Alluxio adds POSIX write cache
- Alluxio released AI 3.9 on May 19, adding a POSIX-compatible write cache that shifts the company from read-centric caching toward read-write acceleration. (documentation.alluxio.io) - The company said validation testing reached 8.99 GB/s peak sequential write throughput in FUSE Full POSIX Workspace and 10.20 GB/s single-worker checkpoint writes. (documentation.alluxio.io) - Alluxio’s release notes and product blog published May 19 detail configuration, benchmark context and the new FUSE Full POSIX Workspace. (documentation.alluxio.io)
Alluxio released AI 3.9 on May 19, adding a POSIX-compatible write cache and a new FUSE Full POSIX Workspace aimed at write-heavy AI jobs. The company said the release shifts Alluxio from a read-centric cache into a read-write data acceleration platform for AI workloads. (documentation.alluxio.io) Release notes say the update also adds RDMA support for read I/O and stronger reliability features. Alluxio’s product blog said the new write path is designed for frameworks that checkpoint through a mounted file system rather than through S3. (documentation.alluxio.io) The company named PyTorch Distributed Checkpoint, DeepSpeed, Megatron-LM and Ray Train as examples of software that writes through POSIX interfaces and can stall when shared storage is slow. ### Why did Alluxio add a POSIX write cache now? Alluxio said May 19 that distributed training jobs spend significant time writing checkpoints and waiting for them to finish. Its blog said those writes often run through remote file systems or object-storage-backed mounts, turning checkpointing into a network-bound step that leaves GPUs idle until the slowest worker completes. (documentation.alluxio.io) The new feature extends write-back caching that Alluxio introduced in version 3.8 for S3 workloads to the POSIX file-system path, the company said. In Alluxio’s description, applications write to a POSIX-mounted Alluxio file system, data lands first on compute-side NVMe in the worker pool, and persistence to the durable backend happens asynchronously. (alluxio.io) ### What changes in the file-system path? Release notes for AI 3.9 said Alluxio now supports write-intensive workloads with a POSIX-compatible write cache and S3 multipart upload support. The same notes said the FUSE Full POSIX Workspace supports random writes, overwrites, truncation, rename, symlinks and other standard POSIX operations through a FUSE mount. (alluxio.io) The documentation says the workspace uses FoundationDB as a distributed, strongly consistent metadata store. Alluxio said that setup allows multi-node access to the same dataset, while data can sit on worker NVMe for lower latency or on UFS PageStore for higher durability. (alluxio.io) ### What performance numbers did Alluxio publish? Alluxio’s release notes said validation testing for FUSE Full POSIX Workspace reached up to 8.99 GB/s peak sequential write throughput and 8.01 GB/s peak hot-cache read throughput. The same release notes said checkpoint support through S3 and FUSE showed up to 10.20 GB/s single-worker checkpoint write throughput. (documentation.alluxio.io) Bill Hodak, writing in Alluxio’s May 19 blog post, said the POSIX write cache delivered 7.6 GiB/s per node throughput with sub-2 millisecond P99 latency in the company’s checkpointing tests. Those figures came from vendor testing, not an independent benchmark. (documentation.alluxio.io) ### What does this change for write-heavy pipelines? Alluxio’s own benchmark documentation says the default POSIX write type had previously been THROUGH, with no cache on write. That meant writes persisted to the underlying file system and did not automatically populate the Alluxio cache, according to the benchmark guide. (documentation.alluxio.io) That distinction matters for pipelines that create large volumes of files and metadata in bursts. In practical terms, a write-back cache can absorb writes locally and defer backend persistence, reducing application-visible delay when ingest, checkpointing or large-batch processing pushes shared storage to its limits; that is an inference from Alluxio’s architecture and performance claims, not a separate company statement. (alluxio.io) ### Are there limits or trade-offs in the first release? Alluxio’s FUSE Full POSIX Workspace documentation says the feature is experimental since AI-3.9-16.0.0. The same page says workspace data is not automatically persisted to the underlying file system and that current support is limited to TRANSIENT path configuration. (documentation.alluxio.io) The documentation also says worker-NVMe mode offers low latency but transient durability, with data loss possible if a worker fails. Users that need stronger durability can use UFS PageStore, while metadata remains backed by FoundationDB, according to the guide. (documentation.alluxio.io) ### Where can users see what ships next? Alluxio’s release notes page lists AI 3.9 as the current release and links to configuration details for checkpointing, FUSE write optimization and FUSE Full POSIX Workspace. The company’s May 19 blog post and documentation pages are the named sources for benchmark methodology, supported operations and deployment requirements. (documentation.alluxio.io 1) (documentation.alluxio.io 2)