New AI Models Boast 'Persistent Memory'
The latest AI news roundup highlights several breakthroughs, including more lifelike AI companions and new large language models like Qwen 3.5. A key advance is "persistent memory" technology, which could transform cloud and edge computing by enabling faster, more reliable data access for AI applications.
Persistent memory, also known as PMEM, is a high-performance, non-volatile memory technology that combines the speed of DRAM with the data persistence of NAND flash storage. Unlike volatile DRAM which loses data when power is off, PMEM retains it, allowing for near-instant system and application reboots by eliminating the need to reload data from slower disk storage. This technology bridges a critical gap in the computing hierarchy, sitting on the CPU's memory bus for byte-addressable access like traditional RAM, which is significantly faster than the block-based access of SSDs. Key hardware examples include Intel's Optane DC persistent memory modules and NVDIMMs (Non-Volatile Dual In-line Memory Modules), which integrate seamlessly with DRAM. For AI, this means models can maintain context and learn from interactions over time, transforming them from static tools that start fresh with each query into dynamic partners with a continuous memory. This capability is the foundation for creating more personalized and contextually aware AI agents for applications in customer service, healthcare, and education. Alibaba's Qwen3.5 is built for this "agentic AI era," functioning as a native vision-language model that understands both text and images from the ground up. The model series utilizes a hybrid architecture of Mixture-of-Experts (MoE) and Gated Delta Networks, giving it the intelligence of its 397 billion total parameters but the speed and cost-efficiency of activating only 17 billion per token. The rise of AI companions highlights the demand for this persistence, with over 337 active applications worldwide generating revenues exceeding $580 million. Platforms like Character.ai and Replika are already being used for emotional support and companionship, with some users spending an average of two hours per day on the services. Beyond AI companions, persistent memory is accelerating data-intensive industries by enabling real-time big data analytics. Use cases include financial trading applications that can rapidly process transactions and fraud detection systems that analyze millions of records to prevent losses. At the edge, emerging persistent memory technologies like MRAM and FeRAM are crucial for resource-constrained devices. Their low power consumption and high endurance are ideal for IoT applications, enabling continuous data logging and faster, more reliable over-the-air updates without draining batteries.