AI Roundup: New Models & Persistent Memory

The AI landscape is advancing rapidly with the release of the Qwen 3.5 model, which reportedly shows a leap in reasoning and memory. Other key developments include breakthroughs in persistent memory to give AI context across sessions, more realistic generative image models, and more adaptive AI characters in multiplayer gaming.

The Qwen model series is developed by Alibaba Cloud. The latest versions, like Qwen2-72B, are built on the Transformer architecture and incorporate features like Group Query Attention (GQA) and SwiGLU activation to enhance inference speed and reduce memory usage. The Qwen2 family includes models with parameter sizes ranging from 0.5 billion to 72 billion. On performance benchmarks, the Qwen2-72B base model scores 84.2 on MMLU (measuring knowledge), 89.5 on GSM8K (math word problems), and 64.6 on HumanEval (code generation). Its instruction-tuned variant shows strong competitive performance against other open-source models in reasoning, coding, and multilingual tasks across approximately 30 languages. Persistent memory addresses a core AI challenge known as "catastrophic forgetting," where a model forgets previously learned information when acquiring new data. The goal is to create stateful systems that retain context across multiple interactions, evolving from a stateless tool to a continuous digital partner. This allows an AI to remember user preferences, past conversations, and project details over long periods. Achieving this long-term memory often involves hybrid architectures that go beyond the model's parameters. Techniques include using external vector databases like Pinecone and Weaviate, structured knowledge graphs, and context-aware retrieval mechanisms to store and recall information efficiently. Major developers, including OpenAI and Google, have been integrating memory features into their platforms. The realism of generative images has advanced thanks to a shift from Generative Adversarial Networks (GANs) to diffusion models. While GANs pioneered high-resolution face synthesis, they often struggled with overall scene coherence. Diffusion models, in contrast, generate images by iteratively refining random noise into a detailed picture, which allows for greater consistency and complexity. Recent breakthroughs in generative imagery focus on user control. Technologies like ControlNet allow creators to guide the generation process using structural blueprints such as skeletal sketches or depth maps. This provides finer-grained control over the final image, solving earlier problems where models struggled with precise spatial relationships. In gaming, NVIDIA's ACE (Autonomous Character Engine) technology is being used to create more lifelike and adaptive non-player characters (NPCs). The goal is to enable characters that can perceive their environment, make independent decisions, and adapt their strategies in real-time based on player actions, rather than following a predefined script. Game developers are already implementing these advanced AI characters. Krafton is using a small language model in its simulation game *inZOI* to allow characters to react naturally to events, while NetEase has integrated AI models into the NPCs of its multiplayer game *Sword of Justice*.

AI Roundup: New Models & Persistent Memory

Get your own daily briefing