Qwen3.6 27B beats larger models

- Alibaba Cloud said on April 24 it open-sourced Qwen3.6-27B, a dense coding model that it says beat its older Qwen3.5-397B-A17B flagship. - Alibaba reported Qwen3.6-27B scored 77.2 on SWE-bench Verified, above Qwen3.5-397B-A17B’s 76.2, while also topping it on Terminal-Bench and SkillsBench. - The release extends Alibaba’s push from APIs into open weights and cars at the Beijing Auto Show. (alibabacloud.com)

Large language models are software systems trained to predict the next token, or chunk of text, and coding agents use that skill to read files, run commands, and edit repositories. Alibaba Cloud said April 24 that its new Qwen3.6-27B open-weight model now beats its own older 397 billion-parameter flagship on several agentic coding tests. (alibabacloud.com) The new model is dense, meaning all 27 billion parameters are used for each response, instead of a mixture-of-experts system that activates only part of a much larger network. Alibaba said that simpler setup makes Qwen3.6-27B easier to deploy than Qwen3.5-397B-A17B, which has 397 billion total parameters and 17 billion active parameters. (alibabacloud.com) On Alibaba’s reported benchmarks, Qwen3.6-27B scored 77.2 on SWE-bench Verified versus 76.2 for Qwen3.5-397B-A17B. It also posted 53.5 versus 50.9 on SWE-bench Pro, 59.3 versus 52.5 on Terminal-Bench 2.0, and 48.2 versus 30.0 on SkillsBench. (alibabacloud.com) Those tests measure whether a model can fix real software bugs, navigate terminal tasks, and complete coding jobs across repositories. Alibaba said the evaluations used long context windows of 200,000 to 256,000 tokens and multi-run averages on some benchmarks, which affects how directly the scores compare with outside tests. (alibabacloud.com) Alibaba released Qwen3.6-27B as open weights through the Qwen3.6 repository, alongside access through Qwen Studio and an API path through Alibaba Cloud Model Studio. The GitHub repository says Qwen3.6 adds “Thinking Preservation,” which keeps reasoning context across conversation history during iterative coding sessions. (github.com) (alibabacloud.com) The broader Qwen3.6 line has split into open and proprietary tracks this month. Qwen’s research index lists Qwen3.6-Plus as an API release on April 1, Qwen3.6-35B-A3B as an open-source release on April 14, Qwen3.6-Max-Preview on April 21, and Qwen3.6-27B as an open-source release on April 21. (qwen.ai) Alibaba’s pitch is that smaller models can now do more useful work in real coding environments, not just score well on abstract tests. Its Qwen product page says the latest Qwen3 models use “Thinking” and “Non-Thinking” modes, support 119 languages and dialects, and are built to work with Model Context Protocol tools for agent workflows. (alibabacloud.com) The company is also moving Qwen beyond developer tools and into products. Alibaba Cloud said on April 24 at the Beijing Auto Show that BYD, Geely, Li Auto, Changan, Dongfeng, BAIC, Great Wall Motor, SAIC Volkswagen, and SAIC IM Motors would integrate Qwen into vehicle systems. (alibabacloud.com) (cnbc.com) For developers, the immediate takeaway is narrower and more practical: Alibaba is arguing that a 27 billion-parameter dense model is now enough for flagship-level coding work. The next test is whether Qwen3.6-27B’s benchmark lead holds up in independent use across open-source coding agents and production software teams. (alibabacloud.com)

Qwen3.6 27B beats larger models

Get your own daily briefing