Google's Gemini 3.1 Pro Outperforms GPT-4.2 and Claude
Google's new Gemini 3.1 Pro model has set a new performance benchmark, scoring 44.4% on the HUMANITIES exam. The result surpasses both GPT-4.2 (34%) and Claude Opus (40%), according to a recent podcast analysis. Alongside the new model, Google also released Pomeli, an enterprise tool for generating campaign-ready product images that integrates directly with Google Ads.
- While Gemini 3.1 Pro leads on 13 of 16 benchmarks, including more than doubling its predecessor's score on the ARC-AGI-2 abstract reasoning test (77.1%), competitors still lead in key areas. Claude Opus 4.6 outperforms on expert tasks and GPT-5.3-Codex leads on some specialized coding benchmarks. - The Pomeli tool creates its product images by first scanning a brand's website to build a "Business DNA" profile, which analyzes tone of voice, color palettes, and fonts to ensure generated assets are brand-consistent. As of January 2026, a new feature called Pomeli Animate was added, using the Veo 3.1 model to turn static marketing content into short video animations. - For agencies with public sector clients, the EU AI Act's transparency obligations for AI-generated content and deepfakes (Article 50) are scheduled to become legally binding on August 2, 2026. A voluntary Code of Practice is expected to be finalized by June 2026 to guide companies on compliance for labeling and watermarking synthetic media. - The use of generative AI in political campaigns is shifting communication from broad mass messaging to highly personalized digital outreach. AI tools are being used to create multilingual, hyper-targeted voter messages and to optimize content to provoke specific emotional reactions and increase engagement. - In govtech, a primary driver for AI adoption is workforce strain from hiring freezes and a high number of pending retirements. This is creating pressure to automate structured and repetitive work such as form processing, scheduling, and compliance checks. - The technology behind these models is enabling a shift for service-based agencies toward "productized" offerings. This model involves identifying the most common client requests and building a standardized solution that is 80% repeatable, allowing the agency to scale revenue without linearly increasing headcount.