ThinkMorph boosts multimodal IQ

ThinkMorph reports a 34.7% benchmark gain by enabling multimodal interleaved chain‑of‑thought — big for models that must reason across text and images. (x.com)

ThinkMorph was fine‑tuned on roughly 24,000 high‑quality interleaved text–image reasoning traces collected across tasks with varying visual engagement. (arxiv.org) The paper lists Jiawei Gu, Yunzhuo Hao, Huichen Will Wang, Linjie Li, Michael Qizhe Shieh, Yejin Choi, Ranjay Krishna and Yu Cheng as authors with affiliations including the National University of Singapore, Zhejiang University, University of Washington, Stanford University and The Chinese University of Hong Kong. (arxiv.org) The project has an official code repository and project website on GitHub and a model checkpoint (ThinkMorph‑7B) published on Hugging Face for public use. (github.com) (huggingface.co) The team evaluated ThinkMorph across a suite of vision‑centric benchmarks supported by their VLMEvalKit, naming VSP, VisPuzzle, ChartQA, VStar, BLINK‑J, MMVP, SAT, BLINK and CV‑Bench. (github.com) Authors report emergent multimodal behaviors — including visual manipulation capabilities such as zoom‑in and image inpainting, autonomous switching between text/vision reasoning modes, and improved test‑time scaling via diversified multimodal thought sampling. (iclr.cc) (arxiv.org) ThinkMorph was accepted to ICLR 2026 as a poster; the OpenReview entry shows a submission/decision timeline with the paper posted 26 Jan 2026 and last modified 28 Feb 2026. (openreview.net) OpenReview meta‑reviewers raised concerns about possible over‑generalization of task‑specific results, unclear empirical causes for test‑time advantages, the geometric/semantic consistency of reported "unseen" visual manipulations, and the computational overhead of generating and reintegrating images. (openreview.net) The authors note the training data and model checkpoint were released publicly (with repository activity and dataset uploads dating from Oct 29, 2025), enabling replication and follow‑up work by the community. (github.com) (huggingface.co)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.