Elorian raises $55M for multimodal research

A new lab called Elorian AI raised $55 million to focus on multimodal reasoning and visual understanding, led by former DeepMind researchers. The funding targets research that could improve how models interpret and reason about images and other visual inputs. (pulse2.com)

Artificial intelligence startup Elorian came out of stealth on April 9 with $55 million to build models that reason about images, not just describe them. (bloomberg.com) The Palo Alto company was co-founded by former Google DeepMind researcher Andrew Dai, along with Yinfei Yang, who worked on artificial intelligence research at Google and Apple, and Seth Neel, a former Harvard professor. Bloomberg reported the round values Elorian at $300 million. (bloomberg.com) Elorian said its backers include Striker Venture Partners, Menlo Ventures, and Altimeter, with participation from 49 Palms and researcher Jeff Dean. Bloomberg also reported participation from Nvidia and said the company raised the money in two tranches, first at a $120 million valuation and later at $300 million. (elorian.ai, bloomberg.com) Multimodal models take in more than one kind of input, such as text and images. Elorian argues that most current vision-language systems still convert pictures into words first and then reason in text, a handoff the company says is “fragile, limited, and prone to hallucination.” (elorian.ai) Elorian’s pitch is that machines need to work with visual information more directly, the way people judge distance, shape, and physical constraints from what they see. On its site, the company says it is training models to manipulate visual representations so they can reason about structure, relationships, and constraints. (elorian.ai) Dai told Bloomberg current systems still struggle with tasks like analyzing satellite imagery or spotting what is missing from an image, even after billions of dollars of spending by larger labs. He said better visual reasoning could help in architecture, automotive design, and robotics. (bloomberg.com) The company has hired more than a dozen people, is not generating revenue yet, and is talking with potential customers, according to Bloomberg. Dai said Elorian plans to release its first publicly available reasoning model in about 12 months. (bloomberg.com) Elorian is joining a crowded market for multimodal artificial intelligence, but its bet is narrower than the all-purpose chatbot race. The company’s launch materials say the target is “the foundation of visual reasoning,” with early use cases in engineering, robotics, medicine, and science. (elorian.ai)

Elorian raises $55M for multimodal research

Get your own daily briefing