OpenAI Reportedly Developing Smart Speaker with Camera
OpenAI is tipped to be developing an AI-powered smart speaker that includes a camera. While details are scarce, the device would enable multimodal interactions combining voice and vision. This move could signal a broader industry trend toward embedding conversational and perceptual AI into household devices.
- The project is a collaboration between OpenAI CEO Sam Altman and Jony Ive, the former chief design officer at Apple, with the goal of creating the "iPhone of artificial intelligence". The venture is reportedly in discussions for up to $1 billion in funding, with SoftBank CEO Masayoshi Son being a key party in the talks. - This device would be powered by a multimodal AI model, likely an evolution of OpenAI's GPT-4o, which is designed to process and generate information across text, audio, and image inputs simultaneously. The "o" in GPT-4o stands for "omni," and it can respond to audio prompts in as little as 0.3 seconds on average. - The camera's reported function is to gather contextual information about the user's surroundings, such as identifying objects on a table. It may also incorporate a facial recognition feature similar to Apple's Face ID to authenticate purchases. - A key design goal is to create a more natural and intuitive user interface for AI that is less reliant on screens. Ive has previously expressed concerns about the compulsive nature of smartphone use and sees this as an opportunity for a new interaction paradigm. - The AI is intended to be proactive rather than just reactive. For example, it might observe a user staying up late before an early meeting and suggest they go to bed. - This smart speaker is reportedly the first of several hardware products being explored by OpenAI, with a team of over 200 employees dedicated to the initiative. Other devices in early development include smart glasses and a smart lamp. - The target price for the smart speaker is reportedly between $200 and $300, with a potential launch date in early 2027 at the earliest. - The inclusion of an always-on camera and microphone inherently raises significant privacy concerns, which could be a major hurdle for consumer adoption compared to traditional audio-only smart speakers.