OpenAI's o3 mini + AFC v2

- OpenAI released o3 mini high, a distilled model tuned for multi-step reasoning and lower latency. - The same update included Advanced Function Calling v2, which adds parallel tool calls and error recovery. - The release and tooling were detailed across developer posts and a public thread documenting the changes. ( )

OpenAI has rolled out o3-mini-high and a new function-calling stack that lets models use tools in more than one step with less waiting. (openai.com) OpenAI introduced o3-mini on January 31, 2025, as a small reasoning model with three effort settings — low, medium, and high — and said the high setting is available in ChatGPT as o3-mini-high. The company said o3-mini is 93% cheaper than o1, has a 200,000-token context window, and supports function calling, structured outputs, streaming, and developer messages. (openai.com) (community.openai.com) A reasoning model is built to work through multi-step problems before answering, the way a calculator shows intermediate work even if the user only sees the result. OpenAI’s developer cookbook says models like o3 and o4-mini are trained for complex planning, coding, and step-by-step tasks, and the newer Responses API is designed to manage those back-and-forth tool interactions. (developers.openai.com) (platform.openai.com) Function calling is the part that lets a model ask an app to run outside code, like checking a database or sending a request to another service. OpenAI’s guide describes a five-step loop: send tools, receive a tool call, execute it in the app, return the result, and get either a final answer or another tool call. (developers.openai.com) The newer change is that OpenAI’s tooling now documents parallel tool calls, which means a model can ask for more than one function in the same turn when those tasks do not depend on each other. OpenAI’s Assistants documentation shows two tool calls in one run, and its reasoning-model cookbook says some models can return an array of functions for parallel execution. (developers.openai.com 1) (developers.openai.com 2) That matters for latency because separate lookups that used to run one after another can be launched together. OpenAI’s cookbook on parallel agents says concurrent execution is used specifically for lower latency, and the o3-mini launch post said the model was tuned for lower latency than o1-mini. (developers.openai.com) (openai.com) OpenAI’s public docs also show the company pushing developers toward the Responses API as the main place to build these workflows. The reference says Responses supports stateful interactions, built-in tools, and custom function calling in one interface, which reduces the amount of conversation state developers have to track themselves. (platform.openai.com) The company has also been explicit that parallel tool use is not universal and that reasoning models may still choose serial calls when one step depends on another. OpenAI’s cookbook says some tasks must be done in sequence, and its prompting guide for o3 and o4-mini warns developers to spell out tool order when the workflow has prerequisites. (developers.openai.com 1) (developers.openai.com 2) OpenAI’s own posts frame the release as part of a broader shift from chatbots that answer in one pass to models that can plan, call tools, and keep going until a task is complete. In that setup, o3-mini-high is the slower, harder-thinking option, and the new function-calling layer is the plumbing that lets that extra reasoning touch outside systems. (openai.com 1) (openai.com 2)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.