Google readies Gemini agent
- Google is preparing a Gemini "agent" that can run on macOS and — with screen access — move the mouse, type, and organize files on your Mac. - Gemini in Docs now accepts up to 1,000 persistent custom instructions and Gemini can export chat outputs as DOCX, PDF, Sheets, and Slides. - The change moves Gemini from suggestion to execution — agents will run multi-step Workspace workflows and act inside Meet, Gmail, Docs and the desktop.
Lede This is about desktop AI — not just a smarter reply box. Google is rolling out a Gemini "agent" that can live on macOS and take actions on your screen — moving the cursor, typing, and tidying files. That matters because it turns an assistant that suggests into one that completes tasks for you. The news combines a native macOS app and new persistent rules in Docs, and it’s rolling out now. What is a Gemini agent? A Gemini agent is an automated, multi-step helper that plans and executes tasks for you — like booking, summarizing, or filing. It can pull from the web, your documents, and connected apps, then carry out steps in sequence instead of just giving you a one-off answer. What will it do on a Mac? On macOS the agent can use screen access and accessibility controls to interact with the desktop — display what it’s doing, move the mouse, and type — so it can actually open folders, sort downloads, and rename or move files. That’s different from a sidebar helper — it can manipulate apps the way a human would. How does Gemini behave inside Google Docs? Gemini in Docs now supports persistent custom instructions — you can set up to 1,000 rules for tone, format, or process and the model will follow them across sessions. That means you don’t repeat style directions and the assistant keeps behaving the same way inside the document. Rollout began in early May and should reach users over the following weeks. Can it create and export Office and PDF files? Yes — Gemini can generate formatted outputs from chat and save them as DOCX, PDF, Sheets, Slides, and more. So you can ask for a meeting summary, a budget spreadsheet, or a slide deck in chat and get a downloadable file instead of copy‑pasting. That closes a big workflow gap. What about safety and privacy? There are guardrails — Gemini’s docs warn against putting passwords or sensitive payment data directly in chat, and note some agent actions may run even if an app is disconnected. Screen control requires macOS accessibility permissions, so users must opt in and grant system access. The catch is obvious — power plus screen access raises new security questions. Why does this shift matter? This moves AI from giving suggestions to finishing work. That changes how success will be measured — not clicks or answers, but completed workflows, repeat usage, and time saved. Enterprises will see it as an automation platform; consumers will notice fewer context switches. Bottom line Gemini’s move to run on your Mac and obey persistent rules in Docs is small on the surface — and big under the hood. It could save time, but it demands careful permissioning and new habits around security.