Demo Shows Agents Controlling Phones, Macs

A new demo of $PHONECLAW showcases AI agents controlling consumer devices like phones, Macs, and Vision Pro headsets to perform scheduled tasks. The project points to a future of agentic UX that moves beyond chatbots to direct, autonomous action on personal hardware.

The architecture of agentic AI systems is evolving to include sophisticated modules for perception, planning, and tool use, moving beyond early rule-based systems. Research in this area is focusing on hierarchical and multi-step planners, and the use of external tools through APIs and code execution to interact with the real world. Key design trade-offs that engineers are grappling with include balancing latency versus accuracy and autonomy versus controllability. Open-source frameworks are significantly accelerating the development of multi-agent systems. Projects like Microsoft's AutoGen, CrewAI, and LangGraph provide foundational tools for developers to build applications with multiple collaborating agents. These frameworks help manage orchestration, memory, and communication between agents, which are complex engineering challenges. For example, CrewAI focuses on orchestrating role-playing agents for seamless task collaboration. A significant challenge in deploying these systems is ensuring reliability and managing the handoff between different agents or between an agent and a human. The non-deterministic nature of AI agents means that the same input can produce different outputs, making consistent performance a major hurdle. Research indicates that even with a 95% reliability for each step in a workflow, a 20-step process will only have a 36% success rate, highlighting the compounding nature of errors in multi-step agentic workflows. For consumer-facing AI products, the user experience is paramount. The goal is to make complex agent behavior feel simple and intuitive to everyday users. This involves designing effective AI interaction patterns and conversational interfaces. A key aspect of this is managing user expectations and providing transparency into the agent's reasoning process to build trust. In China, the AI landscape is shaped by a combination of strategic government plans and specific regulations. The "New Generation Artificial Intelligence Development Plan" sets the goal for China to be a global leader in AI by 2030. Regulations like the "Interim Measures for the Management of Generative Artificial Intelligence Services" establish frameworks for data handling and content generation. Additionally, China is actively involved in setting international AI standards, including leading the development of guidelines for generative AI risk management. The development of multi-agent systems also brings new security considerations. As AI agents gain access to more personal data and system controls, they increase the potential attack surface. This necessitates a focus on robust security practices, including limiting permissions to only what is necessary and implementing strong authentication and continuous monitoring to prevent malicious use. A crucial area of research for improving agent capabilities is in task planning and tool usage. Studies are evaluating the effectiveness of different agent types, such as "One-Step Agents" that plan and execute all subtasks at once, versus "Sequential Agents" that solve problems incrementally. The ability for agents to effectively use multiple tools and handle structured data formats remains a significant challenge. The future of agentic UX points towards more proactive and autonomous systems. Gartner predicts that by 2028, a significant portion of daily work decisions will be made autonomously by AI agents. This "Great Handoff" from human-led to agent-led tasks will require new infrastructure, including specialized chips for agent-based computing and advanced platforms for orchestrating multiple agents.

Demo Shows Agents Controlling Phones, Macs

Get your own daily briefing