LoHo‑Manip breaks long‑horizon tasks
- UC San Diego and NVIDIA researchers posted LoHo‑Manip on April 23, a robot-control system that splits long chores into smaller steps and redraws a visual path after each move. - The system pairs a task manager with a short-horizon executor, using a “done + remaining” memory and a 2D keypoint trace so failed steps stay visible and get retried. - The paper targets a core robotics problem: short-horizon policies can grasp or place objects, but multi-step household tasks still break under compounding errors. (arxiv.org)
Most robot policies are good at one move at a time and bad at chores that take dozens of moves. LoHo‑Manip is a new system from University of California, San Diego and NVIDIA that tries to fix that by replanning after each step. (arxiv.org) (liuisabella.com) The paper, “Long-Horizon Manipulation via Trace-Conditioned VLA Planning,” was posted to arXiv on April 23, 2026 by Isabella Liu, An‑Chieh Cheng, Rui Yan and coauthors, with Xiaolong Wang, Hongxu Yin and Sifei Liu listed as equal advising authors. (arxiv.org) The basic problem is error buildup. A robot can pick, place, open or push in short bursts, but a task like refilling a kettle expands into a chain of dependent actions, and one miss can derail everything that follows. (arxiv.org) LoHo‑Manip splits that job in two. A high-level vision-language model acts as the manager, while a separate vision-language-action executor handles the local motion. (arxiv.org) (liuisabella.com) At each step, the manager predicts a “done + remaining” task list and a visual trace, which the authors describe as a compact 2D keypoint trajectory showing where the robot should move or what it should approach next. (arxiv.org) That turns a long plan into a series of short control problems. If a step fails, the unfinished subtask stays in the next prediction and the trace is updated, so the system can continue without a hand-built failure detector or a large visual-history buffer. (arxiv.org) (emergentmind.com) The authors say they tested the framework on embodied planning, long-horizon reasoning, trajectory prediction and end-to-end manipulation, including real-world rollouts on a Franka robot. The project page shows tasks such as sorting produce into a black bowl and placing other items into a container. (arxiv.org) (liuisabella.com) The benchmarks on the project page include EB‑ALFRED, EB‑Habitat, RoboVQA and EgoPlan‑Bench2 for planning and reasoning, plus trajectory prediction and real-robot manipulation results. The paper says the manager generalizes across different manipulators, objects and scenes because it is decoupled from the executor. (liuisabella.com) (arxiv.org) The immediate claim is not that robots suddenly solved housework. It is that a planner that keeps rewriting the remaining to-do list and drawing the next path can make long tasks look more like repeated short ones. (arxiv.org) That is the bet behind LoHo‑Manip: keep the robot anchored to what is left, where to go next, and what just failed, instead of asking one model to remember the whole chore at once. (arxiv.org)