Stanford Team Turns Broom Into Instrument
A team at Stanford's TreeHacks hackathon won the competition by building "Maestro," an AI system that turns everyday objects into musical instruments. The project used computer vision and real-time audio synthesis, showcasing the kind of interdisciplinary AI project that stands out in portfolios.
The "Maestro" project was developed in just 36 hours by a team including Stanford electrical engineering and computer science student Vansh Gadhia. Gadhia, a RISE Global Fellow and a Top 50 finalist for the Chegg.org Global Student Prize, described the project as an "accessibility-first AI music system." The project secured second place in the music track at TreeHacks, which was supported by companies like Suno, NVIDIA, and Perplexity. To bring "Maestro" to life, the team utilized a sophisticated tech stack. They combined MediaPipe and OpenCV for tracking hand gestures, and used WebSockets to link inputs from a phone and a web browser. For real-time sound generation, the system relied on FluidSynth and SoundFonts for MIDI playback. The project also incorporated advanced audio processing and generation technologies. The team used Demucs and Basic Pitch for music transcription and reconstructing audio stems. To power the generative music capabilities of "Maestro," the students integrated the Suno API, a tool for AI music creation. The entire system ran on NVIDIA DGX Spark hardware to minimize latency. TreeHacks is the largest collegiate hackathon in the United States, drawing over 1,000 students from around the globe to Stanford's campus for an intense 36 hours of building. The 2024 event, the 10th edition of the hackathon, featured over $130,000 in prizes and included speakers like the CTO of OpenAI, Mira Murati. Beyond its technical implementation, "Maestro" includes an AI-powered coaching layer. After a user's performance, the system can analyze their posture and timing to provide feedback. The team has expressed plans to enhance the system by expanding its object recognition capabilities, adding collaborative modes, and building in adaptive practice tracking.