Students Distill Models for Under $100

A group of college students demonstrated they could distill proprietary models like Claude, GPT, and Gemini into high-performing open-source variants for just $52 in compute costs. The project, TeichAI, used 250 reasoning samples to create models that have since been downloaded over 67,000 times. The effort highlights the increasing accessibility of powerful AI, alongside the complex legal and ethical questions of data provenance and model replication.

- The project's method is a form of supervised fine-tuning (SFT), where instead of using human-labeled data, they use the outputs ("reasoning traces") from more powerful "teacher" models to train smaller, open-source "student" models. - To achieve this efficiently, TeichAI uses Unsloth, a library designed to speed up training by up to 30x and reduce memory usage by 60-90%. This is paired with Low-Rank Adaptation (LoRA), a technique that dramatically reduces the number of trainable parameters, making it feasible to fine-tune on consumer-grade hardware. - Performance benchmarks show that this distillation process can be effective; for instance, a model distilled from Gemini 2.5 Flash improved upon the base model's average score by 1.9% across six different reasoning benchmarks, and a GPT-5 Codex-distilled model improved performance on graduate-level science questions by 13.6%. - While the students' approach highlights a path to democratize AI, it operates in a legally gray area. Most proprietary AI model providers explicitly forbid the use of their model outputs to train competing models in their terms of service. - The central legal debate is whether a distilled model constitutes a "derivative work" under copyright law. However, current legal interpretations suggest this is unlikely to be considered copyright infringement because the process mimics functional patterns rather than directly copying expressive content or code. - The more immediate legal risk for those replicating this work is a breach of contract claim for violating the teacher model's terms of service, a simpler case to argue in court than copyright infringement.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.