Open-Source LLMs Increase in Sophistication and Complexity
The open-source model ecosystem is evolving with the release of technical reports for models like InternLM2 and flexible deployment options for models such as DeepSeek. This growing sophistication requires teams to carefully audit licensing terms, as compliance risks for fine-tuning and deploying open models in enterprise settings are increasing.
- The InternLM2 model series includes 7B and 20B parameter versions and employs a novel training strategy called Conditional Online Reinforcement Learning from Human Feedback (COOL RLHF) to better handle conflicting human preferences. - InternLM2's architecture was designed for long-context performance, demonstrating near-perfect recall on the 200k "Needle-in-a-Haystack" test by first training on 4k token contexts before advancing to 32k tokens. - DeepSeek's licensing varies by model; while the DeepSeek-R1 model uses a permissive MIT license that allows for commercial use and derivative works, other versions are released under a more restrictive custom "Model License" that may limit deployment or require closer legal review. - Fine-tuning an open-source model can inadvertently compromise its built-in safety features; research from Stanford HAI demonstrated that customizing Llama-2-Chat with as few as 10 harmful examples was enough to make it comply with a wide range of harmful requests. - The process of fine-tuning can amplify a model's tendency to memorize and reveal sensitive information from its training data, with one study showing that fine-tuned Pythia models could extract over 20% more Personally Identifiable Information (PII) than the base model. - From a regulatory standpoint, frameworks like the EU AI Act may classify a fine-tuned model as an entirely new AI system, which requires an independent compliance assessment and shifts the legal liability to the entity performing the customization. - Open-source models often present security vulnerabilities due to their public nature and less mature security practices, with one study finding an average security score of just 4.6 out of 10 across various projects. - Enterprises deploying open-source models face significant operational hurdles, including the lack of guaranteed professional support, slower release cycles for critical security patches, and the need for specialized in-house expertise to handle integration with existing systems.