InternLM2 Technical Report Released
A detailed technical report for the InternLM2 large language model has been released. The paper provides an in-depth examination of the model's architecture, training procedures, and evaluation benchmarks, offering insights for advanced machine learning practitioners.
- The model was developed by the Shanghai AI Laboratory and is available in several open-source versions, including parameter sizes of 1.8B, 7B, and 20B. Models from different stages of training, such as base, Supervised Fine-Tuning (SFT), and Reinforcement Learning from Human Feedback (RLHF), have also been released to aid research. - A key architectural feature is its support for an extended context window, demonstrating strong performance on a 200,000-token "Needle-in-a-Haystack" test. This is enabled in part by Group Query Attention (GQA), and a subsequent version, InternLM2.5, has demonstrated a context window of up to 1 million tokens. - The alignment process utilizes a novel strategy called Conditional Online Reinforcement Learning from Human Feedback (COOL RLHF), designed to better handle conflicting human preferences and prevent reward hacking. The conditional reward model was trained on a dataset of 2.4 million binarized preference pairs. - On specific benchmarks, the InternLM2-Chat-20B model's performance in reasoning tasks surpasses that of GPT-3.5. Furthermore, its coding-specific performance shows a significant improvement, with the InternLM2-Chat-20B model scoring over 10% higher than previous state-of-the-art models on HumanEval and MBPP benchmarks. - The project extends beyond the models themselves, offering a suite of open-source tools for the MLOps lifecycle. This includes LMDeploy for model compression and serving, and OpenCompass, a platform for reproducible large model evaluation. - Specialized versions of the model have been released, such as InternLM2-Math, which was further pre-trained on approximately 100 billion math-related tokens and can generate verification code in the Lean programming language.