ModelScope’s dots.mocr OCR
What happened
ModelScope released dots.mocr, a 3B multimodal OCR model that topped benchmarks and is now integrated into vLLM v0.11.0—potentially improving extraction quality for document‑heavy RAG systems. The model’s multimodal gains could reduce noisy retrieval signals from scanned documents. (x.com/ModelScope2022/status/2034826884018500081)
Why it matters
The dots.mocr paper "Multimodal OCR: Parse Anything from Documents" was posted to arXiv on March 13, 2026 and lists authors including Handong Zheng, Yumeng Li, Kaile Zhang and Xiang Bai among the contributing team. (arxiv.org) The project publishes both a general dots.mocr model and a dots.mocr‑svg variant that explicitly targets image→SVG conversion for charts, UI layouts and scientific figures. (huggingface.co) On the olmOCR benchmark table reproduced in the repo, dots.mocr posts per‑category scores such as 85.9, 85.5 and 90.7 on several splits and an overall reported score of 83.9 ± 0.9 in the authors' evaluation table. (github.com) The repository and Hugging Face card include an example serve command for production use: "vllm serve rednote‑hilab/dots.ocr --trust‑remote‑code" and note a vLLM model executor class for dots_ocr in the vLLM API docs. (stable-learn.com) A vLLM Docker image tag frequently referenced for deployment is vllm/vllm-openai:v0.11.0 (multi‑platform image, compressed layers ~11.6 GB in the registry metadata). (hub.docker.com) The project documentation and third‑party deployment guides show a Docker + vLLM compose path and state that performance was validated against the original out‑of‑tree registration during their vLLM server tests. (deepwiki.com)
Key numbers
- ModelScope released dots.mocr, a 3B multimodal OCR model that topped benchmarks and is now integrated into vLLM v0.11.0—potentially improving extraction quality for document‑heavy RAG systems.
- (huggingface.co) On the olmOCR benchmark table reproduced in the repo, dots.mocr posts per‑category scores such as 85.9, 85.5 and 90.7 on several splits and an overall reported score of 83.9 ± 0.9 in the authors' evaluation table.
- (stable-learn.com) A vLLM Docker image tag frequently referenced for deployment is vllm/vllm-openai:v0.11.0 (multi‑platform image, compressed layers ~11.6 GB in the registry metadata).
What happens next
- (arxiv.org) The project publishes both a general dots.mocr model and a dots.mocr‑svg variant that explicitly targets image→SVG conversion for charts, UI layouts and scientific figures.
- The model’s multimodal gains could reduce noisy retrieval signals from scanned documents.
Quick answers
What happened in ModelScope’s dots.mocr OCR?
ModelScope released dots.mocr, a 3B multimodal OCR model that topped benchmarks and is now integrated into vLLM v0.11.0—potentially improving extraction quality for document‑heavy RAG systems. The model’s multimodal gains could reduce noisy retrieval signals from scanned documents. (x.com/ModelScope2022/status/2034826884018500081)
Why does ModelScope’s dots.mocr OCR matter?
The dots.mocr paper "Multimodal OCR: Parse Anything from Documents" was posted to arXiv on March 13, 2026 and lists authors including Handong Zheng, Yumeng Li, Kaile Zhang and Xiang Bai among the contributing team. (arxiv.org) The project publishes both a general dots.mocr model and a dots.mocr‑svg variant that explicitly targets image→SVG conversion for charts, UI layouts and scientific figures. (huggingface.co) On the olmOCR benchmark table reproduced in the repo, dots.mocr posts per‑category scores such as 85.9, 85.5 and 90.7 on several splits and an overall reported score of 83.9 ± 0.9 in the authors' evaluation table. (github.com) The repository and Hugging Face card include an example serve command for production use: "vllm serve rednote‑hilab/dots.ocr --trust‑remote‑code" and note a vLLM model executor class for dots_ocr in the vLLM API docs. (stable-learn.com) A vLLM Docker image tag frequently referenced for deployment is vllm/vllm-openai:v0.11.0 (multi‑platform image, compressed layers ~11.6 GB in the registry metadata). (hub.docker.com) The project documentation and third‑party deployment guides show a Docker + vLLM compose path and state that performance was validated against the original out‑of‑tree registration during their vLLM server tests. (deepwiki.com)