Cohere unveils Command R+

Published by The Daily Scout

What happened

Cohere announced Command R+ with 128K context and long‑document / enterprise RAG capabilities — a clear product push into large‑context inference and retrieval workflows. This positions Cohere to compete on longer‑context use cases that drive heavy inference-loads and persistent model state. (x.com)

Why it matters

Cohere stamped the release as command-r-plus-08-2024 and set a per‑run output cap of 4,000 tokens while listing per‑token pricing at $2.50 per 1M input tokens and $10 per 1M output tokens. (cohere.com) (docs.cohere.com) A Cohere Labs research release on Hugging Face publishes Command R+ weights as a 104‑billion‑parameter model under a CC‑BY‑NC license and includes a hosted demo space. (huggingface.co) (huggingface.co) Microsoft added Command R+ to the Azure AI model catalog as part of a Cohere collaboration and highlighted pairing the model with Cohere Embed and Rerank to produce citation‑backed outputs. (microsoft.com) (techcommunity.microsoft.com) AWS co‑authored a SageMaker JumpStart post showing Command R and R+ are deployable for inference across a long list of regions including us‑east‑1, us‑west‑2, eu‑central‑1 and ap‑northeast‑1 for VPC‑isolated production use. (aws.amazon.com) (aws.amazon.com) Cohere’s August 2024 model refresh claims roughly 50% higher throughput and about 25% lower latencies for Command R+ compared with the prior version, achieved without increasing the hardware footprint. (cohere.com) (docs.cohere.com) Community resources on Hugging Face surface 4‑bit quantized builds and bitsandbytes deployment instructions for Command R+, enabling local or cloud self‑hosting for research and private deployments. (huggingface.co) (huggingface.co) Amazon Bedrock and other platform docs show the models accept a structured documents field and can return outputs that reference those documents, supporting citation workflows in production inference. (aws.amazon.com) (docs.aws.amazon.com)

Key numbers

  • Cohere announced Command R+ with 128K context and long‑document / enterprise RAG capabilities — a clear product push into large‑context inference and retrieval workflows.
  • (x.com) Cohere stamped the release as command-r-plus-08-2024 and set a per‑run output cap of 4,000 tokens while listing per‑token pricing at $2.50 per 1M input tokens and $10 per 1M output tokens.
  • (cohere.com) (docs.cohere.com) A Cohere Labs research release on Hugging Face publishes Command R+ weights as a 104‑billion‑parameter model under a CC‑BY‑NC license and includes a hosted demo space.
  • (aws.amazon.com) (aws.amazon.com) Cohere’s August 2024 model refresh claims roughly 50% higher throughput and about 25% lower latencies for Command R+ compared with the prior version, achieved without increasing the hardware footprint.

Quick answers

What happened in Cohere unveils Command R+?

Cohere announced Command R+ with 128K context and long‑document / enterprise RAG capabilities — a clear product push into large‑context inference and retrieval workflows. This positions Cohere to compete on longer‑context use cases that drive heavy inference-loads and persistent model state. (x.com)

Why does Cohere unveils Command R+ matter?

Cohere stamped the release as command-r-plus-08-2024 and set a per‑run output cap of 4,000 tokens while listing per‑token pricing at $2.50 per 1M input tokens and $10 per 1M output tokens. (cohere.com) (docs.cohere.com) A Cohere Labs research release on Hugging Face publishes Command R+ weights as a 104‑billion‑parameter model under a CC‑BY‑NC license and includes a hosted demo space. (huggingface.co) (huggingface.co) Microsoft added Command R+ to the Azure AI model catalog as part of a Cohere collaboration and highlighted pairing the model with Cohere Embed and Rerank to produce citation‑backed outputs. (microsoft.com) (techcommunity.microsoft.com) AWS co‑authored a SageMaker JumpStart post showing Command R and R+ are deployable for inference across a long list of regions including us‑east‑1, us‑west‑2, eu‑central‑1 and ap‑northeast‑1 for VPC‑isolated production use. (aws.amazon.com) (aws.amazon.com) Cohere’s August 2024 model refresh claims roughly 50% higher throughput and about 25% lower latencies for Command R+ compared with the prior version, achieved without increasing the hardware footprint. (cohere.com) (docs.cohere.com) Community resources on Hugging Face surface 4‑bit quantized builds and bitsandbytes deployment instructions for Command R+, enabling local or cloud self‑hosting for research and private deployments. (huggingface.co) (huggingface.co) Amazon Bedrock and other platform docs show the models accept a structured documents field and can return outputs that reference those documents, supporting citation workflows in production inference. (aws.amazon.com) (docs.aws.amazon.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Published by The Daily Scout - Be the smartest in the room.