Baseten's post‑training push

Published by The Daily Scout

What happened

Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub. That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost. (techspot.com) (linkedin.com) (github.com)

Why it matters

Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs. (businesswire.com). March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026. (baseten.co). The Truss README and docs list first‑class support for multiple serving backends and GPU configuration and name frameworks like vLLM and TensorRT‑LLM plus example deployments for Llama 3 (70B), signaling an operator focus on heterogeneous runtimes and large models. (github.com / baseten; baseten.co). Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators. (baseten.co). The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems. (jobgether.com) (goremotejob.com). Baseten’s head of marketing framed the recent cryptic OOH push as intentionally exclusive — “if‑you‑know‑you‑know” — a play aimed at recruiting and signaling to ML operators rather than mainstream buyers. (npr.org).

Key numbers

  • Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub.
  • (techspot.com) (linkedin.com) (github.com) Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs.
  • March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026.
  • Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators.

What happens next

  • The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems.
  • That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost.

Quick answers

What happened in Baseten's post‑training push?

Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub. That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost. (techspot.com) (linkedin.com) (github.com)

Why does Baseten's post‑training push matter?

Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs. (businesswire.com). March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026. (baseten.co). The Truss README and docs list first‑class support for multiple serving backends and GPU configuration and name frameworks like vLLM and TensorRT‑LLM plus example deployments for Llama 3 (70B), signaling an operator focus on heterogeneous runtimes and large models. (github.com / baseten; baseten.co). Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators. (baseten.co). The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems. (jobgether.com) (goremotejob.com). Baseten’s head of marketing framed the recent cryptic OOH push as intentionally exclusive — “if‑you‑know‑you‑know” — a play aimed at recruiting and signaling to ML operators rather than mainstream buyers. (npr.org).

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Published by The Daily Scout - Be the smartest in the room.