Baseten's post‑training push

Published March 24, 2026 by The Daily Scout

Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub. That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost. (techspot.com) (linkedin.com) (github.com)

Why it matters

Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs. (businesswire.com). March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026. (baseten.co). The Truss README and docs list first‑class support for multiple serving backends and GPU configuration and name frameworks like vLLM and TensorRT‑LLM plus example deployments for Llama 3 (70B), signaling an operator focus on heterogeneous runtimes and large models. (github.com / baseten; baseten.co). Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators. (baseten.co). The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems. (jobgether.com) (goremotejob.com). Baseten’s head of marketing framed the recent cryptic OOH push as intentionally exclusive — “if‑you‑know‑you‑know” — a play aimed at recruiting and signaling to ML operators rather than mainstream buyers. (npr.org).

Key numbers

Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub.
(techspot.com) (linkedin.com) (github.com) Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs.
March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026.
Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators.

What happens next

The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems.
That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost.

Sources

Quick answers

What happened in Baseten's post‑training push?

Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub. That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost. (techspot.com) (linkedin.com) (github.com)

Why does Baseten's post‑training push matter?

Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs. (businesswire.com). March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026. (baseten.co). The Truss README and docs list first‑class support for multiple serving backends and GPU configuration and name frameworks like vLLM and TensorRT‑LLM plus example deployments for Llama 3 (70B), signaling an operator focus on heterogeneous runtimes and large models. (github.com / baseten; baseten.co). Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators. (baseten.co). The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems. (jobgether.com) (goremotejob.com). Baseten’s head of marketing framed the recent cryptic OOH push as intentionally exclusive — “if‑you‑know‑you‑know” — a play aimed at recruiting and signaling to ML operators rather than mainstream buyers. (npr.org).

Baseten's post‑training push

What happened

Why it matters

Key numbers

What happens next

Sources

Quick answers

What happened in Baseten's post‑training push?

Why does Baseten's post‑training push matter?

Get your own daily briefing