Baseten's post‑training push
Baseten visibly ramped a post‑training/inference push this week — they ran cryptic SF billboards, posted a Post‑Training Research Engineer role (Mar 23), and shipped truss v0.15.8 on GitHub. That combo signals a GTM and productization move aimed at ML operators that will surface questions about scheduler compatibility, multi‑silicon routing, and inference cost. (techspot.com) (linkedin.com) (github.com)
Baseten closed a $300 million Series E in January 2026 with IVP, CapitalG and NVIDIA named as anchor investors, giving the company a multi‑hundred‑million war chest to scale inference and go‑to‑market programs. (businesswire.com). March product changelogs show Truss gained developer‑focused changes this month — support for pyproject.toml and uv.lock, a --no‑cache truss push flag, and a new “monitor concurrent inference requests” telemetry feature added across early March 2026. (baseten.co). The Truss README and docs list first‑class support for multiple serving backends and GPU configuration and name frameworks like vLLM and TensorRT‑LLM plus example deployments for Llama 3 (70B), signaling an operator focus on heterogeneous runtimes and large models. (github.com / baseten; baseten.co). Baseten published a billing usage API on March 2, 2026 that returns a 31‑day usage_summary endpoint for programmatic cost breakdowns across Dedicated Inference, which directly surfaces per‑deployment spend for operators. (baseten.co). The mid‑March Post‑Training Research Scientist listing specifies a PhD or equivalent, expects published first‑author work, and requires running experiments at scale (multi‑node, 1T+ parameter models) to translate post‑training methods into production systems. (jobgether.com) (goremotejob.com). Baseten’s head of marketing framed the recent cryptic OOH push as intentionally exclusive — “if‑you‑know‑you‑know” — a play aimed at recruiting and signaling to ML operators rather than mainstream buyers. (npr.org).