Engineer moves: Mainz joins Google torchTPU

Adam Mainz, previously at Meta AI working on GPU programming and ML systems, announced he’s joining Google’s torchTPU team to expand PyTorch support on TPUs. The hire is a concrete sign that big-model vendors are investing in software layers that bridge popular frameworks to custom accelerators ([]).

Most artificial intelligence researchers write models in PyTorch, which is the software layer they touch every day, while the chip underneath might be a graphics processing unit from Nvidia or a Tensor Processing Unit from Google. Google said on April 7, 2026 that TorchTPU is meant to let a developer switch a PyTorch script to “tpu” with minimal code changes. (developers.googleblog.com) A Tensor Processing Unit is Google’s custom chip for machine learning, and Google says those chips already power training and serving for Gemini and Veo inside its own infrastructure. The catch is that most outside developers learned on PyTorch first, not on Google’s older TensorFlow stack. (developers.googleblog.com) That mismatch is why “PyTorch on TPU” has existed as a bridge project for years. Google Cloud’s current setup guide still tells users to install a package called PyTorch/XLA on TPU virtual machines before they can run PyTorch jobs on TPU slices. (cloud.google.com) PyTorch/XLA is exactly what it sounds like: a translator between PyTorch code and the XLA compiler stack that can target Google TPUs. The official PyTorch/XLA repository describes itself as “Enabling PyTorch on XLA Devices,” with Google TPU as the main example. (github.com) Google’s new pitch is that the bridge should feel less like a translation layer and more like native driving. In its April 7 post, the TorchTPU team said its goal is usability first, so existing PyTorch workloads can move over without rewriting core training logic. (developers.googleblog.com) That is the backdrop for Adam Mainz moving from Meta to Google’s TorchTPU team. Mainz’s recent public work at Meta sat in the part of the stack where software meets hardware, including systems for modeling compute, memory, and network behavior across large training clusters. (engineering.fb.com) When an engineer with that profile joins the team responsible for making PyTorch run smoothly on TPUs, the signal is not about one résumé line. It says Google is spending senior engineering talent on the boring, crucial layer that decides whether a lab can move a model from Nvidia-style workflows onto Google hardware without weeks of porting work. (developers.googleblog.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.