Google's Vertex AI Improves PyTorch Orchestration

Google's Vertex AI Pipelines now offers full support for orchestrating PyTorch-based machine learning workflows. The platform enables serverless automation, real-time monitoring, and scalable deployment for PyTorch models. The update also includes seamless integration with Google's Gemini models, which are available in various sizes for use in end-to-end applications.

- Vertex AI Pipelines utilizes the open-source Kubeflow Pipelines (KFP) v2 SDK, allowing ML engineers to define and orchestrate workflows using Python. This serverless execution eliminates the need to manage underlying Kubernetes clusters, a key difference from running open-source Kubeflow. - The platform automatically tracks pipeline executions, artifacts, and lineage in Vertex ML Metadata. This helps with the reproducibility of experiments and analysis of how models were created, which are core principles of MLOps. - For distributed training of PyTorch models, Vertex AI can leverage Google's custom accelerators like Cloud TPUs via the PyTorch/XLA library, which is an open-source compiler for linear algebra. At its Cloud Next '24 conference, Google announced AI Hypercomputer, a new supercomputing architecture that includes the Cloud TPU v5p to further optimize large-scale training. - While Vertex AI Pipelines is a managed service, it is based on the open-source Kubeflow project. A key trade-off for the convenience of a serverless platform is less low-level control compared to managing a Kubeflow instance directly on Kubernetes. - The pricing model for Vertex AI Pipelines starts at a fee of $0.03 per pipeline run. This cost is separate from the charges for the underlying Google Cloud resources consumed during the pipeline's execution, such as Compute Engine instances for training or other services called by the pipeline. - The integration supports tools from the broader PyTorch ecosystem, including an adapter for TorchX to orchestrate components and compatibility with PyTorch Lightning, a lightweight wrapper that organizes PyTorch code and simplifies training on different hardware. - Deployment of PyTorch models on Vertex AI can be done using pre-built Docker container images or by creating custom containers with TorchServe, an open-source model serving framework for PyTorch.

Google's Vertex AI Improves PyTorch Orchestration

Get your own daily briefing