Overview

Table of contents

Overview#

Transform your AI/ML development process with Amazon SageMaker HyperPod CLI and SDK. These tools handle infrastructure management complexities, allowing you to focus on model development and innovation. Whether it’s scaling your PyTorch training jobs across thousands of GPUs, deploying production-grade inference endpoints or managing multiple clusters efficiently; the intuitive command-line interface and programmatic control enable you to:

  • Accelerate development cycles and reduce operational overhead

  • Automate ML workflows while maintaining operational visibility

  • Optimize computing resources across your AI/ML projects

Note

Version Info - you’re viewing latest documentation for SageMaker Hyperpod CLI and SDK v3.0.0.

What’s New

🚀 We are excited to announce general availability of Amazon SageMaker HyperPod CLI and SDK!

Major Updates:

  • Distributed Training: Scale PyTorch jobs across multiple nodes and GPUs with simplified management and automatic fault tolerance.

  • Model Inference: Deploy pre-trained models from SageMaker JumpStart and host custom auto-scaling inference endpoints.

  • Observability: Connect to and manage multiple HyperPod clusters with enhanced monitoring capabilities.

  • Usability Improvements: Intuitive CLI for quick experimentation and cluster management, granular SDK control over workload configurations and easy access to system logs and observability dashboards for efficient debugging

Quick Start#

Installation

New to HyperPod? Install the CLI/ SDK in minutes.

Get Started
Getting Started

Ready to explore? Connect to your cluster before running ML workflows.

Getting Started
Training

Scale Your ML Models! Get started with training

Training with SageMaker HyperPod
Inference

Deploy Your ML Model! Get started with inference

Inference with SageMaker HyperPod

Advanced Resources#

API reference

Explore APIs - Checkout API Documentation

sdk/sdk_index.html
Github

Example Notebooks - Ready-to-use implementation guides

End-to-End Example and Notebooks
AWS SageMaker HyperPod Docs

HyperPod Documentation - Know more about HyperPod

https://docs.aws.amazon.com/sagemaker/latest/dg/hyperpod.html
HyperPod Developer Guide

Developer Guide - Refer to this practical development guide

https://catalog.workshops.aws/sagemaker-hyperpod-eks/en-US
SageMaker HyperPod Workshop

Practical Guide - Refer to the workshop for detailed follow-through steps

https://catalog.workshops.aws/sagemaker-hyperpod-eks/en-US