AWS security controls detailed for production ML
A recent analysis of production-ready machine learning infrastructure explained the importance of a multi-layered security approach on AWS. Key strategies include using multi-tier subnet architecture with isolated databases, configuring network ACLs and stateful security groups, and implementing AWS WAF for web attack mitigation. The guidance stresses the principle of least privilege access and regular audits for all resources.
- Data in Amazon SageMaker, including notebooks and model artifacts, is automatically encrypted both at rest and in transit using AWS Key Management Service (KMS) and SSL/TLS. For more granular control over data protection, customer-managed keys (CMKs) can be utilized, allowing for customized rotation schedules and access policies. - To prevent unauthorized network access and potential data breaches, SageMaker training jobs can be configured with network isolation enabled. This setting blocks all outbound internet connections from the training container, preventing it from accessing external resources or transferring data to remote hosts. - When network isolation is active, SageMaker manages data transfer to and from S3 on behalf of the container. To facilitate this, a Gateway VPC endpoint for S3 must be configured in the same VPC and subnet as the SageMaker job. - To safeguard sensitive information such as database credentials or API keys, AWS Secrets Manager should be used instead of storing them directly in notebooks or environment variables. This practice helps prevent accidental exposure and allows for programmatic retrieval of secrets at runtime. - For regulated workloads or those handling highly sensitive data, AWS Nitro Enclaves provide isolated compute environments to process data in use. This ensures that even users with root access to an instance cannot access the data being processed within the enclave. - All API calls to SageMaker are made over a secure SSL connection and must be signed using the Signature Version 4 signing process. This process uses client access keys to add authentication information and prevent in-transit request tampering. - To monitor and audit activity within your SageMaker environment, AWS CloudTrail should be enabled to log all API actions. This creates a comprehensive audit trail that can be used for security analysis, compliance reporting, and identifying anomalous activity. - Object versioning in Amazon S3 should be enabled for buckets storing model artifacts. For an additional layer of security, this can be combined with Multi-Factor Authentication (MFA) Delete, which requires MFA to permanently remove an object version.