The updated AWS Well-Architected Machine Learning Lens is now available, featuring the latest capabilities and best practices for developing machine learning (ML) workloads on AWS.
The AWS Well-Architected Framework offers architectural best practices for designing and operating cloud workloads that are reliable, secure, efficient, cost-effective, and sustainable. The Machine Learning Lens applies this framework to guide users through a thorough review of their ML architectures.
This updated lens offers a standardized method for evaluating various ML workloads, from traditional supervised and unsupervised learning to advanced AI applications. It covers key aspects of the entire ML lifecycle, including defining business goals, framing problems, processing data, developing models, deploying them, and continuous monitoring. The lens integrates AWS ML services and capabilities introduced since 2023, ensuring access to current best practices and implementation advice.
The Machine Learning Lens is one of several Well-Architected lenses available within the AWS Well-Architected Lenses collection.
What is the Machine Learning Lens?
The Well-Architected Machine Learning Lens integrates the six pillars of the Well-Architected Framework with the six phases of the ML lifecycle.

The six phases include:
- Business goal identification: Defining clear business objectives and success metrics for an ML initiative.
- ML problem framing: Converting business challenges into clearly defined ML problems with suitable metrics.
- Data processing: Gathering, preparing, and engineering features from data sources.
- Model development: Constructing, training, tuning, and evaluating ML models, along with tracking experiments.
- Model deployment: Implementing models into production environments, complete with necessary infrastructure and monitoring.
- Model monitoring: Continuously tracking model performance and ensuring model quality over time.
Achieving a functional prototype based on these six ML lifecycle phases typically requires an iterative approach, rather than a traditional waterfall method. The lens offers a collection of established, cloud-agnostic best practices, structured around the Well-Architected Framework pillars for each ML lifecycle phase.
The Well-Architected Machine Learning Lens can be utilized at any stage of a cloud adoption journey. Its guidance can be applied during the initial design of ML workloads or as part of a continuous improvement process for workloads already in production.
Machine Learning Lens Components
The lens is structured around four main focus areas:
- Well-Architected ML design principles: Ten core design principles that underpin the best practices, such as assigning ownership, ensuring reproducibility, optimizing resources, and fostering continuous improvement.
- The ML lifecycle and the Well-Architected Framework pillars: This section examines all facets of the ML lifecycle and reviews design strategies that align with the pillars of the broader Well-Architected Framework:
- Operational excellence: The capacity to support ongoing development, efficiently operate ML workloads, gain operational insights, and continuously refine processes.
- Security: The capability to safeguard data, models, and ML infrastructure, leveraging cloud technologies to enhance security.
- Reliability: The ability of ML workloads to function correctly and consistently as intended, including automatic recovery from failures.
- Performance efficiency: The skill to utilize computing resources effectively for ML workloads and maintain efficiency as requirements and technologies evolve.
- Cost optimization: The means to run ML systems to provide business value at the lowest possible cost through resource optimization and automation.
- Sustainability: Addresses the environmental footprint of ML workloads, emphasizing energy consumption and resource efficiency.
- Cloud-agnostic best practices: Over 100 extensive best practices that span each ML lifecycle phase and align with the Well-Architected Framework pillars. Each best practice provides:
- Implementation guidance: Detailed AWS implementation strategies, with references to current AWS ML services and capabilities.
- Resources: Handpicked links to AWS documentation, blog posts, videos, and code examples that support the best practices.
- Related ML architecture considerations: Discussions on advanced subjects like MLOps patterns, data architecture for ML, strategies for model governance, and considerations for implementing responsible AI.
Additional Topics in the Machine Learning Lens
The Machine Learning Lens also covers these important topics:
- Responsible AI: Detailed guidance for developing fair, explainable, and unbiased ML systems across the entire development lifecycle.
- MLOps and automation: Best practices for establishing continuous integration, continuous deployment, and continuous training for ML workloads.
- Data architecture for ML: Advice on constructing strong data pipelines, feature stores, and data governance practices to support scalable ML workloads.
- Model governance and lineage: Approaches for tracking model versions, maintaining audit trails, and ensuring adherence to regulatory standards.
New Features in the Updated Machine Learning Lens
The updated Machine Learning Lens incorporates the latest AWS ML capabilities and best practices introduced since 2023. Key updates include:
- Enhanced data and AI collaborative workflows: Integrated development is supported through Amazon SageMaker Unified Studio – MLOPS02-BP01, MLOPS01-BP01, MLOPS03-BP01, and MLOPS02-BP04.
- AI-assisted development lifecycle: Code generation and productivity are improved using Kiro and Amazon Q Developer – MLCOST01-BP02, MLOPS01-BP01, MLCOST03-BP02, and MLSUS05-BP02.
- Distributed training infrastructure: Large-scale foundation model development and fine-tuning are facilitated with Amazon SageMaker HyperPod – MLCOST04-BP02, MLCOST04-BP07, MLPERF06-BP05, MLSEC03-BP02, MLCOST04-BP06, MLPERF06-BP07, and MLSUS05-BP02.
- Model customization capabilities: Knowledge distillation and fine-tuning for domain-specific applications are available using Amazon Bedrock with Kiro and Amazon Q Developer integration, and a model hub with Amazon SageMaker Jumpstart – MLCOST01-BP02, MLCOST01-BP01, MLCOST03-BP02, MLSUS04-BP02, MLCOST05-BP01, and MLSUS05-BP02.
- No-code ML development: Natural language support for model building is provided via SageMaker Canvas with Amazon Q Developer integration – MLCOST03-BP02, MLCOST03-BP03, MLOPS01-BP01, and MLSUS05-BP02.
- Improved bias detection: Enhanced fairness metrics are included in SageMaker Clarify, along with Model Monitor for drift detection – MLREL02-BP01, MLREL03-BP04, MLREL02-BP04, MLREL02-BP05, and MLREL02-BP02.
- Modular inference architecture: Flexible deployment options are offered with SageMaker Inference Components and Multi-Model Endpoints – MLCOST05-BP01, MLREL01-BP01, MLSUS05-BP01, MLCOST05-BP03, and MLREL01-BP02.
- Advanced observability: Debugging capabilities are enhanced with SageMaker Debugger, Model Monitor, and CloudWatch across the ML lifecycle – MLOPS06-BP02, MLOPS05-BP02, MLOPS06-BP01, and MLOPS02-BP04.
- Enhanced cost optimization: Resource management is improved through SageMaker Training Plans, Savings Plans, and Spot Instance support – MLCOST05-BP03, MLOPS05-BP02, MLCOST06-BP01, MLCOST06-BP02, and MLCOST04-BP06.
Target Audience for the Machine Learning Lens
The Machine Learning Lens offers value to various roles within an organization. Business leaders can utilize it to grasp the comprehensive implementation and business benefits of ML initiatives. Data scientists and ML engineers can leverage the lens to understand how to construct, deploy, and maintain scalable ML systems. DevOps and platform engineers can learn to establish reliable and secure infrastructure for ML workloads. Risk and compliance leaders can gain insight into responsible ML system implementation and adherence to regulatory and governance standards.
For support with implementing or assessing ML workloads, individuals can contact their AWS Solutions Architect or Account Representative.
The updated Machine Learning Lens benefited from contributions across the AWS Solution Architecture, AWS Professional Services, and Machine Learning communities. These contributions incorporated diverse perspectives, expertise, and experiences to develop comprehensive guidance for ML workloads on AWS.
Further reading includes the AWS Well-Architected Framework, or the AWS Well-Architected Generative AI Lens for guidance specific to generative AI workloads.

