Skip to main content

Management and Governance

AWS CloudTrail

What it is:
CloudTrail is a service that records all API calls and actions made in your AWS account, including who made the call, what services were affected, and when.

Why it matters:

  • Provides an audit trail for all changes and activities
  • Helps you detect suspicious behavior or unauthorized access
  • Useful for compliance reporting and forensic analysis

Typical Use Cases:

  • Investigating security incidents (e.g., who deleted a resource?)
  • Monitoring access to sensitive services (e.g., S3, IAM, SageMaker)
  • Setting up alarms on critical changes

Learn more

AWS CloudWatch

What it is:
CloudWatch is AWS’s central monitoring service for metrics, logs, and alarms. It collects and tracks data from AWS services and custom sources.

Why it matters:

  • Helps you visualize performance (CPU, memory, latency, etc.)
  • Allows you to set alarms and get notified when something goes wrong
  • Enables automated actions (e.g., restarting instances)

Typical Use Cases:

  • Monitoring model performance or resource usage in SageMaker
  • Setting alerts on Lambda failures or high error rates
  • Creating dashboards for your application’s health

Learn more

AWS Config

What it is:
AWS Config is a resource compliance and configuration tracking service. It monitors changes to AWS resources and evaluates them against predefined rules.

Why it matters:

  • Provides a timeline of resource changes
  • Ensures your environment adheres to security and compliance policies
  • Supports automatic remediation of non-compliant resources

Typical Use Cases:

  • Checking if S3 buckets are publicly accessible
  • Tracking IAM policy changes
  • Auditing the history of ML model versions or endpoints

Learn more

AWS Trusted Advisor

What it is:
Trusted Advisor is a service that scans your AWS environment and gives recommendations to help improve performance, security, fault tolerance, and cost optimization.

Why it matters:

  • Highlights security vulnerabilities (e.g., open ports, weak IAM policies)
  • Identifies unused resources to reduce cost
  • Suggests best-practice improvements

Typical Use Cases:

  • Checking for over-provisioned EC2/SageMaker instances
  • Ensuring MFA is enabled for root accounts
  • Finding unused EBS volumes or idle load balancers

Learn more

AWS Well-Architected Tool

What it is:
This is a self-assessment tool that helps you review and improve your architecture based on the AWS Well-Architected Framework, which includes 6 pillars (Operational Excellence, Security,Reliability, Cost Optimization, Performance Efficiency, Sustainability).

Why it matters:

  • Provides a structured review of your architecture
  • Helps you identify risks and improvement areas
  • Guides you in building resilient and efficient applications

Typical Use Cases:

  • Assessing your ML/AI solution before production
  • Aligning your architecture with AWS best practices
  • Comparing designs across multiple workloads or teams

Learn more