As cloud environments grow in complexity, managing costs becomes increasingly challenging. One of the often overlooked culprits of rising AWS costs is the accumulation of detached Elastic Block Store (EBS) volumes that are no longer in use. These orphaned volumes can silently inflate your storage expenses over time.
In this blog, we explore how I developed an automated cleanup process for unused EBS volumes, significantly reducing costs using a serverless architecture designed for efficiency and safety. By implementing an EventBridge-triggered Lambda function, we ensure that old, unused volumes are identified, evaluated, and securely deleted, all while preserving critical resources through a smart tagging system.
Introducing the Storage Cleaner Service
To address this challenge, I designed and deployed a serverless architecture focused on cleaning up old, detached EBS volumes in non-production accounts. The architecture was meticulously planned to ensure that resources deleted by the storage-cleaner service could be recovered if necessary, preventing accidental data loss.
Key Features of the Storage Cleaner Service:
- Recovery Assurance: In case a resource is accidentally deleted or was not supposed to be deleted, it can be recovered, minimising the risk of data loss.
- Tag-Based Opt-Out: Workload teams can opt out specific resources from deletion by applying a predefined tag. This ensures that critical resources are preserved.
- Default Deletion Period: Resources that are 30 days or older, and not tagged for preservation, are considered for deletion by default.
Serverless Architecture Overview Detached EBS Volume Cleaner
Once the architecture was deployed, the process of cleaning up detached EBS volumes was automated and scheduled to run daily. Here’s a step-by-step explanation of how the Detached EBS Volume Cleaner operates:
- EventBridge Trigger: The cleanup process is initiated by an EventBridge rule scheduled to trigger a Lambda function at 6 PM daily.
- Cross-Account Role Assumption: The Lambda function assumes a cross-account role in the workload account to gain access to the necessary resources.
- Fetching Detached EBS Volumes: The function fetches the list of all detached EBS volumes within the workload account.
- Eligibility Check for Deletion:
- The function checks if a snapshot of the volume has been taken by the storage cleaner service within the last 30 hours.
- If a snapshot exists, the volume is deemed safe for deletion and is deleted.
- If no snapshot exists, a snapshot is created, and the volume is retained for deletion in the next scheduled run.
This approach ensures that volumes are not deleted until a recent snapshot is available, providing an extra layer of safety before resource removal.
Logging and Monitoring
To ensure transparency and traceability, detailed logs are maintained in CloudWatch. These logs capture essential information such as the snapshot or EBS volume ID, associated tags, time of deletion, volume size, and more. This logging mechanism helps in auditing and provides insights into the operation of the storage-cleaner service.
Conclusion
The Storage Cleaner Service for EBS volumes, which I developed, is a powerful tool in the ongoing battle against unnecessary cloud costs. By automating the identification and deletion of detached EBS volumes, my serverless solution helps ensure that your AWS environment will remain lean and cost-efficient. The combination of EventBridge, Lambda and intelligent tagging allows for a safe and flexible cleanup process that protects critical resources while eliminating waste. As AWS environments continue to grow in complexity, implementing automated solutions like this is key to maintaining control over costs and optimising resource utilisation.