Recently I worked on a Python script to monitor and delete objects inside an S3 bucket.
I published an excerpt on GitHub: python-projects/delete-s3-objects
The code in the repo mostly consist of the
cleanup() function. The function use
boto3 to connect to AWS, pull a list of all the objects contained in a specific bucket and then delete all the objects older than
I have included a few examples of creating a
boto3.client which is what the function is expecting as the first argument. The other arguments are used to build the path to the directory inside the S3 bucket where the files are located. This path in AWS terms is called a Prefix.
As the number of the objects in the bucket can be larger than 1000, which is the limit for a single GET in the
GET Bucket (List Objects) v2, I used a paginator to pull the entire list. The objects removal follow the same principle and process batches of 1000 objects.
Now this was all good fun but the really interesting part was creating a proper unittest.
After some searching I found moto, the “Mock AWS Services” library. It is brilliant!
Using this library the test will mock access to the S3 bucket and create several objects in the bucket. You can leave the dummy AWS credentials in the script as they won’t be needed.
At this point I wanted to create multiple objects in the S3 mocked environment with different timestamps, but unfortunately I discovered that this is not possible. Once an object is created in S3 the date of creation metadata cannot be easily altered, see here for reference.
Cue another awesome library called freezegun. The test use freeze_time to mock the date/time and create S3 objects with different timestamps, so that we can safely experiment with the logic of the
cleanup() function (‘leave objects older than n days, delete everything else within the prefix‘).
$ python test_script.py mock-root-prefix/mock-sub-prefix/test_object_01 2019-08-29 00:00:00+00:00 mock-root-prefix/mock-sub-prefix/test_object_02 2019-08-28 00:00:00+00:00 mock-root-prefix/mock-sub-prefix/test_object_03 2019-08-27 00:00:00+00:00 mock-root-prefix/mock-sub-prefix/test_object_04 2019-08-26 00:00:00+00:00 mock-root-prefix/mock-sub-prefix/test_object_05 2019-08-25 00:00:00+00:00 mock-root-prefix/mock-sub-prefix/test_object_06 2019-08-24 00:00:00+00:00 <class 'botocore.client.S3'> Cleanup S3 backups Working in the bucket: my-mock-bucket The prefix is: mock-root-prefix/mock-sub-prefix/ The threshold (n. days) is: 4 Total number of files in the bucket: 7 Number of files to be deleted: 3 Deleting the files from the bucket ... Deleted: 3 Left to delete: 0 . ---------------------------------------------------------------------- Ran 1 test in 0.798s OK