Kong Kong - 11 months ago 80
Python Question

S3 Key Not Present Immediatly After Listing

I'm using Python and boto3 to work with S3.

I'm listing an S3 bucket and filtering by a prefix:

bucket = s3.Bucket(config.S3_BUCKET)
for s3_object in bucket.objects.filter(Prefix="0000-00-00/", Delimiter="/"):

This gives me an iterable of S3 objects.

If I print the object I see:

s3.ObjectSummary(bucket_name='validation', key=u'0000-00-00/1463665359.Vfc01I205aeM627249')

When I go to get the body though I get an exception:

content = s3_object.get()["Body"].read()

botocore.exceptions.ClientError: An error occurred (NoSuchKey) when
calling the GetObject operation: The specified key does not exist.

So boto just gave me the key, but then it says it doesn't exist?

This doesn't happen for all keys. Just some. If I search for the invalid key in the AWS console it doesn't find it.

Answer Source

It's safe to assume you are using the 'standard' endpoint. All of this primarily applies to it, and not the regional endpoints. S3 is atomic and eventually consistent. The documentation gives several examples, including this:

A process writes a new object to Amazon S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list.

Occasionally delays of many hours have been seen, and my anecdata agrees with this statement that well over 99% of the data exists within 2 seconds.

You can enable read-after-write consistency, which "fixes" this, by changing your endpoint from s3.amazonaws.com to s3-external-1.amazonaws.com:

s3client = boto3.client('s3', endpoint_url='s3-external-1.amazonaws.com')