LearningSlowly LearningSlowly - 22 days ago 8
Python Question

Reading data from S3 using Lambda

I have a range of json files stored in an S3 bucket on AWS.

I wish to use AWS lambda python service to parse this json and send the parsed results to an AWS RDS MySQL database.

I have a stable python script for doing the parsing and writing to the database. I need to lambda script to iterate through the json files (when they are added).

Each json file contains a list, simple consisting of

results = [content]


In pseudo-code what I want is:


  1. Connect to the S3 bucket (
    jsondata
    )

  2. Read the contents of the JSON file (
    results
    )

  3. Execute my script for this data (
    results
    )



I can list the buckets I have by:

import boto3

s3 = boto3.resource('s3')

for bucket in s3.buckets.all():
print(bucket.name)


Giving:

jsondata


But I cannot access this bucket to read its results.

There doesn't appear to be a
read
or
load
function.

I wish for something like

for bucket in s3.buckets.all():
print(bucket.contents)


EDIT

I am misunderstanding something. Rather than reading the file in S3, lambda must download it itself.

From here it seems that you must give lambda a download path, from which it can access the files itself

import libraries

s3_client = boto3.client('s3')

def function to be executed:
blah blah

def handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
s3_client.download_file(bucket, key, download_path)

Answer

You can use bucket.objects.all() to get a list of the all objects in the bucket (you also have alternative methods like filter, page_sizeand limit depending on your need)

These methods return an iterator with S3.ObjectSummary objects in it, from there you can use the method object.get to retrieve the file.