Lingbo Tang Lingbo Tang - 5 months ago 93
Node.js Question

AWS nodejs microservice: Iteratively invoke service when files in S3 bucket changed

I created a micro service on lambda using nodejs to generate thumbnails of my images in S3 bucket. However, it didn't get triggered after I uploaded new images to the S3 bucket. I set the trigger event type as S3 object created. And I also configured my test event as:

"eventName": "ObjectCreated:*"
which means when some files are created/changed in the bucket, it should trigger the test event and invoke this lambda function. I also set up the same notification configuration on the bucket side. It worked for me the first time I created this lambda function from this example: Create a deployment package

The function only worked for the exact file "HappyFace.jpg", but failed for all other images. And I got "Access Denied" error sometimes. I'm using the following code:

// dependencies
var async = require('async');
var AWS = require('aws-sdk');
var gm = require('gm')
.subClass({ imageMagick: true }); // Enable ImageMagick integration.
var util = require('util');
var utils = require('utils');

// constants
var MAX_WIDTH = 100;
var MAX_HEIGHT = 100;

// get reference to S3 client
var s3 = new AWS.S3();

exports.handler = function(event, context, callback) {
// Read options from the event.
console.log("Reading options from event:\n", util.inspect(event, {depth: 5}));
var srcBucket = event.Records[0].s3.bucket.name;
// Object key may have spaces or unicode non-ASCII characters.
var srcKey =
decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, " "));
var dstBucket = srcBucket + "-resized";
var dstKey = "resized-" + srcKey;

// Sanity check: validate that source and destination are different buckets.
if (srcBucket == dstBucket) {
callback("Source and destination buckets are the same.");
return;
}

// Infer the image type.
var typeMatch = srcKey.match(/\.([^.]*)$/);
if (!typeMatch) {
callback("Could not determine the image type.");
return;
}
var imageType = typeMatch[1];
if (imageType != "jpg" && imageType != "png") {
callback('Unsupported image type: ${imageType}');
return;
}

// Download the image from S3, transform, and upload to a different S3 bucket.
async.waterfall([
function download(next) {
// Download the image from S3 into a buffer.
s3.getObject({
Bucket: srcBucket,
Key: srcKey
},
next);
},
function transform(response, next) {
gm(response.Body).size(function(err, size) {
// Infer the scaling factor to avoid stretching the image unnaturally.
var scalingFactor = Math.min(
MAX_WIDTH / size.width,
MAX_HEIGHT / size.height
);
var width = scalingFactor * size.width;
var height = scalingFactor * size.height;

// Transform the image buffer in memory.
this.resize(width, height)
.toBuffer(imageType, function(err, buffer) {
if (err) {
next(err);
} else {
next(null, response.ContentType, buffer);
}
});
});
},
function upload(contentType, data, next) {
// Stream the transformed image to a different S3 bucket.
s3.putObject({
Bucket: dstBucket,
Key: dstKey,
Body: data,
ContentType: contentType
},
next);
}
], function (err) {
if (err) {
console.error(
'Unable to resize ' + srcBucket + '/' + srcKey +
' and upload to ' + dstBucket + '/' + dstKey +
' due to an error: ' + err
);
} else {
console.log(
'Successfully resized ' + srcBucket + '/' + srcKey +
' and uploaded to ' + dstBucket + '/' + dstKey
);
}

callback(null, "message");
}
);
};


and has configured the type-match before downloading. I tried to use s3.ListObjects, but it didn't make sense to me logically. Since lambda can be triggered by the upload event, every time I upload a new image it should be invoked for that image, so I don't want to list the objects every time.

Update:

I got rid of the access denied problem after I got admin access. It inspired me to inspect the node packages I installed. We might troubleshoot it through this way. However, after I installed 'utils' from npm, I can not invoke the function for existing files.

Answer

The access denied error might not be the IAM/S3 bucket/lambda permission issue. If your service can't find the given key in your S3 bucket, it will also return an access denied error to requesters. Because returning NoSuchKey would leak information about the nonexistence of the requested key. For Reference, please check this link: Causes of Access Denied Error

As for how to iteratively invoking the lambda function, you definitely don't need to invoke s3.ListObject() in your code, because that will slow down your performance. But this link might help you to customize your function: Listing Large S3 Buckets with the AWS SDK for Node.js. In the given example of this question, notice that they included util package by:

var util = require('util');

But how they installed with npm is through this command line:

npm install async gm

If you want to make the function be invoked iteratively, you would also like to install "utils" through npm by npm install utils. When it works iteratively through your bucket, you might get access denied error for some files, because you might not have the key configured in your event. You can ignore that.

update

I also managed to put original images and thumbnails in the same bucket, what you need to do is two things:

  1. Skip the thumbnails by checking prefix or suffix.
  2. Set Timeout Interval. Since we are using 'async', then we don't need to setTimeout for the waterfall function, we can set it outside the waterfall but inside the handler. And you can also set the timeout and time schedule event in GUI.

Important Update:

Unfortunately, my original solution is not perfectly robust. I got another safer solution. There are three steps:

  1. Configure your S3 bucket to an SQS queue.
  2. Listening to every incoming messages in an async loop (or setInterval).
  3. Executing the thumbnail function in the async loop for every SQS message.

And the code will roughly look like:

s3.listObjects({Bucket:"myBucket",Delimiter:"",Prefix:""}, function (err, data) {
    if (err) throw err;

    thumbnail(event, function(err){})
});

setInterval(function() {
    console.log("Pause");
    sqs.receiveMessage(receiveParams, function(err,data){
        console.log("Calling");
        if (err) {
            console.log(err);
        }
        else {
            if (data.Messages != null)
            {
                thumbnail(data.Messages[0].Body, function(err){
                    if (err) {
                        console.log(err);
                    }
                });
            }
        }
    });
}, 1000);
Comments