Abdul Manaf Abdul Manaf - 4 months ago 157
Node.js Question

aws s3 putObject vs sync

i need to upload a large file to aws s3 bucket. in every 10 minute my code delete old file from source directory and generate a new file. File size is around 500 MB. Now i used s3.putObject() method for uploading each file after creation. i also heard about aws s3 sync. its coming with aws-cli. it used for uploading files to s3 bucket.

i used aws-sdk for node.js for s3 upload. aws-sdk for node.js does not contain s3-sync method. is s3-sync is better than s3.putObject() method?. i need faster upload.


There's always more than way to make on thing, so to upload a file into a S3 bucket you can :

  • use aws CLI and run aws s3 cp ...
  • use aws CLI and run aws s3api put-object ...
  • use aws SDK (your language of choice)

you can also use sync method but for a single file, there's no need to sync a whole directory, and generally when looking for better performance its better to start multiple cp instances to benefit from multi thread vs sync mono-thread.

basically all this methods are wrapper for the aws S3 API calls. From amazon doc

Making REST API calls directly from your code can be cumbersome. It requires you to write the necessary code to calculate a valid signature to authenticate your requests. We recommend the following alternatives instead:

  • Use the AWS SDKs to send your requests (see Sample Code and Libraries). With this option, you don't need to write code to calculate a signature for request authentication because the SDK clients authenticate your requests by using access keys that you provide. Unless you have a good reason not to, you should always use the AWS SDKs.
  • Use the AWS CLI to make Amazon S3 API calls. For information about setting up the AWS CLI and example Amazon S3 commands see the following topics: Set Up the AWS CLI in the Amazon Simple Storage Service Developer Guide. Using Amazon S3 with the AWS Command Line Interface in the AWS Command Line Interface User Guide.

so Amazon would recommend to use the SDK. At the end of the day, I think its really a matter to what you're most comfortable and how you will integrate this piece of code into the rest of your program. For one-time action, I always go to CLI.

In term of performance though, using one or the other will not make difference as again they're just wrapper to AWS API call. For transfer optimization, you should look at aws s3 transfer acceleration and see if you can enable it