Jaaromy Zierse Jaaromy Zierse - 1 month ago 11
C# Question

Is there a way to bulk insert into Amazon Aurora RDS directly from Amazon S3 tab delimited files in C#?

I am currently using Amazon Redshift to store aggregated data from the 50 - 100 GB (ie. millions of rows) of tab delimited files that are pushed to a bucket in Amazon S3 every day.

Redshift makes this easy by providing a

copy
command which can be targeted directly to an S3 bucket to bulk load the data.

I would like to use Amazon Aurora RDS for this same purpose. Documentation on Aurora is thin, at best, right now. Is there a way to bulk load directly from S3 into Aurora?

As far as I can tell, MySql's
LOAD DATA INFILE
requires a path to the file on disk, which I suppose I can work around by downloading the tsv to an AWS instance and running the command from there, though that isn't ideal.

I've also attempted to read the tsv into memory and construct multiple
insert
statements. This is obviously slow and clunky.

Ideas?

Answer

You could use AWS Data Pipeline. There is even a template for loading data from S3 to RDS:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-copys3tords.html