I wrote a short function with Oledb which should read large amount of data but there are still several issues I couldnt't solve, it's about reading and inserting semicolon separated data in sql database
Private Function GetCSVFile(ByVal file As String) As DataTable
Dim dt As New DataTable
Dim ConStr As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & TextBox1.Text & ";Extended Properties=""TEXT;HDR=Yes;FMT=Delimited"""
Dim conn As New OleDb.OleDbConnection(ConStr)
Dim da As New OleDb.OleDbDataAdapter("Select * from " & _table & ".csv", conn)
getData = dt
Catch ex As OleDbException
Catch ex As Exception
value(i) = value(i).Replace("\t", Constants.vbTab).Replace("\n", Constants.vbLf).Replace("\r", Constants.vbCr).Replace("\""", """").Replace("\\", "\")
Dim dtInserts As DataTable = db.GetDataTable("SELECT TOP 0 * FROM " & _table)
Dim ListOfTypes As New List(Of System.Type)
For Each _col As DataColumn In dtInserts.Columns
Dim _type As System.Type = _col.DataType
Dim _wert1 As String = "11.11.2011"
Dim _type1 As System.Type = ListOfTypes.Item(1)
If DateTime.TryParse(_wert1, New Date) Then
Dim ConStr As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & TextBox1.Text & ";Extended Properties=""TEXT;HDR=Yes;FMT=Delimited;CODEPAGE=1252"""
CSVHelper can do most of what it sounds like you want. FileHelpers might also work, I just haven't worked with it as much. I wouldn't recommend VB's
TextFieldParser because it returns a string array rather than typed data.
OleDB provides a great way to import, but saving to the database presents a small challenge. All the loaded rows will have a
Unchanged. The only way to change that to
Added is to copy the rows into a new table:
For Each dr As DataRow In dtCSV.Rows dtDest.Rows.Add(dr.ItemArray) Next
It will work, but as a result, you will have 2 tables and all those rows loaded at once. It turns out that using CSVHelper and a simple INSERT query is the most economical - since the records are fetched from an
IEnumerable<T> only 1 source record will be loaded at a time. It is also a bit faster than copying rows: about 20% faster on 500k rows.
Note: We have no idea whatsoever what the data looks like other than it is semicolon delimited...and there is apparently a lot of it.
Using sr As New StreamReader(CSVFilePath, False), csv As New CsvReader(sr) ' some CSVHelper config options csv.Configuration.HasHeaderRecord = True csv.Configuration.TrimFields = True csv.Configuration.TrimHeaders = True csv.Configuration.Delimiter = ";" csv.Configuration.IsHeaderCaseSensitive = False csv.Configuration.RegisterClassMap(Of RandItem.RandItemMap)() ' get the file into IEnumerable collection Dim csvData = csv.GetRecords(Of OleImportItem)() Dim SQL = <sql> INSERT INTO RandomData (Foo, Bar, Cat, Dog...) VALUES (@p1, @p2, @p3, @p4...) </sql>.Value Using dbcon As New OleDbConnection(ACEConnStr) Using cmd As New OleDbCommand(SQL, dbcon) dbcon.Open() ' create the parameters cmd.Parameters.Add("@p1", OleDbType.VarChar) cmd.Parameters.Add("@p2", OleDbType.VarChar) cmd.Parameters.Add("@p3", OleDbType.Integer) cmd.Parameters.Add("@p4", OleDbType.Integer) ... ' load one item at a time, to save it For Each item In csvData cmd.Parameters("@p1").Value = item.Foo cmd.Parameters("@p2").Value = item.Bar cmd.Parameters("@p3").Value = item.Cat cmd.Parameters("@p4").Value = item.Dog ... cmd.ExecuteNonQuery() Next End Using End Using End Using
I cant give a complete tutorial on how to use it, but in general you create a type (Class) which defines the datatype for each column (here,
RandItemMap is another class which specifies the order of those properties in the file. In so doing, CSVHelper knows the datatype of each column and will convert for you.
There are several ways to use it, this way is reading one line at a time from the file and immediately saves that item to the database:
csvData = csv.GetRecords(Of OleImportItem)() initializes
csvData as an
IEnumberable(Of RandItem), so only one item is loaded at a time in the loop which makes it very economical.
In the loop, the code gets a typed
item from the csv file which is then mapped to the respective parameters and saved. This is about 20% faster than copying the data to a DataTable in batches and saving; and much less memory intensive since there is only ever 1 item loaded.
For smaller csv files, you could use
ToList() to load the file to a collection.
Note Importing/parsing a text file without headers is a little different than shown, but just as easy.
It sounds like CSVHelper might obviate most issues in your laundry list.