SF Lee SF Lee - 1 month ago 18
C# Question

Regex for parsing CSV

I'm trying to write a Regex that that will extract individual fields from a CSV file.

For example, if given the following line in a CSV file:

123, Bob ,Bob, " Foo Bar ", "a, ""b"", c"


Should give the following results (without the single quotes):

'123'
'Bob'
'Bob'
' Foo Bar '
'a, "b", c'


Note that leading and trailing white spaces should be trimmed unless they are within quotes.

I'm not worried about invalid CSV lines such as open quotes without matching closing quotes. You can safely assume that the CSV file is perfectly valid according to the rules above.

I'm also fine with using multiple Regexes if a single one is difficult. But I like to avoid using standard C# operations unless they are simple and short. (I don't want to end up with writing lots of code.)

So, any suggestions?

Answer

Well there are many gotchas and error possiable with Regexes... try following code it did trick for me and it is sweet and simple...

Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\MyFile.csv")

Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited

Dim MyDelimeters(0 To 0) As String
Reader.HasFieldsEnclosedInQuotes = False
Reader.SetDelimiters(","c)

Dim currentRow As String()
While Not Reader.EndOfData
    Try
        currentRow = Reader.ReadFields()
        Dim currentField As String
        For Each currentField In currentRow
            MsgBox(currentField)
        Next
    Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
        MsgBox("Line " & ex.Message &
        "is not valid and will be skipped.")
    End Try
End While
End Using

Mark as answer if found handy ...;)

Please see the same implementation here,,,