HWQ HWQ - 5 months ago 18
HTML Question

Need help in parsing html tags in excel or any other method

I have over a thousand email addresses in the format below, placed in a column in excel 2007, like this:

<td class="Normal">street name1<br>street name 2<br>city, state zipcode<br>country<br>contact no</TD>


Some cells have different
<br>
tags like this:

<td class="Normal">street name 1<br>city, state postal<br>country</TD>


I can extract the last two tags using the excel "text to culumns" functions
but the transformation is not consistent when extracted in columns and it will take forever to align each column to its right place.

The list all have "," to distinguish the street addresses, and I can use "text-to column' feature to extract all data before "," and then work on the first subset to get the data out. like this:

<td class="Normal">street name1<br>street name 2<br>city


I've searched all over the web and have gone through many formulas. Can't seem to extract text between two
<br>
tags.

Is there a way to extract between the two first
<br>
tags or a script to count the number of
<br>
tags and then use a script to extract each set of
<br>
tags in different columns, as some have one
<br>
tags and other have two
<br>
tags, in Excel.

please do suggest any other way or a tool that splits each in their respective columns.

Many Thanks.
Haroon

Answer

I guess this is what you are looking for:

Sub Demo()
    Dim str() As String, tempStr As String
    Dim lastRow As Long, i As Long, colStart As Long, r As Long

    lastRow = Cells(Rows.Count, "A").End(xlUp).Row    '-->get last row with data
    For r = 1 To lastRow
        tempStr = Range("A" & r).Value
        colStart = 2
        str = Split(tempStr, "<br>")    '-->split string on tag <br>
        For i = 1 To UBound(str) - 1
            Cells(r, colStart) = str(i)
            colStart = colStart + 1
        Next
    Next r
End Sub

See image for reference: enter image description here

EDIT# 1: Changes based on our discussion ________________________________________________________________________________

Sub Demo()
    Dim str() As String, tempStr As String
    Dim lastRow As Long, i As Long, colStart As Long, r As Long

    lastRow = Cells(Rows.Count, "A").End(xlUp).Row    '-->get last row with data
    For r = 1 To lastRow
        tempStr = Range("A" & r).Value
        colStart = 2
        str = Split(tempStr, "<br>")    '-->split string on tag <br>
        For i = 1 To UBound(str)
            If i = UBound(str) Then
                'this section will take care of the string with just one <br> tag
                If UBound(str) = 1 Then
                    Cells(r, colStart) = str(1)
                End If
            Else
                Cells(r, colStart) = str(i)
                colStart = colStart + 1
            End If
        Next
    Next r
End Sub
Comments