lazo lazo - 2 months ago 13
Perl Question

Remove entire row from table with HTML::TableExtract

Is there a way to remove an entire row (html tags 'n all) from an HTML Table with


Mucking around with the sample code from CPAN, this is what I've tried so far:

use HTML::TableExtract qw(tree);

my $te = HTML::TableExtract->new( headers => [qw(name type members)] );

# get $html_string out of a file...


my $table = $te->first_table_found();
my $table_tree = $table->tree;
my $document_tree = $te->tree;
my $document_html = $document_tree->as_HTML;

# write $document_html to a file ...

Now, as the name suggests, 'replace_content()' in the line
removes the content of row 4, but the row itself remains in markup. I need to get the tags and everything in-between removed as well.

Any ideas?

mrk mrk

What you want is the parent and delete methods

See the docs for HTML::Element and for HTML::Element::delete


Ok, click that checkmark and mark this one as answered....Here it is:

my($p) = $table_tree->row(4)->parent();

Also, NOTE, you need the () parens around $p! If you don't have parens don't get back a reference.

For me, with the above Perl code working on this HTML,

   <tr><td>row1</td><td>row1</td> <td>row1</td></tr>
   <tr><td>row2</td><td>row2</td> <td>row2</td></tr>
   <tr><td>row3</td><td>row3</td> <td>row3</td></tr>
   <tr><td>row4</td><td>row4</td> <td>row4</td></tr>

I get this as a result of printing $document_html


Notice that there is no empty <tr></tr>