Andrew Diamond Andrew Diamond - 3 months ago 16
Perl Question

perl find and replace deleting files

I'm new to Perl scripting, but I need to do a large amount of regex find-and-replaces across hundreds of files.

I came across this website which recommends the Perl command

perl -p -i -e 's/oldstring/newstring/g' *
to get all files, and then
perl -p -i -e 's/oldstring/newstring/g' 'find ./ -name *.html\'
to filter that to certain files.

My goal is to find all *.csproj and *.vbproj files and replace a reference to a .dll to a new path.

Those are both XML file types.

The text I'm replacing is

<Reference Include="log4net, Version=1.2.10.0, Culture=neutral, PublicKeyToken=1b44e1d426115821, processorArchitecture=MSIL">
<SpecificVersion>False</SpecificVersion>
</Reference>


with

<Reference Include="log4net, Version=1.2.10.0, Culture=neutral, PublicKeyToken=1b44e1d426115821, processorArchitecture=MSIL">
<SpecificVersion>False</SpecificVersion>
<Private>True</Private>
<HintPath>..\..\..\..\ExternalDLLs\log4net.dll</HintPath>
</Reference>


The command I have so far is

perl -p -i -e 's/<Reference Include="log4net, (?:.*?[\t\s\n\r])*?<\/Reference>/<Reference Include="log4net, Version=1\.2\.10\.0, Culture=neutral, PublicKeyToken=1b44e1d426115821, processorArchitecture=MSIL"><SpecificVersion>False<\/SpecificVersion><Private>True<\/Private><HintPath>\.\.\\\.\.\\\.\.\\\.\.\\ExternalDLLs\\log4net\.dll<\/HintPath><\/Reference>/g' `find . -type f \( -name "*.vbproj" -or -name "*.csproj" \)`


Which seems to try and work, but it just ends up deleting all of my *.vbproj and *.csproj files.

I can't figure out why my script is deleting files.

Any help?

Edit: it prints this out per file

Can't do inplace edit on ./Middletier/TDevAccess/AmCad.Components.TDevAccess.csproj: No such file or directory.


Edit 2: Im using Bash on Ubuntu on Windows if that matters

Could this be related?

Answer

I'd suggest you're going to trip yourself up in two different ways if you're not really careful.

  • Parsing XML with regex is a bad idea. It's messy, because regex isn't contextual, where XML is.
  • Perl has a perfectly good Find module, that means you don't need to use the command version.

I don't know specifically why you're having a problem, but I'd guess it's because the find command is generating linefeeds, and you're not stripping them?

Anyway, I'd suggest that you do neither, and use XML::Twig and File::Find::Rule to do this job just within perl.

Something like:

#!/usr/bin/perl
use strict;
use warnings;

use File::Find::Rule;
use XML::Twig;

#setup the parser - note, this may reformat (in valid XML sorts of ways).
my $twig = XML::Twig->new(
   pretty_print => 'indented',

   #set a handler for 'Reference' elements - to insert your values.
   twig_handlers => {
      'Reference' => sub {
         $_->insert_new_elt( 'Private' => 'True' );
         $_->insert_new_elt(
            'HintPath' => '..\..\..\..\ExternalDLLs\log4net.dll' );

         #flush is needed to write out the change.
         $_->flush;
      }
   }
);

#use rules to find suitable files to alter.
foreach my $xml_file (
   File::Find::Rule->or(
      File::Find::Rule->name('*.csproj'),
      File::Find::Rule->name('*.vbproj'),
   )->in('.')
  )
{
   print "\nFound: $xml_file\n";

   #do the parse.
   $twig->parsefile_inplace($xml_file);
}

Following on from comments - if you want to extend to match a Reference attribute, there's two possiblities - either set a handler on the specific xpath:

twig_handlers => { 'Reference[@Include="log4net, Version=1.2.10.0, Culture=neutral, PublicKeyToken=1b44e1d426115821, processorArchitecture=MSIL"]' => sub { $_->insert_new_elt( 'Private' => 'True' ); $_->insert_new_elt( 'HintPath' => '........\ExternalDLLs\log4net.dll' );

     #flush is needed to write out the change.
     $_->flush;
  }

}

This selects based on attribute content (but bear in mind the above is quite long and convoluted).

Alternatively - the handler 'fires' for each reference you encounter, so you can build a test.

my $twig = XML::Twig->new(
   pretty_print => 'indented',

   #set a handler for 'Reference' elements - to insert your values.
   twig_handlers => {
      'Reference' => sub {
         #note - instead of 'eq' you can do things like regex tests. 
         if ( $_ -> att('Include') eq "log4net, Version=1.2.10.0, Culture=neutral, PublicKeyToken=1b44e1d426115821, processorArchitecture=MSIL") {
              $_->insert_new_elt( 'Private' => 'True' );
              $_->insert_new_elt( 'HintPath' => '..\..\..\..\ExternalDLLs\log4net.dll' );
         }

         #flush is needed to write out the change.
         $_->flush;
      },
   }
);
Comments