Search and replace by parsing nested clause across multilines using recursive regex
Let me start with the answer. The answer is:
Now the question and story behind it:
Transforming existing code can be bit tricky and less useful sometimes. But at other times it can literally save our jobs! Here is a typical scenario:
FuncName ("Hell brakes $heck if(blah (blah(blah(".
"blah(blah))blah)))", Heaven);
If the above function is called million times at different places and you want to change that crazy looking nested structure inside a function argument what will you do ? Ok, you can slurp the file to a scalar as in here:
http://www.modernperlbooks.com/mt/2009/08/a-one-line-slurp-in-perl-5.html. Then use the recursive regexp idiom:
$re = qr/\((?:[^()]*|(??{$re}))*\)/;
$input_scalar =~ s/(Hell\s+brakes\s+.[a-z,A-Z,0-9|.]*\s+if\s*$re)\s*(?!HELLIFY)/$1 HELLIFY /gis;
The $re grabs everything in the nested brackets (note you can not to this with regular regex). The rest is search replace as usual with some Look-Around and grouping. Depends on what you want to do. I wanted to append a string after the nested clause ended.
You already know the short answer! The long answer is below.
Have fun!
# Multiline search. Can use instead: undef $*=1;
undef $/;
#Process each file in the dir
foreach $file (@files) { # Process each file
# Read the file
open(FILE,"$file") or die;
@input_array=
close(FILE);
$input_scalar=join("",@input_array);
#An example
my $re; # A recursive regular expression for everything in brackets
# Do your substitution here.
$re = qr/\((?:[^()]*|(??{$re}))*\)/;
$input_scalar =~ s/(Hell\s+table\s+.[a-z,A-Z,0-9]*\s*$re)\s*(?!READONLY)/$1 READONLY /gis;
# Write to file
open(OUTPUT,">$file") or die;
print(OUTPUT $input_scalar);
close(OUTPUT);
print $file . "\n";
} #End for
$/ = $holdTerminator; #Restore
print "Search replace complete \n";
return 0; } /end sub One can shorten, optimize etc. But this does what is needed at the moment. dprofpp
Total Elapsed Time = 0.019919 Seconds
User+System Time = 0.019919 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
50.2 0.010 0.010 2 0.0050 0.0050 main::BEGIN
50.2 0.010 0.010 1 0.0100 0.0099 main::searchReplaceAcrossLines
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_BRACE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_NOMAGIC
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_QUOTE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_TILDE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_ALPHASORT
0.00 - -0.000 1 - - DynaLoader::dl_load_file
0.00 - -0.000 1 - - DynaLoader::dl_undef_symbols
0.00 - -0.000 1 - - DynaLoader::dl_find_symbol
0.00 - -0.000 1 - - DynaLoader::dl_install_xsub
0.00 - -0.000 1 - - File::Glob::bootstrap
0.00 - -0.000 1 - - File::Glob::doglob
0.00 - -0.000 1 - - warnings::import
0.00 - -0.000 1 - - warnings::BEGIN Here is how profiling works: http://www.perl.com/pub/2004/06/25/profiling.html
perl -p -i -e 'undef $/;$re = qr/\((?:[^()]*|(??{$re}))*\)/;s/(create\s+table
\s+.[a-z,A-Z,0-9,.]*\s*$re)\s*(?!SPARSIFY)/$1 SPARSIFY /gis' test/*
Now the question and story behind it:
Transforming existing code can be bit tricky and less useful sometimes. But at other times it can literally save our jobs! Here is a typical scenario:
FuncName ("Hell brakes $heck if(blah (blah(blah(".
"blah(blah))blah)))", Heaven);
If the above function is called million times at different places and you want to change that crazy looking nested structure inside a function argument what will you do ? Ok, you can slurp the file to a scalar as in here:
http://www.modernperlbooks.com/mt/2009/08/a-one-line-slurp-in-perl-5.html. Then use the recursive regexp idiom:
$re = qr/\((?:[^()]*|(??{$re}))*\)/;
$input_scalar =~ s/(Hell\s+brakes\s+.[a-z,A-Z,0-9|.]*\s+if\s*$re)\s*(?!HELLIFY)/$1 HELLIFY /gis;
The $re grabs everything in the nested brackets (note you can not to this with regular regex). The rest is search replace as usual with some Look-Around and grouping. Depends on what you want to do. I wanted to append a string after the nested clause ended.
You already know the short answer! The long answer is below.
Have fun!
sub searchReplaceAcrossLines()
{
my @files =dir/*; # Read all files in a dir
{
my @files =dir/*; # Read all files in a dir
my $input_scalar;
my @input_array;
my $holdTerminator = $/;
# Multiline search. Can use instead: undef $*=1;
undef $/;
foreach $file (@files) { # Process each file
# Read the file
open(FILE,"$file") or die;
@input_array=
close(FILE);
$input_scalar=join("",@input_array);
#An example
my $re; # A recursive regular expression for everything in brackets
# Do your substitution here.
$re = qr/\((?:[^()]*|(??{$re}))*\)/;
$input_scalar =~ s/(Hell\s+table\s+.[a-z,A-Z,0-9]*\s*$re)\s*(?!READONLY)/$1 READONLY /gis;
# Write to file
open(OUTPUT,">$file") or die;
print(OUTPUT $input_scalar);
close(OUTPUT);
print $file . "\n";
} #End for
$/ = $holdTerminator; #Restore
print "Search replace complete \n";
return 0;
Total Elapsed Time = 0.019919 Seconds
User+System Time = 0.019919 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
50.2 0.010 0.010 2 0.0050 0.0050 main::BEGIN
50.2 0.010 0.010 1 0.0100 0.0099 main::searchReplaceAcrossLines
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_BRACE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_NOMAGIC
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_QUOTE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_TILDE
0.00 0.000 0.000 1 0.0000 0.0000 File::Glob::GLOB_ALPHASORT
0.00 - -0.000 1 - - DynaLoader::dl_load_file
0.00 - -0.000 1 - - DynaLoader::dl_undef_symbols
0.00 - -0.000 1 - - DynaLoader::dl_find_symbol
0.00 - -0.000 1 - - DynaLoader::dl_install_xsub
0.00 - -0.000 1 - - File::Glob::bootstrap
0.00 - -0.000 1 - - File::Glob::doglob
0.00 - -0.000 1 - - warnings::import
0.00 - -0.000 1 - - warnings::BEGIN
Comments