Wednesday, November 11, 2015

Sed Global Substitution - Ordering Matters!

I noticed a little nuance while programmatically generating some code segments (programming programming). It seems obvious, when I program by hand I never have the issue since I'm fairly tuned into the logical flow, but in the case of programming programming it makes sense to keep in mind that the order of global substitutions matters.

Test string: "trouble_%here_(_if_order_i$_wrong.html"

\x25 is the hexcode for percent %, if the substitution is on the end of the righthand side of the sed global substitutions list there is trouble since subsequent substitutions have their change applied.

E.g.
sed 's/\x20/%20/g;s/\x21/%21/g;s/\x22/%22/g;s/\x23/%23/g;s/\x5c\x24/%24/g;s/\x25/%25/g'  
<<< "trouble_%here_(_if_order_i$_wrong.html"

trouble_%25here_(_if_order_i%2524_wrong.html

Since the replacement of percents is all the way on the righthand side (generated by sequence {20..25}), 
the sed replacements of other special characters incorrectly have the resulting %'s from their
respective substitutions replaced. sed 's/\x25/%25/g;s/\x20/%20/g;s/\x21/%21/g;s/\x22/%22/g;s/\x23/%23/g;s/\x5c\x24/%24/g'
<<< "trouble_%here_(_if_order_i$_wrong.html" trouble_%25here_(_if_order_i%24_wrong.html Since the percent substitution is now on the lefthand side, sed parses the string of any
existing %'s and leaves the subsequent substitutions alone.

No comments:

Post a Comment