Murray Picton

Using PHP’s preg_replace with the ‘e’ modifier

For anyone that doesn’t know already, preg_replace is the PHP function for doing a regular expression replace on a string using a Perl Compatible Regular Expression (PCRE). It allows certain modifiers to be applied that modify the way the function works, one of the more interesting of these is the PREG_REPLACE_EVAL or ‘e’ modifier. For a full list of modifiers, take a look at the PHP manual here.

Let’s take a look at a simple use of preg_replace:

<?php
$string = 'This is my string';
$pattern = '/This is my (\w+)$/';
$replacement = 'This is another $1';
echo preg_replace($pattern, $replacement, $string); //Outputs "This is another string"
?>

If you don’t know, regular expressions are basically a way of using wildcards to match arbitrary characters within a string to allow enhanced searching and replacement. In the example above, in our replacement we have used a back-reference – a reference to a section of the matching pattern in the replacement. If you want more information on PHP PCRE, take a look at the PHP manual or regular-expressions.info.

The ‘e’ modifier

The ‘e’ modifier when used in preg_replace allows the parser to evaluate and execute the replacement after replacing any back-references and then returning the result. This is amazingly powerful, it allows the developer to create much more functionality with preg_replace and really easily perform complex tasks in a few simple lines.

To demonstrate this, I will write a piece of code that will find all URLs in a string and urlencode them. Let’s take a look:

<?php
$string = 'http://www.google.com and http://www.murraypicton.com';
$pattern = '!(http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)?)!e'; //Notice the 'e' modifier

$replacement = 'urlencode("$1")';
echo preg_replace($pattern, $replacement, $string); //Outputs "http%3A%2F%2Fwww.google.com and http%3A%2F%2Fwww.murraypicton.com"
?>

As you can see, we have replaced each URL with a urlencoded version, this is not the only thing that can be done. The URL regex was kindly taken from regexlib, a great place to quickly find pre-made regex patterns. Almost any PHP can be executed in the replacement, providing massive power to the developer.

Security vulnerabilities

As is often quoted – “With great power comes great responsibility”, preg_replace with the ‘e’ modifier is a great example of this. We are executing arbitrary code that is provided through a variable. This can be very dangerous if not properly monitored – if we allow the user to provide the variable then effectively they could execute any code they within our file. The ‘e’ modifier does try to get round this by escaping some characters (‘, “, \ and NULL), however it still leaves it open to some vulnerability.

Thanks for reading, if you have found this post interesting, please sign up to my RSS feed to get all future posts sent straight to you.