next up previous
Next: Capturing Up: Regular Expressions Previous: Regular Expressions


Matching

With matching, we can see if the the text conforms to a certain description. In the most simple case, we can check if the text contains a certain sequence of characters. In the following example, we check if the string $log_line contains the word ``apache_pb'', which it does. The `' operator binds a string to a regular expression and returns `true' if the string matches the expression.

# matching a string to a string
$log_line =~ /apache_pb/; # returns true

Regular expressions also provide something called a ``character class'' which can match a certain set of characters at the point it appears in the regular expression. Regular expressions also include facilities for specifying how many characters in the text being evaluated should match each character in the regular expression. Thirdly, a regular expression may specify where a match should occur. The exact syntax for these facilities is beyond the scope of this tutorial, but you can peruse the following examples for a taste of regular expressions.

# a simple example
$log_line =~ /^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/;
#
# or, a much more convoluted example
$ipnum = qr/([0-1]?\d\d?|2[0-4]\d|25[0-5])/;
$log_line =~ /^$ipnum\.$ipnum\.$ip_num\.$ip_num\s/;
#
# and finally, a decent compromise
$log_line =~ /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s/;

All these regular expressions will match the IP address at the beggining of $log_line, but they aren't all quite the same. The first will match any sequence of four integers separated by periods (208.190.213.12 and 1.45.3.999 and 314156.2171828.1618.1412). The second will match only valid ip addresses and 0.0.0.0 - not a valid ip address, but close. The third one falls somewhere in the middle, and should be sufficient for our purposes.


next up previous
Next: Capturing Up: Regular Expressions Previous: Regular Expressions
saville stephen a 2003-01-05