Skip to main content

Basics of Regular expressions

Basic Syntax of Regular Expressions (as from PHPBuilder.com)First of all, let's take a look at two special symbols: '^' and '$'. What they do is indicate thestart and the end of a string, respectively, like this:
"^The": matches any string that starts with "The";"of despair$": matches a string that ends in the substring "of despair";"^abc$": a string that starts and ends with "abc" -- that could only be "abc" itself!"notice": a string that has the text "notice" in it.You can see that if you don't use either of the two characters we mentioned, as in the last example,you're saying that the pattern may occur anywhere inside the string -- you're not "hooking" it to any of the edges.
There are also the symbols '*', '+', and '?', which denote the number of times a character or a sequence ofcharacters may occur. What they mean is: "zero or more", "one or more", and "zero or one." Here are some examples:

"ab*": matches a string that has an a followed by zero or more b's ("a", "ab", "abbb", etc.);"ab+": same, but there's at least one b ("ab", "abbb", etc.);"ab?": there might be a b or not;"a?b+$": a possible a followed by one or more b's ending a string.You can also use bounds, which come inside braces and indicate ranges in the number of occurences:
"ab{2}": matches a string that has an a followed by exactly two b's ("abb");"ab{2,}": there are at least two b's ("abb", "abbbb", etc.);"ab{3,5}": from three to five b's ("abbb", "abbbb", or "abbbbb").Note that you must always specify the first number of a range (i.e, "{0,2}", not "{,2}"). Also, as you mighthave noticed, the symbols '*', '+', and '?' have the same effect as using the bounds "{0,}", "{1,}", and "{0,1}",respectively.Now, to quantify a sequence of characters, put them inside parentheses:
"a(bc)*": matches a string that has an a followed by zero or more copies of the sequence "bc";
"a(bc){1,5}": one through five copies of "bc."
There's also the '|' symbol, which works as an OR operator:
"hi|hello": matches a string that has either "hi" or "hello" in it;
"(b|cd)ef": a string that has either "bef" or "cdef";
"(a|b)*c": a string that has a sequence of alternating a's and b's ending in a c;
A period ('.') stands for any single character:
"a.[0-9]": matches a string that has an a followed by one character and a digit;
"^.{3}$": a string with exactly 3 characters.
Bracket expressions specify which characters are allowed in a single position of a string:
"[ab]": matches a string that has either an a or a b (that's the same as "a|b");
"[a-d]": a string that has lowercase letters 'a' through 'd' (that's equal to "a|b|c|d" and even "[abcd]");
"^[a-zA-Z]": a string that starts with a letter;
"[0-9]%": a string that has a single digit before a percent sign;
",[a-zA-Z0-9]$": a string that ends in a comma followed by an alphanumeric character.
You can also list which characters you DON'T want -- just use a '^' as the first symbol in a bracket expression
(i.e., "%[^a-zA-Z]%" matches a string with a character that is not a letter between two percent signs).
In order to be taken literally, you must escape the characters "^.[$()|*+?{\" with a backslash ('\'), as
they have special meaning. On top of that, you must escape the backslash character itself in PHP3 strings, so,
for instance, the regular expression "(\$|¥)[0-9]+" would have the function call: ereg("(\\$|¥)[0-9]+", $str)
(what string does that validate?)
Example 1. Examples of valid patterns
 * /<\/\w+>/
 * |(\d{3})-\d+|Sm
 * /^(?i)php[34]/
 * {^\s+(\s+)?$}
Example 2. Examples of invalid patterns
 * /href='(.*)' - missing ending delimiter
 * /\w+\s*\w+/J - unknown modifier 'J'
 * 1-\d3-\d3-\d4| - missing starting delimiter
Some useful PHP Keywords and their use  (php.net man pages)


preg_split

preg_split (PHP 3>= 3.0.9, PHP 4 ) preg_split -- Split string by a regular expression Description array preg_split ( string pattern, string subject [, int limit [, int flags]]) Returns an array containing substrings of subject split along boundaries matched by pattern. If limit is specified, then only substrings up to limit are returned, and if limit is -1, it actually means "no limit", which is useful for specifying the flags. flags can be any combination of the following flags (combined with bitwise | operator): PREG_SPLIT_NO_EMPTY If this flag is set, only non-empty pieces will be returned by preg_split(). PREG_SPLIT_DELIM_CAPTURE If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well. This flag was added for 4.0.5. PREG_SPLIT_OFFSET_CAPTURE If this flag is set, for every occuring match the appendant string offset will also be returned. Note that this changes the return value in an array where every element is an array consisting of the matched string at offset 0 and it's string offset into subject at offset 1. This flag is available since PHP 4.3.0 . Example 1. preg_split() example : Get the parts of a search string <?php // split the phrase by any number of commas or space characters, // which include " ", \r, \t, \n and \f $keywords = preg_split ("/[\s,]+/", "hypertext language, programming"); ?> Example 2. Splitting a string into component characters <?php $str = 'string'; $chars = preg_split('//', $str, -1, PREG_SPLIT_NO_EMPTY); print_r($chars); ?> Example 3. Splitting a string into matches and their offsets <?php $str = 'hypertext language programming'; $chars = preg_split('/ /', $str, -1, PREG_SPLIT_OFFSET_CAPTURE); print_r($chars); ?> will yield: Array ( [0] => Array ( [0] => hypertext [1] => 0 ) [1] => Array ( [0] => language [1] => 10 ) [2] => Array ( [0] => programming [1] => 19 ) ) Note: Parameter flags was added in PHP 4 Beta 3.

preg_match

preg_match (PHP 3>= 3.0.9, PHP 4 ) preg_match -- Perform a regular expression match Description int preg_match ( string pattern, string subject [, array matches [, int flags]]) Searches subject for a match to the regular expression given in pattern. If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on. flags can be the following flag: PREG_OFFSET_CAPTURE If this flag is set, for every occuring match the appendant string offset will also be returned. Note that this changes the return value in an array where every element is an array consisting of the matched string at offset 0 and it's string offset into subject at offset 1. This flag is available since PHP 4.3.0 . The flags parameter is available since PHP 4.3.0 . preg_match() returns the number of times pattern matches. That will be either 0 times (no match) or 1 time because preg_match() will stop searching after the first match. preg_match_all() on the contrary will continue until it reaches the end of subject. preg_match() returns FALSE if an error occured. Tip: Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster. Example 1. Find the string of text "php" <?php // The "i" after the pattern delimiter indicates a case-insensitive search if (preg_match ("/php/i", "PHP is the web scripting language of choice.")) { print "A match was found."; } else { print "A match was not found."; } ?> <strong>Example 2.</strong> Find the word "web" <?php /* The \b in the pattern indicates a word boundary, so only the distinct * word "web" is matched, and not a word partial like "webbing" or "cobweb" */ if (preg_match ("/\bweb\b/i", "PHP is the web scripting language of choice.")) { print "A match was found."; } else { print "A match was not found."; } if (preg_match ("/\bweb\b/i", "PHP is the website scripting language of choice.")) { print "A match was found."; } else { print "A match was not found."; } ?> <strong>Example 3.</strong> Getting the domain name out of a URL <?php // get host name from URL preg_match("/^(http:\/\/)?([^\/]+)/i", "http://www.php.net/index.html", $matches); $host = $matches[2]; // get last two segments of host name preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches); echo "domain name is: {$matches[0]}\n"; ?> PHP Live Regex Source

Comments

Popular posts from this blog

Difference between session.gc_maxlifetime and session.cookie_lifetime in Drupal setting.php

ini_set('session.gc_maxlifetime', 200000); This value is for the server. It is a settings for Session Garbage Collection. If the users last visit happened before 200000s then this session is eligible for garbage collection. Since it is GC, the session value may be discarded and not compulsory. If a GC action happens after the session was made eligible for the GC, it will be deleted. ini_set ( 'session.cookie_lifetime' , 2000000 ); This value is for the browser. This is the absolute maximum time till which a browser can keep this cookie active. A 0 value here means immediate or when the browser is closed. Source: 

CKFinder Installation in the CKEditor for Drupal 7 Module

Please follow the steps below;  Go to http://ckfinder.com/download and download CKfinder Unpack the contents of the installation package into the directory that contains the CKEditor module and place it in thesites/all/modules/ckeditor/ckfinder (or sites/all/libraries/ckfinder) folder. . When the files are unpacked, you should see the following file structure inside the drupal/sites/all/modules directory: Now open the CKFinder configuration file (ckfinder/config.php) and introduce the code changes described below. Firstly, remove the CheckAuthentication() function (do not worry, this function is defined in filemanager.config.php, see below): function CheckAuthentication()       <- remove it {                                    <- remove it    //WARNING : DO NOT simply...      <- remove it    ...                               <- remove it    return false;                     <- remove it } For CKFinder installed in the sites/all/modules/ckeditor/ckfinder

Smart pagination or page break in Drupal 7(CK editor)

1. Install Smart Paging module   Go to Administration › Configuration › Administer Smart Paging settings.   Under 'Default page break method', select "Manual placement of page break placeholder". 2.  Install  'Ckeditor' Module   Go to Administration › Configuration > Ckediotr profiles > Filtered HTML   Edit the settings of the Advanced (Filtered HTML) Profile. Under 'Editor Appearance' section, In plugins check the required options like " Plugin for inserting a Drupal teaser and page breaks. ". 3.  Edit the configuration settings of input formats (Filtered HTML, Full HTML, Plain Text)   Go to Administration › Configuration > Text formats. Edit the required input format. For example say "Filtered HTML". Under  "Enabled filters" section, check the 'Smart Paging' option and uncheck all the remaining checkboxes. 4. Go to Content type 'article' and create new content. We will s