|Easily use the power of regular expressions in your C/C++ applications with RegexBuddy. |
Create and analyze regex patterns with RegexBuddy's intuitive regex building blocks. Implement regexes in your applications with instant C and C++ code snippets. Just tell RegexBuddy what you want to achieve, and copy and paste the auto-generated PCRE-based C code. Get your own copy of RegexBuddy now.
PCRE is short for Perl Compatible Regular Expressions. It is the name of an open source library written in C by Phillip Hazel. The library is compatible with a great number of C compilers and operating systems. Many people have derived libraries from PCRE to make it compatible with other programming languages. E.g. there are several Delphi components that are simply wrappers around the PCRE library compiled into a Win32 DLL. The library is also included with many Linux distributions as a shared .so library and a .h header file. The PHP preg functions and the REALbasic RegEx class are built on top of PCRE.
PCRE implements almost the entire Perl 5.8 regular expression syntax. Only the support for various Unicode properties with \p is incomplete, though the most important ones are supported.
Using PCRE is very straightforward. Before you can use a regular expression, it needs to be converted into a binary format for improved efficiency. To do this, simply call pcre_compile() passing your regular expression as a null-terminated string. The function will return a pointer to the binary format. You cannot do anything with the result except pass it to the other pcre functions.
To use the regular expression, call pcre_exec() passing the pointer returned by pcre_compile(), the character array you want to search through, and the number of characters in the array (which need not be null-terminated). You also need to pass a pointer to an array of integers where pcre_exec() will store the results, as well as the length of the array expressed in integers. The length of the array should equal the number of capturing groups you want to support, plus one (for the entire regex match), multiplied by three (!). The function will return -1 if no match could be found. Otherwise, it will return the number of capturing groups filled plus one. If there are more groups than fit into the array, it will return 0. The first two integers in the array with results contain the start of the regex match (counting bytes from the start of the array) and the number of bytes in the regex match, respectively. The following pairs of integers contain the start and length of the backreferences. So array[n*2] is the start of capturing group n, and array[n*2+1] is the length of capturing group n, with capturing group 0 being the entire regex match.
When you are done with a regular expression, all pcre_dispose() with the pointer returned by pcre_compile() to prevent memory leaks.
The PCRE library only supports regex matching, a job it does rather well. It provides no support for search-and-replace, splitting of strings, etc. This is not a major issue, as you can easily do that in your own code.
You can find more information about PCRE on http://www.pcre.org/.
By default, PCRE compiles without Unicode support. If you try to use \p, \P or \X in your regular expressions, PCRE will complain it was compiled without Unicode support.
To compile PCRE with Unicode support, you need to define the SUPPORT_UTF8 and SUPPORT_UCP conditional defines. If PCRE's configuration script works on your system, you can easily do this by running ./configure --enable-unicode-properties before running make.
Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!
Page URL: http://www.Regular-Expressions.info/pcre.html
Page last updated: 22 September 2010
Site last updated: 18 April 2013
Copyright © 2003-2013 Jan Goyvaerts. All rights reserved.
|Languages & Libraries|
|Visual Basic 6|
|XQuery & XPath|
|Tools and Languages|
|About This Site|
|RSS Feed & Blog|