Comparison of regular expression engines
From Seo Wiki - Search Engine Optimization and Programming Languages
Jump to navigationJump to search
Libraries
Official website | Programming language | Software license | |
---|---|---|---|
Boost.Regex Template:Ref label | Boost C++ Libraries | C++ | Boost Software License |
Boost.Xpressive | Boost C++ Libraries | C++ | Boost Software License |
CL-PPCRE | Edi Weitz | Common Lisp | BSD |
DEELX | RegExLab | C++ | "free for personal use and commercial use" |
GLib/GRegex Template:Ref label | Marco Barisione | C | LGPL |
GRETA | Microsoft Research | C++ | ? |
ICU | International Components for Unicode | C/C++/Java | ICU license |
Jakarta/Regexp | The Apache Jakarta Project | Java | Apache License |
JRegex | JRegex | Java | BSD |
Oniguruma | Kosako | C | BSD |
Pattwo | Stevesoft | Java (compatible with Java 1.0) | LGPL |
PCRE | Philip Hazel | C/C++Template:Ref label | BSD |
Qt/QRegExp | Qt Software | C++ | Qt GNU GPL v. 3.0 / Qt GNU LGPL v. 2.1 / Qt Commercial |
regex - Henry Spencer's regular expression libraries | ArgList | C | BSD |
TRE | Ville Laurikari | C | BSD |
TPerlRegEx | TPerlRegEx VCL Component | Object Pascal | MPLv1.1 |
TRegExpr | RegExp Studio | Object Pascal | Freeware |
Template:Note label formerly called Regex++
Template:Note label included since version 2.13.0
Template:Note label C++ bindings were developed by Google and became officially part of PCRE in 2006
Languages
Language | Official website | Software license | Remarks |
---|---|---|---|
.NET | MSDN | Proprietary | |
D | D | Proprietary | |
Haskell | Haskell.org | BSD3 | |
Java | Java | GNU General Public License | REs are written as strings (all backslashes must be doubled, hurting readability). |
JavaScript/ECMAScript | ? | Limited but REs are first-class citizens of the language with a specific /.../mod syntax.
| |
Lua | Lua.org | MIT License | Uses a simplified, limited dialect. Can be bound to a more powerful library, like PCRE or an alternative parser like LPeg. |
Perl | Perl.com | Artistic License or the GNU General Public License | Full, central part of the language. |
PHP | PHP.net | ? | Has two implementations, PCRE being the most efficient (speed, functionalities). |
Python | python.org | Python Software Foundation License | |
Ruby | ruby-doc.org | GNU Library General Public License | |
SAP ABAP | SAP.com | ? | |
Tcl 8.4 | tcl.tk | Tcl/Tk License (Permissive, similar to BSD) |
Language features
NOTE: An application using a library for regular expression support does not necessarily offer the full set of features of the library, e.g. GNU Grep which uses PCRE does not offer lookahead support, though PCRE does.
Part 1
"+" quantifier | Negated character classes | Non-greedy quantifiersTemplate:Refun | Shy groupsTemplate:Refun | Lookahead | Lookbehind | BackreferencesTemplate:Refun | >9 indexable captures | |
---|---|---|---|---|---|---|---|---|
Boost.Regex | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Boost.Xpressive | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
CL-PPCRE | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
EmEditor | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
GLib/GRegex | ? | ? | ? | ? | ? | ? | ? | ? |
GNU Grep | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ? |
Haskell | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Java | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
ICU Regex | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
JGsoft | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
.NET | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
OmniOutliner 3.6.2 | Yes | Yes | Yes | No | No | No | ? | ? |
PCRE | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Perl | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
PHP | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Python | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Qt/QRegExp | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
Ruby | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
TRE | Yes | Yes | Yes | Yes | No | No | Yes | No |
Vim Template:Latest preview release/Vim | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
- ^ Non-greedy quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-POSIX engines were non-greedy and didn't have greedy quantifiers at all
- ^ Shy groups, also called non-capturing groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the groups content needs not be accessed later.
- ^ Backreferences enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ([ab]+)\1 matches "abab" but not "abaab"
Part 2
Directives Template:Refun | Conditionals | Atomic groups Template:Refun | Named capture Template:Refun | Comments | Embedded code | Partial matching | Fuzzy matching | Unicode property support [1] | |
---|---|---|---|---|---|---|---|---|---|
Boost.Regex | Yes | Yes | Yes | Yes | Yes | No | Yes | No | Yes Template:Refun |
Boost.Xpressive | Yes | No | Yes | Yes | Yes | No | Yes | No | No |
CL-PPCRE | Yes | Yes | Yes | Yes | Yes | Yes | ? | No | No |
EmEditor | Yes | Yes | ? | ? | Yes | No | Yes | No | ? |
GLib/GRegex | ? | ? | ? | ? | ? | No | Yes | No | Yes Template:Refun |
GNU Grep | Yes | Yes | ? | Yes | Yes | No | ? | No | No |
Haskell | ? | ? | ? | ? | ? | No | ? | No | No |
Java | Yes | Yes | Yes | Yes | No | No | ? | No | Yes |
ICU Regex | Yes | Yes | Yes | No | Yes | No | No | No | Yes |
JGsoft | Yes | Yes | Yes | Yes | Yes | No | Yes | ? | Yes |
.NET | Yes | Yes | Yes | Yes | Yes | No | ? | No | Yes |
OmniOutliner 3.6.2 | ? | ? | ? | ? | No | No | ? | No | ? |
PCRE | Yes | Yes | Yes | Yes Template:Refun | Yes | Yes | Yes | No | Yes Template:Refun |
Perl | Yes | Yes | Yes | Yes Template:Refun | Yes | Yes | No | No | Yes |
PHP | Yes | Yes | Yes | Yes | Yes | No | No | No | No |
Python | Yes | Yes | No | Yes | Yes | No | No | No | No |
Qt/QRegExp | No | No | No | No | No | No | Yes | No | Yes |
RubyTemplate:Refun | Yes | No | No | No | Yes | Yes | No | No | No |
TRE | Yes | No | No | No | Yes | No | No | Yes | ? |
Vim Template:Latest preview release/Vim | Yes | ? | Yes | ? | ? | No | Yes | No | ? |
- ^ Also known as Flags modifiers or Option letters. Example pattern: "(?i:test)"
- ^ Also called Independent sub-expressions
- ^ Similar to back references but with names instead of indices
- ^ Available as of PCRE 7.0 (as of PCRE 4.0 with Python-like syntax
(?P<name>...)
) - ^ Available as of perl 5.9.5
- ^ Requires optional Unicode support enabled.
- ^ As of Ruby 1.8. The current development version, Ruby 1.9, has additional features.
API features
Native UTF-16 support Template:Refun | Native UTF-8 support Template:Refun | Non-linear input support | Dot-matches-newline option | Anchor-matches-newline option | |
---|---|---|---|---|---|
Boost.Regex | No | No | Yes | Yes | Yes |
Boost.Xpressive | ? | ? | ? | ? | ? |
GLib/GRegex | No | Yes Template:Refun | No | Yes | Yes |
ICU Regex | Yes | No | No | Yes | ? |
Java | Yes | ? | ? | Yes | Yes |
.NET | Yes | No | Yes | Yes | ? |
PCRE | No | Yes Template:Refun | No | Yes | Yes |
Qt/QRegExp | Yes | No | No | No | No |
TRE | No | ? | Yes | Yes | Yes |
- ^ Native support means that conversion between UTF-16 <-> UTF-8 isn't required, the Unicode properties are supported, and the encoding type is always available (platform dependent wchar_t doesn't count).
See also
External links
- Regular Expression Flavor Comparison — Detailed comparison of the most popular regular expression flavors
- Regexp Syntax Summary
If you like SEOmastering Site, you can support it by - BTC: bc1qppjcl3c2cyjazy6lepmrv3fh6ke9mxs7zpfky0 , TRC20 and more...
→