Tutorial
Tools & Languages
Examples
Books & Reference
RegexBuddy Easily create and understand regular expressions today. Compose and analyze regex patterns with RegexBuddy's easy-to-grasp regex blocks and intuitive regex tree, instead of or in combination with the traditional regex syntax. Developed by the author of this website, RegexBuddy makes learning and using regular expressions easier than ever. Get your own copy of RegexBuddy now, and get a FREE printable PDF version of the regex reference on this website.

Regular Expression Unicode Syntax Reference

Unicode Characters
CharacterDescriptionExample
\X Matches a single Unicode grapheme, whether encoded as a single code point or multiple code points using combining marks. A grapheme most closely resembles the everyday concept of a "character". \X matches à encoded as U+0061 U+0300, à encoded as U+00E0, ©, etc.
\uFFFF where FFFF are 4 hexadecimal digits Matches a specific Unicode code point. Can be used inside character classes. \u00E0 matches à encoded as U+00E0 only. \u00A9 matches ©
\x{FFFF} where FFFF are 1 to 4 hexadecimal digits Perl syntax to match a specific Unicode code point. Can be used inside character classes. \x{E0} matches à encoded as U+00E0 only. \x{A9} matches ©
Unicode Properties, Scripts and Blocks
CharacterDescriptionExample
\p{L} or \p{Letter} Matches a single Unicode code point that has the property "letter". See Unicode Character Properties in the tutorial for a complete list of properties. Each Unicode code point has exactly one property. Can be used inside character classes. \p{L} matches à encoded as U+00E0; \p{S} matches ©
\p{Arabic} Matches a single Unicode code point that is part of the Unicode script "Arabic". See Unicode Scripts in the tutorial for a complete list of scripts. Each Unicode code point is part of exactly one script. Can be used inside character classes. \p{Thai} matches one of 83 code points in Thai script, from until
\p{InBasicLatin} Matches a single Unicode code point that is part of the Unicode block "BasicLatin". See Unicode Blocks in the tutorial for a complete list of blocks. Each Unicode code point is part of exactly one block. Blocks may contain unassigned code points. Can be used inside character classes. \p{InLatinExtended-A} any of the code points in the block U+100 until U+17F (Ā until ſ)
\P{L} or \P{Letter} Matches a single Unicode code point that does not have the property "letter". You can also use \P to match a code point that is not part of a particular Unicode block or script. Can be used inside character classes. \P{L} matches ©

Make a Donation

Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!

Books
Regular Expr. Cookbook
Teach Yourself Reg. Expr.
Mastering Regular Expr.
Java Regular Expressions
Oracle Regular Expr.
Regular Expr. Pocket Ref.
Regular Expr. Recipes
Regex Recipes for Windows
Regex Reference
Basic Regex Syntax
Advanced Regex Syntax
Unicode-Specific Syntax
Flavor-Specific Syntax
Flavor Comparison
Replacement Syntax
More Information
Introduction
Quick Start
Tutorial
Tools and Languages
Examples
Books
Reference
Print PDF
About This Site
RSS Feed & Blog

 

PowerGREP 4
PowerGREP PowerGREP is probably the most powerful regex-based text processing tool available today. A knowledge worker's Swiss army knife for searching through, extracting information from, and updating piles of files.
Use regular expressions to search through large numbers of text and binary files. Quickly find the files you are looking for, or extract the information you need. Look through just a handful of files or folders, or scan entire drives and network shares.
Search and replace using text, binary data or one or more regular expressions to automate repetitive editing tasks. Preview replacements before modifying files, and stay safe with flexible backup and undo options.
Use regular expressions to rename files, copy files, or merge and split the contents of files. Work with plain text files, Unicode files, binary files, compressed files, and files in proprietary formats such as MS Office, OpenOffice, and PDF. Runs on Windows 2000, XP, Vista, 7, and 8.
More information
Download PowerGREP now