| Unicode Regexes |
| Introduction |
| Astral Characters |
| Code Points and Graphemes |
| Unicode Categories |
| Unicode Scripts |
| Unicode Blocks |
| Unicode Binary Properties |
| Unicode Property Sets |
| Unicode Script Runs |
| Unicode Boundaries |
Property sets are a kind of Unicode property that flavors may support. A property set can have multiple values. Every Unicode code point is assigned exactly one value of each set. In a regular expression you specify both a set and one of its values as the Property in \p{Property}. For example, the Bidi_Class property set has Left_To_Right as one of its values. The regex \p{Bidi_Class=Left_To_Right} matches any code point that has the Left_To_Right value for the Bidi_Class property.
Strictly speaking, General_Category, Script, and Block are also property sets. But we handle those separately in this tutorial because they are supported by many more regex flavors and have alternative syntax specific to them. The only flavors to support any Unicode property sets other than these three are ICU, Perl, Ruby, and PCRE2. The Unicode property set reference lists all the property sets and all their possible values and indicates which versions of which flavors support them. Only ICU and Perl support most of the property sets. Ruby supports the Age property since Ruby 1.9 and the Grapheme_Cluster_Break property since Ruby 2.4. PCRE2 supports only the Bidi_Class property starting with version 10.40. This also applies to PHP 8.2.0 and R 4.2.2 as they are based on PCRE2.
Many flavors only support a limited number of property sets. The fact that a flavor is built on a certain version of Unicode does not mean it supports all the property sets that exist in that version of Unicode. The table below indicates which flavors support which properties sets. If a flavor supports a property set then it does support all property values that are part of that set. Those values are listed in the description. The exact code points matched by each property do depend on the Unicode version the flavor was built with.
If you find the content on this website helpful they you may want a copy you can read offline or even print, or browse the site as often as you want without ads. You can purchase your own copy of the Regular-Expressions.info printable PDF download. As a bonus, you'll get a lifetime of advertisement-free access to this site!
| Quick Start | Tutorial | Search & Replace | Tools & Languages | Examples | Reference |
| Introduction | Astral Characters | Code Points and Graphemes | Unicode Categories | Unicode Scripts | Unicode Blocks | Unicode Binary Properties | Unicode Property Sets | Unicode Script Runs | Unicode Boundaries |
| Introduction | Table of Contents | Special Characters | Non-Printable Characters | Regex Engine Internals | Character Classes | Character Class Subtraction | Character Class Intersection | Shorthand Character Classes | Dot | Anchors | Word Boundaries | Alternation | Optional Items | Repetition | Grouping & Capturing | Backreferences | Backreferences, part 2 | Named Groups | Relative Backreferences | Branch Reset Groups | Free-Spacing & Comments | Unicode Characters & Properties | Mode Modifiers | Atomic Grouping | Possessive Quantifiers | Lookahead & Lookbehind | Lookaround, part 2 | Lookbehind Limitations | (Non-)Atomic Lookaround | Keep Text out of The Match | Conditionals | Balancing Groups | Recursion | Subroutines | Infinite Recursion | Recursion & Quantifiers | Recursion & Capturing | Recursion & Backreferences | Recursion & Backtracking | POSIX Bracket Expressions | Zero-Length Matches | Continuing Matches | Backtracking Control Verbs | Control Verb Arguments |
Page URL: https://www.regular-expressions.info/unicodepropertyset.html
Page last updated: 16 June 2025
Site last updated: 09 January 2026
Copyright © 2003-2026 Jan Goyvaerts. All rights reserved.