Quick Start
Tutorial
Tools & Languages
Examples
Reference
Book Reviews
Examples
Regular Expressions Examples
Numeric Ranges
Floating Point Numbers
Email Addresses
IP Addresses
Valid Dates
Numeric Dates to Text
Credit Card Numbers
Matching Complete Lines
Deleting Duplicate Lines
Programming
Two Near Words
Pitfalls
Catastrophic Backtracking
Too Many Repetitions
Denial of Service
Making Everything Optional
Repeated Capturing Group
Mixing Unicode & 8-bit
More on This Site
Introduction
Regular Expressions Quick Start
Regular Expressions Tutorial
Replacement Strings Tutorial
Applications and Languages
Regular Expressions Examples
Regular Expressions Reference
Replacement Strings Reference
Book Reviews
Printable PDF
About This Site
RSS Feed & Blog
RegexMagic—Generate regular expressions matching floating point numbers

Matching Floating Point Numbers with a Regular Expression

This example shows how you can avoid a common mistake often made by people inexperienced with regular expressions. As an example, we will try to build a regular expression that can match any floating point number. Our regex should also match integers and floating point numbers where the integer part is not given. We will not try to match numbers with an exponent, such as 1.5e8 (150 million in scientific notation).

At first thought, the following regex seems to do the trick: [-+]?[0-9]*\.?[0-9]*. This defines a floating point number as an optional sign, followed by an optional series of digits (integer part), followed by an optional dot, followed by another optional series of digits (fraction part).

Spelling out the regex in words makes it obvious: everything in this regular expression is optional. This regular expression considers a sign by itself or a dot by itself as a valid floating point number. In fact, it even considers an empty string as a valid floating point number. If you tried to use this regex to find floating point numbers in a file, you’d get a zero-length match at every position in the string where no floating point number occurs.

Not escaping the dot is also a common mistake. A dot that is not escaped matches any character, including a dot. If we had not escaped the dot, both 4.4 and 4X4 would be considered floating point numbers.

When creating a regular expression, it is more important to consider what it should not match, than what it should. The above regex indeed matches a proper floating point number, because the regex engine is greedy. But it also matches many things we do not want, which we have to exclude.

Here is a better attempt: [-+]?([0-9]*\.[0-9]+|[0-9]+). This regular expression matches an optional sign, that is either followed by zero or more digits followed by a dot and one or more digits (a floating point number with optional integer part), or that is followed by one or more digits (an integer).

This is a far better definition. Any match must include at least one digit. There is no way around the [0-9]+ part. We have successfully excluded the matches we do not want: those without digits.

We can optimize this regular expression as: [-+]?[0-9]*\.?[0-9]+.

If you also want to match numbers with exponents, you can use: [-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?. Notice how I made the entire exponent part optional by grouping it together, rather than making each element in the exponent optional.

Finally, if you want to validate if a particular string holds a floating point number, rather than finding a floating point number within longer text, you’ll have to anchor your regex: ^[-+]?[0-9]*\.?[0-9]+$ or ^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$. You can find additional variations of these regexes in RegexBuddy’s library.