
| Figure out why your regular expression isn't working with RegexBuddy. RegexBuddy's regular expression debugger offers you a unique view inside the regular expression engine. It shows you exactly how and why your regex works, or doesn't. Stop guessing. Fix your regular expressions with knowledge instead of by trial and error. Get your own copy of RegexBuddy now. |
In this example, I will show you how you can avoid a common mistake often made by people inexperienced with regular expressions. As an example, we will try to build a regular expression that can match any floating point number. Our regex should also match integers, and floating point numbers where the integer part is not given (i.e. zero). We will not try to match numbers with an exponent, such as 1.5e8 (150 million in scientific notation).
At first thought, the following regex seems to do the trick: [-+]?[0-9]*\.?[0-9]*. This defines a floating point number as an optional sign, followed by an optional series of digits (integer part), followed by an optional dot, followed by another optional series of digits (fraction part).
Spelling out the regex in words makes it obvious: everything in this regular expression is optional. This regular expression will consider a sign by itself or a dot by itself as a valid floating point number. In fact, it will even consider an empty string as a valid floating point number. This regular expression can cause serious trouble if it is used in a scripting language like Perl or PHP to verify user input.
Not escaping the dot is also a common mistake. A dot that is not escaped will match any character, including a dot. If we had not escaped the dot, 4.4 would be considered a floating point number, and 4X4 too.
When creating a regular expression, it is more important to consider what it should not match, than what it should. The above regex will indeed match a proper floating point number, because the regex engine is greedy. But it will also match many things we do not want, which we have to exclude.
Here is a better attempt: [-+]?([0-9]*\.[0-9]+|[0-9]+). This regular expression will match an optional sign, that is either followed by zero or more digits followed by a dot and one or more digits (a floating point number with optional integer part), or followed by one or more digits (an integer).
This is a far better definition. Any match will include at least one digit, because there is no way around the [0-9]+ part. We have successfully excluded the matches we do not want: those without digits.
We can optimize this regular expression as: [-+]?[0-9]*\.?[0-9]+
.
If you also want to match numbers with exponents, you can use: [-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?
. Notice how I made the entire exponent part optional by grouping it together, rather than making each element in the exponent optional.
Finally, if you want to validate if a particular string holds a floating point number, rather than finding a floating point number within longer text, you'll have to anchor your regex: ^[-+]?[0-9]*\.?[0-9]+$ or ^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$. You can find additional variations of these regexes in RegexBuddy's library.
Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!
Page URL: http://www.Regular-Expressions.info/floatingpoint.html
Page last updated: 09 June 2009
Site last updated: 27 November 2009
Copyright © 2003-2010 Jan Goyvaerts. All rights reserved.
| Examples |
| Examples |
| Numeric Ranges |
| Floating Point Numbers |
| Email Addresses |
| Valid Dates |
| Credit Card Numbers |
| Matching Complete Lines |
| Deleting Duplicate Lines |
| Programming |
| Two Near Words |
| Pitfalls |
| Catastrophic Backtracking |
| Making Everything Optional |
| Repeated Capturing Group |
| Mixing Unicode & 8-bit |
| More Information |
| Introduction |
| Quick Start |
| Tutorial |
| Tools and Languages |
| Examples |
| Books |
| Reference |
| Print PDF |
| About This Site |
| RSS Feed & Blog |