Recently, I got a request to show prices in comma seperated format on whatznear.com. Since we are using rails, it have a handy method to do number_to_currency, but unfortunately that was not enough because it follow US system of seperation with thousands. My requirement was to show prices in Indian system of seperation with hundreds.
450,500 # US system 4,50,500 # Indian System
The above regex worked fine for me, but I was curious to know how it works. So I started digging into regex documentation. I am gonna explain, what I understand about the regex.
Let look into group 1 ie.,
(\d+?) which says digit should be matched as often as possible. I think there is nothing much confusing here except the tailing
?. The question mark tells the regex engine, “preceding token zero times or once”. A
+ makes it lazy and return as soon as the first match.
The Grouping start with
?= means its a positive lookahead and the previously captured match (matched by group 1) should follow match of this group. Whatever this Group 2 matches won’t expand the match of Group 1.
In order to match this group, 2 digits should be found at least once (
(\d\d)+), followed by a digit (
(\d)) and non digit.
(?!\d) is a negative lookahead which succeeds when the regex inside lookahead fails. This helps to filter out last 3 digits of a number.
How Group 1 & positive lookahead works together
When the group 1 and positive lookahead works together, for the first match there should be a digit followed by at least 1 group of 2 digits, followed by a single digit and non digit.
lets take the number 1234567.00, as the regex engine always returns the leftmost match,
the first match will be
12 since it is followed by group of 2 digit (twice) (34 & 56) and a digit then a
. (non digit). The second match will be
34 since it is followed by group of 2 digit (once) (56) and then a
. (non digit). Then the engine will try to match again but
56 won’t get a match since it is not followed by the group of 2 digits. So the resulting match will be
The last group
(\.\d+)? is for floating point.
Now we have the two matches with first group as
34 respectively. First group in two matches can be referenced by
\1 since we didn’t name the group. So with String#gsub we can replace the content of group 1 with
<content of group1>,, ie., we can replace
# replacing 12 with 12, and 34 with 34,. .gsub(/(\d+?)(?=(\d\d)+(\d)(?!\d))(\.\d+)?/, "\\1,")
we used two backslash
"\\1," because we used a double quotes, if you are using single quotes just
'\1,' will be enough. if you didn’t escape the backslash wile using double quotes, ruby will consider
"\1," as unicode 1
Hope this explanation helped. Thank you.