Basic Regular Expressions in Ruby with gsub

Angelo Spampinato
4 min readFeb 20, 2019

Regular expressions (shortened to regex) are used to operate on patterns found in strings. They can find, replace, or remove certain parts of strings depending on what you tell them to do. In Ruby, they are always contained within two forward slashes.

For example, using the .gsub method for strings:

sentence = "This is a sample sentence."
sentence.gsub(/a/, "") #returns "This is smple sentence."

The .gsub method is finding all of the a’s in the string and removing them, because the second half of the argument is empty. If we were to add something between the quotes:

sentence.gsub(/This/, "*") #returns "* is a sample sentence."

It then finds and replaces all instances of “This” with an asterisk (*). There is a way to reverse this functionality as well, adding a caret (^) at the front of the regular expression will do the opposite:

sentence.gsub(/[^a]/, "*") #returns "********a**a**************"

Everything (including the spaces!) in the string that is NOT an a is being replaced with asterisks. Notice that in order to apply this functionality the square brackets ([]) are necessary, without them the gsub won’t do anything.

Ranges

It is possible to pass number or letter ranges into the regex to save yourself from having to type out every letter in the alphabet or all numbers 0 through 9.

sentence = "Th1s 1s a sampl3 s3nt3nc3."
sentence.gsub(/[0123456789]/, "!") #these return the same thing
sentence.gsub(/[0-9]/, "!") # "Th!s !s a sampl! s!nt!nc!."

And for letter ranges:

sentence.gsub(/[A-Za-z]/, "🔥") 
#returns "🔥🔥1🔥 1🔥 🔥 🔥🔥🔥🔥🔥3 🔥3🔥🔥3🔥🔥3."

A-Z includes all capital letters and a-z includes all lower case letters. So this gsub is removing all letters and replacing them with the fire emoji.

Shortcuts

There are many useful shortcuts available to simplify what gets passed into the regex. Here are some of the most useful I’ve found so far:

  • \w for targeting all letters and numbers:
lorem = "Lorem ipsum."
lorem.gsub(/[\w]/, "wow") #returns "wowwowwowwowwow wowwowwowwowwow."

• \W to do the opposite (in the below example it is replacing the spaces and the period at the end with “wow”):

lorem.gsub(/[\W]/, "wow") #returns "Loremwowipsumwow"
  • \d for targeting all integers:
lorem = "L0r3m 1psum"
lorem.gsub(/[\d]/, "-") #returns "L-r-m -psum."
  • \D to do the opposite
lorem.gsub(/[\D]/, "-") #returns "-0-3--1-----"

There are many more shortcuts and a great resource for this I found is Rubular, it has a list of them and lets you test them out in the browser.

Blocks

Blocks can be used to change how .gsub behaves. Here’s an example for capitalizing every word in a string:

cool_phrase = "slow lorises are cool"
cool_phrase.gsub(/\w+/) {|word| word.capitalize}
#returns "Slow Lorises Are Cool"

Note: the “+” symbol after the \w is selecting each word that has 1 or more character. Without the “+” the whole string would be uppercase.

Here’s another example where the number in a string is incremented:

year = "The year is 3100"
year.gsub(/\d+/) {|num| num.to_i + 1} #returns "The year is 3101"

Note: again, the “+” symbol is selecting each number that has 1 or more number as a whole. Without it, the return would be “The year is 4211” because it’d be incrementing each number individually.

Use Cases

Regular expressions are often used to validate or filter user input. Imagine your website has a form that takes in a user’s phone number. Here’s an example of removing the symbols to only get the number as an integer:

phone_number = "(123)456-7890"
phone_number.gsub(/[()-], ""/) #returns "1234567890"

When getting the user’s name, it’s useful to only get a string of letters and remove anything they input that is a number or a symbol.

steve = "St97eve Brul()*)e "
steve.gsub(/[\W\d]/, "").split(/(?=[A-Z])/).join(" ")
#returns "Steve Brule"

The above example looks a little complex, however all it is doing is removing all symbols and numbers, splitting the result into an array based on where capital letters are, and joining them into a string.

Conclusion and Resources

Regular expressions and .gsub are powerful tools that can enhance your program, but reading and trying to figure them out can sometimes feel like you’re reading hieroglyphics (see image below). I recommend taking your time learning them and trying to figure them out one at a time. Do this and soon you’ll be a regex master!

What!?
  • Rubular: useful for testing out different regular expressions in the browser
  • RubyGuides: a master resource for all things regex in Ruby
  • Tutorials Point: more regex knowledge

--

--