Extracting values using Regular Expressions

Table of contents
Reading Time: 2 minutes

Regular expression is a sequence of characters that define a search pattern. Regular expressions are used to find particular sequences in a string or is used to extract value from a string.

Fields of application range from validation to parsing/replacing strings, passing through translating data to other formats and web scraping.

To define a search pattern in scala, we need to import scala.util.matching.Regex and to convert a string into a regular expression, use .r method.

 

In the above example, we have pattern as the regex variable. The regex defined for this variable matches the word that contains only alphabets. findFirstMatchIn() method is defined in scala.util.matching.Regex to find the first occurrence of the pattern in the input string. Also, the return type of findFirstMatchIn() is an optional value. Similarly, we have findAllMatchIn that returns an iterator to iterate through all the possible matches found in the input string. In the below example, foundPatterns is an iterator and we have converted it to an array to see all the possible options.

 

 

Regular expressions are also widely used to extract values from a string. For example, to count the number of words in a file, we usually first remove all the extra spaces and then split the words on the basis of commas and dots. To make the process easy, we can use regular expression. To split the words of a file, we can write the regex as shown below:

 

There are many other use cases of regular expressions. It also works as an extractor. The regex for the values that need to be extracted must be wrapped inside the parenthesis. For example, to extract the id and name of a student from the input string “Student(1, Aashrita)”, we can use regular expression as shown below:

 

I hope this helps you to get a better understanding of how regular expression works. For more information, please have a look at the references below and refer the regex cheat sheet to write a regular expression for a particular use case.

References


knoldus-advt-sticker


Written by 

Aashrita Goel is an Intern having experience of more than 4 months. She is perceived as a cooperative person, devoted and capable expert and an innovation aficionado. She has great time administration and relational abilities. She believes in standard coding practices. Her emphasis dependably stays on functional work. Her hobbies include reading books and listening to music.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading