Friday, March 1, 2024

2 ways to Split String with Dot (.) in Java using Regular Expression? Examples

You can use the split() method of java.lang.String class to split a string based on the dot. Unlike comma, colon, or whitespace, a dot is not a common delimiter to join String, and that's why beginner often struggles to split a String by dot. One more reason for this struggle is the dot being a special character in the regular expression. If you want to split String on the dot you need to escape dot as \\. instead of just passing "." to the split() method. Alternatively, you can also use the regular expression [.] to split the String by a dot in Java. The dot is mostly used to get the file extension as shown in our example.


The logic and method are exactly the same as earlier examples of split string on space and split the string on a comma, the only difference is the regular expression. Once you know how to write the regular expression, all these examples will be the same for you. Java's regular expression is inspired by Perl.

A regular expression is often considered as an advanced concept by Java developers and that's the main reason for Java programmers not being comfortable with utilizing the power of regular expression for text processing.

Also, some of the most popular Java books like Head First Java or Core Java Volume 1 don't cover regular expression in good detail; but, as a Java developer, you cannot ignore regular expression.

That's why I highly recommend you to join The Complete Regular Expression course for beginners course on Udemy. It's a great course to learn everything about RegEx you want to know and it's also very affordable, you can buy in just $10 on Udemy sales which happens every now and then.

A regular expression is one of the best and most powerful tools for an experienced developer, whether you are working on Java projects or searching log files for patterns in the UNIX box. Since Java, a regular expression in Perl like learning Java regex also helps to effectively use the find and grep commands.





Splitting String by Dot in Java using Regular Expression

Most of the Java programmers first try the following approach when they need to split the String on dot character:

String textfile = "ReadMe.txt";
String filename = textfile.split(".")[0];
String extension = textfile.split(".")[1];

This will not work because dot (.) is a special character in Java's regular expression to match any single character. The above code will throw java.lang.ArrayIndexOutOfBoundsException: 0 because split() will return an empty array, as shown below:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at StringSplitWithRegEx.main(StringSplitWithRegEx.java:9)

The problem with this code is that "." is a metacharacter if you want to use it literally you need to escape it by using backslash e.g. \\. , though you should remember that to escape dot you just need one backslash i.e \., but in Java since \ backslash also need escaping you need two backslashes or \\, as shown below:

String textfile = "ReadMe.txt";
String filename = textfile.split("\\.")[0];
String extension = textfile.split("\\.")[1];

Alternatively, you can also use the [.] regular expression to split the String by dots in Java, as shown below:

String extension = "minecraft.exe".split("[.]")[1];

The reason [.] works because the dot is inside character class i.e. double brackets []. Only characters ]^-\ have a special meaning inside character classes in Java and dot is not one of them, which means you can use it literally inside character class or [ ].

Though it's good to remember that which characters has special meaning in Java regular expression inside and outside of character class

1) The characters .^$|*+?()[{\ have special meaning outside of character classes.
2) The characters ]^-\ have a special meaning inside of character classes.

If you are an experienced Java developer then you should join these best regular expression courses to master the regex in Java. It will not only teach you basics of regex but also empower you with how to use them effectively.

Also here is a nice regular expression cheat sheet to remember most used special characters in regex:
2 ways to Split String with Dot (.) in Java using Regular Expression? Examples




Java Example to Split String by Dot

Here is our sample program to show you how to split a String by dot (.) in Java. In this example, you will find what works, what doesn't work, and why. The examples are pretty much similar to splitting String by any delimiter, with only a focus on using the correct regular expression because the dot is a special character in Java's regular expression API.

import java.util.Arrays;


public class StringSplitWithRegEx{

public static void main(String args[]) {

// 1st example - splitting string by dot in Java 
// this will not work because dot is a special character in 
// regular expression which will match with any single character

String file = "abc.txt";
String[] array = file.split("."); 

System.out.println("input string: " + file);
System.out.println("output array after splitting with . : " 
                         + Arrays.toString(array));

// solution is to escape dot in Java as shown below
array = file.split("\\.");
System.out.println("input string: " + file);
System.out.println("output array after splitting with regex'\\.' : " 
                     + Arrays.toString(array));

// or you can also use following regular expression to split string on dot (.)
array = file.split("[.]");
System.out.println("input string: " + file);
System.out.println("output array after splitting with regex '[.]' : " 
                           + Arrays.toString(array));

// once you have got the individual words, you can get the file name 
// and extension as follow
String filename = array[0];
String extension = array[1];

System.out.println("file: " + file);
System.out.println("name: " + filename);
System.out.println("extension: " + extension);
}

}

Output
input string: abc.txt
output array after splitting with . : []
input string: abc.txt
output array after splitting with regex'\.' : [abc, txt]
input string: abc.txt
output array after splitting with regex '[.]' : [abc, txt]
file: abc.txt
name: abc
extension: txt


That's all about how to split a String by dot in Java. Remember, even though you can use the split() method in the same way you used earlier to split by comma and whitespace, the tricky part here the dot is a special character in the regular expression.

In order to use the dot literally, you need to escape it like \. but again a backslash also requires escaping in Java, you should give \\. i.e. double backslash. Alternatively, you can also give [.] because only ]^-\ characters have a special meaning inside of character classes ([...]) and dot is not one of them, which means it will be treated literally.


Other Java String tutorials from this blog:
  • How to reverse String in Java using recursion? (solution)
  • 25 Recursion exercise for Beginners (recursion questions)
  • How to format String in Java? (example)
  • 21 String interview questions with answers (questions)
  • How to check if two String are Anagram? (solution)
  • Why String is Immutable in Java? (answer)
  • How String in switch case works in Java? (answer)
  • How to convert Date to String in Java? (answer)
  • 100+ data structure interview questions (answers)
  • How to convert double to String in Java? (solution)
  • 10 Dynamic Programming interview questions (DP questions)
  • How to compare two String in Java? (solution)
  • 10 Microservice questions with answers (microservices)

Thank you for reading this tutorial so far. I appreciate it. If you have any questions feel free to ask. 


3 comments:

  1. I was always afraid of using regular expression but the way you explained it, I simply love

    ReplyDelete
  2. Does this still work if the file name itself has a dot? Eg: filename.test.txt

    ReplyDelete
  3. Hello Anonymous, there is no special logic to handle dot (.) in file name, so it will split into multiple parts, also (Dot) is I think the invalid character in filename.

    ReplyDelete