Wednesday, July 28, 2021

How to Split String based on delimiter in Java? Example Tutorial

You can use the split() method of String class from JDK to split a String based on a delimiter e.g. splitting a comma-separated String on a comma, breaking a pipe-delimited String on a pipe, or splitting a pipe-delimited String on a pipe. It's very similar to earlier examples where you have learned how to split String in Java. The only point which is important to remember is little bit of knowledge of regular expression, especially when the delimiter is also a special character in regular expression e.g. pipe (|) or dot (.), as seen in how to split String by dot in Java. In those cases, you need to escape these characters e.g. instead of |, you need to pass \\| to the split method.

Alternatively, you can also use special characters inside character class i.e. inside a square bracket e.g. [|] or [.], though you need to be careful that they don't have any special meaning inside character class e.g. pipe and dot doesn't have any special meaning inside character class but ^ does have.

Let's see a couple of examples to split a string based on delimiter in Java to understand the concept better.



1st example - splitting a pipe-delimited String

Splitting a String on delimiter as the pipe is a little bit tricky becuase the most obvious solution will not work, given pipe is a special character in Java regular expression. For example, when you asked any Java programmer to split a pipe-delimited String, he will most like wrote the following code:

String pipeDelimited = "IBM|Intel|HP|Cisco"; 
String[] companies = pipeDelimited.split("|");

This will not yield the result you want i.e. an array of IBM, Intel, HP, and Cisco, instead, it will print [, I, B, M, |, I, n, t, e, l, |, H, P, |, C, i, s, c, o] because | is interpreted as logical operator OR and Java regex engine split the String on empty String.


In order to solve this problem, you need to escape the pipe character when you pass to split() function as shown below:

pipeDelimited.split("\\|"); // return [IBM, Intel, HP, Cisco]

This will give you String array [IBM, Intel, HP, Cisco] because now Java regex engine will interpret it literarily.

There is one more way to solve the above problem e.g. by using the pipe character inside a bracket e.g. [|]. Bracket acts as a character class in Java regex API and pipe (|) only has special meaning outside character class, inside it will be interpreted literarily, as shown in the following example

pipeDelimited.split("[|]"); // return [IBM, Intel, HP, Cisco]

Thought it's always better to see which character has special meaning inside and outside character class in Java regular expression, e.g. caret (^) has special meaning inside a character class and won't be treated literally. I suggest reading any good book on Java regular expression or joining these best Regular expressions courses for more details on this topic.

How to Split String based on delimiter in Java?


2nd Example - splitting a String on the colon as a delimiter

Breaking a String on the colon is easy because it's a normal character in Java regular expression, just pass ":" to split() method and it will return an array of  String broken on the colon, as shown below:

String colonDelimited = "1:2:3:4:5"; 
String[] numbers = colonDelimited.split(":"); 
System.out.println(Arrays.toString(numbers)); // print [1, 2, 3, 4, 5]

If you need a list instead of an array, then see this tutorial learn how to convert an array to a list in Java.



3rd Example - splitting a comma-separated String 

This is similar to the earlier example but it's more common because CSV is a popular format to export data from databases, tables, or XLS files. You can split a comma-delimited String by passing "," to split() method as shown in the following example:

String commaDelimited = "Equity,Gold,FixedIncome,Derivatives"; 
String[] assetClasses = commaDelimited.split(",");

// print [Equity, Gold, FixedIncome, Derivatives] 
System.out.println(Arrays.toString(assetClasses)); 

This is one of the simplest ways to split a CSV String in Java, but if your String contains headers and some metadata information then you may need to use a proper CSV parser e.g. Apache CSV parser. See this tutorial to learn more about how to load CSV files in Java.

How to Split String based on delimiter in Java? Example Tutorial



Java Program to split String based on delimeter

Here is our sample Java program to demonstrate how to split String based upon delimiter in Java. Though you can also use StringTokenizer to split String on a delimiter, In this program I have solely focus on the split() function. I have shown 3 examples in this article to split a comma-separated, pipe-delimited, and colon delimiter String.

I have also included one example to particularly demonstrate splitting on the character which is also a special character in regular expression e.g. pipe or dot. They required escaping when you pass them to the split() method.

import java.util.Arrays;

public class App {

  public static void main(String args[]) {

    // You can split the String using the split() method of java.lang.String
    // class. It allows you to specify any delimiter e.g. comma, colon, pipe
    // etc.
    // Actually it expect a regular expression but if you pass just a single
    // character it will break the string on that delimiter.

    // 1st Example - let's split a String where delimiter is pipe (|)
    // Suppose you have a pipe delimited String like below
    String pipeDelimited = "Google|Amazon|Microsoft|Facebook";

    // 1st try - this is what many Java programmer does, but
    // unfortunately it will not work because '|' pipe is
    // a special character in Java regular expression API
    String[] array = pipeDelimited.split("|");

    System.out.println("pipe delimited String: " + pipeDelimited);
    System.out.println("splitted String with delimiter: "
        + Arrays.toString(array));

    // If you want to break String on pipe, you need to escape it
    // i.e. \\|, Java needs double slash to escape because backslash is also
    // special character in Java and needs escaping.

    array = pipeDelimited.split("\\|");
    System.out.println("splitted String after escaping pipe: "
        + Arrays.toString(array));

    // Alternatively you can also use pipe (|) inside a character
    // class e.g. [|], it doesn't hold its special meaning inside
    // character class and regular expression engine will interpret it literary

    array = pipeDelimited.split("[|]");
    System.out.println("output String after using pipe inside character clas: "
        + Arrays.toString(array));

    // 2nd Example - splitting a colon(:) delimited String
    // breaking string on colon is easy because its not a special character in
    // regex
    String colonDelimited = "Android:Windows10:Linux:MacOSX";
    String[] os = colonDelimited.split(":");
    System.out.println("colon delimited String: " + colonDelimited);
    System.out.println("splitted String with delimiter as colon: "
        + Arrays.toString(os));

    // 3rd Example - split a comma separated String
    // just pass "," to split() function, no fuss because , is also a normal
    // character
    String commaDelimited = "find,grep,chmod,netstat";
    String[] commands = commaDelimited.split(",");
    System.out.println("comma delimited String: " + commaDelimited);
    System.out.println("splitted String with comma as delimter: "
        + Arrays.toString(commands));
  }

}
Output
pipe delimited String: Google|Amazon|Microsoft|Facebook
split String with delimiter: [, G, o, o, g, l, e, |, A, m, a, z, o, n, |,
 M, i, c, r, o, s, o, f, t, |, F, a, c, e, b, o, o, k]
split String after escaping pipe: [Google, Amazon, Microsoft, Facebook]
output String after using pipe inside a character class: [Google, Amazon,
 Microsoft, Facebook]
colon delimited String: Android:Windows10:Linux:MacOSX
split String with delimiter as a colon: [Android, Windows10, Linux, MacOSX]
comma delimited String: find,grep,chmod,netstat
split String with comma as a delimiter: [find, grep, chmod, netstat]


That's all about how to split a String-based upon delimeter in Java. As I said, you can use the split() method, pass the delimiter you want to use and it will break the String accordingly. Since split() expect a regular expression, just be careful whether delimiter is a special character or not, if it is e.g. pipe or dot then you need to escape them.

You can use a double backslash in Java to escape special characters e.g. \\. or \\|, Alternatively, you can also use them inside character class e.g. [.] or [|] as these characters only have special meaning outside the character class. They are treated literally inside a character class.


Other Java String tutorials you may like to explore:
  • How to format String in Java? (example)
  • How to compare two Sring in Java? (solution)
  • How to replace characters and substring in a given String? (example)
  • How to convert double to String in Java? (solution)
  • How to check if two String are Anagram? (solution)
  • How to convert Date to String in Java? (answer)
  • How String in switch case works in Java? (answer)
Thank you for reading this article, if you like this article then please share it with your friends and colleagues. If you have any questions or suggestions then please drop a comment. 

2 comments: