Saturday, February 5, 2022

How to split a comma separated String in Java? Regular Expression Example

You can use the String.split() function or StringTokenizer class to split a comma-separated String in Java. Since splitting a String is a very common functionality, Java designers have provided a couple of split() methods on java.lang.String class itself. These split() function takes a regular expression and split the String accordingly. In order to parse a comma-delimited String, you can just provide a "," as a delimiter and it will return an array of String containing individual values. The split() function internally uses Java's regular expression API (java.util.regex) to do its job.

If you are not very familiar with the regular expression then you can also use the StringTokenizer class, which can also split a comma-delimited String but StringTokenizer is an old class and not recommended.

You should always try to use the split() function from java.lang.String class, as any performance improvement will likely happen on this method than the StringTokenizer class. Let's see a couple of examples to split a comma-separated String in Java.

By the way, if you are a complete beginner into Regular expression and don't understand these magical symbols then I highly recommend you join these best Regular expression courses. It's a great collection of online courses to learn everything about RegEx you want to know.





Split a comma-delimited String into an Array - Example

This is the simplest example of splitting a CSV String in Java. All you need to do is call the split() function with delimiter as shown in the following example. This method splits the String-based upon given regular expression and returns a String array with individual String values.

How to split a comma separated String in Java? Regular Expression Example


package beginner;

import java.util.Arrays;

/**
 * Java Program to split a CSV String into individual String values. This
 * program shows two ways to split the CSV String and create a String array out
 * of it, first by using String.split() method and second, by using
 * StringTokenizer class.
 * 
 * @author WINDOWS
 *
 */
public class HelloWorldApp {

    public static void main(String... args) {

        String CSV = "Google,Apple,Microsoft";

        String[] values = CSV.split(",");

        System.out.println(Arrays.toString(values));
    }
}

Output :[Google, Apple, Microsoft]

You can also create an ArrayList by splitting a comma-separated String as shown below:

ArrayList list = new ArrayList(Arrays.asList(values)

If your comma-separated String also contains whitespace between values like "Google, Microsoft, Apple" then you can use the following regular expression to split the CSV string as well as get rid of the leading and trailing whitespaces from individual values.

String CSV = "Google, Apple, Microsoft";

String[] values = CSV.split("\\s*,\\s*");

System.out.println(Arrays.toString(values));

Here \\s* is the regular expression to find zero or more space. \s is the metacharacter to find whitespace including tabs. since \ (forward slash) requires escaping in Java it becomes \\ (double slash) and \s becomes \\s. 

Now coming to * (star or asterisk), it's another special character in regular expression which means any number of times. So \\s* means space any number of times.




Split Comma Separated String using StringTokenizer in Java

And, here is an example of how you can use StringTokenizer to split a comma-separated String into a String array. StringTokenizer is a utility class in java.util package, which accepts a String and breaks into tokens.

By default, StringTokenizer breaks the String on whitespace but you can pass the token you want to use. For example, to break a comma-separated String into individual tokens, we can pass comma (,) as a token as shown in the following example.

Once you do that, you can just iterate over StringTokenizer using hasMoreTokens() and retrieve values using nextToken(), similar to how you iterate over ArrayList using Iterator.

public class HelloWorldApp {

    public static void main(String... args) {

        String CSV = "Google,Apple,Microsoft";

        StringTokenizer tokenizer = new StringTokenizer(CSV, ",");
        
        while (tokenizer.hasMoreTokens()) {
            System.out.println(tokenizer.nextToken());
        }        
    }
}

Output
Google
Apple
Microsoft

That's all about how to split a comma-separated String in Java. You can use the split() function to parse comma-delimited String. Just remember that it takes regular expression but to break CSV string, you just need to pass "," if your String is like "a,b,c" i.e. there is no space between two values. If there is space between values then you can use the regular expression "\\s*,\\s*" to remove the space around individual values.


Other Java String tutorials from this blog:
  • How to reverse String in Java using recursion? (solution)
  • How to compare two Sring in Java? (solution)
  • How to format String in Java? (example)
  • How to convert double to String in Java? (solution)
  • How to check if two String are Anagram? (solution)
  • How to convert Date to String in Java? (answer)
  • How String in switch case works in Java? (answer)
  • 21 String interview questions with answers (questions)
  • 100+ data structure interview questions (answers)
  • 25 Recursion exercise for Beginners (recursion questions)
  • 10 Dynamic Programming interview questions (DP questions)
  • 10 Microservice questions with answers (microservices)

Thanks for reading this article so far. If you like this article then please share it with your friends and colleagues. If you have any questions or feedback then please drop a note and if you like to watch then please subscribe to our Youtube channel by clicking this link. 

5 comments :

Unknown said...

Nice post

Partha Pratim Sanyal said...

Doesnt work

javin paul said...

Hello PArtha, what problem you are getting? can you share more?

Anonymous said...

How can I handle if my csv has only thre columns and first column value has a comma in it.
for example :
[Google,Alphabet, Apple, Microsoft]

Note Google,Alphabet is one value

then how should I read first value.

any suggestions are apprreciated.
thanks.

javin paul said...

I think you should use a open-source CSV library like Apache commons CSV, it should have some way to handle commas on value, otherwise, include them in double quotes to mark as single value.

Post a Comment