ts_928 ts_928 - 29 days ago 24
Java Question

Remove punctuation java

I'm trying to remove punctuation from a string but keep the spaces, as I need to be able to distinguish different words. The end goal is to find the length of each word in a string. I set up a for loop to check the length of a word until it hits a space but this would count the punctuation as a letter. I know that I would have to change the variable in the if statement to reflect the length of the substring between "i" and the "indexOf" the space in the string.

for(int i=0; i > stringLength - 1;){
original.substring(i, original.indexOf(' '));
if(i > minLength)

Answer

While it might be tempting to throw a bunch of fors and ifs, it would be cleaner to just use a regular expression:

Pattern.compile("[.,; ]+").splitAsStream(input)

A full example:

import java.util.regex.Pattern;
import java.util.stream.Collectors;

public class Counting {
    public static void main(String... args) {
        String text = "This is a string. With some punctuation, but I only care about words.";

        String wordsWithLengths = Pattern.compile("[.,; ]+")
                .splitAsStream(text)
                .map(word -> word + " => " + word.length())
                .collect(Collectors.joining("\n"));

        System.out.println(wordsWithLengths);
    }
}

Output:

This => 4
is => 2
a => 1
string => 6
With => 4
some => 4
punctuation => 11
but => 3
I => 1
only => 4
care => 4
about => 5
words => 5
Comments