TheDoctor TheDoctor - 1 year ago 136
Java Question

How can I remove punctuation from input text in Java?

I am trying to get a sentence using input from the user in Java, and i need to make it lowercase and remove all punctuation. Here is my code:

String[] words = instring.split("\\s+");
for (int i = 0; i < words.length; i++) {
words[i] = words[i].toLowerCase();
String[] wordsout = new String[50];
int e = 0;
for (int i = 0; i < words.length; i++) {
if (words[i] != "") {
wordsout[e] = words[e];
wordsout[e] = wordsout[e].replaceAll(" ", "");
return wordsout;

I cant seem to find any way to remove all non-letter characters. I have tried using regexes and iterators with no luck. Thanks for any help.

Answer Source

This first removes all non-letter characters, folds to lowercase, then splits the input, doing all the work in a single line:

String[] words = instring.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\\s+");

Spaces are initially left in the input so the split will still work.

By removing the rubbish characters before splitting, you avoid having to loop through the elements.