DevRj DevRj - 2 years ago 103
Java Question

check if a string is contained in a text file of words in java

I have a text file (collection of all valid english words) from a github project that looks like this words.txt

My text file is under the

resources
folder in my project.

I have also a list of rows obtained from a table in mysql.
What i'm trying to do is to check if all the words in a every row are valid english words, that's why I compare each row with the words contained in my file.

This what i've tried so far :

public static void englishCheck(List<String> rows) throws IOException {
ClassLoader classLoader = ClassLoader.getSystemClassLoader();
int lenght, occurancy = 0;
for ( String row : rows ){

File file = new File(classLoader.getResource("words.txt").getFile());


lenght = 0;

if ( !row.isEmpty() ){
System.out.println("the row : "+row);
String[] tokens = row.split("\\W+");
lenght = tokens.length;
for (String token : tokens) {

occurancy = 0;
BufferedReader br = new BufferedReader(new FileReader(file));

String line;
while ((line = br.readLine()) != null ){


if ((line.trim().toLowerCase()).equals(token.trim().toLowerCase())){
occurancy ++ ;

}
if (occurancy == lenght ){ System.out.println(" this is english "+row);break;}

}

}





}

}
}


this works only for the very first rows, after that my method loops over the rows only displaying them and ignores the comparison, I would like to know why this isn't working for my set of rows, It works also if I predefined my list like this
List<String> raws = Arrays.asList(raw1, raw2, raw3 )
and so on

Answer Source

You can read words.txt file, convert words into lower case, then put words into HashSet.

Use the boolean contains(Object o) or boolean containsAll(Collection<?> c); methods to compare each word. The time was O(n).

TIP: Do not read file in every check. Reading file is very very slow.

ClassLoader classLoader = ClassLoader.getSystemClassLoader();
InputStream inputStream = classLoader.getResourceAsStream("words.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
List<String> wordList = new LinkedList<String>(); // You do not know word count, LinkedList is a better way.
String line = null;
while ((line = reader.readLine()) != null) {
  String[] words = line.toLowerCase().split("\\W+");
  wordList.addAll(Arrays.asList(words));
}
Set<String> wordSet = new HashSet<String>(wordList.size());
wordSet.addAll(wordList);


// then you can use the wordSet to check. 
// You shold convert the tokens to lower case.
String[] tokens = row.toLowerCase().split("\\W+");
wordSet.containsAll(Arrays.asList(tokens)); 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download