Nat95 Nat95 - 1 month ago 8
Java Question

Read a file and group its text

I have a file which contains some text and at the end a number. The file is like:

to Polyxena. Achilles appears in the in the novel The Firebrand by Marion
the firebrand 14852520
fantasy novelist David Gemmell omic book hero Captain Marvel is endowed with the courage of Achilles, as well
captain marvel 403585
the city its central theme and
corfu 45462


What I want is to group all the text till the number. For example:

" to Polyxena. Achilles appears in the in the novel The Firebrand by Marion the firebrand 14852520"

" fantasy novelist David Gemmell omic book hero Captain Marvel is endowed with the courage of Achilles, as well captain marvel 403585"


I noticed that each group of text begins with white-space. However I have a difficulty how to group them. I coded this:

String line;
String s = " ";
char whiteSpace = s.charAt(0);

ArrayList<String> lines = new ArrayList<>();
BufferedReader in = new BufferedReader(new FileReader(args[0]));
while((line = in.readLine()) != null)
{
if (whiteSpace == line.charAt(0)){ //start of sentence
lines.add(line);
}
}
in.close();

Answer

You could follow this algorithm:

  • Create an empty buffer
  • For each line:
    • Append to the buffer
    • If the line ends with a number:
    • Add the buffer to the list
    • Empty the buffer

Something like this:

String text = " to Polyxena. Achilles appears in the in the novel The Firebrand by Marion \n" +
        "the firebrand   14852520\n" +
        " fantasy novelist David Gemmell omic book hero Captain Marvel is endowed with the courage of Achilles, as well \n" +
        "captain marvel  403585\n" +
        " the city its central theme and \n" +
        "corfu   45462";
Scanner scanner = new Scanner(text);

List<String> lines = new ArrayList<>();
StringBuilder buffer = new StringBuilder();

while (scanner.hasNext()) {
    String line = scanner.nextLine();
    buffer.append(line);
    if (line.matches(".*\\d+$")) {
        lines.add(buffer.toString());
        buffer.setLength(0);
    }
}
Comments