Crozin Crozin - 12 days ago 6
Java Question

Read large amount of data from file in Java

I've got text file that contains

1 000 002
numbers in following formation:

123 456
1 2 3 4 5 6 .... 999999 100000


Now I need to read that data and allocate it to
int
variables (the very first two numbers) and all the rest (1 000 000 numbers) to an array
int[]
.

It's not a hard task, but - it's horrible slow.

My first attempt was
java.util.Scanner
:



Scanner stdin = new Scanner(new File("./path"));
int n = stdin.nextInt();
int t = stdin.nextInt();
int array[] = new array[n];

for (int i = 0; i < n; i++) {
array[i] = stdin.nextInt();
}


It works as excepted but it takes about 7500 ms to execute. I need to fetch that data in up to several hundred of milliseconds.

Then I tried
java.io.BufferedReader
:



Using
BufferedReader.readLine()
and
String.split()
I got the same results in about 1700 ms, but it's still too many.

How can I read that amount of data in less that 1 second? The final result should be equal to:

int n = 123;
int t = 456;
int array[] = { 1, 2, 3, 4, ..., 999999, 100000 };


According to trashgod answer:



StreamTokenizer
solution is fast (takes about 1400 ms) but it's still too slow:

StreamTokenizer st = new StreamTokenizer(new FileReader("./test_grz"));
st.nextToken();
int n = (int) st.nval;

st.nextToken();
int t = (int) st.nval;

int array[] = new int[n];

for (int i = 0; st.nextToken() != StreamTokenizer.TT_EOF; i++) {
array[i] = (int) st.nval;
}


PS. There is no need for validation. I'm 100% sure that data in
./test_grz
file is correct.

Answer

Thanks for every answer but I've already found a method that meets my criteria:

BufferedInputStream bis = new BufferedInputStream(new FileInputStream("./path"));
int n = readInt(bis);
int t = readInt(bis);
int array[] = new int[n];
for (int i = 0; i < n; i++) {
    array[i] = readInt(bis);
}

private static int readInt(InputStream in) throws IOException {
    int ret = 0;
    boolean dig = false;

    for (int c = 0; (c = in.read()) != -1; ) {
        if (c >= '0' && c <= '9') {
            dig = true;
            ret = ret * 10 + c - '0';
        } else if (dig) break;
    }

    return ret;
}

It requires only about 300 ms to read 1 mln of integers!