Jane Wayne Jane Wayne - 12 days ago 7
Java Question

what is an efficient way to transpose a matrix in a text file?

i have a text file that holds a 2-dimensional matrix. it looks like the following.

01 02 03 04 05
06 07 08 09 10
11 12 13 14 15
16 17 18 19 20


as you can see, each row is delimited by a new line and each column is delimited by a space. i need to transpose this matrix in an efficient way.

01 06 11 16
02 07 12 17
03 08 04 05
04 09 14 19
05 10 15 20


in reality, the matrix is 10,000 by 14,000. the individual elements are double/float. it would be costly, if not impossible, to attempt to transpose this file/matrix all in memory.

does anyone know of a util API to do something like this or an efficient approach?

what i have tried: my naive approach has been to create a temporary file for each column (of the transposed matrix). so, with 10,000 rows, i will have 10,000 temporary files. when i read each line, i tokenize each value, and append the value to the corresponding file. so with the example above, i will have something like the following.

file-0: 01 06 11 16
file-1: 02 07 12 17
file-3: 03 08 13 18
file-4: 04 09 14 19
file-5: 05 10 15 20


i then read each file back in and append them into one file. i wonder if there's a smarter way because i know the file i/o operations will be a pain point.

Answer

Solution with minimal memory consumption and extremely low performance:

import org.apache.commons.io.FileUtils;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

public class MatrixTransposer {

  private static final String TMP_DIR = System.getProperty("java.io.tmpdir") + "/";
  private static final String EXTENSION = ".matrix.tmp.result";
  private final String original;
  private final String dst;

  public MatrixTransposer(String original, String dst) {
    this.original = original;
    this.dst = dst;
  }

  public void transpose() throws IOException {

    deleteTempFiles();

    int max = 0;

    FileReader fileReader = null;
    BufferedReader reader = null;
    try {
      fileReader = new FileReader(original);
      reader = new BufferedReader(fileReader);
      String row;
      while((row = reader.readLine()) != null) {

        max = appendRow(max, row, 0);
      }
    } finally {
      if (null != reader) reader.close();
      if (null != fileReader) fileReader.close();
    }


    mergeResultingRows(max);
  }

  private void deleteTempFiles() {
    for (String tmp : new File(TMP_DIR).list()) {
      if (tmp.endsWith(EXTENSION)) {
        FileUtils.deleteQuietly(new File(TMP_DIR + "/" + tmp));
      }
    }
  }

  private void mergeResultingRows(int max) throws IOException {

    FileUtils.deleteQuietly(new File(dst));

    FileWriter writer = null;
    BufferedWriter out = null;

    try {
      writer = new FileWriter(new File(dst), true);
      out = new BufferedWriter(writer);
      for (int i = 0; i <= max; i++) {
        out.write(FileUtils.readFileToString(new File(TMP_DIR + i + EXTENSION)) + "\r\n");
      }
    } finally {
      if (null != out) out.close();
      if (null != writer) writer.close();
    }
  }

  private int appendRow(int max, String row, int i) throws IOException {

    for (String element : row.split(" ")) {

      FileWriter writer = null;
      BufferedWriter out = null;
      try {
        writer = new FileWriter(TMP_DIR + i + EXTENSION, true);
        out = new BufferedWriter(writer);
        out.write(columnPrefix(i) + element);
      } finally {
        if (null != out) out.close();
        if (null != writer) writer.close();
      }
      max = Math.max(i++, max);
    }
    return max;
  }

  private String columnPrefix(int i) {

    return (0 == i ? "" : " ");
  }

  public static void main(String[] args) throws IOException {

    new MatrixTransposer("c:/temp/mt/original.txt", "c:/temp/mt/transposed.txt").transpose();
  }
}