yeppe yeppe - 3 months ago 6x
Java Question

Matrix manipulation: logic not fetching correct answer for higher order NXN matrix data

I came across below problem related to Matrix Manipulation.

problem statement

There is a NxN matrix,divided into N * N cells. Each cell has a predefined value. Which would be given as an input. Iteration has to happen K number of times which is also give in the test input. We have to make sure that we pick the optimum/min value of row/column at each iteration. Final output is the cumulative sum of optimum value saved at the end of each iteration.

Steps 1. Sum up the individual row and column and find the min sum of rows and columns, (it could be a row or a column, have to calculate the minimum)

Step 2. Store the sum found above separately

Step 3.
Increment elements of the min. sum row or column. by 1

Repeat steps 1,2,3 from 1 to Kth value

add the sum at each iteration(specified in step2)

output is the sum obtained on on the Kth iteration.

Sample data

2 4
1 3
2 4

Output data


I was able to write a code (in java) and tested the same for some sample test cases. The output worked fine. The code works fine for sample data matrix of lower order, say, 2x2,4x4,even till 44x40 (that has less iteration). However, when the matrix size is increased to 100X100 (complex iteration), I see the expected output output values differ at 10s and hundreds place of the digit from the actual output and its random. Since I am not able to find a correct pattern of output vs input. Now, it is taking a toll on me to really debugging 500th loop to identify the issue. Is there any better way or approach to solve such problem related to huge matrix manipulation. Has anyone come across issues similar to this and solved it.

I am mainly interested in knowing the correct approach to solve given matrix problem. What Data structure to use in java. At present, I am using primitive DS and arrays int[] or long[] to solve this problem. Appreciate any help in this regard.


Which data structure?

What you need here is a data structure which allows you to efficiently query and update the minimum sum line. The most commonly used for this is a heap

For your purposes it's probably best to just implement the simplest kind, an array-based binary heap:

..for implementation details.


  • Initialize your heap to size M + N where M, N are the number of rows and columns.
  • Before the loop, pre-compute the sum of each row and column, and add them as objects to the heap. Also add two arrays A, B which store the row and columon objects separately.
  • Now heapify the heap array with respect to the line sum attribute. This ensures the heap follows the criterion of the binary heap structure (parent always > children). Read the sources to find out more about how to implement this (quite easy for a fixed array)
  • For each iteration, look at the first element in the heap array. This is always the one with the smallest line sum. If this is a row object, then increment the sum attribute by N (no. of columns), and increment each object in B (list of columns) by 1. Do the same if it's a column.
  • After this, always heapify before the next iteration.

At the end, just return the first element's attribute.

Time complexity:

The original naive solution (looping through all columns and rows every time) is enter image description here.

Using a heap, the heapify operation at each step is enter image description here (for a binary heap).

This means the total complexity is enter image description here, FAR smaller. The max term is to compensate for the fact that at each iteration it may be either rows or columns which are incremented.

As a side note, there are other heap structure types which have even better time complexity than the binary heap, e.g. binomial trees, Fibonacci heaps etc. These however are far more complicated, and have higher constant-factor overheads as a result. Thus for your project I feel they are not necessary, as many of them need phenomenal data set sizes to justify for the constant factor overhead.

Besides, they all support the same external operations as the binary heap, as defined by the Abstract Data Structure of Heap.

(heapify is an internal operation specific to the binary heap structure. Quite a few of the other ones are theoretically superior as they do this operation implicitly and "lazily")