Nav Nav - 5 months ago 6x
SQL Question

Retrieving distinct rows where columns have the same value and iterating by not having a cursor timeout?

I have seen this, this and this, but none of the answers solve my problem. Even

group by
isn't working because of a

I have a table with rows:

id, cId, aId, eId, val

1 231 6 56 4

2 231 6 56 7

3 451 6 30 4

4 762 7 56 4

5 342 7 32 5

6 453 7 34 6

7 342 7 32 8

I want distinct rows based on
only. So the result will be:

231 6 56

451 6 30

762 7 56

342 7 32

453 7 34

The two challenges:

1. What query could give me this result?

2. The DB contains around 16 million rows. Once I retrieve the unique rows, I'll be iterating it via
. I've encountered a problem with
, that when iterating a very long result set, the cursor's connection to the database times out even though I'm actively iterating. I've been told that the same can happen with
if it is a very large result set.

So my question here is not just about retrieving those distinct rows, but also about getting it in such a way that I'd be able to iterate on chunks of maybe 10000 of them at a time. I know of the
command, but I don't know how to continue retrieving from the 10001'th result after running a select with limit 10000.


How about

select distinct cId,aId, eId from TheTable;

also see about sql pagination (if necessary).

You'd get the results in pages with repeated commands in Java, like this:

select distinct cId, aId, eId from TheTable limit 1, 10000;
select distinct cId, aId, eId from TheTable limit 10001, 10000;
select distinct cId, aId, eId from TheTable limit 20002, 10000;
select distinct cId, aId, eId from TheTable limit 30003, 10000;

and so on...
You'd need a variable to change the 1 to 10001 and then to 20002 etc. What those statements mean, is select something from TheTable and limit it to the results of the first row of the results to the 10000'th row of the results. The second select does a select from the 10001'th row of the distinct results to the 10001+10000'th row.

But before starting with the pagination, it'd help to know the total number of results you are going to iterate through. You can do that with:

select count(distinct cId,aId, eId) from TheTable;