DhiwaTdG - 1 year ago 65
C Question

# Algorithm for intersection of n arrays in C

I need to write a function which returns the intersection

`(AND condition)`
of all the arrays generated in every iteration for an array of queries.

If my query is given by:
`query[] = {"num>20", "avg==5", "deviation != 0.5"}`
then,
`n`
runs from
`0`
to
`length of query`
. The query is passed on to a function (
`get_sample_ids`
) which compares the condition against a list of samples possessing certain information. The returned array numbers from
`get_sample_ids`
are the index of the respective samples.

``````query[] = {"num>20", "avg==5", "deviation != 0.5"}
int intersected_array*;
for n=0:query.length-1
int arr* = get_sample_ids(query[n]);
// n=0: [1, 7, 4, 2, 6]
// n=1: [3, 6, 2]
// n=2: [6, 2]
end;

Expected output: intersected_array* = [6, 2]
``````

I've coded an implementation which has 2 arrays (
`arr*`
,
`temp*`
). For every array returned in the iteration, it is first stored in the
`temp*`
array and the intersection of
`arr*`
and
`temp*`
is stored in
`arr*`
. Is this an optimal solution or what is the best approach?

This is quite efficient but could be tiresome to implement (haven't tried it).

1. Determine the `shortest` array. Benefit of using C is that if you don't know their length, you can use the pointers to arrays to determine it if they are placed sequentially in memory.
2. Make a `<entry,boolean>` hash `map` for the entries in `shortest`. We know the size and if anything it's only going down in next steps.
3. Iterate through an array. Start by initiating the whole `map` to false. For each entry check it in `map`.
4. Iterate through `map` deleting all the entries that weren't checked. Set all the values to false.
5. If there any new arrays left go back to step 3. with a new array.
6. The result are the keys in the final `map`.

It looks like much but we didn't have to resort to any high complexity measures. Key to good performance is using hash map because of constant access time.

Alternatively:

Make the `map` be `<entry,int>`. This way you can count up all the recurrences and don't have to reset it at every iteration, which adds to complexity.

At the end just compare the number of array's to the the values in `map`. Those that match are your solution.

Complexity:

Seems like O(n).

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download