Omnus Ruthius Omnus Ruthius - 5 months ago 11
SQL Question

Oracle SQL - Isolating Orphaned Data

I have an application that keeps track when a file is being “attempted” to move from one server to another, as well as when it has “succeeded” or “failed.” An "attempt" should always be paired with a "success" or "failure." However, I am encountering orphaned attempts...meaning there have been attempts without any success or failure reported. From another question I asked, I am able to isolate the e_ids in particular that are considered orphans. However, I do not want to return the "non-orphaned" transmissions in the final output. Consider the following sample output:

enter image description here
In the output above, for example, e_id 000125 has a completed transmission (attempt-->success) at time 5/23/2016 11:37:09PM. I do not want to see this. Again, e_id 000672 has a completed transmission(attempt-->failure) at time 5/25/2016 1:28:36PM. I do not want to see this either. I only want to see the orphaned transmissions, with a result set looking like this:

enter image description here

It may be worth noting that all completed transmissions occur within one second of each other (hence why in the first sample output the dates appear as identical, when in reality they differ by milliseconds).

Finally, my query thus far is as follows

--This query shows all data for all contact_id's known to be an orphan
SELECT * FROM
(
SELECT
d.*
FROM
(
SELECT e_id, COUNT(*) AS attempts FROM e_table
WHERE e_comment LIKE '%attempting%'
AND e_date >= '23-MAY-2016'
AND e_date <= '26-MAY-2016'
GROUP BY e_id
) a
FULL OUTER JOIN
(
SELECT e_id, COUNT(*) AS successes FROM e_table
WHERE e_comment LIKE '%successful%'
AND e_date >= '23-MAY-2016'
AND e_date <= '26-MAY-2016'
GROUP BY e_id
) s
ON s.e_id = a.e_id
FULL OUTER JOIN
(
SELECT e_id, COUNT(*) AS failures FROM e_table
WHERE e_comment LIKE '%failed%'
AND e_date >= '23-MAY-2016'
AND e_date <= '26-MAY-2016'
GROUP BY e_id
) f
ON f.e_id = COALESCE(a.e_id, s.e_id)
FULL OUTER JOIN
(
SELECT * FROM e_table
WHERE e_date >= '23-MAY-2016'
AND e_date <= '26-MAY-2016'
) d
ON a.e_id = d.e_id
WHERE
COALESCE(attempts, 0) <> COALESCE(successes, 0) + COALESCE(failures, 0)
)
MINUS
(
--This is where I'm stuck. I figure, with a MINUS, I can remove the
--cases with completed transmissions, thus showing ONLY the orphans.
)


Any help is greatly appreciated.

Answer

The following query accomplishes what I need. Instead of using the range specified in the original question, I pull "yesterday's" data:

SELECT et.* FROM
(
    SELECT ce_id, e_id FROM e_table et
    WHERE e_comment LIKE ("attempt")
    AND e_date >= TO_DATE(TRUNC(SYSDATE-1))
    AND e_date <= TO_DATE(TRUNC(SYSDATE-1) || ' 23:59:59', 'DD-MON-YY HH24:MI:SS')
    AND NOT EXISTS
    (
        SELECT ce_id, e_id FROM e_table ett
        WHERE e_comment LIKE ("success") OR e_comment LIKE ("failure")
        AND et.e_id = ett.e_id
        AND ett.ce_id > et.ce_id
    ) 
) orphans, e_table et
WHERE orphans.ce_id = et.ce_id
AND orphans.e_id = et.e_id
ORDER BY et.ce_id ASC;
Comments