I have been working on this exercise on Codecademy for several days and still couldn't understand the logic behind it.Below is the exercise and code:
It would be interesting to order flights by giving them a sequence number based on time, by carrier.
For instance, assuming flight_id increments with each additional flight, we could use the following query to view flights by carrier, flight id, and sequence number:
SELECT carrier, id,
FROM flights f
WHERE f.id < flights.id
AND f.carrier=flights.carrier) + 1 AS flight_sequence_number
f.id < flights.id
The query selects records from table flights. For each record it selects carrier, id and the flight_sequence_number.
So for every record in flights the subquery gets executed. It reads the table flights again, but only takes records with the same carrier and a flight ID smaller than the one of the main record. In order to talk about the main query record and the sub query record, you need one or two table aliases, for otherwise both records would be called flight as in flight.id, and it would not be clear which one you are talking of. So the table in the inner query gets the alias f. Now you can compare f.id < flights.id and f.carrier = flights.carrier.
COUNT(*) counts hence all records lower than a main record's flight ID for its carrier and thus numbers the carrier's rows. For the smallest ID you find no smaller IDs so the count is 0 and you add one thus getting row number 1. For the second smallest you find one record with a smaller ID, hence you get count 1, add 1 and get row number 2 and so on.
The results will look better, when you add an order by clause
ORDER BY carrier, id, so you show the results in the order used.
As has been mentioned, several modern DBMS offer analytic functions, such as
ROW_NUMBER and the query becomes much simpler:
select carrier, id, row_number() over (partition by carrier order by id) as flight_sequence_number from flights order by carrier, id;