Fishingwest Fishingwest - 1 month ago 7
SQL Question

Find a way to create multiple columns that average data by different groupings within the same SQL query

This query provides us with the time of various operations. Right now it breaks it down the time for each event by name, date, and the shovel model. However, I'm looking to somehow join the average of all the names in one crew into a new column. Being new to the SQL game it would be great to have a strategy that could provide various average definitions and return them to the same table. A picture of the output follows this link.

https://i.stack.imgur.com/6V5XB.png

DECLARE @start AS DATETIME = '2017-08-15'

DECLARE @end AS DATETIME = '2017-08-20'

SELECT *

FROM (

SELECT

[Name],
[Date],
MAX([Crew]) as [Crew],
MAX([Shovel_ID]) as [Shovel Model],
CONVERT(DECIMAL(10,1) ,AVG([empty])) as [Empty],
CONVERT(DECIMAL(10,1) ,AVG([HAUL])) AS [Hauling],
CONVERT(DECIMAL(10,0) ,AVG([Wait At Dump]*60)) AS [Wait at Dump],
CONVERT(DECIMAL(10,0) ,AVG([Dumping]*60)) as [Dumping],
CONVERT(DECIMAL(10,0) ,AVG([Spot at LU]*60)) as [Spot at LU],
CONVERT(DECIMAL(10,0) ,AVG([Load Time]*60)) AS [Truck Loading],
CONVERT(DECIMAL(10,0) ,AVG([QUEUE]*60)) as [Wait at Shovel],
CONVERT(DECIMAL(10,0) ,AVG([Hot Seat]*60)) as [Hot Seat],
CONVERT(DECIMAL(10,0) ,AVG([Empty Hauling]*60)) as [Empty Stopped],
CONVERT(DECIMAL(10,0) ,AVG([HAULING STOPPED]*60)) AS [Hauling Stopped]


FROM (

Select
haul_cycle,
max([name]) as [Name],
max([shovel_id]) as [Shovel_ID],
max([date]) as [Date],
max([crew]) as [Crew],
sum(dumping) as [Dumping],
sum([spot at lu]) as [Spot at LU],
sum([wait at dump]) as [Wait at Dump],
sum([empty]) as [Empty],
sum([hauling]) as [Haul],
SUM([Truck Loading]) as [Load Time],
SUM([Queue at LU]) as [QUEUE],
SUM([Hot Change]) as [Hot Seat],
SUM([EMPTY sTOPPED]) AS [Empty Hauling],
SUM([HAULing STOPPED]) AS [Hauling Stopped],

SUM([HAULing STOPPED]+[EMPTY STOPPED]+[Queue at LU]+[Truck Loading]+
[Hauling]+[Empty]+[Spot at LU]+[wait at dump]+[dumping]) as [Cycle Time],
COUNT([HAUL_CYCLE]) AS [Number of Cycles]

FROM (

SELECT
HAUL_CYCLE_TRANS.HAUL_CYCLE_REC_IDENT as [HAUL_CYCLE],
CREW_IDENT as [CREW],
badge.LAST_NAME + ' , ' + badge.FIRST_NAME as [Name],
DUMP_END_SHIFT_DATE as date ,
DUMP_END_SHIFT_IDENT,
CAST(EQUIPMENT_STATUS_TRANS.END_TIMESTAMP-
EQUIPMENT_STATUS_TRANS.START_TIMESTAMP AS FLOAT)*24*60 AS DURATION,
COALESCE(SUB_STATUS_DESC, STATUS_DESC) AS [Status],

CASE
WHEN HAUL_CYCLE_TRANS.LOADING_UNIT_IDENT IN (4001, 4002, 4005, 4006)
THEN '6060'
ELSE '7495'
END AS SHOVEL_ID

FROM HAUL_CYCLE_TRANS

LEFT JOIN HAUL_UNIT_STATUS_TRANS_COL on
HAUL_UNIT_STATUS_TRANS_COL.HAUL_CYCLE_REC_IDENT =
HAUL_CYCLE_TRANS.HAUL_CYCLE_REC_IDENT

LEFT JOIN EQUIPMENT_STATUS_TRANS on
EQUIPMENT_STATUS_TRANS.EQUIP_STATUS_REC_IDENT =
HAUL_UNIT_STATUS_TRANS_COL.EQUIP_STATUS_REC_IDENT

LEFT JOIN badge on badge.BADGE_IDENT =
haul_cycle_trans.HAULING_UNIT_BADGE_IDENT

LEFT JOIN EQUIP_STATUS_CODE on EQUIP_STATUS_CODE.STATUS_CODE =
EQUIPMENT_STATUS_TRANS.STATUS_CODE


LEFT JOIN EQUIP_SUB_STATUS_CODE on EQUIP_SUB_STATUS_CODE.SUB_STATUS_CODE =
EQUIPMENT_STATUS_TRANS.SUB_STATUS_CODE

WHERE dump_end_shift_date >= @start AND First_Name <> '') raw_data

PIVOT(SUM(duration) FOR [Status] IN ([Dumping], [Spot at LU],[Wait at Dump],
[Empty], [Empty Stopped], [Hauling Stopped], [Hauling], [Queue at LU],
[Truck Loading], [Hot Change])) DATAFIELD

GROUP BY haul_cycle) SUM_DATA_FOR_EACH_REC
GROUP BY [name], [date], Shovel_ID
) AVG_OPERATOR_STATUS

ORDER BY [NAME],[DATE]

Answer Source

You can use the OVER (PARTITION BY) function.

That way you can use whatever split you would like your average to have, by partitioning the dataset in different ways.

You can read more about OVER Clause here.

Example:

SELECT
    DateField
    ,YourName1
    ,YourId
    ,SUM(Value1)
    ,AVG(Value1) OVER (PARTITION BY DateField) AS DateFieldAverage
FROM
    YourTable
GROUP BY
    DateField
    ,YourName1
    ,YourId

Assuming you are using SQL Server (based on the syntax on top of your question).