user7394882 user7394882 -4 years ago 126
Scala Question

Aggregate function over a given time interval spark

Please , i need your help please , i need to aggregate a dataset based on a 5 minute interval and aggregating based on average function ,here you may find input and expected output .,your help will be highly appreciated ,the first column is a timestamp column and am using scala languageenter image description here

Answer Source

Generally you can extract the 5 minutes bucket from each time (e.g. by getting the timestamp as a number, dividing by 5 minutes and flooring the result).

Then you simply do:

df.groupBy("bucket").avg($"value")
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download