Matthew Matthew - 7 months ago 9
SQL Question

Please help optimize my sql server code

I have a table with the columns: Id, time, value.

First step: Given input parameters as signal id, start time and end time, I want to first extract rows with the the signal id and time is between start time and end time.

Second: Assume I have selected 100 rows in the first step. Given another input parameter which is max_num, I want to further select max_num samples out of 100 rows but in a uniform manner. For example, if max_num is set to 10, then I will select 1, 11, 21, .. 91 rows out of 100 rows.

I am not sure if the stored procedure below is optimal, if you find any inefficiencies of the code, please point that out to me and give some suggestion.

Please let me know if I didn't make the question clear.

create procedure data_selection
@sig_id bigint,
@start_time datetime2,
@end_time datetime2,
@max_num float
AS

BEGIN
declare @tot float
declare @step int
declare @selected table (id int primary key identity not null, Date datetime2, Value real)

// first step
insert into @selected (Date, Value) select Date, Value from Table
where Id = @sig_id
and Date > = @start_time and Date < = @end_time
order by Date

// second step
select @tot = count(1) from @selected
set @step = ceiling(@tot / @max_num)
select * from @selected
where id % @step = 1

END

Answer

EDITED to calculate step on the fly. I had first thought this was an argument.

;with data as (
    select row_number() over (order by [Date]) as rn, *
    from Table
    where Id = @sig_id and Date between @start_time and @end_time
), calc as (
    select cast(ceiling(max(rn) / @max_num) as int) as step from data
)
select * from data cross apply calc as c
where (rn - 1) % step = 0 --and rn <= (@max_num - 1) * step + 1

Or I guess you can just order/filter by your identity value as you already had it:

;with calc as (select cast(ceiling(max(rn) / @max_num) as int) as step from @selected)
select * from @selected cross apply calc as c
where (id - 1) % step = 0 --and id <= (@max_num - 1) * step + 1

I think that because you're rounding step up with ceiling you'll easily find scenarios where you get fewer rows than @max_num. You might want to round down instead: case when floor(max(rn) / @max_num) = 0 then 1 else floor(max(rn) / @max_num) end as step?