Apache11 Apache11 - 5 months ago 7
MySQL Question

section wise sum of data in sql

hello as the title suggest i need a code to sum section wise data --->

can.id status qid marks
001 section 1 question 1 112 3
001 section 1 question 2 117 3
001 section 1 question 3 116 3
001 section 2 question 1 115 3
001 section 2 question 2 114 -1
001 section 2 question 3 111 3
001 section 3 question 1 112 -1
001 section 3 question 2 116 3
002 section 1 question 1 114 3
002 section 1 question 2 111 3
002 section 2 question 2 111 -1
002 section 3 question 1 111 -1


i want to display sum of marks for each can.id for every section. help is appreciated

my code -->

select sum(marks),can.id
from samp_dat
group by can.id, status;

Answer

In R, we can use dplyr. We separate (from tidyr), the 'status' into two new columns ('section', 'question') based on the space delimiter, then grouped by 'can.id' and 'section', get the sum of 'marks'.

library(dplyr)
library(tidyr)
df1 %>% 
  extract(status, into = "section", "(.*\\d+)\\s+[[:alpha:]].*") %>%
  group_by(can.id, section) %>% 
  summarise(SumMarks = sum(marks))
#  can.id   section SumMarks
#   <int>     <chr>    <int>
#1      1 section 1        9
#2      1 section 2        5
#3      1 section 3        2
#4      2 section 1        6
#5      2 section 2       -1
#6      2 section 3       -1

Or using data.table

library(data.table)
setDT(df1)[,.(SumMarks = sum(marks)), .(can.id, 
                 section = sub("\\s+[[:alpha:]].*", "", status))]

data

 df1 <- structure(list(can.id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
 2L, 2L, 2L), status = c("section 1 question 1", "section 1 question 2", 
 "section 1 question 3", "section 2 question 1", "section 2 question 2", 
 "section 2 question 3", "section 3 question 1", "section 3 question 2", 
 "section 1 question 1", "section 1 question 2", "section 2 question 2", 
 "section 3 question 1"), qid = c(112L, 117L, 116L, 115L, 114L, 
  111L, 112L, 116L, 114L, 111L, 111L, 111L), marks = c(3L, 3L, 
 3L, 3L, -1L, 3L, -1L, 3L, 3L, 3L, -1L, -1L)), .Names = c("can.id", 
 "status", "qid", "marks"), class = "data.frame",
  row.names = c(NA, -12L))
Comments