Apache11 Apache11 - 5 months ago 14
MySQL Question

section wise sum of data

hello as the title suggest i need to sum section wise data --->

can.id status qid marks
001 section 1 question 1 112 3
001 section 1 question 2 117 3
001 section 1 question 3 116 3
001 section 2 question 1 115 3
001 section 2 question 2 114 -1
001 section 2 question 3 111 3
001 section 3 question 1 112 -1
001 section 3 question 2 116 3
002 section 1 question 1 114 3
002 section 1 question 2 111 3
002 section 2 question 2 111 -1
002 section 3 question 1 111 -1


i want to display sum of marks for each can.id for every section, help is appreciated....

Answer

In R, we can use dplyr. We extract (from tidyr), substring from 'status' to create 'section', then grouped by 'can.id' and 'section', get the sum of 'marks'.

library(dplyr)
library(tidyr)
df1 %>% 
  extract(status, into = "section", "(.*\\d+)\\s+[[:alpha:]].*") %>%
  group_by(can.id, section) %>% 
  summarise(SumMarks = sum(marks))
#  can.id   section SumMarks
#   <int>     <chr>    <int>
#1      1 section 1        9
#2      1 section 2        5
#3      1 section 3        2
#4      2 section 1        6
#5      2 section 2       -1
#6      2 section 3       -1

Or using data.table

library(data.table)
setDT(df1)[,.(SumMarks = sum(marks)), .(can.id, 
                 section = sub("\\s+[[:alpha:]].*", "", status))]

data

 df1 <- structure(list(can.id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
 2L, 2L, 2L), status = c("section 1 question 1", "section 1 question 2", 
 "section 1 question 3", "section 2 question 1", "section 2 question 2", 
 "section 2 question 3", "section 3 question 1", "section 3 question 2", 
 "section 1 question 1", "section 1 question 2", "section 2 question 2", 
 "section 3 question 1"), qid = c(112L, 117L, 116L, 115L, 114L, 
  111L, 112L, 116L, 114L, 111L, 111L, 111L), marks = c(3L, 3L, 
 3L, 3L, -1L, 3L, -1L, 3L, 3L, 3L, -1L, -1L)), .Names = c("can.id", 
 "status", "qid", "marks"), class = "data.frame",
  row.names = c(NA, -12L))