Dfinzgar - 1 year ago 67

R Question

I have a data.frame with couple of thousands rows. I am applying several lines of code to subsets of this data.

I have 4 subsets in a column "*mergeorder$phylum*":

`[1] "ascomycota" "basidiomycota" "unidentified"`

[4] "chytridiomycota"

And on every subset i have to apply this set of functions separately:

`ascomycota<-mergeorder[mergeorder$phylum %in% c("ascomycota"), ]`

group_ascomycota <- aggregate(ascomycota[,2:62], by=list(ascomycota$order), FUN=sum)

row.names(group_ascomycota)<-group_ascomycota[,1]

group_ascomycota$sum <-apply(group_ascomycota[,-1],1,sum)

dat5 <-sweep(group_ascomycota[,2:62], 2, colSums(group_ascomycota[2:62]), '/')

dat5$sum <-apply(group_ascomycota[,-1],1,sum)

reorder_dat5 <- dat5[order(dat5$sum, decreasing=T),]

reorder_dat5$OTU_ID <- row.names(reorder_dat5)

FINITO<-reorder_dat5[1:15,]

write.table(FINITO, file="output_ITS1/ITS1_ascomycota_order_top15.csv", col.names=TRUE,row.names=FALSE, sep=",", quote=FALSE)

This code works. However, I would like to apply this code without manually replacing every "ascomycota" with "basidiomycota", "unidentified", "chytridiomycota".

What function should I use? How should I use it? I've been struggling with

`sapply()`

`repeat()`

The end result should execute the whole code and export csv separate files.

Many thanks for your answer

Answer Source

It's usually possible to write code that handles all subsets in one go. However, what you are doing is pretty complicated. The best thing to do might be to gather all that into a function and then just run the function for each subset. Something like this:

```
subset_transform <- function(subset){
t <-mergeorder[mergeorder$phylum %in% c(subset), ]
group_t <- aggregate(t[,2:62], by=list(t$order), FUN=sum)
row.names(group_t)<-group_t[,1]
group_t$sum <-apply(group_t[,-1],1,sum)
dat5 <-sweep(group_t[,2:62], 2, colSums(group_t[2:62]), '/')
dat5$sum <-apply(group_t[,-1],1,sum)
reorder_dat5 <- dat5[order(dat5$sum, decreasing=T),]
reorder_dat5$OTU_ID <- row.names(reorder_dat5)
FINITO<-reorder_dat5[1:15,]
write.table(FINITO, file = paste("output_ITS1/ITS1_", subset, "_order_top15.csv"), col.names=TRUE,row.names=FALSE, sep=",", quote=FALSE)
}
subset_transform("ascomycota")
subset_transform("basidiomycota")
subset_transform("unidentified")
subset_transform("chytridiomycota")
```