david david - 1 month ago 9
R Question

Add incremental counter to string in rows of a dataframe

I would like to add a counter to all "unknown" strings in the dataframe.
Here is the dataframe

structure(list(Phylum = structure(c(1L, 2L, 2L, 2L, 2L, 2L), .Label = c("Acidobacteria",
"Actinobacteria"), class = "factor"), Class = structure(c(1L,
2L, 2L, 2L, 2L, 2L), .Label = c("Acidobacteria bacterium RIFCSPLOWO2_02_FULL_64_15",
"Actinobacteria"), class = "factor"), Order = structure(c(3L,
1L, 2L, 2L, 2L, 2L), .Label = c("Actinomycetales", "Corynebacteriales",
"unknown"), class = "factor"), Family = structure(c(3L, 1L, 2L,
2L, 2L, 2L), .Label = c("Actinomycetaceae", "Corynebacteriaceae",
"unknown"), class = "factor"), Genus = structure(c(3L, 2L, 1L,
1L, 1L, 1L), .Label = c("Corynebacterium", "Trueperella", "unknown"
), class = "factor"), Species = structure(c(1L, 1L, 1L, 1L, 1L,
1L), .Label = "unknown", class = "factor"), Genecoverage = c(5.58715596330275,
761.405303030303, 42.5656565656566, 91.910447761194, 63.1912223121422,
87.9927983539095)), .Names = c("Phylum", "Class", "Order", "Family",
"Genus", "Species", "Genecoverage"), row.names = c(NA, 6L), class = "data.frame")


For the first row i would like to replace all "unknown" occurences with "unknown_1".

For the second row replace all "unknown" occurences with "unknown_2" and so on...

Can you help?

thanks

Answer Source

We can subset the non-numeric columns and replace the non-numeric columns 'unknown' by pasteing with the sequence values

i1 <- !sapply(df1, is.numeric)
df1[i1] <- lapply(df1[i1], function(x) factor(replace(as.character(x),
     x == "unknown", paste0(x[x=="unknown"], seq_len(sum(x == "unknown"))))))
df1
#         Phylum                                             Class             Order             Family           Genus  Species Genecoverage
#1  Acidobacteria Acidobacteria bacterium RIFCSPLOWO2_02_FULL_64_15          unknown1           unknown1        unknown1 unknown1     5.587156
#2 Actinobacteria                                    Actinobacteria   Actinomycetales   Actinomycetaceae     Trueperella unknown2   761.405303
#3 Actinobacteria                                    Actinobacteria Corynebacteriales Corynebacteriaceae Corynebacterium unknown3    42.565657
#4 Actinobacteria                                    Actinobacteria Corynebacteriales Corynebacteriaceae Corynebacterium unknown4    91.910448
#5 Actinobacteria                                    Actinobacteria Corynebacteriales Corynebacteriaceae Corynebacterium unknown5    63.191222
#6 Actinobacteria                                    Actinobacteria Corynebacteriales Corynebacteriaceae Corynebacterium unknown6    87.992798