user3067851 - 1 year ago 147
R Question

# Matching two list of unequal length

I am trying to match the values in 2 lists only where the variable names are the same between list. I would like the result to be a list the length of the longer list filled with count of total matches.

``````jac <- structure(list(s1 = "a", s2 = c("b", "c", "d"), s3 = 5),
.Names = c("s1", "s2", "s3"))

larger <- structure(list(s1 = structure(c(1L, 1L, 1L), .Label = "a", class = "factor"),
s2 = structure(c(2L, 1L, 3L), .Label = c("b", "c", "d"), class = "factor"),
s3 = c(1, 2, 7)), .Names = c("s1", "s2", "s3"), row.names = c(NA, -3L), class = "data.frame")
``````

I am using
`mapply(FUN = pmatch, jac, larger)`
which gives me a correct total but not in the format that I would like below:

``````s1  s2  s3  s1result    s2result    s3result
a   c   1   1   2   NA
a   b   2   1   1   NA
a   c   7   1   3   NA
``````

However, I don't think pmatch will ensure the name matching in every situation so I wrote a function that I am still having issues with:

``````prodMatch <- function(jac,larger){
for(i in 1:nrow(larger)){
if(names(jac)[i] %in% names(larger[i])){
r[i] <- jac %in% larger[i]
r
}
}
}
``````

Can anyone help out?

Another dataset that causes one to not be a multiple of the ohter:

`````` larger2 <-
structure(list(s1 = structure(c(1L, 1L, 1L), class = "factor", .Label = "a"),
s2 = structure(c(1L, 1L, 1L), class = "factor", .Label = "c"),
s3 = c(1, 2, 7), s4 = c(8, 9, 10)), .Names = c("s1", "s2",
"s3", "s4"), row.names = c(NA, -3L), class = "data.frame")
``````

`mapply` returns a list of matching index, you can convert it to a data frame simply using `as.data.frame`:

``````as.data.frame(mapply(match, jac, larger))
#   s1 s2 s3
# 1  1  2 NA
# 2  1  1 NA
# 3  1  3 NA
``````

And `cbind` the result with `larger` gives what you expected:

``````cbind(larger,
setNames(as.data.frame(mapply(match, jac, larger)),
paste(names(jac), "result", sep = "")))

#  s1 s2 s3 s1result s2result s3result
#1  a  c  1        1        2       NA
#2  a  b  2        1        1       NA
#3  a  d  7        1        3       NA
``````

Update: To take care of the cases where the name of the two lists don't match, we can loop through the `larger` and it's name simultaneously and extract the elements from `jac` as follows:

``````as.data.frame(
mapply(function(col, name) {
m <- match(jac[[name]], col)
if(length(m) == 0) NA else m  # if the name doesn't exist in jac return NA as well
}, larger, names(larger)))

#  s1 s2 s3
#1  1  2 NA
#2  1  1 NA
#3  1  3 NA
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download