roody roody - 1 year ago 72
R Question

Aggregating by unique identifier and concatenating related values into a string

I have a need that I imagine could be satisfied by

, but I can't quite figure out.

I have a list of names (
), and accompanying ID number (
). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (
) and concatenate the multiple possible
's into a string separated by a comment.

For example,

brand id
RadioShack 2308
Rag & Bone 4466
Ragu 1830
Ragu 4518
Ralph Lauren 1638
Ralph Lauren 2719
Ralph Lauren 2720
Ralph Lauren 2721
Ralph Lauren 2722

Would become...

RadioShack 2308
Rag & Bone 4466
Ragu 1830,4518
Ralph Lauren 1638,2719,2720,2721,2722

How would I accomplish this?

Answer Source

Let's call your data.frame DF

> aggregate(id ~ brand, data = DF, c)
         brand                           id
1   RadioShack                         2308
2   Rag & Bone                         4466
3         Ragu                   1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722

Another alternative using aggregate is:

result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")

This produces the same result and now id is not a list anymore. Thanks to @Frank comment. To see the class of each column try:

> sapply(result, class)
      brand          id 
   "factor" "character"

As mentioned by @DavidArenburg in the comments, another alternative is using the toString function:

aggregate(id ~ brand, data = DF, toString)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download