Sander Ehmsen Sander Ehmsen - 11 days ago 5
R Question

Twitter: Get followers from multiple users at once

I am working on a project where I need to find the reach of some social events. I want to know how many people who were exposed to comments on a festival called Tinderbox in Denmark.
What I do is to get the statusses on Twitter including the word "tinderbox" on the language danish. Then I want to extract the number of followers from these screennames. So the first part of my code is given by:

library("twitteR")
setup_twitter_oauth(consumer_key,consumer_secret,access_token,access_secret)
1
#get data
TB<-searchTwitter("tinderbox", lan="da", n=10000)
#put into a dataframe
df <- do.call("rbind", lapply(TB, as.data.frame))


My thought is to make use of the same output as in the example below, that is to
get followersCount directly from the twitter data.
The example is found here on stackoverflow. But I dont know how to do it to solve my purpose (fetching large number of followers and followees in R)

library(twitteR)
user <- getUser("krestenb")
followers <- user$getFollowers()
b <- twListToDF(followers)
f_count <- as.data.frame(b$followersCount)
u_id <- as.data.frame(b$id)
u_sname <- as.data.frame(b$screenName)
u_name <- as.data.frame(b$name)
final_df <- cbind(u_id,u_name,u_sname,f_count)
sort_fc <- final_df[order(-f_count),]
colnames(sort_fc) <- c('id','name','s_name','fol_count')


My problem is that I cannot simply use a vector of user-name in the followers <- <- user$getFollowers(), by extracting the list of screennames from the df$screenName.

So my thought was that maybe I needed to do a loop with all the different screennames. But I do not know how to do this.

I have that I have painted the picture of what I want to get, and how I thought/think I can get there.

Help is much apreciated as the festival is due this weekend.

Answer

Here is some sample code based on what you had in your original problem which will aggregate Twitter results for a set of users:

# create a data frame with 4 columns and no rows initially
df_result <- data.frame(t(rep(NA, 4)))
names(df_result) <- c('id', 'name', 's_name', 'fol_count')
df_result <- df_result[0:0,]

# you can replace this vector with whatever set of Twitter users you want
users <- c("krestenb", "tjb25587")                    # tjb25587 (me) has no followers

# iterate over the vector of users and aggregate each user's results
sapply(users, function(x) {
                  user <- getUser(x)
                  followers <- user$getFollowers()
                  if (length(followers) > 0) {        # ignore users with no followers
                      b <- twListToDF(followers)
                      f_count <- as.data.frame(b$followersCount)
                      u_id <- as.data.frame(b$id)
                      u_sname <- as.data.frame(b$screenName)
                      u_name <- as.data.frame(b$name)
                      final_df <- cbind(u_id,u_name,u_sname,f_count)
                      sort_fc <- final_df[order(-f_count),]
                      colnames(sort_fc) <- c('id','name','s_name','fol_count')
                      df_result <<- rbind(df_result, sort_fc)
                  }
              })

Important points

I used the global assignment operator <<- when doing the rbind on the df_result data frame so that it will "stick" outside the loop. As I mentioned in my original answer, you can use the sapply function to iterate over a vector of users. Inside the loop, the results are aggregated.

I tested with a vector containing Twitter users both which have and do not have followers and it worked.