Hussain Shehadeh Hussain Shehadeh - 2 years ago 81
R Question

How to extract the user ids for each element

How can I extract the user_id from the retweets collected using this function?

## get only first 8 words from each tweet
x <- lapply(strsplit(dat$text, " "), "[", 1:8)
x <- lapply(x, na.omit)
x <- vapply(x, paste, collapse = " ", character(1))
## get rid of hyperlinks
x <- gsub("http[\\S]{1,}", "", x, perl = TRUE)
## encode for search query (handles the non ascii chars)
x <- sapply(x, URLencode, USE.NAMES = FALSE)
## get up to first 100 retweets for each tweet
data <- lapply(x, search_tweets, verbose = FALSE)


I have 12 elements, each contains a list of user ids, how can I extract the user ids only?

here is the full code:

library(rtweet)
library(dplyr)
library(plyr)
require(reshape2)

## search for day of rage tweets, try to exclude rt here
dor <- search_tweets("#Newsnight -filter:retweets", n = 10000)

## merge tweets data with unique (non duplicated) users data
## exclude retweets
## select status_id, retweet count, followers count, and text columns
dat <- dor %>%
users_data() %>%
unique() %>%
right_join(dor) %>%
filter(!is_retweet) %>%
dplyr::select(user_id, screen_name, retweet_count, followers_count, text) %>%
filter(retweet_count >=50 & retweet_count <100 & followers_count < 10000 & followers_count > 500)
dat

## get only first 8 words from each tweet
x <- lapply(strsplit(dat$text, " "), "[", 1:8)
x <- lapply(x, na.omit)
x <- vapply(x, paste, collapse = " ", character(1))
## get rid of hyperlinks
x <- gsub("http[\\S]{1,}", "", x, perl = TRUE)
## encode for search query (handles the non ascii chars)
x <- sapply(x, URLencode, USE.NAMES = FALSE)
## get up to first 100 retweets for each tweet
data <- lapply(x, search_tweets, verbose = FALSE)


There are 11 more elements like this

12 elements

Answer Source

Ok, so you have a list of 12 dataframes, each has a column called user_id. if the list is named, then this will work, if it isn't named, then take out the df_name = names(data)[x], part.

lapply(1:12, function(x) {
  df <- data[[x]]
  data.frame(user_id = x$user_id, 
             df_name = names(data)[x], 
             df_number = x, stringsAsFactors=FALSE) } ) %>%
dplyr::bind_rows()

That should give you a new dataframe with all of the userids and which previous dataframe they came from.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download