Doug Fir Doug Fir - 2 months ago 25
R Question

c a Corpus using rep or replicate or similar

I have a small corpus e.g.

myvec <- c("n417", "disturbance", "grand theft auto", "assault", "burglary",
"vandalism", "atmt to locate", "drug arrest", "traffic stop",
"larceny", "graffiti complaint / reporting")

corpus <- VCorpus(VectorSource(myvec))


If I wanted to make corpus 10 times bigger, how would I do that so that the resulting variable is a VCorpus and not a list?

Tried:

corpus <- replicate(10, corpus) # returns a list
corpus <- VCorpus(replicate(10, corpus)) # Error: inherits(x, "Source") is not TRUE
corpus <- c(corpus, corpus, corpus, corpus, corpus, corpus, corpus) # works, returns a corpus 7 times bigger but involves lots of typing)


If I have a small corpus and I want to make it ten times larger for example purposes, how could I do that?

Answer Source

We can use do.call with c after replicating

library(tm)
do.call(c, rep(list(corpus), 7))
# <<VCorpus>>
#Metadata:  corpus specific: 0, document level (indexed): 0
#Content:  documents: 77

Similarly for replicate

do.call(c, replicate(7, corpus, simplify = FALSE))
#<<VCorpus>>
#Metadata:  corpus specific: 0, document level (indexed): 0
#Content:  documents: 77

The simplify = FALSE is not needed here with replicate

do.call(c, replicate(7, corpus))
#<<VCorpus>>
#Metadata:  corpus specific: 0, document level (indexed): 0
#Content:  documents: 77