FilipW FilipW - 8 months ago 45
R Question

Make all words uppercase in Wordcloud in R

When creating Wordclouds it is most common to make all the words lowercase. However, I want the wordclouds to display the words uppercase. After forcing the words to be uppercase the wordcloud still display lowercase words. Any ideas why?

Reproducable code:


data <- data.frame(text = c("Creativity is the art of being ‘productive’ by using
the available resources in a skillful manner.
Scientifically speaking, creativity is part of
our consciousness and we can be creative –
if we know – ’what goes on in our mind during
the process of creation’.
Let us now look at 6 examples of creativity which blows the mind."))

text <- paste(data$text, collapse = " ")

# I am using toupper() to force the words to become uppercase.
text <- toupper(text)

source <- VectorSource(text)
corpus <- VCorpus(source, list(language = "en"))

# This is my function for cleaning the text
clean_corpus <- function(corpus){
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, stripWhitespace)
corpus <- tm_map(corpus, removeWords, c(stopwords("en")))

clean_corp <- clean_corpus(corpus)
data_tdm <- TermDocumentMatrix(clean_corp)
data_m <- as.matrix(data_tdm), colors = c("#224768", "#ffc000"), max.words = 50)

This produces to following output

enter image description here

Answer Source

It's because behind the scenes TermDocumentMatrix(clean_corp) is doing TermDocumentMatrix(clean_corp, control = list(tolower = TRUE)). If you set it to TermDocumentMatrix(clean_corp, control = list(tolower = FALSE)), then the words stay uppercase. Alternatively, you can also adjust the row names of your matrix afterwards: rownames(data_m) <- toupper(rownames(data_m)).