Oshrat Oshrat - 3 months ago 10
R Question

Text analysis within data frame in r

I am working on Google Store metadata and have it as a data frame. Per each app there is information about the requested permissions within a single cell, as a long text, for example:


READ SENSITIVE LOG DATA|RETRIEVE RUNNING APPS|FIND ACCOUNTS ON THE DEVICE|READ YOUR OWN CONTACT CARD|READ YOUR CONTACTS|


I want to separate the text between the "|" character into different cells (columns), so I can analyze existing permissions. I did not analyze text with R before. I tried using strings functions, however, when looking at the info within the cell, it is not recognized as a string.

Any suggestions, directions? Thanks!

Pj_ Pj_
Answer

You can do something like this: Example string -

strin1 <- "READ SENSITIVE LOG DATA|RETRIEVE RUNNING APPS|FIND ACCOUNTS ON THE DEVICE|READ YOUR OWN CONTACT CARD|READ YOUR CONTACTS|"

read.table(text = strin1, sep ='|', colClasses = character)

Does the trick.

A better solution is using tidyr package in this answer: Splitting a dataframe string column into multiple different columns