AliCivil AliCivil - 1 month ago 19
R Question

Breaking a paragraph into a vector of sentences in R

I have the following paragraph:

Well, um...such a personal topic. No wonder I am the first to write a review. Suffice to say this stuff does just what they claim and tastes pleasant. And I had, well, major problems in this area and now I don't. 'Nuff said. :-)


for the purpose of applying the
calculate_total_presence_sentiment
command from the
RSentiment
package I would like to break this paragraph into a vector of sentences as follows:

[1] "Well, um...such a personal topic."
[2] "No wonder I am the first to write a review."
[3] "Suffice to say this stuff does just what they claim and tastes pleasant."
[4] "And I had, well, major problems in this area and now I don't."
[5] "'Nuff said."
[6] ":-)"


Would appreciate your help on this.

Answer

qdap has a convenient function for this:

sent_detect_nlp - Detect and split sentences on endmark boundaries using openNLP & NLP utilities which matches the onld version of the openNLP package's now removed sentDetect function.

library(qdap)

txt <- "Well, um...such a personal topic. No wonder I am the first to write a review. Suffice to say this stuff does just what they claim and tastes pleasant. And I had, well, major problems in this area and now I don't. 'Nuff said. :-)"

sent_detect_nlp(txt)
#[1] "Well, um...such a personal topic."                                       
#[2] "No wonder I am the first to write a review."                             
#[3] "Suffice to say this stuff does just what they claim and tastes pleasant."
#[4] "And I had, well, major problems in this area and now I don't."           
#[5] "'Nuff said."                                                             
#[6] ":-)"