I have a text document which has a million words. Now, I need to know how to find trailing and leading words of a word using R.
For example, If I want to find out the words that are coming before and after the word "error". It could be anything like following with leading words
For words coming before error:
x <- "no error and no error and some error" # input library(gsubfn) rx <- "(\\w+) error" table(strapplyc(x, rx)[])
no some 2 1
rx with the following for words after error:
rx <- "error (\\w+)"