dede dede - 1 month ago 7
R Question

how to generate string of letters based on some parameters

I have a set of sentences with a different number of words in each sentence. I need to replace each word with a string of letters, but the string of letters needs to be based on specific criteria. For example, the letter 't' can be replaced only by the letters 'i', 'l', 'f'; the letter 'e' can be replaced only by 'o' or 'c', and so on, for each letter of the alphabet. Also, spaces between words need to be kept intact, as well as full stops, apostrophes and other symbols of punctuation. Following an example:
ORIGINAL SENTENCE: He loves dog.
SENTENCE WITH STRING OF LETTERS: Fc tcwoz bcy.

Is there a way to automatise this procedure in R? Thank you.

ADDED: I need to do this replacement for about 400 sentences. The sentences are stored in a variable of a data frame (data$sentences).

Answer
# the strings to be encoded
mystrings <- c('abc', 'bye')

# the dictionary with the replacements for each letter
replacements <- {}
replacements['a'] <- 'xy'
replacements['b'] <- 'zp'
replacements['c'] <- '91'
# ... 
replacements['e'] <- 'xyv'
replacements['y'] <- 'opj'


encode <- function(string, dictionary) {
  # get the actual subsitutions 
  substitutions <- sapply (strsplit(s,'')[[1]], function(ch) {

    # if we have ho candidates, return empty string
    if (is.na(dictionary[ch])) {
      return('')
    }

    # possible replacement for the current character
    possible.replacements <- strsplit(dictionary[ch][[1]], '')[[1]]

    # we sample from the possible replacement for the current char
    return(sample(possible.replacements,1))
  },USE.NAMES = F,simplify = T);

  # paste the resulting vector into a single string
  result <- paste(substitutions, collapse = '')

  return(result);
}

for (s in mystrings) {
  print(paste(s, ' --> ', encode(s, replacements), sep = ''))
}

UPDATE: wrapped the code into a function to call it over a vector