Levente Makai Levente Makai - 12 days ago 5
C Question

How to generate every possible string combination of 4 letters and store in array in C

In C I need to make an array which contains every possible 5 letter string combination of the letters "A", "C", "G", "T". That is,
AAAAA,
AAAAG,
AAAAC,
etc.
And I need these stored in an array. I'm aware there are 1024 possible combinations, and therefore the array would be allocated with that in mind.
I think the memory allocation would look something like this:

char* combinations[] = calloc(1024, 5*sizeof(char));


Not sure about how to fill such an array with all possible combinations.

Answer

The following code does what you want.

#include <stdio.h>
#include <stdlib.h>

char ** getCombinations(){
  char letters[] = {'A','C','G','T'};
  // memory to hold pointers to our strings
  // this is less memory efficient, but gives us our char** 
  char ** combinations = (char**)calloc(1024, sizeof(char*));
  char * strings = (char*)calloc(1024, 5*sizeof(char));
  unsigned i;
  unsigned int j;

  for (i = 0; i < 1024; i++){
    combinations[i] = &strings[i * 5];
    for ( j = 5; j--;){
      combinations[i][4 - j] = letters[(i >> (j * 2)) % 4]; 
    }
  }

  return combinations;
}

int main(){
  int i;
  char ** combinations = getCombinations();
  for ( i = 0; i < 1024; i++){
    printf("%.*s\n", 5, combinations[i]);
  }

  free(combinations);
}

The important line is the inner loop

combinations[i][4 - j] = letters[(i >> (j * 2)) % 4];

The purpose of this line is to turn an index (0-1023) into a combination by simply counting up.

Lets break this down:

  • letters[... % 4] returns a letter based on whatever (...) is. the % 4 part just makes indexing at 1, 5, 9, ... all return 'C'

  • (i >> (j * 2)) this basically allows us to select in base 4 (the number of possible letters)

  • combinations[i][4 - j] sets the value to the jth letter (counting from the right) of the ith word in the list.

Comments