Mike Mike - 27 days ago 9
C Question

converting unquoted slash to newline without strtok or further memory allocation

I'm trying to figure out why this doesn't work.

I'd like to take data from a file using the 'getline()' function and convert the string so that the slashes ('/') that are not in quotes are replaced with new line characters. I'd like to avoid copying the string to another if possible.

I tried my program below, with two attempts to process the same data. The first attempt wasn't quite right. I expected to see the following in both cases:

ABC
DEF'/'GH


But

printf("%s",newline);


only returns this:

ABC
DEF'/'


and:

printf("%s",newline2);


returns a segmentation fault.

Because the
getline()
function returns the string as a char array with memory pre-allocated to it, I feel a ridiculous solution would be:

char lines[5000000];
strcpy(lines,datafromgetline);
char* newline=parsemulti(lines,10); //prints data almost correctly
printf("%s",newline);


But could I somehow do this where I don't have to allocate local stack space or memory? Can I somehow modify the incoming data directly without a segmentation fault?

#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>

// replaces all occurrences of / not within single quotes with a new line character
char* parsemulti(char* input,int inputlen){
char* fms=strchr(input,'/');
char output[100000]; //allocate tons of space
if (!fms){
return input;
}else{
int exempt=0,sz=inputlen;
char aline[5000];
char*inputptr=input,*lineptr=aline;
memset(aline,0,5000);
while(--sz >= 0){
if (*inputptr=='\''){exempt=1-exempt;} //toggle exempt when ' is found
if (*inputptr=='/' && exempt==0){
*lineptr='\0';
strcat(output,aline);
lineptr=aline;
strcat(output,"\r\n");
}else{
*lineptr=*inputptr;lineptr++;
}
inputptr++;
}
if (exempt==1){printf("\nWARNING: Unclosed quotes\n");}
*lineptr='\0';
strcat(output,aline);
strcat(output,"\r\n");
}
strcpy(input,output);
return input;
}

int main(){
char lines[5000];
strcpy(lines,"ABC/DEF'/'GH");
char* newline=parsemulti(lines,10); //prints data almost correctly
printf("%s",newline);

char* lines2="ABC/DEF'/'GH";
char* newline2=parsemulti(lines2,10); //returns segmentation fault
printf("%s",newline2);
return 0;
}

Answer

Two lines

char lines[5000];
strcpy(lines, "ABC/DEF'/'GH");

will

  • allocate memory for 5000 objects of type char on stack
  • copy string literal contents to memory pointed by name "lines", which you can modify

on the other hand

char *lines2 = "ABC/DEF'/'GH";

defines pointer to string literal that is usually located in read only memory.
Read only, as in do not modify me :)

You tagged this C so I assume You are talking about using getline() function - not a part of C standard, but provided by GNU C Library, that manages memory on it's own (so basically it can, and will do memory allocations, unless you preallocate it. It uses only heap memory, so if preallocated size is too small it reallocates it. Thus You can't provide address to stack char array instead).

To actually find and replace escape character from string, I'd say you should not reinvent wheel and use library string functions.

char *line = NULL;
char *needle;
ssize_t line_size;
size_t size = 0;

line_size = getline(&line, &size, stdin);
while (line_size != -1) {
  needle = strchr(line, '/');
  while (needle) {
      if (needle != line && !(*(needle - 1) == '\'' && *(needle + 1) == '\''))
        *needle = '\n';
      needle = strchr(needle + 1, '/');
  }
  printf("%s", line);
  line_size = getline(&line, &size, stdin);
}