ADOctopus ADOctopus - 2 months ago 6
C Question

Issues with 'saving progress' of function

I'm trying to parse through some CSV log files extracting only the n'th field (dismissing the others for speed). My function works as expected when I use a buffer size with fread greater than the size of the input.

The problem is when I read in part of the input and try to continue where I left off the next time the function is called. I believe the problem lies in how I'm handling the null terminator and setting my globals, but I just can't seem to figure it out.

Any help with understanding what I'm doing wrong greatly appreciated!

Code

#include <stdio.h>
#include <time.h>

int gcomc = 0;
int gpos = 0;

void test(char *str, int len)
{
const char *ptr = str;
char ch;
int i;
char so[10];
int comc = gcomc;
int pos = gpos;

for(i = 0; i < len; i++)
{
ch = ptr[i];

switch(ch)
{
case ';':
comc++;
break;

case '\0':
gcomc = comc;
gpos = pos;
break;

default:
if (comc == 3) {
ch = ptr[i];
so[pos++] = ch;
}
if (comc == 7) {
printf(" %s ", so);
comc = 0;
pos = 0;
gcomc = 0;
gpos = 0;
}
}
}

return;
}

int main(int argc, char* argv[])
{

FILE *fin=fopen("test.txt", "rb");
char buffer[100 + 1];
size_t bsz;

while((bsz = fread(buffer, sizeof *buffer, 100, fin)) > 0)
{
buffer[bsz] = '\0';
test(buffer, bsz);
}

return 1;
}


Input

A;B;C;D;E;F;G;H
I;J;K;L;M;N;O;P
Q;R;S;T;U;V;W;X
Y;Z;1;2;3;4;5;6


Output with buffer size of 100 (101)

D L T 2


Output with buffer size of 10 (11)

D P
Q X
Segmentation fault (core dumped)


Edit:
Thank you for the comments and code, I've reworked my (rather dumb written) code - any further criticism is welcome (constructive or destructive, I learn from it all):

#include <stdio.h>
#include <time.h>

void test(char *str, int len);

int gcomc, gpos = 0;

void test(char *str, int len)
{
const char *ptr = str;
char ch;
int i;
static char so[10];

for(i = 0; i < len; i++)
{
ch = ptr[i];

switch(ch)
{
case ';':
gcomc++;
break;

default:
if (gcomc == 3) {
ch = ptr[i];
so[gpos++] = ch;
}
if (gcomc == 7) {
so[gpos] = '\0'; /* ensure so is null terminated */
printf(" %s ", so);
gcomc = 0;
gpos = 0;
}
}
}

return;
}

extern int main()
{
FILE *fin=fopen("test.txt", "rb");
char buffer[10 + 1];
size_t bsz;

while((bsz = fread(buffer, sizeof *buffer, sizeof buffer, fin)) > 0)
{
test(buffer, bsz);
}

return 1;
}

Answer

There are at least two problems in your code to be able to read the file in chunks.

First, the so array is automatic: it has no reason to keep its values from one call to the others. You should declare it global (outside the test function) or static.

Next, you only copy the local state to global one when you find a null. But the null is at position len, and you exit the loop just before (for(i = 0; i < len; i++) note the <) so on next call you start again with 0, 0. You should choose one method to indicate the end of the buffer, either passing a length, of writing a null marker, but mixing both is error prone. As you use fread, my advice is to stick to a length:

In main use:

while((bsz = fread(buffer, sizeof *buffer, sizeof buffer, fin)) > 0)
{
    test(buffer, bsz);
}

(that way, you only write the size of the buffer once)

and in test:

void test(char *str, int len)
{
    const char *ptr = str;
          char ch;
          int i;
          static char so[10];
          int comc = gcomc;
          int pos = gpos;

    for(i = 0; i < len; i++)
    {
        ch = ptr[i];

        switch(ch) 
        {
             case ';':
                 comc++;
                 break;

             default:
                 if (comc == 3) {
                     ch = ptr[i];
                     so[pos++] = ch;
                 }
                 if (comc == 7) {
                     so[pos] = '\0'; /* ensure so is null terminated */
                     printf(" %s ", so);
                     comc = 0;
                     pos = 0;
                     gcomc = 0;
                     gpos = 0;
                 }
        }
    }
    gcomc = comc;          /* store the state to globals */
    gpos = pos;

return;
}

But as you were said in comments mixing local and globals like that is error prone. It looks like you started coding before designing the structure of the program and identifying what actually needed to be global. You didn't, did you? ;-)

Comments