wj127 wj127 - 19 days ago 5
C Question

How can I read and obtain separated data from a file using 'fread' in C?

I've written in a file (using 'fwrite()') the following:

TUS�ABQ���������������(A����������(A��B������(A��B���A��(A��B���A������B���A������0����A������0�ABQ�������0�ABQ�����LAS����������������A�����������A��&B�������A��&B��B���A��&B��B������&B��
B����153���B����153�LAS�����153�LAS�����LAX���������������:A����������:AUUB������:AUUB��B��:
AUUB��B����UUB��B����������B��������LAX���������LAX�����MDW���������������A����������A��(�������A��(����A��A��(����A������(����A����A�89���A����A�89MDW�����A�89MDW�����OAK���������
����������������������@�����������@�����������@�����������@�������������������������OAK���������OAK�����SAN���������������LA����������LA��P@������LA��P@��@A��LA��P@��@A������P@��@A����������@A��������SAN���������SAN�����TPA�ABQ����������������B�����������B��@�����...(continues)


which is translated to this:

TUSLWD2.103.47.775.1904.06.40.03AMBRFD4.63.228.935.0043.09.113.0ASDGHU5.226.47.78.3.26...(The same structure)


and the hexdump of that would be:

00000000 54 55 53 00 41 42 51 00 00 00 00 00 00 00 00 00 |TUS.ABQ.........|
00000010 00 00 00 00 00 00 28 41 00 00 0e 42 00 00 f8 41 |......(A...B...A|
00000020 00 00 00 00 4c 41 53 00 00 00 00 00 00 00 00 00 |....LAS.........|
00000030 00 00 00 00 00 00 88 41 00 00 26 42 9a 99 11 42 |.......A..&B...B|
(Continues...)


the structure is, always 2 words of 3 characters each one (i.e. TUS and LWD) followed by 7 floats, and then it repeats again on a on until end of file.

The key thing is: I just want to read every field separated like 'TUS', 'LWD', '2.10', '3.4', '7.77'...

And I can only use 'fread()' to achieve that! For now, I'm trying this:

aux2 = 0;
fseek(fp, SEEK_SET, 0);
fileSize = 0;
while (!feof(fp) && aux<=2) {
fread(buffer, sizeof(char)*4, 1, fp);
printf("%s", buffer);
fread(buffer, sizeof(char)*4, 1, fp);
printf("%s", buffer);
for(i=0; i<7; i++){
fread(&delay, sizeof(float), 1, fp);
printf("%f", delay);
}
printf("\n");
aux++;
fseek(fp,sizeof(char)*7+sizeof(float)*7,SEEK_SET);
aux2+=36;
}


And I get this result:

TUSABQ0.0000000.0000000.00000010.5000000.0000000.00000010.500000
AB0.0000000.000000-10384675421112248092159136000638976.0000000.0000000.000000-10384675421112248092159136000638976.0000000.000000
AB0.0000000.000000-10384675421112248092159136000638976.0000000.0000000.000000-10384675421112248092159136000638976.0000000.000000


But it does not works correctly...

*Note: forget the arguments of the last 'fseek()', cos I've been trying too many meaningless things!
To write the words (i.e. TUS) into the file, I use this:

fwrite(x->data->key, 4, sizeof(char), fp);


and to write the floats, this:

for (i = 0; i < 7; i++) {
fwrite(&current->data->retrasos[i], sizeof(float), sizeof(float), fp);
}

Answer

I'd recommend using a structure to hold each data unit:

typedef struct {
    float  value[7];
    char   word1[5];  /* 4 + '\0' */
    char   word2[5];  /* 4 + '\0' */
} unit;

To make the file format portable, you need a function that packs and unpacks the above structure to/from a 36-byte array. On Intel and AMD architectures, float corresponds to IEEE-754-2008 binary32 format in little-endian byte order. For example,

#define STORAGE_UNIT (4+4+7*4)

#if defined(__i386) || defined(_M_IX86) || defined(__x86_64__) || defined(_M_X64)

size_t unit_pack(char *target, const size_t target_len, const unit *source)
{
    size_t i;

    if (!target || target_len < STORAGE_UNIT || !source) {
        errno = EINVAL;
        return 0;
    }

    memcpy(target + 0, source->word1, 4);
    memcpy(target + 4, source->word2, 4);

    for (i = 0; i < 7; i++)
        memcpy(target + 8 + 4*i, &(source->value[i]), 4);

    return STORAGE_UNIT;
}

size_t unit_unpack(unit *target, const char *source, const size_t source_len)
{
    size_t i;

    if (!target || !source || source_len < STORAGE_UNIT) {
        errno = EINVAL;
        return 0;
    }

    memcpy(target->word1, source, 4);
    target->word1[4] = '\0';

    memcpy(target->word2, source + 4, 4);
    target->word2[4] = '\0';

    for (i = 0; i < 7; i++)
        memcpy(&(target->value[i]), source + 8 + i*4, 4);

    return STORAGE_UNIT;
}

#else
#error Unsupported architecture!
#endif

The above only works on Intel and AMD machines, but it is certainly easy to extend to other architectures if necessary. (Almost all machines currently use IEEE 754-2008 binary32 for float, only the byte order varies. Those that do not, typically have C extensions that do the conversion to/from their internal formats.)

Using the above, you can -- should! must! -- document your file format, for example as follows:

Words are 4 bytes encoded in UTF-8
Floats are IEEE 754-2008 binary32 values in little-endian byte order

A file contains one or more units. Each unit comprises of

    Name    Description
    word1   First word
    word2   Second word
    value0  First float
    value1  Second float
    value2  Third float
    value3  Fourth float
    value4  Fifth float
    value5  Sixth float
    value6  Second float

There is no padding.

To write an unit, use a char array of size STORAGE_UNIT as a cache, and write that. So, if you have unit *one, you can write it to FILE *out using

    char  buffer[STORAGE_UNIT];

    if (unit_pack(buffer, sizeof buffer, one)) {
        /* Error! Abort program! */
    }
    if (fwrite(buffer, STORAGE_UNIT, 1, out) != 1) {
        /* Write error! Abort program! */
    }

Correspondingly, reading from FILE *in would be

    char buffer[STORAGE_UNIT];

    if (fread(buffer, STORAGE_UNIT, 1, in) != 1) {
        /* End of file, or read error.
           Check feof(in) or/and ferror(in). */
    }
    if (unit_unpack(one, buffer, STORAGE_UNIT)) {
        /* Error! Abort program! */
    }

If one is an array of units, and you are writing or reading one[k], use &(one[k]) (or equivalently one + k) instead of one.