machine_1 machine_1 - 2 months ago 11
C Question

Opening a file with binary mode for processing

What I am basically doing is opening a file in binary mode, dumping it into a buffer, and processing it, and I finally realized that opening the file in binary mode is causing a problem and I searched a bit on google but I don't know how to fix it.

The code's aim is to reduce the number of lines. Note: The logic of the program is correct.

#include <stdio.h>
#include <stdlib.h>

int foo(const char *filename);

int main(void)
{
foo("file.txt");

fputs("Press any key to continue...", stderr);
getchar();
return 0;
}

int foo(const char *filename)
{
/* open file in binary mode */
FILE *fp = fopen(filename, "rb");
if (!fp) {
perror(filename);
return -1;
}

/* store file size and allocate memory accordingly */
long f_size;
fseek(fp, 0L, SEEK_END);
f_size = ftell(fp);
fseek(fp, 0L, SEEK_SET);
char *buf = (char *)malloc(f_size+1);
if (!buf) {
puts("Error - malloc failed.");
return -2;
}

/* store file contents in buffer */
size_t bytes_read = fread(buf, 1, f_size, fp);
if (bytes_read != f_size) {
fclose(fp);
free(buf);
puts("read error...");
return -3;
}
else {
fclose(fp);
buf[f_size] = '\0';
}

bool f = 0;
size_t n = 0;
size_t m = 0;
while (buf[n]) {
if (buf[n] == '\n') {
if (f) {
f = 0;
buf[m++] = '\n';
}
}
else {
f = 1;
buf[m++] = buf[n];
}
n++;
}
/* NUL-terminate buffer at m*/
buf[m] = '\0';

/* open file for writing */
fp = fopen(filename, "wb");
if (!fp) {
perror(filename);
free(buf);
return -4;
}

/* write buffer to file */
size_t bytes_written = fwrite(buf, 1, m, fp);
if (bytes_written != m) {
puts("fwrite error...");
}
fclose(fp);
free(buf);
return 0;
}


file.txt:


00000000

00000000

00000000


desired output:


00000000

00000000

00000000

Answer

If you are processing a text file as text, then you should open it in text mode, not binary mode. The physical distinction between a text file and a binary file is system-dependent, and on some systems there is no distinction, but portable programs need to be aware that it at least potentially makes a difference. This is about the C view of the file's contents, not about the (stream) functions you use to access the contents.

In particular, if you open a file in text mode then the I/O functions will translate between the system's standard convention for (text) line terminators externally and newlines internally. If your program operates on input files that you assume will conform to the local system's idea of line termination, and you want your program to have a view of its input files in which each line is terminated by a newline (alone), then you need to open the files in text mode to engage that translation. Beware, however, that that may have additional implications.

If you cannot or will not open the files in text mode, then you need to establish and handle what a "line" means to your program. It is entirely possible to make it recognize line terminators of different types, and even of mixed types. In that case, it may be useful to know that lines of Windows text files are terminated by the two-byte sequence "\r\n", whereas lines of Unix (including OS X) text files are terminated by a single newline. Lines of classic MacOS text files were terminated by single carriage returns, but you probably don't need to worry about that these days. Some other systems have other conventions or support multiple conventions.

Comments