DnRng DnRng - 1 month ago 12
Linux Question

Why does size always = 4096 in Linux character driver read call?

I've been working my way through the Linux char driver examples on the web but run across a behavior that I can't explain.

static ssize_t my_read(struct file *f, char __user *user_buf, size_t cnt, loff_t* off)
{
printk( KERN_INFO "Read called for %zd bytes\n", cnt );
return cnt;
}


The message always indicates that
cnt=4096
bytes regardless of what the number of bytes specified in the user space call to read (e.g..

[11043.021789] Read called for 4096 bytes


However, the user space read calls

retval = fread(_rx_buffer, sizeof(char), 5, file_ptr);
printf( "fread returned %d bytes\n", retval );


The output from user space is

fread returned 5 bytes.


How is it that the value of the size in
my_read
is always 4096 but the value from
fread
indicates 5 ? I know there's something I'm missing but not sure what...

Answer Source

Try read(2) (in unistd.h) and it should output 5 characters. When using libc (fread(3), fwrite(3), etc.), you're using the internal libc buffer, which is usually the size of a page (which is almost always 4 kiB).

I believe that the first time you call fread for 5 bytes, libc does an internal read of 4096 bytes and the following fread will simply return bytes libc already has in the buffer associated with the FILE structure you use. Until you reach 4096. The 4097th byte will issue another read of 4096 bytes and so on.

This also happens when you write, for example when using printf, which is just fprintf with stdout as its first argument. libc won't call write(2) directly, but put your stuff in its internal buffer instead (also of 4096 bytes). It will flush if you call

fflush(stdout);

yourself, or anytime it finds the byte 0x0a (newline in ASCII) within the bytes sent.

Try it, you shall see:

#include <stdio.h>
#include <unistd.h> /* for sleep() */

int main(void) {
    printf("the following message won't show up\n");
    printf("hello, world!");
    sleep(3);
    printf("\nuntil now...\n");

    return 0;
}

This will work however (not using libc's buffering):

#include <stdio.h>
#include <unistd.h> /* for sleep() and write() */

int main(void) {
    printf("the following message WILL show up\n");
    write(0, "hello!", 6);
    sleep(3);
    printf("see?\n");

    return 0;
}

Of course you should know that 0 is the default file descriptor for the standard output.

Flushing everytime there's a newline is essential for the user of a terminal to see messages instantly, and is also helpful for per-line processing, which is done a lot in a Unix environment.

So, even if libc uses read and write syscalls directly to fill and flush its buffers (and by the way the Microsoft implementation of the C standard library must be using Windows stuff, probably ReadFile and WriteFile), those syscalls absolutely do not know libc. This leads to interesting behaviours when using both:

#include <stdio.h>
#include <unistd.h> /* for write() */

int main(void) {
    printf("1. first message (flushed now)\n");
    printf("2. second message (without flushing)");
    write(0, "3. third message (flushed now)", 30);
    printf("\n");

    return 0;
}

which outputs:

1. first message (flushed now)
3. third message (flushed now)2. second message (without flushing)

(third before second!).

Also, note that you can turn off libc's buffering with setvbuf(3). Example:

#include <stdio.h>
#include <unistd.h> /* for sleep() */

int main(void) {
    setvbuf(stdout, NULL, _IONBF, 0);
    printf("the following message WILL show up\n");
    printf("hello!");
    sleep(3);
    printf("see?\n");

    return 0;
}

I never tried, but I guess you could do the same with the FILE* you get when fopening your character device and disable I/O buffering for this one:

FILE* fh = fopen("/dev/my-char-device", "rb");
setvbuf(fh, NULL, _IONBF, 0);