mk.. mk.. - 1 month ago 16
C Question

copy_to_user vs memcpy

I have always been told(In books and tutorials) that while copying data from kernel space to user space, we should use copy_to_user() and using memcpy() would cause problems to the system. Recently by mistake i have used memcpy() and it worked perfectly fine with out any problems. Why is that we should use copy_to_user instead of memcpy()

My test code(Kernel module) is something like this:

static ssize_t test_read(struct file *file, char __user * buf,
size_t len, loff_t * offset)
{
char ani[100];

if (!*offset) {
memset(ani, 'A', 100);
if (memcpy(buf, ani, 100))
return -EFAULT;
*offset = 100;
return *offset;
}

return 0;
}

struct file_operations test_fops = {
.owner = THIS_MODULE,
.read = test_read,
};

static int __init my_module_init(void)
{
struct proc_dir_entry *entry;

printk("We are testing now!!\n");
entry = create_proc_entry("test", S_IFREG | S_IRUGO, NULL);
if (!entry)
printk("Failed to creats proc entry test\n");

entry->proc_fops = &test_fops;
return 0;
}
module_init(my_module_init);


From user-space app, i am reading my
/proc
entry and everything works fine.

A look at source code of copy_to_user() says that it is also simple memcpy() where we are just trying to check if the pointer is valid or not with access_ok and doing memcpy.

So my understanding currently is that, if we are sure about the pointer we are passing, memcpy() can always be used in place of copy_to_user.

Please correct me if my understanding is incorrect and also, any example where copy_to_user works and memcpy() fails would be very useful. Thanks.

Answer

There are a couple of reasons for this.

First, security. Because the kernel can write to any address it wants, if you just use a user-space address you got and use memcpy, an attacker could write to another process's pages, which is a huge security problem. copy_to_user checks that the target page is writable by the current process.

There are also some architecture considerations. On x86, for example, the target pages must be pinned in memory. On some architectures, you might need special instructions. And so on. The Linux kernels goal of being very portable requires this kind of abstraction.