Synchronization of linux executables with write operations a locking mechanism resulting from of file read and write operations

  • 2020-05-07 20:46:37
  • OfStack

When an executable is write and open, the executable is not allowed to be executed. Conversely, when a file is being executed, it is also not allowed to be in write mode while open is. This constraint is well understood because file execution and file writing should require synchronization protection, so the kernel guarantees this synchronization. So how does the kernel implement this mechanism?
The Inode node contains one data item, called i_writecount, which is obviously used for recording the number of written files and for synchronization. The type is also atomic_t.


int get_write_access(struct inode * inode)
{
    spin_lock(&inode->i_lock);
    if (atomic_read(&inode->i_writecount) < 0) {
                spin_unlock(&inode->i_lock);
        return -ETXTBSY;
    }
    atomic_inc(&inode->i_writecount);
        spin_unlock(&inode->i_lock);
    return 0;
}

int deny_write_access(struct file * file)
{
    struct inode *inode = file->f_path.dentry->d_inode;
        spin_lock(&inode->i_lock);
    if (atomic_read(&inode->i_writecount) > 0) {// If the file is opened, return failure 
                spin_unlock(&inode->i_lock);
        return -ETXTBSY;
    }
        atomic_dec(&inode->i_writecount); 
    spin_unlock(&inode->i_lock);
}

Both of these functions are very simple, get_write_acess works as name 1, as does deny_write_access. If a file is executed, to ensure that it cannot be written during execution, deny_write_access should be called to close the write permission before execution. Check that the execve system call does that.
Sys_execve calls do_execve, then calls the function open_exec, see the code of open_exec:


struct file *open_exec(const char *name)
{
    struct file *file;
    int err;
        file = do_filp_open(AT_FDCWD, name,
                O_LARGEFILE | O_RDONLY | FMODE_EXEC, 0,
                MAY_EXEC | MAY_OPEN);

        if (IS_ERR(file))
        goto out;
        err = -EACCES;

    if (!S_ISREG(file->f_path.dentry->d_inode->i_mode))
        goto exit;

        if (file->f_path.mnt->mnt_flags & MNT_NOEXEC)
        goto exit;

        fsnotify_open(file->f_path.dentry);
    err = deny_write_access(file);// call 
       if (err)
        goto exit;

       out:
    return file;

       exit:
    fput(file);
    return ERR_PTR(err);
}

You can clearly see the call to deny_write_access as expected. In the call to open, there should be a call to get_write_access. The call to this function is included in the relevant function of the open call with s 47en_open.


if (f->f_mode & FMODE_WRITE) {
    error = __get_file_write_access(inode, mnt);
    if (error)
            goto cleanup_file;
    if (!special_file(inode->i_mode))
      file_take_write(f);
}

S 52en_file_write_access (inode, mnt) encapsulates get_write_access.
So how does the kernel ensure that a file that is being written is not allowed to be executed? This is also very simple, when a file is already write and open, its corresponding i_writecount of inode will become 1, so when execve is executed, deny_write_access will also be called i_writecount > After 0, a failure will be returned, so execve will also fail to return.
Here is the scenario of writing a file related to i_writecount:
When writing to open 1 file, in the function dentry_open:

if (f->f_mode & FMODE_WRITE) { 
    error = get_write_access(inode); 
    if (error) 
    goto cleanup_file; 
} 

Of course, when the file is closed, i_writecount--; When closed, the code is executed:

if (file->f_mode & FMODE_WRITE) 
    put_write_access(inode); 

The put_write_access code is simple:

static inline void put_write_access(struct inode * inode) 
{ 
    atomic_dec(&inode->i_writecount); 
} 

So I wrote a simple code, 1 empty loop, and when the file is executed, in bash, echo 111 > > The executable, with the expected results, returns a failure and the message text file busy.
Is this mechanism also applicable to the mapping mechanism? When executing executable files, some associated dynamic link libraries of mmap1 are not allowed to be written after mmap and mmap is not allowed while writing? This is something that needs to be considered, because it has to do with security. Because library files are also executable code, tampering can also cause security problems.
Mmap calls mmap_region with one check:


if (vm_flags & VM_DENYWRITE) {          
        error = deny_write_access(file);
    if (error)
        goto free_vma;
    correct_wcount = 1;
}

Where, the flags parameter in mmap call will be correctly assigned to vm_flags, the corresponding relationship is MAP_DENYWRIRE is set, so VM_DENYWRITE is set correspondingly. Write a simple code below, do 1 test:

#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
int main()
{
        int fd;
    void *src = NULL;
    fd = open("test.txt",O_RDONLY);
    if (fd != 0)
        {
        if ((src = mmap(0,5,PROT_READ|PROT_EXEC  ,MAP_PRIVATE|        MAP_DENYWRITE,fd,0))== MAP_FAILED)
                {
            printf("MMAP error\n");
            printf("%s\n",strerror(errno));
                }else{
            printf("%x\n",src);
        }
    }

        FILE * fd_t = fopen("test.txt","w");
    if( !fd_t)
    {
                printf("open for write error\n");
        printf("%s\n",strerror(errno));
        return 0;
    }

        if (fwrite("0000",sizeof(char),4,fd_t) != 4)
    {
        printf("fwrite error \n");
    }

     
        fclose(fd_t);
    close(fd);
    return 1;
}

The last test.txt is written as "0000", which is strange, it seems that MAP_DENTWRITE doesn't work. So man mmap looked and found:

MAP_DENYWRITE

This   flag   is ignored it signaled attempts to write the file

It turns out that this logo doesn't work at the user level anymore, and it also explains why it's vulnerable to denial-of-service attacks. The attacker maliciously maps the files to be written by some system programs in MAP_DENYWRITE mode, which causes the normal program to fail to write files. However, VM_DENYWRITE is still used in the kernel, and there is still a call to deny_write_access in mmap, but the call to deny_write_access is no longer driven by MAP_DENYWRITE of flag in mmap.
The dynamic link library file associated with the executable is a tragedy. As you know, the dynamic link library also USES mmap, which causes the dynamic link library to be changed at runtime. I'm just trying to make sure of that. This caused me to write my own synchronization control code. We can use i_security in inode and f_secutiry in file to write our own synchronization logic. Security is a tricky issue...


Related articles: