用mmap接口访问文件时边界问题会导致的两个错误

本文主要探索以mmap接口访问文件时，文件自身大小、mmap映射范围和我们所能访问区间之间的关系。主要通过几个小的实验程序来说明。

本文假定读者了解mmap可以作为文件访问的接口，若没有用过可以在Linux中直接man mmap看相关说明，或者去网上搜索其他资料。简单来说，文件的某一段经过mmap系统调用映射后会返回一个地址，这样我们可以像操纵内存一样操纵磁盘上的数据，因此”open +mmap+memcpy+msync “这套文件操作可以在很多的时候代替”open+read/write+fsync“这套文件操作。

但是相比write进行追加写的操作，被mmap映射的地址是无法做到改变被映射文件大小的，那么我们如果想改变文件大小怎么办？如果我们写的地址大于实际文件大小会出现什么情况？如果我们写的地址大于所映射的地址范围会有什么情况？

通过两组简单的测试，我们可以探究这个问题：

测试1：”文件范围内, mmap范围外” 会产生SIGSEGV段错误

测试1是简单的情况，我们mmap映射的范围小于文件的实际大小，那么当我们访问在文件范围内但不是映射区范围内的地址时，会产生”segmentation fault”(SIGSEGV)错误！这很好理解，因为我们访问了非法的内存地址。

如下图，具体的，我们创建一个1 MB的文件，然后将其前512 KB用mmap映射，然后尝试访问文件第800 KB，第800 KB虽然在文件的范围内，但是不在映射范围内。结果是产生segmentation fault (SIGSEGV) 段错误。

                 +------------------+------------------+
file_testmap:    |      mmapped     |   not mmapped    |
                 +------------------+------------------+
                 0 (KB)            512         ^      1024
                                               |
                      we try to access here  --+
                    (it will cause seg. fault)

程序1：

#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <assert.h>


#define ALLOC_SIZE 1024*1024
#define MMAP_SIZE  1024*512
#define ACCESS_OFF 1024*800


int main()
{
    // open
    int fd = open("file_testmmap", O_CREAT | O_RDWR | O_DIRECT | O_TRUNC, 0755);
    assert(fd);
    // alloc 1MB
    int ret = fallocate(fd, 0, 0, ALLOC_SIZE);
    assert(!ret);
    // mmap
    char *addr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    assert(addr);

    char a = *(addr + ACCESS_OFF); // we expect program abort here!
    printf("The fisrt read succeed!\n");
    *(addr + ACCESS_OFF) = 'j'; 
    printf("The fisrt write succeed!\n");

    return 0;
}

运行结果1：

zjc@~/test_mmap$ ./a.out
段错误(吐核)

测试2：”文件范围外,mmap范围内” 会产生SIGBUS总线错误

测试2稍微复杂，我们验证mmap范围大于文件范围的情况，我们还验证了sparse file的情况 (关于sparse file和文件打洞hole punching的详细介绍本博客另一篇文章[1])。

                 +---------+------------------------------+--------+
file_testmap:    |#########|       hole                   |########|
(shadow means    +---------+------------------------------+--------+
allocated )      0 (MB)    1                              5        6

mmapped area:    +---------------------------------------------------------------------+
                start from 0 KB                                             end at 20000KB

                     ^            ^                                       ^
TESTs access:      [1] 500 (KB)  [2] 1500                              [3] 10000 


Our mmap area is much larger than [1] [2] and [3], so there will be no seg. fault.
TESTs:
    [1]: normal access
    [2]: although we access a hole, the FS will alloc a block(4KB) automatically, 
         the space the file ocuppied will be 4KB larger, but no error will happen.
    [3]: we will get an SIGBUS abort.

如上图，具体的：

A. 我们创建一个1 MB的文件，然后从第5 MB再分配1 MB的空间给这个文件，这样，我们就得到了一个文件6 MB但是实际分配块占磁盘2 MB大小的文件，中间的4 MB是一个空洞，所以这个文件是一个sparse file稀疏文件。

B. 我们用mmap映射一个20000 KB (比6 MB大)的地址addr。

C. 分别以500 KB、1500 KB 和 10000 KB 为 addr的偏移量访问映射区：

测试的三种情况的具体说明：

Case 1. 500 KB在文件的非空洞区，肯定是正常访问的。

Case 2. 1500 KB 在文件的空洞区，我们在xfs和ext4上都做了测试，也是可以正常访问的，文件系统会自动在1500 KB所在的区域为文件分配一个块，这会导致文件的实际占用空间从2048 KB 涨到 2052 KB，因为我们一般的文件系统的存储单元是一个4 K块。

Case 3. 10000 KB 在mmap区，但是大于文件的大小，这是会产生总线错误SIGBUS。至于原因：不是段错误的原因，因为我们用mmap分配了20000 KB给addr变量，所以程序并非访问了非法的内存空间；在文件系统真正处理mmap缺页时，会检查所访问的内容是否超过文件当前的大小，若文件小于所请求地址的偏移，那么会返回SIGBUS错误。

注意，case2中我们是先进行mmap映射，再用fallocate产生的1 MB之后的空洞和5~6MB的实际分配空间，这还可以说明，用fallocate、write等方法改变文件的大小后并不用改变已经用mmap映射的较大的addr范围。这也就是说，我们可以在程序最开始用mmap映射远大于文件的addr，然后在程序需要访问大于文件大小的地址前，保证文件被增长到相应位置即可保证不发生错误。

程序2

#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <assert.h>
#include <linux/falloc.h>

#define ALLOC_SIZE 1024*1024
#define MMAP_SIZE 1024*20000
#define ACCESS_OFF1 1024*500
#define ACCESS_OFF2 1024*1500
#define ACCESS_OFF3 1024*10000

void print_size(int fd) 
{
    struct stat st;
    fstat(fd, &st);
    printf("fstat:\t file size: %d KB (%d KB allocate).\n\n", st.st_size / 1024, st.st_blocks * 512 / 1024);
}

int main()
{
    // open
    int fd = open("file_testmmap", O_CREAT | O_RDWR | O_DIRECT | O_TRUNC, 0755);
    assert(fd);
    // alloc 1MB
    int ret = fallocate(fd, 0, 0, ALLOC_SIZE);
    assert(!ret);
    // mmap
    char *addr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    assert(addr);


    // === TEST 1. As expected, access to a normal area of a mmapped file  will not cause an abort.
    printf("=== TEST 1. As expected, access to a normal area of a mmapped file  will not cause an abort.\n");
    char a = *(addr + ACCESS_OFF1);
    printf("The first read (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF1 / 1024, a);
    *(addr + ACCESS_OFF1) = 'j';
    printf("The first write (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF1 / 1024, *(addr + ACCESS_OFF1));
    print_size(fd);

    // === TEST 2. access to a mmapped file hole will not cause an abort.
    printf("=== TEST 2. Access to a mmapped file hole will not cause an abort.\n");
    // we first create a hole
    ret = fallocate(fd, 0, ALLOC_SIZE * 5, ALLOC_SIZE);
    assert(!ret);
    printf("File fallocated and punched:\n");
    print_size(fd);
    // although we are reading and writing into a hole area,
    // it will not cause a SIGABRT since we are write within file size
    char b = *(addr + ACCESS_OFF2);
    printf("The second read (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF2 / 1024, b);
    *(addr + ACCESS_OFF2) = 'c';
    printf("The second write (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF2 / 1024, *(addr + ACCESS_OFF2));
    print_size(fd);

    // === TEST 3. access beyond the file size will cause an abort
    printf("=== TEST 3. access beyond the file size will cause an abort.\n");
    char c = *(addr + ACCESS_OFF3); // we expect program abort here! we will get a SIGBUS
    printf("The third read (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF3 / 1024, c);
    *(addr + ACCESS_OFF3) = '!';
    printf("The third write (at %d KB) succeed!\tchar: %c\n", ACCESS_OFF3 / 1024, *(addr + ACCESS_OFF3));
    print_size(fd);

    return 0;
}

运行结果2

zjc@~/test_mmap$ ./a.out
=== TEST 1. As expected, access to a normal area of a mmapped file  will not cause an abort.
The first read (at 500 KB) succeed!     char:
The first write (at 500 KB) succeed!    char: j
fstat:   file size: 1024 KB (1024 KB allocate).

=== TEST 2. Access to a mmapped file hole will not cause an abort.
File fallocated and punched:
fstat:   file size: 6144 KB (2048 KB allocate).

The second read (at 1500 KB) succeed!   char:
The second write (at 1500 KB) succeed!  char: c
fstat:   file size: 6144 KB (2052 KB allocate).

=== TEST 3. access beyond the file size will cause an abort.
总线错误(吐核)

[1] 文件打洞 (Hole Punching) 及其应用, http://blog.jcix.top/2018-09-28/hole_punching/

JciX ~

Jc's Blog

用mmap接口访问文件时边界问题会导致的两个错误

测试1：”文件范围内, mmap范围外” 会产生SIGSEGV段错误

测试2：”文件范围外,mmap范围内” 会产生SIGBUS总线错误

相关

Leave a Reply Cancel reply