TOC

Python Popen 卡住

程序中有些时候无法避免需要执行系统命令完成一些任务。
比如,我们系统有一个内部的小功能以来 rsync 来做文件同步。

最近发现如果数据量稍微大一点,rsync 就会卡住,不知道为什么。

经过排查之后,发现是 Popen 模块的使用错误导致。

准备环境

mkdir /tmp/aa /tmp/bb
# Create 10K ~ 1000k Files
for i in {00001..10000}; do
    file_size=$((1 + $RANDOM % 100))
    dd if=/dev/urandom of=/tmp/aa/file$i.txt bs=10K count=$file_size
done
# du -sh /tmp/aa/
# 4.9G    /tmp/aa/

代码

from subprocess import PIPE, STDOUT, Popen

src_dir = '/tmp/aa/'
tgt_dir = '/tmp/bb/'

# --remove-source-files
command = 'rsync -av %s %s' % (src_dir, tgt_dir)
p = Popen(command, stdin=PIPE, stdout=PIPE, stderr=STDOUT, shell=True)
p.wait()
if p.returncode == 0:
    LOG.info('rsync success')
else:
    LOG.warning('rsync error %d', p.returncode)

数据传输卡在 file0670.txt 了,总传输数据 2.3G。

排查

经过排查之后,确认是我们的编码问题。
代码捕获了标准输出和标准错误,但是我们没有去读这个数据,最后把管道缓冲区占满了,程序就无法继续运行。

Popen 初始化有一个参数 pipesize,如果设置了,则会调用 fcntl.fcntl(p2cwrite, fcntl.F_SETPIPE_SZ, self.pipesize) 设置缓冲区大小。
在 man fcntl 中了解到:

Changing the capacity of a pipe
    F_SETPIPE_SZ (int; since Linux 2.6.35)
            Change  the capacity of the pipe referred to by fd to be at least arg bytes.  An unprivileged process can adjust the pipe capacity to any value between the system page size and the limit defined in /proc/sys/fs/pipe-max-size (see proc(5)).  Attempts to set the pipe capacity below the page size are silently rounded up to the page  size.   Attempts  by  an  unprivileged  process  to  set  the  pipe  capacity  above  the  limit  in /proc/sys/fs/pipe-max-size yield the error EPERM; a privileged process (CAP_SYS_RESOURCE) can override the limit.

            When  allocating  the  buffer for the pipe, the kernel may use a capacity larger than arg, if that is convenient for the implementation.  (In the current implementation, the allocation is the next higher power-of-two page-size multiple of the requested size.)  The actual capacity (in bytes) that is set is returned as the function result.

            Attempting to set the pipe capacity smaller than the amount of buffer space currently used to store data produces the error EBUSY.

            Note that because of the way the pages of the pipe buffer are employed when data is written to the pipe, the number of bytes that can be written may be less than the nominal size, depending on the size of the writes.

    F_GETPIPE_SZ (void; since Linux 2.6.35)
            Return (as the function result) the capacity of the pipe referred to by fd.

又在 man 7 pipe | grep size -C10 中了解到:

Pipe capacity
    A pipe has a limited capacity.  If the pipe is full, then a write(2) will block or fail, depending on whether the O_NONBLOCK flag is set (see below).  Different implementations have different limits for the  pipe  capacity.
    Applications should not rely on a particular capacity: an application should be designed so that a reading process consumes data as soon as it is available, so that a writing process does not remain blocked.

    In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386).  Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with a page size of 4096 bytes).  Since Linux 2.6.35, the default pipe capacity is 16 pages, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations.  See fcntl(2) for more information.

    The following ioctl(2) operation, which can be applied to a file descriptor that refers to either end of a pipe, places a count of the number of unread bytes in the pipe in the int buffer pointed to by the final argument of the call:

        ioctl(fd, FIONREAD, &nbytes);

    The FIONREAD operation is not specified in any standard, but is provided on many implementations.

也就是说:

  1. 非特权进程可以调整管道缓冲区大小,范围是:页大小到 /proc/sys/fs/pipe-max-size
  2. 低于页大小,这会被当作是页大小
  3. 超过 pipe-max-size 则会报错 EPERM
  4. 特权进程不受限制
  5. 管道缓冲区大小
  6. 2.6.11 以前,系统页大小
  7. 2.6.11 之后,系统页大小 x 16
  8. 2.6.35 之后,可以通过 fcntl 来手动调整
  9. 如果管道缓冲区满了,则会写阻塞,除非程序设置了非阻塞运行模式(O_NONBLOCK

查看当前系统的页大小

getconf PAGE_SIZE
4096
getconf PAGESIZE
4096

验证

系统的缓冲区大小应该是 16 x 4K = 64K

import subprocess

cmd = ['python', '-c', 'print("a" * 1024 * 64, end=".")']
# cmd = ['python', '-c', 'import time; time.sleep(10);']
print(1)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
print(2)
# stdout, stderr = p.communicate()
# print(repr([stdout, stderr]))
print(3)
p.wait()

子进程执行 python 命令,输出 64KB 不会卡住,增加 1B 就会卡在 wait 那个地方。
解除 communicate 那一行的注释,程序能正常运行。

调整

子程序使用系统的 stdin/stdout/stderr

Popen(command, shell=True)

重定向到 DEVNULL

Popen(command, stdout=subprocess.DEVNULL, shell=True)

# python2 不支持 DEVNULL
devnull = os.open(os.devnull, os.O_RDWR)
Popen(command, stdout=devnull, shell=True)
devnull.close()

读子程序的 stdout

p = Popen(command, stdout=PIPE, stderr=STDOUT, shell=True)
# 程序阻塞着,不停从子进程标准输出读数据
p.communicate()

使用 run

result = subprocess.run(command, shell=True)
print(result.returncode)
print(result.stdout)
print(result.stderr)