从python中的管道子进程stdout读取行时的内存使用情况-Java 学习之路

我只是想了解在处理subprocess.Popen（）结果和逐行读取时内存使用方面在“背景”中发生的情况 . 这是一个简单的例子 .

给定以下脚本 test.py 打印"Hello"然后等待10s并打印"world"：

import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")

然后，以下脚本 test_sub.py 将作为子进程'test.py'调用，将stdout重定向到管道，然后逐行读取：

import subprocess, time, os, sy

cmd = ["python3","test.py"]

p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT, universal_newlines = True)

for line in iter(p.stdout.readline, ''):
   print("---" + line.rstrip())

在这种情况下，我的问题是，当我在执行子进程调用后运行 test_sub.py 时，它将打印"Hello"然后等待10秒直到"world"来然后打印它， what happens to "Hello" during those 10s of waiting? Does it get stored in memory until test_sub.py finishes, or does it get tossed away in the first iteration?

对于这个例子来说，这可能并不重要，但是在处理真正大的文件时它确实如此 .

1 回答

1
那些等待10年的“你好”会发生什么？

"Hello" （在父级中）可以通过 line 名称获得，直到 .readline() 返回第二次，即 "Hello" 至少存在，直到在父级中读取 print("World") 的输出 .

如果您的意思是在子进程中发生了什么，那么在 sys.stdout.flush() 之后 "Hello" 对象没有理由继续生存，但它可能会例如，见Does Python intern strings?

在test_sub.py完成之前它是否会存储在内存中，还是在第一次迭代中被抛弃？

.readline() 第二次返回后， line 指的是 "World" . 之后 "Hello" 会发生什么情况取决于特定Python实现中的垃圾收集，即使 line 是 "World" ;对象 "Hello" 可能会继续存在一段时间 . Releasing memory in Python .

您可以设置 PYTHONDUMPREFS=1 envvar并使用debug python build运行代码，以查看 python 进程退出时处于活动状态的对象 . 例如，考虑以下代码：
```
#!/usr/bin/env python3
import threading
import time
import sys

def strings():
    yield "hello"
    time.sleep(.5)
    yield "world"
    time.sleep(.5)

def print_line():
    while True:
        time.sleep(.1)
        print('+++', line, file=sys.stderr)

threading.Thread(target=print_line, daemon=True).start()
for line in strings():
    print('---', line)
time.sleep(1)
```
它表明 line 直到第二个 yield 才会反弹 . PYTHONDUMPREFS=1 ./python . |& grep "'hello'" 的输出显示 python 在 python 退出时仍处于活动状态 .
回复于 2024-04-28T23:47:32+08:00

从python中的管道子进程stdout读取行时的内存使用情况

1 回答

相关问题