首页 文章

从子进程调用时,Python程序永远挂起

提问于
浏览
35

pip测试套件使用子进程调用来运行集成测试 . 最近放置了PR,删除了一些旧的兼容性代码 . 特别是它使用 b"" 文字明确使用了 b() 函数 . 然而,这似乎打破了特定子进程调用永远挂起的地方 . 更糟糕的是,它只会永远挂在Python 3.3(可能只有Python 3.3.5)上,并且不能轻易地在Travis之外复制 .

相关的拉动请求:

其他Pull请求也会出现类似的问题,但是在不同版本的Python和不同的测试用例中它们会失败 . 这些Pull请求是:

另一位用户今天在IRC上向我报告了一个类似的问题,他们说他们可以在Ubuntu 14.04上使用Python 3.3从死神(但不是在OSX上)本地再现它,而不仅仅是在特拉维斯,就像我迄今为止我一直能够做到的那样 . 他们给了我重现的步骤:

$ git clone git@github.com:xavfernandez/pip.git
$ cd pip
$ git checkout debug_stuck
$ pip install pytest==2.5.2 scripttest==1.3 virtualenv==1.11.6 mock==1.0.1 pretend==1.0.8 setuptools==4.0
$ # The below should pass just fine
$ py.test -k test_env_vars_override_config_file -v -s
$ # Now edit pip/req/req_set.py and remove method remove_me_to_block or change its content to print('KO') or pass
$ # The below should hang forever
$ py.test -k test_env_vars_override_config_file -v -s

在上面的例子中, remove_me_to_block 方法不会在任何地方被调用,只是它的存在足以使测试不被阻塞,并且它的不存在(或改变它的内容)足以使测试块永远存在 .

大多数调试都是在此PR(https://github.com/pypa/pip/pull/1901)中进行了更改 . 一次推送一次提交,测试一直通过,直到应用了这个特定的提交 - https://github.com/dstufft/pip/commit/d296df620916b4cd2379d9fab988cbc088e28fe0 . 具体而言,使用 b'\r\n'(entry + endline).encode("utf-8") 的更改将触发它,但是这些都不在 pip install -vvv INITools 的执行路径中,这是它无法执行的命令 .

在试图追查问题的时候,我注意到如果我用 (lambda: "something")().encode("utf8") 替换至少一次对 "something".encode("utf8") 的调用,它就可以了 .

尝试调试时的另一个问题是,我尝试了各种各样的事情(添加打印语句,无操作 atexit 函数,使用异常的异步子进程)只会将问题从特定的Python版本的特定测试用例转移到不同版本的Python上的不同测试用例 .

我知道如果您直接从 subprocess.Popen().stdout/stderr/stdin 读/写, subprocess 模块可能会死锁 . 但是,此代码使用 communicate() 方法,该方法应该解决这些问题 . 它位于 wait() 调用中 communicate() 执行该进程永远挂起等待pip进程退出 .

其他信息:

  • 非常heisenbug-ey,我已经设法让它消失或根据不应该对它产生任何影响的各种事物进行转变 .

  • 我已经跟踪了pip内部的执行,一直到代码路径的末尾,直到 sys.exit() 被调用 .

  • os._exit() 替换 sys.exit() 修复了所有悬而未决的问题,但是我'd rather not do that as we'然后跳过了Python解释器所做的清理工作 .

  • 没有运行其他线程(使用 threading.enumerate 验证) .

  • 我已经有了一些变化的组合,即使没有 subprocess.PIPE 用于stdout / stderr / stdin,它也会挂起,但是如果不使用它们,其他组合将不会挂起(或者它将转移到不同的测试用例) / python版) .

  • 它似乎与时间无关,任何特定提交将在影响测试用例/ Pythons上100%失败或在0%时间失败 .

  • 经常被改变的代码甚至不被pip子流程中的特定代码路径执行,但是仅仅存在改变似乎打破了它 .

  • 我尝试使用 PYTHONDONTWRITEBYTECODE=1 禁用字节码生成,并且在一个组合中产生效果,但在其他组合中它没有效果 .

  • 子进程调用的命令不会在每次调用时挂起(类似的命令是通过测试套件发出的)但是它总是挂在特定提交的完全相同的位置 .

  • 到目前为止,我已经完全无法在测试套件中通过 subproccess 调用之外重现这个,但是我并没有与此相关 .

我完全不知道是什么导致了这一点 .

UPDATE #1

使用 faulthandler.dump_traceback_later() 我得到了这个结果:

Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/requests/packages/urllib3/response.py", line 287 in closed
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/requests/packages/urllib3/response.py", line 287 in closed
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
Timeout (0:00:05)!
Current thread 0x00007f417bd92740:
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/requests/packages/urllib3/response.py", line 285 in closed
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__
  [ Duplicate Lines Snipped ]
  File "/tmp/pytest-10/test_env_vars_override_config_file0/pip_src/pip/_vendor/cachecontrol/filewrapper.py", line 24 in __getattr__

这告诉我,问题可能与垃圾收集和urllib3有关吗? pip._vendor.cachecontrol.filewrapper 中的 Filewrapper 被用作urllib3响应对象(其子类 io.IOBase )的包装器,这样我们就可以使用 read() 方法将每个读取调用的结果存储在缓冲区中并返回它,然后一旦文件具有完全消耗了使用该缓冲区的内容运行回调,以便我们可以将项目存储在缓存中 . 这可能会以某种方式与GC进行交互吗?

Update #2

如果我将一个 def __del__(self): pass 方法添加到Filewrapper类中,那么在我've tried. I tested to ensure that this wasn' t的情况下一切正常,因为我碰巧定义了一个方法(有时候"fixes")通过将其更改为 def __del2__(self): pass ,它又开始失败了 . 我不确定为什么这种方法可以正常工作,并且一个no-op __del__ 方法似乎不太理想 .

Update #3

在执行挂起的pip命令期间,将两次 import gc; gc.set_debug(gc.DEBUG_UNCOLLECTABLE) 打印的东西添加到stderr,它们是:

gc: uncollectable <CallbackFileWrapper 0x7f66385c1cd0>
gc: uncollectable <dict 0x7f663821d5a8>
gc: uncollectable <functools.partial 0x7f663831de10>
gc: uncollectable <_io.BytesIO 0x7f663804dd50>
gc: uncollectable <method 0x7f6638219170>
gc: uncollectable <tuple 0x7f663852bd40>
gc: uncollectable <HTTPResponse 0x7f663831c7d0>
gc: uncollectable <PreparedRequest 0x7f66385c1a90>
gc: uncollectable <dict 0x7f663852cb48>
gc: uncollectable <dict 0x7f6637fdcab8>
gc: uncollectable <HTTPHeaderDict 0x7f663831cb90>
gc: uncollectable <CaseInsensitiveDict 0x7f66385c1ad0>
gc: uncollectable <dict 0x7f6638218ab8>
gc: uncollectable <RequestsCookieJar 0x7f663805d7d0>
gc: uncollectable <dict 0x7f66382140e0>
gc: uncollectable <dict 0x7f6638218680>
gc: uncollectable <list 0x7f6638218e18>
gc: uncollectable <dict 0x7f6637f14878>
gc: uncollectable <dict 0x7f663852c5a8>
gc: uncollectable <dict 0x7f663852cb00>
gc: uncollectable <method 0x7f6638219d88>
gc: uncollectable <DefaultCookiePolicy 0x7f663805d590>
gc: uncollectable <list 0x7f6637f14518>
gc: uncollectable <list 0x7f6637f285a8>
gc: uncollectable <list 0x7f6637f144d0>
gc: uncollectable <list 0x7f6637f14ab8>
gc: uncollectable <list 0x7f6637f28098>
gc: uncollectable <list 0x7f6637f14c20>
gc: uncollectable <list 0x7f6637f145a8>
gc: uncollectable <list 0x7f6637f14440>
gc: uncollectable <list 0x7f663852c560>
gc: uncollectable <list 0x7f6637f26170>
gc: uncollectable <list 0x7f663821e4d0>
gc: uncollectable <list 0x7f6637f2d050>
gc: uncollectable <list 0x7f6637f14fc8>
gc: uncollectable <list 0x7f6637f142d8>
gc: uncollectable <list 0x7f663821d050>
gc: uncollectable <list 0x7f6637f14128>
gc: uncollectable <tuple 0x7f6637fa8d40>
gc: uncollectable <tuple 0x7f66382189e0>
gc: uncollectable <tuple 0x7f66382183f8>
gc: uncollectable <tuple 0x7f663866cc68>
gc: uncollectable <tuple 0x7f6637f1e710>
gc: uncollectable <tuple 0x7f6637fc77a0>
gc: uncollectable <tuple 0x7f6637f289e0>
gc: uncollectable <tuple 0x7f6637f19f80>
gc: uncollectable <tuple 0x7f6638534d40>
gc: uncollectable <tuple 0x7f6637f259e0>
gc: uncollectable <tuple 0x7f6637f1c7a0>
gc: uncollectable <tuple 0x7f6637fc8c20>
gc: uncollectable <tuple 0x7f6638603878>
gc: uncollectable <tuple 0x7f6637f23440>
gc: uncollectable <tuple 0x7f663852c248>
gc: uncollectable <tuple 0x7f6637f2a0e0>
gc: uncollectable <tuple 0x7f66386a6ea8>
gc: uncollectable <tuple 0x7f663852f9e0>
gc: uncollectable <tuple 0x7f6637f28560>

然后

gc: uncollectable <CallbackFileWrapper 0x7f66385c1350>
gc: uncollectable <dict 0x7f6638c33320>
gc: uncollectable <HTTPResponse 0x7f66385c1590>
gc: uncollectable <functools.partial 0x7f6637f03ec0>
gc: uncollectable <_io.BytesIO 0x7f663804d600>
gc: uncollectable <dict 0x7f6637f1f680>
gc: uncollectable <method 0x7f663902d3b0>
gc: uncollectable <tuple 0x7f663852be18>
gc: uncollectable <HTTPMessage 0x7f66385c1c10>
gc: uncollectable <HTTPResponse 0x7f66385c1450>
gc: uncollectable <PreparedRequest 0x7f66385cac50>
gc: uncollectable <dict 0x7f6637f2f248>
gc: uncollectable <dict 0x7f6637f28b90>
gc: uncollectable <dict 0x7f6637f1e638>
gc: uncollectable <list 0x7f6637f26cb0>
gc: uncollectable <list 0x7f6637f2f638>
gc: uncollectable <HTTPHeaderDict 0x7f66385c1f90>
gc: uncollectable <CaseInsensitiveDict 0x7f66385b2890>
gc: uncollectable <dict 0x7f6638bd9200>
gc: uncollectable <RequestsCookieJar 0x7f663805da50>
gc: uncollectable <dict 0x7f6637f28a28>
gc: uncollectable <dict 0x7f663853aa28>
gc: uncollectable <list 0x7f663853a6c8>
gc: uncollectable <dict 0x7f6638ede5f0>
gc: uncollectable <dict 0x7f6637f285f0>
gc: uncollectable <dict 0x7f663853a4d0>
gc: uncollectable <method 0x7f663911f710>
gc: uncollectable <DefaultCookiePolicy 0x7f663805d210>
gc: uncollectable <list 0x7f6637f28ab8>
gc: uncollectable <list 0x7f6638215050>
gc: uncollectable <list 0x7f663853a200>
gc: uncollectable <list 0x7f6638215a28>
gc: uncollectable <list 0x7f663853a950>
gc: uncollectable <list 0x7f663853a998>
gc: uncollectable <list 0x7f6637f21638>
gc: uncollectable <list 0x7f6637f0cd40>
gc: uncollectable <list 0x7f663853ac68>
gc: uncollectable <list 0x7f6637f22c68>
gc: uncollectable <list 0x7f663853a170>
gc: uncollectable <list 0x7f6637fa6a28>
gc: uncollectable <list 0x7f66382153b0>
gc: uncollectable <list 0x7f66386a5e60>
gc: uncollectable <list 0x7f663852f2d8>
gc: uncollectable <list 0x7f66386a3320>
    [<pip._vendor.cachecontrol.filewrapper.CallbackFileWrapper object at 0x7f66385c1cd0>, <pip._vendor.cachecontrol.filewrapper.CallbackFileWrapper object at 0x7f66385c1350>]

这是有用的信息吗?我之前从未使用过这面旗帜所以我不知道这是不寻常的 .

1 回答

  • 19

    在Python 2中,如果一组对象在链(引用循环)中链接在一起,并且至少一个对象具有 __del__ 方法,则垃圾收集器将不会删除这些对象 . 如果您有一个参考周期,添加 __del__() 方法可能只是隐藏错误(解决方法错误) .

    根据您的更新#3,看起来您有这样的问题 .

相关问题