我正在使用带有libvirt的cgroup来限制一组qemu-kvm来宾可以在自定义linux内核4.7.8上使用的内存 . 在用它进行了几次测试之后,当libvirt cgroup内存耗尽时,我会在调用oom-killer之后看到内核恐慌 . 即使我将cgroup内存设置为低于总数并且系统处于空闲状态,除了运行vms(从cgroup中为其他任务留下内存的负载)时,也会发生这种情况 . 为了记录,我的系统有32GB,我使用20GB的客人cgroups . 这是崩溃日志的一部分(它很长,但我可以在以后链接到完整日志):

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727982] Call Trace:

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727975]  0000000000000296 ffff8801f1638dc0 ffffffff811b50f6 ffff88083dff9800

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728007]  [<ffffffff811608ce>] oom_kill_process+0xc2/0x487

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728014]  [<ffffffff812e22c2>] ? selinux_capable+0x1f/0x21

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727994]  [<ffffffff811b50f6>] ? mem_cgroup_select_victim_node+0x17d/0x1ac

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728018]  [<ffffffff812d95ac>] ? security_capable_noaudit+0x2b/0x46

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728028]  [<ffffffff811b235b>] ? mem_cgroup_iter+0x250/0x265

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728025]  [<ffffffff811afd1c>] ? css_put+0x18/0x1a

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728031]  [<ffffffff811607c1>] ? oom_badness+0x10f/0x15a

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728036]  [<ffffffff811af46d>] ? get_mem_cgroup_from_mm+0x52/0x71

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728039]  [<ffffffff811b4039>] mem_cgroup_out_of_memory+0x2c7/0x311

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728054]  [<ffffffff81160ffd>] pagefault_out_of_memory+0x1f/0x76

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728003]  [<ffffffff81171d2c>] ? try_to_free_mem_cgroup_pages+0x10d/0x16a

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728061]  [<ffffffff810999bf>] mm_fault_error+0x66/0x103

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727966] CPU: 1 PID: 15433 Comm: qemu-system-x86 Tainted: G           O    4.7.8 #25

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728064]  [<ffffffff81099e4c>] __do_page_fault+0x3f0/0x4d8

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728071]  [<ffffffff8109a043>] do_page_fault+0x26/0x2f

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728068]  [<ffffffff81191a7d>] ? SyS_mremap+0x46c/0x4cf

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728074]  [<ffffffff81a3dca8>] page_fault+0x28/0x30

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728023]  [<ffffffff8111469e>] ? css_next_descendant_pre+0x32/0x53

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728051]  [<ffffffff811b14d2>] ? mem_cgroup_count_precharge_pte_range+0xe8/0xe8

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727999]  [<ffffffff8115fe5f>] dump_header+0x5e/0x286

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728058]  [<ffffffff8118e399>] ? vma_adjust+0x4b5/0x58b

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728047]  [<ffffffff811b43bb>] mem_cgroup_oom_synchronize+0x1ed/0x27b

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728010]  [<ffffffff811b53f5>] ? task_in_mem_cgroup+0xc9/0xd6

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727989]  [<ffffffff81336719>] dump_stack+0x65/0x8c

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.728044]  [<ffffffff810da147>] ? finish_wait+0x65/0x70

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727979]  ffff8803259d7cf8 ffff8803259d7b38 ffffffff8115fe5f 024200ca00000003

Message from syslogd@ at Fri May 11 15:53:57 2018 ...
 kernel: [ 6380.727971]  ffff880802cbb700 ffff8803259d7a08 ffffffff81336719 ffff88081948fc00
May 11 15:53:57  kernel: [ 6380.727959] qemu-system-x86 invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), order=0, oom_score_adj=0
May 11 15:53:57  kernel: [ 6380.727966] CPU: 1 PID: 15433 Comm: qemu-system-x86 Tainted: G           O    4.7.8 #25
May 11 15:53:57  kernel: [ 6380.727968] Hardware name: ADLINK TECHNOLOGY Inc. Express-SL/, BIOS 1.22.10.KA08 05/03/2017
May 11 15:53:57  kernel: [ 6380.727971]  ffff880802cbb700 ffff8803259d7a08 ffffffff81336719 ffff88081948fc00
May 11 15:53:57  kernel: [ 6380.727975]  0000000000000296 ffff8801f1638dc0 ffffffff811b50f6 ffff88083dff9800
May 11 15:53:57  kernel: [ 6380.727979]  ffff8803259d7cf8 ffff8803259d7b38 ffffffff8115fe5f 024200ca00000003
May 11 15:53:57  kernel: [ 6380.727982] Call Trace:
May 11 15:53:57  kernel: [ 6380.727989]  [<ffffffff81336719>] dump_stack+0x65/0x8c
May 11 15:53:57  kernel: [ 6380.727994]  [<ffffffff811b50f6>] ? mem_cgroup_select_victim_node+0x17d/0x1ac
May 11 15:53:57  kernel: [ 6380.727999]  [<ffffffff8115fe5f>] dump_header+0x5e/0x286
May 11 15:53:57  kernel: [ 6380.728003]  [<ffffffff81171d2c>] ? try_to_free_mem_cgroup_pages+0x10d/0x16a
May 11 15:53:57  kernel: [ 6380.728007]  [<ffffffff811608ce>] oom_kill_process+0xc2/0x487
May 11 15:53:57  kernel: [ 6380.728010]  [<ffffffff811b53f5>] ? task_in_mem_cgroup+0xc9/0xd6
May 11 15:53:57  kernel: [ 6380.728014]  [<ffffffff812e22c2>] ? selinux_capable+0x1f/0x21
May 11 15:53:57  kernel: [ 6380.728018]  [<ffffffff812d95ac>] ? security_capable_noaudit+0x2b/0x46
May 11 15:53:57  kernel: [ 6380.728023]  [<ffffffff8111469e>] ? css_next_descendant_pre+0x32/0x53
May 11 15:53:57  kernel: [ 6380.728025]  [<ffffffff811afd1c>] ? css_put+0x18/0x1a
May 11 15:53:57  kernel: [ 6380.728028]  [<ffffffff811b235b>] ? mem_cgroup_iter+0x250/0x265
May 11 15:53:57  kernel: [ 6380.728031]  [<ffffffff811607c1>] ? oom_badness+0x10f/0x15a
May 11 15:53:57  kernel: [ 6380.728036]  [<ffffffff811af46d>] ? get_mem_cgroup_from_mm+0x52/0x71
May 11 15:53:57  kernel: [ 6380.728039]  [<ffffffff811b4039>] mem_cgroup_out_of_memory+0x2c7/0x311
May 11 15:53:57  kernel: [ 6380.728044]  [<ffffffff810da147>] ? finish_wait+0x65/0x70
May 11 15:53:57  kernel: [ 6380.728047]  [<ffffffff811b43bb>] mem_cgroup_oom_synchronize+0x1ed/0x27b
May 11 15:53:57  kernel: [ 6380.728051]  [<ffffffff811b14d2>] ? mem_cgroup_count_precharge_pte_range+0xe8/0xe8
May 11 15:53:57  kernel: [ 6380.728054]  [<ffffffff81160ffd>] pagefault_out_of_memory+0x1f/0x76
May 11 15:53:57  kernel: [ 6380.728058]  [<ffffffff8118e399>] ? vma_adjust+0x4b5/0x58b
May 11 15:53:57  kernel: [ 6380.728061]  [<ffffffff810999bf>] mm_fault_error+0x66/0x103
May 11 15:53:57  kernel: [ 6380.728064]  [<ffffffff81099e4c>] __do_page_fault+0x3f0/0x4d8
May 11 15:53:57  kernel: [ 6380.728068]  [<ffffffff81191a7d>] ? SyS_mremap+0x46c/0x4cf
May 11 15:53:57  kernel: [ 6380.728071]  [<ffffffff8109a043>] do_page_fault+0x26/0x2f
May 11 15:53:57  kernel: [ 6380.728074]  [<ffffffff81a3dca8>] page_fault+0x28/0x30
May 11 15:53:57  kernel: [ 6380.728077] Task in /machine/ubc3.libvirt-qemu killed as a result of limit of /machine
May 11 15:53:57  kernel: [ 6380.728082] memory: usage 31457280kB, limit 31457280kB, failcnt 0
May 11 15:53:57  kernel: [ 6380.728084] memory+swap: usage 31457280kB, limit 31457280kB, failcnt 129072
May 11 15:53:57  kernel: [ 6380.728086] kmem: usage 2200kB, limit 9007199254740988kB, failcnt 0
May 11 15:53:57  kernel: [ 6380.728088] Memory cgroup stats for /machine: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728102] Memory cgroup stats for /machine/ubc4.libvirt-qemu: cache:12KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:8KB active_file:4KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728114] Memory cgroup stats for /machine/ubc1.libvirt-qemu: cache:32KB rss:6312484KB rss_huge:0KB mapped_file:20KB dirty:0KB writeback:0KB swap:0KB inactive_anon:16KB active_anon:6312488KB inactive_file:12KB active_file:0KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728126] Memory cgroup stats for /machine/ubc2.libvirt-qemu: cache:60KB rss:8408808KB rss_huge:0KB mapped_file:20KB dirty:0KB writeback:0KB swap:0KB inactive_anon:16KB active_anon:8408804KB inactive_file:20KB active_file:20KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728137] Memory cgroup stats for /machine/ubc3.libvirt-qemu: cache:60KB rss:8410240KB rss_huge:0KB mapped_file:20KB dirty:0KB writeback:0KB swap:0KB inactive_anon:16KB active_anon:8410244KB inactive_file:24KB active_file:16KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728149] Memory cgroup stats for /machine/ubc4.libvirt-qemu: cache:88KB rss:8323296KB rss_huge:0KB mapped_file:20KB dirty:0KB writeback:0KB swap:0KB inactive_anon:16KB active_anon:8323260KB inactive_file:32KB active_file:36KB unevictable:0KB
May 11 15:53:57  kernel: [ 6380.728161] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
May 11 15:53:57  kernel: [ 6380.728189] [ 5493]     0  5493  1626653  1580390    3170       9        0             0 qemu-system-x86
May 11 15:53:57  kernel: [ 6380.728193] [15427]     0 15427  2151137  2104431    4195      11        0             0 qemu-system-x86
May 11 15:53:57  kernel: [ 6380.728197] [17683]     0 17683  2151136  2104812    4195      11        0             0 qemu-system-x86
May 11 15:53:57  kernel: [ 6380.728202] [18273]     0 18273  2152955  2083131    4156      12        0             0 qemu-system-x86
May 11 15:53:57  kernel: [ 6380.728205] Memory cgroup out of memory: Kill process 17683 (qemu-system-x86) score 260 or sacrifice child
May 11 15:53:57  kernel: [ 6380.728217] Killed process 17683 (qemu-system-x86) total-vm:8604544kB, anon-rss:8409956kB, file-rss:9272kB, shmem-rss:20kB

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379203] CPU: 4 PID: 121 Comm: kworker/4:1 Tainted: G           O    4.7.8 #25

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379206] Hardware name: ADLINK TECHNOLOGY Inc. Express-SL/, BIOS 1.22.10.KA08 05/03/2017

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379222]  0000000000000296 000000000000039a 0000000000000000 ffffffff81f4b6bb

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379227]  0000000000000000 ffff8808038f7748 ffffffff810a938a 000000091d19fce0

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379232] Call Trace:

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379244]  [<ffffffff810a938a>] __warn+0xdc/0xf7

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379239]  [<ffffffff81336719>] dump_stack+0x65/0x8c

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379254]  [<ffffffff81034a38>] mmu_spte_clear_track_bits+0xe6/0x147

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379272]  [<ffffffff810353e7>] drop_spte+0x15/0xa4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379267]  [<ffffffff810d3d7c>] ? update_group_capacity+0x25/0x1d0

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379282]  [<ffffffff81071dbe>] ? sched_clock+0x9/0xd

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379287]  [<ffffffff81035703>] kvm_mmu_prepare_zap_page+0x177/0x2ef

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379293]  [<ffffffff81069828>] ? __switch_to+0x458/0x4ea

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379298]  [<ffffffff810d04da>] ? sched_clock_cpu+0x21/0xb4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379304]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379316]  [<ffffffff810363a7>] kvm_mmu_invalidate_zap_all_pages+0xcc/0x104

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379321]  [<ffffffff813498c2>] ? percpu_ref_put+0x2e/0x2e

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379327]  [<ffffffff8102800e>] kvm_arch_flush_shadow_all+0x9/0xb

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379346]  [<ffffffff81349bab>] ? percpu_ref_kill_and_confirm+0x60/0x65

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379337]  [<ffffffff811a600f>] __mmu_notifier_release+0x4d/0xe3

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379341]  [<ffffffff811ab39c>] ? kfree+0x167/0x178

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379357]  [<ffffffff811f4e9b>] ? exit_aio+0xc6/0xd5

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379367]  [<ffffffff810a6f30>] __mmput+0x19/0xbc

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379371]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379376]  [<ffffffff810a6fe3>] mmput_async_fn+0x10/0x12

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379387]  [<ffffffff81a396c6>] ? schedule+0x98/0xa6

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379407]  [<ffffffff810cf9d4>] ? default_wake_function+0xd/0xf

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379392]  [<ffffffff810c0ace>] worker_thread+0x36d/0x43c

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379397]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379413]  [<ffffffff810c0761>] ? process_one_work+0x353/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379417]  [<ffffffff81a396c6>] ? schedule+0x98/0xa6

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379422]  [<ffffffff810c0761>] ? process_one_work+0x353/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379426]  [<ffffffff810c49e8>] kthread+0xc8/0xd2

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379436]  [<ffffffff81a3bf3f>] ret_from_fork+0x1f/0x40

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379431]  [<ffffffff810c0761>] ? process_one_work+0x353/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379441]  [<ffffffff810c4920>] ? kthread_freezable_should_stop+0x61/0x61

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379457] Modules linked in: bridge stp llc ipip ip_gre vfio_iommu_type1 vfio_pci vfio vfio_virqfd qcserial qmi_wwan usbnet cdc_wdm clear_stats(O) fusion(O) gpio_pca953x i2c_i801 i2c_acpi_sbus(O) gpio_exar e1000e

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379481] Hardware name: ADLINK TECHNOLOGY Inc. Express-SL/, BIOS 1.22.10.KA08 05/03/2017

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379489]  ffffffff81f4b6bb ffff8808038f7708 ffffffff81336719 ffff880800294f00

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379486] Workqueue: events mmput_async_fn

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379494]  0000000000000296 dead000000000100 0000000000000000 ffffffff81f4b6bb

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379499]  0000000000000000 ffff8808038f7748 ffffffff810a938a 0000000900000100

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379504] Call Trace:

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379523]  [<ffffffff81034a38>] mmu_spte_clear_track_bits+0xe6/0x147

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379259]  [<ffffffff810d258f>] ? check_preempt_wakeup+0x115/0x1b4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379263]  [<ffffffff81032058>] ? gfn_to_rmap+0x27/0x5a

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379527]  [<ffffffff810d258f>] ? check_preempt_wakeup+0x115/0x1b4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379351]  [<ffffffff8118cfb3>] exit_mmap+0x22/0x102

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379331]  [<ffffffff8101a7f5>] kvm_mmu_notifier_release+0x2e/0x41

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379536]  [<ffffffff810d3d7c>] ? update_group_capacity+0x25/0x1d0

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379541]  [<ffffffff810353e7>] drop_spte+0x15/0xa4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379550]  [<ffffffff81071dbe>] ? sched_clock+0x9/0xd

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379555]  [<ffffffff81035703>] kvm_mmu_prepare_zap_page+0x177/0x2ef

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379249]  [<ffffffff810a93bd>] warn_slowpath_null+0x18/0x1a

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379532]  [<ffffffff81032058>] ? gfn_to_rmap+0x27/0x5a

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379565]  [<ffffffff810d04da>] ? sched_clock_cpu+0x21/0xb4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379382]  [<ffffffff810c0620>] process_one_work+0x212/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379570]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379580]  [<ffffffff810363a7>] kvm_mmu_invalidate_zap_all_pages+0xcc/0x104

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379585]  [<ffffffff813498c2>] ? percpu_ref_put+0x2e/0x2e

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379590]  [<ffffffff8102800e>] kvm_arch_flush_shadow_all+0x9/0xb

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379362]  [<ffffffff810d04da>] ? sched_clock_cpu+0x21/0xb4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379595]  [<ffffffff8101a7f5>] kvm_mmu_notifier_release+0x2e/0x41

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379518]  [<ffffffff810a93bd>] warn_slowpath_null+0x18/0x1a

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379609]  [<ffffffff81349bab>] ? percpu_ref_kill_and_confirm+0x60/0x65

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379604]  [<ffffffff811ab39c>] ? kfree+0x167/0x178

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379619]  [<ffffffff811f4e9b>] ? exit_aio+0xc6/0xd5

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379614]  [<ffffffff8118cfb3>] exit_mmap+0x22/0x102

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379624]  [<ffffffff810d04da>] ? sched_clock_cpu+0x21/0xb4

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379637]  [<ffffffff810a6fe3>] mmput_async_fn+0x10/0x12

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379632]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379642]  [<ffffffff810c0620>] process_one_work+0x212/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379652]  [<ffffffff810c0ace>] worker_thread+0x36d/0x43c

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379647]  [<ffffffff81a396c6>] ? schedule+0x98/0xa6

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379277]  [<ffffffff81035513>] mmu_page_zap_pte+0x48/0xc1

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379600]  [<ffffffff811a600f>] __mmu_notifier_release+0x4d/0xe3

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379667]  [<ffffffff810cf9d4>] ? default_wake_function+0xd/0xf

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379672]  [<ffffffff810c0761>] ? process_one_work+0x353/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379661]  [<ffffffff810cf9ab>] ? try_to_wake_up+0x240/0x25c

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379681]  [<ffffffff810c0761>] ? process_one_work+0x353/0x353

Message from syslogd@ at Fri May 11 15:53:58 2018 ...

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379574]  [<ffffffff814585e7>] ? extract_buf+0xf7/0x106
kernel: [ 6381.379696]  [<ffffffff81a3bf3f>] ret_from_fork+0x1f/0x40

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379686]  [<ffffffff810c49e8>] kthread+0xc8/0xd2

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379715] Modules linked in: bridge stp llc ipip ip_gre vfio_iommu_type1 vfio_pci vfio vfio_virqfd qcserial qmi_wwan usbnet cdc_wdm clear_stats(O) fusion(O) gpio_pca953x i2c_i801 i2c_acpi_sbus(O) gpio_exar e1000e

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379508]  [<ffffffff81336719>] dump_stack+0x65/0x8c

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379701]  [<ffffffff810c4920>] ? kthread_freezable_should_stop+0x61/0x61

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379545]  [<ffffffff81035513>] mmu_page_zap_pte+0x48/0xc1

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379736] CPU: 4 PID: 121 Comm: kworker/4:1 Tainted: G        W  O    4.7.8 #25

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379656]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379739] Hardware name: ADLINK TECHNOLOGY Inc. Express-SL/, BIOS 1.22.10.KA08 05/03/2017

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379743] Workqueue: events mmput_async_fn

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379760] Call Trace:

Message from syslogd@ at Fri May 11 15:53:58 2018 ...
 kernel: [ 6381.379750]  0000000000000296 dead000000000100 0000000000000000 ffffffff81f4b6bb
.
.
.

May 11 15:54:13  kernel: [ 6396.890938] Workqueue: events mmput_async_fn
May 11 15:54:13  kernel: [ 6396.890940]  ffffffff81f4b6bb ffff8808038f7708 ffffffff81336719 0000000000294f00
May 11 15:54:13  kernel: [ 6396.890945]  0000000000000296 ffffea0002d58220 0000000000000000 ffffffff81f4b6bb
May 11 15:54:13  kernel: [ 6396.890950]  0000000000000000 ffff8808038f7748 ffffffff810a938a 0000000902d58220
May 11 15:54:13  kernel: [ 6396.890954] Call Trace:
May 11 15:54:13  kernel: [ 6396.890958]  [<ffffffff81336719>] dump_stack+0x65/0x8c
May 11 15:54:13  kernel: [ 6396.890963]  [<ffffffff810a938a>] __warn+0xdc/0xf7
May 11 15:54:13  kernel: [ 6396.890968]  [<ffffffff810a93bd>] warn_slowpath_null+0x18/0x1a
May 11 15:54:13  kernel: [ 6396.890972]  [<ffffffff81034a38>] mmu_spte_clear_track_bits+0xe6/0x147
May 11 15:54:13  kernel: [ 6396.890977]  [<ffffffff81032058>] ? gfn_to_rmap+0x27/0x5a
May 11 15:54:13  kernel: [ 6396.890981]  [<ffffffff810353e7>] drop_spte+0x15/0xa4
May 11 15:54:13  kernel: [ 6396.890986]  [<ffffffff81035513>] mmu_page_zap_pte+0x48/0xc1
May 11 15:54:13  kernel: [ 6396.890990]  [<ffffffff8112b110>] ? kprobe_flush_task+0x8d/0xe8
May 11 15:54:13  kernel: [ 6396.890995]  [<ffffffff81035703>] kvm_mmu_prepare_zap_page+0x177/0x2ef
May 11 15:54:13  kernel: [ 6396.891000]  [<ffffffff810cee3a>] ? finish_task_switch+0x19f/0x1d5
May 11 15:54:13  kernel: [ 6396.891005]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594
May 11 15:54:13  kernel: [ 6396.891009]  [<ffffffff81a39517>] ? __schedule+0x56f/0x594
May 11 15:54:13  kernel: [ 6396.891014]  [<ffffffff81a39562>] ? preempt_schedule_common+0x26/0x31
May 11 15:54:13  kernel: [ 6396.891020]  [<ffffffff810363a7>] kvm_mmu_invalidate_zap_all_pages+0xcc/0x104
May 11 15:54:13  kernel: [ 6396.891024]  [<ffffffff813498c2>] ? percpu_ref_put+0x2e/0x2e
May 11 15:54:13  kernel: [ 6396.891029]  [<ffffffff8102800e>] kvm_arch_flush_shadow_all+0x9/0x

我决定尝试禁用cgroup上的oom-killer并耗尽cgroup内存以查看会发生什么,并且我希望guest是挂起的,直到我手动杀死其中一个(如cgroup文档中所述) . 但令人惊讶的是,每次重复测试时,其中一位客人(似乎是随机的)都会被杀死 . 我很困惑,因为如果禁用oom-killer什么会杀死进程?

以下是我在内核中获得的消息,以及启用了oom-killer但系统没有崩溃的情况:

kernel: [ 1143.934857]   cache: task_struct(10:ubc2.libvirt-qemu), 
object size: 3520, buffer size: 3520, default order: 3, min order: 0
kernel: [ 1143.934860]   node 0: slabs: 3, objs: 27, free: 0
kernel: [ 1143.944535] SLUB: Unable to allocate memory on node -1, 
gfp=0x24000c0(GFP_KERNEL)
kernel: [ 1143.944541]   cache: cred_jar(10:ubc2.libvirt-qemu), object 
size: 168, buffer size: 192, default order: 0, min order: 0
kernel: [ 1143.944545]   node 0: slabs: 2, objs: 42, free: 0

根据我的观察,似乎有一些事情在oom-killer被调用之前杀死进程(当它被启用时),在这种情况下系统将恢复正常,但是当oom-killer确实被调用时系统崩溃并且机器需要重启 .

所以我的问题是:

  • 什么可能导致oom-killer使机器崩溃?

  • 禁用oom-killer时杀死客人的是什么?

如果有人在这件事上有任何线索,那就太棒了!谢谢!

注意:我使用的是使用buildroot构建的内核v4.7.8,并在x86平台上使用uClibc编译 . 此外,没有交换此系统 .