我正在探索CFS调度程序 . 根据CFS,vruntime是进程在CPU上运行的时间 . 因此,一旦进程消耗了一些CPU,其vruntime就会增加 .

为了深入理解context_switch概念,我研究了kernel / sched / core.c文件的context_switch方法实现 .

context_switch(struct rq *rq, struct task_struct *prev,
           struct task_struct *next)

为了了解上下文切换所涉及的过程 - 特别是为了知道调度了哪个进程以及调度了哪个进程,我已经添加了

trace_printk(KERN_INFO "**$$,traceme,%d,%llu,%llu,%d,%llu,%llu\n", (int)(prev->pid),prev->se.vruntime,prev->se.sum_exec_runtime, (int)(next->pid),next->se.vruntime,next->se.sum_exec_runtime);

内核/ sched / core.c文件的context_switch()函数 .

清理后的一些样本数据

//(prev->pid),prev->se.vruntime,prev->se.sum_exec_runtime, (int)(next->pid),next->se.vruntime,next->se.sum_exec_runtime



 Line-1 :    7560,24498429469681,823155565,7566,24498418258892,1637962
 Line-2 :    7566,24498418261234,1640304,7580,24498417733416,1018016
 Line-3 :    7580,24498417752807,1037407,686,24498429468802,48339928895
 Line-4 :    686,24498429469817,48339929910,7566,24498418261234,1640304
 Line-5 :    7566,24498418263610,1642680,7581,24498417762357,1038126
 Line-6 :    7581,24498417781339,1057108,7560,24498429469681,823155565

 Line-7 :    7560,24498429470724,823156608,7566,24498418263610,1642680
 Line-8 :    7566,24498418265980,1645050,7582,24498418395747,1202608
 Line-9 :    7582,24498418414400,1221261,686,24498429469817,48339929910
 Line-10:    686,24498429470804,48339930897,7566,24498418265980,1645050
 Line-11:    7566,24498418268334,1647404,7583,24498417826636,1168325
 Line-12:    7583,24498417845297,1186986,7560,24498429470724,823156608

 Line-13:    7560,24498429471802,823157686,7566,24498418268334,1647404
 Line-14:    7566,24498418270800,1649870,686,24498429470804,48339930897
 // Up to this line vruntime of all process increased in each run as expected.


 Line-15: 686,24498438028365,48348488458,7560,24498429471802,823157686
 Line-16: 0,0,0,7,918077230457,2930949708
 Line-17: 7,918077232097,2930951348,0,0,0
 Line-18: 7560,6056741110796,823305719,7584,24498429478909,1156272 <---- Here vruntime of process 7560 is decreased . Why?

从上面的数据中,我们可以推断出每个流程在计划之前执行了多少时间 .

p_pid   p_vrt         p_sum_exe_rt   n_pid    n_vrt     n_sum_exe_rt                 |  prev_tslice next_tslice  CPU
 7560 ,  24498429469681,       823155565,   7566 ,  24498418258892,        1637962, | ,       1191 ,      2327 
  7566 ,  24498418261234,         1640304,   7580 ,  24498417733416,        1018016, | ,       2342 ,     19462 
  7580 ,  24498417752807,         1037407,    686 ,  24498429468802,    48339928895, | ,      19391 ,      1003 
   686 ,  24498429469817,     48339929910,   7566 ,  24498418261234,        1640304, | ,       1015 ,      2342 
  7566 ,  24498418263610,         1642680,   7581 ,  24498417762357,        1038126, | ,       2376 ,     18429 
  7581 ,  24498417781339,         1057108,   7560 ,  24498429469681,      823155565, | ,      18982 ,      1191 

  7560 ,  24498429470724,       823156608,   7566 ,  24498418263610,        1642680, | ,       1043 ,      2376 
  7566 ,  24498418265980,         1645050,   7582 ,  24498418395747,        1202608, | ,       2370 ,     19520 
  7582 ,  24498418414400,         1221261,    686 ,  24498429469817,    48339929910, | ,      18653 ,      1015 
   686 ,  24498429470804,     48339930897,   7566 ,  24498418265980,        1645050, | ,        987 ,      2370 
  7566 ,  24498418268334,         1647404,   7583 ,  24498417826636,        1168325, | ,       2354 ,     19617 
  7583 ,  24498417845297,         1186986,   7560 ,  24498429470724,      823156608, | ,      18661 ,      1043 
  7560 ,  24498429471802,       823157686,   7566 ,  24498418268334,        1647404, | ,       1078 ,      2354 


   7566 ,  24498418270800,         1649870,    686 ,  24498429470804,    48339930897, | ,       2466 ,       987 



   686 ,  24498438028365,     48348488458,   7560 ,  24498429471802,      823157686, | ,    8557561 ,      1078 
 //----------------------- Up to this line vruntime is increased      
   7560 ,   6056741110796,       823305719,   7584 ,  24498429478909,        1156272, | ,     148033 ,     18671

一切看起来都很完美 - 每次运行都会增加一个过程的运行时间 .

令我惊讶的是,在最后一行, vruntime of a running process is decreased .

具有pid 7560的

  • 进程的vruntime在最后一行减少了(从24498429471802到6056741110796) - 为什么?

  • 即使将进程固定到特定的CPU内核(因此无法迁移到其他CPU内核),运行进程的运行时间是如何减少的?

  • Another important observation, in other run 7560 process run for shorter time slice, but this last time it gets higher timeslice. Any co-relation between larger time slice and reduction in vruntime ?

我使用的是Debian 8.0和Ubuntu 16.04,内核3.16.35,这在两个操作系统中都有用 .

了解原因的任何链接都将是一个很大的帮助 .