Linux CPU频率缩放会影响timerfd的准确性-Java 学习之路

在基于Linux的单核嵌入式Cortex-A8机器上，我遇到了 timerfd 的问题：我需要每隔几毫秒触发一些IO，到目前为止我用这样创建的计时器一切顺利：

int _timer_fd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK);
int _flags = 0;
itimerspec _new_timer;
_new_timer.it_interval.tv_sec = interval / 1000000;
_new_timer.it_interval.tv_nsec = (interval % 1000000) * 1000;
_new_timer.it_value.tv_sec = _new_timer.it_interval.tv_sec;
_new_timer.it_value.tv_nsec = _new_timer.it_interval.tv_nsec;
timerfd_settime(_timer_fd, _flags, &_new_timer, NULL);

..和 select() 在文件描述符上 .

CPU默认运行在800MHz，可以缩小到300MHz . 即使在最低频率下，即使系统负载较高且IO较大，定时器也会定期触发 .

现在问题是：当我将CPU频率调节器设置为 ondemand 时，定时器在切换频率时错过了 several seconds （我已经看到高达2800ms）的唤醒 .

我正在谈论的IO涉及上传大文件（网络IO，提取/ CPU，写入闪存） . 仅仅创建/提取大型存档似乎不是问题 .

我修改this handy little Python script使用 timerfd 每隔100ms打印一次CPU频率和时间差异，我可以重现这个问题！运行 test.py 并开始上传（重IO）给我以下输出：

f=300000 t=0.100021, count=01 *
f=600000 t=0.099609, count=01 *                    <== switch, but no problem
f=600000 t=0.099989, count=01 *
f=300000 t=0.100388, count=01 *                    <== switch, but no problem
f=300000 t=0.099874, count=01 *
f=300000 t=0.099944, count=01 *
f=300000 t=0.100000, count=01 *
f=600000 t=0.099615, count=01 *                    <== switch, but no problem
f=600000 t=0.100033, count=01 *
f=600000 t=0.099958, count=01 *
f=600000 t=0.100003, count=01 *                    <== IO starts
f=600000 t=0.100062, count=01 *
f=600000 t=0.100318, count=01 *
f=800000 t=0.418505, count=04 ****                 <== 3 misses
f=800000 t=0.081735, count=01 *
f=800000 t=0.100019, count=01 *
f=800000 t=0.099284, count=01 *
f=800000 t=0.100584, count=01 *
f=800000 t=0.100089, count=01 *
f=800000 t=0.099623, count=01 *
f=720000 t=1.854099, count=18 ******************   <== 17 misses
f=720000 t=0.046591, count=01 *
f=720000 t=0.099038, count=01 *
f=720000 t=0.100744, count=01 *
f=720000 t=0.099240, count=01 *
f=720000 t=0.100029, count=01 *
f=720000 t=0.099985, count=01 *
f=720000 t=0.100007, count=01 *
f=800000 t=2.715434, count=27 ***************************  <== 26 misses
f=800000 t=0.085148, count=01 *
f=800000 t=0.099992, count=01 *
f=800000 t=0.099648, count=01 *
f=800000 t=0.100367, count=01 *
f=800000 t=0.099406, count=01 *
f=800000 t=0.099984, count=01 *
f=720000 t=2.446585, count=24 ************************  <== 23 misses
f=720000 t=0.054219, count=01 *
f=720000 t=0.099947, count=01 *
f=720000 t=0.099284, count=01 *
f=720000 t=0.100721, count=01 *
f=720000 t=0.099975, count=01 *
f=720000 t=0.100089, count=01 *
f=800000 t=2.391552, count=23 ***********************  <== 22 misses
f=800000 t=0.015058, count=01 *
f=800000 t=0.092592, count=01 *
f=800000 t=0.100651, count=01 *
f=800000 t=0.099982, count=01 *
f=800000 t=0.099967, count=01 *

我试过this回答，建议设置我的过程的优先级，但没有效果 .

以下是我目前的结论：

问题不是由我的C程序引起的，因为我可以用一些Python脚本重现它
CPU性能不是问题，因为将频率固定为300MHz效果很好
产生重负荷的过程必须满足某些要求（见下文） - 只做网络IO或CPU密集操作不起作用
只有当 gpg 进程获得某些数据时才会出现计时器间隙

So my question is ：我需要一个间隔大约10ms的准确定时器（几秒ms抖动就可以了） . 我可以用 timerfd 实现这个目标吗？我有什么选择？

使用的内核版本是4.4.19（OpenEmbedded / Yocto）

Reproducing

目前我知道没有其他方法可以重现所描述的行为，而不是以下方面：

具有网络访问权限的嵌入式设备上的

已将 proxy_pass 安装 proxy_pass 的端口 80 安装到某个其他端口，例如8081
在设备上运行 receive.py 将收听 POST 个请求，收到一个大文件并将其传递给 GnuPG
在设备上运行 test.py 以观察CPU频率和定时器精度
将cpu governor设置为 ondemand ： echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
在另一台机器上使用 upload.py 将随机内容的10M文件发送到嵌入式
上传数据的内容似乎很重要！ upload.py <ip/hostname> 10000000 将生成一个随机字节流，并在 POST 之前将其存储到名为 data-out 的文件中 - 在大多数情况下，您将看不到计时器间隙 - 如果您可以观察它们，您可以保留文件并在以后重复使用
从嵌入式设备运行 upload.py （无网络）或遗漏 nginx 将无效！

Files

这是test.py的修改版本，它生成导入asyncore，time，timerfd.async以上的输出

class TestDispatcher（timerfd.async.dispatcher）：
def __init __（self，* args）：
超（）.__的init __（*参数）
self._last_t = time.time（）

def handle_expire（self，count）：
t = time.time（）
f = open（'/ sys / devices / system / cpu / cpu0 / cpufreq / scaling_cur_freq'） . readline（） . strip（'\ n'）
print（“f =％s t =％ . 6f，count =％0.2d％s”％（f，t - self._last_t，count，'' count））
self._last_t = t

dispatcher = TestDispatcher（timerfd.CLOCK_MONOTONIC）
dispatcher.settime（0，timerfd.itimerspec（0.1,1））
asyncore.loop（）
receive.py import子进程，http.server，socketserver
class InstallationHandler（http.server.BaseHTTPRequestHandler）：
def do_POST（self）：
gpg_process = subprocess.Popen（
['gpg'，' - homedir'，'/ home / root / .gnupg'，' - u'，'Name'，' - d']，
stdin = subprocess.PIPE，stdout = subprocess.PIPE，stderr = subprocess.PIPE）
tar_process = subprocess.Popen（
['tar'，' - C'，' . '，' - xzf'，' - ']，
stdin = gpg_process.stdout，stderr = subprocess.PIPE）
content_length = int（self.headers ['content-length']）
而content_length> 0：
content_length - = gpg_process.stdin.write（
self.rfile.read（min（1000，content_length）））
gpg_process.stdin.close（）
self.send_response（201）
self.end_headers（）

socketserver.TCPServer.allow_reuse_address = True
socketserver.TCPServer（（''，8081），InstallationHandler）.serve_forever（）
upload.py - 提供要上传的文件名或多个字节以生成导入http.client，sys，os
如果os.path.exists（sys.argv [2]）：
print（'read ..％r'％sys.argv [2]）
b =open（sys.argv [2]，'rb'） . read（）
其他：
打印（'生成随机数据..'）
b = os.urandom（int（sys.argv [2]））
打开（'data-out'，'wb'） . 写（b）
b =字节（b）
print（'size =％d'％len（b））
h = http.client.HTTPConnection（sys.argv [1]）
h.request（'POST'，'/ upload / calibration_data'，b）
打印（h.getresponse（） . 阅读（））

1 回答

4

初步答案 . 让's assume you don' t想要禁用cpufreq或执行任何其他可能导致功耗变化的入侵内核配置更改 .

让我假设抖动不是来自cpu时钟和定时器时钟之间的一些奇怪的交互，这很难消除 .

让我们假设你愿意稍微破解你的方式 . 在那种情况下......使用你自己的硬件计时器！

ARM SoC通常有许多硬件定时器，而Linux通常只消耗其中两个：一个用于提供定时器（即 timerfd 和其他定时器接口），另一个用于计时 . 这意味着您通常有许多空闲且可用的硬件计时器 .

不幸的是，Linux没有提供任何框架或界面来使用它们，所以你必须做自己的事情 . 例如here有一个MIPS SoC AR9331的例子 .

为ARM SoC做这件事只需要阅读数据表，检查寄存器并调整该示例，或者提出自己的解决方案 .

抖动将会少得多，因为它将是一个硬件定时器，产生中断，因此不受常规负载的影响 .

如果您想要更少的抖动，可以尝试快速中断（FIQ） . Bootlin（前Free Electrons）在blog上解释了这个很棒的技巧 .

回复于 2024-04-20T10:48:07+08:00

Linux CPU频率缩放会影响timerfd的准确性

1 回答

相关问题