Kernel crash at "cpufreq_governor_dbs+0x397" or "__cpufreq_governor+0x2b"

11시 32분 2014년 3월 28일 업데이트

문제

  • The server has multiple warnings logged in the system log as follows:
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xb8/0xd0() (Tainted: G        W  ---------------   )
Hardware name: ProLiant DL360p Gen8
sysfs: cannot create duplicate filename '/devices/system/cpu/cpu9/cpufreq/ondemand'
Modules linked in: ... cpufreq_ondemand freq_table pcc_cpufreq ... [last unloaded: freq_table]
Pid: 47469, comm: cpuspeed Tainted: G        W  ---------------    2.6.32-358.el6.x86_64 #1
Call Trace:
 [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106e3d6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff811fb898>] ? sysfs_add_one+0xb8/0xd0
 [<ffffffff811fb918>] ? create_dir+0x68/0xb0
 [<ffffffff811fb9cb>] ? sysfs_create_subdir+0x1b/0x20
 [<ffffffff811fcd18>] ? internal_create_group+0x58/0x1a0
 [<ffffffff811fce93>] ? sysfs_create_group+0x13/0x20
 [<ffffffffa0224a13>] ? cpufreq_governor_dbs+0x163/0x470 [cpufreq_ondemand]
 [<ffffffff8112a281>] ? get_page_from_freelist+0x3d1/0x830
 [<ffffffff81414857>] ? cpufreq_governor_userspace+0x2f7/0x330
 [<ffffffff814131a9>] ? __cpufreq_governor+0xb9/0x180
 [<ffffffff8141343f>] ? __cpufreq_set_policy+0x1cf/0x250
 [<ffffffff81413954>] ? store_scaling_governor+0xe4/0x210
 [<ffffffff814135f0>] ? handle_update+0x0/0x40
 [<ffffffff8127923a>] ? kobject_get+0x1a/0x30
 [<ffffffff8150f200>] ? hrtimer_nanosleep_restart+0x20/0x90
 [<ffffffff81412a27>] ? store+0x67/0xa0
 [<ffffffff811f97c5>] ? sysfs_write_file+0xe5/0x170
 [<ffffffff81180f98>] ? vfs_write+0xb8/0x1a0
 [<ffffffff81181891>] ? sys_write+0x51/0x90
 [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
---[ end trace 448bc8fe604185a3 ]---
  • The system may crash with "divide error" at "cpufreq_governor_dbs+0x397/0x470" and with the following call trace in the crash dump:
divide error: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
Modules linked in: ... cpufreq_ondemand freq_table pcc_cpufreq ... [last unloaded: freq_table]
Pid: 48308, comm: cpuspeed Tainted: G        W  ---------------    2.6.32-358.el6.x86_64 #1 HP ProLiant DL360p Gen8
RIP: 0010:[<ffffffffa037dc47>]  [<ffffffffa037dc47>] cpufreq_governor_dbs+0x397/0x470 [cpufreq_ondemand]
RSP: 0018:ffff883f4eab5be8  EFLAGS: 00010246
RAX: 000000010017c0dc RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880099837548
RBP: ffff883f4eab5c88 R08: ffff882012a3f008 R09: 0000000000000040
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
R13: ffff880099837500 R14: ffff880099837500 R15: 0000000000011400
FS:  00007fdf271f6700(0000) GS:ffff880099840000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000003dd6300040 CR3: 0000003f4d95b000 CR4: 00000000000427e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cpuspeed (pid: 48308, threadinfo ffff883f4eab4000, task ffff88400ad87500)
Stack:
 ffff883f4eab5d08 ffffffff8112a281 ffff883f4eab5c18 0000000000000001
<d> 0000000000000002 000000006b4b36de 0000000000000001 ffff880099837500
<d> 0000000000011400 00000000637742f6 0000000000017500 ffff880099837510
Call Trace:
 [<ffffffff8112a281>] ? get_page_from_freelist+0x3d1/0x830
 [<ffffffff814131a9>] __cpufreq_governor+0xb9/0x180
 [<ffffffff8141343f>] __cpufreq_set_policy+0x1cf/0x250
 [<ffffffff81413954>] store_scaling_governor+0xe4/0x210
 [<ffffffff814135f0>] ? handle_update+0x0/0x40
 [<ffffffff8127923a>] ? kobject_get+0x1a/0x30
 [<ffffffff8150f200>] ? hrtimer_nanosleep_restart+0x20/0x90
 [<ffffffff81412a27>] store+0x67/0xa0
 [<ffffffff811f97c5>] sysfs_write_file+0xe5/0x170
 ...skip...
  • Or the system may crash with "unable to handle kernel paging request" at "__cpufreq_governor+0x2b/0x180" and with the following call trace in the crash dump:
BUG: unable to handle kernel paging request at ffffffffa044a788
IP: [<ffffffff8141311b>] __cpufreq_governor+0x2b/0x180
PGD 1a87067 PUD 1a8b063 PMD 40118fa067 PTE 0
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Modules linked in: ... cpufreq_ondemand freq_table pcc_cpufreq ... [last unloaded: freq_table]
Pid: 29442, comm: cpuspeed Tainted: G        W  ---------------    2.6.32-358.el6.x86_64 #1 HP ProLiant DL360p Gen8
RIP: 0010:[<ffffffff8141311b>]  [<ffffffff8141311b>] __cpufreq_governor+0x2b/0x180
RSP: 0018:ffff881f67e1dc98  EFLAGS: 00010282
RAX: ffff8800998137e0 RBX: ffff884010f50180 RCX: 0000000000124f80
RDX: ffffffff00000001 RSI: ffffffffa044a760 RDI: ffff884010f50180
RBP: ffff881f67e1dcd8 R08: 0000000000249f00 R09: 0000000000249f00
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 0000000000000000 R14: ffffffffa044a760 R15: ffff881f67e1de08
FS:  00007fbf9e99b700(0000) GS:ffff880099820000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa044a788 CR3: 0000003f6fac1000 CR4: 00000000000427e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cpuspeed (pid: 29442, threadinfo ffff881f67e1c000, task ffff881f56439540)
Stack:
 0000000000000009 0000000000124f80 0000000000249f00 ffff881f67e1dd28
<d> ffff884010f50180 0000000000000000 ffffffffa044a760 ffff881f67e1de08
<d> ffff881f67e1dd08 ffffffff8141341f ffff884010f50180 ffff881f67e1dd28
Call Trace:
 [<ffffffff8141341f>] __cpufreq_set_policy+0x1af/0x250
 [<ffffffff81413954>] store_scaling_governor+0xe4/0x210
 [<ffffffff814135f0>] ? handle_update+0x0/0x40
 [<ffffffff8127923a>] ? kobject_get+0x1a/0x30
 [<ffffffff8150f200>] ? hrtimer_nanosleep_restart+0x20/0x90
 [<ffffffff81412a27>] store+0x67/0xa0
 [<ffffffff811f97c5>] sysfs_write_file+0xe5/0x170
 ...skip...

환경

  • Red Hat Enterprise Linux 6

해결

  • This issue was tracked in the private Red Hat Bugzilla 910617 and Bug 896083. But the current status of this bugzilla is CLOSED WONTFIX, because the fix would require major significant changes of the module loading code.

  • There is a workaround for the issue. Configure the system to not to use "ondemand" frequency scaling governor but use "performance" governor instead. This can be done by enabling cpuspeed service and editing the following line in /etc/sysconfig/cpuspeed:

GOVERNOR=performance

진단 단계

  • Check the system log for the warnings shown in the "Issue" section.

  • Obtain a crash dump of the server and check for the call trace shown in the "Issue" section.

+ Recent posts