BUG: scheduling while atomic: events/0/4/總結


對於Linux內核來說,Oops就意外着內核出了異常,此時會將產生異常時CPU的狀態,出錯的指令地址、數據地址及其他寄存器,函數調用的順序甚至是棧里面的內容都打印出來,然后根據異常的嚴重程度來決定下一步的操作:殺死導致異常的進程或者掛起系統。

最典型的異常是在內核態引用了一個非法地址,通常是未初始化的野指針Null,這將導致頁表異常,最終引發Oops。

Linux系統足夠健壯,能夠正常的反應各種異常。異常通常導致當前進程的死亡,而系統依然能夠繼續運轉,但是這種運轉都處在一種不穩定的狀態,隨時可能出問題。對於中斷上下文的異常及系統關鍵資源的破壞,通常會導致內核掛起,不再響應任何事件。

2 內核的異常級別
2.1 Bug
Bug是指那些不符合內核的正常設計,但內核能夠檢測出來並且對系統運行不會產生影響的問題,比如在原子上下文中休眠。如:
BUG: scheduling while atomic: insmod/826/0x00000002
Call Trace:
[ef12f700] [c00081e0] show_stack+0x3c/0x194 (unreliable)
[ef12f730] [c0019b2c] __schedule_bug+0x64/0x78
[ef12f750] [c0350f50] schedule+0x324/0x34c
[ef12f7a0] [c03515c0] schedule_timeout+0x68/0xe4
[ef12f7e0] [c027938c] fsl_elbc_run_command+0x138/0x1c0
[ef12f820] [c0275820] nand_do_read_ops+0x130/0x3dc
[ef12f880] [c0275ebc] nand_read+0xac/0xe0
[ef12f8b0] [c0262d98] part_read+0x5c/0xe4
[ef12f8c0] [c017bcac] jffs2_flash_read+0x68/0x254
[ef12f8f0] [c0170550] jffs2_read_dnode+0x60/0x304
[ef12f940] [c017088c] jffs2_read_inode_range+0x98/0x180
[ef12f970] [c016e610] jffs2_do_readpage_nolock+0x94/0x1ac
[ef12f990] [c016ee04] jffs2_write_begin+0x2b0/0x330
[ef12fa10] [c005144c] generic_file_buffered_write+0x11c/0x8d0
[ef12fab0] [c0051e48] __generic_file_aio_write_nolock+0x248/0x500
[ef12fb20] [c0052168] generic_file_aio_write+0x68/0x10c
[ef12fb50] [c007ca80] do_sync_write+0xc4/0x138
[ef12fc10] [f107c0dc] oops_log+0xdc/0x1e8 [oopslog]
[ef12fe70] [f3087058] oops_log_init+0x58/0xa0 [oopslog]
[ef12fe80] [c00477bc] sys_init_module+0x130/0x17dc
[ef12ff40] [c00104b0] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff29658
    LR = 0x10031300

2.2 Oops
程序在內核態時,進入一種異常情況,比如引用非法指針導致的數據異常,數組越界導致的取指異常,此時異常處理機制能夠捕獲此異常,並將系統關鍵信息打印到串口上,正常情況下Oops消息會被記錄到系統日志中去。

Oops發生時,進程處在內核態,很可能正在訪問系統關鍵資源,並且獲取了一些鎖,當進程由於Oops異常退出時,無法釋放已經獲取的資源,導致其他需要獲取此資源的進程掛起,對系統的正常運行造成影響。通常這種情況,系統處在不穩定的狀態,很可能崩潰。

2.3 Panic
當Oops發生在中斷上下文中或者在進程0和1中,系統將徹底掛起,因為中斷服務程序異常后,將無法恢復,這種情況即稱為內核panic。另外當系統設置了panic標志時,無論Oops發生在中斷上下文還是進程上下文,都將導致內核Panic。由於在中斷復位程序中panic后,系統將不再進行調度,Syslogd將不會再運行,因此這種情況下,Oops的消息僅僅打印到串口上,不會被記錄在系統日志中。

 

 

在調試IC卡驅動過程中頻繁拔插卡則出現BUG: scheduling while atomic: events/0/4/0x00010004異常導致系統崩潰,具體信息如下:

[ 1947.900000] BUG: scheduling while atomic: events/0/4/0x00010004
<4>[ 1947.900000] @@@@cardslot_iso7816_uart_interrupt,line:1384
<3>[ 1947.900000] BUG: scheduling while atomic: events/0/4/0x00010004
<4>[ 1947.900000] Modules linked in: iccard iso7816_uart bcm589x_pm(P) bar_scanner bcm589x_ped idtechencmag magstripe

cx930xx modem slnsp ftp101 printer bcm589x_spi touch_screen bcm5892_adc_driver matrix_keys beeper leds fusion

bcm589x_otg bcm589x_dwccom [last unloaded: iso7816_uart]
<4>[ 1947.900000]
<4>[ 1947.900000] Pid: 4, comm:             events/0
<4>[ 1947.900000] CPU: 0    Tainted: P            (2.6.32.9-bcm5892 #8)
<4>[ 1947.900000] PC is at memcpy+0x16c/0x330
<4>[ 1947.900000] LR is at 0xffffff
<4>[ 1947.900000] pc : [<c017c5cc>]    lr : [<00ffffff>]    psr: 20000013
<4>[ 1947.900000] sp : c3823ecc  ip : 00ffffff  fp : c3823f60
<4>[ 1947.900000] r10: 00000000  r9 : 00ffffff  r8 : 00ffffff
<4>[ 1947.900000] r7 : 00ffffff  r6 : 00ffffff  r5 : ff000000  r4 : 0000ffff
<4>[ 1947.900000] r3 : ff00ffff  r2 : 0000003f  r1 : c483ad44  r0 : c4854d40
<4>[ 1947.900000] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
<4>[ 1947.900000] Control: 00c5387d  Table: 43b14008  DAC: 00000017
<4>[ 1947.900000] [<c0032fb4>] (show_regs+0x0/0x4c) from [<c004d85c>] (__schedule_bug+0x48/0x5c)
<4>[ 1947.900000]  r4:c3823e84
<4>[ 1947.900000] [<c004d814>] (__schedule_bug+0x0/0x5c) from [<c032b734>] (schedule+0x78/0x45c)
<4>[ 1947.900000]  r4:c3823ce8
<4>[ 1947.900000] [<c032b6bc>] (schedule+0x0/0x45c) from [<c032c138>] (schedule_timeout+0x20/0x1dc)
<4>[ 1947.900000] [<c032c118>] (schedule_timeout+0x0/0x1dc) from [<c032bfa0>] (wait_for_common+0xf8/0x1b4)
<4>[ 1947.900000]  r6:c3822000 r5:c3815340 r4:c3823ce8
<4>[ 1947.900000] [<c032bea8>] (wait_for_common+0x0/0x1b4) from [<c032c0ec>] (wait_for_completion+0x18/0x1c)
<4>[ 1947.900000] [<c032c0d4>] (wait_for_completion+0x0/0x1c) from [<c0065ba4>] (__cancel_work_timer+0x140/0x194)
<4>[ 1947.900000] [<c0065a64>] (__cancel_work_timer+0x0/0x194) from [<c0065c0c>] (cancel_delayed_work_sync+0x14/0x18)
<4>[ 1947.900000] [<c0065bf8>] (cancel_delayed_work_sync+0x0/0x18) from [<c0198044>]

(fb_deferred_io_fsync_delay+0x1c/0x2c)
<4>[ 1947.900000] [<c0198028>] (fb_deferred_io_fsync_delay+0x0/0x2c) from [<c01999dc>]

(new8110_lcd_icon_control+0xb0/0xbc)
<4>[ 1947.900000]  r5:c38daa80 r4:00008040
<4>[ 1947.900000] [<c019992c>] (new8110_lcd_icon_control+0x0/0xbc) from [<c003ad64>] (lcd_icon_set+0x20/0x28)
<4>[ 1947.900000]  r7:bf08e6b0 r6:000000ff r5:c3bee70c r4:c3b0aa1c
<4>[ 1947.900000] [<c003ad44>] (lcd_icon_set+0x0/0x28) from [<c023d8c0>] (led_trigger_event+0x64/0xa0)
<4>[ 1947.900000] [<c023d85c>] (led_trigger_event+0x0/0xa0) from [<bf088520>] (card_insert_interrupt+0x70/0x98

[iccard])
<4>[ 1947.900000]  r6:bf08cacc r5:c041cf4c r4:00000040
<4>[ 1947.900000] [<bf0884b0>] (card_insert_interrupt+0x0/0x98 [iccard]) from [<bf084b10>]

(cardslot_iso7816_uart_interrupt+0xd4/0xb68 [iccard])
<4>[ 1947.900000]  r7:0000002e r6:c041cf4c r5:0000002e r4:bf08cacc
<4>[ 1947.900000] [<bf084a3c>] (cardslot_iso7816_uart_interrupt+0x0/0xb68 [iccard]) from [<c007e76c>]

(handle_IRQ_event+0x3c/0xfc)
<4>[ 1947.900000]  r6:00000000 r5:00000000 r4:c3932da0
<4>[ 1947.900000] [<c007e730>] (handle_IRQ_event+0x0/0xfc) from [<c0080734>] (handle_level_irq+0xcc/0x168)
<4>[ 1947.900000]  r7:00000002 r6:0000002e r5:c3932da0 r4:c0420810
<4>[ 1947.900000] [<c0080668>] (handle_level_irq+0x0/0x168) from [<c0031070>] (asm_do_IRQ+0x70/0x8c)
<4>[ 1947.900000]  r6:00004000 r5:00000000 r4:0000002e
<4>[ 1947.900000] [<c0031000>] (asm_do_IRQ+0x0/0x8c) from [<c0031b78>] (__irq_svc+0x38/0xd4)
<4>[ 1947.900000] Exception stack(0xc3823e84 to 0xc3823ecc)
<4>[ 1947.900000] 3e80:          c4854d40 c483ad44 0000003f ff00ffff 0000ffff ff000000 00ffffff
<4>[ 1947.900000] 3ea0: 00ffffff 00ffffff 00ffffff 00000000 c3823f60 00ffffff c3823ecc 00ffffff
<4>[ 1947.900000] 3ec0: c017c5cc 20000013 ffffffff
<4>[ 1947.900000]  r5:d102a000 r4:ffffffff
<4>[ 1947.900000] [<c0198f84>] (new8110fb_deferred_io+0x0/0x5ac) from [<c0197f40>] (fb_deferred_io_work+0x90/0xe0)
<4>[ 1947.900000] [<c0197eb0>] (fb_deferred_io_work+0x0/0xe0) from [<c00657b8>] (worker_thread+0x178/0x22c)
<4>[ 1947.900000]  r8:c0197eb0 r7:c3801900 r6:c38daa1c r5:c3822000 r4:c38daa20
<4>[ 1947.900000] [<c0065640>] (worker_thread+0x0/0x22c) from [<c00692b8>] (kthread+0x84/0x8c)
<4>[ 1947.900000] [<c0069234>] (kthread+0x0/0x8c) from [<c0056b80>] (do_exit+0x0/0x62c)
<4>[ 1947.900000]  r7:00000000 r6:00000000 r5:00000000 r4:00000000

 

經過反復調試確定問題出在拔插卡中斷里面,在拔插卡中斷中調用了led_trigger_event(card_insert_led_trigger, LED_OFF)函數,此函數將會加鎖而概率性引起死鎖或者阻塞,而在中斷上下文中絕對不允許調用阻塞函數,故系統將會報bug.

去掉后問題解決.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM