Mx53 efikasb jd by PurpleAlien · Pull Request #1 · genesi/linux-shortbus

PurpleAlien · 2011-09-28T10:38:43Z

This is the mx53 EfikaSB support.

ARCH_MXC_AUDMUX_V2 selected for this to work.

Make sure we compile the correct NAND driver.

actually makes a difference.

Fix the whitespace issues. Kernel use is tabs, not spaces. Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

BugLink: http://bugs.launchpad.net/bugs/857057 After adding support for S/PDIF audio output routing to HDMI, the SGTL5000 became card #1, which was no longer #0, thus making it no longer the default ALSA card. imx_sgtl5000_init() doesn't need to be late_initcall(), module_init() works just all right. Signed-off-by: Eric Miao <eric.miao@linaro.org>

Conflicts: sound/soc/imx/Makefile

…ro-natty into mx53_efikasb_jd

There were 2 duplicate includes in the board-mx53_efikasb.c file. Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

@if

Important: @if (val & SDHCI_INT_CARD_INT) we enter the part which clears and then sets D3CD bit to avoid missing the card interrupt. This is a Freescale patch which in our case causes an interrupt storm and freezes the system. The original steps are: data = readl(host->ioaddr + SDHCI_HOST_CONTROL); data &= ~SDHCI_CTRL_D3CD; writel(data, host->ioaddr + SDHCI_HOST_CONTROL); data |= SDHCI_CTRL_D3CD; writel(data, host->ioaddr + SDHCI_HOST_CONTROL); We just do: data = readl(host->ioaddr + SDHCI_HOST_CONTROL); data |= SDHCI_CTRL_D3CD; writel(data, host->ioaddr + SDHCI_HOST_CONTROL); This needs to be investigated furhter.

This quirck takes care of SDIO devices not supporting 512 byte requests in byte mode during CMD53. http://comments.gmane.org/gmane.linux.kernel.mmc/10087

seems to conflict with audio. This is still under investigation.

Power and main driver are part of the architecture specific files. This can probably be done better with specific platform devices. Drivers for mouse, trackpad and rtc are in their respective directories.

There's a race window in xen_drop_mm_ref, where remote cpu may exit dirty bitmap between the check on this cpu and the point where remote cpu handles drop request. So in drop_other_mm_ref we need check whether TLB state is still lazy before calling into leave_mm. This bug is rarely observed in earlier kernel, but exaggerated by the commit 831d52b ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm") which clears bitmap after changing the TLB state. the call trace is as below: --------------------------------- kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb CPU 1 Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP: e02b:ffff88002805be48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) Stack: ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 Call Trace: <IRQ> [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30 <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef [<ffffffff81042fcf>] ? need_resched+0x23/0x2d [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 [<ffffffff81114362>] ? do_execve+0x1c3/0x29e [<ffffffff8101155d>] ? sys_execve+0x43/0x5d [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e [<ffffffff81013daa>] ? child_rip+0xa/0x20 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 [<ffffffff81013da0>] ? child_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP <ffff88002805be48> ---[ end trace ce9cee6832a9c503 ]--- Tested-by: Maoxiaoyun<tinnycloud@hotmail.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> [v1: Fleshed out the git description a bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

@ap

ae01b24 (libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65) added ATA_FLAG_NO_DIPM and made ata_eh_set_lpm() check the flag. However, @ap is NULL if @link points to a PMP link and thus the unconditional @ap->flags dereference leads to the following oops. BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510 ... Pid: 295, comm: scsi_eh_4 Tainted: P 2.6.38.5-core2 #1 System76, Inc. Serval Professional/Serval Professional RIP: 0010:[<ffffffff813f98e1>] [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510 RSP: 0018:ffff880132defbf0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880132f40000 RCX: 0000000000000000 RDX: ffff88013377c000 RSI: ffff880132f40000 RDI: 0000000000000000 RBP: ffff880132defce0 R08: ffff88013377dc58 R09: ffff880132defd98 R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 R13: 0000000000000000 R14: ffff88013377c000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8800bf700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000001a03000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process scsi_eh_4 (pid: 295, threadinfo ffff880132dee000, task ffff880133b416c0) Stack: 0000000000000000 ffff880132defcc0 0000000000000000 ffff880132f42738 ffffffff813ee8f0 ffffffff813eefe0 ffff880132defd98 ffff88013377f190 ffffffffa00b3e30 ffffffff813ef030 0000000032defc60 ffff880100000000 Call Trace: [<ffffffff81400867>] sata_pmp_error_handler+0x607/0xc30 [<ffffffffa00b273f>] ahci_error_handler+0x1f/0x70 [libahci] [<ffffffff813faade>] ata_scsi_error+0x5be/0x900 [<ffffffff813cf724>] scsi_error_handler+0x124/0x650 [<ffffffff810834b6>] kthread+0x96/0xa0 [<ffffffff8100cd64>] kernel_thread_helper+0x4/0x10 Code: 8b 95 70 ff ff ff b8 00 00 00 00 48 3b 9a 10 2e 00 00 48 0f 44 c2 48 89 85 70 ff ff ff 48 8b 8d 70 ff ff ff f6 83 69 02 00 00 01 <48> 8b 41 18 0f 85 48 01 00 00 48 85 c9 74 12 48 8b 51 08 48 83 RIP [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510 RSP <ffff880132defbf0> CR2: 0000000000000018 Fix it by testing @Link->ap->flags instead. stable: ATA_FLAG_NO_DIPM was added during 2.6.39 cycle but was backported to 2.6.37 and 38. This is a fix for that and thus also applicable to 2.6.37 and 38. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Nathan A. Mourey II" <nmoureyii@ne.rr.com> LKML-Reference: <1304555277.2059.2.camel@localhost.localdomain> Cc: Connor H <cmdkhh@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@pobox.com>

…ict() During the sctp_close() call, we do not use rcu primitives to destroy the address list attached to the endpoint. At the same time, we do the removal of addresses from this list before attempting to remove the socket from the port hash As a result, it is possible for another process to find the socket in the port hash that is in the process of being closed. It then proceeds to traverse the address list to find the conflict, only to have that address list suddenly disappear without rcu() critical section. Fix issue by closing address list removal inside RCU critical section. Race can result in a kernel crash with general protection fault or kernel NULL pointer dereference: kernel: general protection fault: 0000 [#1] SMP kernel: RIP: 0010:[<ffffffffa02f3dde>] [<ffffffffa02f3dde>] sctp_bind_addr_conflict+0x64/0x82 [sctp] kernel: Call Trace: kernel: [<ffffffffa02f415f>] ? sctp_get_port_local+0x17b/0x2a3 [sctp] kernel: [<ffffffffa02f3d45>] ? sctp_bind_addr_match+0x33/0x68 [sctp] kernel: [<ffffffffa02f4416>] ? sctp_do_bind+0xd3/0x141 [sctp] kernel: [<ffffffffa02f5030>] ? sctp_bindx_add+0x4d/0x8e [sctp] kernel: [<ffffffffa02f5183>] ? sctp_setsockopt_bindx+0x112/0x4a4 [sctp] kernel: [<ffffffff81089e82>] ? generic_file_aio_write+0x7f/0x9b kernel: [<ffffffffa02f763e>] ? sctp_setsockopt+0x14f/0xfee [sctp] kernel: [<ffffffff810c11fb>] ? do_sync_write+0xab/0xeb kernel: [<ffffffff810e82ab>] ? fsnotify+0x239/0x282 kernel: [<ffffffff810c2462>] ? alloc_file+0x18/0xb1 kernel: [<ffffffff8134a0b1>] ? compat_sys_setsockopt+0x1a5/0x1d9 kernel: [<ffffffff8134aaf1>] ? compat_sys_socketcall+0x143/0x1a4 kernel: [<ffffffff810467dc>] ? sysenter_dispatch+0x7/0x32 Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>

When device_add is called in rc_register_device, the rc sysfs nodes show up, and there's a window in which ir-keytable can be launched via udev and trigger a show_protocols call, which runs without various rc_dev fields filled in yet. Add some locking around registration and store/show_protocols to prevent that from happening. The problem manifests thusly: [64692.957872] BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 [64692.957878] IP: [<ffffffffa036a4c1>] show_protocols+0x47/0xf1 [rc_core] [64692.957890] PGD 19cfc7067 PUD 19cfc6067 PMD 0 [64692.957894] Oops: 0000 [#1] SMP [64692.957897] last sysfs file: /sys/devices/pci0000:00/0000:00:03.1/usb3/3-1/3-1:1.0/rc/rc2/protocols [64692.957902] CPU 3 [64692.957903] Modules linked in: redrat3(+) ir_lirc_codec lirc_dev ir_sony_decoder ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder rc_hauppauge ir_nec _decoder rc_core ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_mi di_event snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem pcsp kr tg3 snd_hwdep emu10k1_gp snd amd64_edac_mod gameport edac_core soundcore edac_mce_amd k8temp shpchp i2c_piix4 lm63 e100 mii uinput ipv6 raid0 rai d1 ata_generic firewire_ohci pata_acpi firewire_core crc_itu_t sata_svw pata_serverworks floppy radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: redrat3] [64692.957949] [64692.957952] Pid: 12265, comm: ir-keytable Tainted: G M W 2.6.39-rc6+ #2 empty empty/TYAN Thunder K8HM S3892 [64692.957957] RIP: 0010:[<ffffffffa036a4c1>] [<ffffffffa036a4c1>] show_protocols+0x47/0xf1 [rc_core] [64692.957962] RSP: 0018:ffff880194509e38 EFLAGS: 00010202 [64692.957964] RAX: 0000000000000000 RBX: ffffffffa036d1e0 RCX: ffffffffa036a47a [64692.957966] RDX: ffff88019a84d000 RSI: ffffffffa036d1e0 RDI: ffff88019cf2f3f0 [64692.957969] RBP: ffff880194509e68 R08: 0000000000000002 R09: 0000000000000000 [64692.957971] R10: 0000000000000002 R11: 0000000000001617 R12: ffff88019a84d000 [64692.957973] R13: 0000000000001000 R14: ffff8801944d2e38 R15: ffff88019ce5f190 [64692.957976] FS: 00007f0a30c9a720(0000) GS:ffff88019fc00000(0000) knlGS:0000000000000000 [64692.957979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [64692.957981] CR2: 0000000000000090 CR3: 000000019a8e0000 CR4: 00000000000006e0 [64692.957983] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [64692.957986] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [64692.957989] Process ir-keytable (pid: 12265, threadinfo ffff880194508000, task ffff88019a9fc720) [64692.957991] Stack: [64692.957992] 0000000000000002 ffffffffa036d1e0 ffff880194509f58 0000000000001000 [64692.957997] ffff8801944d2e38 ffff88019ce5f190 ffff880194509e98 ffffffff8131484b [64692.958001] ffffffff8118e923 ffffffff810e9b2f ffff880194509e98 ffff8801944d2e18 [64692.958005] Call Trace: [64692.958014] [<ffffffff8131484b>] dev_attr_show+0x27/0x4e [64692.958014] [<ffffffff8118e923>] ? sysfs_read_file+0x94/0x172 [64692.958014] [<ffffffff810e9b2f>] ? __get_free_pages+0x16/0x52 [64692.958014] [<ffffffff8118e94c>] sysfs_read_file+0xbd/0x172 [64692.958014] [<ffffffff8113205e>] vfs_read+0xac/0xf3 [64692.958014] [<ffffffff8113347b>] ? fget_light+0x3a/0xa1 [64692.958014] [<ffffffff811320f2>] sys_read+0x4d/0x74 [64692.958014] [<ffffffff814c19c2>] system_call_fastpath+0x16/0x1b Its a bit difficult to reproduce, but I'm fairly confident this has fixed the problem. Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Like the following, mmu_notifier can be called after registering immediately. So, kvm have to initialize kvm->mmu_lock before it. BUG: spinlock bad magic on CPU#0, kswapd0/342 lock: ffff8800af8c4000, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Pid: 342, comm: kswapd0 Not tainted 2.6.39-rc5+ #1 Call Trace: [<ffffffff8118ce61>] spin_bug+0x9c/0xa3 [<ffffffff8118ce91>] do_raw_spin_lock+0x29/0x13c [<ffffffff81024923>] ? flush_tlb_others_ipi+0xaf/0xfd [<ffffffff812e22f3>] _raw_spin_lock+0x9/0xb [<ffffffffa0582325>] kvm_mmu_notifier_clear_flush_young+0x2c/0x66 [kvm] [<ffffffff810d3ff3>] __mmu_notifier_clear_flush_young+0x2b/0x57 [<ffffffff810c8761>] page_referenced_one+0x88/0xea [<ffffffff810c89bf>] page_referenced+0x1fc/0x256 [<ffffffff810b2771>] shrink_page_list+0x187/0x53a [<ffffffff810b2ed7>] shrink_inactive_list+0x1e0/0x33d [<ffffffff810acf95>] ? determine_dirtyable_memory+0x15/0x27 [<ffffffff812e90ee>] ? call_function_single_interrupt+0xe/0x20 [<ffffffff810b3356>] shrink_zone+0x322/0x3de [<ffffffff810a9587>] ? zone_watermark_ok_safe+0xe2/0xf1 [<ffffffff810b3928>] kswapd+0x516/0x818 [<ffffffff810b3412>] ? shrink_zone+0x3de/0x3de [<ffffffff81053d17>] kthread+0x7d/0x85 [<ffffffff812e9394>] kernel_thread_helper+0x4/0x10 [<ffffffff81053c9a>] ? __init_kthread_worker+0x37/0x37 [<ffffffff812e9390>] ? gs_change+0xb/0xb Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>

Avoid spawning a shell pipeline doing cat, grep, sed, and do it all inside perl. The <*.c> globbing construct works at least as far back as 5.8.9 Note that this is not just an optimization; the sed command in the pipeline was unterminated, due to lack of escape on the end-of-line (\$) in the regex, resulting in this: $ perl ../linux-2.6/scripts/export_report.pl > /dev/null sed: -e expression #1, char 5: unterminated `s' command sh: .mod.c/: not found Comments on an earlier patch sought an all-perl implementation. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> cc: Michal Marek <mmarek@suse.cz>, cc: linux-kbuild@vger.kernel.org cc: Arnaud Lacombe lacombar@gmail.com cc: Stephen Hemminger shemminger@vyatta.com Signed-off-by: Michal Marek <mmarek@suse.cz>

The 'max_part' parameter controls the number of maximum partition a loop block device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel panic (or, at least, produce invalid device nodes in some cases). On my desktop system, following command kills the kernel. On qemu, it triggers similar oops but the kernel was alive: $ sudo modprobe loop max_part0000 ------------[ cut here ]------------ kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: loop(+) Pid: 43, comm: insmod Tainted: G W 2.6.39-qemu+ #155 Bochs Bochs RIP: 0010:[<ffffffff8113ce61>] [<ffffffff8113ce61>] internal_create_group= +0x2a/0x170 RSP: 0018:ffff880007b3fde8 EFLAGS: 00000246 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800 FS: 0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000= 00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c= 0) Stack: ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b 0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868 0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8 Call Trace: [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12 [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff8116a090>] blk_register_queue+0x47/0xf7 [<ffffffff8116f527>] add_disk+0xdf/0x290 [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop] [<ffffffffa0006000>] ? 0xffffffffa0005fff [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e [<ffffffff81096804>] sys_init_module+0x9c/0x1e0 [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb= 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe = 48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49 RIP [<ffffffff8113ce61>] internal_create_group+0x2a/0x170 RSP <ffff880007b3fde8> ---[ end trace a123eb592043acad ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

Konrad reports: [ 0.930811] RTNL: assertion failed at /home/konrad/ssd/linux/net/core/dev.c (5258) [ 0.930821] Pid: 22, comm: xenwatch Not tainted 2.6.39-05193-gd762f43 #1 [ 0.930825] Call Trace: [ 0.930834] [<ffffffff8143bd0e>] __netdev_update_features+0xae/0xe0 [ 0.930840] [<ffffffff8143dd41>] netdev_update_features+0x11/0x30 [ 0.930847] [<ffffffffa0037105>] netback_changed+0x4e5/0x800 [xen_netfront] [ 0.930854] [<ffffffff8132a838>] xenbus_otherend_changed+0xa8/0xb0 [ 0.930860] [<ffffffff8157ca99>] ? _raw_spin_unlock_irqrestore+0x19/0x20 [ 0.930866] [<ffffffff8132adfe>] backend_changed+0xe/0x10 [ 0.930871] [<ffffffff8132875a>] xenwatch_thread+0xba/0x180 [ 0.930876] [<ffffffff810a8ba0>] ? wake_up_bit+0x40/0x40 [ 0.930881] [<ffffffff813286a0>] ? split+0xf0/0xf0 [ 0.930886] [<ffffffff810a8646>] kthread+0x96/0xa0 [ 0.930891] [<ffffffff815855a4>] kernel_thread_helper+0x4/0x10 [ 0.930896] [<ffffffff815846b3>] ? int_ret_from_sys_call+0x7/0x1b [ 0.930901] [<ffffffff8157cf61>] ? retint_restore_args+0x5/0x6 [ 0.930906] [<ffffffff815855a0>] ? gs_change+0x13/0x13 This update happens in xenbus watch callback context and hence does not already hold the rtnl. Take the lock as necessary. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>

This soft lockup was recently reported: [root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters [root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves bonding: bond5: doing slave updates when interface is down. bonding bond5: master_dev is not up in bond_enslave [root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves bonding: bond5: doing slave updates when interface is down. BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444] CPU 12: Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc be2d Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1 RIP: 0010:[<ffffffff80064bf0>] [<ffffffff80064bf0>] .text.lock.spinlock+0x26/00 RSP: 0018:ffff810113167da8 EFLAGS: 00000286 RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025 RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8 RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000 R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282 FS: 00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0 Call Trace: [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14 [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1 [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0 [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450 [<ffffffff8006457b>] __down_write_nested+0x12/0x92 [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7 [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016b87>] vfs_write+0xce/0x174 [<ffffffff80017450>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 It occurs because we are able to change the slave configuarion of a bond while the bond interface is down. The bonding driver initializes some data structures only after its ndo_open routine is called. Among them is the initalization of the alb tx and rx hash locks. So if we add or remove a slave without first opening the bond master device, we run the risk of trying to lock/unlock a spinlock that has garbage for data in it, which results in our above softlock. Note that sometimes this works, because in many cases an unlocked spinlock has the raw_lock parameter initialized to zero (meaning that the kzalloc of the net_device private data is equivalent to calling spin_lock_init), but thats not true in all cases, and we aren't guaranteed that condition, so we need to pass the relevant spinlocks through the spin_lock_init function. Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to the ndo_init path, so they are ready for use by the bond_store_slaves path. Change notes: v2) Based on conversation with Jay and Nicolas it seems that the ability to enslave devices while the bond master is down should be safe to do. As such this is an outlier bug, and so instead we'll just initalize the errant spinlocks in the init path rather than the open path, solving the problem. We'll also remove the warnings about the bond being down during enslave operations, since it should be safe v3) Fix spelling error Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: jtluka@redhat.com CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: nicolas.2p.debian@gmail.com CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>

The 'max_part' parameter controls the number of maximum partition a brd device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel panic (or, at least, produce invalid device nodes in some cases). On my desktop system, following command kills the kernel. On qemu, it triggers similar oops but the kernel was alive: $ sudo modprobe brd max_part=100000 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae PGD 7af1067 PUD 7b19067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: brd(+) Pid: 44, comm: insmod Tainted: G W 2.6.39-qemu+ #158 Bochs Bochs RIP: 0010:[<ffffffff81110a9a>] [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae RSP: 0018:ffff880007b15d78 EFLAGS: 00000286 RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8 RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760 RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000 R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000 R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063 FS: 0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980) Stack: ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a 0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000 Call Trace: [<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0 [<ffffffff81143da1>] kobject_add_varg+0x41/0x50 [<ffffffff81143e6b>] kobject_add+0x64/0x66 [<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8 [<ffffffff81140f72>] add_disk+0xdf/0x289 [<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd] [<ffffffffa0004000>] ? 0xffffffffa0003fff [<ffffffffa0004000>] ? 0xffffffffa0003fff [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e [<ffffffff8108516c>] sys_init_module+0x9c/0x1dc [<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30 8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89 RIP [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae RSP <ffff880007b15d78> CR2: 0000000000000058 ---[ end trace aebb1175ce1f6739 ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

The 'max_part' parameter controls the number of maximum partition a nbd device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel oops (or, at least, produce invalid device nodes in some cases). In addition, specifying large 'nbds_max' value causes same problem for the same reason. On my desktop, following command results to the kernel bug: $ sudo modprobe nbd max_part=100000 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/block/nbd4/range CPU 1 Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom Pid: 2522, comm: modprobe Tainted: G W 2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO RIP: 0010:[<ffffffff8115aa08>] [<ffffffff8115aa08>] internal_create_group+0x2f/0x166 RSP: 0018:ffff8801009f1de8 EFLAGS: 00010246 RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3 RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478 RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478 R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0 R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400 FS: 00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0) Stack: ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80 ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468 0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a Call Trace: [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4 [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43 [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15 [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd [<ffffffff811f3bdf>] add_disk+0xe4/0x29c [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd] [<ffffffffa007e000>] ? 0xffffffffa007dfff [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3 [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49 RIP [<ffffffff8115aa08>] internal_create_group+0x2f/0x166 RSP <ffff8801009f1de8> ---[ end trace 753285ffbf72c57c ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

This commit switches manipulations of the rcu_node ->wakemask field to atomic operations, which allows rcu_cpu_kthread_timer() to avoid acquiring the rcu_node lock. This should avoid the following lockdep splat reported by Valdis Kletnieks: [ 12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd [ 12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513 [ 12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 12.987691] hub 1-4:1.0: USB hub found [ 12.987877] hub 1-4:1.0: 3 ports detected [ 12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10 [ 13.071471] udevadm used greatest stack depth: 3984 bytes left [ 13.172129] [ 13.172130] ======================================================= [ 13.172425] [ INFO: possible circular locking dependency detected ] [ 13.172650] 2.6.39-rc6-mmotm0506 #1 [ 13.172773] ------------------------------------------------------- [ 13.172997] blkid/267 is trying to acquire lock: [ 13.173009] (&p->pi_lock){-.-.-.}, at: [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [ 13.173009] but task is already holding lock: [ 13.173009] (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] which lock already depends on the new lock. [ 13.173009] [ 13.173009] [ 13.173009] the existing dependency chain (in reverse order) is: [ 13.173009] [ 13.173009] -> #2 (rcu_node_level_0){..-...}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81090794>] rcu_read_unlock_special+0x8c/0x1d5 [ 13.173009] [<ffffffff8109092c>] __rcu_read_unlock+0x4f/0xd7 [ 13.173009] [<ffffffff81027bd3>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff8102cc34>] cpuacct_charge+0x6c/0x75 [ 13.173009] [<ffffffff81030cc6>] update_curr+0x101/0x12e [ 13.173009] [<ffffffff810311d0>] check_preempt_wakeup+0xf7/0x23b [ 13.173009] [<ffffffff8102acb3>] check_preempt_curr+0x2b/0x68 [ 13.173009] [<ffffffff81031d40>] ttwu_do_wakeup+0x76/0x128 [ 13.173009] [<ffffffff81031e49>] ttwu_do_activate.constprop.63+0x57/0x5c [ 13.173009] [<ffffffff81031e96>] scheduler_ipi+0x48/0x5d [ 13.173009] [<ffffffff810177d5>] smp_reschedule_interrupt+0x16/0x18 [ 13.173009] [<ffffffff815710f3>] reschedule_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810b66d1>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff810b739c>] find_get_page+0xa9/0xb9 [ 13.173009] [<ffffffff810b8b48>] filemap_fault+0x6a/0x34d [ 13.173009] [<ffffffff810d1a25>] __do_fault+0x54/0x3e6 [ 13.173009] [<ffffffff810d447a>] handle_pte_fault+0x12c/0x1ed [ 13.173009] [<ffffffff810d48f7>] handle_mm_fault+0x1cd/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] -> #1 (&rq->lock){-.-.-.}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81027e19>] __task_rq_lock+0x8b/0xd3 [ 13.173009] [<ffffffff81032f7f>] wake_up_new_task+0x41/0x108 [ 13.173009] [<ffffffff810376c3>] do_fork+0x265/0x33f [ 13.173009] [<ffffffff81007d02>] kernel_thread+0x6b/0x6d [ 13.173009] [<ffffffff8153a9dd>] rest_init+0x21/0xd2 [ 13.173009] [<ffffffff81b1db4f>] start_kernel+0x3bb/0x3c6 [ 13.173009] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3 [ 13.173009] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7 [ 13.173009] [ 13.173009] -> #0 (&p->pi_lock){-.-.-.}: [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] other info that might help us debug this: [ 13.173009] [ 13.173009] Chain exists of: [ 13.173009] &p->pi_lock --> &rq->lock --> rcu_node_level_0 [ 13.173009] [ 13.173009] Possible unsafe locking scenario: [ 13.173009] [ 13.173009] CPU0 CPU1 [ 13.173009] ---- ---- [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&rq->lock); [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&p->pi_lock); [ 13.173009] [ 13.173009] *** DEADLOCK *** [ 13.173009] [ 13.173009] 3 locks held by blkid/267: [ 13.173009] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8156cdb4>] do_page_fault+0x1f3/0x5de [ 13.173009] #1: (&yield_timer){+.-...}, at: [<ffffffff810451da>] call_timer_fn+0x0/0x1e9 [ 13.173009] #2: (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] stack backtrace: [ 13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1 [ 13.173009] Call Trace: [ 13.173009] <IRQ> [<ffffffff8154a529>] print_circular_bug+0xc8/0xd9 [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff8100c861>] ? save_stack_trace+0x28/0x46 [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff810451da>] ? del_timer+0x75/0x75 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8106365f>] ? tick_dev_program_event+0x37/0xf6 [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] <EOI> [<ffffffff810bd384>] ? get_page_from_freelist+0x114/0x310 [ 13.173009] [<ffffffff810bd51a>] ? get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff812220e7>] ? clear_page_c+0x7/0x10 [ 13.173009] [<ffffffff810bd1ef>] ? prep_new_page+0x14c/0x1cd [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff810d915f>] ? sys_brk+0x32/0x10c [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff81065c4f>] ? trace_hardirqs_off_caller+0x3f/0x9c [ 13.173009] [<ffffffff812235dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>

page_get_storage_key() and page_set_storage_key() expect a page address and not its page frame number. This got inconsistent with 2d42552 "[S390] merge page_test_dirty and page_clear_dirty". Result is that we read/write storage keys from random pages and do not have a working dirty bit tracking at all. E.g. SetPageUpdate() doesn't clear the dirty bit of requested pages, which for example ext4 doesn't like very much and panics after a while. Unable to handle kernel paging request at virtual user address (null) Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 1 Not tainted 2.6.39-07551-g139f37f-dirty #152 Process flush-94:0 (pid: 1576, task: 000000003eb34538, ksp: 000000003c287b70) Krnl PSW : 0704c00180000000 0000000000316b12 (jbd2_journal_file_inode+0x10e/0x138) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0700000000000000 0000000000316a62 000000003eb34cd0 0000000000000025 000000003c287b88 0000000000000001 000000003c287a70 000000003f1ec678 000000003f1ec000 0000000000000000 000000003e66ec00 0000000000316a62 000000003c287988 Krnl Code: 0000000000316b04: f0a0000407f4 srp 4(11,%r0),2036,0 0000000000316b0a: b9020022 ltgr %r2,%r2 0000000000316b0e: a7740015 brc 7,316b38 >0000000000316b12: e3d0c0000024 stg %r13,0(%r12) 0000000000316b18: 4120c010 la %r2,16(%r12) 0000000000316b1c: 4130d060 la %r3,96(%r13) 0000000000316b20: e340d0600004 lg %r4,96(%r13) 0000000000316b26: c0e50002b567 brasl %r14,36d5f4 Call Trace: ([<0000000000316a62>] jbd2_journal_file_inode+0x5e/0x138) [<00000000002da13c>] mpage_da_map_and_submit+0x2e8/0x42c [<00000000002daac2>] ext4_da_writepages+0x2da/0x504 [<00000000002597e8>] writeback_single_inode+0xf8/0x268 [<0000000000259f06>] writeback_sb_inodes+0xd2/0x18c [<000000000025a700>] writeback_inodes_wb+0x80/0x168 [<000000000025aa92>] wb_writeback+0x2aa/0x324 [<000000000025abde>] wb_do_writeback+0xd2/0x274 [<000000000025ae3a>] bdi_writeback_thread+0xba/0x1c4 [<00000000001737be>] kthread+0xa6/0xb0 [<000000000056c1da>] kernel_thread_starter+0x6/0xc [<000000000056c1d4>] kernel_thread_starter+0x0/0xc INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000000000316a8a>] jbd2_journal_file_inode+0x86/0x138 Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

The runtime PM changes introduce sh_dmae_rst() wrapping via the runtime_resume helper, depending on dev_get_drvdata() to fetch the platform data needed for the DMAOR initialization default at a time where drvdata hasn't yet been established by the probe path, resulting in general probe misery: Unable to handle kernel NULL pointer dereference at virtual address 000000c4 pc = 8025adee *pde = 00000000 Oops: 0000 [#1] Modules linked in: Pid : 1, Comm: swapper CPU : 0 Not tainted (3.0.0-rc1-00012-g9436b4a-dirty #1456) PC is at sh_dmae_rst+0x28/0x86 PR is at sh_dmae_rst+0x22/0x86 PC : 8025adee SP : 9e803d10 SR : 400080f1 TEA : 000000c4 R0 : 000000c4 R1 : 0000fff8 R2 : 00000000 R3 : 00000040 R4 : 000000f0 R5 : 00000000 R6 : 00000000 R7 : 804f184c R8 : 00000000 R9 : 804dd0e8 R10 : 80283204 R11 : ffffffda R12 : 000000a0 R13 : 804dd18c R14 : 9e803d10 MACH: 00000000 MACL: 00008f20 GBR : 00000000 PR : 8025ade8 Call trace: [<8025ae70>] sh_dmae_runtime_resume+0x24/0x34 [<80283238>] pm_generic_runtime_resume+0x34/0x3c [<80283370>] rpm_callback+0x4a/0x7e [<80283efc>] rpm_resume+0x240/0x384 [<80283f54>] rpm_resume+0x298/0x384 [<8028428c>] __pm_runtime_resume+0x44/0x7c [<8038a358>] __ioremap_caller+0x0/0xec [<80284296>] __pm_runtime_resume+0x4e/0x7c [<8038a358>] __ioremap_caller+0x0/0xec [<80666254>] sh_dmae_probe+0x180/0x6a0 [<802803ae>] platform_drv_probe+0x26/0x2e Fix up the ordering accordingly. Signed-off-by: Paul Mundt <lethal@linux-sh.org>

This fixes the A->B/B->A locking dependency, see the warning below. The function task_exit_notify() is called with (task_exit_notifier) .rwsem set and then calls sync_buffer() which locks buffer_mutex. In sync_start() the buffer_mutex was set to prevent notifier functions to be started before sync_start() is finished. But when registering the notifier, (task_exit_notifier).rwsem is locked too, but now in different order than in sync_buffer(). In theory this causes a locking dependency, what does not occur in practice since task_exit_notify() is always called after the notifier is registered which means the lock is already released. However, after checking the notifier functions it turned out the buffer_mutex in sync_start() is unnecessary. This is because sync_buffer() may be called from the notifiers even if sync_start() did not finish yet, the buffers are already allocated but empty. No need to protect this with the mutex. So we fix this theoretical locking dependency by removing buffer_mutex in sync_start(). This is similar to the implementation before commit: 750d857 oprofile: fix crash when accessing freed task structs which introduced the locking dependency. Lockdep warning: oprofiled/4447 is trying to acquire lock: (buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile] but task is already holding lock: ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 ((task_exit_notifier).rwsem){++++..}: [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffff81463a2b>] down_write+0x44/0x67 [<ffffffff810581c0>] blocking_notifier_chain_register+0x52/0x8b [<ffffffff8105a6ac>] profile_event_register+0x2d/0x2f [<ffffffffa00013c1>] sync_start+0x47/0xc6 [oprofile] [<ffffffffa00001bb>] oprofile_setup+0x60/0xa5 [oprofile] [<ffffffffa00014e3>] event_buffer_open+0x59/0x8c [oprofile] [<ffffffff810cd3b9>] __dentry_open+0x1eb/0x308 [<ffffffff810cd59d>] nameidata_to_filp+0x60/0x67 [<ffffffff810daad6>] do_last+0x5be/0x6b2 [<ffffffff810dbc33>] path_openat+0xc7/0x360 [<ffffffff810dbfc5>] do_filp_open+0x3d/0x8c [<ffffffff810ccfd2>] do_sys_open+0x110/0x1a9 [<ffffffff810cd09e>] sys_open+0x20/0x22 [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b -> #0 (buffer_mutex){+.+...}: [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711 [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309 [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile] [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile] [<ffffffff81467b96>] notifier_call_chain+0x37/0x63 [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67 [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c [<ffffffff81039e8f>] do_exit+0x2a/0x6fc [<ffffffff8103a5e4>] do_group_exit+0x83/0xae [<ffffffff8103a626>] sys_exit_group+0x17/0x1b [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 1 lock held by oprofiled/4447: #0: ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67 stack backtrace: Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10 Call Trace: [<ffffffff81063193>] print_circular_bug+0xae/0xbc [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile] [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile] [<ffffffff81062627>] ? mark_lock+0x42f/0x552 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile] [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile] [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile] [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67 [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67 [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile] [<ffffffff81467b96>] notifier_call_chain+0x37/0x63 [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67 [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c [<ffffffff81039e8f>] do_exit+0x2a/0x6fc [<ffffffff81465031>] ? retint_swapgs+0xe/0x13 [<ffffffff8103a5e4>] do_group_exit+0x83/0xae [<ffffffff8103a626>] sys_exit_group+0x17/0x1b [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Carl Love <carll@us.ibm.com> Cc: <stable@kernel.org> # .36+ Signed-off-by: Robert Richter <robert.richter@amd.com>

The lockdep warning below detects a possible A->B/B->A locking dependency of mm->mmap_sem and dcookie_mutex. The order in sync_buffer() is mm->mmap_sem/dcookie_mutex, while in sys_lookup_dcookie() it is vice versa. Fixing it in sys_lookup_dcookie() by unlocking dcookie_mutex before copy_to_user(). oprofiled/4432 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<ffffffff810b444b>] might_fault+0x53/0xa3 but task is already holding lock: (dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (dcookie_mutex){+.+.+.}: [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309 [<ffffffff81124e5c>] get_dcookie+0x30/0x144 [<ffffffffa0000fba>] sync_buffer+0x196/0x3ec [oprofile] [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile] [<ffffffff81467b96>] notifier_call_chain+0x37/0x63 [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67 [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c [<ffffffff81039e8f>] do_exit+0x2a/0x6fc [<ffffffff8103a5e4>] do_group_exit+0x83/0xae [<ffffffff8103a626>] sys_exit_group+0x17/0x1b [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b -> #0 (&mm->mmap_sem){++++++}: [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711 [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffff810b4478>] might_fault+0x80/0xa3 [<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149 [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 1 lock held by oprofiled/4432: #0: (dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149 stack backtrace: Pid: 4432, comm: oprofiled Not tainted 2.6.39-00008-ge5a450d #9 Call Trace: [<ffffffff81063193>] print_circular_bug+0xae/0xbc [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711 [<ffffffff8102ef13>] ? get_parent_ip+0x11/0x42 [<ffffffff810b444b>] ? might_fault+0x53/0xa3 [<ffffffff8106557f>] lock_acquire+0xf8/0x11e [<ffffffff810b444b>] ? might_fault+0x53/0xa3 [<ffffffff810d7d54>] ? path_put+0x22/0x27 [<ffffffff810b4478>] might_fault+0x80/0xa3 [<ffffffff810b444b>] ? might_fault+0x53/0xa3 [<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149 [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b References: https://bugzilla.kernel.org/show_bug.cgi?id=13809 Cc: <stable@kernel.org> # .27+ Signed-off-by: Robert Richter <robert.richter@amd.com>

We were incorrectly executing PCIe specific workarounds on PCI cards. This resulted in: Machine check in kernel mode. Caused by (from SRR1=149030): Transfer error ack signal Oops: Machine check, sig: 7 [#1] Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>

Running ktest.pl, I hit this bug: [ 19.780654] BUG: unable to handle kernel NULL pointer dereference at 0000000c [ 19.780660] IP: [<c112efcd>] dev_get_drvdata+0xc/0x46 [ 19.780669] *pdpt = 0000000031daf001 *pde = 0000000000000000 [ 19.780673] Oops: 0000 [#1] SMP [ 19.780680] Dumping ftrace buffer:^M [ 19.780685] (ftrace buffer empty) [ 19.780687] Modules linked in: ide_pci_generic firewire_ohci firewire_core evbug crc_itu_t e1000 ide_core i2c_i801 iTCO_wdt [ 19.780697] [ 19.780700] Pid: 346, comm: v4l_id Not tainted 2.6.39-test-02740-gcaebc16-dirty #4 /DG965MQ [ 19.780706] EIP: 0060:[<c112efcd>] EFLAGS: 00010202 CPU: 0 [ 19.780709] EIP is at dev_get_drvdata+0xc/0x46 [ 19.780712] EAX: 00000008 EBX: f1e37da4 ECX: 00000000 EDX: 00000000 [ 19.780715] ESI: f1c3f200 EDI: c33ec95c EBP: f1e37d80 ESP: f1e37d80 [ 19.780718] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 19.780721] Process v4l_id (pid: 346, ti=f1e36000 task=f2bc2a60 task.ti=f1e36000) [ 19.780723] Stack: [ 19.780725] f1e37d8c c117d395 c33ec93c f1e37db4 c117a0f9 00000002 00000000 c1725e54 [ 19.780732] 00000001 00000007 f2918c90 f1c3f200 c33ec95c f1e37dd4 c1789d3d 22222222 [ 19.780740] 22222222 22222222 f2918c90 f1c3f200 f29194f4 f1e37de8 c178d5c4 c1725e54 [ 19.780747] Call Trace: [ 19.780752] [<c117d395>] st_kim_ref+0x28/0x41 [ 19.780756] [<c117a0f9>] st_register+0x29/0x562 [ 19.780761] [<c1725e54>] ? v4l2_open+0x111/0x1e3 [ 19.780766] [<c1789d3d>] fmc_prepare+0x97/0x424 [ 19.780770] [<c178d5c4>] fm_v4l2_fops_open+0x70/0x106 [ 19.780773] [<c1725e54>] ? v4l2_open+0x111/0x1e3 [ 19.780777] [<c1725e9b>] v4l2_open+0x158/0x1e3 [ 19.780782] [<c065173b>] chrdev_open+0x22c/0x276 [ 19.780787] [<c0647c4e>] __dentry_open+0x35c/0x581 [ 19.780792] [<c06498f9>] nameidata_to_filp+0x7c/0x96 [ 19.780795] [<c065150f>] ? cdev_put+0x57/0x57 [ 19.780800] [<c0660cad>] do_last+0x743/0x9d4 [ 19.780804] [<c065d5fc>] ? path_init+0x1ee/0x596 [ 19.780808] [<c0661481>] path_openat+0x10c/0x597 [ 19.780813] [<c05204a1>] ? trace_hardirqs_off+0x27/0x37 [ 19.780817] [<c0509651>] ? local_clock+0x78/0xc7 [ 19.780821] [<c0661945>] do_filp_open+0x39/0xc2 [ 19.780827] [<c1cabc76>] ? _raw_spin_unlock+0x4c/0x5d^M [ 19.780831] [<c0674ccd>] ? alloc_fd+0x19e/0x1b7 [ 19.780836] [<c06499ca>] do_sys_open+0xb7/0x1bd [ 19.780840] [<c0608eea>] ? sys_munmap+0x78/0x8d [ 19.780844] [<c0649b06>] sys_open+0x36/0x58 [ 19.780849] [<c1cb809f>] sysenter_do_call+0x12/0x38 [ 19.780852] Code: d8 2f 20 c3 01 83 15 dc 2f 20 c3 00 f0 ff 00 83 05 e0 2f 20 c3 01 83 15 e4 2f 20 c3 00 5d c3 55 89 e5 3e 8d 74 26 00 85 c0 74 28 <8b> 40 04 83 05 e8 2f 20 c3 01 83 15 ec 2f 20 c3 00 85 c0 74 13 ^M [ 19.780889] EIP: [<c112efcd>] dev_get_drvdata+0xc/0x46 SS:ESP 0068:f1e37d80 [ 19.780894] CR2: 000000000000000c [ 19.780898] ---[ end trace e7d1d0f6a2d1d390 ]--- The id of 0 passed to st_kim_ref() found no device, keeping pdev null, and causing pdev->dev cause a NULL pointer dereference. After having st_kim_ref() check for NULL, the st_unregister() function needed to be updated to handle the case that st_gdata was not set by the st_kim_ref(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Doing a ktest.pl randconfig, I stumbled across the following bug on boot up: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/lockdep.c:2649 lockdep_trace_alloc+0xed/0x100() Hardware name: Modules linked in: Pid: 0, comm: swapper Not tainted 3.0.0-rc1-test-00054-g1d68b67 #1 Call Trace: [<ffffffff810626ad>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106270a>] warn_slowpath_null+0x1a/0x20 [<ffffffff810b537d>] lockdep_trace_alloc+0xed/0x100 [<ffffffff81182fb0>] __kmalloc_node+0x30/0x2f0 [<ffffffff81153eda>] pcpu_mem_alloc+0x13a/0x180 [<ffffffff82be022c>] percpu_init_late+0x48/0xc2 [<ffffffff82bd630c>] ? mem_init+0xd8/0xe3 [<ffffffff82bbcc73>] start_kernel+0x1c2/0x449 [<ffffffff82bbc35c>] x86_64_start_reservations+0x163/0x167 [<ffffffff82bbc493>] x86_64_start_kernel+0x133/0x142^M ---[ end trace a7919e7f17c0a725 ]--- Then I ran a ktest.pl config_bisect and it came up with this config as the problem: CONFIG_SLOB Looking at what is different between SLOB and SLAB and SLUB, I found that the gfp flags are masked against gfp_allowed_mask in SLAB and SLUB, but not SLOB. On boot up, interrupts are disabled and lockdep will warn if some flags are set in gfp and interrupts are disabled. But these flags are masked off with the gfp_allowed_mask during boot. Because SLOB does not mask the flags against gfp_allowed_mask it triggers the warn on. Adding this mask fixes the bug. I also found that kmem_cache_alloc_node() was missing both the mask and the lockdep check, and that was added too. Acked-by: Matt Mackall <mpm@selenic.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>

Check pers->hot_remove_disk instead of pers->hot_add_disk in slot_store() during disk removal. The linear personality only has ->hot_add_disk and no ->hot_remove_disk, so that removing disk in the array resulted to following kernel bug: $ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3] $ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) PGD c9f5d067 PUD 8575a067 PMD 0 Oops: 0010 [#1] SMP CPU 2 Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO RIP: 0010:[<0000000000000000>] [< (null)>] (null) RSP: 0018:ffff880085757df0 EFLAGS: 00010282 RAX: ffffffffa00168e0 RBX: ffff8800d1431800 RCX: 000000000000006e RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88008543c000 RBP: ffff880085757e48 R08: 0000000000000002 R09: 000000000000000a R10: 0000000000000000 R11: ffff88008543c2e0 R12: 00000000ffffffff R13: ffff8800b4641000 R14: 0000000000000005 R15: 0000000000000000 FS: 00007fe8c9e05700(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000b4502000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process tee (pid: 10450, threadinfo ffff880085756000, task ffff8800c9f08000) Stack: ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000 ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90 ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98 Call Trace: [<ffffffff8138496a>] ? slot_store+0xaa/0x265 [<ffffffff81384bae>] rdev_attr_store+0x89/0xa8 [<ffffffff8115a96a>] sysfs_write_file+0x108/0x144 [<ffffffff81106b87>] vfs_write+0xb1/0x10d [<ffffffff8106e6c0>] ? trace_hardirqs_on_caller+0x111/0x135 [<ffffffff81106cac>] sys_write+0x4d/0x77 [<ffffffff814fe702>] system_call_fastpath+0x16/0x1b Code: Bad RIP value. RIP [< (null)>] (null) RSP <ffff880085757df0> CR2: 0000000000000000 ---[ end trace ba5fc64319a826fb ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>

Affected kernels 2.6.36 - 3.0 AppArmor may do a GFP_KERNEL memory allocation with task_lock(tsk->group_leader); held when called from security_task_setrlimit. This will only occur when the task's current policy has been replaced, and the task's creds have not been updated before entering the LSM security_task_setrlimit() hook. BUG: sleeping function called from invalid context at mm/slub.c:847 in_atomic(): 1, irqs_disabled(): 0, pid: 1583, name: cupsd 2 locks held by cupsd/1583: #0: (tasklist_lock){.+.+.+}, at: [<ffffffff8104dafa>] do_prlimit+0x61/0x189 #1: (&(&p->alloc_lock)->rlock){+.+.+.}, at: [<ffffffff8104db2d>] do_prlimit+0x94/0x189 Pid: 1583, comm: cupsd Not tainted 3.0.0-rc2-git1 #7 Call Trace: [<ffffffff8102ebf2>] __might_sleep+0x10d/0x112 [<ffffffff810e6f46>] slab_pre_alloc_hook.isra.49+0x2d/0x33 [<ffffffff810e7bc4>] kmem_cache_alloc+0x22/0x132 [<ffffffff8105b6e6>] prepare_creds+0x35/0xe4 [<ffffffff811c0675>] aa_replace_current_profile+0x35/0xb2 [<ffffffff811c4d2d>] aa_current_profile+0x45/0x4c [<ffffffff811c4d4d>] apparmor_task_setrlimit+0x19/0x3a [<ffffffff811beaa5>] security_task_setrlimit+0x11/0x13 [<ffffffff8104db6b>] do_prlimit+0xd2/0x189 [<ffffffff8104dea9>] sys_setrlimit+0x3b/0x48 [<ffffffff814062bb>] system_call_fastpath+0x16/0x1b Signed-off-by: John Johansen <john.johansen@canonical.com> Reported-by: Miles Lane <miles.lane@gmail.com> Cc: stable@kernel.org Signed-off-by: James Morris <jmorris@namei.org>

Following OOPS was seen when booting with card inserted BUG: unable to handle kernel NULL pointer dereference at 0000004c IP: [<f8b7718c>] cfg80211_get_drvinfo+0x21/0x115 [cfg80211] *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: iwl3945 iwl_legacy mwifiex_sdio mac80211 11 sdhci_pci sdhci pl2303 'ethtool' on the mwifiex device returned this OOPS as wiphy_dev() returned NULL. Adding missing set_wiphy_dev() call to fix the problem. Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>

Current oprofile's x86 callgraph support may trigger page faults throwing the BUG_ON(in_nmi()) message below. This patch fixes this by using the same nmi-safe copy-from-user code as in perf. ------------[ cut here ]------------ kernel BUG at .../arch/x86/kernel/traps.c:436! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast CPU 5 Modules linked in: Pid: 8611, comm: opcontrol Not tainted 2.6.39-00007-gfe47ae7 #1 Advanced Micro Device Anaheim/Anaheim RIP: 0010:[<ffffffff813e8e35>] [<ffffffff813e8e35>] do_nmi+0x22/0x1ee RSP: 0000:ffff88042fd47f28 EFLAGS: 00010002 RAX: ffff88042c0a7fd8 RBX: 0000000000000001 RCX: 00000000c0000101 RDX: 00000000ffff8804 RSI: ffffffffffffffff RDI: ffff88042fd47f58 RBP: ffff88042fd47f48 R08: 0000000000000004 R09: 0000000000001484 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88042fd47f58 R13: 0000000000000000 R14: ffff88042fd47d98 R15: 0000000000000020 FS: 00007fca25e56700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000074 CR3: 000000042d28b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process opcontrol (pid: 8611, threadinfo ffff88042c0a6000, task ffff88042c532310) Stack: 0000000000000000 0000000000000001 ffff88042c0a7fd8 0000000000000000 ffff88042fd47de8 ffffffff813e897a 0000000000000020 ffff88042fd47d98 0000000000000000 ffff88042c0a7fd8 ffff88042fd47de8 0000000000000074 Call Trace: <NMI> [<ffffffff813e897a>] nmi+0x1a/0x20 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771 <<EOE>> Code: ff 59 5b 41 5c 41 5d c9 c3 55 65 48 8b 04 25 88 b5 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe 81 80 44 e0 ff ff 00 00 01 04 65 ff 04 25 c4 0f 01 RIP [<ffffffff813e8e35>] do_nmi+0x22/0x1ee RSP <ffff88042fd47f28> ---[ end trace ed6752185092104b ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 8611, comm: opcontrol Tainted: G D 2.6.39-00007-gfe47ae7 #1 Call Trace: <NMI> [<ffffffff813e5e0a>] panic+0x8c/0x188 [<ffffffff813e915c>] oops_end+0x81/0x8e [<ffffffff8100403d>] die+0x55/0x5e [<ffffffff813e8c45>] do_trap+0x11c/0x12b [<ffffffff810023c8>] do_invalid_op+0x91/0x9a [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee [<ffffffff8131e6fa>] ? oprofile_add_sample+0x83/0x95 [<ffffffff81321670>] ? op_amd_check_ctrs+0x4f/0x2cf [<ffffffff813ee4d5>] invalid_op+0x15/0x20 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee [<ffffffff813e8e7a>] ? do_nmi+0x67/0x1ee [<ffffffff813e897a>] nmi+0x1a/0x20 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771 <<EOE>> Cc: John Lumby <johnlumby@hotmail.com> Cc: Maynard Johnson <maynardj@us.ibm.com> Cc: <stable@kernel.org> # .37+ Signed-off-by: Robert Richter <robert.richter@amd.com>

Running a ktest.pl test, I hit the following bug on x86_32: ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:81 __kunmap_atomic+0x64/0xc1() Hardware name: Modules linked in: Pid: 93, comm: sh Not tainted 2.6.39-test+ #1 Call Trace: [<c04450da>] warn_slowpath_common+0x7c/0x91 [<c042f5df>] ? __kunmap_atomic+0x64/0xc1 [<c042f5df>] ? __kunmap_atomic+0x64/0xc1^M [<c0445111>] warn_slowpath_null+0x22/0x24 [<c042f5df>] __kunmap_atomic+0x64/0xc1 [<c04d4a22>] unmap_vmas+0x43a/0x4e0 [<c04d9065>] exit_mmap+0x91/0xd2 [<c0443057>] mmput+0x43/0xad [<c0448358>] exit_mm+0x111/0x119 [<c044855f>] do_exit+0x1ff/0x5fa [<c0454ea2>] ? set_current_blocked+0x3c/0x40 [<c0454f24>] ? sigprocmask+0x7e/0x8e [<c0448b55>] do_group_exit+0x65/0x88 [<c0448b90>] sys_exit_group+0x18/0x1c [<c0c3915f>] sysenter_do_call+0x12/0x38 ---[ end trace 8055f74ea3c0eb62 ]--- Running a ktest.pl git bisect, found the culprit: commit e303297 ("mm: extended batches for generic mmu_gather") But although this was the commit triggering the bug, it was not the one originally responsible for the bug. That was commit d16dfc5 ("mm: mmu_gather rework"). The code in zap_pte_range() has something that looks like the following: pte = pte_offset_map_lock(mm, pmd, addr, &ptl); do { [...] } while (pte++, addr += PAGE_SIZE, addr != end); pte_unmap_unlock(pte - 1, ptl); The pte starts off pointing at the first element in the page table directory that was returned by the pte_offset_map_lock(). When it's done with the page, pte will be pointing to anything between the next entry and the first entry of the next page inclusive. By doing a pte - 1, this puts the pte back onto the original page, which is all that pte_unmap_unlock() needs. In most archs (64 bit), this is not an issue as the pte is ignored in the pte_unmap_unlock(). But on 32 bit archs, where things may be kmapped, it is essential that the pte passed to pte_unmap_unlock() resides on the same page that was given by pte_offest_map_lock(). The problem came in d16dfc5 ("mm: mmu_gather rework") where it introduced a "break;" from the while loop. This alone did not seem to easily trigger the bug. But the modifications made by e303297 caused that "break;" to be hit on the first iteration, before the pte++. The pte not being incremented will now cause pte_unmap_unlock(pte - 1) to be pointing to the previous page. This will cause the wrong page to be unmapped, and also trigger the warning above. The simple solution is to just save the pointer given by pte_offset_map_lock() and use it in the unlock. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

We have migrated the space for the delayed inode items from trans_block_rsv to global_block_rsv, but we forgot to set trans->block_rsv to global_block_rsv when we doing delayed inode operations, and the following Oops happened: [ 9792.654889] ------------[ cut here ]------------ [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 btrfs_alloc_free_block+0xca/0x27c [btrfs]() [ 9792.654899] Hardware name: To Be Filled By O.E.M. [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] [ 9792.654919] Pid: 2762, comm: rm Tainted: G W 2.6.39+ #1 [ 9792.654920] Call Trace: [ 9792.654922] [<ffffffff81053c4a>] warn_slowpath_common+0x83/0x9b [ 9792.654925] [<ffffffff81053c7c>] warn_slowpath_null+0x1a/0x1c [ 9792.654933] [<ffffffffa038e747>] btrfs_alloc_free_block+0xca/0x27c [btrfs] [ 9792.654945] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.654953] [<ffffffffa038189b>] __btrfs_cow_block+0xfc/0x30c [btrfs] [ 9792.654963] [<ffffffffa0396aa6>] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs] [ 9792.654970] [<ffffffffa0382e48>] ? read_block_for_search+0x94/0x368 [btrfs] [ 9792.654978] [<ffffffffa0381ba9>] btrfs_cow_block+0xfe/0x146 [btrfs] [ 9792.654986] [<ffffffffa03848b0>] btrfs_search_slot+0x14d/0x4b6 [btrfs] [ 9792.654997] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.655022] [<ffffffffa03938e8>] btrfs_lookup_inode+0x2f/0x8f [btrfs] [ 9792.655025] [<ffffffff8147afac>] ? _cond_resched+0xe/0x22 [ 9792.655027] [<ffffffff8147b892>] ? mutex_lock+0x29/0x50 [ 9792.655039] [<ffffffffa03d41b1>] btrfs_update_delayed_inode+0x72/0x137 [btrfs] [ 9792.655051] [<ffffffffa03d4ea2>] btrfs_run_delayed_items+0x90/0xdb [btrfs] [ 9792.655062] [<ffffffffa039a69b>] btrfs_commit_transaction+0x228/0x654 [btrfs] [ 9792.655064] [<ffffffff8106e8da>] ? remove_wait_queue+0x3a/0x3a [ 9792.655075] [<ffffffffa03a2fa5>] btrfs_evict_inode+0x14d/0x202 [btrfs] [ 9792.655077] [<ffffffff81132bd6>] evict+0x71/0x111 [ 9792.655079] [<ffffffff81132de0>] iput+0x12a/0x132 [ 9792.655081] [<ffffffff8112aa3a>] do_unlinkat+0x106/0x155 [ 9792.655083] [<ffffffff81127b83>] ? path_put+0x1f/0x23 [ 9792.655085] [<ffffffff8109c53c>] ? audit_syscall_entry+0x145/0x171 [ 9792.655087] [<ffffffff81128410>] ? putname+0x34/0x36 [ 9792.655090] [<ffffffff8112b441>] sys_unlinkat+0x29/0x2b [ 9792.655092] [<ffffffff81482c42>] system_call_fastpath+0x16/0x1b [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- This patch fix it by setting the reservation of the transaction handle to the correct one. Reported-by: Josef Bacik <josef@redhat.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>

This patch fixes two possible NULL pointer dereferences in target v4.0 code where se_tmr release path in core_tmr_release_req() can OOPs upon transport_get_lun_for_tmr() failure by attempting to access se_device or se_tmr->tmr_list without a valid member of se_device->tmr_list during transport_free_se_cmd() release. This patch moves the se_tmr->tmr_dev pointer assignment in transport_get_lun_for_tmr() until after possible -ENODEV failures during unpacked_lun lookup. This addresses an OOPs originally reported with LIO v4.1 upstream on .39 code here: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x00000000 BUG: unable to handle kernel NULL pointer dereference at 0000000000000550 IP: [<ffffffff81035ec4>] __ticket_spin_trylock+0x4/0x20 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map CPU 1 Modules linked in: netconsole target_core_pscsi target_core_file tcm_qla2xxx target_core_iblock tcm_loop target_core_mod configfs ipmi_devintf ipmi_si ipmi_msghandler serio_raw i7core_edac ioatdma dca edac_core ps_bdrv ses enclosure usbhid usb_storage ahci qla2xxx hid uas e1000e mpt2sas libahci mlx4_core scsi_transport_fc scsi_transport_sas raid_class scsi_tgt [last unloaded: netconsole] Pid: 0, comm: kworker/0:0 Tainted: G W 2.6.39+ #1 Xyratex Storage Server RIP: 0010:[<ffffffff81035ec4>] [<ffffffff81035ec4>]__ticket_spin_trylock+0x4/0x20 RSP: 0018:ffff88063e803c08 EFLAGS: 00010286 RAX: ffff880619ab45e0 RBX: 0000000000000550 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000550 RBP: ffff88063e803c08 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000568 R13: 0000000000000001 R14: 0000000000000000 R15: ffff88060cd96a20 FS: 0000000000000000(0000) GS:ffff88063e800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000550 CR3: 0000000001a03000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/0:0 (pid: 0, threadinfo ffff880619ab8000, task ffff880619ab45e0) Stack: ffff88063e803c28 ffffffff812cf039 0000000000000550 0000000000000568 ffff88063e803c58 ffffffff8157071e ffffffffa028a1dc ffff88060f7e4600 0000000000000550 ffff880616961480 ffff88063e803c78 ffffffffa028a1dc Call Trace: <IRQ> [<ffffffff812cf039>] do_raw_spin_trylock+0x19/0x50 [<ffffffff8157071e>] _raw_spin_lock+0x3e/0x70 [<ffffffffa028a1dc>] ? core_tmr_release_req+0x2c/0x60 [target_core_mod] [<ffffffffa028a1dc>] core_tmr_release_req+0x2c/0x60 [target_core_mod] [<ffffffffa028d0d2>] transport_free_se_cmd+0x22/0x50 [target_core_mod] [<ffffffffa028d120>] transport_release_cmd_to_pool+0x20/0x40 [target_core_mod] [<ffffffffa028e525>] transport_generic_free_cmd+0xa5/0xb0 [target_core_mod] [<ffffffffa0147cc4>] tcm_qla2xxx_handle_tmr+0xc4/0xd0 [tcm_qla2xxx] [<ffffffffa0191ba3>] __qla24xx_handle_abts+0xd3/0x150 [qla2xxx] [<ffffffffa0197651>] qla_tgt_response_pkt+0x171/0x520 [qla2xxx] [<ffffffffa0197a2d>] qla_tgt_response_pkt_all_vps+0x2d/0x220 [qla2xxx] [<ffffffffa0171dd3>] qla24xx_process_response_queue+0x1a3/0x670 [qla2xxx] [<ffffffffa0196281>] ? qla24xx_atio_pkt+0x81/0x120 [qla2xxx] [<ffffffffa0174025>] ? qla24xx_msix_default+0x45/0x2a0 [qla2xxx] [<ffffffffa0174198>] qla24xx_msix_default+0x1b8/0x2a0 [qla2xxx] [<ffffffff810dadb4>] handle_irq_event_percpu+0x54/0x210 [<ffffffff810dafb8>] handle_irq_event+0x48/0x70 [<ffffffff810dd5ee>] ? handle_edge_irq+0x1e/0x110 [<ffffffff810dd647>] handle_edge_irq+0x77/0x110 [<ffffffff8100d362>] handle_irq+0x22/0x40 [<ffffffff8157b28d>] do_IRQ+0x5d/0xe0 [<ffffffff81571413>] common_interrupt+0x13/0x13 <EOI> [<ffffffff813003f7>] ? intel_idle+0xd7/0x130 [<ffffffff813003f0>] ? intel_idle+0xd0/0x130 [<ffffffff8144832b>] cpuidle_idle_call+0xab/0x1c0 [<ffffffff8100a26b>] cpu_idle+0xab/0xf0 [<ffffffff81566c59>] start_secondary+0x1cb/0x1d2 Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

Loading the ehci-hcd module on the ath79 platform causes a NULL pointer dereference: CPU 0 Unable to handle kernel paging request at virtual address 00000000, epc == c0252928, ra == c00de968 Oops[#1]: Cpu 0 $ 0 : 00000000 00000070 00000001 00000000 $ 4 : 802cf870 0000117e ffffffff 8019c7bc $ 8 : 0000000a 00000002 00000001 fffffffb $12 : 8026ef20 0000000f ffffff80 802dad3c $16 : 8077a2d4 8077a200 c00f3484 8019ed84 $20 : c00f0000 00000003 000000a0 80262c2c $24 : 00000002 80079da0 $28 : 80788000 80789c80 80262b14 c00de968 Hi : 00000000 Lo : b61f0000 epc : c0252928 __mod_vermagic5+0xc260/0xc7e8 [ehci_hcd] Not tainted ra : c00de968 usb_add_hcd+0x2a4/0x858 [usbcore] Status: 1000c003 KERNEL EXL IE Cause : 00800008 BadVA : 00000000 PrId : 00019374 (MIPS 24Kc) Modules linked in: ehci_hcd(+) pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filte r ip_tables xt_tcpudp x_tables ppp_async ppp_generic slhc ath mac80211 usbcore nls_base input_polldev crc_ccitt cfg80211 compat input_core a rc4 aes_generic crypto_algapi Process insmod (pid: 379, threadinfo=80788000, task=80ca2180, tls=77fe52d0) Stack : c0253184 80c57d80 80789cac 8077a200 00000001 8019edc0 807fa800 8077a200 8077a290 c00f3484 8019ed84 c00f0000 00000003 000000a0 80262c2c c00de968 802d0000 800878cc c0253228 c02528e4 c0253184 80c57d80 80bf6800 80ca2180 8007b75c 00000000 8077a200 802cf830 802d0000 00000003 fffffff4 00000015 00000348 00000124 800b189c c024bb4c c0255000 801a27e8 c0253228 c02528e4 ... Call Trace: [<c0252928>] __mod_vermagic5+0xc260/0xc7e8 [ehci_hcd] It is caused by: commit c430131 Author: Jan Andersson <jan@gaisler.com> Date: Tue May 3 20:11:57 2011 +0200 USB: EHCI: Support controllers with big endian capability regs The two first HC capability registers (CAPLENGTH and HCIVERSION) are defined as one 8-bit and one 16-bit register. Most HC implementations have selected to treat these registers as part of a 32-bit register, giving the same layout for both big and small endian systems. This patch adds a new quirk, big_endian_capbase, to support controllers with big endian register interfaces that treat HCIVERSION and CAPLENGTH as individual registers. Signed-off-by: Jan Andersson <jan@gaisler.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> The reading of the HC capability register has been moved by that commit to a place where the ehci->caps field is not initialized yet. This patch moves the reading of the register back to the original place. Acked-by: Jan Andersson <jan@gaisler.com> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

…highmem machines During 32/64 NUMA init unification, commit 797390d ("x86-32, NUMA: use sparse_memory_present_with_active_regions()") made 32bit mm init call memory_present() automatically from active_regions instead of leaving it to each NUMA init path. This commit description is inaccurate - memory_present() calls aren't the same for flat and numaq. After the commit, memory_present() is only called for the intersection of e820 and NUMA layout. Before, on flatmem, memory_present() would be called from 0 to max_pfn. After, it would be called only on the areas that e820 indicates to be populated. This is how x86_64 works and should be okay as memmap is allowed to contain holes; however, x86_32 DISCONTIGMEM is missing early_pfn_valid(), which makes memmap_init_zone() assume that memmap doesn't contain any hole. This leads to the following oops if e820 map contains holes as it often does on machine with near or more 4GiB of memory by calling pfn_to_page() on a pfn which isn't mapped to a NUMA node, a reported by Conny Seidel: BUG: unable to handle kernel paging request at 000012b0 IP: [<c1aa13ce>] memmap_init_zone+0x6c/0xf2 *pdpt =3D 0000000000000000 *pde =3D f000eef3f000ee00 Oops: 0000 [#1] SMP last sysfs file: Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00164-g797390d #1 To Be Filled By O.E.M. To Be Filled By O.E.M./E350M1 EIP: 0060:[<c1aa13ce>] EFLAGS: 00010012 CPU: 0 EIP is at memmap_init_zone+0x6c/0xf2 EAX: 00000000 EBX: 000a8000 ECX: 000a7fff EDX: f2c00b80 ESI: 000a8000 EDI: f2c00800 EBP: c19ffe54 ESP: c19ffe34 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=3Dc19fe000 task=3Dc1a07f60 task.ti=3Dc19fe000) Stack: 00000002 00000000 0023f000 00000000 10000000 00000a00 f2c00000 f2c00b58 c19ffeb0 c1a80f24 000375fe 00000000 f2c00800 00000800 00000100 00000030 c1abb768 0000003c 00000000 00000000 00000004 00207a02 f2c00800 000375fe Call Trace: [<c1a80f24>] free_area_init_node+0x358/0x385 [<c1a81384>] free_area_init_nodes+0x420/0x487 [<c1a79326>] paging_init+0x114/0x11b [<c1a6cb13>] setup_arch+0xb37/0xc0a [<c1a69554>] start_kernel+0x76/0x316 [<c1a690a8>] i386_start_kernel+0xa8/0xb0 This patch fixes the bug by defining early_pfn_valid() to be the same as pfn_valid() when DISCONTIGMEM. Reported-bisected-and-tested-by: Conny Seidel <conny.seidel@amd.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: hans.rosenfeld@amd.com Cc: Christoph Lameter <cl@linux.com> Cc: Conny Seidel <conny.seidel@amd.com> Link: http://lkml.kernel.org/r/20110628094107.GB3386@htj.dyndns.org Signed-off-by: Ingo Molnar <mingo@elte.hu>

This silences dma-debug warnings: https://lkml.org/lkml/2011/6/30/341 ------------[ cut here ]------------ WARNING: at /home/jimc/projects/lx/linux-2.6/lib/dma-debug.c:820 check_unmap+0x1fe/0x56a() natsemi 0000:00:06.0: DMA-API: device driver frees DMA memory with different size [device address=0x0000000006ef0040] [map size=1538 bytes] [unmap size=1522 bytes] Modules linked in: pc8736x_gpio pc87360 hwmon_vid scx200_gpio nsc_gpio scx200_hrt scx200_acb i2c_core arc4 rtl8180 mac80211 eeprom_93cx6 cfg80211 pcspkr rfkill scx200 ide_gd_mod ide_pci_generic ohci_hcd usbcore sc1200 ide_core Pid: 870, comm: collector Not tainted 3.0.0-rc5-sk-00080-gca56a95 #1 Call Trace: [<c011a556>] warn_slowpath_common+0x4a/0x5f [<c02565cb>] ? check_unmap+0x1fe/0x56a [<c011a5cf>] warn_slowpath_fmt+0x26/0x2a [<c02565cb>] check_unmap+0x1fe/0x56a [<c0256aaa>] debug_dma_unmap_page+0x53/0x5b [<c029d6cd>] pci_unmap_single+0x4d/0x57 [<c029ea0a>] natsemi_poll+0x343/0x5ca [<c0116f41>] ? try_to_wake_up+0xea/0xfc [<c0122416>] ? spin_unlock_irq.clone.28+0x18/0x23 [<c02d4667>] net_rx_action+0x3f/0xe5 [<c011e35e>] __do_softirq+0x5b/0xd1 [<c011e303>] ? local_bh_enable+0xa/0xa <IRQ> [<c011e54b>] ? irq_exit+0x34/0x75 [<c01034b9>] ? do_IRQ+0x66/0x79 [<c034e869>] ? common_interrupt+0x29/0x30 [<c0115ed0>] ? finish_task_switch.clone.118+0x31/0x72 [<c034cb92>] ? schedule+0x3b2/0x3f1 [<c012f4b0>] ? hrtimer_start_range_ns+0x10/0x12 [<c012f4ce>] ? hrtimer_start_expires+0x1c/0x24 [<c034d5aa>] ? schedule_hrtimeout_range_clock+0x8e/0xb4 [<c012ed27>] ? update_rmtp+0x68/0x68 [<c034d5da>] ? schedule_hrtimeout_range+0xa/0xc [<c017a913>] ? poll_schedule_timeout+0x27/0x3e [<c017b051>] ? do_select+0x488/0x4cd [<c0115ee2>] ? finish_task_switch.clone.118+0x43/0x72 [<c01157ad>] ? need_resched+0x14/0x1e [<c017a99e>] ? poll_freewait+0x74/0x74 [<c01157ad>] ? need_resched+0x14/0x1e [<c034cbc1>] ? schedule+0x3e1/0x3f1 [<c011e55e>] ? irq_exit+0x47/0x75 [<c01157ad>] ? need_resched+0x14/0x1e [<c034cf8a>] ? preempt_schedule_irq+0x44/0x4a [<c034dd1e>] ? need_resched+0x17/0x19 [<c024bc12>] ? put_dec_full+0x7b/0xaa [<c0240060>] ? blkdev_ioctl+0x434/0x618 [<c024bc70>] ? put_dec+0x2f/0x6d [<c024c6a5>] ? number.clone.1+0x10b/0x1d0 [<c034cf8a>] ? preempt_schedule_irq+0x44/0x4a [<c034dd1e>] ? need_resched+0x17/0x19 [<c024d046>] ? vsnprintf+0x225/0x264 [<c024cea0>] ? vsnprintf+0x7f/0x264 [<c018346f>] ? seq_printf+0x22/0x40 [<c01a2fcc>] ? do_task_stat+0x582/0x5a3 [<c017a913>] ? poll_schedule_timeout+0x27/0x3e [<c017b1b5>] ? core_sys_select+0x11f/0x1a3 [<c017a913>] ? poll_schedule_timeout+0x27/0x3e [<c01a34a1>] ? proc_tgid_stat+0xd/0xf [<c012357c>] ? recalc_sigpending+0x32/0x35 [<c0123b9c>] ? __set_task_blocked+0x64/0x6a [<c011dfb0>] ? timespec_add_safe+0x24/0x48 [<c0123449>] ? spin_unlock_irq.clone.16+0x18/0x23 [<c017b3a1>] ? sys_pselect6+0xe5/0x13e [<c034dd65>] ? syscall_call+0x7/0xb [<c0340000>] ? rpc_clntdir_depopulate+0x26/0x30 ---[ end trace 180dcac41a50938b ]--- Reported-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Tested-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>

PurpleAlien and others added 13 commits September 20, 2011 04:19

EfikaSB support.

ba0fcc3

Fixed dam driver, but for MX53 only. TODO: more elegant solution.

08625c9

Fixed ssi driver, but for MX53 only. TODO: more elegant solution.

26358b9

CS42L52 codec. TODO: cleanup

376745d

The CS42L52 machine driver, and make sure arch/arm/mach-mx5/Kconfig has

beb6033

ARCH_MXC_AUDMUX_V2 selected for this to work.

Fall back to the mxc_nd2 NAND driver instead of mxc_nand.

fc0f6d4

Add correct NAND chip id for the EfikaSB NAND and correct the buswidth.

cd015c0

Make sure we compile the correct NAND driver.

Add our screenmode to ldb.

7c6587f

Set NAND clock to Micron max freq. TODO: benchmark and see if it

e340b67

actually makes a difference.

Proper defconfig for mx53 efikasb.

3cc6b6a

mach-types: Fix white space issues

a5f63f5

Fix the whitespace issues. Kernel use is tabs, not spaces. Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

Removed 'extern' variable. Not needed.

f70c1a1

Better reset sequence for WiFi.

9264bd8

PurpleAlien and others added 16 commits October 10, 2011 02:26

Merge branch 'master' into mx53_efikasb_jd

aa38e51

Conflicts: sound/soc/imx/Makefile

Merge branch 'mx53_efikasb_jd' of https://github.com/steev/linux-lina…

5bd444c

…ro-natty into mx53_efikasb_jd

Clean up GPU driver of old unused files

f42be05

gpu: remove one last file

45d30cd

gpu: update platform data to match BSP

691ac52

Sync GPU to production kernel

63d77d0

gpu: port to 2.6.38

93106fa

mx53_efikasb: Remove duplicate includes

47387b1

There were 2 duplicate includes in the board-mx53_efikasb.c file. Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

sppp: Add initial header for sppp

45f4a9a

sppp: whitespace cleanups

9d6750c

Signed-off-by: Steev Klimaszewski <steev@genesi-usa.com>

Step two to get the cw1200 driver to work.

4736164

This quirck takes care of SDIO devices not supporting 512 byte requests in byte mode during CMD53. http://comments.gmane.org/gmane.linux.kernel.mmc/10087

Wrong mute mask, and some restructuring for clarity.

50f5458

Don't register RTC. We have the SPPP RTC, and the srtc driver

3cdc226

seems to conflict with audio. This is still under investigation.

SPPP integration for keyboard, trackpad, clock and power.

8f7b7c5

Power and main driver are part of the architecture specific files. This can probably be done better with specific platform devices. Drivers for mouse, trackpad and rtc are in their respective directories.

Updated defconfig to reflect SPPP driver addition

abe5e73

efikasb: cleanups for efikasb and sppp

13d1a67

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mx53 efikasb jd#1

Mx53 efikasb jd#1
PurpleAlien wants to merge 34 commits intomasterfrom
mx53_efikasb_jd

PurpleAlien commented Sep 28, 2011

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PurpleAlien commented Sep 28, 2011

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants