# Collector Facts Dump Report directory: `reports/lkml/69ed492c.050a0220.e51af.0005.GAE_google.com/` Phase 2 complete. This file is the rich facts dump for the analyst agent. --- ## Debug symbols - **vmlinux**: `oops-workdir/syzbot/vmlinux-b4e07588` (absolute: `/sdb1/arjan/git/oops-skill/oops-workdir/syzbot/vmlinux-b4e07588`) - **Download**: downloaded from `https://storage.googleapis.com/syzbot-assets/a3832abcd2f7/vmlinux-b4e07588.xz` by the fetcher agent; decompressed to ~1.7 GB ELF - **Kernel version string**: `syzkaller #0 PREEMPT(full)` (not a standard distro kernel — syzbot mainline upstream build at b4e07588e743) - **Compiler**: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 --- ## Source tree - **Path**: `oops-workdir/linux` (absolute: `/sdb1/arjan/git/oops-skill/oops-workdir/linux`) - **HEAD commit**: `b4e07588e743` — confirmed checked out - **semcode index**: refreshed (by fetcher agent) --- ## Resolved addresses All entries resolved from vmlinux by `backtrace_resolve.py`. Full table: | JSON idx | Function | Address (computed) | Source (primary) | Inlined frames | Type | |----------|----------|--------------------|-----------------|----------------|------| | 0 | `__queue_work` | `0xffffffff818d5d2a (0xffffffff818d4fe0 + 0xd4a)` | kernel/workqueue.c:0 (addr2line returned 0; WARNING header gives 2297) | — | normal | | 1 | `queue_work_on` | `0xffffffff818d4f06 (0xffffffff818d4e00 + 0x106)` | kernel/workqueue.c:2432 | — | normal | | 2 | `hci_send_cmd` | `0xffffffff8aaa3767 (0xffffffff8aaa36b0 + 0xb7)` | include/linux/workqueue.h:696 | `{hci_send_cmd, hci_core.c:3111}` | normal | | 3 | `hci_conn_security` | `0xffffffff8aabbac9 (0xffffffff8aabb530 + 0x599)` | net/bluetooth/hci_conn.c:0 (addr2line returned 0; source_hint gives 2551) | `{hci_conn_security, hci_conn.c:0}` | normal | | 4 | `l2cap_conn_start` | `0xffffffff8ab70f0c (0xffffffff8ab70b50 + 0x3bc)` | net/bluetooth/l2cap_core.c:1534 | — | normal | | 5 | `l2cap_info_timeout` | `0xffffffff8ab707c8 (0xffffffff8ab70760 + 0x68)` | net/bluetooth/l2cap_core.c:1685 | — | normal | | 6 | `process_scheduled_works` | `0xffffffff818eb2ed (0xffffffff818ea790 + 0xb5d)` | kernel/workqueue.c:0 (addr2line returned 0; source_hint gives 3385) | `{process_scheduled_works, workqueue.c:0}` | normal | | 7 | `worker_thread` | `0xffffffff818f3353 (0xffffffff818f2900 + 0xa53)` | kernel/workqueue.c:3466 | — | normal | | 8 | `kthread` | `0xffffffff8190ae58 (0xffffffff8190aad0 + 0x388)` | kernel/kthread.c:436 | — | normal | | 9 | `ret_from_fork` | `0xffffffff816bfae4 (0xffffffff816bf5d0 + 0x514)` | arch/x86/kernel/process.c:158 | — | normal | | 10 | `ret_from_fork_asm` | `0xffffffff813370aa (0xffffffff81337090 + 0x1a)` | arch/x86/entry/entry_64.S:245 | — | normal | **Note on addr2line returning line 0**: `__queue_work` (index 0), `hci_conn_security` (index 3), and `process_scheduled_works` (index 6) returned `file:0` from addr2line. Source locations for these are taken from the Format 3 backtrace annotations and the WARNING header line. --- ## Code bytes Raw bytes from oops (space-separated, `< >` marks trapping instruction): ``` 83 c5 18 4c 89 e8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ef e8 17 4d a5 00 49 8b 75 00 49 81 c7 70 01 00 00 4c 89 f7 4c 89 fa <67> 48 0f b9 3a ← ud1 (%edx),%rdi (trapping instruction) 48 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc ``` Annotated disassembly at crash window (from vmlinux via backtrace_resolve.py): ``` ffffffff818d5cec: mov 0x18(%rsp),%r15 ffffffff818d5cf1: jmp ffffffff818d5cf8 <__queue_work+0xd18> ffffffff818d5cf3: call ffffffff81c5e0f0 <__sanitizer_cov_trace_pc> ffffffff818d5cf8: lea 0xea93071(%rip),%r14 # ffffffff90368d70 <__start___bug_table+0xd530> ffffffff818d5cff: add $0x18,%r13 ffffffff818d5d03: mov %r13,%rax ffffffff818d5d06: shr $0x3,%rax ffffffff818d5d0a: cmpb $0x0,(%rax,%r12,1) ; KASAN shadow check on R13 ffffffff818d5d0f: je ffffffff818d5d19 <__queue_work+0xd39> ffffffff818d5d11: mov %r13,%rdi ffffffff818d5d14: call ffffffff8232aa30 <__asan_report_load8_noabort> ffffffff818d5d19: mov 0x0(%r13),%rsi ; RSI = *R13 = work->func ffffffff818d5d1d: add $0x170,%r15 ; R15 = original R15 + 0x170 = wq->name ffffffff818d5d24: mov %r14,%rdi ; RDI = bug table entry (LEA'd above) ffffffff818d5d27: mov %r15,%rdx ; RDX = wq->name ptr ffffffff818d5d2a: call ffffffff8bbd4e98 <__SCT__WARN_trap> <<<< crash at 0xd4a ffffffff818d5d2f: add $0x58,%rsp ffffffff818d5d33: pop %rbx ffffffff818d5d34: pop %r12 ffffffff818d5d36: pop %r13 ffffffff818d5d38: pop %r14 ``` **Discrepancy note**: The Code: bytes show `ud1 (%edx),%rdi` (opcode `67 48 0f b9 3a`) at the crash address `__queue_work+0xd4a`, while the vmlinux disasm shows `call __SCT__WARN_trap` (`e8 ...`) at the same address. This is because `ud1` is the WARN trap encoding used by the kernel's WARN mechanism via static calls — `__SCT__WARN_trap` contains the actual `ud1` that causes the #UD exception, and the Code: dump captures bytes from the actual trap location. --- ## Register annotations ``` RIP: 0010:__queue_work+0xd4a/0xfc0 kernel/workqueue.c:2296 RSP: 0018:ffffc9000257f720 EFLAGS: 00010082 RAX: 1ffff110081cc181 KASAN shadow address (for R13 >> 3 + R12) RBX: 0000000000000008 small integer, likely work flags RCX: ffff888000260000 kernel object pointer RDX: ffff888040182170 = R15 after add $0x170 → wq->name ptr (points to "hci0") RSI: ffffffff8aa9ccd0 = hci_cmd_work (confirmed by nm: "ffffffff8aa9ccd0 t hci_cmd_work") = work->func, loaded from *R13 at ffffffff818d5d19 RDI: ffffffff90368d70 = R14 = __start___bug_table+0xd530 (BUG table entry for WARN) RBP: 0000000000000020 = 32 decimal R08: ffff888040e60bf7 kernel object pointer R09: 1ffff110081cc17e KASAN shadow address R10: dffffc0000000000 KASAN shadow offset (standard constant) R11: ffffed10081cc17f KASAN address R12: dffffc0000000000 KASAN shadow offset (standard constant) R13: ffff888040e60c08 ptr to work struct; *R13 = work->func = hci_cmd_work R14: ffffffff90368d70 BUG table entry = __start___bug_table+0xd530 R15: ffff888040182170 wq->name ptr (pointing to "hci0" string) ``` **KASAN indicators**: RAX, R09, R10, R11, R12 all contain KASAN shadow-space addresses (`1ffff11...`, `dffffc00...`). This kernel is built with KASAN enabled. The code before the crash checks the KASAN shadow for R13 (work struct pointer). **work->func identification**: RSI = ffffffff8aa9ccd0 = `hci_cmd_work` (confirmed via `nm` on vmlinux). This is the work function being queued: `hci_cmd_work` from `net/bluetooth/hci_core.c`. --- ## Full source excerpts ### `hci_conn_auth` (net/bluetooth/hci_conn.c:2438–2468) [`net/bluetooth/hci_conn.c` @ b4e07588e743](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/bluetooth/hci_conn.c?id=b4e07588e743#l2438) ```c 2438 static int hci_conn_auth(struct hci_conn *conn, __u8 sec_level, __u8 auth_type) 2439 { 2440 BT_DBG("hcon %p", conn); 2441 2442 if (conn->pending_sec_level > sec_level) 2443 sec_level = conn->pending_sec_level; 2444 2445 if (sec_level > conn->sec_level) 2446 conn->pending_sec_level = sec_level; 2447 else if (test_bit(HCI_CONN_AUTH, &conn->flags)) 2448 return 1; 2449 2450 /* Make sure we preserve an existing MITM requirement*/ 2451 auth_type |= (conn->auth_type & 0x01); 2452 conn->auth_type = auth_type; 2453 2454 if (!test_and_set_bit(HCI_CONN_AUTH_PEND, &conn->flags)) { 2455 struct hci_cp_auth_requested cp; 2456 cp.handle = cpu_to_le16(conn->handle); 2457 hci_send_cmd(conn->hdev, HCI_OP_AUTH_REQUESTED, ← line 2459 2458 sizeof(cp), &cp); 2459 >>> hci_send_cmd(conn->hdev, HCI_OP_AUTH_REQUESTED, ← backtrace annotation line 2459 2460 sizeof(cp), &cp); 2461 if (!test_bit(HCI_CONN_ENCRYPT, &conn->flags)) 2462 set_bit(HCI_CONN_ENCRYPT_PEND, &conn->flags); 2463 } 2464 return 0; 2465 } ``` (Actual call is at line 2459 per backtrace annotation.) ### `l2cap_info_timeout` (net/bluetooth/l2cap_core.c:1675–1687) ```c 1675 static void l2cap_info_timeout(struct work_struct *work) 1676 { 1677 struct l2cap_conn *conn = container_of(work, struct l2cap_conn, 1678 info_timer.work); 1679 1680 conn->info_state |= L2CAP_INFO_FEAT_MASK_REQ_DONE; 1681 conn->info_ident = 0; 1682 1683 mutex_lock(&conn->lock); 1684 l2cap_conn_start(conn); ← calls into l2cap_conn_start 1685 >>> l2cap_conn_start(conn); 1686 mutex_unlock(&conn->lock); 1687 } ``` ### `l2cap_conn_start` (net/bluetooth/l2cap_core.c:1519–1592) — excerpt The crash path through `l2cap_conn_start` reaches `l2cap_chan_check_security` which calls `hci_conn_security`: ```c 1519 static void l2cap_conn_start(struct l2cap_conn *conn) 1520 { 1521 ... 1534 >>> if (!l2cap_chan_check_security(chan, true) || 1535 !__l2cap_no_conn_pending(chan)) { ``` `l2cap_chan_check_security` calls `hci_conn_security` (via `hci_connect_cfm` chain or directly). --- ## Factual notes 1. **syzbot occurrence count**: The subject line says "(4)" — syzbot has seen this WARNING at least 4 times. It is a recurring, reproducible condition. 2. **WARNING type**: WARN_ONCE in `__queue_work` at `kernel/workqueue.c:2296–2298`. Condition: `wq->flags & (__WQ_DESTROYING | __WQ_DRAINING)` is true AND `!is_chained_work(wq)`. This is a developer-inserted runtime consistency check, not a generic assertion. 3. **Workqueue state**: The warning message is `"workqueue: cannot queue hci_cmd_work on wq hci0"`. `hci0` is the HCI device workqueue; `hci_cmd_work` is its command processing work function. 4. **Caller context**: The crash happens in the `events` workqueue running `l2cap_info_timeout`. This is a timer-based L2CAP info timeout handler, NOT running on the `hci0` workqueue. 5. **Call chain factual path**: - `l2cap_info_timeout` (events wq) → `l2cap_conn_start` → `hci_conn_security` → `hci_conn_auth` (inlined, calls `hci_send_cmd`) → `hci_send_cmd` calls `queue_work(hdev->workqueue, &hdev->cmd_work)` → `queue_work_on` → `__queue_work` → WARN 6. **work->func confirmed**: RSI at crash = `ffffffff8aa9ccd0` = `hci_cmd_work` (verified by nm). `hci_cmd_work` is defined in `net/bluetooth/hci_core.c` as `hdev->cmd_work`. 7. **wq identified as hci0's workqueue**: `hdev->workqueue` is `hci0`'s workqueue. The warning message confirms this: `"wq hci0"`. The workqueue has `__WQ_DESTROYING` or `__WQ_DRAINING` set. 8. **KASAN enabled**: Register values and disasm contain KASAN shadow addresses and `__asan_report_load8_noabort` calls. The vmlinux has KASAN instrumentation. 9. **`hci_send_cmd` does not check workqueue state**: `hci_send_cmd` at `hci_core.c:3111` calls `queue_work(hdev->workqueue, &hdev->cmd_work)` without checking whether `hdev->workqueue` is in a draining or destroying state. 10. **`hci_conn_auth` does not check hdev state before calling `hci_send_cmd`**: The function checks `test_and_set_bit(HCI_CONN_AUTH_PEND, ...)` but does not verify that `hdev` is in a state that permits queueing new commands. 11. **`l2cap_info_timeout` holds `conn->lock`**: At line 1683, `mutex_lock(&conn->lock)` is acquired before calling `l2cap_conn_start` (line 1684). `conn->lock` is held throughout the call chain down to the WARNING. 12. **Disasm discrepancy**: Code bytes from the oops show `ud1 (%edx),%rdi` (67 48 0f b9 3a) at `__queue_work+0xd4a`, while vmlinux disasm shows `call __SCT__WARN_trap` at the same address. Both are consistent with the WARN trap mechanism: the call to `__SCT__WARN_trap` executes a `ud1` inside the static call stub. 13. **blame_details for `queue_work_on`** (adjacent to crash site): - `86898fa6b8cd942505860556f3a0bf52eae57fe8` "workqueue: Implement disable/enable for (delayed) work items" - `8930caba3dbdd8b86dd6934a5920bf61b53a931e` "workqueue: disable irq while manipulating PENDING" These are the commits most recently touching `queue_work_on`; neither is obviously related to the destroying/draining check.