Linux kernel crash report

Syzbot report: WARNING — ODEBUG bug in lane_ioctl (net/atm/lec.c, net-next tree). Source email: 69f16c26.170a0220.34e5b8.0013.GAE@google.com Dashboard: https://syzkaller.appspot.com/bug?extid=ca9d5686d06994c6547c

Root cause: lec_atm_close clears priv->lecd to NULL before calling lec_arp_destroy, creating a window where a concurrent lecd_attach can call lec_arp_init on a still-active lec_arp_work timer.

Analysis complete.

Key elements

Field Value Implication
MSGID <69f16c26.170a0220.34e5b8.0013.GAE@google.com>
MSGID_URL 69f16c26.170a0220.34e5b8.0013.GAE@google.com
BUG_URL https://syzkaller.appspot.com/bug?extid=ca9d5686d06994c6547c
UNAME syzkaller #0 PREEMPT(full)
DISTRO syzbot / net-next
PROCESS syz-executor253 (PID 5933, UID 0)
TAINT Not tainted
HARDWARE Google Compute Engine
BIOS Google 04/18/2026
CRASH_TYPE WARNING — ODEBUG: init active
CRASH_SITE __debug_object_init+0x30f/0x4e0lib/debugobjects.c:632 WARN fires inside debug_print_object (inlined)
CONFIG_REQUIRED CONFIG_DEBUG_OBJECTS_TIMERS Timer-specific ODEBUG; no-op when this config is disabled
HEAD_COMMIT e728258debd5 net-next, Merge tag ‘net-7.1-rc1’
SOURCEDIR oops-workdir/linux Checked out to HEAD_COMMIT
VMLINUX oops-workdir/syzbot/vmlinux-e728258d Downloaded from syzbot storage

Kernel modules

Module Flags Backtrace Location Flag Implication
(no modules linked in)

Backtrace

Address Function Offset Size Context Module Source location
debug_print_object (inlined) Task lib/debugobjects.c:629
0xffffffff84bc2d2f (0xffffffff84bc2a20 + 0x30f) __debug_object_init 0x30f 0x4e0 Task lib/debugobjects.c:780
debug_timer_init (inlined) Task kernel/time/timer.c:788
debug_init (inlined) Task kernel/time/timer.c:836
0xffffffff81b1ce61 (0xffffffff81b1ce20 + 0x41) timer_init_key 0x41 0x2c0 Task kernel/time/timer.c:880
lec_arp_init (inlined) Task net/atm/lec.c:1274
lecd_attach (inlined) Task net/atm/lec.c:781
0xffffffff8ae1a809 (0xffffffff8ae19290 + 0x1579) lane_ioctl 0x1579 0x2220 Task net/atm/lec.c:1037
0xffffffff8ae003dd (0xffffffff8ae00070 + 0x36d) do_vcc_ioctl 0x36d 0x9d0 Task net/atm/ioctl.c:159
0xffffffff8adfe8c6 (0xffffffff8adfe6d0 + 0x1f6) svc_ioctl 0x1f6 0x7d0 Task net/atm/svc.c:611
0xffffffff897a71a1 (0xffffffff897a70a0 + 0x101) sock_do_ioctl 0x101 0x320 Task net/socket.c:1313
0xffffffff897a5b06 (0xffffffff897a5540 + 0x5c6) sock_ioctl 0x5c6 0x7f0 Task net/socket.c:1434
vfs_ioctl (inlined) Task fs/ioctl.c:51
__do_sys_ioctl (inlined) Task fs/ioctl.c:597
0xffffffff824860cc (0xffffffff82485fd0 + 0xfc) __se_sys_ioctl 0xfc 0x170 Task fs/ioctl.c:583
do_syscall_x64 (inlined) Task arch/x86/entry/syscall_64.c:63
0xffffffff8bb861df (0xffffffff8bb86080 + 0x15f) do_syscall_64 0x15f 0xf80 Task arch/x86/entry/syscall_64.c:94
0xffffffff81000130 (0xffffffff810000b9 + 0x77) entry_SYSCALL_64_after_hwframe 0x77 0x7f Task arch/x86/entry/entry_64.S:121

CPU Registers

Register Value Notes
RIP 0xffffffff84bc2d2f __debug_object_init+0x30f — crash site
RSP 0xffffc90003e278f8
EFLAGS 0x00010246
RAX 0x1ffffffff179e734
RBX 0xffff888077a60f78 Object address (timer_list being re-initialised)
RCX 0x0000000000000000
RDX 0xffffffff8c28ab20
RSI 0xffffffff8c28a720
RDI 0xffffffff903e69a0
RBP 0x0000000000000003
R08 0xffff888077a60f78 Same as RBX — object address repeated
R09 0xffffffff8bcf4d00
R10 0xdffffc0000000000 KASAN shadow base
R11 0xffffffff81b236d0
R12 0xffffffff8c28ab20 Same as RDX
R13 0xffffffff8bcf39a0
R14 0xffffffff903e69a0 Same as RDI
R15 0xdffffc0000000000 KASAN shadow base
FS 0x00007f3e1e8976c0 TLS base (userspace)
GS 0xffff888125213000 per-CPU GS base
CS 0x0010 Kernel code segment
CR0 0x0000000080050033
CR2 0x000055a602110fd8 Last page-fault address (user space — unrelated to this crash)
CR3 0x00000000753ea000 Page directory base
CR4 0x00000000003526f0

Backtrace source code

1. __debug_object_init — crash site (lib/debugobjects.c:632)

The WARN fires inside debug_print_object (inlined into __debug_object_init). The object at 0xffff888077a60f78 (a timer_list) is in the active state when debug_object_init is called on it a second time.

lib/debugobjects.c on elixir

// lib/debugobjects.c:611
611 static void debug_print_object(struct debug_obj *obj, char *msg)
612 {
613 const struct debug_obj_descr *descr = obj->descr;
614 static int limit;
615 
616 /*
617  * Don't report if lookup_object_or_alloc() by the current thread
618  * failed because lookup_object_or_alloc()/debug_objects_oom() by a
619  * concurrent thread turned off debug_objects_enabled and cleared
620  * the hash buckets.
621  */
622 if (!debug_objects_enabled)
623 return;
624 
625 if (limit < 5 && descr != descr_test) {
626 void *hint = descr->debug_hint ?
627 descr->debug_hint(obj->object) : NULL;
628 limit++;
629 WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) "
630  "object: %p object type: %s hint: %pS\n",
631 msg, obj_states[obj->state], obj->astate,
632 obj->object, descr->name, hint);   // <<< WARN fires here
633 }
634 debug_objects_warnings++;
635 }

lib/debugobjects.c__debug_object_init

// lib/debugobjects.c:747
747 static void
748 __debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack)
749 {
750 struct debug_obj *obj, o;
751 struct debug_bucket *db;
752 unsigned long flags;
753 
754 debug_objects_fill_pool();
755 
756 db = get_bucket((unsigned long) addr);
757 
758 raw_spin_lock_irqsave(&db->lock, flags);
759 
760 obj = lookup_object_or_alloc(addr, db, descr, onstack, false);
761 if (unlikely(!obj)) {
762 raw_spin_unlock_irqrestore(&db->lock, flags);
763 debug_objects_oom();
764 return;
765 }
766 
767 switch (obj->state) {
768 case ODEBUG_STATE_NONE:
769 case ODEBUG_STATE_INIT:
770 case ODEBUG_STATE_INACTIVE:
771 obj->state = ODEBUG_STATE_INIT;
772 raw_spin_unlock_irqrestore(&db->lock, flags);
773 return;
774 default:
775 break;
776 }
777 
778 o = *obj;
779 raw_spin_unlock_irqrestore(&db->lock, flags);
780 debug_print_object(&o, "init");    // <<< called when state is not NONE/INIT/INACTIVE
781 
782 if (o.state == ODEBUG_STATE_ACTIVE)
783 debug_object_fixup(descr->fixup_init, addr, o.state);
784 }

2. timer_init_keykernel/time/timer.c:880

kernel/time/timer.c on elixir

// kernel/time/timer.c:786
786 static inline void debug_timer_init(struct timer_list *timer)
787 {
788 debug_object_init(timer, &timer_debug_descr);  // <<< [inline] backtrace entry
789 }

// kernel/time/timer.c:834
834 static inline void debug_init(struct timer_list *timer)
835 {
836 debug_timer_init(timer);   // <<< [inline] backtrace entry
837 trace_timer_init(timer);
838 }

// kernel/time/timer.c:876
876 void timer_init_key(struct timer_list *timer,
877     void (*func)(struct timer_list *), unsigned int flags,
878     const char *name, struct lock_class_key *key)
879 {
880 debug_init(timer);                          // <<< backtrace entry
881 do_init_timer(timer, func, flags, name, key);
882 }
883 EXPORT_SYMBOL(timer_init_key);

3. lane_ioctl / lecd_attach / lec_arp_init (net/atm/lec.c:1037)

net/atm/lec.c on elixir

// net/atm/lec.c:1264
1264 static void lec_arp_init(struct lec_priv *priv)
1265 {
1266 unsigned short i;
1267 
1268 for (i = 0; i < LEC_ARP_TABLE_SIZE; i++)
1269 INIT_HLIST_HEAD(&priv->lec_arp_tables[i]);
1270 INIT_HLIST_HEAD(&priv->lec_arp_empty_ones);
1271 INIT_HLIST_HEAD(&priv->lec_no_forward);
1272 INIT_HLIST_HEAD(&priv->mcast_fwds);
1273 spin_lock_init(&priv->lec_arp_lock);
1274 INIT_DELAYED_WORK(&priv->lec_arp_work, lec_arp_check_expire);  // <<< inlined entry
1275 schedule_delayed_work(&priv->lec_arp_work, LEC_ARP_REFRESH_INTERVAL);
1276 }
// net/atm/lec.c:748
748 static int lecd_attach(struct atm_vcc *vcc, int arg)
749 {
750 int i;
751 struct lec_priv *priv;
752 
753 lockdep_assert_held(&lec_mutex);
754 if (arg < 0)
755 arg = 0;
756 if (arg >= MAX_LEC_ITF)
757 return -EINVAL;
758 i = array_index_nospec(arg, MAX_LEC_ITF);
759 if (!dev_lec[i]) {
760 int size;
761 
762 size = sizeof(struct lec_priv);
763 dev_lec[i] = alloc_etherdev(size);
764 if (!dev_lec[i])
765 return -ENOMEM;
766 dev_lec[i]->netdev_ops = &lec_netdev_ops;
767 dev_lec[i]->max_mtu = 18190;
768 snprintf(dev_lec[i]->name, IFNAMSIZ, "lec%d", i);
769 if (register_netdev(dev_lec[i])) {
770 free_netdev(dev_lec[i]);
771 dev_lec[i] = NULL;
772 return -EINVAL;
773 }
774 
775 priv = netdev_priv(dev_lec[i]);
776 } else {
776 priv = netdev_priv(dev_lec[i]);
777 if (rcu_access_pointer(priv->lecd))
778 return -EADDRINUSE;
779 }
781 lec_arp_init(priv);   // <<< called unconditionally for both new and existing priv
782 ...
// net/atm/lec.c:1018
1018 static int lane_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
1019 {
1020 struct atm_vcc *vcc = ATM_SD(sock);
1021 int err = 0;
1022 
1023 switch (cmd) {
1024 case ATMLEC_CTRL:
1025 case ATMLEC_MCAST:
1026 case ATMLEC_DATA:
1027 if (!capable(CAP_NET_ADMIN))
1028 return -EPERM;
1029 break;
1030 default:
1031 return -ENOIOCTLCMD;
1032 }
1033 
1034 mutex_lock(&lec_mutex);
1035 switch (cmd) {
1036 case ATMLEC_CTRL:
1037 err = lecd_attach(vcc, (int)arg);   // <<< backtrace entry
1038 if (err >= 0)
1039 sock->state = SS_CONNECTED;
1040 break;
1041 ...
1049 mutex_unlock(&lec_mutex);
1050 return err;
1051 }

4. do_vcc_ioctl (net/atm/ioctl.c:159)

net/atm/ioctl.c on elixir

// net/atm/ioctl.c:153
153 error = -ENOIOCTLCMD;
154 
155 mutex_lock(&ioctl_mutex);
156 list_for_each(pos, &ioctl_list) {
157 struct atm_ioctl *ic = list_entry(pos, struct atm_ioctl, list);
158 if (try_module_get(ic->owner)) {
159 error = ic->ioctl(sock, cmd, arg);  // <<< backtrace entry — calls lane_ioctl
160 module_put(ic->owner);
161 if (error != -ENOIOCTLCMD)
162 break;
163 }
164 }
165 mutex_unlock(&ioctl_mutex);

Phase 3 — Root Cause Analysis

What

The ODEBUG WARNING fires inside __debug_object_init when INIT_DELAYED_WORK(&priv->lec_arp_work, lec_arp_check_expire) is called on a timer_list that is already in ODEBUG_STATE_ACTIVE (state 3).

The object at 0xffff888077a60f78 is the timer_list embedded in lec_priv.lec_arp_work (a delayed_work). Its callback hint resolves to lec_arp_check_expire, confirming the identity.

__debug_object_init transitions the object state:

// lib/debugobjects.c:767
767     switch (obj->state) {
768     case ODEBUG_STATE_NONE:
769     case ODEBUG_STATE_INIT:
770     case ODEBUG_STATE_INACTIVE:
771         obj->state = ODEBUG_STATE_INIT;
772         raw_spin_unlock_irqrestore(&db->lock, flags);
773         return;                  // ← safe cases return here
774     default:
775         break;
776     }
780     debug_print_object(&o, "init");   // ← fires when state == ACTIVE (3)

Register RBP = 0x0000000000000003 confirms the object state was ODEBUG_STATE_ACTIVE at the time of the WARN.

INIT_DELAYED_WORK expands to __INIT_DELAYED_WORKtimer_setuptimer_init_keydebug_initdebug_timer_initdebug_object_init__debug_object_init.

The call reaches lec_arp_init via:

// net/atm/lec.c:1264
1264 static void lec_arp_init(struct lec_priv *priv)
1265 {
1266     unsigned short i;
1267     for (i = 0; i < LEC_ARP_TABLE_SIZE; i++)
1268         INIT_HLIST_HEAD(&priv->lec_arp_tables[i]);
1269     INIT_HLIST_HEAD(&priv->lec_arp_empty_ones);
1270     INIT_HLIST_HEAD(&priv->lec_no_forward);
1271     INIT_HLIST_HEAD(&priv->mcast_fwds);
1272     spin_lock_init(&priv->lec_arp_lock);
1274     INIT_DELAYED_WORK(&priv->lec_arp_work, lec_arp_check_expire);  // ← WARN triggered here
1275     schedule_delayed_work(&priv->lec_arp_work, LEC_ARP_REFRESH_INTERVAL);
1276 }

How

Q1: How is lec_arp_init called on a priv whose lec_arp_work is already active?

A1: lecd_attach calls lec_arp_init(priv) unconditionally for both new and existing dev_lec[i] entries:

// net/atm/lec.c:748
748 static int lecd_attach(struct atm_vcc *vcc, int arg)
749 {
        ...
759     if (!dev_lec[i]) {
            // new device: allocate, register, set priv
775         priv = netdev_priv(dev_lec[i]);
776     } else {
777         priv = netdev_priv(dev_lec[i]);
778         if (rcu_access_pointer(priv->lecd))
779             return -EADDRINUSE;          // ← blocks if daemon already attached
780     }
781     lec_arp_init(priv);    // ← called for existing priv WITHOUT cancelling work

In the else-branch, the only guard is that priv->lecd is NULL (no daemon currently attached). There is no cancellation of priv->lec_arp_work before the re-initialization.

Q2: How is priv->lecd NULL while lec_arp_work is still active?

A2: lec_atm_close — the ATM VCC close handler — clears priv->lecd to NULL before calling lec_arp_destroy:

// net/atm/lec.c:487
487 static void lec_atm_close(struct atm_vcc *vcc)
488 {
489     struct net_device *dev = (struct net_device *)vcc->proto_data;
490     struct lec_priv *priv = netdev_priv(dev);
491
492     rcu_assign_pointer(priv->lecd, NULL);  // ← lecd cleared here
493     synchronize_rcu();                     // ← wait for RCU readers
494     /* Do something needful? */
495     netif_stop_queue(dev);
496
497     lec_arp_destroy(priv);                 // ← work cancelled here (too late)

lec_arp_destroy calls cancel_delayed_work_sync(&priv->lec_arp_work), but it is invoked after priv->lecd is already NULL. Since lec_atm_close does not hold lec_mutex, there is a race window:

Thread A (lec_atm_close):           Thread B (lecd_attach via lane_ioctl):
  rcu_assign_pointer(lecd, NULL)
  synchronize_rcu()                   mutex_lock(&lec_mutex)
  [window open]                       sees priv->lecd == NULL  ← passes guard
                                      lec_arp_init(priv)       ← INIT_DELAYED_WORK
                                                                  on active timer ← WARN
  lec_arp_destroy(priv)              [too late: work already re-initialized]

lec_arp_check_expire (the work function) does not acquire lec_mutex, so calling cancel_delayed_work_sync from within a lec_mutex-holding context is safe — no deadlock risk.

Root cause (complete fact, confirmed): lec_atm_close clears priv->lecd to NULL without holding lec_mutex, creating a race window between the NULL assignment and lec_arp_destroy. During this window a concurrent lecd_attach re-enters lec_arp_init on a still-active lec_arp_work, triggering the ODEBUG WARNING.

Where

The fix belongs in lecd_attach, in the else-branch that handles an existing device. Before calling lec_arp_init(priv), cancel any in-flight delayed work:

    } else {
        priv = netdev_priv(dev_lec[i]);
        if (rcu_access_pointer(priv->lecd))
            return -EADDRINUSE;
+       cancel_delayed_work_sync(&priv->lec_arp_work);
    }
    lec_arp_init(priv);

cancel_delayed_work_sync is a no-op if the work has already been cancelled by lec_arp_destroy in the concurrent lec_atm_close path, so it is safe in both the racy and the non-racy case.

lec_arp_check_expire (the work function) only holds priv->lec_arp_lock (a spinlock) and never acquires lec_mutex, so the cancel_delayed_work_sync call is safe from a mutex-held context.

PATCH_BASE: e728258debd5

--- a/net/atm/lec.c
+++ b/net/atm/lec.c
@@ -776,6 +776,7 @@ static int lecd_attach(struct atm_vcc *vcc, int arg)
 } else {
 priv = netdev_priv(dev_lec[i]);
 if (rcu_access_pointer(priv->lecd))
 return -EADDRINUSE;
+cancel_delayed_work_sync(&priv->lec_arp_work);
 }
 lec_arp_init(priv);

Bug Introduction

The race exists because lec_atm_close does not hold lec_mutex. The lec_mutex was introduced by commit d13a3824bfd2 (“net: atm: add lec_mutex”) to serialise access to dev_lec[] from lecd_attach, lec_vcc_attach, and lec_mcast_attach, but lec_atm_close was not updated to take the mutex. The bug was therefore introduced by d13a3824bfd2 (or was pre-existing but became exploitable once lec_mutex serialised the concurrent path).

The unconditional lec_arp_init(priv) call for existing devices predates d13a3824bfd2 and has always been present in the LANE driver.


Search anchor: bug introduction commit d13a3824bfd2 (“net: atm: add lec_mutex”).

Git log — commits to net/atm/lec.c between d13a3824bfd2 and e728258debd5 (HEAD):

Hash Subject
922814879542 atm: lec: fix use-after-free in sock_def_readable()
101bacb303e8 atm: lec: fix null-ptr-deref in lec_arp_clear_vccs
bf4afc53b77a Convert ‘alloc_obj’ family to use the new default GFP_KERNEL argument
69050f8d6d07 treewide: Replace kmalloc with kmalloc_obj for non-scalar types
d03b79f459c7 net: atm: fix /proc/net/atm/lec handling

None of these address the lec_arp_work re-initialisation race. No commit cancels lec_arp_work before calling lec_arp_init in the else-branch of lecd_attach.

lore.kernel.org search — searched for patches touching lecd_attach, lec_arp_init, lec_arp_work, or lane_ioctl since the introduction of lec_mutex (d13a3824bfd2, 2025-06-18): no results. No fix for this specific race has been posted to LKML or netdev as of the time of analysis.

Conclusion: no existing upstream fix was found. The proposed fix in report.patch (adding cancel_delayed_work_sync(&priv->lec_arp_work) in the else-branch of lecd_attach before lec_arp_init) is novel.


Patch

Status: succeeded

Base commit: e728258debd5 (exact — net-next, Merge tag ‘net-7.1-rc1’)

Validation: git apply --check passed cleanly against e728258debd5.

Output files: - patch-email.txt — LKML-ready patch email - git-send-email.sh — send script (--to netdev@vger.kernel.org, --in-reply-to 69f16c26.170a0220.34e5b8.0013.GAE@google.com)


Fact Check

All checked items verified.

See factcheck.md for detailed verification results and cosmetic fixes applied.


Patch Review

Verdict: PASS

All checklist items cleared — the patch is safe and correct. The added cancel_delayed_work_sync(&priv->lec_arp_work) eliminates the race condition by ensuring in-flight work is drained before lec_arp_init() reinitializes the lec_arp_work timer. No resource leaks, lock imbalances, NULL dereferences, or deadlock risks are introduced. The fix properly serializes the work lifecycle in the reattach path and is consistent with the existing lec_arp_destroy() pattern. Deadlock safety is confirmed: the work callback only holds priv->lec_arp_lock (orthogonal to lec_mutex held by lecd_attach), so cancel_delayed_work_sync() is safe from within the mutex-held context.