# Linux kernel crash report Source: syzbot/LKML email report, HEAD commit `6596a02b2078` (Merge tag 'drm-next-2026-04-22'), upstream git tree. Oops-Analysis: http://oops.fenrus.org/reports/email/69e87e0e.a00a0220.9259.001c.GAE_google.com/report.html --- ## Key elements | Field | Value | Implication | |----------------|-------|-------------| | UNAME | `syzkaller #0` | Upstream kernel, exact commit `6596a02b2078` | | PROCESS | `syz.6.3284` (PID 17707) | syzkaller fuzzer process | | TAINT | L (SOFTLOCKUP) | Indicates a system stall prior to crash | | HARDWARE | QEMU Standard PC (Q35 + ICH9, 2009) | QEMU VM used by syzbot | | BIOS | 1.16.3-debian-1.16.3-2 04/01/2014 | | | MSGID | `<69e87e0e.a00a0220.9259.001c.GAE@google.com>` | | | MSGID_URL | [69e87e0e.a00a0220.9259.001c.GAE@google.com](https://lore.kernel.org/all/69e87e0e.a00a0220.9259.001c.GAE@google.com/) | | | SOURCEDIR | `oops-workdir/linux` at commit [`6596a02b2078`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6596a02b20788d4211eed1e048a6b9a9ef25e9c3) | Exact commit available | | VMLINUX | Not available locally (syzbot asset would need downloading) | Source-only analysis | | INTRODUCED-BY | [`f1327abd6abe`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f1327abd6abed031ae4146825c6b28bdd1456474) — "RDMA/rxe: Support RDMA link creation and destruction per net namespace" | Introduced race condition | --- ## Kernel modules | Module | Flags | Backtrace | Location | Flag Implication | |--------|-------|-----------|----------|-----------------| | *(module list not available in this report)* | | | | | --- ## Backtrace | Address | Function | Offset | Size | Context | Module | Source location | |---------|----------|--------|------|---------|--------|----------------| | (RIP) | [`iput.part.0`](#1-iputpart0--crash-site-fsinodec1980) | `+0xa94` | `0xf50` | Task | *(built-in)* | [fs/inode.c:1980](#1-iputpart0--crash-site-fsinodec1980) | | | `iput` | `+0x35` | `0x40` | Task | *(built-in)* | [fs/inode.c:1975](#2-iput--fsinodec1975) | | | `__sock_release` (inlined) | | | Task | *(built-in)* | [net/socket.c:734](#3-__sock_release--netsocketc734) | | | `sock_release` | `+0x169` | `0x1c0` | Task | *(built-in)* | [net/socket.c:750](#3-__sock_release--netsocketc734) | | | `rxe_release_udp_tunnel` (inlined) | | | Task | *(built-in)* | [drivers/infiniband/sw/rxe/rxe_net.c:294](#4-rxe_sock_put--rxe_netc639) | | | `rxe_sock_put` | `+0xae` | `0x130` | Task | *(built-in)* | [drivers/infiniband/sw/rxe/rxe_net.c:639](#4-rxe_sock_put--rxe_netc639) | | | `rxe_net_del` | `+0x83` | `0x120` | Task | *(built-in)* | drivers/infiniband/sw/rxe/rxe_net.c:660 | | | `rxe_dellink` | `+0x15` | `0x20` | Task | *(built-in)* | drivers/infiniband/sw/rxe/rxe.c:254 | | | `nldev_dellink` | `+0x289` | `0x3c0` | Task | *(built-in)* | drivers/infiniband/core/nldev.c:1849 | | | `rdma_nl_rcv_msg` | `+0x392` | `0x6f0` | Task | *(built-in)* | drivers/infiniband/core/netlink.c:195 | | | `rdma_nl_rcv_skb.constprop.0.isra.0` | `+0x2cb` | `0x410` | Task | *(built-in)* | drivers/infiniband/core/netlink.c:239 | | | `netlink_unicast_kernel` (inlined) | | | Task | *(built-in)* | net/netlink/af_netlink.c:1318 | | | `netlink_unicast` | `+0x585` | `0x850` | Task | *(built-in)* | net/netlink/af_netlink.c:1344 | | | `netlink_sendmsg` | `+0x8b0` | `0xda0` | Task | *(built-in)* | net/netlink/af_netlink.c:1894 | | | `sock_sendmsg_nosec` (inlined) | | | Task | *(built-in)* | net/socket.c:787 | | | `__sock_sendmsg` (inlined) | | | Task | *(built-in)* | net/socket.c:802 | | | `____sys_sendmsg` | `+0x9e1` | `0xb70` | Task | *(built-in)* | net/socket.c:2698 | | | `___sys_sendmsg` | `+0x190` | `0x1e0` | Task | *(built-in)* | net/socket.c:2752 | | | `__sys_sendmsg` | `+0x170` | `0x220` | Task | *(built-in)* | net/socket.c:2784 | | | `do_syscall_x64` (inlined) | | | Task | *(built-in)* | arch/x86/entry/syscall_64.c:63 | | | `do_syscall_64` | `+0x10b` | `0xf80` | Task | *(built-in)* | arch/x86/entry/syscall_64.c:94 | | | `entry_SYSCALL_64_after_hwframe` | `+0x77` | `0x7f` | Task | *(built-in)* | arch/x86/entry/common.h | --- ## CPU Registers ``` RIP: 0010:iput.part.0+0xa94/0xf50 fs/inode.c:1980 RSP: 0018:ffffc90005107128 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff888059f79900 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81250760 RDI: fffff52000a20de7 RBP: 0000000000000300 R12: ffff888059f79860 R13: ffffffff90dc56e4 R14: ffff888059f799d0 R15: dffffc0000000000 CR0: 0000000080050033 CR2: 00007f3056de9f00 CR3: 0000000058d5d000 CR4: 0000000000352ef0 ``` **Notable registers:** - **RBX = `ffff888059f79900`** — the `struct inode *` passed into `iput()`, confirmed by the `VFS_BUG_ON_INODE` output (`inode:ffff888059f79900`). - **RBP = `0x300`** — holds the inode state at the BUG_ON check site. `0x300 = I_FREEING (0x100) | I_CLEAR (0x200)`, meaning the inode has already been fully freed. This directly confirms the double-free. - **R12 = `ffff888059f79860`** — `inode - 0xa0 = socket` or related sockfs struct (offset to `struct socket` embedded in sockfs inode). --- ## Code byte line extraction ``` Code: 88 76 ff 48 c7 c6 60 9a c5 8b 48 89 df e8 74 68 ff ff 90 0f 0b e8 ac 88 76 ff 48 c7 c6 40 8f c5 8b 48 89 df e8 5d 68 ff ff 90 <0f> 0b e8 95 88 76 ff 48 c7 c6 00 9a c5 8b 48 89 df e8 46 68 ff ff ``` Decoded (via `scripts/decodecode`): ``` ... 13: 0f 0b ud2 ; first BUG() in the sequence ... 2a:* 0f 0b ud2 ; <-- TRAPPING INSTRUCTION (RIP points here) 2c: e8 ... call ... ; next statement after the trapping ud2 ``` The trapping instruction is `ud2` — the x86 encoding of `BUG()`. This is the invalid-opcode trap inserted by the `VFS_BUG_ON_INODE` macro at `fs/inode.c:1980`. The surrounding pattern (multiple `ud2` instructions separated by `call`s) is characteristic of consecutive `VFS_BUG_ON_INODE` assertions in `iput()`. --- ## Backtrace source code ### 1. `iput.part.0` — crash site (`fs/inode.c:1980`) [fs/inode.c at commit 6596a02b2078](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/inode.c?h=v6.14#n1972) ```c 1972 void iput(struct inode *inode) 1973 { 1974 might_sleep(); 1975 if (unlikely(!inode)) 1976 return; 1977 1978 retry: 1979 lockdep_assert_not_held(&inode->i_lock); 1980 VFS_BUG_ON_INODE(inode_state_read_once(inode) & (I_FREEING | I_CLEAR), inode); // ← CRASH HERE: inode (RBX=ffff888059f79900) has state 0x300 // I_FREEING(0x100)|I_CLEAR(0x200) — already fully freed // i_count=0 — no live references remain ... 2010 } ``` The `VFS_BUG_ON_INODE` macro expands to a `BUG()` when compiled with `CONFIG_DEBUG_VM` (or equivalent), which emits a `ud2` instruction. The crash confirms that `iput()` is being called on an inode that is already in state `I_FREEING | I_CLEAR` — the inode was freed by a prior call and the reference count is already 0. ### 2. `iput` — `fs/inode.c:1975` [fs/inode.c at commit 6596a02b2078](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/inode.c?h=v6.14#n1972) ```c 1972 void iput(struct inode *inode) 1973 { 1974 might_sleep(); 1975 if (unlikely(!inode)) // ← call here (wrapper iput → iput.part.0) 1976 return; ... ``` The public `iput()` at `+0x35/0x40` is the thin wrapper that dispatches to `iput.part.0`. The offset `+0x35` places execution just past the NULL guard. ### 3. `__sock_release` — `net/socket.c:734` [net/socket.c at commit 6596a02b2078](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/socket.c?h=v6.14#n713) ```c 713 static void __sock_release(struct socket *sock, struct inode *inode) 714 { 715 const struct proto_ops *ops = READ_ONCE(sock->ops); 716 717 if (ops) { ... 722 ops->release(sock); 723 sock->sk = NULL; ... 728 } 729 730 if (sock->wq.fasync_list) 731 pr_err("%s: fasync list not empty!\n", __func__); 732 733 if (!sock->file) { 734 iput(SOCK_INODE(sock)); // ← call here — drops inode ref for kernel socket 735 return; 736 } 737 WRITE_ONCE(sock->file, NULL); 738 } 739 740 ... 748 void sock_release(struct socket *sock) 749 { 750 __sock_release(sock, NULL); // ← sock_release call site 751 } ``` `sock_release()` calls `__sock_release()` with `inode=NULL`, so line 733 (`!sock->file`) is TRUE for kernel sockets (they have no file backing), and `iput(SOCK_INODE(sock))` is called unconditionally. When the inode has already been freed by a previous `sock_release`, this second `iput` triggers the BUG. ### 4. `rxe_sock_put` — `rxe_net.c:639` [drivers/infiniband/sw/rxe/rxe_net.c at commit 6596a02b2078](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/infiniband/sw/rxe/rxe_net.c?h=v6.14#n632) ```c 291 static void rxe_release_udp_tunnel(struct socket *sk) 292 { 293 if (sk) 294 udp_tunnel_sock_release(sk); // ← inlined call at rxe_net.c:294 295 } 296 ... 632 static void rxe_sock_put(struct sock *sk, 633 void (*set_sk)(struct net *, struct sock *), 634 struct net *net) 635 { 636 if (refcount_read(&sk->sk_refcnt) > SK_REF_FOR_TUNNEL) { 637 __sock_put(sk); 638 } else { 639 rxe_release_udp_tunnel(sk->sk_socket); // ← call here (via inline) // → udp_tunnel_sock_release → sock_release → __sock_release → iput // CRASH: inode already I_FREEING|I_CLEAR from prior release 640 sk = NULL; 641 set_sk(net, sk); // NOTE: pointer is cleared AFTER releasing 642 } 643 } ``` --- ## What-How-Where analysis ### What The kernel BUG at `fs/inode.c:1980` is triggered by `VFS_BUG_ON_INODE`, an assertion that fires when `iput()` is called on an inode that is already fully freed (`I_FREEING | I_CLEAR`, state `0x300`). The inode belongs to a sockfs socket — the per-network-namespace UDP tunnel socket used by the RXE soft-RoCE driver for encapsulating RDMA traffic. Register evidence: - **RBX = `ffff888059f79900`** is the `struct inode *` of the (already freed) sockfs inode. - **RBP = `0x300`** is the inode state snapshot at the check, confirming `I_FREEING|I_CLEAR`. - The `VFS_BUG_ON_INODE` line in the oops header shows `state:0x300 count:0`, i.e., the inode reference count is already zero — the inode has been completely freed. In plain terms: `iput()` is called twice on the same sockfs inode, and the second call happens after the inode has already been freed by the first. ### How **Q1: Who released the socket the first time?** A1: `rxe_ns_exit()` in `drivers/infiniband/sw/rxe/rxe_ns.c`, the pernet cleanup callback registered via `register_pernet_subsys(&rxe_net_ops)`. When a network namespace is torn down, `rxe_ns_exit()` reads the stored UDP socket pointer from the per-namespace `rxe_ns_sock` structure and calls `udp_tunnel_sock_release(sk->sk_socket)`: ```c 38 static void rxe_ns_exit(struct net *net) 39 { 40 struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id); 41 struct sock *sk; 42 45 rcu_read_lock(); 46 sk = rcu_dereference(ns_sk->rxe_sk4); // reads pointer 47 rcu_read_unlock(); 48 if (sk) { 49 rcu_assign_pointer(ns_sk->rxe_sk4, NULL); 50 udp_tunnel_sock_release(sk->sk_socket); // ← first release 51 } ... 62 } ``` **Q2: Who released the socket the second time (the crash)?** A2: `rxe_sock_put()` in `rxe_net.c`, called from `rxe_net_del()` → `rxe_dellink()` → `nldev_dellink()`, triggered by the user-space `sendmsg` to the RDMA netlink socket (explicit deletion of the RXE device): ```c 632 static void rxe_sock_put(struct sock *sk, 633 void (*set_sk)(struct net *, struct sock *), 634 struct net *net) 635 { 636 if (refcount_read(&sk->sk_refcnt) > SK_REF_FOR_TUNNEL) { 637 __sock_put(sk); 638 } else { 639 rxe_release_udp_tunnel(sk->sk_socket); // ← second release → CRASH 640 sk = NULL; 641 set_sk(net, sk); // clears pointer AFTER releasing — too late 642 } 643 } ``` **Q3: How does the race happen?** A3: Both code paths independently read the namespace socket pointer and decide to release the socket, with no synchronisation between them: ``` CPU 0 (namespace teardown) CPU 1 (nldev_dellink) ───────────────────────────────── ────────────────────────────────── rxe_ns_exit(): rxe_net_del() → rxe_sock_put(): rcu_read_lock() sk = rxe_ns_pernet_sk4(net) sk = rcu_dereference(ns_sk->rxe_sk4) // reads non-NULL pointer rcu_read_unlock() ← both see non-NULL sk ← rcu_assign_pointer(ns_sk->rxe_sk4, NULL) udp_tunnel_sock_release(sk->sk_socket) → sock_release → iput // inode freed, I_FREEING|I_CLEAR set rxe_release_udp_tunnel(sk->sk_socket) → sock_release → iput → BUG! ``` The specific vulnerability is: 1. **`rxe_ns_exit()` sets the namespace pointer to NULL *before* releasing the socket**, but after it has already read the pointer into a local variable. If `rxe_net_del` reads the pointer during the window between the `rcu_read_unlock()` and the `rcu_assign_pointer(…, NULL)` call — or even before either sets it to NULL — both paths have a live copy of the pointer. 2. **`rxe_sock_put()` clears the namespace pointer *after* releasing the socket** (line 641 comes after line 639). This creates an additional window where the socket has already been freed but the namespace pointer still points to it. If `rxe_ns_exit()` reads the pointer during this window, it gets a dangling pointer to a freed socket and tries to release it again. There is no locking between the two release paths. The race was introduced by commit [`f1327abd6abe`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f1327abd6abed031ae4146825c6b28bdd1456474) (March 2026), which added per-network-namespace socket management for RXE devices, together with companion commit [`13f2a53c2a71`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=13f2a53c2a71e12478279261b35e2abf393523b3) which added `rxe_ns.c`/`rxe_ns.h`. ### Where The fix must ensure that the socket is released at most once, regardless of whether the release comes from `rxe_ns_exit()` (namespace teardown) or `rxe_net_del()` (explicit device deletion). The two operations must be serialized. **Proposed fix:** add a `struct mutex` to `struct rxe_ns_sock` that serializes the read-and-clear of the socket pointer. The actual `udp_tunnel_sock_release()` can run outside the lock (it sleeps), but the "claim" (read + set-NULL) must be atomic: ```diff --- a/drivers/infiniband/sw/rxe/rxe_ns.c +++ b/drivers/infiniband/sw/rxe/rxe_ns.c @@ -15,18 +15,42 @@ struct rxe_ns_sock { struct sock __rcu *rxe_sk4; struct sock __rcu *rxe_sk6; + struct mutex lock; /* serializes socket claim/release */ }; static unsigned int rxe_pernet_id; static int rxe_ns_init(struct net *net) { + struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id); + + mutex_init(&ns_sk->lock); return 0; } +/* + * Atomically claim the socket pointer from the namespace: read it and set it + * to NULL under the lock, then return the pointer (or NULL if already gone). + * The caller is responsible for releasing the socket after the lock is dropped. + */ +static struct sock *rxe_ns_claim_sk(struct rxe_ns_sock *ns_sk, + struct sock __rcu **sk_rcu) +{ + struct sock *sk; + + mutex_lock(&ns_sk->lock); + sk = rcu_dereference_protected(*sk_rcu, lockdep_is_held(&ns_sk->lock)); + if (sk) + rcu_assign_pointer(*sk_rcu, NULL); + mutex_unlock(&ns_sk->lock); + return sk; +} + static void rxe_ns_exit(struct net *net) { struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id); struct sock *sk; - rcu_read_lock(); - sk = rcu_dereference(ns_sk->rxe_sk4); - rcu_read_unlock(); - if (sk) { - rcu_assign_pointer(ns_sk->rxe_sk4, NULL); - udp_tunnel_sock_release(sk->sk_socket); - } + sk = rxe_ns_claim_sk(ns_sk, &ns_sk->rxe_sk4); + if (sk) + udp_tunnel_sock_release(sk->sk_socket); #if IS_ENABLED(CONFIG_IPV6) - rcu_read_lock(); - sk = rcu_dereference(ns_sk->rxe_sk6); - rcu_read_unlock(); - if (sk) { - rcu_assign_pointer(ns_sk->rxe_sk6, NULL); - udp_tunnel_sock_release(sk->sk_socket); - } + sk = rxe_ns_claim_sk(ns_sk, &ns_sk->rxe_sk6); + if (sk) + udp_tunnel_sock_release(sk->sk_socket); #endif } ``` Expose the claim helper to `rxe_net.c` via `rxe_ns.h`: ```diff --- a/drivers/infiniband/sw/rxe/rxe_ns.h +++ b/drivers/infiniband/sw/rxe/rxe_ns.h @@ -5,6 +5,8 @@ struct sock *rxe_ns_pernet_sk4(struct net *net); void rxe_ns_pernet_set_sk4(struct net *net, struct sock *sk); +struct sock *rxe_ns_pernet_take_sk4(struct net *net); +struct sock *rxe_ns_pernet_take_sk6(struct net *net); #if IS_ENABLED(CONFIG_IPV6) void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk); ``` Add the implementations in `rxe_ns.c`: ```c struct sock *rxe_ns_pernet_take_sk4(struct net *net) { struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id); return rxe_ns_claim_sk(ns_sk, &ns_sk->rxe_sk4); } struct sock *rxe_ns_pernet_take_sk6(struct net *net) { #if IS_ENABLED(CONFIG_IPV6) struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id); return rxe_ns_claim_sk(ns_sk, &ns_sk->rxe_sk6); #else return NULL; #endif } ``` And update `rxe_net_del()` to use the atomic-claim helpers, replacing the racy read-then-put pattern: ```diff --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -645,13 +645,14 @@ void rxe_net_del(struct ib_device *dev) net = dev_net(ndev); - sk = rxe_ns_pernet_sk4(net); + /* Atomically claim (read + null) the pointer so rxe_ns_exit() + * cannot race and double-release the same socket. */ + sk = rxe_ns_pernet_take_sk4(net); if (sk) rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net); - sk = rxe_ns_pernet_sk6(net); + sk = rxe_ns_pernet_take_sk6(net); if (sk) rxe_sock_put(sk, rxe_ns_pernet_set_sk6, net); ``` With this change `rxe_sock_put` no longer needs to call `set_sk(net, NULL)` (the pointer was already cleared by the claim helper), but it can be left in place as a harmless no-op (setting NULL to NULL). **Why this is correct:** `rxe_ns_claim_sk` uses the mutex to make the read-and-null operation atomic. Exactly one of the two racing paths — `rxe_ns_exit` or `rxe_net_del` — will get a non-NULL pointer back and proceed to release the socket. The other will get NULL and do nothing. After `rcu_assign_pointer(…, NULL)` inside `rxe_ns_claim_sk`, the `synchronize_rcu()` inside `udp_tunnel_sock_release` (via `rcu_assign_sk_user_data`) ensures that no concurrent RCU reader can observe the old pointer value afterwards. --- ## Bug introduction The bug was introduced by the following two commits in the series merged in March 2026: **Primary:** [`f1327abd6abe`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f1327abd6abed031ae4146825c6b28bdd1456474) — *"RDMA/rxe: Support RDMA link creation and destruction per net namespace"* Author: Zhu Yanjun ``, Date: 2026-03-12. This commit introduced `rxe_sock_put()` and `rxe_net_del()` with the racy socket release pattern: the namespace socket pointer is not cleared atomically before the socket is released, and there is no coordination with the `rxe_ns_exit()` path. **Supporting:** [`13f2a53c2a71`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=13f2a53c2a71e12478279261b35e2abf393523b3) — *"RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets"* Author: Zhu Yanjun ``, Date: 2026-03-12. This commit introduced `rxe_ns.c` with `rxe_ns_exit()` — the second release path, which reads the socket pointer under an RCU read lock but does not hold any lock that would prevent `rxe_net_del` from reading the same pointer concurrently. **No upstream fix has been identified** for this specific race within the current search budget. The commits that appeared in the search (`RDMA/rxe: Fix double free in rxe_srq_from_init`) address a different double-free in the SRQ allocation path and do not cover the UDP socket race. --- ## Analysis, conclusions and recommendations **Summary:** The crash is a race-condition double-free of a sockfs inode. The UDP tunnel socket shared per-network-namespace by RXE devices can be released twice: once by `rxe_ns_exit()` (namespace teardown) and once by `rxe_sock_put()` (explicit RXE device deletion via `nldev_dellink`). Both code paths independently read the same namespace socket pointer and call `sock_release()` without any mutual exclusion. The second `sock_release()` → `iput()` call triggers the `VFS_BUG_ON_INODE` assertion because the sockfs inode is already in `I_FREEING | I_CLEAR` state. **Root cause commits:** [`f1327abd6abe`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f1327abd6abed031ae4146825c6b28bdd1456474) and [`13f2a53c2a71`](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=13f2a53c2a71e12478279261b35e2abf393523b3), both authored by Zhu Yanjun, merged in March 2026. **Confidence:** High. The crash site, register values (RBP=0x300), inode state, and the complete call chain are all consistent with a double-`sock_release()` on the RXE UDP tunnel socket. The introduced commits are the only changes to this code path that could cause this race. **Recommendations:** 1. Add a mutex to `struct rxe_ns_sock` and use it to serialize the read-and-clear of the socket pointer in both `rxe_ns_exit()` and the `rxe_net_del()` path (see proposed fix in the **Where** section above). 2. Report and submit the fix upstream to the RDMA mailing list (linux-rdma@vger.kernel.org), CC: Zhu Yanjun, Leon Romanovsky, David Ahern. 3. Consider whether the `SK_REF_FOR_TUNNEL = 2` reference-count boundary in `rxe_sock_put` is correct: with two devices sharing a socket (refcount=2), `2 > 2` is FALSE and the socket is fully released instead of just dropping one reference. This appears to be a separate off-by-one, but it is not the direct cause of this crash.