| Field | Value | Implication |
|---|---|---|
| UNAME | 6.18.13-200.fc43.x86_64 |
|
| DISTRO | Fedora | |
| DISTRO_VERSION | fc43 | |
| SOURCEDIR | oops-workdir/linux (tag:
kernel-6.18.13-0) |
|
| VMLINUX | oops-workdir/fedora/files/usr/lib/debug/lib/modules/6.18.13-200.fc43.x86_64/vmlinux |
|
| BASEDIR | oops-workdir/fedora/files/ |
|
| PROCESS | krxrpcio/7001 |
Kernel RxRPC I/O thread for the client local endpoint at UDP port 7001 |
| HARDWARE | VMware, Inc. VMware Virtual Platform |
|
| CRASH_TYPE | BUG (invalid opcode / ud2) | Intentional kernel assertion failure |
| MSGID | <CAPhRvkyZGKHRTBhV3P2PCCRxmRKGEvJQ0W5a9SMW3qwS2hp2Qw@mail.gmail.com> |
|
| MSGID_URL | CAPhRvkyZGKHRTBhV3P2PCCRxmRKGEvJQ0W5a9SMW3qwS2hp2Qw@mail.gmail.com | |
| CONFIG_REQUIRED | (unconditional — fires in all builds) |
The BUG() in
rxrpc_destroy_client_conn_ids() is unconditional |
| INTRODUCED-BY | 9d35d880e0e4
rxrpc: Move client call connection to the I/O thread |
BUG (invalid opcode) —
kernel BUG at net/rxrpc/conn_client.c:64!
The Oops: invalid opcode: 0000 line confirms the crash
was triggered by a ud2 instruction — the x86 encoding of
BUG(). This is an intentional assertion failure: the kernel
detected an invariant violation (a client connection was left registered
in the connection-ID IDR when the local endpoint was being destroyed)
and crashed deliberately.
| Module | Flags | Backtrace | Location | Flag Implication |
|---|---|---|---|---|
| rxrpc | Y | oops-workdir/fedora/files/usr/lib/debug/lib/modules/6.18.13-200.fc43.x86_64/kernel/net/rxrpc/rxrpc.ko.debug |
||
| fcrypt | ||||
| pcbc | ||||
| ip6_udp_tunnel | ||||
| krb5 | ||||
| udp_tunnel | ||||
| rfkill | ||||
| vmxnet3 | ||||
| (remaining modules omitted for brevity) |
| Register | Value |
|---|---|
| RAX | 0x0000000000000000 |
| RBX | 0xffff88810a6b4800 |
| RCX | 0x0000000000000000 |
| RDX | 0x0000000000000000 |
| RSI | 0x0000000000000000 |
| RDI | 0xffff88810a6b4920 |
| RBP | 0xffff888123398000 |
| R8 | 0xffffc900159cfdb8 |
| R9 | 0xffff88810a6b4928 |
| R10 | 0x0000000000000018 |
| R11 | 0x0000000040000000 |
| R12 | 0xffff88810a9cda00 |
| R13 | 0xffff88810a6b4800 |
| R14 | 0xffffc900159cfe70 |
| R15 | 0xffff88812d0c2800 |
| RSP | 0xffffc900159cfdd8 |
| RIP | 0x160d8 (rxrpc_purge_client_connections+0x58) |
| CR2 | 0x00007faf20630030 |
| CR3 | 0x000000000382e002 |
| CR4 | 0x00000000003706f0 |
| EFLAGS | 0x00010246 |
| CS | 0x0010 |
28 01 00 00 00 74 25 31 c0 48 8d 74 24 0c 48 89
cf 89 44 24 0c 48 89 0c 24 e8 d4 ec c2 c1 48 89
c6 48 85 c0 0f 85 49 dd 01 00 <0f> 0b 31 f6 48
89 cf 48 89 0c 24 e8 c8 aa c4 c1 48 8b 0c 24 85
c0
The <0f> 0b bytes at the RIP are the
ud2 instruction — confirming this is a deliberate
BUG() assertion.
| Address | Function | Offset | Size | Context | Module | Source Location |
|---|---|---|---|---|---|---|
rxrpc_destroy_client_conn_ids (inlined) |
Task | rxrpc | net/rxrpc/conn_client.c:64 |
|||
0x160d8 (0x16080 + 0x58) |
rxrpc_purge_client_connections |
0x58 |
0xa0 |
Task | rxrpc | net/rxrpc/conn_client.c:145 |
0x21ab9 (0x219f0 + 0xc9) |
rxrpc_destroy_local |
0xc9 |
0xe0 |
Task | rxrpc | net/rxrpc/local_object.c:451 |
0x1f3cd (0x1ed70 + 0x65d) |
rxrpc_io_thread |
0x65d |
0x750 |
Task | rxrpc | net/rxrpc/io_thread.c:598 |
0xffffffff813f24ec (0xffffffff813f23f0 + 0xfc) |
kthread |
0xfc |
0x240 |
Task | vmlinux | kernel/kthread.c:463 |
0xffffffff8132ab54 (0xffffffff8132aa60 + 0xf4) |
ret_from_fork |
0xf4 |
0x110 |
Task | vmlinux | arch/x86/kernel/process.c:158 |
0xffffffff812d8dca (0xffffffff812d8db0 + 0x1a) |
ret_from_fork_asm |
0x1a |
0x30 |
Task | vmlinux | arch/x86/entry/entry_64.S:245 |
Note: ?-marked entries
(__pfx_rxrpc_io_thread, __pfx_kthread)
excluded per backtrace rules (more than 2 high-confidence entries
present).
rxrpc_destroy_client_conn_ids (inlined into
rxrpc_purge_client_connections)net/rxrpc/conn_client.c
@ kernel-6.18.13-0
54 static void rxrpc_destroy_client_conn_ids(struct rxrpc_local *local)
55 {
56 struct rxrpc_connection *conn;
57 int id;
58
59 if (!idr_is_empty(&local->conn_ids)) {
60 idr_for_each_entry(&local->conn_ids, conn, id) {
61 pr_err("AF_RXRPC: Leaked client conn %p {%d}\n",
62 conn, refcount_read(&conn->ref));
63 }
64 BUG(); // <- crash here: conn_ids IDR is not empty at endpoint destruction
65 }
66
67 idr_destroy(&local->conn_ids);
68 }net/rxrpc/conn_client.c
@ kernel-6.18.13-0
143 void rxrpc_purge_client_connections(struct rxrpc_local *local)
144 {
145 rxrpc_destroy_client_conn_ids(local); // <- call here
146 }rxrpc_destroy_localnet/rxrpc/local_object.c
@ kernel-6.18.13-0
420 void rxrpc_destroy_local(struct rxrpc_local *local)
421 {
422 struct socket *socket = local->socket;
423 struct rxrpc_net *rxnet = local->rxnet;
...
427 local->dead = true;
...
433 rxrpc_clean_up_local_conns(local); // only cleans idle_client_conns list
434 rxrpc_service_connection_reaper(&rxnet->service_conn_reaper);
435 ASSERT(!local->service);
...
450 rxrpc_purge_queue(&local->rx_queue);
451 rxrpc_purge_client_connections(local); // <- call here -> BUG fires inside
452 page_frag_cache_drain(&local->tx_alloc);
453 }rxrpc_clean_up_local_conns
— the incomplete cleanupnet/rxrpc/conn_client.c
@ kernel-6.18.13-0
813 void rxrpc_clean_up_local_conns(struct rxrpc_local *local)
814 {
815 struct rxrpc_connection *conn;
816
817 local->kill_all_client_conns = true;
818
819 timer_delete_sync(&local->client_conn_reap_timer);
820
821 while ((conn = list_first_entry_or_null(&local->idle_client_conns,
822 struct rxrpc_connection, cache_link))) {
// Only processes connections on idle_client_conns -- connections
// in bundles (bundle->conns[]) that have not yet gone idle are missed.
823 list_del_init(&conn->cache_link);
824 atomic_dec(&conn->active);
825 trace_rxrpc_client(conn, -1, rxrpc_client_discard);
826 rxrpc_unbundle_conn(conn);
827 rxrpc_put_connection(conn, rxrpc_conn_put_local_dead);
828 }
829 }net/rxrpc/io_thread.c
@ kernel-6.18.13-0
554 if (!list_empty(&local->new_client_calls))
555 rxrpc_connect_client_calls(local); // allocates connections, moves calls to bundles
...
569 if (should_stop)
570 break; // exits loop when kthread_should_stop() and queues empty
...
596 __set_current_state(TASK_RUNNING);
597 rxrpc_see_local(local, rxrpc_local_stop);
598 rxrpc_destroy_local(local); // <- call hereIn rxrpc_destroy_client_conn_ids() (inlined into
rxrpc_purge_client_connections()), the IDR
local->conn_ids is found to be non-empty. The kernel
prints:
rxrpc: AF_RXRPC: Leaked client conn 00000000bf02a6a7 {1}
and then fires BUG() at
net/rxrpc/conn_client.c:64. The leaked connection has
refcount=1, meaning it was allocated but never put. The connection is
registered in the conn_ids IDR but was not cleaned up
before rxrpc_destroy_local() was called by the I/O thread
during socket teardown.
When a client calls sendmsg() on an
AF_RXRPC socket to initiate a call, the call is placed on
local->new_client_calls. The I/O thread picks it up in
its main loop at io_thread.c:554–555 via
rxrpc_connect_client_calls(). Inside that function, a
client connection is allocated via
rxrpc_add_conn_to_bundle() →
rxrpc_alloc_client_connection(). This allocates a
rxrpc_connection object with refcount=1 and
registers it in local->conn_ids (the IDR). The
connection is stored in bundle->conns[slot] and
bundle->conn_ids[slot]. At this point the call is moved
from new_client_calls to
bundle->waiting_calls, and new_client_calls
becomes empty.
Now the race: after rxrpc_connect_client_calls()
returns, the I/O thread re-evaluates its exit condition (line 558–570).
If kthread_should_stop() is true and all work queues
(including new_client_calls) appear empty, the thread exits
the loop and calls rxrpc_destroy_local().
Inside rxrpc_destroy_local():
rxrpc_clean_up_local_conns() is called. It sets
kill_all_client_conns=true and iterates over
local->idle_client_conns to free connections that have
gone idle. The connection just allocated in step above is NOT on
idle_client_conns — it is in the bundle’s
conns[] array, waiting to be activated for the pending
call. This connection is completely missed by
rxrpc_clean_up_local_conns().
The socket is shut down, queues are purged.
rxrpc_purge_client_connections() →
rxrpc_destroy_client_conn_ids() is called. It finds
local->conn_ids non-empty, logs the leaked connection,
and fires BUG().
The root cause is a coverage gap in
rxrpc_clean_up_local_conns(): it only iterates
local->idle_client_conns but does not iterate
connections in client bundles (local->client_bundles
RB-tree → bundle->conns[]). A connection allocated for a
pending call that hasn’t yet been activated on a channel (and thus never
went idle) falls through this gap.
This gap was introduced by commit 9d35d880e0e4
(“rxrpc: Move client call connection to the I/O thread”), which moved
connection allocation into the I/O thread as part of call setup. Prior
to that change, the connection lifecycle was managed differently, and
the idle-list cleanup was sufficient. After the change, connections can
be in a “bundle-allocated but not yet idle” state that the cleanup path
does not handle.
A related fix, fc9de52de38f
(“rxrpc: Fix missing locking causing hanging calls”), is already
included in kernel 6.18.13. That commit added a missing lock around
rxrpc_disconnect_client_call()’s removal of a call from
new_client_calls, preventing list corruption. It does not
address the idle-list coverage gap described above.
The fix must ensure that when rxrpc_destroy_local()
tears down a local endpoint, all client connections
registered in local->conn_ids are properly cleaned up —
not just those that have reached the idle state.
Two approaches:
Extend rxrpc_clean_up_local_conns()
to also iterate over all entries in the
local->client_bundles RB-tree, unbundle and put each
connection found in bundle->conns[] slots. This mirrors
what the idle-list loop does via rxrpc_unbundle_conn() +
rxrpc_put_connection().
Abort pending calls before teardown: In
rxrpc_destroy_local(), before calling
rxrpc_clean_up_local_conns(), abort all calls still in
bundle->waiting_calls. When calls are aborted, their
disconnect path will properly remove the connection from the bundle (via
rxrpc_disconnect_client_call() →
rxrpc_put_connection()), which will ultimately call
rxrpc_kill_client_conn() →
rxrpc_put_client_connection_id() to remove the connection
from conn_ids.
Approach 1 is more direct and lower-risk. A sketch of the change
would be to add the following after the idle-list loop in
rxrpc_clean_up_local_conns():
/* Also clean up any connections still in bundles (not yet idle). */
spin_lock(&local->client_bundles_lock);
while (!RB_EMPTY_ROOT(&local->client_bundles)) {
struct rxrpc_bundle *bundle = rb_entry(
rb_first(&local->client_bundles),
struct rxrpc_bundle, local_node);
/* unbundle each slot */
for (int i = 0; i < RXRPC_MAX_CONNS_PER_CLIENT; i++) {
conn = bundle->conns[i];
if (conn) {
spin_unlock(&local->client_bundles_lock);
rxrpc_unbundle_conn(conn);
rxrpc_put_connection(conn, rxrpc_conn_put_local_dead);
spin_lock(&local->client_bundles_lock);
}
}
}
spin_unlock(&local->client_bundles_lock);The exact implementation should be reviewed by the rxrpc maintainer (David Howells), as additional locking considerations may apply.
The bug was introduced by commit 9d35d880e0e4
(“rxrpc: Move client call connection to the I/O thread”,
2022-10-19).
That commit moved connection allocation out of the app-thread sendmsg
path and into the I/O thread, creating a new “allocated in bundle, not
yet idle” state for connections. The existing
rxrpc_clean_up_local_conns() function only handles the
idle_client_conns list and was not updated to also cover
the new state.
| Field | Value |
|---|---|
| INTRODUCED-BY | 9d35d880e0e4
rxrpc: Move client call connection to the I/O thread |
Searched git history
(^kernel-6.18.13-0 origin/master -- net/rxrpc/) for commits
that fix the specific coverage gap in
rxrpc_clean_up_local_conns(). No commit addressing the
non-idle connection cleanup was found as of the search.
The fix fc9de52de38f (“rxrpc: Fix missing locking
causing hanging calls”) is already present in
kernel-6.18.13-0 and addresses a different (though related)
bug in the same code path.
No upstream fix was identified for this specific issue within the search budget.
Conclusion (high confidence): The kernel BUG at
net/rxrpc/conn_client.c:64 is triggered when an AF_RXRPC
client socket is closed while the I/O thread has already allocated a
client connection for a pending call but that connection has not yet
been activated on a channel (and therefore never appears on the
idle_client_conns list). The
rxrpc_clean_up_local_conns() function misses this
connection, leaving it registered in the conn_ids IDR,
which then trips the BUG assertion in
rxrpc_destroy_client_conn_ids().
This is an Unprivileged Application crash: a regular
user can trigger it by creating an AF_RXRPC socket and
closing it rapidly while the I/O thread is mid-connection-setup. No root
privileges are required; the rxrpc_create() path has no
capability check. The rxrpc module must be loaded, which is
the case on any system running the AFS client (kafs) or where the module
has been manually loaded.
Recommendation: The rxrpc maintainer (David Howells)
should extend rxrpc_clean_up_local_conns() to also release
connections that are stored in bundle->conns[] but have
not yet appeared on idle_client_conns, ensuring
rxrpc_destroy_client_conn_ids() always finds an empty IDR.
The reproducer provided in the bug report reliably triggers the issue
and can serve as a regression test.
The Linux Kernel CVE team is likely to assign a CVE to this issue (Unprivileged Application crash, no upstream fix identified).
See report.html