| Field | Value | Implication |
|---|---|---|
| UNAME | 7.0.0-08391-g1d51b370a0f8 |
git-describe format; exact hash 1d51b370a0f8 checked
out for source |
| DISTRO | (none — custom build) | |
| SOURCEDIR | /sdb1/arjan/git/oops-skill/oops-workdir/linux @
1d51b370a0f8 |
|
| HARDWARE | QEMU Standard PC (i440FX + PIIX, 1996) | Virtual machine |
| PROCESS | repro1 (PID 334) |
User-space reproducer |
| TAINT | G W |
G = only GPL modules (no implication); W = prior WARNING — an earlier WARNING occurred; may be related |
| MSGID | <CAPHJ_VJeBAL_fk+P79guYTABZgW1hkcAz8t=c_nVK1mbn3_FYw@mail.gmail.com> |
|
| MSGID_URL | CAPHJ_VJeBAL_fk+P79guYTABZgW1hkcAz8t=c_nVK1mbn3_FYw@mail.gmail.com | |
| CRASH_TYPE | BUG (BUG_ON) | |
| INTRODUCED-BY | 67cf5b09a46f
— ext4: add the basic function for inline data support |
BUG_ON present since inline data feature introduction (Dec 2012) |
Type: BUG / BUG_ON — intentional kernel assertion
failure
Variant:
BUG_ON(pos + len > EXT4_I(inode)->i_inline_size) at
fs/ext4/inline.c:240
Oops header:
Oops: invalid opcode: 0000 [#1] SMP KASAN — the
ud2 instruction from BUG_ON triggered an
“invalid opcode” hardware exception.
| Module | Flags | Backtrace | Location | Flag Implication |
|---|---|---|---|---|
| (module list not available in this report) |
| Address | Function | Offset | Size | Context | Module | Source Location |
|---|---|---|---|---|---|---|
| — | ext4_write_inline_data |
0x3d0 |
0x490 |
Task | (built-in) | fs/ext4/inline.c:240 |
| — | ext4_write_inline_data_end |
0x293 |
0xc90 |
Task | (built-in) | fs/ext4/inline.c:825 |
| — | ext4_da_write_end |
0x521 |
0xec0 |
Task | (built-in) | fs/ext4/inode.c:3291 |
| — | ext4_buffered_write_iter |
0x11a |
0x430 |
Task | (built-in) | |
| — | ext4_file_write_iter |
0x561 |
0x1840 |
Task | (built-in) | |
| — | iter_file_splice_write |
0xa33 |
0x11c0 |
Task | (built-in) | |
| — | direct_splice_actor |
0x18f |
0x7a0 |
Task | (built-in) | |
| — | do_splice_direct |
0x41 |
0x50 |
Task | (built-in) | |
| — | do_sendfile |
0xa86 |
0xda0 |
Task | (built-in) | |
| — | __x64_sys_sendfile64 |
0x1cf |
0x210 |
Task | (built-in) |
Note: The console output pastebin (C0XjNMXp) was inaccessible due to network proxy restrictions; the backtrace above is from the lore email which may be truncated.
kernel BUG at fs/ext4/inline.c:240
Line 240 in ext4_write_inline_data:
BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);len here is the copied argument passed from
ext4_write_inline_data_end() — the number of bytes actually
written to the folio. The assertion fires because
pos + copied > EXT4_I(inode)->i_inline_size.
fs/ext4/inline.c @ 1d51b370a0f8
228 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
229 void *buffer, loff_t pos, unsigned int len)
230 {
231 struct ext4_xattr_entry *entry;
232 struct ext4_xattr_ibody_header *header;
233 struct ext4_inode *raw_inode;
234 int cp_len = 0;
235
236 if (unlikely(ext4_emergency_state(inode->i_sb)))
237 return;
238
239 BUG_ON(!EXT4_I(inode)->i_inline_off);
240 BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); // ← CRASH HERE
241
242 raw_inode = ext4_raw_inode(iloc);
243 buffer += pos;
244
245 if (pos < EXT4_MIN_INLINE_DATA_SIZE) {
246 cp_len = pos + len > EXT4_MIN_INLINE_DATA_SIZE ?
247 EXT4_MIN_INLINE_DATA_SIZE - pos : len;
248 memcpy((void *)raw_inode->i_block + pos, buffer, cp_len);
249 len -= cp_len;
250 buffer += cp_len;
251 pos += cp_len;
252 }
253
254 if (!len)
255 return;
256
257 pos -= EXT4_MIN_INLINE_DATA_SIZE;
258 header = IHDR(inode, raw_inode);
259 entry = (struct ext4_xattr_entry *)((void *)raw_inode +
260 EXT4_I(inode)->i_inline_off);
261
262 memcpy((void *)IFIRST(header) + le16_to_cpu(entry->e_value_offs) + pos,
263 buffer, len);
264 }794 int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len,
795 unsigned copied, struct folio *folio)
796 {
797 handle_t *handle = ext4_journal_current_handle();
798 int no_expand;
799 void *kaddr;
800 struct ext4_iloc iloc;
801 int ret = 0, ret2;
802
803 if (unlikely(copied < len) && !folio_test_uptodate(folio))
804 copied = 0;
805
806 if (likely(copied)) {
807 ret = ext4_get_inode_loc(inode, &iloc);
808 if (ret) { ... goto out; }
809 ext4_write_lock_xattr(inode, &no_expand);
810 BUG_ON(!ext4_has_inline_data(inode));
811
812 /*
813 * ei->i_inline_off may have changed since
814 * ext4_write_begin() called ext4_try_to_write_inline_data()
815 */
816 (void) ext4_find_inline_data_nolock(inode); // refreshes i_inline_size from disk
817
818 kaddr = kmap_local_folio(folio, 0);
819 ext4_write_inline_data(inode, &iloc, kaddr, pos, copied); // ← CALLS INTO BUG
820 kunmap_local(kaddr);At line 240 in ext4_write_inline_data(), the assertion
pos + len > i_inline_size fired, where len
is copied (bytes written to the folio) and
i_inline_size is the total allocated inline storage for
this inode (EXT4_MIN_INLINE_DATA_SIZE + e_value_size from
the on-disk system.data xattr).
The i_inline_size value was just refreshed from disk by
ext4_find_inline_data_nolock() at line 822 (the write-end
call). The crash means that pos + copied exceeds the inline
capacity at the time of the write commit.
Key structs: - EXT4_I(inode)->i_inline_size: total
inline capacity in bytes (sizeof(i_block) + e_value_size =
EXT4_MIN_INLINE_DATA_SIZE (60) + whatever
e_value_size holds for the system.data xattr) -
EXT4_I(inode)->i_inline_off: byte offset of the
system.data xattr entry within the raw inode (non-zero means the xattr
exists)
The call stack is sendfile64 →
ext4_da_write_end → ext4_write_inline_data_end
→ ext4_write_inline_data. This is the
delayed-allocation (DA) inline-data write path.
Two prior fixes for the same class of bug exist in the tree:
892e1cf17555
— “ext4: refresh inline data size before write operations” — fixed a
race where concurrent xattr operations could shrink
i_inline_size between
ext4_get_max_inline_size() and
ext4_write_lock_xattr() inside
ext4_prepare_inline_data().
ed9356a30e59
— “ext4: convert inline data to extents when truncate exceeds inline
size” — fixed the case where ext4_setattr() grew a file’s
size beyond inline capacity via truncate() without
converting the inode from inline-data to extent-based storage
first.
Both fixes are confirmed present in the UNAME tree at commit
1d51b370a0f8.
This crash represents a remaining instance of the same root class. The most likely trigger for a syzkaller-crafted image is one of two scenarios:
Scenario A (crafted image, no truncate): The ext4
image is directly crafted so that an inode has
EXT4_INODE_INLINE_DATA set with an i_size that
already exceeds the inline storage capacity — without any
truncate() call needed. The fix ed9356a30e59
only guards the ext4_setattr() path; an inode loaded from
disk whose i_size > i_inline_size from the start
bypasses that guard. When sendfile writes at a position that the current
on-disk e_value_size cannot accommodate,
ext4_find_inline_data_nolock() in write_end refreshes
i_inline_size to the (too-small) disk value, and the
assertion fires.
Scenario B (remaining race window): Even with fix
892e1cf17555, there is a window between the end of
ext4_prepare_inline_data() (which releases the xattr write
lock after ensuring i_inline_size ≥ pos + len) and the
re-acquisition of the lock in ext4_write_inline_data_end().
In a multi-threaded reproducer, a concurrent xattr operation could
shrink i_inline_size in that window.
In either scenario, the BUG_ON is fundamentally the wrong response to filesystem corruption or a race: a crafted/corrupted filesystem should never crash the kernel.
Q1: Why does pos + copied > i_inline_size at
write_end?
Because ext4_find_inline_data_nolock(inode) in
ext4_write_inline_data_end reads a SMALLER
i_inline_size from disk than what
ext4_prepare_inline_data had established during
write_begin. This can happen if: - (Scenario A) The on-disk
e_value_size was never updated because
ext4_update_inline_data returned early (line 350) believing
the current size was sufficient — based on an in-memory
i_inline_size that did NOT reflect the true disk state of a
crafted image. - (Scenario B) A concurrent thread modified the xattr
between write_begin and write_end, reducing
i_inline_size.
Q2: Why doesn’t ext4_prepare_inline_data prevent
this?
Fix 892e1cf17555 added a
ext4_find_inline_data_nolock call inside
ext4_write_lock_xattr in
ext4_prepare_inline_data to refresh the cached
i_inline_size. This prevents stale reads within
write_begin. However, the refresh in write_end (at line 822) is ALSO
inside the write xattr lock — but between write_begin’s lock release and
write_end’s lock acquisition, the on-disk state can change (for Scenario
B), or write_begin may not have fully reconciled the disk state (for
Scenario A with a deeply corrupted image).
The BUG_ONs in ext4_write_inline_data() should be
replaced with proper error handling using
ext4_error_inode(), following the exact pattern established
by commit 099b847ccc6c
(“ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr”) and
commit 356227096eb6
(“ext4: replace BUG_ON with proper error handling in
ext4_read_inline_folio”).
The function must be changed from static void to
static int, the two BUG_ONs replaced with
ext4_error_inode() + return -EFSCORRUPTED, and
the early-return guard converted to return 0. Both callers
(ext4_write_inline_data_end and
ext4_restore_inline_data) must be updated to handle the
error return.
In ext4_da_write_begin(), before taking the inline path,
check that
inode->i_size <= ext4_get_max_inline_size(inode) and
convert to extents if not. This catches the Scenario A case (crafted
image with i_size > inline_capacity without a prior
truncate call). This is complementary to Option A, not a
replacement.
This report implements Option A (the defensive BUG_ON replacement), as it is the most robust defence against any current or future corrupted filesystem trigger.
See report.patch for the proposed fix.
The two BUG_ONs at lines 239–240 were introduced in
commit 67cf5b09a46f
by Tao Ma on 2012-12-10 (“ext4: add the basic function for inline data
support”). They have been present since the inline data feature was
first added to ext4.
The bug has been present since the introduction of the ext4 inline
data feature in commit 67cf5b09a46f
(December 2012). The BUG_ONs were appropriate developer assertions at
feature introduction time, but should be converted to graceful error
handling now that maliciously crafted images are a known threat
vector.
No upstream commit was found that directly replaces these specific
BUG_ONs in ext4_write_inline_data. The fix proposed here is
new.
The closely related commit 356227096eb6
(“ext4: replace BUG_ON with proper error handling in
ext4_read_inline_folio”, present in the upstream tree but not yet in
this kernel version) shows the precedent and pattern for this class of
fix.
<shicenci@gmail.com>