Kernel BUG Analysis: ext4_write_inline_data BUG_ON at fs/ext4/inline.c:240

Key Elements

Field Value Implication
UNAME 7.0.0-08391-g1d51b370a0f8 git-describe format; exact hash 1d51b370a0f8 checked out for source
DISTRO (none — custom build)
SOURCEDIR /sdb1/arjan/git/oops-skill/oops-workdir/linux @ 1d51b370a0f8
HARDWARE QEMU Standard PC (i440FX + PIIX, 1996) Virtual machine
PROCESS repro1 (PID 334) User-space reproducer
TAINT G W G = only GPL modules (no implication); W = prior WARNING — an earlier WARNING occurred; may be related
MSGID <CAPHJ_VJeBAL_fk+P79guYTABZgW1hkcAz8t=c_nVK1mbn3_FYw@mail.gmail.com>
MSGID_URL CAPHJ_VJeBAL_fk+P79guYTABZgW1hkcAz8t=c_nVK1mbn3_FYw@mail.gmail.com
CRASH_TYPE BUG (BUG_ON)
INTRODUCED-BY 67cf5b09a46f — ext4: add the basic function for inline data support BUG_ON present since inline data feature introduction (Dec 2012)

Crash Classification

Type: BUG / BUG_ON — intentional kernel assertion failure
Variant: BUG_ON(pos + len > EXT4_I(inode)->i_inline_size) at fs/ext4/inline.c:240
Oops header: Oops: invalid opcode: 0000 [#1] SMP KASAN — the ud2 instruction from BUG_ON triggered an “invalid opcode” hardware exception.


Modules List

Module Flags Backtrace Location Flag Implication
(module list not available in this report)

Backtrace

Address Function Offset Size Context Module Source Location
ext4_write_inline_data 0x3d0 0x490 Task (built-in) fs/ext4/inline.c:240
ext4_write_inline_data_end 0x293 0xc90 Task (built-in) fs/ext4/inline.c:825
ext4_da_write_end 0x521 0xec0 Task (built-in) fs/ext4/inode.c:3291
ext4_buffered_write_iter 0x11a 0x430 Task (built-in)
ext4_file_write_iter 0x561 0x1840 Task (built-in)
iter_file_splice_write 0xa33 0x11c0 Task (built-in)
direct_splice_actor 0x18f 0x7a0 Task (built-in)
do_splice_direct 0x41 0x50 Task (built-in)
do_sendfile 0xa86 0xda0 Task (built-in)
__x64_sys_sendfile64 0x1cf 0x210 Task (built-in)

Note: The console output pastebin (C0XjNMXp) was inaccessible due to network proxy restrictions; the backtrace above is from the lore email which may be truncated.


BUG Condition

kernel BUG at fs/ext4/inline.c:240

Line 240 in ext4_write_inline_data:

BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);

len here is the copied argument passed from ext4_write_inline_data_end() — the number of bytes actually written to the folio. The assertion fires because pos + copied > EXT4_I(inode)->i_inline_size.


Source: ext4_write_inline_data (fs/ext4/inline.c)

fs/ext4/inline.c @ 1d51b370a0f8

228  static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
229                                     void *buffer, loff_t pos, unsigned int len)
230  {
231      struct ext4_xattr_entry *entry;
232      struct ext4_xattr_ibody_header *header;
233      struct ext4_inode *raw_inode;
234      int cp_len = 0;
235  
236      if (unlikely(ext4_emergency_state(inode->i_sb)))
237          return;
238  
239      BUG_ON(!EXT4_I(inode)->i_inline_off);
240      BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);   // ← CRASH HERE
241  
242      raw_inode = ext4_raw_inode(iloc);
243      buffer += pos;
244  
245      if (pos < EXT4_MIN_INLINE_DATA_SIZE) {
246          cp_len = pos + len > EXT4_MIN_INLINE_DATA_SIZE ?
247                   EXT4_MIN_INLINE_DATA_SIZE - pos : len;
248          memcpy((void *)raw_inode->i_block + pos, buffer, cp_len);
249          len -= cp_len;
250          buffer += cp_len;
251          pos += cp_len;
252      }
253  
254      if (!len)
255          return;
256  
257      pos -= EXT4_MIN_INLINE_DATA_SIZE;
258      header = IHDR(inode, raw_inode);
259      entry = (struct ext4_xattr_entry *)((void *)raw_inode +
260                                          EXT4_I(inode)->i_inline_off);
261  
262      memcpy((void *)IFIRST(header) + le16_to_cpu(entry->e_value_offs) + pos,
263             buffer, len);
264  }

Source: ext4_write_inline_data_end (fs/ext4/inline.c)

794  int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len,
795                                 unsigned copied, struct folio *folio)
796  {
797      handle_t *handle = ext4_journal_current_handle();
798      int no_expand;
799      void *kaddr;
800      struct ext4_iloc iloc;
801      int ret = 0, ret2;
802  
803      if (unlikely(copied < len) && !folio_test_uptodate(folio))
804          copied = 0;
805  
806      if (likely(copied)) {
807          ret = ext4_get_inode_loc(inode, &iloc);
808          if (ret) { ... goto out; }
809          ext4_write_lock_xattr(inode, &no_expand);
810          BUG_ON(!ext4_has_inline_data(inode));
811  
812          /*
813           * ei->i_inline_off may have changed since
814           * ext4_write_begin() called ext4_try_to_write_inline_data()
815           */
816          (void) ext4_find_inline_data_nolock(inode);  // refreshes i_inline_size from disk
817  
818          kaddr = kmap_local_folio(folio, 0);
819          ext4_write_inline_data(inode, &iloc, kaddr, pos, copied); // ← CALLS INTO BUG
820          kunmap_local(kaddr);

What — What Happened

At line 240 in ext4_write_inline_data(), the assertion pos + len > i_inline_size fired, where len is copied (bytes written to the folio) and i_inline_size is the total allocated inline storage for this inode (EXT4_MIN_INLINE_DATA_SIZE + e_value_size from the on-disk system.data xattr).

The i_inline_size value was just refreshed from disk by ext4_find_inline_data_nolock() at line 822 (the write-end call). The crash means that pos + copied exceeds the inline capacity at the time of the write commit.

Key structs: - EXT4_I(inode)->i_inline_size: total inline capacity in bytes (sizeof(i_block) + e_value_size = EXT4_MIN_INLINE_DATA_SIZE (60) + whatever e_value_size holds for the system.data xattr) - EXT4_I(inode)->i_inline_off: byte offset of the system.data xattr entry within the raw inode (non-zero means the xattr exists)


How — Root Cause

The call stack is sendfile64ext4_da_write_endext4_write_inline_data_endext4_write_inline_data. This is the delayed-allocation (DA) inline-data write path.

Two prior fixes for the same class of bug exist in the tree:

  1. 892e1cf17555 — “ext4: refresh inline data size before write operations” — fixed a race where concurrent xattr operations could shrink i_inline_size between ext4_get_max_inline_size() and ext4_write_lock_xattr() inside ext4_prepare_inline_data().

  2. ed9356a30e59 — “ext4: convert inline data to extents when truncate exceeds inline size” — fixed the case where ext4_setattr() grew a file’s size beyond inline capacity via truncate() without converting the inode from inline-data to extent-based storage first.

Both fixes are confirmed present in the UNAME tree at commit 1d51b370a0f8.

This crash represents a remaining instance of the same root class. The most likely trigger for a syzkaller-crafted image is one of two scenarios:

Scenario A (crafted image, no truncate): The ext4 image is directly crafted so that an inode has EXT4_INODE_INLINE_DATA set with an i_size that already exceeds the inline storage capacity — without any truncate() call needed. The fix ed9356a30e59 only guards the ext4_setattr() path; an inode loaded from disk whose i_size > i_inline_size from the start bypasses that guard. When sendfile writes at a position that the current on-disk e_value_size cannot accommodate, ext4_find_inline_data_nolock() in write_end refreshes i_inline_size to the (too-small) disk value, and the assertion fires.

Scenario B (remaining race window): Even with fix 892e1cf17555, there is a window between the end of ext4_prepare_inline_data() (which releases the xattr write lock after ensuring i_inline_size ≥ pos + len) and the re-acquisition of the lock in ext4_write_inline_data_end(). In a multi-threaded reproducer, a concurrent xattr operation could shrink i_inline_size in that window.

In either scenario, the BUG_ON is fundamentally the wrong response to filesystem corruption or a race: a crafted/corrupted filesystem should never crash the kernel.

Q1: Why does pos + copied > i_inline_size at write_end?
Because ext4_find_inline_data_nolock(inode) in ext4_write_inline_data_end reads a SMALLER i_inline_size from disk than what ext4_prepare_inline_data had established during write_begin. This can happen if: - (Scenario A) The on-disk e_value_size was never updated because ext4_update_inline_data returned early (line 350) believing the current size was sufficient — based on an in-memory i_inline_size that did NOT reflect the true disk state of a crafted image. - (Scenario B) A concurrent thread modified the xattr between write_begin and write_end, reducing i_inline_size.

Q2: Why doesn’t ext4_prepare_inline_data prevent this?
Fix 892e1cf17555 added a ext4_find_inline_data_nolock call inside ext4_write_lock_xattr in ext4_prepare_inline_data to refresh the cached i_inline_size. This prevents stale reads within write_begin. However, the refresh in write_end (at line 822) is ALSO inside the write xattr lock — but between write_begin’s lock release and write_end’s lock acquisition, the on-disk state can change (for Scenario B), or write_begin may not have fully reconciled the disk state (for Scenario A with a deeply corrupted image).


Where — Fix

Option A: Defensive BUG_ON replacement (preferred)

The BUG_ONs in ext4_write_inline_data() should be replaced with proper error handling using ext4_error_inode(), following the exact pattern established by commit 099b847ccc6c (“ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr”) and commit 356227096eb6 (“ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio”).

The function must be changed from static void to static int, the two BUG_ONs replaced with ext4_error_inode() + return -EFSCORRUPTED, and the early-return guard converted to return 0. Both callers (ext4_write_inline_data_end and ext4_restore_inline_data) must be updated to handle the error return.

Option B: Add i_size consistency check at write_begin (complementary)

In ext4_da_write_begin(), before taking the inline path, check that inode->i_size <= ext4_get_max_inline_size(inode) and convert to extents if not. This catches the Scenario A case (crafted image with i_size > inline_capacity without a prior truncate call). This is complementary to Option A, not a replacement.

This report implements Option A (the defensive BUG_ON replacement), as it is the most robust defence against any current or future corrupted filesystem trigger.

See report.patch for the proposed fix.


Bug Introduction

The two BUG_ONs at lines 239–240 were introduced in commit 67cf5b09a46f by Tao Ma on 2012-12-10 (“ext4: add the basic function for inline data support”). They have been present since the inline data feature was first added to ext4.

The bug has been present since the introduction of the ext4 inline data feature in commit 67cf5b09a46f (December 2012). The BUG_ONs were appropriate developer assertions at feature introduction time, but should be converted to graceful error handling now that maliciously crafted images are a known threat vector.


Upstream Fix Status

No upstream commit was found that directly replaces these specific BUG_ONs in ext4_write_inline_data. The fix proposed here is new.

The closely related commit 356227096eb6 (“ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio”, present in the upstream tree but not yet in this kernel version) shows the precedent and pattern for this class of fix.


Metadata