Linux 6.16 Released: Resets, colors, fixes and more
Igalia has been contributing to the Linux kernel for many years at this point, helping our clients upstream changes, fixing bugs or adding new features, and the latest release is no different. Linux 6.16 brings a lot of enhancements made by the general community, including:
- Support for Intel Trusted Domain Extensions: this CPU feature encrypts the memory of the guest VM, for confidential computing use cases in a cloud environment.
- Support for USB audio offload: this huge patchset allows audio DSPs to talk directly to xHCI USB endpoints, meaning that a USB audio device can keep playing while the CPU is a low energy state.
- Futex: three new features for futexes were merged:
- Historically, futex uses a global hash for waiters, where all tasks share the buckets. This causes some collateral latency issues, where one task with a lot of waiters could influence another task’s overhead. To avoid this, now any process can create its own private hash table with the new
prctl(PR_FUTEX_HASH)
. - Similarly, a global hash table had latency issues when using a NUMA machine, where one node would host the hash table, and all other nodes would need to do a heavier memory operation to access it. With
FUTEX2_NUMA
, it’s possible to have one global hash table per node. - Finally, users can set
FUTEX_MPOL
to make the global hash allocation respect a defined memory allocation policy in NUMA machines such as the preferred node to be used.
- Historically, futex uses a global hash for waiters, where all tasks share the buckets. This causes some collateral latency issues, where one task with a lot of waiters could influence another task’s overhead. To avoid this, now any process can create its own private hash table with the new
- There’s a new build option for x86 machines, called
CONFIG_X86_NATIVE_CPU
. When building a kernel with this option, the kernel will be optimized for the CPU being used in the build, meaning that it will be aware of the particular features and settings of the running CPU, such as the cache size. The performance gains vary from 0.2% to 3.3% in some benchmarks, which isn’t a game changer, but it’s impressive for a one-line change. - Automatic tuning for NUMA memory allocation: in some NUMA machines, each node might have a different “weight” regarding their bandwidth. In the past, sysadmins were able to provide manually the weight of each node so the kernel would then make smarter choices when allocating memory. Now, the kernel will be able to automatically calculate such weights and apply them.
For the full details, we suggest having a look at the Kernel Newbies summary. Now, let’s have a look at Igalia’s contributions to 6.16.
DRM scheduler
The DRM scheduler is a core piece of the Linux kernel graphics subsystem, and it’s used by several GPU drivers. At Igalia, we have been working on several improvements to modernize it. However, when you are making big changes to critical code, you must have ways to reduce the odds of introducing regressions. That’s where good tests come in!
Historically, the DRM GPU scheduler has been largely uncharted territory when it comes to dedicated testing. That’s why we developed a basic set of unit tests based on the kernel’s unit testing framework, KUnit. This addition provides us with increased confidence as we proceed with upcoming work that aims to simplify and enhance the scheduler. For a deeper dive into our future plans for the scheduler, take a look at Tvrtko’s blog post: Fair(er) DRM GPU scheduler.
AMDGPU & Intel Xe drivers
In Linux 6.16, Igalia continued its dedicated efforts to refine and improve two popular GPU drivers: AMDGPU and Intel Xe. Our contributions aimed to improve maintainability, documentation, and stability.
For AMDGPU, we submitted several code refactorings for the Clear State Indirect Buffer (CSB) logic and also enriched the existing AMDGPU documentation. On the display side, we addressed issues on plane updates and color management found in the latest DCN4.01 hardware, and also expanded debug tool coverage for better maintainability.
Meanwhile, our work on the Intel Xe driver targeted primarily Intel Alder Lake-based GPUs. We contributed several fixes to enhance stability and functionality of the hardware.
GPU resets in Raspberry Pi
For this release, Igalia tackled some critical GPU reset issues on the Raspberry Pi 4 and 5. In general, we don’t want GPU resets to happen; however, when they do, we want the GPU and the system to recover gracefully.
Users were experiencing GPU hangs and UI freezes after a GPU reset. We fixed this by correctly configuring and utilizing a new set of registers for the RPi, which are vital for power management and reset operations on V3D 7.x hardware.
Further enhancing stability, we also addressed a kernel crash that happened during a GPU reset, causing a complete system freeze. These combined efforts significantly improve GPU reset reliability and user experience on the RPi 4 and 5.
Device drivers infrastructure
Since Linux 6.2, misc-based devices are not restricted to 255 minor numbers anymore. However, that was incomplete and did not work correctly until 6.15. Now, this and other bugs can be tested with a KUnit test suite improved during this cycle. Also, all dynamically allocated minor numbers are above 255, simplifying the allocation process.
Filesystems, memory management and more fixes
The FUSE subsystem has had a simple way for the user-space file system servers to ask the kernel to invalidate the cached data for an inode or a dentry. Unfortunately, this mechanism was only allowed to invalidate a single object at a time. In this release, we added a new mechanism that allows the user-space server to ask the kernel to force the invalidation of all its dentries the next time they need to be re-validated.
Alongside that, as usual, Igalians are working to fix kernel bugs, and in this release we managed to specially send fixes to the kernel timer watchdog, filesystems (ext4, ovl, and core vfs) and memory management (huge memory).
Authored (55)
André Almeida
Gavin Guo
- mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()
- mm/huge_memory: remove useless folio pointers passing
Guilherme G. Piccoli
Helen Koike
Jose Maria Casanova Crespo
- drm/v3d: fix client obtained from axi_ids on V3D 4.1
- drm/v3d: client ranges from axi_ids are different with V3D 7.1
Luis Henriques
- fuse: add more control over cache invalidation behaviour
- fs: drop assert in file_seek_cur_needs_f_lock
Maíra Canal
- drm/v3d: Associate a V3D tech revision to all supported devices
- dt-bindings: gpu: v3d: Add per-compatible register restrictions
- dt-bindings: gpu: v3d: Add SMS register to BCM2712 compatible
- dt-bindings: gpu: v3d: Add V3D driver maintainer as DT maintainer
- drm/v3d: Use V3D_SMS registers for power on/off and reset on V3D 7.x
- drm/v3d: Avoid NULL pointer dereference in
v3d_job_update_stats()
- drm/etnaviv: Protect the scheduler’s pending list with its lock
- drm/v3d: Disable interrupts before resetting the GPU
Melissa Wen
- drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp
- Revert “drm/amd/display: Hardware cursor changes color when switched to software cursor”
- drm/amd/display: only collect data if debug gamut_remap is available
- drm/amd/display: no 3D and blnd LUT as DPP color caps for DCN401
- drm/amd/display: Disable CRTC degamma LUT for DCN401
Rodrigo Siqueira
- Documentation/gpu: Add new acronyms
- Documentation/gpu: Change index order to show driver core first
- Documentation/gpu: Create a documentation entry just for hardware info
- Documentation/gpu: Add explanation about AMD Pipes and Queues
- Documentation/gpu: Create a GC entry in the amdgpu documentation
- Documentation/gpu: Add an intro about MES
- drm/amdgpu/gfx: Introduce helpers handling CSB manipulation
- drm/amdgpu/gfx: Use CSB helpers in gfx_v11_0_get_csb_buffer
- drm/amdgpu/gfx: Use CSB helpers in gfx_v10_0_get_csb_buffer
- drm/amdgpu/gfx: Use CSB helpers in gfx_v9_0_get_csb_buffer
- drm/amdgpu/gfx: Use CSB helpers in gfx_v8_0_get_csb_buffer
- drm/amdgpu/gfx: Use CSB helpers in gfx_v7_0_get_csb_buffer
- drm/amdgpu/gfx: Fix gfx_v7_0_get_csb_buffer to use rb_config
- drm/amdgpu/gfx: Use CSB helpers in gfx_v6_0_get_csb_buffer
- drm/amdgpu: Add documentation associated with CSB
- drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb
- Documentation/gpu: Add new entries to amdgpu glossary
Thadeu Lima de Souza Cascardo
- char: misc: restrict the dynamic range to exclude reserved minors
- char: misc: add test cases
- char: misc: make miscdevice unit test built-in only
- ext4: inline: fix len overflow in ext4_prepare_inline_data
Tvrtko Ursulin
- drm/xe: Fix MOCS debugfs LNCF readout
- drm/xe: Fix ring flush invalidation
- drm/xe: Pass flags directly to emit_flush_imm_ggtt
- drm/xe: Use correct type width for alignment in fb pinning code
- drm: Move some options to separate new Kconfig
- drm/sched: Add scheduler unit testing infrastructure and some basic tests
- drm/sched: Add a simple timeout test
- drm/sched: Add basic priority tests
- drm/sched: Add a basic test for modifying entities scheduler list
- drm/sched: Add a basic test for checking credit limit
- drm/xe: Adjust ringbuf emission for maximum possible size
- drm/amdgpu: Make amdgpu_ctx_mgr_entity_fini static
- drm/amdgpu: Remove duplicated “context still alive” check
Reviewed (39)
André Almeida
- selftests/futex: Use TAP output in futex_priv_hash
- selftests/futex: Use TAP output in futex_numa_mpol
- tools headers: Synchronize prctl.h ABI header
- futex: Correct the kernedoc return value for futex_wait_setup().
- selftests/futex: Add futex_numa to .gitignore
Iago Toral Quiroga
- drm/v3d: Associate a V3D tech revision to all supported devices
- drm/v3d: Use V3D_SMS registers for power on/off and reset on V3D 7.x
- drm/v3d: Disable interrupts before resetting the GPU
Jose Maria Casanova Crespo
Juan A. Suarez
Maíra Canal
- drm/vc4: tests: Use return instead of assert
- drm/vc4: tests: Document output handling functions
- drm/vc4: tests: Stop allocating the state in test init
- drm/vc4: tests: Retry pv-muxing tests when EDEADLK
- drm/v3d: fix client obtained from axi_ids on V3D 4.1
- drm/v3d: client ranges from axi_ids are different with V3D 7.1
- drm: writeback: Fix drm_writeback_connector_cleanup signature
- Revert “staging: vchiq_arm: Improve initial VCHIQ connect”
- Revert “staging: vchiq_arm: Create keep-alive thread during probe”
Rodrigo Siqueira
- drm/amd/display: add proper error message for vblank init
- drm/amd/display: add proper error message for vblank init
- drm/amdgpu: add initial documentation for debugfs files
- drm/amdgpu: drop some dead code
- drm/amdgpu/gfx6: fix CSIB handling
- drm/amdgpu/gfx7: fix CSIB handling
- drm/amdgpu/gfx8: fix CSIB handling
- drm/amdgpu/gfx9: fix CSIB handling
- drm/amdgpu/gfx10: fix CSIB handling
- drm/amdgpu/gfx11: fix CSIB handling
Tvrtko Ursulin
- drm/xe/rtp: Drop sentinels from arg to xe_rtp_process_to_sr()
- drm/radeon: fix the warning for radeon_cs_parser_fini
- drm: add drm_file_err function to add process info
- drm/amdgpu: add drm_file reference in userq_mgr
- drm/amdgpu: use drm_file_err in fence timeouts
- drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c
- drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver
- dma-buf: fix compare in WARN_ON_ONCE
- drm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_fini
- drm/panfrost: Fix scheduler workqueue bug
Tested (1)
André Almeida
Acked (24)
Changwoo Min
- sched_ext: Indentation updates
- sched_ext: Remove scx_ops_enq_* static_keys
- sched_ext: Remove scx_ops_cpu_preempt static_key
- sched_ext: Remove scx_ops_allow_queued_wakeup static_key
- sched_ext: Make scx_has_op a bitmap
- sched_ext: change the variable name for slice refill event
- sched_ext: add helper for refill task with default slice
- sched_ext: Clarify CPU context for running/stopping callbacks
- sched_ext: Introduce scx_sched
- sched_ext: Avoid NULL scx_root deref through SCX_HAS_OP()
- sched_ext: Use dynamic allocation for scx_sched
- sched_ext: Inline create_dsq() into scx_bpf_create_dsq()
- sched_ext: Factor out scx_alloc_and_add_sched()
- sched_ext: Move dsq_hash into scx_sched
- sched_ext: Move global_dsqs into scx_sched
- sched_ext: Relocate scx_event_stats definition
- sched_ext: Factor out scx_read_events()
- sched_ext: Move event_stats_cpu into scx_sched
- sched_ext: Move disable machinery into scx_sched
- sched_ext: Clean up SCX_EXIT_NONE handling in scx_disable_workfn()
- sched_ext: Call ops.update_idle() after updating builtin idle bits
Tvrtko Ursulin
- drm/i915/backlight: drop dmesg suggestion to file bugs
- drm/i915/error: drop dmesg suggestion to file bugs on GPU hangs
- Revert “drm/i915/gem: Allow EXEC_CAPTURE on recoverable contexts on DG1”