quick-docs/modules/ROOT/pages/kernel-troubleshooting.adoc

120 lines
4.7 KiB
Text

= Troubleshooting
Brandon Nielsen; Jibec
:revnumber: unknown
:revdate: 2021-02-22
:category: Kernel
:tags: How-to, Kernel, Troubleshooting
:page-aliases: kernel/troubleshooting.adoc
The kernel, like any software, has bugs. It's a large, complex project and can be difficult to troubleshoot problems. This document covers some basic troubleshooting techniques to help narrow down the root cause of an issue.
== Boot failures
Sometimes the kernel fails to boot. Depending on where the problem is in the
boot process, there may or may not be any output. Some good first steps are:
* Remove `quiet` (enable more log messages) and `rhgb` (disable graphical boot)
from the boot flags. If the text output is too fast to read, add
`boot_delay=1000` (the number of milliseconds to delay in
between printk during boot). You can use a camera to take pictures of the
output.
* Booting with vga=791 (or even just vga=1 if the video card won't support 791)
will put the framebuffer into high resolution mode to get more lines of text
on screen, allowing more context for bug analysis.
* Add `initcall_debug` parameter, which traces the initcalls as they are
executed.
* If you get no output at all from the kernel, booting with `earlyprintk=vga`
can sometimes yield something of interest.
== Hangs and freezes
* Checking whether or not the CapsLock key (or NumLock or ScrollLock) causes
the light on the keyboard to change state can be used as an indication of
whether or not the kernel has hung completely, or if there is something else
going on.
* The SysRq magic keys may still work. You may need to add
`sysrq_always_enabled=1` to the kernel boot command line. See
https://fedoraproject.org/wiki/QA/Sysrq[the wiki article on SysRq on usage
details].
* Setting `nmi_watchdog=1` on the kernel command line will cause a panic when
an NMI watchdog timeout occurs.
== Logs to collect ==
When reporting an issue with the kernel you should always attach the kernel
logs, usually collected with the `dmesg` command. For some types of issues,
you may need to collect more logs.
=== Input issues (touchpad etc.) ===
Information for collecting logs is documented at the https://wayland.freedesktop.org/libinput/doc/latest/reporting-bugs.html[libinput website].
=== Sound issues ===
`alsa-info.sh` provides information about both kernel and userspace components.
If you have a working and non-working kernel, you should provide `alsa-info.sh`
for both cases.
== Bisecting the kernel
If the problem you've encountered isn't present in older versions of the
kernel, it is very helpful to use `git-bisect` to find the commit that
introduced the problem. For a general overview of `git-bisect`, see its
https://git-scm.com/docs/git-bisect[documentation]. An outline on how to bisect
the kernel is included in the
https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html[kernel
documentation]. This guide contains Fedora-specific details.
[NOTE]
====
Bisecting is a time-consuming task, but it's very straightforward and is
often the best way to find the cause of a problem. If you're really interested
in getting the problem you're seeing fixed, bisecting will speed up the process
considerably in most cases.
====
. Find the newest version you can that works. This will be the initial "good"
version. The first version you find that doesn't work will be the initial "bad"
version.
. Install the xref:kernel/build-custom-kernel.adoc#_get_the_dependencies[dependencies]
required to build the kernel.
. Next, xref:kernel/build-custom-kernel.adoc#_getting_the_sources[get the source code].
. Prepare a `.config` file. Assuming you've got both the good and bad kernel
installed, the config for both will be in `/boot/`.footnote:[When bisecting
between major versions (e.g. `v4.16` and `v4.15`) new configuration options
will be added and removed as you bisect. It's _usually_ safe to select the
default.]
. Start a new `git-bisect` with `git bisect start`.
. Mark the newest version that works as "good" with `git bisect good <tag>`.
For example: `git bisect good v4.16.8`.
. Mark the first version that does not work as "bad" with `git bisect bad
<tag>`. For example: `git bisect bad v4.17`.
. xref:kernel/build-custom-kernel.adoc#_building_the_kernel[Build the kernel]. Sometimes
commits cannot be built. If this happens, skip the commit with `git bisect
skip`.
. xref:kernel/build-custom-kernel.adoc#_installing_the_kernel[Install the kernel].
. Reboot into the new kernel and test to see if it works.
. If the new kernel works, mark it as good with `git bisect good`. Otherwise,
mark it as bad with `git bisect bad`.
. Repeat the previous five steps until you've found the commit that introduced
the problem.