Radeon RX6600 GPU hang leading to Xserver crash #4

Open
opened 2024-09-30 17:50:30 +00:00 by humaton · 0 comments
Member

Every 2-3 days or so, my X server freezes for about a minute and then exits. It seems to be related to new output causing scrolling in xterm, although I have also seen it in ghidra occasionally. My hardware is an old i7-3770k with a recently fitted Radeon RX6600, so I don't know if this is a regression. It did not happen with the Intel iGPU. Reproducible: Sometimes Steps to Reproduce: 1. Run XFCE desktop with extensive use of xterm (the real old-fashion X11 xterm) 2. Generally use the desktop for web browsing, youtube, software development for a couple of days, using suspend-to-RAM overnight 3. Every so often, do something which results in the output scrolling in xterm Actual Results: The X server freezes and after a minute or so, crashes back to the greeter/user login prompt. In the frozen state, there is almost always an xterm in the process of scrolling where the new line is a corrupted black+white pattern instead of new text. The dmesg has a lot of repeated instances of this: [458041.735598] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:7 pasid:32772, for process Xorg pid 390300 thread Xorg:cs0 pid 390320) [458041.735611] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800109e12000 from client 0x1b (UTCL2) [458041.735615] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00701031 [458041.735618] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [458041.735621] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [458041.735623] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [458041.735625] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [458041.735627] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [458041.735629] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 Expected Results: desktop should not crash uname -a Linux stando.fishzet.co.uk 6.3.11-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Jul 2 13:17:31 UTC 2023 x86_64 GNU/Linux rpm -qa | grep Xorg xorg-x11-server-Xorg-1.20.14-23.fc38.x86_64

Every 2-3 days or so, my X server freezes for about a minute and then exits. It seems to be related to new output causing scrolling in xterm, although I have also seen it in ghidra occasionally. My hardware is an old i7-3770k with a recently fitted Radeon RX6600, so I don't know if this is a regression. It did not happen with the Intel iGPU. Reproducible: Sometimes Steps to Reproduce: 1. Run XFCE desktop with extensive use of xterm (the real old-fashion X11 xterm) 2. Generally use the desktop for web browsing, youtube, software development for a couple of days, using suspend-to-RAM overnight 3. Every so often, do something which results in the output scrolling in xterm Actual Results: The X server freezes and after a minute or so, crashes back to the greeter/user login prompt. In the frozen state, there is almost always an xterm in the process of scrolling where the new line is a corrupted black+white pattern instead of new text. The dmesg has a lot of repeated instances of this: [458041.735598] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:7 pasid:32772, for process Xorg pid 390300 thread Xorg:cs0 pid 390320) [458041.735611] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800109e12000 from client 0x1b (UTCL2) [458041.735615] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00701031 [458041.735618] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [458041.735621] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [458041.735623] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [458041.735625] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [458041.735627] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [458041.735629] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 Expected Results: desktop should not crash $ uname -a Linux stando.fishzet.co.uk 6.3.11-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Jul 2 13:17:31 UTC 2023 x86_64 GNU/Linux $ rpm -qa | grep Xorg xorg-x11-server-Xorg-1.20.14-23.fc38.x86_64
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: rpms/mesa#4
No description provided.