Home | News | Download | Packages | Forum | Wiki | Github

System crash?


#1

A month ago I built a new computer with H270 chipset and Intel Core i7-7700 CPU. I made sure to handle all components in an ESD safe way. After the build was completed Void Linux was installed. I then stress tested the system with Prime95 and MemTest86. No errors was reported during testing and the system did not crash or became unstable. After the initial testing I did not use the system much, I would log on to update the system and a few times to browse the web and watch YouTube videos, but it would mostly idle.

Yesterday I tried to login via SSH after not having used the system for two days, but I was unable to connect so I tried to login using the console instead, but I was unable to get any output on the monitor and the Num Lock/Caps Lock/Scroll Lock LEDs did not respond to key presses, so the only thing I could do was a hard power off. The system booted fine and has been running since.

Logging was not activated so I have absolutely no idea what happened. The only thing I know is that the system was idle and Xorg was not running during the crash.

My biggest worry is that the crash was caused by a hardware problem, especially since the system was doing nothing when the crash happened. I don’t want to replace my current system with a unstable one as I’m pretty dependant on a working computer.

What is your opinion on this? What would you do in this case? Could this simply be a software bug that I don’t need to worry about, or does it sound more like a hardware problem?


(David) #2

I’d tend to blame a bleeding edge distro for any problems rather than a newly built system doing nothing. Enable logging and if this happens again, maybe you can find out what’s going on. More info about your particular Void setup would help too.


#3

I really hope so. A software caused crash now and then is not a problem. However, I’m not a hardware person and building the system was kind of a hassle so I prefer not to mess around with it anymore.

Just a note, after thinking some more about it I’m still 99% sure that the system did nothing when it crashed, but if it has been doing something it was running a small script in tmux that looked for the highest CPU core temperature in /proc every 0.5 second and printed to the screen.

I’m waiting for a BIOS update to my motherboard that will fix the Skylake/Kaby Lake hyper-threading flaw, but I have read that it’s very rare to trigger it and and the system would have to be under load, so I think it’s unlikely that it has something to do with my problem.

I will definitely do that and I’m also planning to setup ‘netconsole’ in case the system is unable to write the logs locally.

The setup is completely standard. I booted the computer in UEFI mode and installed Void using the void-installer. It is connected to the network by ethernet. The installation is only used for testing various things and when I’m ready to shift from my main system, I will do a proper installation with encryption etc. and that’s why I unfortunately did not enable logging to begin with.


#4

The Debian message says you should install the Intel microcode package:

“THE MICROCODE PACKAGES FROM THE RECENT STABLE RELEASE (June 17th, 2017)
ALREADY HAVE THE SKYLAKE FIX, BUT YOU MAY HAVE TO INSTALL THEM.”

$ xbps-query -Rs intel-ucode
[*] intel-ucode-20170707_1 Microcode update files for Intel CPUs

#5

i7-7700 is a Kaby Lake processor so the fix is currently only avaliable through a BIOS/UEFI update:

The Kaby Lake microcode updates that fix this issue are currently only
available to system vendors, so you will need a BIOS/UEFI update to get
it.


#6

The Debian message is outdated.

Look at the latest microcode update from Intel, released a few days ago:

“This update includes patches for SKL and KBL processors that addresses a processor/microcode defect with Hyper-Threading enabled. There are total of three patches - one for SKL and two for KBL.”

The problem seems to be fixed now.


#7

You are absolutely right. I didn’t think Intel would be so fast to release the microcode.

Now I can install the package and exclude hyperthreading as cause of the crash if it happens again. Maybe something else has also been fixed in an earlier version. Thanks for pointing this out.