Bad Ram, or Is It?

Good Morning from my Robotics Lab! This is Shadow_8472, and today I’m solving Derpy’s derp once and for all (I hope)! Let’s get started!

Background

There’s seemingly always been something off about Derpy, even before I dubbed it DerpyChips. First it was a cheap hard drive shorting out. Every so often, the system would throw a power kernel error. I fixed it inadvertently while switching to a solid state drive when I installed Linux.

More recently, we got some new RAM to replace the sticks I borrowed for another computer, but then the system developed occasional twitching fits. It took a while, but I identified one of its four 4gb RAM modules as having a number of bad sectors. The others passed the testing tool. The twitching improved, but not entirely.

Long-Haul Diagnostics

I started using Derpy a little more heavily a bit under a month ago, and the twitching went from Discord twitching by randomly restarting to Kerbal Space Program crashing at least once every couple hours, though only during loading screens and conveniently always shortly after a save or autosave.

Sometimes my game would catch and I could properly close out before it quit with a note about core dumped. If I was unlucky, the even the mouse would freeze for minutes on end. Rebooting didn’t help.

With problem frequency rising into the barely diagnosable range, I decided to run Derpy’s RAM through a less formal test. I temporarily moved the suspect sticks to my Manjaro workstation, isolating them from an unknown number of possible causes. Both computers were pretty dusty, so I took them outside and hit them with the canned air. To keep things fair, I only installed three out of four known good RAM sticks into Derpy and stored the fourth in a drawer.

A Test in Question

Both systems were usably stable right away. Derpy’s fan especially wasn’t panting nearly so hard without a giant dust mat blocking the heat sink. But what if that was the problem all along? Dust in the computer is the perfect unstory. It was early enough in the week I would have sought out another topic.

I spent an afternoon of perfect stability on Derpy, then focused my efforts on passively testing the RAM with my main computer. 99% stability. I had a few audio glitches, but for all I knew, they were always there and I was just noticing them because I was expecting trouble.

Over the course of the week, more problems started showing up. Not many, not often, but KSP did eventually lag hard a couple times, and Discord restarted, switching screens while it was at it. I even had the whole system go unresponsive, even when trying to switch to the “real” terminals outside the graphical environment. One time in a week of testing, I was forced to reboot.

Takeaway

I wasn’t as intense with the RAM as I would have liked, but all things considered, I’m around 95-98% sure Derpy’s replacement RAM is no good. I’ll be interested in seeing about switching it out under warranty this coming week.

To get an idea of how irreproducible this error is on command, after a couple big crashes, I finally had the idea to keep the terminal I’ve been running KSP from up on my second monitor. The plain loading screen is a buzz of activity as the terminal dutifully posts entries in the log file. I was hoping to see its activity during a bad crash, but it wasn’t to be.

Final Question

I’ve narrowed down the problem to the part and prescribed a solution, but I still don’t understand the why. My best guess is that it’s degraded somehow since the test or else it’s weaker under extra heat. Why does my RAM glitch out the way it does?

Leave a Reply