Friday, November 30, 2018

Production Postmortem: the ARM Is Killing Me

“If a tree falls in a forest and no one is around to hear it, does it make a sound?” This is a well-known philosophical statement. The technological equivalent of this is this story. We got a report that RavenDB was failing in the field. But the details around the failure were critical.

The failure happened on the field, literally. This is a system that is running an industrial robot using a custom ARM board. The failure would only happen on the robot on the field and would not reproduce on the user’s test environment or on our own systems. Initially, that was all the information that we had: “This particular robot works fine for a while, but as soon as there is a break, RavenDB dies and needs to be restarted." That was the first time I had run into a system that would crash when it went idle, instead of dying under load, I have to say.



from DZone.com Feed https://ift.tt/2rcb7d9

No comments:

Post a Comment