Here’s how troubleshooting dead servers at Twitter lead to myself taking on a long-running project working with the company’s server vendor.
I provide a list of tools for data center technicians to keep in a backpack or on a cart so they can perform their job properly.
Data center after hours work is hurry up and wait: I wait for devices to switch off, do my work, and then to wait for them to come online.
Data Center on-call life is one where every Slack ping brings fear because you think all heck broke loose and you need to fix it now!
Rack push days at the Twitter data center sucked because we were always tired and sweaty from pushing heavy racks down a long hallway.
There are several dangers in the data center that can befall technicians and even customers, and I list them with ways on how to avoid them.
Due to the amount of walking and standing the proper shoes for data center techs are vital or one will be in pain. My recommendation: Crocs.
My opinion is harsh but git gud techs and stop taking easy tickets because you don’t want to get “stuck” with a difficult-to-diagnose server.
In the Twitter data center if a network switch had a dead switch port techs replaced the entire switch. That’s wasteful, right? Nope.
Diagnosing faulty servers in Twitter’s data center for us Site Operations Techs took using multiple commands and sometimes a crash cart.