I had an interesting day at work today trying to untangle some legacy shit and merging old stuff into a brand spanking new monitoring cluster.

It was a lot of iterating through the same stuff over and over. Breaking stuff into pieces and finding out what it’s supposed to do. And trying to read the manuals available.

I felt like the dog meme above:

"LET ME IN I NEED TO GO BACK OUT AGAIN!"

I’ve spent some time wrecking havoc on a op5 installation today while trying to setup a load balanced cluster with a couple of servers. The Scalable Monitoring documentation was straightforward but I have a lot of tweaking to do before it works as we want it. Using a existing setup is always harder than installing from scratch, I guess.

The strangest thing from the instructions was the official way of pushing configuration with their clustered setup.

mon restart ; sleep 3 ; mon oconf push

Why would you use mon restart;sleep 3 instead of stopping the monitor service, pushing the conf and starting when you’re done? And why is the sleep exactly 3 seconds? I can’t leave this be…

There has to be some sort of underlying story that made them add a 3 seconds sleep to the official documentation.