iGNEOS iGNEOS - 4 months ago 15
Linux Question

Linux (Debian 8 Jessie) HRTimer - Kernel - Leap Seconds

ANSWER: VM Time Syncing is an art, I'm gonna count my blessings and use what already works for me.

This is why: ~qoute~
I've spent several years playing with synchronizing time between host and guest VMs, with and without NTP etc. - it's basically a black art and heavily dependent on hypervisor, both kernels, and a host of settings. We aren't going to sort it out in comments on SO, but if you have a known good configuration, I'd revert to that and change bit by bit until you know what breaks it.

– abligh




Edit:

The Percona information is for context, it is what I'm doing. But the issue IS realized as a leap second issue in Debian 8 kernel. Regarding the hrtimer (High-Res Timer). Mozilla's suggested fix is to reset the time using the "Date" function in linux, but attempting to set date tells me i dont have permission despite being root.




Information: The Inside Story of the Extra Second That Crashed the Web

http://wired.com/2012/07/leap-second-glitch-explained




Screenshots:

Recent Attempt: http://prnt.sc/c208x2

Top: http://prnt.sc/c202q9

Mysql ProcessList: http://prnt.sc/c20b4f




Context:

I've heard this had to do with a leap second issue, thing is, it doesn't fit my profile exactly.

I have several instances of debian linux in vm's on Proxmox.

2 Seperate Percona Galera-Mysql Clusters.

First cluster is original cluster which is in use, But it doesn't have this problem or has ever had. My new cluster setup does on all 3 nodes.

Ive tried applying several versions of the leap second fix

date
date -s "`date`"
date -s "`date`"


All ends in I don't have permission to do that except the first one, but I'm root!

And no work-y.

Many reboots.

Each instance except one slave have the same stats, I boosted one to test if it was a limited resource issue.

1 gig ram

1 3.6ghz cpu

The only other thing I can say is the problem starts AFTER I upload a backup database.

The database is one I already have in first cluster is around 5mb, and is mostly in relational table list style. These are not in use, not doing anything, many reboots later. I'm getting pretty stuck.

PS: Process List is empty. (1 for self look-up and 2 sleep idles)

Answer

I am guessing that the problem here is that you are running on a VM (or even a container). Although you are root, you don't from the container / guest-OS have permission to change the time. That's because the time syncs from the host operating system, and clearly one can't change that.

If so, can you fix it in the host OS? Or do you not have access to that?

Also useful would be an

strace -f -s2048 -o/tmp/post-this-file date -s "`date`"

i.e. a trace of the system calls made by date.

The HiRes timer livelock you refer to was fixed upstream in commit id=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d.

The debian bug to incorporate them into the kernel seems to be here and the fixes described here (incorporated into Debian's 3.2.29-1 and 2.6.32-46 kernels). What kernel are you actually running (output of uname -a would be helpful)? This isn't always obvious, especially in a container environment.

Restarting Percona may fix the issue if you haven't tried that.

Post a little more about the environment and I might be able to be more help.