TSC warp, Threads

Hal Murray hmurray at megapathdsl.net
Thu Mar 11 01:25:26 UTC 2021


Does anybody have access to server class systems?  Or know somebody who can 
run a quick test for me?

Long story.

I'm trying to setup a test environment for the thread work.  Dell high end 
workstations use Xeon chips.  I have a T5610, but it has TSC warp so doesn't 
use the TSC for timekeeping.  That's not fatal, but it's annoying and screws 
up trying to get timing for what a real system would get.

In case you aren't familiar with TSC warp...  Using the TSC for timekeeping 
requires that the TSCs on all the cores be in sync.  It took Intel a few tries 
to figure what they had to do.  Chip reset has to clear them all but a soft 
reset for an individual CPU must not clear the TSC.  So old chips don't work.  
I haven't found a list of chips that do/don't work, or a recipe.  As far as I 
can tell, there isn't a features flag for this feature.

The Linux kernel now tests to see if the TSCs are in sync.  When they are not 
in sync, dmesg/syslog will have things like this:
Mar  1 08:22:43 sam kernel: TSC synchronization [CPU#0 -> CPU#1]:
Mar  1 08:22:43 sam kernel: Measured 443896 cycles TSC warp between CPUs, 
turning off TSC clock.
Mar  1 08:22:43 sam kernel: tsc: Marking TSC unstable due to 
check_tsc_sync_source failed

The problem could be in the processor chip or the kernel or the BIOS.  The 
kernel works on other systems.  I'm pretty sure it's not guilty.  I found some 
Intel documentation for chips I'm using listing doing the resets right (for 
timekeeping) as an errata.  That leaves the BIOS.

So I'd like to see if the BIOSes on other systems have this bug.  If anybody 
has access to a server or high end workstation, anything running Linux on a 
Xeon chip, I'd like to know if it does or doesn't have a warped TSC.  Just run 
"dmesg | grep tsc -i"  If you get the 3 lines above, it's warped.

I'd like to know the type/model of system and the processor chip and the 
yes/no on the warp.  You can get the CPU chip from /proc/cpuinfo

Ideally, I'd like to find a workstation (quieter than servers) that doesn't 
have this bug/feature and is old enough so I can get a refurbished system at a 
reasonable price.

Data from servers can confirm that the chip is not broken.  I'm particularly 
interested in E5-26xx chips.

---------------

I've been debugging test code.  I have a multi threaded echo server with a 
knob to spin for N uSec before replying.  I also have a test harness to drive 
clients on several systems, each with several threads/sockets so several 
low-end systems can generate enough traffic to saturate a high end system.

If N is 0, my best workstation can echo a million packets per second.  I think 
the limiting factor is the kernel thread processing input packets.  It has to 
figure out which socket gets each packet.  It takes about 4 uSec per packet 
for the worker threads.  If N is big enough then the worker CPUs get saturated.

I have a similar test harness for NTP.  (This is why I'm poking at 
ntp_control.  I want to look at CPU usage (getrusage).)

NTP numbers on a fast PC are, roughly:
  NTP 6 uSec, 167K packets/second
  AES 7.2 uSec, 138K
  NTS 17 uSec, 59K
Details depend a lot on how well your chip does AES.  Intel has improved that 
over the years.

FreeBSD is significantly faster than Linux.


-- 
These are my opinions.  I hate spam.





More information about the devel mailing list