Use of pool servers reveals unacceptable crash rate in async DNS

Sun Jun 26 00:37:13 UTC 2016

esr at thyrsus.com said:
> I think the hack is to force libgcc_s to be loaded early. I don't know how
> to do that in waf. 

There are two problems in this area.  One is the end-of-thread code not 
getting locked into memory.  I think that is what you are running into.

The other is a tangle of error handling on out-of-memory issues by things 
like pthread_create and DNS lookup.  I think the latter end up with a retry 
error code.  I think I fixed some/many of them to crash rather than retry on 
the assumption that memory wasn't going to get freed and I didn't know of any 
other reason to retry.  But that was a long time ago (maybe pre fork) and I 
don't remember the details.

I think we should copy the warmup code from ntp classic.  It's basically an 
upstream bug.  Warmup seems like a reasonable work around.

It's in ntpd/ntpd.c  Search for NEED_PTHREAD_WARMUP and backup over the long 
comment
which describes what's going on.

There is a note about not working on FreeBSD.  I haven't sorted that out.  It 
may refer to the linker hack.

Here are the bugs I remember:
  https://bugs.ntp.org/show_bug.cgi?id=2831
    FreeBSD page fault story, morphs into lock discussion
  https://bugs.ntp.org/show_bug.cgi?id=2905
    rlimit/memlock discussion

There is more info in various bugs:
  https://bugs.ntp.org/show_bug.cgi?id=2332
  https://bugs.ntp.org/show_bug.cgi?id=2954
  https://bugs.ntp.org/show_bug.cgi?id=2817
The signal/noise may not be good.

-- 
These are my opinions.  I hate spam.