Proposal - drop the GPSD JSON driver

Gary E. Miller gem at rellim.com
Thu Oct 20 21:31:56 UTC 2016


Yo Eric!

On Thu, 20 Oct 2016 16:11:41 -0400
"Eric S. Raymond" <esr at thyrsus.com> wrote:

> Gary E. Miller <gem at rellim.com>:
> > > What sort of problems are there with the current one?  
> > 
> > Do we really need to go over this yet again?  These have been
> > covered for years.
> > 
> > First, the structure needs to be polled, sometimes leading to long
> > delayed and even missed samples.  
> 
> This will be the case with any shared-memory driver that doesn't have
> a wait semaphore.

But we try to stay POSIX, and the POSIX SHM has them, so a non-issue.

> Which means that a wait semaphore is a good idea,

And POSIX thought so too.

> but is orthogonal to the struct-vs.-JSON-question to put in the box.

Yes, we do tend to get all fuzzy.  Sort of like the presidential debates?

> You seem to have changed the subject from "the existing SHM sriver is
> bad" to "any shared-memory driver would be bad".  Do you believe that?

Of course not.  Which is why I keep saying I favor JSON in POSIX SHM.
And been saying it for at least a month.

> > Second, a big one, the C structure is of loosely defined size and
> > shape. Even on the same host you need to use the same compiler and
> > word length for client and server.  For example, you can't compile
> > the client as 32 bit and the server as 64-bit.  
> 
> No sale. You don't need to use the same compiler, just two compilers
> with self-aligned padding.

Fair enough, sadly a common problem, so still a reason against.

> If this weren't a safe assumption, NTP
> Classic would have been changed to remove it decades ago.

Uh, look at the (missing) commit history.  You will see that interface
has been patched, over and over again.

And I can not beleive you hold up NTP Classic as a model to emulate!

> (Note to self: Revise "The Lost Art of C Structure Packing" to reflect
> the NTP field evidence that this has not been an issue since
> approximately forever.)

Or do a listserver and commit searchs and see you are wrong.  You yourself
added memory barriers, which only work on some OS.

> Word-length mismatch between two programs built under the same OS
> never happens, or close enough to never that I don't care.

Uh, no.  remember when intel OS went from 32 bit to 64-bit?  It was
a huge issue with ntpd.  RasPi is about to have the same problems.

HIstory shows us it keeps happening, history repeats itsef.

>  Not
> willing to eat more jitter to armor against that extremely remote
> contingency.

What jitter?  DO I really need to keep repeating mmyself.  There is
NO added jitter in the exiting ntpd/chrony schemes.  There IS packet
loss and polling overhead.  

ntpd sits on the numbers for 64 seconds by default before looking at
them!  Checkout the code for minpoll and maxpoll, you'll see.  All the
jitter is frozen in place before gpsd gives the numerbs to ntpd.

> Have we on GPSD ever had an actual bug report due to these causes?  I
> know the answer, and it's a big "no".

Since gpsd is not big on the bug tracker thing, you are correct, narrowly.
But, loook in the git history, and you will see many patches.  Look
on the list server archives for the complaints.

> > Third, you pretty much need to reboot the server when you change
> > the struct size as deleting the old SHM segment while running is
> > problematic.  
> 
> Not true, we have a recipe for handling this case without rebooting in
> the GPSD Time Service HOWTO.

Yeah, like how many people get that far?  If we can'd do it automagically,
then we should not do it!  POSIX SHM can, if we work with it.

> > Fourth, teaching people to debug with ipcs is a PITA.  
> 
> True, but pretty minor stuff. Cases where you have to do that kind of
> debugging are rare.

But often enough cause users to give up.  This is exaclty why you
had to write ntpshmmon, because it is hard.

> > Fifth, security is hopeless.  Client and server both need to run as
> > root to be somewhat safe and only 2 SHM slots can do that.  After
> > that the SHMs are wide open to anyone on the server to mess with.  
> 
> I think you're being over-dramatic here.  Yes, there is a theoretical
> hole here but no evidence that it's ever been exploited.  "Hopeless"
> would be if there were a history of such attacks.

OK, can we agree on big gaping hole?  Do we really need exploits in the
wild to fix a potential vulnerability?  Especially when chrony does
not have the same vulnnerability?

> Is this a contingent problem with the way permissions are set
> by the exiting driver?  If so, we can fix it in a new one.

POSIX SHM can do it right, the question was what are the known problems
with the existing driver.

> > Sixth, so many of these problems are so hard to grasp that people
> > get amnesia about them after on eday.
> > 
> > Seventh...  well.  I could go on but I need more espresso, and the
> > following should be more than enough to show the current SHM needs
> > killing off. Sadly that will take a decade, after the next good
> > solution is implemented.  
> 
> You seem to have mixed together at least three different categories of
> objections.  One is to the existing SHM driver.  A second is to any
> driver based on a shared-memory drop with structure passing.  A third
> is to any shared-memory driver at all.  It would be helpful if you
> separated these more cleanly.

Yes, because I keep getting questions from eight different angles.  I'm just
asnwering the questions I'm asked, not trying to propose anything.  Not my
fault if they keep getting conflated in the responses.

But, to sum up:

1: Hal asked about that specifically.  I answered specifically.

2: Passing C structures between programs, locally or remotely, is an
   idea that needs to die.  Hal asked about C structures in #1 for
   an example.

3: I have no problem with a properly implemented shared memory driver.
   In fact, I'm the one that keeps saying JSON over POSIX SHM is a good way
   (not the only way) to go.  How many times do you want to put those 
   words in my mouth that I never said or implied?

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703
	gem at rellim.com  Tel:+1 541 382 8588
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 455 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ntpsec.org/pipermail/devel/attachments/20161020/04b649cc/attachment.bin>


More information about the devel mailing list