droproot, seccomp

Hal Murray hmurray at megapathdsl.net
Tue Feb 25 23:36:19 UTC 2020


> Is there anything preventing the possibility of an early looser
> seccomp setup and then tightening it later possibly with a knob
> to generate terse or verbose warnings instead of dying. 

> Do you have an implementation strategy in mind? 

The API, or the subset we are using, is:
  This list of syscalls is OK.  (everything else is not)
It's a bpf style list so you can check the first parameter and things like 
that but we don't do that.

So for the 2 step process, you would need 2 lists.  The first would have to 
allow seccomp or attempting to switch to the second list would trap.

If something bad happens, you have 2 options: Kill the process or raise a 
signal.

We catch the signal and log a message and stack trace.  That's all doe from a 
signal handler so it might screwup.  So far, it mostly works.  Well, it mostly 
works if the un-OKed syscall isn't used by the logging code.

I just discovered that threads get started in the kill mode so our current 
signal handler doesn't work if a thread hits a syscall that isn't on the list.

I think it would be possible for the signal handler to stash the info in 
global memory, set a flag, and have the non SIGNAL level main loop log the 
event.

-------------

I don't think it's worth the effort to maintain 2 lists.  We can revisit that 
if you think it's appropriate.

------------

I'm working on a cleanup.

The basic idea is to have a per-distro/version/architecture list of syscalls 
in a separate file.  The seccomp code does a #include of a file.  You specify 
the file with something like --enable-syscomp=fedora-31-x86_64.

I'm working on a way to semi-automate generating the list.  The basic idea is 
to run ntpd under strace on the type of system you are interested in to 
collect a lot of data, then run a script to extract the list of syscalls from 
the strace log file.  So far, I have 2 data for 2 systems.  I debugged things 
on the first one, but the output from the final run seems to be working.  The 
second one just worked.

There are 46 syscalls in each list and 55 in the merged list.

I'm about to collect more data on other systems and move the enable-seccomp 
code to very late in the initialization.  It will be interesting to see how 
many syscalls that saves.

There may be a long tail.  I don't have any refclock data yet.

----------

NB: With this approach we would be collecting and distributing a bunch of 
lists.  That may turn into a list of the systems that we actually support.  
And we would have to do a release whenever a supported system makes a major 
change and/or there may be problems if they make a minor change (bug fix) to a 
library.



-- 
These are my opinions.  I hate spam.





More information about the devel mailing list