seccomp (short for secure computing mode) is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit(), sigreturn(), read() and write() to already-open file descriptors. Should it attempt any other system calls, the kernel will terminate the process with SIGKILL or SIGSYS.[1][2] In this sense, it does not virtualize the system's resources but isolates the process from them entirely.https://en.wikipedia.org/wiki/Seccomp OpenBSD pledge(2) is similar https://man.openbsd.org/pledge.2 From looking at the systemd source it looks like it uses mount() to make directories read-only to implement the ProtectHome= and similar directives.
Programming Tips - Linux: Create a sandbox for an application using system calls
Date: 2022jan28
Keywords: syscall
OS: Linux
Language: C/C++
Q. Linux: Create a sandbox for an application using system calls
A. If its a service its easiest to use systemd's sandbox options.
https://www.freedesktop.org/software/systemd/man/systemd.exec.html
But if you want to pragmatically remove some abilities for a program using system calls
prctl(2) and seccomp(2) are the way. A program can do it for itself a startup or a supervisor program
(like systemd) can do it after forking it - but before exec()ing it.
prctl(2) man page
https://man7.org/linux/man-pages/man2/prctl.2.html
seccomp(2) man page
https://man7.org/linux/man-pages/man2/seccomp.2.html
Quoting Wikipedia article about seccomp in general: