Awesome
WHAT IS IT?
Kafel is a language and library for specifying syscall filtering policies. The policies are compiled into BPF code that can be used with seccomp-filter.
This is NOT an official Google product.
Usage
With verbose error reporting
struct sock_fprog prog;
kafel_ctxt_t ctxt = kafel_ctxt_create();
kafel_set_input_string(ctxt, seccomp_policy);
if (kafel_compile(ctxt, &prog)) {
fprintf(stderr, "policy compilation failed: %s", kafel_error_msg(ctxt));
kafel_ctxt_destroy(&ctxt);
exit(-1);
}
kafel_ctxt_destroy(&ctxt);
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog, 0, 0);
free(prog.filter);
Without verbose error reporting
struct sock_fprog prog;
if (kafel_compile_string(seccomp_policy, &prog)) {
fputs("policy compilation failed", stderr);
exit(-1);
}
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog, 0, 0);
free(prog.filter);
Policy language
A simple language is used to define policies.
Policy file consists of statements.
A statement can be:
- a constant definition
- a policy definition
- a policy definition statement
- a default action statement
Policy definition statements placed at file scope will be added to the implicit top level policy. This top level policy is going to be compiled.
Default action statement
DEFAULT the_action
Specifies that action the_action
should be taken when no rule matches.
The default action must be specified just once.
If the policy file specifies no default actions, the default action will be KILL
Numbers
Kafel supports following number notations:
- Decimal
42
- Hexadecimal
0xfa1
- Octal
0777
- Binary
0b10101
Constant definitions
You may define numeric constants to make your policies more readable. Constant definitions may be placed almost anywhere in the policy file. A constant definition cannot be placed inside of a policy definition. The defined constants can then be used anywhere where a number is expected.
#define MYCONST 123
Policy definitions
Policy definition is a list of action blocks and use statements separated by commas.
samples/ contains some example policies that demonstrate supported features.
Use statements
A USE someOtherPolicy
behaves as if someOtherPolicy
body was pasted in its
place. You may only use policies defined before the use statement.
With use statements you can create meaningful groups of filtering rules that are building blocks of bigger policies.
Action blocks
Action block consist of a target and list of syscall matching rules separated with commas.
Target of first rule matched is the policy decision.
Following table list Kafel targets and their corresponding seccomp-filter return values.
Kafel | seccomp-filter |
---|---|
ALLOW | SECCOMP_RET_ALLOW |
LOG | SECCOMP_RET_LOG |
KILL , KILL_THREAD , DENY | SECCOMP_RET_KILL |
KILL_PROCESS | SECCOMP_RET_KILL_PROCESS |
USER_NOTIF | SECCOMP_RET_USER_NOTIF |
ERRNO(number) | SECCOMP_RET_ERRNO+number |
TRAP(number) | SECCOMP_RET_TRAP+number |
TRACE(number) | SECCOMP_RET_TRACE+number |
Syscall matching rules
A rules consist of syscall name and optional list of boolean expressions.
List of boolean expressions separated by commas.
A comma is semantically equivalent to ||
but has the lowest precedence,
therefore it may be easier to read.
Syscall naming
Normally syscalls are specified by their names as defined in Linux kernel.
However, you may also filter custom syscalls that are not in the standard
syscall list.
You can either define a constant and use it in place of syscall name or
utilize SYSCALL
keyword.
#define mysyscall -1
POLICY my_const {
ALLOW {
mysyscall
}
}
POLICY my_literal {
ALLOW {
SYSCALL[-1]
}
}
Argument filtering
Boolean expressions are used to filter syscalls based on their arguments. A expression resembles C language syntax, except that there are no arithmetic operators.
some_syscall(first_arg, my_arg_name) { first_arg == 42 && my_arg_name != 42 }
Bitwise and (&
) and or ('|') operators can be used to test for flags.
mmap { (prot & PROT_EXEC) == 0 },
open { flags == O_RDONLY|O_CLOEXEC }
You don't have to declare arguments for well-known syscalls but can just use
their regular names as specified in Linux kernel and man
pages.
write { fd == 1 }
Include directive
In order to simplify reuse and composition of policies, kafel provides include support.
#include "some_other_file.policy"
Kafel looks for included files only under directories explicitly added to the search paths.
kafel_include_add_search_path(ctxt, "includes/path");
Adds includes/path
to search paths - the example include directive will refer
then to includes/path/some_other_file.policy
.
Include directive is terminated by a newline or a semicolon. Multiple files, separated by whitespace, can be specified in one directive.
#include "first.policy" "second.policy"; #include "third.policy"
Example
When used with nsjail, the following command allows to create a fairly constrained environment for your shell
$ ./nsjail --chroot / --seccomp_string 'POLICY a { ALLOW { write, execve, brk, access, mmap, open, newfstat, close, read, mprotect, arch_prctl, munmap, getuid, getgid, getpid, rt_sigaction, geteuid, getppid, getcwd, getegid, ioctl, fcntl, newstat, clone, wait4, rt_sigreturn, exit_group } } USE a DEFAULT KILL' -- /bin/sh -i
[2017-01-15T21:53:08+0100] Mode: STANDALONE_ONCE
[2017-01-15T21:53:08+0100] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, uid:(ns:1000, global:1000), gid:(ns:1000, global:1000), time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-01-15T21:53:08+0100] Mount point: src:'/' dst:'/' type:'' flags:0x5001 options:''
[2017-01-15T21:53:08+0100] Mount point: src:'(null)' dst:'/proc' type:'proc' flags:0x0 options:''
[2017-01-15T21:53:08+0100] PID: 18873 about to execute '/bin/sh' for [STANDALONE_MODE]
/bin/sh: 0: can't access tty; job control turned off
$ set
IFS='
'
OPTIND='1'
PATH='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
PPID='0'
PS1='$ '
PS2='> '
PS4='+ '
PWD='/'
$ id
Bad system call
$ exit
[2017-01-15T21:53:17+0100] PID: 18873 exited with status: 159, (PIDs left: 0)