Home

Awesome

weggli-patterns

"No one cares about the old scene people anymore, I’m sure,
bunch of old people grepping for the last of the memcpy."

-- Bas Alberts

A collection of my weggli patterns to facilitate vulnerability research.

Blog post:
https://security.humanativaspa.it/a-collection-of-weggli-patterns-for-c-cpp-vulnerability-research

See also:
https://github.com/weggli-rs/weggli
https://dustri.org/b/playing-with-weggli.html
https://github.com/plowsec/weggli-patterns
https://github.com/synacktiv/Weggli_rules_SSTIC2023
https://twitter.com/richinseattle/status/1729654184633327720

buffer overflows

call to unbounded copy functions (CWE-120, CWE-242, CWE-676)

weggli -R 'func=^gets$' '{$func();}' .
weggli -R 'func=st(r|p)(cpy|cat)$' '{$func();}' .
weggli -R 'func=wc(s|p)(cpy|cat)$' '{$func();}' .
weggli -R 'func=sprintf$' '{$func();}' .
weggli -R 'func=scanf$' '{$func();}' .

incorrect use of strncat (CWE-193, CWE-787)

weggli '{strncat(_,_,sizeof(_));}' .
weggli '{strncat(_,_,strlen(_));}' .
weggli '{strncat($dst,$src,sizeof($dst)-strlen($dst));}' .
weggli '{_ $buf[$len]; strncat($buf,_,$len);}' .

The last pattern won't work with integer literals due to known limitations.

destination buffer access using size of source buffer (CWE-806)

weggli -R 'func=cpy$' '{$func(_,$src,_($src));}' .
weggli -R 'func=cpy$' '{$len=_($src); $func(_,$src,$len);}' .
weggli -R 'func=cpy$' '{_ $src[$len]; $func($dst,$src,$len);}' .

The last pattern won't work with integer literals due to known limitations.

use of sizeof() on a pointer type (CWE-467)

weggli '{_* $ptr; sizeof($ptr);}' .
weggli '{_* $ptr=_; sizeof($ptr);}' .
weggli '_ $func(_* $ptr) {sizeof($ptr);}' .

Apparently, global variables are not supported so this won't work:

weggli '_* $ptr=_; _ $func(_) {sizeof($ptr);}' .

use of sizeof() on a character constant

weggli "sizeof('_')" .

In C (but not in C++) character constants have type int.

lack of explicit NUL-termination after strncpy(), etc. (CWE-170)

weggli -R 'func=ncpy$' '{$func($buf,_); not:$buf[_]=_;}' .

Some possible variants: memcpy, read, readlink, fread, etc.

off-by-one error (CWE-193)

weggli '{$buf[sizeof($buf)];}' .
weggli '{_ $buf[$len]; $buf[$len]=_;}' .
weggli '{strlen($src)>sizeof($dst);}' .
weggli '{strlen($src)<=sizeof($dst);}' .
weggli '{sizeof($dst)<strlen($src);}' .
weggli '{sizeof($dst)>=strlen($src);}' .
weggli '{$buf[strlen($buf)-1];}' .
weggli -R 'func=allocf?$' '{$func(strlen($buf));}' .
weggli -R 'func=allocf?$' '{$len=strlen(_); $ptr=$func($len);}' .
weggli -R 'func=allocf?$' '{$len=snprintf(_); $ptr=$func($len);}' .

The second pattern won't work with integer literals due to known limitations.
< should also cover > and <= should also cover >=; however, let's keep all variants just to be sure.

use of pointer subtraction to determine size (CWE-469)

weggli '{_* $ptr1; $ptr1-$ptr2;}' .
weggli '{_* $ptr2; $ptr1-$ptr2;}' .
weggli '{_* $ptr1=_; $ptr1-$ptr2;}' .
weggli '{_* $ptr2=_; $ptr1-$ptr2;}' .
weggli '_ $func(_* $ptr1) {$ptr1-$ptr2;}' .
weggli '_ $func(_* $ptr2) {$ptr1-$ptr2;}' .

potentially unsafe use of the return value of snprintf(), etc. (CWE-787)

weggli -R 'func=(nprintf|lcpy|lcat)$' '{$ret=$func();}' .

direct write into buffer allocated on the stack (CWE-121)

weggli -R 'func=(cpy|cat|memmove|memset|sn?printf)$' '{_ $buf[_]; $func($buf,_);}' .
weggli '{_ $buf[_]; $buf[_]=_;}' .

Some possible variants: bcopy, gets, fgets, getwd, getcwd, fread, read, pread, recv, recvfrom, etc.

integer overflows

incorrect unsigned comparison (CWE-697)

weggli -R '$type=(unsigned|size_t)' '{$type $var; $var<0;}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var; $var<=0;}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var; $var>=0;}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var=_; $var<0;}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var=_; $var<=0;}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var=_; $var>=0;}' .

< should also cover > and <= should also cover >=; however, let's keep all variants just to be sure.

signed/unsigned conversion (CWE-195, CWE-196)

weggli -R '$copy=(cpy|ncat)$' '{int $len; $copy(_,_,$len);}' .
weggli -R '$copy=(cpy|ncat)$' '{int $len=_; $copy(_,_,$len);}' .
weggli -R '$copy=(cpy|ncat)$' '_ $func(int $len) {$copy(_,_,$len);}' .

weggli -R '$copy=nprintf$' '{int $len; $copy(_,$len);}' .
weggli -R '$copy=nprintf$' '{int $len=_; $copy(_,$len);}' .
weggli -R '$copy=nprintf$' '_ $func(int $len) {$copy(_,$len);}' .

weggli -R '$type=(unsigned|size_t)' '{$type $var1; int $var2; $var2=_($var1);}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var1; int $var2; $var1=_($var2);}' .
weggli -R '$type=(unsigned|size_t)' '{$type $var1; int $var2=_($var1);}' .
weggli -R '$type=(unsigned|size_t)' '{int $var1; $type $var2; $var2=_($var1);}' .
weggli -R '$type=(unsigned|size_t)' '{int $var1; $type $var2; $var1=_($var2);}' .
weggli -R '$type=(unsigned|size_t)' '{int $var1=_; $type $var2=_($var1);}' .

weggli -R '$type=(unsigned|size_t)' '_ $func(int $var2) {$type $var1; $var1=_($var2);}' .
weggli -R '$type=(unsigned|size_t)' '_ $func(int $var2) {$type $var1=_($var2);}' .

weggli -R '$type=(unsigned|size_t)' '$type $func(_) {int $var; return $var;}' .
weggli -R '$type=(unsigned|size_t)' 'int $func(_) {$type $var; return $var;}' .

There are many possible variants of these patterns...

integer truncation (CWE-197)

weggli -R 'type=(short|int|long)' '{$type $large; char $narrow; $narrow = $large; }' .
weggli -R 'type=(short|int|long)' '{$type $large; char $narrow = $large; }' .
weggli -R 'type=(int|long)' '{$type $large; short $narrow; $narrow = $large; }' .
weggli -R 'type=(int|long)' '{$type $large; short $narrow = $large; }' .
weggli '{long $large; int $narrow; $narrow = $large; }' .
weggli '{long $large; int $narrow = $large; }' .

weggli -R 'type=(short|int|long)' '_ $func($type $large) {char $narrow; $narrow = $large; }' .
weggli -R 'type=(short|int|long)' '_ $func($type $large) {char $narrow = $large; }' .
weggli -R 'type=(int|long)' '_ $func($type $large) {short $narrow; $narrow = $large; }' .
weggli -R 'type=(int|long)' '_ $func($type $large) {short $narrow = $large; }' .
weggli '_ $func(long $large) {int $narrow; $narrow = $large; }' .
weggli '_ $func(long $large) {int $narrow = $large; }' .

There are many possible variants of these patterns...

use of signed or short sizes, lengths, offsets, counts (CWE-190, CWE-680)

weggli 'short _' .
weggli 'int _' .

Some possible variants: short int, unsigned short, unsigned short int, int.

cast of the return value of strlen(), wcslen() to short (CWE-190, CWE-680)

weggli -R 'func=(str|wcs)len$' '{short $len; $len=$func();}' .

Some possible variants: short int, unsigned short, unsigned short int, even signed int.

integer wraparound (CWE-128, CWE-131, CWE-190, CWE-680)

weggli -R 'func=allocf?$' '{$func(_*_);}' .
weggli -R 'func=allocf?$' '{$func(_+_);}' .
weggli -R 'func=allocf?$' '{$n=_*_; $func($n);}' .
weggli -R 'func=allocf?$' '{$n=_+_; $func($n);}' .

weggli -R 'alloc=allocf?$' -R 'copy=cpy$' '{$alloc($x*_); $copy(_,_,$x);}' .
weggli -R 'alloc=allocf?$' -R 'copy=cpy$' '{$alloc($x+_); $copy(_,_,$x);}' .
weggli -u -R 'alloc=allocf?$' -R 'copy=cpy$' '{$n=_*_; $alloc($n); $copy(_,_,$x);}' .
weggli -u -R 'alloc=allocf?$' -R 'copy=cpy$' '{$n=_+_; $alloc($n); $copy(_,_,$x);}' .

weggli '{$x>_||($x+$y)>_;}' .
weggli '{$x>=_||($x+$y)>_;}' .
weggli '{$x>_||($x+$y)>=_;}' .
weggli '{$x>=_||($x+$y)>=_;}' .
weggli '{$x<_&&($x+$y)<_;}' .
weggli '{$x<=_&&($x+$y)<_;}' .
weggli '{$x<_&&($x+$y)<=_;}' .
weggli '{$x<=_&&($x+$y)<=_;}' .

weggli '{$x>_||($x*$y)>_;}' .
weggli '{$x>=_||($x*$y)>_;}' .
weggli '{$x>_||($x*$y)>=_;}' .
weggli '{$x>=_||($x*$y)>=_;}' .
weggli '{$x<_&&($x*$y)<_;}' .
weggli '{$x<=_&&($x*$y)<_;}' .
weggli '{$x<_&&($x*$y)<=_;}' .
weggli '{$x<=_&&($x*$y)<=_;}' .

< should also cover > and <= should also cover >=; however, let's keep all variants just to be sure.

format strings

call to printf(), scanf(), syslog() family functions (CWE-134)

weggli -R 'func=(printf|scanf|syslog)$' '{$func();}' .

Some possible variants: printk, warn, vwarn, warnx, vwarnx, err, verr, errx, verrx, warnc, vwarnc, errc, verrc, etc.

memory management

call to alloca() (CWE-676, CWE-1325)

weggli -R 'func=alloca$' '{$func();}' .

use after free (CWE-416)

weggli '{free($ptr); not:$ptr=_; not:free($ptr); _($ptr);}' .

double free (CWE-415)

weggli '{free($ptr); not:$ptr=_; free($ptr);}' .

calling free() on memory not allocated in the heap (CWE-590)

weggli '{_ $ptr[]; free($ptr);}' .
weggli '{_ $ptr[]=_; free($ptr);}' .

weggli '{_ $ptr[]; $ptr2=$ptr; free($ptr2);}' .
weggli '{_ $ptr[]=_; $ptr2=$ptr; free($ptr2);}' .

weggli '{_ $var; free(&$var);}' .
weggli '{_ $var=_; free(&$var);}' .
weggli '{_ $var[]; free(&$var);}' .
weggli '{_ $var[]=_; free(&$var);}' .
weggli '{_ *$var; free(&$var);}' .
weggli '{_ *$var=_; free(&$var);}' .

weggli '{$ptr=alloca(_); free($ptr);}' .

returning the address of a stack-allocated variable (CWE-562)

weggli '{_ $ptr[]; return $ptr;}' .
weggli '{_ $ptr[]=_; return $ptr;}' .

weggli '{_ $ptr[]; $ptr2=$ptr; return $ptr2;}' .
weggli '{_ $ptr[]=_; $ptr2=$ptr; return $ptr2;}' .

weggli '{_ $var; return &$var;}' .
weggli '{_ $var=_; return &$var;}' .
weggli '{_ $var[]; return &$var;}' .
weggli '{_ $var[]=_; return &$var;}' .
weggli '{_ *$var; return &$var;}' .
weggli '{_ *$var=_; return &$var;}' .

unchecked return code of malloc(), etc. (CWE-252, CWE-690)

weggli -R 'func=allocf?$' '{$ret=$func(); not:if(_($ret)){};}' .

call to putenv() with a stack-allocated variable (CWE-686)

weggli '{_ $ptr[]; putenv($ptr);}' .
weggli '{_ $ptr[]=_; putenv($ptr);}' .

weggli '{_ $ptr[]; $ptr2=$ptr; putenv($ptr2);}' .
weggli '{_ $ptr[]=_; $ptr2=$ptr; putenv($ptr2);}' .

exposure of underlying memory addresses (CWE-200, CWE-209, CWE-497)

weggli -R 'func=printf$' -R 'fmt=(.*%\w*x.*|.*%\w*X.*|.*%\w*p.*)' '{$func("$fmt");}' .

mismatched memory management routines (CWE-762)

weggli -R 'func=allocf?$|strdn?up$' '{not:$ptr=$func(); free($ptr);}' .

weggli --cpp -R 'func=allocf?$|strn?dup$' '{not:$ptr=$func(); free($ptr);}' .
weggli --cpp '{not:$ptr=new $obj; delete $ptr;}' .

Apparently, delete[] is not supported so this won't work properly:

weggli --cpp '{not:$ptr=new $obj[$len]; delete[] $ptr;}' .

use of uninitialized pointers (CWE-457, CWE-824, CWE-908)

weggli '{_* $ptr; not:$ptr=_; not:_(&$ptr); $func($ptr);}' .
weggli '{_* $ptr; not:$ptr=_; not:_(&$ptr); _($ptr);}' .

These patterns might generate many false positives that should be manually investigated.

command injection

call to system(), popen() (CWE-78, CWE-88, CWE-676)

weggli -R 'func=(system|popen)$' '{$func();}' .
weggli -R 'func=(system|popen)$' '{$func($arg);}' .

The second pattern is meant to filter out string literals, but it might cause some false negatives.

race conditions

call to access(), stat(), lstat() (CWE-367)

weggli -R 'func=(access|l?stat)$' '{$func();}' .

call to mktemp(), tmpnam(), tempnam() (CWE-377)

weggli -R 'func=(mktemp|te?mpnam)$' '{$func();}' .

call to signal() (CWE-364, CWE-479, CWE-828)

weggli -R 'func=signal$' '{$func();}' .

privilege management

privilege management functions called in the wrong order (CWE-696)

weggli '{not:setuid(0); setuid(); setgid();}' .
weggli '{not:seteuid(0); seteuid(); not:seteuid(0); setegid();}' .
weggli '{not:seteuid(0); seteuid(); not:seteuid(0); setuid();}' .
weggli '{not:seteuid(0); seteuid(); not:seteuid(0); seteuid();}' .

unchecked return code of setuid(), seteuid() (CWE-252)

weggli -R 'func=sete?uid$' '{strict:$func();}' .

miscellaneous

wrong order of arguments in call to memset()

weggli -R 'func=memset(_explicit)?$' '{$func(_,_,0);}' .
weggli -R 'func=memset(_explicit)?$' '{$func(_,sizeof(_),_);}' .

call to rand(), srand() (CWE-330, CWE-338)

weggli -R 'func=s?rand$' '{$func();}' .

source and destination overlap in sprintf(), snprintf()

weggli -R 'func=^sn?printf$' '{$func($dst,_,$dst);}' .
weggli -R 'func=^sn?printf$' '{$func($dst,_,_,$dst);}' .
weggli -R 'func=^sn?printf$' '{$func($dst,_,_,_,$dst);}' .

And so on...

size check implemented with an assertion macro

weggli -R 'assert=(?i)^\w*assert\w*\s*$' '{$assert(_<_);}' .
weggli -R 'assert=(?i)^\w*assert\w*\s*$' '{$assert(_<=_);}' .
weggli -R 'assert=(?i)^\w*assert\w*\s*$' '{$assert(_>_);}' .
weggli -R 'assert=(?i)^\w*assert\w*\s*$' '{$assert(_>=_);}' .

< should also cover > and <= should also cover >=; however, let's keep all variants just to be sure.

unchecked return code of scanf(), etc. (CWE-252)

weggli -R 'func=scanf$' '{strict:$func();}' .

call to atoi(), atol(), atof(), atoll()

weggli -R 'func=ato(i|ll?|f)$' '{$func();}' .

command-line argument or environment variable access

weggli -R 'var=argv|envp' '{$var[_];}' .

missing default case in a switch construct (CWE-478)

weggli -l 'switch(_) {_; not:default:_; _;}' .

-l might be overkill and lead to missing additional matches in the same function.

missing break or equivalent statement in a switch construct (CWE-484)

weggli -l 'switch(_) {case _: not:break; not:exit; not:return; not:goto _; case _:_;}' .

-l might be overkill and lead to missing additional matches in the same function.

missing return statement in a non-void function (CWE-393, CWE-394)

weggli -R 'type!=void' '$type $func(_) {_; not:return;}' .

typos with security implications (CWE-480, CWE-481, CWE-482, CWE-483)

weggli '{for (_==_;_;_) {}}' .
weggli 'if (_=_) {}' .
weggli 'if (_&_) {}' .
weggli 'if (_|_) {}' .
weggli '{_=+_;}' .
weggli '{_=-_;}' .
weggli -R 'func=strn?cpy$' 'if ($func()==_) {}' .

There are many possible additional patterns in this category...

keywords that suggest the presence of bugs

weggli -R 'pattern=(?i)(unsafe|insecure|dangerous|warning|overflow)' '$pattern' .

weggli -R 'func=(?i)(encode|decode|convert|interpret|compress|fragment|reassemble)' '_ $func(_) {}' .
weggli -R 'func=(?i)(mutex|lock|toctou|parallelism|semaphore|retain|release|garbage|mutual)' '_ $func(_) {}' .

There are many possible additional patterns in this category...