Awesome
fccf: Fast C/C++ Code Finder
fccf
is a command-line tool that quickly searches through C/C++ source code in a directory based on a search string and prints relevant code snippets that match the query.
Highlights
- Quickly identifies source files that contain a search string.
- For each candidate source file, builds an abstract syntax tree (AST).
- Visits the nodes in the AST, looking for function declarations, classes, enums, variables etc., that match the user's request.
- Pretty-prints the identified snippet of source code to the terminal.
- MIT License
Searching the Linux kernel source tree
The following video shows fccf
searching and finding snippets of code in torvalds/linux.
https://user-images.githubusercontent.com/8450091/165400381-9ba49a62-97fb-4f4a-890a-0dc9b20dfe75.mp4
Searching the fccf
source tree (Modern C++)
The following video shows fccf
searching the fccf
C++ source code.
Note that search results here include:
- Class declaration
- Functions and function templates
- Variable declarations, including lambda functions
https://user-images.githubusercontent.com/8450091/165402206-65d9ed43-b9dd-4528-92bd-0b4ce76b6468.mp4
Search for any of --flag
that matches
Provide an empty query to match any --flag
, e.g., any enum declaration.
... or any class constructor
<img width="694" alt="image" src="https://user-images.githubusercontent.com/8450091/165858122-ecfaf103-8e84-418f-8aaa-8f0fc1d087ea.png">Searching a for
statement
Use --for-statement
to search for
statements. fccf
will try to find for
statements (including C++ ranged for
) that contain the query string.
Searching expressions
Use the --include-expressions
option to find expressions that match the query.
The following example shows fccf
find calls to isdigit()
.
The following example shows fccf
finding references to the clang_options
variable.
Searching for using
declarations
Use the --using-declaration
option to find using
declarations, using
directives, and type alias declarations.
Searching for namespace
aliases
Use the --namespace-alias
option to find namespace
aliases.
Searching for throw
expressions
Use --throw-expression
with a query to search for specific throw
expressions that contain the query string.
As presented earlier, an empty query here will attempt to match any throw
expression in the code base:
Build Instructions
Build fccf
using CMake. For more details, see BUILDING.md.
NOTE: fccf
requires libclang
and LLVM
installed.
# Install libclang and LLVM
# sudo apt install libclang-dev llvm
git clone https://github.com/p-ranav/fccf
cd fccf
# Build
cmake -S . -B build -D CMAKE_BUILD_TYPE=Release
cmake --build build
# Install
sudo cmake --install build
fccf
Usage
foo@bar:~$ fccf --help
Usage: fccf [--help] [--version] [--help] [--exact-match] [--json] [--filter VAR] [-j VAR] [--enum] [--struct] [--union] [--member-function] [--function] [--function-template] [-F] [--class] [--class-template] [--class-constructor] [--class-destructor] [-C] [--for-statement] [--namespace-alias] [--parameter-declaration] [--typedef] [--using-declaration] [--variable-declaration] [--verbose] [--include-expressions] [--static-cast] [--dynamic-cast] [--reinterpret-cast] [--const-cast] [-c] [--throw-expression] [--ignore-single-line-results] [--include-dir VAR]... [--language VAR] [--std VAR] [--no-color] query [path]...
Positional arguments:
query
path [nargs: 0 or more]
Optional arguments:
-h, --help shows help message and exits
-v, --version prints version information and exits
-h, --help Shows help message and exits
-E, --exact-match Only consider exact matches
--json Print results in JSON format
-f, --filter Only evaluate files that match filter pattern [nargs=0..1] [default: "*.*"]
-j Number of threads [nargs=0..1] [default: 5]
--enum Search for enum declaration
--struct Search for struct declaration
--union Search for union declaration
--member-function Search for class member function declaration
--function Search for function declaration
--function-template Search for function template declaration
-F Search for any function or function template or class member function
--class Search for class declaration
--class-template Search for class template declaration
--class-constructor Search for class constructor declaration
--class-destructor Search for class destructor declaration
-C Search for any class or class template or struct
--for-statement Search for `for` statement
--namespace-alias Search for namespace alias
--parameter-declaration Search for function or method parameter
--typedef Search for typedef declaration
--using-declaration Search for using declarations, using directives, and type alias declarations
--variable-declaration Search for variable declaration
--verbose Request verbose output
--ie, --include-expressions Search for expressions that refer to some value or member, e.g., function, variable, or enumerator.
--static-cast Search for static_cast
--dynamic-cast Search for dynamic_cast
--reinterpret-cast Search for reinterpret_cast
--const-cast Search for const_cast
-c Search for any static_cast, dynamic_cast, reinterpret_cast, orconst_cast expression
--throw-expression Search for throw expression
--isl, --ignore-single-line-results Ignore forward declarations, member function declarations, etc.
-I, --include-dir Additional include directories [nargs=0..1] [default: {}] [may be repeated]
-l, --language Language option used by clang [nargs=0..1] [default: "c++"]
--std C++ standard to be used by clang [nargs=0..1] [default: "c++17"]
--nc, --no-color Stops fccf from coloring the output
How it works
fccf
does a recursive directory search for a needle in a haystack - likegrep
orripgrep
- It uses anSSE2
strstr
SIMD algorithm (modified Rabin-Karp SIMD search; see here) if possible to quickly find, in multiple threads, a subset of the source files in the directory that contain a needle.- For each candidate source file, it uses
libclang
to parse the translation unit (build an abstract syntax tree). - Then it visits each child node in the AST, looking for specific node types, e.g.,
CXCursor_FunctionDecl
for function declarations. - Once the relevant nodes are identified, if the node's "spelling" (
libclang
name for the node) matches the search query, then the source range of the AST node is identified - source range is the start and end index of the snippet of code in the buffer - Then, it pretty-prints this snippet of code. I have a simple lexer that tokenizes this code and prints colored output.
Note on include_directories
For all this to work, fccf first identifies candidate directories that contain header files, e.g., paths that end with include/
. It then adds these paths to the clang options (before parsing the translation unit) as -Ifoo -Ibar/baz
etc. Additionally, for each translation unit, the parent and grandparent paths are also added to the include directories for that unit in order to increase the likelihood of successful parsing.
Additional include directories can also be provided to fccf
using the -I
or --include-dir
option. Using verbose output (--verbose
), errors in the libclang parsing can be identified and fixes can be attempted (e.g., adding the right include directories so that libclang
is happy).
To run fccf
on the fccf
source code without any libclang errors, I had to explicitly provide the include path from LLVM-12 like so:
foo@bar:~$ fccf --verbose 'lexer' . --include-dir /usr/lib/llvm-12/include/
Checking ./source/lexer.cpp
Checking ./source/lexer.hpp
Checking ./source/searcher.cpp
// ./source/lexer.hpp (Line: 14 to 40)
class lexer
{
std::string_view m_input;
fmt::memory_buffer* m_out;
std::size_t m_index {0};
bool m_is_stdout {true};
char previous() const;
char current() const;
char next() const;
void move_forward(std::size_t n = 1);
bool is_line_comment();
bool is_block_comment();
bool is_start_of_identifier();
bool is_start_of_string();
bool is_start_of_number();
void process_line_comment();
void process_block_comment();
bool process_identifier(bool maybe_class_or_struct = false);
void process_string();
std::size_t get_number_of_characters(std::string_view str);
public:
void tokenize_and_pretty_print(std::string_view source,
fmt::memory_buffer* out,
bool is_stdout = true);
}