Awesome
scip-clang: SCIP indexer for C and C++
scip-clang is a precise code indexer based on Clang 16, which supports cross-repository code navigation for C, C++ and CUDA in Sourcegraph.
Here are some code navigation examples:
- Cross-repository navigation in Boost
- In Chromium.
- In llvm/llvm-project:
- Find references for a CUDA kernel in apache/mxnet
Table of Contents
- Supported Platforms
- Quick Start
- System Requirements
- Usage
- Troubleshooting
- Reporting issues
- Documentation
- Contributing
Supported Platforms
Binary releases are available for x86_64 Linux (glibc 2.16 or newer) and x86_64 macOS (supported on arm64 macOS via Rosetta).
We're exploring Windows support.
Codebases using GCC and/or Clang for routine compilation are both supported. For codebases exclusively built using GCC, compatibility should be as good as Clang's compatibility (i.e. most features should work, with graceful degradation for features that don't).
<details> <summary>Extra requirements for indexing CUDA</summary>When indexing CUDA code, an installation of Clang is required
(using your OS package manager or otherwise),
and the clang
executable must be available on PATH
,
so that Clang's CUDA-related headers can be found.
We recommend Clang 16 or newer, but in our testing, headers
from Clang 14 also work.
The CUDA SDK must also be installed.
</details>scip-clang currently supports indexing using a JSON compilation database. CMake, Bazel and Meson support emitting this format for compatibility with clang-based tooling. Projects which use Make or other build systems may be able to use Bear to intercept compilation commands and generate a compilation database.
We're interested in exploring more native Bazel support in the future.
The use of pre-compiled headers is not supported, as the format of pre-compiled headers varies across compilers and individual compiler versions.
Quick Start
The easiest way to use scip-clang, once you have a
JSON compilation database, is to invoke scip-clang
from the project root like so:
scip-clang --compdb-path=path/to/compile_commands.json
WARNING: You must invoke scip-clang from the project root, not from a subdirectory, even when you only want to index a subdirectory. If you only want to index a subdirectory, filter out unnecessary entries in the compilation database.
If you see any errors, see the Troubleshooting section.
If all goes well, indexing will generate a file index.scip
which can be uploaded to a Sourcegraph instance using
src-cli
v4.5 or newer.
# See https://docs.sourcegraph.com/cli/references/code-intel/upload
# Make sure to authenticate earlier or provide an access token
src code-intel upload -file=index.scip
See the Usage section for step-by-step instructions.
System Requirements
- About 2MB of temporary space for every TU in the compilation database.
echo "$(perl -e "print $(jq 'length' build/compile_commands.json) / 512.0") GB"
- On Linux, about 2MB of space in
/dev/shm
per core (df -h /dev/shm
). This may particularly be an issue when using Docker on a high core count machine, as default size of/dev/shm
in Docker is 64MB. See also: how to troubleshoot low disk space for IPC. - 2GB RAM per core is generally sufficient.
Usage
Generating a compilation database
-
CMake: Add
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON
to thecmake
invocation. For typical projects, the overall invocation will look like:cmake -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
-
Bazel: Use either hedronvision/bazel-compile-commands-extractor or grailbio/bazel-compilation-database. Caveat: The grailbio generator sometimes accidentally adds unexpanded Make variables in compilation commands, so you may need to remove them as a preprocessing step, before invoking
scip-clang
. -
Meson: Use the Ninja backend, which generates a compilation database.
-
Nix + Make: When using Make under Nix, in our testing, the compilation database by Bear (recommended below) omits some flags needed to find headers from libc and libstdc++/libc++. Using mini_compile_commands instead avoids that.
-
Make or other build systems: Use Bear to wrap the build system invocation which can build all the code. For example:
bear -- make all
In our testing on Linux, Bear works with Boost's B2 build system as well.
Some other tools which may work include:
- compiledb (Linux, macOS, Windows): For Make-style systems, supposedly faster than Bear as it doesn't require a clean build.
- compile-db-gen (Linux): Uses strace.
- clade (Linux, macOS, partial Windows support).
We have not tested any of these.
The official Clang docs may also have additional suggestions for generating a compilation database.
Building code
Large projects typically use various forms of code generation.
scip-clang re-runs type-checking, so it needs access
to generated code. This means that scip-clang
should preferably
run after building compilation artifacts.
Initial scip-clang testing
For large codebases, we recommend first testing scip-clang on a subset of a compilation database with diagnostics turned on. For example:
# Using jq (https://github.com/stedolan/jq)
jq '.[0:5]' build/compile_commands.json > build/small_compdb.json
# Invoke scip-clang from the project root
scip-clang --compdb-path=build/small_compdb.json --show-compiler-diagnostics
<details> <summary>Known diagnostics when indexing CUDA</summary>WARNING: You must invoke scip-clang from the project root, not from a subdirectory, even when you only want to index a subdirectory. If you only want to index a subdirectory, filter out unnecessary entries in the compilation database.
- If you see an error related to the
texture
template, that is likely because of the Clang version not being Clang 16 or newer. See https://github.com/llvm/llvm-project/issues/61340 - If you see any errors related to GCC headers, that's a known issue. It shouldn't affect indexer correctness.
- If you see an error related to an unknown flag, you can generally ignore it. scip-clang skips all known NVCC-specific flags as they generally don't affect the semantics of code navigation. We can easily add more flags to skip here if needed.
If there are errors about missing system or SDK headers, install the relevant system dependencies.
If there are errors about missing generated headers, make sure to build your code first.
If there are any other errors, such as standard library or platform headers not being found, please report an issue.
Running scip-clang on a single repo
scip-clang --compdb-path=build/compile_commands.json
The --show-compiler-diagnostics
flag is deliberately omitted here,
since scip-clang is still able to index code in the presence of
compiler errors, and any errors in headers will get repeated
for each translation unit in which the header is included.
Setting up cross-repo code navigation
See the cross-repository setup docs.
Troubleshooting
See the Troubleshooting docs.
Reporting issues
Create a new GitHub issue with any relevant logs attached.
Sourcegraph customers may ask their Customer Engineers for help with filing an issue confidentally, as the log may contain information about file names etc.
Documentation
Run scip-clang --help
to see documentation for different flags.
A CHANGELOG is also available.
Contributing
- Development.md covers build instructions etc.
- Design.md covers the high-level architecture and design considerations.
- This GitHub comment covers why we decided to write a new C++ indexer.