Home

Awesome

Chapi

<img src="docs/logo.svg" width="100" height="100" alt="Chapi Logo">

Chapi CI codecov Maintainability Maven Central

CHAPI (Common Hierarchical Abstract Parser and Information Converter) streamlines code analysis by converting diverse language source code into a unified abstract model, simplifying cross-language development. Chapi 是一个通用层次抽象解析器与信息转换器,它可以将不同编程语言的源代码转换为统一的层次抽象模型。

Chapi => Cha Pi => Tea Pi => Tea π => 茶 π. See on in refs: Tea if by sea, cha if by land.

Chapi (pronounce /tʃɑpi/) also pronounce XP in Chinese if you always call X in 叉.

language stages:

FeaturesJavaPythonGoKotlinTS/JSCC#ScalaC++Rust
http api decl🆕🆕🆕🆕🆕
syntax parse🆕🆕
function call🆕🆕
arch/package🆕🆕
real world

IDL stages:

FeaturesProtobufThrift
syntax parse
http api decl
arch/package
real world

PS: welcome to PR to send your projects

Language Information

language versions(tested):

// tier 1 languages
":chapi-ast-java",
":chapi-ast-typescript",

// tier 1 model language
":chapi-ast-protobuf",

// tier 2 languages
":chapi-ast-kotlin",
":chapi-ast-go",
":chapi-ast-python",
":chapi-ast-scala",

// tier 3 languages
":chapi-ast-rust",
":chapi-ast-csharp",
":chapi-ast-c",
":chapi-ast-cpp",

// others
":chapi-parser-toml",
":chapi-parser-cmake",

Language Family wiki

Algol Family https://wiki.c2.com/?AlgolFamily

Languagesplan support
C familyC#, Java, Go, C, C++, Objective-C, Rust, ...C++, C, Java, C#, Rust?
FunctionalScheme, Lisp, Clojure, Scala, ...Scala
ScriptingLua, PHP, JavaScript, Python, Perl, Ruby, ...Python, JavaScript
OtherFortran, Swift, Matlab, ...Swift?, Fortran?

Specify Rule

scan by twice. In order to success get:

TypeScript

  1. PackageName will use resolvePath, package of src/grammar/blbla.ts is @.grammar
  2. Function in file will use default as DataStructure.Name
  3. export default Object in file will use default as FunctionName, and will belong to default DataStructure

C# issues

C

We use https://github.com/shevek/jcpp to pre-process C code.

Kotlin

Usage

  1. add to dependencies
dependencies {
    implementation 'com.phodal.chapi:chapi-ast-java:2.3.6'
    implementation 'com.phodal.chapi:chapi-domain:2.3.6'
}

Usage

import chapi.domain.core.CodeDataStruct
import kotlinx.coroutines.async
import kotlinx.coroutines.awaitAll
import kotlinx.coroutines.runBlocking
import org.archguard.scanner.core.sourcecode.LanguageSourceCodeAnalyser
import org.archguard.scanner.core.sourcecode.SourceCodeContext
import java.io.File

class CSharpAnalyser(override val context: SourceCodeContext)

private val client = context.client
private val impl = chapi.ast.csharpast.CSharpAnalyser()

fun analyse(): List<CodeDataStruct> = runBlocking {
    getFilesByPath(context.path) {
        it.absolutePath.endsWith(".cs")
    }
        .map { async { analysisByFile(it) } }.awaitAll()
        .flatten()
        .also { client.saveDataStructure(it) }
}

fun analysisByFile(file: File): List<CodeDataStruct> {
    val codeContainer = impl.analysis(file.readContent(), file.name)
    return codeContainer.Containers.flatMap { container ->
        container.DataStructures.map {
            it.apply {
                it.Imports = codeContainer.Imports
                it.FilePath = file.absolutePath
            }
        }
    }
}
}

Examples

examples Java source code:

package adapters.outbound.persistence.blog;

public class BlogPO implements PersistenceObject<Blog> {
    @Override
    public Blog toDomainModel() {

    }
}

examples output

{
    "Imports": [],
    "Implements": [
        "PersistenceObject<Blog>"
    ],
    "NodeName": "BlogPO",
    "Extend": "",
    "Type": "CLASS",
    "FilePath": "",
    "InOutProperties": [],
    "Functions": [
        {
            "IsConstructor": false,
            "InnerFunctions": [],
            "Position": {
                "StartLine": 6,
                "StartLinePosition": 133,
                "StopLine": 8,
                "StopLinePosition": 145
            },
            "Package": "",
            "Name": "toDomainModel",
            "MultipleReturns": [],
            "Annotations": [
                {
                    "Name": "Override",
                    "KeyValues": []
                }
            ],
            "Extension": {},
            "Override": false,
            "extensionMap": {},
            "Parameters": [],
            "InnerStructures": [],
            "ReturnType": "Blog",
            "Modifiers": [],
            "FunctionCalls": []
        }
    ],
    "Annotations": [],
    "Extension": {},
    "Parameters": [],
    "Fields": [],
    "MultipleExtend": [],
    "InnerStructures": [],
    "Package": "adapters.outbound.persistence.blog",
    "FunctionCalls": []
}

Development

Syntax Parse Identify Rules:

  1. package name
  2. import name
  3. class / data struct
    1. struct name
    2. struct parameters
    3. function name
    4. return types
    5. function parameters
  4. function
    1. function name
    2. return types
    3. function parameters
  5. method call
    1. new instance call
    2. parameter call
    3. field call

Build Antlr Grammar

  1. setup Antlr: brew install antlr
  2. run compile: ./scripts/compile-antlr.sh

Data Structures

// for multiple project analysis
code_project
code_module

// for package dependency analysis
code_package_info
code_dependency

// package or file as dependency analysis
code_package
code_container

// class-first or function-first
code_data_struct
code_function

// function or class detail
code_annotation
code_field
code_import
code_member
code_position
code_property

// method call information
code_call

Development(Chinese Version)

Env:Intellij IDEA、JDK 11+

  1. Clone code:git clone https://github.com/phodal/chapi

  2. Build:./gradlew build

参与开发

为了保证不易出现 bug,项目采用 TDD 的方式进行,即先编写对应的语法测试,然后实现代码。通过尽可能高的测试覆盖率,降低 bug 的出现。

项目主要由 domain + 各种语言的 AST + application 构建:

每个 AST 项目的入口是 xxAnalyser,返回的是一个 CodeContainer,即代码容器。在非 C# 语言里,等同于 CodeFile,即代码文件。

CodeContainer 内对应的领域模型如下所示:

// class-first or function-first
code_data_struct // 类、struct、interface 等
code_function    // 函数。如果是头等函数的语言(first-class function”),会用 NodeName = "default" 包在 code_data_struct 模型中

// function or class detail
code_annotation  // 注解
code_field       // 全局变量
code_import      // 包依赖
code_member      // 保留字段
code_position    // 位置信息
code_property    // 参数相关

// method call information
code_call        // 函数调用,如 fmt.Println

加入开发

  1. 寻找感兴趣的语言 / 添加新的语言 AST

通过 TDD 的方式一点点实现下面的功能(可以考虑按顺序),参照示例见 JavaFullIdentListenerTest.kt

  1. package name
  2. import name
  3. class / data struct
    1. struct name
    2. struct parameters
    3. function name
    4. return types
    5. function parameters
  4. function
    1. function name
    2. return types
    3. function parameters
  5. method call
    1. new instance call
    2. parameter call
    3. field call
    4. other calls...

提交信息格式

用于发布时,使用标准的 CHANGELOG.md

<type>[Language]: <message>,示例:feat(java): <grammars> init python & go grammars Phodal Huang 2020/2/2, 5:01 PM

所有的 type 见:

License

Phodal's Idea

@2020 A Phodal Huang's Idea. This code is distributed under the MPL license. See LICENSE in this directory.