Home

Awesome

GitHub敏感信息扫描工具

Twitter

功能设计

GitPrey是根据企业关键词进行项目检索以及相应敏感文件和敏感文件内容扫描的工具,其设计思路如下:

此外,不同关键词搜索的疑似项目数量不同,少则数个,多则数十个甚至数百个,并会对搜索和扫描时间造成直接影响(另一影响因素是匹配的文件名关键词数量和内容关键词数量),项目和关键词越多,扫描时间越长。因此可以根据需要进行扫描深度的选择,这一维度由GitHub最近索引(Recently Indexed)排序的代码页决定,深度越深,检索的项目数量越多,反之亦然。深度选项和说明如下:

深度选择与企业扫描周期性应该成正相关,深度选择小,则相应扫描的周期性也应当较小,如深度选择为Level 1,则相应的扫描周期基于企业情况可定为每天或每周,深度选择为Level 5,则相应的扫描周期可适当延长。例如,关键词“Google”最大(Level 5)可搜索两天前上传的项目代码,而关键词“repoog”搜索结果则不足1页。

技术实现

项目配置文件Config.py中需要配置使用者的GitHub用户名、密码:

程序使用

GitPrey v2.2版本后去除了ACCESS_TOKEN的配置以及配置文件中的SEARCH_LEVEL和KEYWORDS配置项,改用命令行参数方式执行:

USAGE:
        -l  Set level for searching within 1~5, default level is 1.
        -k  Set key words for searching projects.
        -h  Show help information.

文件配置

pattern为搜索项文件配置目录,相关文件说明如下:

程序更新


Sensitive info scan tool of GitHub

Function introduction and design

GitPrey is a tool for searching sensitive information or data according to company name or key word something.The design mind is from searching sensitive data leakling in GitHub:

By the way, there is some missing file or mistake file with using Gitprey,the reason is:

Gitprey also provides the search level to adjust scanning deep, it's between Level 1 to Level 5:

You can modify the Level in Config.py.To search as quick as you can,you must configure your own GitHub account username and password to avoid 429 ERROR which is too many requests.

Tech detail introduction

There are some hints to declare about technological details:

GitPrey usage

GitPrey removed ACCESS_TOKEN, SEARCH_LEVEL and KEYWORDS configuration from v2.2: USAGE: -l Set search level for searching projects within 1-5, default level is 1. -k Set key words for searching projects. -h Show help information.

pattern file introduction

pattern is a directory putting db files: