Home

Awesome

CircleCI

CFPQ_PyAlgo

The CFPQ_PyAlgo is a repository for developing, testing and benchmarking algorithms that solve Formal-Language-Constrained Path Problems, such as Context-Free Path Queries and Regular Path Queries. All algorithms are based on the GraphBLAS framework that allows you to represent graphs as matrices and work with them in terms of linear algebra. For convenience, all the code is written in Python using pygraphblas or in C/C++ using purely SuiteSparse with a Python wrapper.

Installation

First of all you need to clone repository with its submodules:

git clone --recurse-submodules https://github.com/JetBrains-Research/CFPQ_PyAlgo.git
cd CFPQ_PyAlgo/ 
git submodule init
git submodule update

Then the easiest way to get started is to use Docker. An alternative, which is more correct, is to install everything directly.

Using Docker

The first way to start is to use Docker:

# build docker image
docker build --tag <some_tag> .

# run docker container
docker run --rm -it -v ${PWD}:/CFPQ_PyAlgo <some_tag> bash

After it you can develop everything locally and run tests and benchmarks inside the container. Also you can use PyCharm and configure an interpreter using Docker.

Direct install

The correct way is to install everything into your local python interpreter or virtual environment.

First of all you need to install pygraphblas package.

pip3 install pygraphblas

Secondly you need to install cfpq_data_devtools package and other requirements:

cd deps/CFPQ_Data
pip3 install -r requirements.txt
python3 setup.py install --user

cd ../../
pip3 install -r requirements.txt

To check if the installation was successful you can run simple tests

python3 -m pytest test -v -m "CI"

Benchmark

Look please Readme in benchmark

Usage

Let's describe an example of using the implementation outside this environment.

For example, you want to solve a basic problem CFPQ using the matrix algorithm. To do this, you need a context-free grammar (Gr), as well as a graph (G) in the format of "triplets".

Then the matrix algorithm can be run as follows, where PATH_TO_GRAMMAR --- path to file with Gr, PATH_TO_GRAPH --- path to file with G

from src.problems.Base.algo.matrix_base.matrix_base import MatrixBaseAlgo
from cfpq_data import cfg_from_txt
from src.graph.graph import Graph

from pathlib import Path

algo = MatrixBaseAlgo()
algo.prepare(Graph.from_txt(Path(PATH_TO_GRAPH)), cfg_from_txt(Path(PATH_TO_GRAMMAR)))
res = algo.solve()
print(res.matrix_S.nvals)

The given fragment displays the number of pairs of vertices between which the desired path exists.

More examples can be found in test

Project structure

The global project structure is the following:

.
├── deps
│   └── CFPQ_Data - repository with graphs and grammars suites
├───benchmark - directory for performance measurements of implementations
├── src
│   ├── problems - directory where all the problems CFPQ that we know how to solve
│   │   ├───AllPaths
│   │   ├───Base
│   │   ├───MultipleSource
│   │   └───SinglePath
│   ├── grammar - directory for all grammar formats representation and its loading  
│   ├── graph - directory for all graph formats representation and its loading
│   └── utils - directory for other useful classes and methods
└── test
    ├───AllPaths - tests for implementations in src.problems.AllPaths
    ├───Base - tests for implementations in src.problems.Base
    ├───data - dataset for tests
    ├───MultipleSource - tests for implementations in src.problems.MultipleSource
    ├───SinglePath - tests for implementations in src.problems.SinglePath
    └───suites