Awesome
About
This is the PyPI package of the simple-graph implementation in Python, which is a simple graph database in SQLite, inspired by "SQLite as a document database".
Build and Test
How to generate the distribution archive and confirm it on test.pypi.org, also based on the pypa/sampleproject:
rm -rf build dist src/simple_graph_sqlite.egg-info
python -m build
python -m twine upload --repository testpypi dist/*
Create a virtual environment for the test package, activate it, pull from test.pypi.org (the --extra-index-url
is necessary since the graphviz
and/or Jinja2
dependencies may not be in the test archive), and confirm the package is available:
$ pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple simple-graph-sqlite graphviz==0.16 Jinja2==3.1.2
$ python
Python 3.6.13 |Anaconda, Inc.| (default, Jun 4 2021, 14:25:59)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from simple_graph_sqlite import database as db
With the test package installed, update PYTHONPATH
to include ./tests
and run pytest
from the root of this repository. If the tests pass, rebuild and push to pypi.org:
rm -rf build dist src/simple_graph_sqlite.egg-info
python -m build
python -m twine upload --repository pypi dist/*
Structure
The schema consists of just two structures:
- Nodes - these are any json objects, with the only constraint being that they each contain a unique
id
value - Edges - these are pairs of node
id
values, specifying the direction, with an optional json object as connection properties
There are also traversal functions as native SQLite Common Table Expressions:
Applications
- Social networks
- Interest maps/recommendation finders
- To-do / task lists
- Bug trackers
- Customer relationship management (CRM)
- Gantt chart
Usage
Installation Requirements
- SQLite, version 3.31.0 or higher; get the latest source or precompiled binaries from the SQLite Download Page
- Graphviz for visualization (download page, installation procedure for Windows)
- Jinja2 for the search and traversal templates
Basic Functions
The database script provides convenience functions for atomic transactions to add, delete, connect, and search for nodes.
Any single node or path of nodes can also be depicted graphically by using the visualize
function within the database script to generate dot files, which in turn can be converted to images with Graphviz.
Example
Dropping into a python shell, we can create, upsert, and connect people from the early days of Apple Computer. The resulting database will be saved to a SQLite file named apple.sqlite
:
>>> apple = "apple.sqlite"
>>> from simple_graph_sqlite import database as db
>>> db.initialize(apple)
>>> db.atomic(apple, db.add_node({'name': 'Apple Computer Company', 'type':['company', 'start-up'], 'founded': 'April 1, 1976'}, 1))
>>> db.atomic(apple, db.add_node({'name': 'Steve Wozniak', 'type':['person','engineer','founder']}, 2))
>>> db.atomic(apple, db.add_node({'name': 'Steve Jobs', 'type':['person','designer','founder']}, 3))
>>> db.atomic(apple, db.add_node({'name': 'Ronald Wayne', 'type':['person','administrator','founder']}, 4))
>>> db.atomic(apple, db.add_node({'name': 'Mike Markkula', 'type':['person','investor']}, 5))
>>> db.atomic(apple, db.connect_nodes(2, 1, {'action': 'founded'}))
>>> db.atomic(apple, db.connect_nodes(3, 1, {'action': 'founded'}))
>>> db.atomic(apple, db.connect_nodes(4, 1, {'action': 'founded'}))
>>> db.atomic(apple, db.connect_nodes(5, 1, {'action': 'invested', 'equity': 80000, 'debt': 170000}))
>>> db.atomic(apple, db.connect_nodes(1, 4, {'action': 'divested', 'amount': 800, 'date': 'April 12, 1976'}))
>>> db.atomic(apple, db.connect_nodes(2, 3))
>>> db.atomic(apple, db.upsert_node(2, {'nickname': 'Woz'}))
There are also bulk operations, to insert and connect lists of nodes in one transaction.
The nodes can be searched by their ids:
>>> db.atomic(apple, db.find_node(1))
{'name': 'Apple Computer Company', 'type': ['company', 'start-up'], 'founded': 'April 1, 1976', 'id': 1}
Searches can also use combinations of other attributes, both as strict equality, or using LIKE
in combination with a trailing %
for "starts with" or %
at both ends for "contains":
>>> db.atomic(apple, db.find_nodes([db._generate_clause('name', predicate='LIKE')], ('Steve%',)))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Steve Jobs', 'type': ['person', 'designer', 'founder'], 'id': 3}]
>>> db.atomic(apple, db.find_nodes([db._generate_clause('name', predicate='LIKE'), db._generate_clause('name', predicate='LIKE', joiner='OR')], ('%Woz%', '%Markkula',)))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Mike Markkula', 'type': ['person', 'investor'], 'id': 5}]
More complex queries to introspect the json body, using the sqlite json_tree() function, are also possible, such as this query for every node whose type
array contains the value founder
:
>>> db.atomic(apple, db.find_nodes([db._generate_clause('type', tree=True)], ('founder',), tree_query=True, key='type'))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Steve Jobs', 'type': ['person', 'designer', 'founder'], 'id': 3}, {'name': 'Ronald Wayne', 'type': ['person', 'administrator', 'founder'], 'id': 4}]
See the _generate_clause()
and _generate_query()
functions in database.py for usage hints.
Paths through the graph can be discovered with a starting node id, and an optional ending id; the default neighbor expansion is nodes connected nodes in either direction, but that can changed by specifying either find_outbound_neighbors
or find_inbound_neighbors
instead:
>>> db.traverse(apple, 2, 3)
['2', '1', '3']
>>> db.traverse(apple, 4, 5)
['4', '1', '2', '3', '5']
>>> db.traverse(apple, 5, neighbors_fn=db.find_inbound_neighbors)
['5']
>>> db.traverse(apple, 5, neighbors_fn=db.find_outbound_neighbors)
['5', '1', '4']
>>> db.traverse(apple, 5, neighbors_fn=db.find_neighbors)
['5', '1', '2', '3', '4']
Any path or list of nodes can rendered graphically by using the visualize
function. This command produces dot files, which are also rendered as images with Graphviz:
>>> from visualizers import graphviz_visualize
>>> graphviz_visualize(apple, 'apple.dot', [4, 1, 5])
The resulting text file also comes with an associated image (the default is png, but that can be changed by supplying a different value to the format
parameter)
The default options include every key/value pair (excluding the id) in the node and edge objects, and there are display options to help refine what is produced:
>>> graphviz_visualize(apple, 'apple.dot', [4, 1, 5], exclude_node_keys=['type'], hide_edge_key=True)
The resulting dot file can be edited further as needed; the dot guide has more options and examples.