Home

Awesome

Build Status

Pelote

Pelote is a python library full of graph-related functions that can be used to complement networkx for higher-level tasks.

It mainly helps with the following things:

As such it is the perfect companion to ipysigma, our Jupyter widget that can render interactive graphs directly within your notebooks.

Installation

You can install pelote with pip with the following command:

pip install pelote

If you want to be able to use the library with pandas, you will need to install it also:

pip install pandas

Usage


Tabular data to graphs

table_to_bipartite_graph

Function creating a bipartite graph from the given tabular data.

Arguments

Returns

nx.AnyGraph - the bipartite graph.

tables_to_graph

Function creating a graph from two tables: a table of nodes and a table of edges.

from pelote import tables_to_graph

table_nodes = [
    {"name": "alice", "age": 50},
    {"name": "bob", "age": 12}
]

table_edges = [
    {"source": "alice", "target": "bob", "weight": 0.8},
    {"source": "bob", "target": "alice", "weight": 0.2}
]

g = tables_to_graph(
    table_nodes, table_edges, node_col="name", node_data=["age"], edge_data=["weight"], directed=True
)

Arguments

Returns

nx.AnyGraph - the resulting graph.

edges_table_to_graph

Function creating a graph from a table of edges.

Arguments

Returns

nx.AnyGraph - the resulting graph.


Graphs to tabular data

graph_to_nodes_dataframe

Function converting the given networkx graph into a pandas DataFrame of its nodes.

from pelote import graph_to_nodes_dataframe

df = graph_to_nodes_dataframe(graph)

Arguments

Returns

pd.DataFrame - A pandas DataFrame

graph_to_edges_dataframe

Function converting the given networkx graph into a pandas DataFrame of its edges.

Arguments

Returns

pd.DataFrame - A pandas DataFrame

graph_to_dataframes

Function converting the given networkx graph into two pandas DataFrames: one for its nodes, one for its edges.

Arguments

Returns

None - (pd.DataFrame, pd.DataFrame)


Graph projection

monopartite_projection

Function returning the monopartite projection of a given bipartite graph wrt one of both partitions of the graph.

That is to say the resulting graph will keep a single type of nodes sharing weighted edges based on the neighbors they shared in the bipartite graph.

import networkx as nx
from pelote import monopartite_projection

bipartite = nx.Graph()
bipartite.add_nodes_from([1, 2, 3], part='account')
bipartite.add_nodes_from([4, 5, 6], part='color')
bipartite.add_edges_from([
    (1, 4),
    (1, 5),
    (2, 6),
    (3, 4),
    (3, 6)
])

# Resulting graph will only contain nodes [1, 2, 3]
# with edges: (1, 3) and (2, 3)
monopartite = monopartite_projection(bipartite, 'account')

Arguments

Returns

nx.Graph - the projected monopartite graph.


Graph sparsification

global_threshold_sparsification

Function returning a copy of the given graph without edges whose weight is less than a given threshold.

Arguments

Returns

nx.AnyGraph - the sparse graph.

multiscale_backbone

Function returning the multiscale backbone of the given graph, i.e. a copy of the graph were we only kept "relevant" edges, as defined by a statistical test where we compare the likelihood of a weighted edge existing vs. the null model.

Article

Serrano, M. Ángeles, Marián Boguná, and Alessandro Vespignani. "Extracting the multiscale backbone of complex weighted networks." Proceedings of the national academy of sciences 106.16 (2009): 6483-6488.

References

Arguments

Returns

nx.AnyGraph - the sparse graph.


Miscellaneous graph-related metrics

edge_disparity

Function computing the disparity score of each edge in the given graph. This score is typically used to extract the multiscale backbone of a weighted graph.

The formula from the paper (relying on integral calculus) can be simplified to become:

disparity(u, v) = min(
    (1 - normalizedWeight(u, v)) ^ (degree(u) - 1)),
    (1 - normalizedWeight(v, u)) ^ (degree(v) - 1))
)

where

normalizedWeight(u, v) = weight(u, v) / weightedDegree(u)
weightedDegree(u) = sum(weight(u, v) for v in neighbors(u))

This score can sometimes be found reversed likewise:

disparity(u, v) = max(
    1 - (1 - normalizedWeight(u, v)) ^ (degree(u) - 1)),
    1 - (1 - normalizedWeight(v, u)) ^ (degree(v) - 1))
)

so that higher score means better edges. We chose to keep the metric close to the paper to keep the statistical test angle. This means that, in this implementation at least, a low score for an edge means a high relevance and increases its chances to be kept in the backbone.

Note that this algorithm has no proper definition for directed graphs and is only useful if edges have varying weights. This said, it could be possible to compute the disparity score only based on edge direction, if we drop the min part.

Article

Serrano, M. Ángeles, Marián Boguná, and Alessandro Vespignani. "Extracting the multiscale backbone of complex weighted networks." Proceedings of the national academy of sciences 106.16 (2009): 6483-6488.

References

Arguments

Returns

dict - Dictionnary with edges - (source, target) tuples - as keys and the disparity scores as values

triangular_strength

Function returning a graph edges triangular strength, sometimes also called Simmelian strength, i.e. the number of triangles each edge is a part of.

Arguments

Returns

dict - mapping of edges to their triangular strength.


Graph utilities

union_of_maximum_spanning_trees

Generator yielding the edges belonging to any Maximum Spanning Tree (MST) of the given networkx graph.

Note that this function will give to each edge with no weight a default weight of 1.

Article

Arlind Nocaj, Mark Ortmann, and Ulrik Brandes "Untangling Hairballs. From 3 to 14 Degrees of Separation." Computer & Information Science, University of Konstanz, Germany, 2014, https://dx.doi.org/10.1007/978-3-662-45803-7_9.

References

Arguments

Yields

tuple - source, target, attributes

largest_connected_component

Function returning the largest connected component of given networkx graph as a set of nodes.

Note that this function will consider any given graph as undirected and will therefore work with weakly connected components in the directed case.

Arguments

Returns

set - set of nodes representing the largest connected component.

crop_to_largest_connected_component

Function mutating the given networkx graph in order to keep only the largest connected component.

Note that this function will consider any given graph as undirected and will therefore work with weakly connected components in the directed case.

Arguments

largest_connected_component_subgraph

Function returning the largest connected component subgraph of the given networkx graph.

Note that this function will consider any given graph as undirected and will therefore work with weakly connected components in the directed case.

Arguments

Returns

nx.AnyGraph - the subgraph.

remove_edges

Function removing all edges that do not pass a predicate function from a given networkx graph.

Note that this function mutates the given graph.

Arguments

filter_edges

Function returning a copy of the given networkx graph but without the edges filtered out by the given predicate function

Arguments

Returns

nx.AnyGraph - the filtered graph.

remove_nodes

Function removing all nodes that do not pass a predicate function from a given networkx graph.

Note that this function mutates the given graph.

from pelote import remove_nodes

g = nx.Graph()
g.add_node(1, weight=22)
g.add_node(2, weight=4)
g.add_edge(1, 2)

remove_nodes(g, lambda n, a: a["weight"] >= 10)

Arguments

filter_nodes

Function returning a copy of the given networkx graph but without the nodes filtered out by the given predicate function

from pelote import filter_nodes

g = nx.Graph()
g.add_node(1, weight=22)
g.add_node(2, weight=4)
g.add_edge(1, 2)

h = filter_nodes(g, lambda n, a: a["weight"] >= 10)

Arguments

Returns

nx.AnyGraph - the filtered graph.

remove_leaves

Function removing all leaves of the graph, i.e. the nodes incident to a single edge, i.e. the nodes with degree 1.

This function is not recursive and will only remove one layer of leaves.

Note that this function mutates the given graph.

from pelote import remove_leaves

g = nx.Graph()
g.add_edge(1, 2)
g.add_edge(2, 3)

remove_leaves(g)

list(g.nodes)
>>> [2]

Arguments

filter_leaves

Function returning a copy of the given networkx graph but without its leaves, i.e. the nodes incident to a single edge, i.e. the nodes with degree 1.

This function is not recursive and will only filter only one layer of leaves.

from pelote import remove_leaves

g = nx.Graph()
g.add_edge(1, 2)
g.add_edge(2, 3)

h = filter_leaves(g)

list(h.nodes)
>>> [2]

Arguments


Learning

floatsam_threshold_learner

Function using an iterative algorithm to try and find the best weight threshold to apply to trim the given graph's edges while keeping the underlying community structure.

It works by iteratively increasing the threshold and stopping as soon as a significant connected component starts to drift away from the principal one.

This is basically an optimization algorithm applied to a complex nonlinear function using a very naive cost heuristic, but it works decently for typical cases as it emulates the method used by hand by some researchers when they perform this kind of task on Gephi, for instance.

When working on metrics where lower is better (i.e. edge disparity), you can reverse the logic of the algorithm by tweaking starting_threshold and giving a negative learning_rate.

Arguments

Returns

float - The found threshold


Reading & Writing

read_graphology_json

Function reading and parsing the given json file representing a serialized graphology graph as a networkx graph.

Note that this function cannot parse a true mixed graph since this is not supported by networkx.

Arguments

Returns

nx.AnyGraph - a networkx graph instance.

write_graphology_json

Function serializing the given networkx graph as JSON, using the graphology format.

Note that both node keys and attribute names will be cast to string so they can safely be represented in JSON. As such in some cases (where your node keys and/or attribute names are not strings), this function will not be bijective when used with read_graphology_json.

Arguments

Returns

dict - JSON data