graph_tools package

Submodules

graph_tools.epidemics_utils module

graph_tools.epidemics_utils.simulate_infection_size(A, pi0, B=0.1, y=0.1, T=5, dt=0.05, ensure_probability_between_0_1=False, plot=True, plot_pit=False, figsize=(6, 6), return_pit=False, title_suffix=None)[source]

Compute the average degree connectivity of graph.

The average degree connectivity is the average nearest neighbor degree of nodes with degree k. For weighted graphs, an analogous measure can be computed using the weighted average neighbors degree defined in [1]_, for a node i, as

\[k_{nn,i}^{w} = \frac{1}{s_i} \sum_{j \in N(i)} w_{ij} k_j\]

where s_i is the weighted degree of node i, w_{ij} is the weight of the edge that links i and j, and N(i) are the neighbors of node i.

Parameters:
  • G (NetworkX graph) –

  • source ("in"|"out"|"in+out" (default:"in+out")) – Directed graphs only. Use “in”- or “out”-degree for source node.

  • target ("in"|"out"|"in+out" (default:"in+out") – Directed graphs only. Use “in”- or “out”-degree for target node.

  • nodes (list or iterable (optional)) – Compute neighbor connectivity for these nodes. The default is all nodes.

  • weight (string or None, optional (default=None)) – The edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1.

Returns:

d – A dictionary keyed by degree k with the value of average connectivity.

Return type:

dict

Raises:

NetworkXError – If either source or target are not one of ‘in’, ‘out’, or ‘in+out’. If either source or target is passed for an undirected graph.

Examples

>>> G = nx.path_graph(4)
>>> G.edges[1, 2]["weight"] = 3
>>> nx.average_degree_connectivity(G)
{1: 2.0, 2: 1.5}
>>> nx.average_degree_connectivity(G, weight="weight")
{1: 2.0, 2: 1.75}

See also

average_neighbor_degree

References

graph_tools.graph_path_utils module

graph_tools.graph_path_utils.example()[source]
graph_tools.graph_path_utils.perc_95(array)[source]
graph_tools.graph_path_utils.perc_k(array, k)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples(G, n_samples=10000, nodes=None, source_nodes=None, target_nodes=None, edge_weight=None, nodes_weight=None, plot=False, title_prefix=None, log_scale=True, verbose=True, verbose_time=False, undirected=False, path_nodes=None, return_no_path_perc=False, **kwargs)[source]

Purpose: To compute the shortest path distance between nodes in a graph (may just need to sample the path)

Pseudocode: 1) Generate x samples of nodes in the graph as the sources and as the targets 2) Make sure the two nodes are not the same 3) Compute the shortest path and add to the list 4) Compute the mean and standard deviation 5) Plot a histogram if requested

graph_tools.graph_path_utils.shortest_path_distance_samples_mean(G, nodes=None, source_nodes=None, target_nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_mean_from_source(G, nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_mean_from_source_undirected(G, nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_mean_undirected(G, nodes=None, source_nodes=None, target_nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_perc_95_from_source(G, nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_perc_95_from_source_undirected(G, nodes=None, n_samples=40000, return_no_path_perc=False, **kwargs)[source]
graph_tools.graph_path_utils.shortest_path_distance_samples_stat(G, stat, n_samples=10000, nodes=None, source_nodes=None, target_nodes=None, edge_weight=None, nodes_weight=None, plot=False, title_prefix=None, log_scale=True, verbose=False, return_no_path_perc=False, **kwargs)[source]

graph_tools.graph_preprocessing module

graph_tools.graph_preprocessing.random_subgraph(G, size=None, p=None, verbose=False, with_replacement=False, return_largest_component=False)[source]

Purpose: To produce a random sample of graphs

graph_tools.graph_statistics module

Useful python functions main from networkx for estimating network stats

Notes: most of functions orignially from /old_modules/graph_statistics_and_simulations.py

packages that need to be installed: 1) ndlib 2) powerlaw

graph_tools.graph_statistics.adjacency_eig_vals_vecs(G, verbose=False)[source]
graph_tools.graph_statistics.average_clustering(G, **kwargs)[source]
local clustering: theoretically the fraction of traingles that actually exist /

all possible traingles in its neighborhood

How it is computed: 1) choose random node 2) choose 2 neighbors at random 3) check if traingle (if yes increment traingle counter) 4) Repeat and compute number with triangle_counter/ trials

graph_tools.graph_statistics.average_degree_connectivity(G, **kwargs)[source]

Returns dictionary that maps nodes with a certain degree to the average degree of the nearest neightbors

graph_tools.graph_statistics.average_shortest_path_length(G)[source]
graph_tools.graph_statistics.betweenness_centrality(G, normalized=True, endpoints=False, **kwargs)[source]
graph_tools.graph_statistics.critical_occupation_probability(G1)[source]
graph_tools.graph_statistics.degree_centrality(G, weight=None, nodes=None, normalize=True)[source]

Purpose: To compute the degree centrality for nodes (with an option for weighted)

graph_tools.graph_statistics.degree_distribution(G, nodes=None, percentile=None, degree_type='in_and_out', **kwargs)[source]

Purpose: To find the degree distribution of a graph (and optionally apply a node mask or restrict to a percentile of nodes)

graph_tools.graph_statistics.degree_distribution_analysis(G, graph_title='Graph (No Title)', degree_type_list=['in_and_out', 'in', 'out'], percentile=99.5, verbose=True, plot_distributions=True, **kwargs)[source]

Purpose: Will get statistics and possibly plot degree distribution data for a graph

graph_tools.graph_statistics.degree_distribution_mean(G, nodes=None, percentile=None, degree_type='in_and_out', **kwargs)[source]
graph_tools.graph_statistics.degree_distribution_mean_simple(G, **kwargs)[source]
graph_tools.graph_statistics.degree_distribution_median(G, nodes=None, percentile=None, degree_type='in_and_out', **kwargs)[source]
graph_tools.graph_statistics.degree_distribution_stat(G, nodes=None, stat='mean', percentile=None, degree_type='in_and_out', verbose=False, **kwargs)[source]
graph_tools.graph_statistics.degree_sequences(A, return_non_zero_degree_mask=True, filter_away_disconnected_nodes=False)[source]
graph_tools.graph_statistics.diameter(G)
graph_tools.graph_statistics.eig_vals_vecs_from_matrix(array, verbose=False)[source]

Eigenvectors are in the columns and organized in ascending order (so last one is the largest)

graph_tools.graph_statistics.eigenvector_centrality(G, weight=None, **kwargs)[source]
graph_tools.graph_statistics.eigenvector_centrality_numpy(G, weight=None, max_iter=50, tol=0)[source]

Compute the eigenvector centrality for the graph G.

Eigenvector centrality computes the centrality for a node based on the centrality of its neighbors. The eigenvector centrality for node $i$ is

\[Ax = \lambda x\]

where $A$ is the adjacency matrix of the graph G with eigenvalue $lambda$. By virtue of the Perron–Frobenius theorem, there is a unique and positive solution if $lambda$ is the largest eigenvalue associated with the eigenvector of the adjacency matrix $A$ ([2]).

Parameters:
  • G (graph) – A networkx graph

  • weight (None or string, optional (default=None)) – The name of the edge attribute used as weight. If None, all edge weights are considered equal. In this measure the weight is interpreted as the connection strength.

  • max_iter (integer, optional (default=100)) – Maximum number of iterations in power method.

  • tol (float, optional (default=1.0e-6)) – Relative accuracy for eigenvalues (stopping criterion). The default value of 0 implies machine precision.

Returns:

nodes – Dictionary of nodes with eigenvector centrality as the value.

Return type:

dictionary

Examples

>>> G = nx.path_graph(4)
>>> centrality = nx.eigenvector_centrality_numpy(G)
>>> print([f"{node} {centrality[node]:0.2f}" for node in centrality])
['0 0.37', '1 0.60', '2 0.60', '3 0.37']

See also

eigenvector_centrality, pagerank, hits

Notes

The measure was introduced by [1]_.

This algorithm uses the SciPy sparse eigenvalue solver (ARPACK) to find the largest eigenvalue/eigenvector pair.

For directed graphs this is “left” eigenvector centrality which corresponds to the in-edges in the graph. For out-edges eigenvector centrality first reverse the graph with G.reverse().

Raises:

NetworkXPointlessConcept – If the graph G is the null graph.

References

graph_tools.graph_statistics.find_pandemic_beta(graph, n_time_iterations=200, initial_infected_prop=0.05, gamma=0.01, beta_start=1e-05, current_jump=0.001, pandemic_threshold=0.7, pandemic_dev=0.01, max_iterations=50)[source]
graph_tools.graph_statistics.get_degree_distribution(G)[source]
graph_tools.graph_statistics.in_degree_mean(G, nodes=None, percentile=None, **kwargs)[source]
graph_tools.graph_statistics.in_degree_median(G, nodes=None, percentile=None, **kwargs)[source]
graph_tools.graph_statistics.inverse_average_shortest_path(G)[source]
graph_tools.graph_statistics.laplacian_eig_vals_vecs(G, verbose=False)[source]
graph_tools.graph_statistics.largest_adj_eigen_value(G1)[source]
graph_tools.graph_statistics.largest_connected_component_size(G, **kwargs)[source]
graph_tools.graph_statistics.largest_laplacian_eigen_value(G1)[source]
graph_tools.graph_statistics.local_clustering_coefficients(G, nodes=None, **kwargs)[source]
graph_tools.graph_statistics.longest_shortest_path(G)[source]
graph_tools.graph_statistics.min_weighted_vertex_cover_len(G, **kwargs)[source]

Returns length of Minimum number of vertices so that all edges are coincident on at least one vertice

graph_tools.graph_statistics.n_edges_empirical(G)[source]
graph_tools.graph_statistics.n_edges_from_A(A)[source]
graph_tools.graph_statistics.n_maximal_cliques(G, **kwargs)[source]

clique is just subset of vertices group where every vertex in group is connected (subgraph induced is complete)

Maximal clique = clique that cannot be extended by including one or more adjacent vertex (aka not subset of larger clique) Maximum clique = clique of the largest size in a graph clique number = number of vertices in a maxium clique

graph_tools.graph_statistics.n_reciprocal(A)[source]
graph_tools.graph_statistics.n_triangles(G)[source]
graph_tools.graph_statistics.node_attribute_mean(G, attribute, nodes=None, verbose=False)[source]
graph_tools.graph_statistics.node_attribute_median(G, attribute, nodes=None, verbose=False)[source]
graph_tools.graph_statistics.node_attribute_stat(G, attribute, stat='mean', nodes=None, verbose=False)[source]

Purpose: To get a summary statistic of a node attribute from all nodes in graph

Pseudocode: 1) Get the attribute for all nodes specified 2) Run the summary statistic

Ex: gstat.node_attribute_stat(

G = G_auto, attribute = “axon_skeletal_length”, stat = “mean”, verbose = True,

)

graph_tools.graph_statistics.node_connectivity(G, **kwargs)[source]
graph_tools.graph_statistics.number_connected_components(G, **kwargs)[source]
graph_tools.graph_statistics.out_degree_mean(G, nodes=None, percentile=None, **kwargs)[source]
graph_tools.graph_statistics.out_degree_median(G, nodes=None, percentile=None, **kwargs)[source]
graph_tools.graph_statistics.pandemic_beta(graph)[source]
graph_tools.graph_statistics.pandemic_beta_average(graph, average_iterations=5, n_time_iterations=200, initial_infected_prop=0.05, gamma=0.01, beta_start=1e-05, current_jump=0.001, pandemic_threshold=0.7, pandemic_dev=0.01, max_iterations=50, use_optimized_beta_finder=False)[source]
graph_tools.graph_statistics.power_exp_fit_ratio(G)[source]

Will return the loglikelihood ratio of the power and exponential graph R: Will be positive if power is more likely

negative exponential

p: significance of fit

graph_tools.graph_statistics.power_law_alpha(G)[source]
graph_tools.graph_statistics.power_law_alpha_sigma(G)[source]
graph_tools.graph_statistics.power_law_sigma(G)[source]
graph_tools.graph_statistics.random_degree_site_percolation(G, n_iterations=200)[source]
graph_tools.graph_statistics.reciprocity(A)[source]
graph_tools.graph_statistics.rich_club_transitivity(G)[source]

Computes the triad closure percentage between only those nodes with same or higher degree

class graph_tools.graph_statistics.run_options(directional=False, multiedge=False)[source]

Bases: object

__init__(directional=False, multiedge=False)[source]
graph_tools.graph_statistics.run_site_percolation(G, vertex_order_type='random', n_iterations=1000)[source]
graph_tools.graph_statistics.second_smallest_laplacian_eigen_value(G1)[source]
graph_tools.graph_statistics.size_maximum_clique(G, **kwargs)[source]

clique is just subset of vertices group where every vertex in group is connected (subgraph induced is complete)

Maximum clique = clique of the largest size in a graph clique number = number of vertices in a maxium clique

graph_tools.graph_statistics.smallest_adj_eigen_value(G1)[source]
graph_tools.graph_statistics.smallest_laplacian_eigen_value(G1)[source]
graph_tools.graph_statistics.sparsity(A, verbose=False)[source]
graph_tools.graph_statistics.top_heavy_percentage(G1, top_percentage=0.9)[source]
graph_tools.graph_statistics.transitivity(G, **kwargs)[source]

transitivity: Fraction of all possible traingles present in G Triad = 2 edges with a shared vertex

Transitivity = 3* # of triangles/ # of traids

graph_tools.graph_statistics.tree_number(G, **kwargs)[source]

Returns an approximation of the tree width of the graph (aka how tree-like it is): The lower the value the more tree-like the graph is

graph_tools.graph_statistics.trunc_power_stretched_exp_fit_ratio(G)[source]

Will return the loglikelihood ratio of the power and exponential graph R: Will be positive if power is more likely

negative exponential

p: significance of fit

graph_tools.graph_visualizations module

The parameters for plotting networkx graphs using nx.draw https://github.com/networkx/networkx/blob/main/networkx/drawing/nx_pylab.py#L584

Ggraph

A networkx graph

posdictionary, optional

A dictionary with nodes as keys and positions as values. If not specified a spring layout positioning will be computed. See networkx.drawing.layout for functions that compute node positions.

arrowsbool or None, optional (default=None)

If None, directed graphs draw arrowheads with ~matplotlib.patches.FancyArrowPatch, while undirected graphs draw edges via ~matplotlib.collections.LineCollection for speed. If True, draw arrowheads with FancyArrowPatches (bendable and stylish). If False, draw edges using LineCollection (linear and fast). For directed graphs, if True draw arrowheads. Note: Arrows will be the same color as edges.

arrowstylestr (default=’-|>’ for directed graphs)

For directed graphs, choose the style of the arrowsheads. For undirected graphs default to ‘-’ See matplotlib.patches.ArrowStyle for more options.

arrowsizeint or list (default=10)

For directed graphs, choose the size of the arrow head’s length and width. A list of values can be passed in to assign a different size for arrow head’s length and width. See matplotlib.patches.FancyArrowPatch for attribute mutation_scale for more info.

with_labelsbool (default=True)

Set to True to draw labels on the nodes.

axMatplotlib Axes object, optional

Draw the graph in the specified Matplotlib axes.

nodelistlist (default=list(G))

Draw only specified nodes

edgelistlist (default=list(G.edges()))

Draw only specified edges

node_sizescalar or array (default=300)

Size of nodes. If an array is specified it must be the same length as nodelist.

node_colorcolor or array of colors (default=’#1f78b4’)

Node color. Can be a single color or a sequence of colors with the same length as nodelist. Color can be string or rgb (or rgba) tuple of floats from 0-1. If numeric values are specified they will be mapped to colors using the cmap and vmin,vmax parameters. See matplotlib.scatter for more details.

node_shapestring (default=’o’)

The shape of the node. Specification is as matplotlib.scatter marker, one of ‘so^>v<dph8’.

alphafloat or None (default=None)

The node and edge transparency

cmapMatplotlib colormap, optional

Colormap for mapping intensities of nodes

vmin,vmaxfloat, optional

Minimum and maximum for node colormap scaling

linewidthsscalar or sequence (default=1.0)

Line width of symbol border

widthfloat or array of floats (default=1.0)

Line width of edges

edge_colorcolor or array of colors (default=’k’)

Edge color. Can be a single color or a sequence of colors with the same length as edgelist. Color can be string or rgb (or rgba) tuple of floats from 0-1. If numeric values are specified they will be mapped to colors using the edge_cmap and edge_vmin,edge_vmax parameters.

edge_cmapMatplotlib colormap, optional

Colormap for mapping intensities of edges

edge_vmin,edge_vmaxfloats, optional

Minimum and maximum for edge colormap scaling

stylestring (default=solid line)

Edge line style e.g.: ‘-’, ‘–’, ‘-.’, ‘:’ or words like ‘solid’ or ‘dashed’. (See matplotlib.patches.FancyArrowPatch: linestyle)

labelsdictionary (default=None)

Node labels in a dictionary of text labels keyed by node

font_sizeint (default=12 for nodes, 10 for edges)

Font size for text labels

font_colorstring (default=’k’ black)

Font color string

font_weightstring (default=’normal’)

Font weight

font_familystring (default=’sans-serif’)

Font family

labelstring, optional

Label for graph legend

kwdsoptional keywords

See networkx.draw_networkx_nodes(), networkx.draw_networkx_edges(), and networkx.draw_networkx_labels() for a description of optional keywords.

graph_tools.graph_visualizations.draw_G_with_color_array(G, colors, pos=None, vmin=None, vmax=None, cmap=<matplotlib.colors.LinearSegmentedColormap object>)[source]

Purpose: To graph the colors associated with an array referencing the individual nodes of a graph

ex: eigvals,eigvecs = laplacian_eig_vals_vecs(G) draw_G_with_color_array(

G, eigvecs[:,1]

)

graph_tools.graph_visualizations.graph_stats_dicts_to_plt_table(stats_list, graph_names_list)[source]

Psuedocode: To assemble all of the statistics in to a table that can be printed oout and put in a paper

Stats want to note of n_nodes n_edge unique edges Largest connected cluster Average Degree Average In Degree Average Out Degree

Iterate through all of the graph types 1) Compute a dictionary of the above mentioned stats

  1. Create a Dataframe from all the dictionaries

  2. Export a Nice looking copy of the DataFrame

graph_tools.graph_visualizations.plot_degree_distribution(G, degree_type='in_and_out', density=False, logscale=False, n_bins=50, bin_width=None, bin_max=None, bin_min=None, title=None, title_append=None, degree_distribution=None, print_degree_distribution_stats=True, fontsize_axes=20, fontsize_title=20, **kwargs)[source]

Purpose: To plot the degree distribution

graph_tools.graph_visualizations.plot_degree_distribution_simple(G, bins=20)[source]
graph_tools.graph_visualizations.plot_modularity_vs_spectral_partitioning(G)[source]
graph_tools.graph_visualizations.plot_partition(G, values, nodes=None, pos=None, class_colors=('blue', 'red'))[source]

graph_tools.netsci_utils module

Note: The motifs it finds, if it finds 3 nodes for a motif, those 3 nodes are not used in double counting of another motif (NO DOUBLE COUNTING)

graph_tools.netsci_utils.example_plot_3_node_motifs()[source]
graph_tools.netsci_utils.example_plot_3_node_motifs_comparison(frequency_syn, frequency_prox)[source]
graph_tools.netsci_utils.plot_all_triads(order=array([3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]), o=(1, 0), delta=(1, 0), r=0.4, node_colors=None, **kwargs)[source]

graph_tools.null_models module

class graph_tools.null_models.SynConnData(G, filter_away_disconnected_nodes=False, **kwargs)[source]

Bases: object

Purpose: from an adjacency matrix, compute the degree sequences and number of edges as a data-driven class

A_from_G(G)[source]
__init__(G, filter_away_disconnected_nodes=False, **kwargs)[source]
property edges
property edges_syn
property n_edges_raw
property n_edges_syn
property nodes
property nodes_syn
graph_tools.null_models.create_degree_sequence(n, sfunction=None, max_tries=200, **kwds)[source]

Attempt to create a valid degree sequence of length n using specified function sfunction(n,**kwds).

Parameters:
  • n (int) – Length of degree sequence = number of nodes

  • sfunction (function) – Function which returns a list of n real or integer values. Called as “sfunction(n,**kwds)”.

  • max_tries (int) – Max number of attempts at creating valid degree sequence.

Notes

Repeatedly create a degree sequence by calling sfunction(n,**kwds) until achieving a valid degree sequence. If unsuccessful after max_tries attempts, raise an exception.

For examples of sfunctions that return sequences of random numbers, see networkx.Utils.

Examples

>>> from networkx.utils import uniform_sequence, create_degree_sequence
>>> seq=create_degree_sequence(10,uniform_sequence)
graph_tools.null_models.erdos_renyi(n, p)[source]
graph_tools.null_models.erdos_renyi_random_location(n, p)[source]

Erdos Renyi graph that has random locations generated

graph_tools.null_models.linear_preferencial_attachment_random(n, m, p=0.5, G=None, seed=None)[source]

Will generate a rich get richer graph

preferential attachment but starts from a random graph

Example: G_LPA = linear_preferencial_attachment_random(20,5,0.3) nx.draw(G_LPA)

m = number of edges you want per node p = the probability of connection at the start

graph_tools.null_models.linear_preferncial_attachment_wheel(n, m, seed=None)[source]

Returns a random graph according to the Barabási–Albert preferential attachment model. Preferential Attachment but starts from spoke and wheel graph

A graph of n nodes is grown by attaching new nodes each with m edges that are preferentially attached to existing nodes with high degree.

Parameters:
  • n (int) – Number of nodes

  • m (int) – Number of edges to attach from a new node to existing nodes

  • seed (int, optional) – Seed for random number generator (default=None).

Returns:

G

Return type:

Graph

Raises:

NetworkXError – If m does not satisfy 1 <= m < n.

References

graph_tools.null_models.p_ER_from_number_of_nodes_edges(n_nodes, n_edges)[source]
graph_tools.null_models.power_law_sequence(n, alpha, xmin, before_xmin_func=None, perc_before_xmin=0, before_xmin_func_args={})[source]
graph_tools.null_models.random_power_law(n, alpha, xmin=3)[source]
graph_tools.null_models.random_sequence_law(n, **kwargs)[source]

Will create a random graph based on the sequence function you pass it and other parameters

Example use: random_sequence_law(n=10,sfunction=power_law_sequence, max_tries=50, xmin = 1,alpha=2)

graph_tools.null_models.random_tree_random_location(n)[source]
graph_tools.null_models.random_uniform(n, m)[source]
graph_tools.null_models.rename(newname)[source]
graph_tools.null_models.small_world(n, m, p)[source]
graph_tools.null_models.uniform_sequence(n, k_max)[source]

Return sample sequence of length n from a uniform distribution.

graph_tools.null_models.vertex_duplication(n, p, seed=None)[source]

Returns an undirected graph using the duplication-divergence model.

A graph of n nodes is created by duplicating the initial nodes and retaining edges incident to the original nodes with a retention probability p.

Parameters:
  • n (int) – The desired number of nodes in the graph.

  • p (float) – The probability for retaining the edge of the replicated node.

  • seed (int, optional) – A seed for the random number generator of random (default=None).

Returns:

G

Return type:

Graph

Raises:

NetworkXError – If p is not a valid probability. If n is less than 2.

Example: nx.draw(vertex_duplication(20,0.5))

graph_tools.null_models.vertex_duplication_with_complement(n, p, p2, seed=None)[source]

Returns an undirected graph using the duplication-divergence model. with probability p copies the edges of a node with probability (1-p makes random edge)

graph_tools.null_models.vertex_duplication_with_mutation(n, p, p2, seed=None)[source]

Returns an undirected graph using the duplication-divergence model. with probability p copies the edges of a node with probability (1-p makes random edge)

graph_tools.null_models.watts_strogatz_graph_smallworld_random_location(n, m, p)[source]

Module contents