Preprocessing¶

Graph Cuts¶

Constants¶

graspologic.preprocessing.LARGER_THAN_INCLUSIVE¶

Cut any edge or node > the cut_threshold

graspologic.preprocessing.LARGER_THAN_EXCLUSIVE¶

Cut any edge or node >= the cut_threshold

graspologic.preprocessing.SMALLER_THAN_INCLUSIVE¶

Cut any edge or node < the cut_threshold

graspologic.preprocessing.SMALLER_THAN_EXCLUSIVE¶

Cut any edge or node <= the cut_threshold

Classes¶

class graspologic.preprocessing.DefinedHistogram[source]¶

Contains the histogram and the edges of the bins in the histogram. The bin_edges will have a length 1 greater than the histogram, as it defines the minimal and maximal edges as well as each edge in between.

Create new instance of DefinedHistogram(histogram, bin_edges)

histogram: ndarray¶: Alias for field number 0

bin_edges: ndarray¶: Alias for field number 1

static __new__(_cls, histogram, bin_edges)¶

Create new instance of DefinedHistogram(histogram, bin_edges)

Parameters:

histogram (ndarray)
bin_edges (ndarray)

count(value, /)¶: Return number of occurrences of value.

index(value, start=0, stop=sys.maxsize, /)¶

Return first index of value.

Raises ValueError if the value is not present.

Functions¶

graspologic.preprocessing.cut_edges_by_weight(graph, cut_threshold, cut_process, weight_attribute='weight', prune_isolates=False)[source]¶

Thresholds edges (removing them from the graph and returning a copy) by weight.

Parameters:

graphUnion[nx.Graph, nx.DiGraph]

The graph that will be copied and pruned.

cut_thresholdUnion[int, float]

The threshold for making cuts based on weight.

cut_processstr

Describes how we should make the cut; cut all edges larger or smaller than the cut_threshold, and whether exclusive or inclusive. Allowed values are

larger_than_inclusive
larger_than_exclusive
smaller_than_inclusive
smaller_than_exclusive

weight_attributestr

The weight attribute name in the edge's data dictionary. Default is weight.

prune_isolatesbool

If true, remove any vertex that no longer has an edge. Note that this only prunes vertices which have edges to be pruned; any isolate vertex prior to any edge cut will be retained.

Returns:

Union[nx.Graph, nx.DiGraph]: Pruned copy of the same type of graph provided

Parameters:

graph (Graph | DiGraph)
cut_threshold (int | float)
cut_process (str)
weight_attribute (str)
prune_isolates (bool)

Return type:

Graph | DiGraph

Notes

Edges without a weight_attribute field will be excluded from these cuts. Enable logging to view any messages about edges without weights.

graspologic.preprocessing.cut_vertices_by_betweenness_centrality(graph, cut_threshold, cut_process, num_random_samples=None, normalized=True, weight_attribute='weight', include_endpoints=False, random_seed=None)[source]¶

Given a graph and a cut_threshold and a cut_process, return a copy of the graph with the vertices outside of the cut_threshold.

The betweenness centrality calculation can take advantage of networkx' implementation of randomized sampling by providing num_random_samples (or k, in networkx betweenness_centrality nomenclature).

Parameters:

graphUnion[nx.Graph, nx.DiGraph]

The graph that will be copied and pruned.

cut_thresholdUnion[int, float]

The threshold for making cuts based on weight.

cut_processstr

Describes how we should make the cut; cut all edges larger or smaller than the cut_threshold, and whether exclusive or inclusive. Allowed values are

larger_than_inclusive
larger_than_exclusive
smaller_than_inclusive
smaller_than_exclusive

num_random_samplesOptional[int]

Use num_random_samples for vertex samples to estimate betweenness. num_random_samples should be <= len(graph.nodes). The larger num_random_samples is, the better the approximation. Default is None.

normalizedbool

If True the betweenness values are normalized by \(2/((n-1)(n-2))\) for undirected graphs, and \(1/((n-1)(n-2))\) for directed graphs where n is the number of vertices in the graph. Default is True

weight_attributeOptional[str]

If None, all edge weights are considered equal. Otherwise holds the name of the edge attribute used as weight. Default is weight

include_endpointsbool

If True include the endpoints in the shortest path counts. Default is False

random_seedOptional[Union[int, random.Random, np.random.RandomState]]

Random seed or preconfigured random instance to be used for selecting random samples. Only used if num_random_samples is set. None will generate a new random state. Specifying a random state will provide consistent results between runs.

Returns:

Union[nx.Graph, nx.DiGraph]: Pruned copy of the same type of graph provided

Parameters:

graph (Graph | DiGraph)
cut_threshold (int | float)
cut_process (str)
num_random_samples (int | None)
normalized (bool)
weight_attribute (str | None)
include_endpoints (bool)
random_seed (int | Random | RandomState | None)

Return type:

Graph | DiGraph