Consolidate a graph by removing intermediate nodes (nodes that occur exactly twice) and optionally dropping loop, duplicate, and singleton edges (leading to dead ends). This simplifies the network topology while preserving connectivity.
Usage
consolidate_graph(
graph_df,
directed = FALSE,
drop.edges = c("loop", "duplicate", "single"),
consolidate = TRUE,
by = NULL,
keep.nodes = NULL,
...,
recursive = "full",
verbose = TRUE
)Arguments
- graph_df
A data frame representing a graph with columns:
fromandto(node IDs), and optionally other columns to preserve. If coordinate columns (FX,FY,TX,TY) are present, they will be preserved and updated based on the consolidated node coordinates.- directed
Logical (default: FALSE). Whether the graph is directed.
- drop.edges
Character vector (default:
c("loop", "duplicate", "single")). Types of edges to drop:"loop"removes self-loops (edges where from == to),"duplicate"removes duplicate edges (same from-to pair),"single"removes edges leading to singleton nodes (nodes that occur only once). Set toNULLto keep all edges.- consolidate
Logical (default: TRUE). If TRUE, consolidates the graph by removing intermediate nodes (nodes that occur exactly twice) and merging connecting edges. If FALSE, only drops edges as specified in
drop.edges.- by
Link characteristics to preserve/not consolidate across, passed as a one-sided formula or character vector of column names. Typically this includes attributes like mode, type, or capacity to ensure that only edges with the same characteristics are consolidated.
- keep.nodes
Numeric vector (optional). Node IDs to preserve during consolidation, even if they occur exactly twice. Also used to preserve nodes when dropping singleton edges.
- ...
Arguments passed to
collap()for aggregation across consolidated edges. The defaults areFUN = fmeanfor numeric columns andcatFUN = fmodefor categorical columns. Select columns usingcolsor use argumentcustom = list(fmean = cols1, fsum = cols2, fmode = cols3)to map different columns to specific aggregation functions. It is highly recommended to weight the aggregation (usingw = ~ weight_col) by the length/cost of the edges.- recursive
One of
"none"/FALSE,"partial"(recurse on dropping single edges and consolidation but only aggregate once), or"full"/TRUE(recursively consolidates and aggregates the graph until no further consolidation is possible). This ensures that long chains of intermediate nodes are fully consolidated in a single call.- verbose
Logical (default: TRUE). Whether to print messages about dropped edges and consolidation progress.
Value
A data frame representing the consolidated graph with:
edge- Edge identifier (added as first column)All columns from
graph_df(aggregated if consolidation occurred), excludingfrom,to, and optionallyFX,FY,TX,TY(which are re-added if present in original)from,to- Node IDs (updated after consolidation)Coordinate columns (
FX,FY,TX,TY) if present in originalAttribute
"keep.edges"- Indices of original edges that were keptAttribute
"gid"- Edge group IDs mapping consolidated edges to original edges
Details
This function simplifies a graph by:
Dropping edges: Optionally removes self-loops, duplicate edges, and edges leading to singleton nodes (nodes that appear only once in the graph)
Consolidating nodes: Removes intermediate nodes (nodes that occur exactly twice) by merging the two edges connected through them into a single longer edge
Aggregating attributes: When edges are merged, attributes/columns are aggregated using
collap(). The default aggregation is mean for numeric columns and mode for categorical columns.Recursive consolidation: If
recursive = TRUE, the function continues consolidating until no more nodes can be consolidated, ensuring complete simplification
Consolidation is useful for simplifying network topology while preserving connectivity.
For example, if node B connects A->B and B->C, it will be removed and replaced with A->C.
With recursive = TRUE, long chains (A->B->C->D) are fully consolidated to A->D in
a single call.
For undirected graphs, the algorithm also handles cases where a node appears twice as either origin or destination (circular cases).
If coordinate columns (FX, FY, TX, TY) are present in the input,
they are preserved and updated based on the consolidated node coordinates from the original graph.
Examples
library(flownet)
library(sf)
# Convert segments to undirected graph
graph <- africa_segments |>
linestrings_from_graph() |>
linestrings_to_graph() |>
create_undirected_graph(FUN = "fsum")
# Get nodes to preserve (city/port locations)
nodes <- nodes_from_graph(graph, sf = TRUE)
nearest_nodes <- nodes$node[st_nearest_feature(africa_cities_ports, nodes)]
# Consolidate graph, preserving city nodes
graph_cons <- consolidate_graph(graph, keep = nearest_nodes, w = ~ passes)
#> Consolidating undirected graph graph with 11385 edges using full recursion
#> Initial node degrees:
#> 1 2 3 4 5 6 7 8 9 10
#> 125 5293 3240 395 102 31 4 2 1 1
#>
#> Dropped 44 loop edges
#> Dropped 11 edges leading to singleton nodes
#> Oriented 3431 undirected intermediate edges
#> Consolidated 5079 intermediate nodes
#> Oriented 10 undirected intermediate edges
#> Consolidated 10 intermediate nodes
#> Aggregated 11330 edges down to 6597 edges
#> Oriented 361 undirected intermediate edges
#> Consolidated 375 intermediate nodes
#> Oriented 8 undirected intermediate edges
#> Consolidated 8 intermediate nodes
#> Aggregated 6597 edges down to 6264 edges
#> Oriented 41 undirected intermediate edges
#> Consolidated 43 intermediate nodes
#> Oriented 1 undirected intermediate edges
#> Consolidated 1 intermediate nodes
#> Aggregated 6264 edges down to 6224 edges
#> Oriented 2 undirected intermediate edges
#> Consolidated 3 intermediate nodes
#> Aggregated 6224 edges down to 6221 edges
#> No nodes to consolidate, returning graph
#>
#> Consolidated undirected graph graph from 11385 edges to 6221 edges (54.6%)
#> Final node degrees:
#> 1 2 3 4 5 6 7 8
#> 114 236 3246 392 84 18 2 1
# Consolidated graph has fewer edges
c(original = nrow(graph), consolidated = nrow(graph_cons))
#> original consolidated
#> 11385 6221
