External BFS
Run unweighted shortest-path queries through NetworkX in process, instead of through Neo4j. Implemented for completeness, but excluded from the default benchmark — here is the honest reason.
This module regressed on the real dataset-backed runs. At SF1 the cost of fetching up to 20,000 edges from Neo4j into a Python NetworkX graph dominates the cost of the BFS itself, and Neo4j's native shortest-path implementation is already faster. The module remains in the codebase as a reference for future hybrid-execution work, but every benchmark configuration in run_benchmark.py sets EXTERNAL_BFS=False.
What the module does
If a Cypher query matches a regex for unweighted shortest paths and does not mention weight, cost, or distance, the module attempts to run it externally. It pulls a bounded set of edges from Neo4j into a networkx.DiGraph, runs the standard library shortest-path algorithm, and caches the result under the dedicated bfs: prefix.
Candidate detection
The is-this-a-BFS-candidate check is a regex over the raw Cypher text. It matches three patterns: an explicit shortestPath or allShortestPaths call, or any variable-length right-pointing or left-pointing relationship. The negative regex rejects any query that mentions edge weights.
SHORTEST_PATH_RE = re.compile(
r"(shortestPath|allShortestPaths|-\[[^\]]*\*[^\]]*\]->|<-\[[^\]]*\*[^\]]*\]-)",
re.IGNORECASE,
)
WEIGHT_HINT_RE = re.compile(r"weight|cost|distance", re.IGNORECASE)
def is_candidate(self, query: str) -> bool:
return bool(SHORTEST_PATH_RE.search(query)) and not bool(WEIGHT_HINT_RE.search(query))
Running the BFS
The flow is:
- Validate the request has a
source_idin params. - Pull all edges from Neo4j with a node-id projection, capped at
BFS_MAX_EDGES(default 20,000). - Refuse if the cap was hit, since the result would be incomplete.
- Build a NetworkX
DiGraphby inserting every edge. - Refuse if node count exceeds
BFS_MAX_NODES(default 5,000). - Run
shortest_pathif both source and target are provided, otherwisesingle_source_shortest_pathfor all reachable nodes.
async def run(self, query, params, cache_key):
source = params.get("source_id")
target = params.get("target_id")
if source is None:
return None, {"bfs_override": False, "bfs_reason": "missing_source"}
start = time.perf_counter()
edges = await self.neo4j_client.execute_cypher(
"""
MATCH (a)-[r]->(b)
WHERE a.id IS NOT NULL AND b.id IS NOT NULL
RETURN a.id AS source, b.id AS target
LIMIT $limit
""",
{"limit": self.settings.BFS_MAX_EDGES},
)
if len(edges) >= self.settings.BFS_MAX_EDGES:
return None, {"bfs_override": False, "bfs_reason": "edge_limit_exceeded"}
graph = nx.DiGraph()
for edge in edges:
graph.add_edge(edge["source"], edge["target"])
if graph.number_of_nodes() > self.settings.BFS_MAX_NODES:
return None, {"bfs_override": False, "bfs_reason": "node_limit_exceeded"}
try:
if target is not None:
path = nx.shortest_path(graph, source=source, target=target)
result = [{"path": path, "length": max(len(path) - 1, 0)}]
else:
paths = nx.single_source_shortest_path(graph, source)
result = [{"target": node, "path": path, "length": len(path) - 1} for node, path in paths.items()]
except (nx.NetworkXNoPath, nx.NodeNotFound):
result = []
The speedup metric
The module records a Prometheus histogram of the wall-clock latency and an estimated speedup against the assumed Neo4j cost. The estimate uses the per-request estimated_neo4j_ms param (the same one used by the jitter module's XFetch rule), with a floor of 1.0× to avoid reporting a negative speedup when BFS is slower than the assumed Neo4j cost.
elapsed = time.perf_counter() - start
BFS_EXTERNAL_LATENCY.observe(elapsed)
estimated_speedup = max(
params.get("estimated_neo4j_ms", 100.0) / max(elapsed * 1000, 1),
1.0,
)
BFS_SPEEDUP.observe(estimated_speedup)
return result, {
"bfs_override": True,
"bfs_reason": "executed",
"bfs_cache_key": f"{self.settings.BFS_PREFIX}{cache_key}",
}
Why it regressed on the real dataset
Three things compound to make the module unprofitable at SF1:
- Edge fetch is the bottleneck. Pulling 20,000 edges through the Bolt driver and round-tripping them as Python dicts takes longer than Neo4j's native shortest-path query, which never leaves the database.
- Caching does not amortise enough. The benchmark workload issues many distinct shortest-path queries, so a freshly fetched edge set rarely gets reused before it expires.
- NetworkX is single-threaded Python. Even when the edge fetch is hot in cache, the BFS itself is slower than Neo4j's native C++ implementation for graphs of this size.
The module would be more interesting on a workload that issues many shortest-path queries against a small, stable subgraph: a small office hierarchy, a product catalog with bounded depth, or a frozen graph snapshot. None of those match LDBC SNB SF1.
How the exclusion is enforced
The benchmark runner explicitly disables the flag in every run, including baseline and all_enabled:
DISABLED_BENCHMARK_FLAGS = {
"EXTERNAL_BFS": False,
}
def build_run_matrix():
# every entry merges DISABLED_BENCHMARK_FLAGS into its toggle dict
runs = [("baseline", {**DISABLED_BENCHMARK_FLAGS, **{flag: False for flag in flags}})]
...
runs.append(("all_enabled", {**DISABLED_BENCHMARK_FLAGS, **{flag: True for flag in flags}}))
return runs
This means the published "all_enabled" numbers in this thesis are honest: nothing is hidden behind a faulty optimization. The module exists, has a unit test (tests/test_external_bfs.py), and is available for anyone who wants to revive it for a more amenable workload.
External execution of graph algorithms in front of a graph database is a tempting idea, but it pays off only when the algorithm is structurally different from what the database can do natively (think: an in-memory PageRank that runs once per session and is reused), or when the database round-trip dominates and the in-memory copy is already amortised. Pure unweighted BFS is neither.