KG Cache Middleware

A FastAPI middleware that sits between clients and Neo4j, with a Redis-backed cache and five pluggable graph-aware optimizers. This guide walks through the actual code, module by module.

Python 3.12 FastAPI Neo4j 5.x Redis Prometheus

What this is

The middleware exposes a single endpoint, POST /query, that takes a Cypher string plus parameters and returns the result. Behind that endpoint sits a pipeline of optimizers, each independently flag-controlled, that decide whether the query can be answered from cache, whether its result deserves a slot, when a hot key should be refreshed early, and what is likely to be asked next.

This site is a guided tour through the actual implementation. Every code block is taken from the repository as it ships, with annotations that explain why each piece exists. Numerical claims are tied back to the benchmark CSVs in benchmark_results/.

Reading order

If you are new to the project, work through the pages in the order below. Each page is self-contained but builds on earlier concepts.

01 · Foundation

Architecture

Request lifecycle from FastAPI entry to Redis hit, with the orchestration in middleware/cache.py.

main.py cache.py

02 · Foundation

Query Normalization

How Cypher text and parameters are reduced to a deterministic cache key, shared by every optimizer.

query_parser.py key_generator.py

03 · Optimizer

Overlapping Subqueries

The canonical-path-signature scheme that lets two different Cypher queries reuse the same cached result.

strongest module overlapping_subqueries.py

04 · Optimizer

Frequency-Aware Cache

TinyLFU-inspired admission with a sub-linear bounded TTL formula. Stops one-time misses from polluting the cache.

frequency_aware.py

05 · Optimizer

Adaptive Prefetch

A first-order Markov model over per-session query patterns, plus a one-hop neighbour-expansion injection.

adaptive_prefetch.py

06 · Optimizer

Jitter & Stampede

Probabilistic XFetch refresh plus a topology-sensitive variant that scales with node degree.

novel jitter_stampede.py

07 · Optimizer

External BFS

NetworkX-backed shortest-path computation. Implemented but excluded from the default benchmark — here is why.

excluded external_bfs.py

08 · Evaluation

Benchmarking

The run matrix, dataset setup (LDBC SNB SF1, SSCA-inspired), and how the headline numbers are produced.

run_benchmark.py ssca_workload.py

Five knobs at a glance

Every optimizer has a single feature flag in config.yaml. You can enable or disable any subset to ablate behaviour.

Flag	Module	What it does	Default
`JITTER_STAMPEDE`	`jitter_stampede.py`	Single-flight lock + XFetch early refresh, plain or topology-sensitive.	on
`FREQUENCY_AWARE`	`frequency_aware.py`	Admit on access count or compute cost; assign sub-linear TTL.	on
`ADAPTIVE_PREFETCH`	`adaptive_prefetch.py`	Predict next likely query per session; warm one-hop neighbours.	on
`OVERLAPPING_SUBQUERIES`	`overlapping_subqueries.py`	Hash variable-length path fragments into reusable cache keys.	on
`EXTERNAL_BFS`	`external_bfs.py`	Run unweighted shortest paths through NetworkX rather than Neo4j.	off (in benchmark)

Headline results

Numbers from the per-run summary CSVs in this repository. See Benchmarking for the full breakdown.

Workload	Metric	Baseline	All Enabled	Change
LDBC SNB SF1	Throughput	56.77 qps	187.03 qps	+229.5%
LDBC SNB SF1	P95 latency	242.40 ms	51.20 ms	−78.9%
SSCA-inspired	Throughput	48.76 qps	413.61 qps	+748.2%
SSCA-inspired	P95 latency	167.22 ms	7.95 ms	−95.2%

▶ Tip

The strongest single optimizer is overlapping subqueries in canonical-hybrid mode. On SSCA, it alone reaches 559.58 qps — higher than the all-enabled run, because every other module adds bookkeeping overhead.