KG Cache Middleware
A FastAPI middleware that sits between clients and Neo4j, with a Redis-backed cache and five pluggable graph-aware optimizers. This guide walks through the actual code, module by module.
What this is
The middleware exposes a single endpoint, POST /query, that takes a Cypher string plus parameters and returns the result. Behind that endpoint sits a pipeline of optimizers, each independently flag-controlled, that decide whether the query can be answered from cache, whether its result deserves a slot, when a hot key should be refreshed early, and what is likely to be asked next.
This site is a guided tour through the actual implementation. Every code block is taken from the repository as it ships, with annotations that explain why each piece exists. Numerical claims are tied back to the benchmark CSVs in benchmark_results/.
Reading order
If you are new to the project, work through the pages in the order below. Each page is self-contained but builds on earlier concepts.
Architecture
Request lifecycle from FastAPI entry to Redis hit, with the orchestration in middleware/cache.py.
Query Normalization
How Cypher text and parameters are reduced to a deterministic cache key, shared by every optimizer.
03 · OptimizerOverlapping Subqueries
The canonical-path-signature scheme that lets two different Cypher queries reuse the same cached result.
04 · OptimizerFrequency-Aware Cache
TinyLFU-inspired admission with a sub-linear bounded TTL formula. Stops one-time misses from polluting the cache.
05 · OptimizerAdaptive Prefetch
A first-order Markov model over per-session query patterns, plus a one-hop neighbour-expansion injection.
06 · OptimizerJitter & Stampede
Probabilistic XFetch refresh plus a topology-sensitive variant that scales with node degree.
07 · OptimizerExternal BFS
NetworkX-backed shortest-path computation. Implemented but excluded from the default benchmark — here is why.
08 · EvaluationBenchmarking
The run matrix, dataset setup (LDBC SNB SF1, SSCA-inspired), and how the headline numbers are produced.
Five knobs at a glance
Every optimizer has a single feature flag in config.yaml. You can enable or disable any subset to ablate behaviour.
| Flag | Module | What it does | Default |
|---|---|---|---|
JITTER_STAMPEDE |
jitter_stampede.py |
Single-flight lock + XFetch early refresh, plain or topology-sensitive. | on |
FREQUENCY_AWARE |
frequency_aware.py |
Admit on access count or compute cost; assign sub-linear TTL. | on |
ADAPTIVE_PREFETCH |
adaptive_prefetch.py |
Predict next likely query per session; warm one-hop neighbours. | on |
OVERLAPPING_SUBQUERIES |
overlapping_subqueries.py |
Hash variable-length path fragments into reusable cache keys. | on |
EXTERNAL_BFS |
external_bfs.py |
Run unweighted shortest paths through NetworkX rather than Neo4j. | off (in benchmark) |
Headline results
Numbers from the per-run summary CSVs in this repository. See Benchmarking for the full breakdown.
| Workload | Metric | Baseline | All Enabled | Change |
|---|---|---|---|---|
| LDBC SNB SF1 | Throughput | 56.77 qps | 187.03 qps | +229.5% |
| LDBC SNB SF1 | P95 latency | 242.40 ms | 51.20 ms | −78.9% |
| SSCA-inspired | Throughput | 48.76 qps | 413.61 qps | +748.2% |
| SSCA-inspired | P95 latency | 167.22 ms | 7.95 ms | −95.2% |
The strongest single optimizer is overlapping subqueries in canonical-hybrid mode. On SSCA, it alone reaches 559.58 qps — higher than the all-enabled run, because every other module adds bookkeeping overhead.