KG Cache Middleware

A FastAPI middleware that sits between clients and Neo4j, with a Redis-backed cache and five pluggable graph-aware optimizers. This guide walks through the actual code, module by module.

Python 3.12 FastAPI Neo4j 5.x Redis Prometheus

What this is

The middleware exposes a single endpoint, POST /query, that takes a Cypher string plus parameters and returns the result. Behind that endpoint sits a pipeline of optimizers, each independently flag-controlled, that decide whether the query can be answered from cache, whether its result deserves a slot, when a hot key should be refreshed early, and what is likely to be asked next.

This site is a guided tour through the actual implementation. Every code block is taken from the repository as it ships, with annotations that explain why each piece exists. Numerical claims are tied back to the benchmark CSVs in benchmark_results/.

Reading order

If you are new to the project, work through the pages in the order below. Each page is self-contained but builds on earlier concepts.

Five knobs at a glance

Every optimizer has a single feature flag in config.yaml. You can enable or disable any subset to ablate behaviour.

Flag Module What it does Default
JITTER_STAMPEDE jitter_stampede.py Single-flight lock + XFetch early refresh, plain or topology-sensitive. on
FREQUENCY_AWARE frequency_aware.py Admit on access count or compute cost; assign sub-linear TTL. on
ADAPTIVE_PREFETCH adaptive_prefetch.py Predict next likely query per session; warm one-hop neighbours. on
OVERLAPPING_SUBQUERIES overlapping_subqueries.py Hash variable-length path fragments into reusable cache keys. on
EXTERNAL_BFS external_bfs.py Run unweighted shortest paths through NetworkX rather than Neo4j. off (in benchmark)

Headline results

Numbers from the per-run summary CSVs in this repository. See Benchmarking for the full breakdown.

Workload Metric Baseline All Enabled Change
LDBC SNB SF1 Throughput 56.77 qps 187.03 qps +229.5%
LDBC SNB SF1 P95 latency 242.40 ms 51.20 ms −78.9%
SSCA-inspired Throughput 48.76 qps 413.61 qps +748.2%
SSCA-inspired P95 latency 167.22 ms 7.95 ms −95.2%
▶ Tip

The strongest single optimizer is overlapping subqueries in canonical-hybrid mode. On SSCA, it alone reaches 559.58 qps — higher than the all-enabled run, because every other module adds bookkeeping overhead.