SQLite performance recommendations for the reference runtime
Status: implemented 2026-04-19 (Tim authorized optimistic config change) Raised: 2026-04-19 Trigger: the Claude Code ingest wrote 9.8 GB to disk from 2.2 GB of source jsonl. 4.5× write amplification. Observed rate ~1 MB/s sustained. This was ~10× slower than it should be for a single-writer workload.
What landed
reference-implementation/server/db.js— file-backedinitDbnow sets WAL, synchronous=NORMAL, temp_store=MEMORY, mmap_size=256 MB, cache_size=64 MB. In-memory paths untouched (WAL doesn't apply).reference-implementation/runtime/index.js—BATCH_SIZEdefault bumped from 50 → 500, overridable viaPDPP_RUNTIME_BATCH_SIZE.- Currently-running ingests (Claude Code / Codex) don't benefit — their DBs opened with old pragmas. Next orchestrator invocation will pick up the faster config.
Root cause analysis
Current config (reference-implementation/server/db.js):
- Default journal mode is
DELETE(notWAL) — every COMMIT rewrites a journal file - Default
synchronous=FULL— two fsyncs per commit @databases/sqliteopens one connection; no per-table bulk-insert pragmas- Runtime batch size is 50 records per
/v1/ingestcall (runtime/index.js:420) - Each batch flush emits a
spine_eventsrow — double-writes per batch
Recommendations (in priority order)
1. Enable WAL mode (biggest win, ~10× speedup on write-heavy workloads)
In initDb(), immediately after db = createDatabase(path):
await db.query(sql`PRAGMA journal_mode = WAL`);
await db.query(sql`PRAGMA synchronous = NORMAL`);
await db.query(sql`PRAGMA temp_store = MEMORY`);
await db.query(sql`PRAGMA mmap_size = 268435456`); // 256 MB
await db.query(sql`PRAGMA cache_size = -65536`); // 64 MB cache (negative = KiB)
Durability trade-off: synchronous=NORMAL with WAL fsyncs the WAL file at checkpoint, not every commit. A power loss between commit and checkpoint can lose the last ~second of writes. For personal-server workloads this is acceptable; if not, keep synchronous=FULL and accept a smaller speedup.
2. Increase runtime batch size
runtime/index.js:420 → const BATCH_SIZE = 500; (was 50).
Batch size of 50 meant ~220 round-trips per 11k-record file. At 500, it's ~22. This is a Node<->SQLite micro-optimization; the real cost is the per-batch HTTP round trip over localhost when using the embedded orchestrator. Bigger batches amortize that cost.
3. Consider dropping per-batch spine_events
Every flushBatch inserts a spine_events row (run.batch_ingested). For a large ingest this doubles write volume. Options:
- Aggregate: one spine_events row per run-completion rather than per batch
- Move spine_events to a separate DB file (attached) so its writes don't contend with
recordsWAL - Make it opt-in via a
spine_granularityconfig
4. Run PRAGMA optimize at connection close
await db.query(sql`PRAGMA optimize`);
Tells SQLite to update stats based on actual query patterns; next open benefits.
5. Consider prepared statement reuse
@databases/sqlite prepares each db.query(sql\…`)call. For hot paths (every INSERT intorecords), a cached prepared statement would avoid re-parsing. This is a library-level concern; may require switching from @databases/sqlitetonode:sqliteorbetter-sqlite3` where prepared-statement caching is explicit.
What the impact would be on today's workload
Claude Code ingest at ~1 MB/s → 10 MB/s = same data in 15 minutes instead of 2.5 hours. Codex similarly.
Why we're not doing this today
db.jsis owned by the reference-implementation track; changing it mid-ingest would break live writers.- The speedup is irrelevant to correctness — we're just slower than we could be.
- Proper fix belongs with the runtime/reference agent who can adjust tests + durability-mode trade-offs deliberately.
Action items for the runtime agent
- Add WAL pragmas to
initDb - Bump
BATCH_SIZEto 500 (or make it configurable via env) - Decide spine_events granularity policy
- Add a perf regression test with a known-size fixture so write amplification is visible in CI
Related
rs-storage-topology-open-question.md— topology choice interacts with WAL (WAL doesn't federate across attached DBs cleanly)connector-configuration-open-question.md—BATCH_SIZEis a tuning knob; belongs in the options surface we're proposing