Core question
In SQLite WAL mode, readers and one writer should usually coexist. So why does writer contention appear to hurt the daemon’s ability to read?
What we know
The daemon makes many concurrent write attempts to a database that allows only one writer at a time.
Observed writer contention:
many callers enter
BEGIN IMMEDIATEonly one can hold the SQLite writer lock
others wait inside the SQLite busy handler
after ~10s they return
SQLITE_BUSY
Important distinction
Writer contention does not necessarily block reads directly at the SQLite lock level.
In WAL mode:
Many readers + one writer can coexist
Only one writer can write
But writer contention can still indirectly hurt reads.
Main mechanism: shared connection pool starvation
Reads and writes share the same sqlitex.Pool.
When many writer goroutines are stuck waiting in BEGIN IMMEDIATE, each one may hold a pooled SQLite connection.
Then read paths can block before SQLite even executes the read:
conn, release, err := db.Conn(ctx)So the read is not blocked by a SQLite read lock. It is blocked because all/most DB connections are occupied by writers waiting for the write lock.
This is probably the main explanation.
Other mechanisms
CPU / I/O pressure
Heavy writers like indexing/sync can consume:
disk bandwidth
page cache
CPU
SQLite internal work
Reads are logically allowed, but they still compete for machine resources.
WAL growth / checkpoint pressure
In WAL mode, writes append to
db.sqlite-wal.If checkpointing falls behind, the WAL grows. A large WAL can make reads and checkpoints more expensive.
We observed an old daemon with a very large WAL file (~679 MB).
Mixed request pipelines
Many user-visible operations are not pure reads.
Example pipeline:
read peer list connect to peers update peers table fetch blobs PutMany/index blobs read document/eventsSo a user may experience “reading is slow”, but the request may be waiting on best-effort writes somewhere earlier in the pipeline.
“Read-looking” code may still use savepoints
Many read paths use
WithSave. Pure read savepoints are okay, but they still hold a DB connection. If any statement mutates temp/real tables, it can enter writer accounting.Current likely offenders / signals
From debug snapshots:
hmnet.(*Node).connecthas highbegin_busycount and ~10s waits.Its own hold time is tiny.
So it is mostly a victim/amplifier.
blob.(*Index).PutMany-range1can hold writer lock for hundreds of ms.sync/indexing paths can repeatedly take the writer lock.
domain cache writes are also small but can add pressure.
feed/list/event queries are mostly read-side performance concerns, not necessarily direct writer-lock holders.
Key conclusion
The issue is likely not:
writer lock directly blocks all reads
It is more likely:
writer storm consumes DB pool + I/O + scheduler capacity
so reads cannot get resources promptly
Proposed direction
We need both:
Reduce write work
optimize heavy SQL
shorten transactions
batch writes carefully
Control write admission
avoid many goroutines racing
BEGIN IMMEDIATEuse Go-level queue/gate for writes
batch/coalesce best-effort writes
prioritize user-visible writes
Separate/guard read capacity
consider read vs write pool behavior
ensure writers waiting for lock do not consume all connections needed by reads
Practical next fix
Move hmnet.(*Node).connect peer-row update into peerWriter.
Reason:
it is best-effort bookkeeping
currently it can wait 10s on SQLite busy
it creates many busy events
it should not block successful network connect or read/sync flows
Then consider broader writer admission control for hot paths like PutMany, domain updates, and sync writes.
Do you like what you are reading? Subscribe to receive updates.
Unsubscribe anytime