Hosted onseed.hyper.mediavia theHypermedia Protocol

Core question

In SQLite WAL mode, readers and one writer should usually coexist. So why does writer contention appear to hurt the daemon’s ability to read?

What we know

The daemon makes many concurrent write attempts to a database that allows only one writer at a time.

Observed writer contention:

  • many callers enter BEGIN IMMEDIATE

  • only one can hold the SQLite writer lock

  • others wait inside the SQLite busy handler

  • after ~10s they return SQLITE_BUSY

Important distinction

Writer contention does not necessarily block reads directly at the SQLite lock level.

In WAL mode:

Many readers + one writer can coexist

Only one writer can write

But writer contention can still indirectly hurt reads.

Main mechanism: shared connection pool starvation

Reads and writes share the same sqlitex.Pool.

When many writer goroutines are stuck waiting in BEGIN IMMEDIATE, each one may hold a pooled SQLite connection.

Then read paths can block before SQLite even executes the read:

conn, release, err := db.Conn(ctx)

So the read is not blocked by a SQLite read lock. It is blocked because all/most DB connections are occupied by writers waiting for the write lock.

This is probably the main explanation.

Other mechanisms

  1. CPU / I/O pressure

    Heavy writers like indexing/sync can consume:

    • disk bandwidth

    • page cache

    • CPU

    • SQLite internal work

    Reads are logically allowed, but they still compete for machine resources.

  2. WAL growth / checkpoint pressure

    In WAL mode, writes append to db.sqlite-wal.

    If checkpointing falls behind, the WAL grows. A large WAL can make reads and checkpoints more expensive.

    We observed an old daemon with a very large WAL file (~679 MB).

  3. Mixed request pipelines

    Many user-visible operations are not pure reads.

    Example pipeline:

    read peer list
    connect to peers
    update peers table
    fetch blobs
    PutMany/index blobs
    read document/events
    

    So a user may experience “reading is slow”, but the request may be waiting on best-effort writes somewhere earlier in the pipeline.

  4. “Read-looking” code may still use savepoints

    Many read paths use WithSave. Pure read savepoints are okay, but they still hold a DB connection. If any statement mutates temp/real tables, it can enter writer accounting.

    Current likely offenders / signals

    From debug snapshots:

    • hmnet.(*Node).connect has high begin_busy count and ~10s waits.

      • Its own hold time is tiny.

      • So it is mostly a victim/amplifier.

    • blob.(*Index).PutMany-range1 can hold writer lock for hundreds of ms.

    • sync/indexing paths can repeatedly take the writer lock.

    • domain cache writes are also small but can add pressure.

    • feed/list/event queries are mostly read-side performance concerns, not necessarily direct writer-lock holders.

Key conclusion

The issue is likely not:

writer lock directly blocks all reads

It is more likely:

writer storm consumes DB pool + I/O + scheduler capacity
so reads cannot get resources promptly

Proposed direction

We need both:

  1. Reduce write work

    • optimize heavy SQL

    • shorten transactions

    • batch writes carefully

  2. Control write admission

    • avoid many goroutines racing BEGIN IMMEDIATE

    • use Go-level queue/gate for writes

    • batch/coalesce best-effort writes

    • prioritize user-visible writes

  3. Separate/guard read capacity

    • consider read vs write pool behavior

    • ensure writers waiting for lock do not consume all connections needed by reads

Practical next fix

Move hmnet.(*Node).connect peer-row update into peerWriter.

Reason:

  • it is best-effort bookkeeping

  • currently it can wait 10s on SQLite busy

  • it creates many busy events

  • it should not block successful network connect or read/sync flows

Then consider broader writer admission control for hot paths like PutMany, domain updates, and sync writes.

Do you like what you are reading? Subscribe to receive updates.

Unsubscribe anytime