How I handle offline sessions, synthetic stops, and late syncs in a time-tracking desktop app
When I started building TimePerch, I faced a few edge cases that made me rethink my architecture. What happens if a user starts a work session, then loses internet connectivity? They continue working for hours offline. When they finally reconnect, how do we ensure their time entries are accurate?
In a naive system, that data is lost. The server thinks they are still working. 10 hours later, the dashboard says they worked a 24-hour shift. This creates “Orphaned Sessions”, the bane of every monitoring tool.
The options were simple: block the UI until the internet comes back (useless), or embrace chaos. I chose the latter. Here is how I implemented Eventual Consistency to survive the disconnect, and how the system decides how much to trust the data it recovers.
The Architecture of Failure
The architecture moved away from a simple client-server model to an Offline-First architecture. The Desktop App is the source of truth, not the database.
The system relies on four distinct components working in harmony:
- Desktop Agent (Rust/Tauri): The “Edge” node. Holds a local SQLite DB. Never trusts the network.
- Backend API (Golang): The ingestion engine. Accepts high-volume writes.
- Reporting API: The “Cleaner.” Reconciles messy data into clean reports.
- Web Dashboard: The view layer that the manager sees.
Here is the exact flow when a user goes offline and comes back:
- The Action: User clicks “Stop Work” while offline.
- Local Persistence: The event is saved to a local queue on disk with a generated UUID (e.g.,
1234). - The Wait: The network is dead, so the agent does nothing. The data sits safely on the user’s hard drive.
- The Reconnect: Connection is restored 1 hour later.
- The Sync: The agent pushes the saved event
1234to the API. - The Reconciliation: The server accepts the “late” timestamp, updates the timeline retroactively, and assigns a confidence score to the entry.
That last step is the part most systems skip. It is also the part that matters most.
The 5-Pillar Solution
1. The Database in Your Pocket (Client-Side Durability)
Most apps try to send data immediately (POST /time-entry). If it fails, they show an error. We don’t do that here. Instead, when you click Stop, the request is saved to a local queue on disk first and assigned a client-generated UUID.
Only then it is sent. If the internet is dead, the agent just waits. It will retry in 5 minutes, or 5 hours. The data is safe on your hard drive.
2. Idempotency (The Magic UUID)
What happens if the network is “flaky”? The client sends a request, the server receives it, writes it to the DB, but the acknowledgment (200 OK) gets lost on the way back. The client thinks it failed, so it sends it again.
Without protection, you now have duplicate time entries. This was solved with Idempotency Keys.
CREATE TABLE time_entries (
id UUID PRIMARY KEY,
user_id UUID NOT NULL,
-- The Secret Sauce:
client_event_id UUID NOT NULL UNIQUE,
timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
status VARCHAR(50)
); Because client_event_id is unique, the database rejects the second write. The Go handler catches that error and returns a Success to the client anyway. The system heals itself.
3. Two-Phase Synthetic Entry
This was the hardest part. What if a user crashes their computer while working? They never clicked “Stop.” If no heartbeat arrives within 15 minutes, a Synthetic Stop entry is created automatically.
Phase 1 (Provisional): The session gets marked as “Ended (System Auto-Close).”
Phase 2 (Reconciliation): If the user comes back online the next day and uploads their logs, the real Stop event surfaces from the uploaded logs. The reconciliation engine finds the Synthetic entry and supersedes it with the real data.
But what if no real Stop event ever arrives? That is where the confidence score comes in.
4. The Reporting API (The Truth Layer)
The Web Dashboard never reads raw logs. It reads from a Reconciled View, a background job in the Reporting API that looks for conflicts (overlapping sessions, superseded synthetic entries) and flattens them into a clean timeline.
What gets surfaced depends on one thing: how much the system trusts each entry. High-confidence entries show cleanly. Low-confidence ones get a Pending flag for manager review. The manager never sees incorrect math, just occasionally a flag that says “look at this one.”
5. The Confidence Score (Knowing When to Distrust Your Own Data)
Not all recovered data deserves equal trust. Every entry in TimePerch carries a confidence score between 0.0 and 1.0.
Real entries, ones where the user explicitly clicked Stop, always score 1.0. Synthetic entries are more nuanced. Rather than treating them all equally, the reconciliation engine analyzes each one against known patterns:
- Did the user historically log out around this time?
- Does the timestamp fall within an admin-defined shift window?
- Is this a clean end-of-shift or an anomalous mid-session drop?
A synthetic stop at 5:30 PM for a user who always clocks out between 5 and 6 scores significantly higher than a random disconnect at 2 AM. Shift rules defined by the admin push confidence even higher. If the system knows shifts end at 6 PM, a synthetic stop at 5:58 is treated almost like a real one.
This is what ties the whole system together. The Desktop Agent preserves the data. Idempotency prevents duplication. The Synthetic Entry fills the gap when data never arrives. And the Confidence Score tells the Reporting API how loudly to flag what it found.
The result: a manager never sees a sudden unexplained 4-hour block. They might see a brief Pending flag, but never silent incorrect data.
The confidence scoring and reconciliation logic is the part I’m still iterating on. Happy to go deeper on any of this in the comments.