Your Browser is Finally Your Database

Matthew Tyson
13 Min Read

The thick client paradigm is resurfacing. Discover how cutting-edge local databases such as PGlite and RxDB are enabling robust, feature-rich data management directly within web browsers.

Innovate to Adapt to changes and transforming facing futur danger as a business metaphor or life success symbol to find a clever solution to survive and prosper as origami icons.
                                        <div class="media-with-label__label">
                        Credit:                                                             Lightspring / Shutterstock                                                  </div>
                                </figure>
        </div>
                                        </div>
                        </div>
                    </div>

Historically, computing began with centralized mainframes and basic, non-programmable terminals. This centralized power shifted dramatically with the advent of personal computers, distributing control to individual desktops. The internet then emerged, making the browser the dominant application, which recentralized power in servers and the cloud. Now, this architectural pendulum is once again swinging towards distributed client-side capabilities.

This piece offers an initial exploration into the burgeoning local-first movement and the innovative technologies enabling sophisticated data storage directly within web browsers.

PGLite: A fully-fledged database for your browser

Contemporary web browsers are powerful platforms, honed by extensive development and rigorous testing, often running on high-performance machines. Despite this, they frequently rely on the server for data, behaving as mere data consumers. Browser state remains transient, vanishing with each page refresh. The familiar loading spinners, UI delays, and ‘click-and-wait’ experiences stem from this continuous dependence on the backend for persistent data management.

However, a new paradigm is taking shape. The concept involves integrating a relational database directly into the browser, holding a subset of the data, with a synchronization (sync) engine ensuring consistency. The browser then interacts with this local data store, which seamlessly synchronizes with the server in the background. This approach delivers instantaneous front-end responsiveness while preserving backend data integrity. Such next-gen browsers will feature a more robust, persistent state, transcending the limitations of temporary caching.

Multiple advancements have contributed to browsers becoming more capable data stores, notably IndexedDB and WebAssembly. These technologies have facilitated the development of in-browser NoSQL solutions like PouchDB. Yet, the standout innovation currently is the PGLite SQL database.

Naturally, every technological shift involves compromises. Integrating a database directly into the browser introduces new architectural considerations. However, the most profound impact of this change lies in its progressive departure from two fundamental pillars of web development: JSON and REST.

Towards an isomorphic future

My previous work highlighted WinterTC’s role in advancing the vision of isomorphic JavaScript, where server and client environments are unified. The subsequent step involves attaining comparable uniformity across data stores. This has only recently been made feasible by the maturity of WASM, which now supports a complete PostgreSQL instance within the browser. This WASM-powered database instance is precisely what PGLite offers.

While SQLite brings you close to enterprise-grade database functionality in the browser, PGLite *is* the identical database found in data centers, eliminating dialect inconsistencies. The WebAssembly (WASM) runtime, a marvel of contemporary programming, allows PGLite to be a compact compilation of the authentic Postgres codebase.

Consequently, we are closer than ever to realizing the thick client model. Naturally, this evolution entails various subtleties and unforeseen developments along the way.

Synchronization driven by ‘shapes’

Even with identical API and implementation, replicating the entire database as a shard within the browser is impractical due to size constraints and security risks. The goal is to provision only the data pertinent to a particular user’s session.

A compelling concept gaining traction is ‘shape-based’ syncing, pioneered by ElectricSQL, the developers of PGLite. A shape functions akin to a database view, employing one or more queries to populate the client’s local database with a relevant data subset. The server maintains the authoritative dataset, while the client subscribes to a defined shape on the server (e.g., SELECT * FROM issues WHERE assigned_to = 'me').

At its core, synchronization leverages Postgres’s intrinsic Logical Replication protocol. The sync engine operates as a middleware consumer, monitoring the database’s write-ahead log—a live stream of all server-side modifications. Should a change align with a client’s subscribed shape, the engine transmits that specific update via a background WebSocket to the browser’s PGLite instance. This process is bidirectional: local writes instantaneously update the UI, are then queued, and streamed back to the central database, with the engine adeptly managing conflict resolution.

Previously, with progressive web apps (PWAs), developers manually crafted imperative code to re-execute failed requests upon reconnecting. While functional, this method was fragile and led to a subpar developer experience. Contemporary sync engines provide a more refined solution by automating this complex process.

You might now be considering: “Browsers are temporary! What if users clear their cache?”

The sync engine addresses this valid concern. To grasp the mechanism, consider the Git architecture:

  • The remote repository (GitHub) serves as the definitive source of truth.
  • The local copy (on your device) holds the active working data.

Should a user clear their browser cache, their data isn’t lost; they’ve merely removed their local data repository. Upon subsequent login, the sync engine effectively executes a git clone operation, retrieving their defined ‘data shape’ back onto the device.

Understanding Conflict-Free Replicated Data Types (CRDTs)

Yet, a crucial question arises: what if two users modify the same data concurrently while offline? In conventional databases, the most recent write would typically override prior changes. Therefore, the synchronization logic must be exceptionally advanced to manage numerous clients working on distinct ‘shapes’ and consistently pushing their updates to the central data store.

This is precisely where CRDTs (Conflict-Free Replicated Data Types) become essential.

CRDTs, an abstract-sounding collection of mathematical constructs, offer highly practical solutions for synchronization challenges. These data structures (such as Map or List) are inherently designed for automatic, mathematical merging. This contrasts with a Git merge conflict, which halts progress for manual resolution, versus Google Docs, which effortlessly combines multiple users’ inputs. Leveraging CRDT logic, sync engines guarantee that offline user edits are preserved and smoothly integrated once connectivity is re-established.

Allow me to anticipate your next thought: This introduces considerable architectural overhead. We now have dual databases, a dedicated syncing engine, and the ‘shape’ seemingly duplicates a SELECT statement that would ordinarily reside on the server.

In essence, we’ve dissected distributed computing, a notoriously complex domain, at the data store level. This undertaking aims to circumvent sluggish data loading experiences, albeit by diverging from established patterns like the JSON API and REST.

Given our willingness to embrace such complexities, we must be anticipating significant benefits, wouldn’t you agree?

Transcending the JSON API

This novel methodology enables an aspiration web developers have pursued for two decades: achieving a desktop-grade application experience.

Interacting with local data facilitates a level of UI responsiveness that remains unattainable with direct network requests. Embedding a comprehensive PostgreSQL instance offers a robust solution, surpassing partial approaches like local caches. Furthermore, this also presents an intriguing potential improvement in developer experience.

By removing the backend API, we shed an entire layer of coupling that has long been a burden for web developers. The objective is to circumvent the manual translation of client-side data into a transport format, then into a data store format, and back again. (Frameworks like HTMX also aspire to this, albeit through alternative means.)

Under the local-first data paradigm, the manual marshaling of JSON is eliminated. We simply compose our desired SQL statements, and the sync engine autonomously manages data transport according to predefined rules. This means no longer crafting a GET /todos endpoint; instead, we embed an SQL query directly within our component, such as: SELECT * FROM todos.

The roles of IndexedDB and OPFS

While PGLite represents a captivating technological advancement, it’s not the sole narrative in the local-first data landscape. It forms part of a broader ecosystem of technologies. Developers have, after all, consistently sought methods for client-side data persistence, including localStorage, cookies, and IndexedDB.

IndexedDB represents a genuine effort to provide browsers with database capabilities (and can indeed serve as a rapid backend for PGLite instances). However, it’s hindered by a cumbersome API and inherent performance bottlenecks. It functions more as a file system container than a true database engine, lacking support for intricate queries, joins, or constraints. Engaging in complex operations necessitates custom JavaScript logic to filter data in memory, severely impacting performance. In essence, its practical application in real-world scenarios is often unwieldy.

IndexedDB served as a crucial intermediate step but not the ultimate solution. The bedrock of this modern development is formed by WebAssembly and the Origin Private File System (OPFS). These innovations enable us to cease reimplementing databases in JavaScript and instead integrate battle-tested database engines directly into the client environment.

OPFS: The high-performance file system

Though it may seem like an esoteric browser specification, OPFS is pivotal for contemporary local-first architectures. If WASM delivers the runtime environment, OPFS furnishes the essential file system capabilities.

OPFS ultimately grants browsers direct, high-speed access to a user’s local storage. In contrast to IndexedDB, which mandates reading or writing complete files or objects, OPFS facilitates random-access writes. This capability allows databases such as PGLite to alter a small 4KB data page within a 1GB file without needing to rewrite the entire file. This represents the crucial advancement enabling server-class databases to operate in browsers with almost native performance.

RxDB: The NoSQL counterpart

While PGLite spearheads the ‘SQL on the client’ movement, RxDB (Reactive Database) stands as its NoSQL parallel. It builds upon the legacy of PouchDB, a long-established, in-browser NoSQL database with years of practical deployment.

Whereas PGLite emphasizes replicating server-side database structure on the client, RxDB prioritizes the dynamics of contemporary user interfaces. Its design centers on reactivity (as indicated by the Rx prefix). Unlike conventional databases where queries yield static results, RxDB allows you to subscribe to a query:

// In RxDB, the database IS the state manager
db.todos.find().$.subscribe(todos => {
  render(todos);
});

As the sync engine transmits new data from the server, your UI immediately reflects these updates. This eliminates the necessity for state management libraries such as Redux or Pinia, as the database itself functions as the reactive source of truth.

Conclusion

The browser has evolved beyond a mere document viewer or basic terminal. However, the burgeoning local-first architecture represents a significant departure from our accustomed REST and REST-like solutions, introducing its own set of complexities.

With the integration of a unified runtime (WinterTC) and the advent of production-ready local databases (PGlite and RxDB), the browser holds the potential to become a comprehensive application platform. Yet, whether it will fully displace established methodologies remains to be seen. Such changes won’t occur immediately or rapidly, as familiarity exerts considerable inertia in the programming ecosystem.

The combination of local-first design and robust syncing mechanisms might someday challenge the dominance of JSON and REST. However, this will first require thorough validation of its real-world viability.

Software DevelopmentDatabasesData ManagementWeb DevelopmentJavaScriptEmerging Technology
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *