Real-time collaboration is a big part of what we love about modern software – it drives the productivity gains we all feel in our day-to-day lives. We’ve empowered disparate teams to work together, on the same document, at the same time, seeing changes instantly as they work! In 2023 real-time collaboration feels like table stakes for modern software, but that doesn’t make it any easier to build! Any developer will tell you the journey from vision to implementing a collaborative, real-time application is riddled with challenges.
This article is about those challenges, some common solutions, and how to approach building the real-time, collaborative experience your business and customers need. To understand how we got to where we are today, let’s start with a bit of history.
Tech Giants Innovate
Given the complexities, early examples of truly collaborative software were dominated by tech giants like Google and Microsoft. Taking our wayback machine to 2006 – Google acquired a product called Writely which laid the foundation for what later became Google Docs. Nearly 20 years later, sending files back and forth over email has gone the way of the fax machine, but the bar for developing real-time text editing applications remains very high.
Google had the foresight, resources, time, and expertise to build and scale a collaborative solution. But what’s the little guy to do?
Empowering Developers (CKEditor 4 → 5)
That’s where editors like CKEditor, TinyMCE and TipTap come in. CKEditor's own journey paints a vivid picture of just how challenging real-time editing can be. CKEditor 4 had tremendous adoption and was a beloved product with collectively more than 50 person-years of development! For all the product’s strengths and incredible adoption – the market pushed CKEditor in a new direction, one where collaboration and real-time editing was quickly becoming the norm. They had to completely rethink, reimagine, and rebuild the entire solution. The result, CKEditor 5, took four intense years to perfect, but it delivered an enormously flexible and future-proof architecture that allowed for the creation of real-time editing, along with the accompanying Cloud back-end SaaS. It also provided a base framework that allows for fully customizable solutions from a UX perspective.
Why is Real-Time Editing (still) Hard?
Conflict Resolution Algorithm
The conflict resolution algorithm is at the heart of any real-time editing system. It addresses the inevitable challenges brought by network communication latencies, which result in editing conflicts.
Decades of computer science research have sought solutions to these conflicts, and several notable methods have been developed. Building real-time text editing you’ll have to explore these approaches:
- Operational Transformation (OT): A method embraced by platforms like CKEditor. Its versatility allows for seamless collaboration, but even this established approach has room for improvement. CKEditor's team, for instance, had to entirely reshape their definition of "operations" in their OT algorithm to support a richer text data model. Their journey, filled with intricate challenges, is documented in their blog post.
- Conflict-Free Replicated Data Type (CRDT): Adopted by Yjs, a renowned library that underpins the collaborative modes of various editors, including Tiptap.
- Total Order Broadcast: Microsoft's choice for their Fluid Framework, which powers many of their products. Aimed at achieving high performance and scalability, this algorithm showcases the continuous quest for enhanced collaborative tools.
- Server Reconciliation: the solution used for multiplayer video games and is adopted by platforms such as Replicache and Reflect. It allows to run custom authoritative conflict resolution logic on the server-side which allows to enforce arbitrary business logic, fine-grained authorization, server-side integrations, and more.
While these are among the best-known algorithms, implementing them is no walk in the park! Real-world applications demand extensive research and fine-tuning.
Now think about all the other more complex elements that can be present in a document: tables, images, custom blocks. Conflict resolution has to cover all the internals of any of these blocks! The search for optimal solutions in this domain remains a real challenge and no case is the same.
Network Communications
After mastering conflict resolution, the next hurdle is networking: How can clients efficiently exchange data and share edits in real-time. One might consider enabling direct communication between clients. However, this approach has limitations, including:
- Browser Constraints: Modern browsers set restrictions on the number of simultaneous connections a webpage can initiate. This becomes a bottleneck, especially when your aim is a scalable solution catering to potentially hundreds of collaborators.
- Loss of Data Consistency: Without a centralized system, identifying which client holds the most up-to-date version becomes a challenge. This decentralized approach can jeopardize data integrity.
Given these challenges, a more viable solution emerges: a Centralized Communication Hub. With this structure, all clients connect to and communicate through a central server. This server not only maintains the latest version of the document but also manages the inflow and outflow of edits between clients, ensuring real-time collaboration is consistent.
Backend Infrastructure (the Unsung Hero)
Delving into backend infrastructure quickly unveils the depth and breadth of challenges awaiting solution architects and developers. Each aspect plays a crucial role in ensuring a seamless, real-time editing experience.
- Scalability: The backbone of any real-time solution is its ability to handle a massive volume of connections simultaneously. The question becomes: Can your infrastructure withstand peak loads and high-traffic moments?
- Security: Determining user permissions is paramount. Which clients can access specific documents, and to what extent? This involves extracting information from databases, authenticating, and validating special access tokens for each client connection.
- Data Persistence: With the server acting as the primary source of truth, there's an immediate need to securely and reliably store the documents. This isn't just about having space; it's about ensuring data integrity and quick retrieval.
- Multimedia Data Handling: Beyond plain text, documents today encompass rich media like images, videos, and attached files. Effective storage solutions for such content are essential, as is defining access controls to safeguard sensitive media.
- Communication Protocol: The nature of data transferred over the network, its format, and its compression are all pivotal. It's crucial to devise strategies that ensure even users with slower internet connections have a fluid experience. The endeavors of the CKEditor team shed light on this intricate process: How CKEditor Optimized Data Traffic for Real-time Collaboration.
- Notifications and Alerts: A comprehensive collaboration experience is punctuated by timely notifications. Whether someone edits a document, leaves comments, or mentions another user, the server must flawlessly track and notify relevant parties.
Let’s Navigate this Landscape, Together
Hopefully we helped break down some of the complexity of building real-time, collaborative apps. There’s no shortage of complexity, but neither is there a shortage of opportunities. The challenges you’ll face are the moat that will give your app a competitive advantage.
What would your customers and users be able to accomplish with real-time editing, collaboration, voice typing, generative AI, and other dynamic experiences? We’re here to help you explore those opportunities from the design phase to navigating the complex landscape of editors and technical implementation challenges.
We've walked this path, understood its nuances, and have the expertise to guide you through. With us, you’re not just getting a solution, but a partnership dedicated to elevating your real-time collaboration experience. If you’re starting your journey to a more collaborative, real-time app experience and need advice, reach out and schedule some time. Zero strings, we’re happy to share our experience.