Why Rollups Are Not Shards: Deconstructing the Logical Fallacies

Jul 03, 2024

https://x.com/VitalikButerin/status/1331428783635656704

Rollups are not holistic blockchain scaling solutions; rather, they are centralized mini-blockchains that post their opaque proofs to Ethereum L1.

In one of his recent blogs, Vitalik argued that "rollups are shards” For me, this was a surreal moment because Vitalik is someone I respect a lot. As someone who has worked with passion and grit on blockchain architectures and developed new distributed ledger technologies over the past 12 years, I found this to be one of the most intellectually dishonest takes I have ever seen from Vitalik.

In this write up, I will deconstruct that narrative and tell you why Rollups are NOT shards but a band-aid solution that brings back all the headaches Ethereum tried to dodge by avoiding sharding in the first place.

And show you how “The Rollup Centric Scalability Roadmap” is leading Ethereum down a path of unnecessary complexity, fragmentation, and technical debt.

Before we dive in:

I write regularly and cover everything crypto, tech, and some of my musings.

With social media buzzing with crypto, this helps me keep my thoughts in one place.

Interested? Head over to my Substack and join the conversation:

Understanding Sharding

Let's start from the basics.

What is sharding?

In the context of distributed ledger networks, sharding is a method used to partition the ledger into smaller, more manageable pieces called shards. This technique allows for more transactions to be processed in parallel, thereby improving scalability and performance.

There are four primary types of sharding with its own characteristics and trade-offs: Network Topology Sharding, Transactional Sharding, Execution Sharding and State Sharding.

Network Topology Sharding

Network Topology Sharding is the most basic form of sharding in a distributed network. In this approach, the network is sharded based on geographical or logical clusters. For example, one cluster could be North America, another could be Europe, and another could be Asia.

The idea is that by grouping nodes that are physically close together, the network can achieve better finality and responsiveness. Transactions originating from a client in Europe would be sent to validators also located in Europe, making the process more efficient for the user.

However, this method relies heavily on centralized entities for geolocation and IP management, making it difficult to audit and verify. Moreover, ensuring that all validators in a specific region are genuinely located there adds another layer of complexity.

Transaction Sharding

Transaction Sharding takes a different approach by sharding the network based on transaction hashes. Transactions are assigned to different validators based on their hash ranges. For example, transactions with hashes in a certain range go to one set of validators, while those in another range go to a different set. This method is simpler to verify than ‘Network Topology Sharding’ because you can check which validators are responsible for specific transactions.

However, it introduces issues around composability and state fragmentation. For instance, if a transaction needs information from another shard, it requires cross-shard communication, which can be complex and inefficient with this model. This fragmentation can lead to situations where parts of the state, such as account balances or UTXOs, are scattered across different shards, necessitating constant communication between them.

Execution Sharding

Source: Ethereum 2.0 overall architecture. Original diagram by Hsiao-Wei Wang

In execution sharding, the network is divided into several shards, each operating as an independent blockchain. Transactions and smart contracts are distributed across these shards, with each shard processing its own set of transactions and maintaining its own state. Cross-shard communication protocols facilitate interactions between different shards when necessary.

Originally part of Ethereum's scaling roadmap, execution sharding was eventually revised due to the significant complexity it adds to the protocol. Ensuring efficient and secure communication between shards presents challenges, as each shard must maintain adequate security, which is more complex than securing a single chain or the final sharding model.

State Sharding

Source: Source: **https://www.radixdlt.com/blog/cerberus-infographic-series-chapter-ii**

In this approach, individual pieces of state are distributed across the network. Validators are responsible for specific pieces of state, and these pieces can be scattered throughout the network. The primary advantage of State Sharding is its unified nature and efficiency; it eliminates the need for complex communication protocols between shards. The state remains in a fixed location, and validators move around to access it as needed.

This method reduces complexity and ensures better load balancing across the network. Since all validators use the same primitives and proofs, auditing and verifying transactions become much simpler. State Sharding also avoids the pitfalls of state yanking (moving state from one shard to another) , making it a more scalable and efficient solution.

For a deeper understanding, here’s the link to my recent post on “State Sharding” :

Why "Sharding Splits Security" Is a Mid-Curve Argument

Dan Hughes

June 27, 2024

Why "Sharding Splits Security" Is a Mid-Curve Argument

“Sharding splits economic security and makes networks weaker.” This mantra is often parroted in this space without going deeper into the nuances and practicalities of sharding. It's as if the critics have prematurely dismissed the concept of sharding, without having a comprehensive grasp of what sharding en…

Read full story

Why Did Ethereum Pivot from Sharding to a Rollup Centric Scalability Roadmap?

So why did Ethereum move away from execution sharding, which was originally part of Ethereum's roadmap, in favor of a rollup-centric approach?

In his article Vitalik mentions a set of reasons why he moved away from execution sharding in his recent article, mostly because the Ethereum developers encountered significant challenges in implementing sharding.

Expensive cross-shard communication, ensuring data availability, and maintaining state consistency and integrity across shards proved to be complex problems; not only that, coordinating validators across shards and preventing attacks like single-shard takeovers also presented difficulties because each shard acts as an independent chain .

On top of all that, sharding demanded a complete overhaul of Ethereum's core architecture and consensus mechanism.

It was like trying to rebuild a plane while flying it…

The risks and complexity outweighed any potential gains. So they made a bold call – ditch sharding and look for other ways to scale. This is understandable, if short-sighted, because sharding would have introduced incredible complexities to the original Ethereum L1, but that doesn’t mean the current path Ethereum is pursuing is great either.

In his article, Vitalik tries to whitewash the rollup-centric approach. But guess what? Rollups bring their own bag of complexity and security nightmares.

So let's go through each of those points and see why you end up with a band-aid solution that brings back all the headaches Ethereum tried to dodge by avoiding sharding in the first place and even more problems that were not present in the original model.

The Illusion of Execution Environment Diversity

There is this growing misconception surrounding the supposed benefits of diverse execution environments in Layer 2 solutions. And now Vitalik is projecting this as one of the biggest advantages of the path Ethereum is pursuing with Rollups.

However, this perceived diversity can introduce a lot of complexities and challenges that may outweigh its purported benefits. I argue that the notion of execution environment diversity is largely an illusion, fraught with pitfalls that undermine its practicality and security.

The Importance of State Model Consistency

The argument for diverse execution environments often stems from a misguided belief that different use cases necessitate radically different execution models.

However, this perspective fails to grasp the essence of blockchain's core purpose: maintaining a consistent, verifiable state.

Let's break this down.

At its core, distributed ledger technology revolves around state transitions governed by execution constraints. The rules of consensus then ensure that everyone agrees it was executed correctly.

Whether we're dealing with UTXOs, account-based models, financial transactions, social media interactions, or complex smart contracts, the fundamental requirement is to transition from one valid state to another in a deterministic and verifiable manner.

The execution environment's primary function is to facilitate these state transitions within the defined constraints.

Source: https://www.researchgate.net/figure/State-Transition-Function-in-Ethereum_fig1_357042521

Introducing multiple execution environments doesn't solve any inherent problem. Instead, it exponentially increases the complexity for developers, auditors, and users.

Each unique execution environment necessitates its own set of interpreters, compilers, and verification tools. This fragmentation complicates cross-chain interactions, as verifying state and execution logic between disparate environments becomes a non-trivial task.

For instance, reconciling a UTXO-based transaction from one L2 with an account-based model in another introduces unnecessary complexity in state representation and validation.

Imagine having to understand and support 27 different execution environments just to confirm a simple token transfer from Alice to Bob.

Not only that, each new execution environment introduces its own set of potential vulnerabilities, edge cases, and failure modes, and the interactions between these environments create new potential exploit vectors. From a security auditing perspective, this diversity mandates expertise across multiple systems, significantly increasing the cognitive load on security professionals and, consequently, the likelihood of overlooked vulnerabilities.

If you want an idea of the mess this approach could cause, a quick Google search will list plenty of results highlighting issues between different Ethereum implementations. Some of these issues have been pretty serious resulting in the network forking! Now imagine all these different implementations also have to support multiple execution environments!

This diversity doesn't enhance security or functionality; it merely obfuscates the verification process, making it increasingly likely that critical errors will slip through unnoticed.

It's not just impractical; it's fucking madness.

The Power of Constraint-Based Execution

Source: https://vitalik.eth.limo/general/2024/05/23/l2exec.html

The main argument often presented is that this diversity is necessary to support various use cases. However, I struggle to see many scenarios that genuinely require fundamentally different execution environments. Most use cases can be accommodated within a well-designed, Turing-complete execution environment with appropriate domain-specific constraints.

Take, for example, the approach used in systems like Scrypto.

By using a robust, Turing-complete language like Rust as the foundation, you provide developers with the full power of a general-purpose programming language. Implement guardrails and constraints specific to blockchain operations as language extensions, defining what can and cannot happen to the state on the ledger. This ensures that all operations adhere to the fundamental rules of the system.

Let me elaborate on why this approach is so powerful.

The execution environment allows for virtually unlimited flexibility in how computations are performed. Developers can employ complex algorithms, mathematical operations, or even unconventional approaches to problem-solving.

For instance, if a developer needs to perform multiplication, they could implement it as a series of additions within a loop. The execution environment doesn't restrict how the computation is done, only the nature of the final output.

While computation is flexible, the rules governing state transitions are strict and well-defined. The ledger enforces a set of immutable conditions that must be met for any state change to be valid. These might include rules like non-negative balances, conservation of total supply, or specific structural requirements for data objects.

This approach creates a clear separation between the computation logic and the state transition rules.

The execution environment is concerned with how a result is computed, while the ledger is only concerned with whether the result meets the predefined criteria. This separation simplifies auditing and verification processes.

By using a single, well-defined execution environment with clear constraints, we eliminate the need for multiple, disparate execution paradigms. This unification significantly reduces complexity in areas such as cross-chain interactions, security auditing, and developer onboarding.

Despite operating within a unified framework, this approach allows for tremendous extensibility. New use cases can be accommodated by expanding the constraint set or introducing new data structures, all while maintaining compatibility with the core execution environment.

With a single, well-understood execution environment, security efforts can be focused and more effective. Vulnerabilities are easier to identify and address when they're not spread across multiple, fundamentally different environments.

By taking this approach you allow for flexible computation within the execution environment, but enforce strict rules on the final state transitions. The ledger doesn't care how you arrive at a result, only that the result meets predefined constraints.

The Ethereum Counterargument

Proponents of Ethereum's rollup centric approach might argue that their system achieves a similar end by publishing proofs from various L2 Rollups back to the Ethereum L1.

But, this argument is fundamentally flawed.

Each distinct L2 potentially requires a different method of proof verification, adding unnecessary complexity to the L1.

The more diverse the L2 landscape, the more difficult it becomes to maintain a clear, auditable trail of state transitions across the entire ecosystem.

The argument that "it's all the same" because proofs are published to Ethereum L1 glosses over a critical issue: the fundamental incompatibilities in state representations across different L2s.

Take, for example, an L2 using a UTXO model versus one using an account-based model. When these disparate state representations are flattened into proofs on the L1, we lose the rich context and structure that makes each model efficient in its native environment.

This flattening process creates significant challenges for cross-L2 interactions:

How does one efficiently transfer assets or state between a UTXO-based L2 and an account-based L2?

The L1 becomes a bottleneck, forced to act as a complex translator between incompatible state models. This not only reduces efficiency but also introduces potential points of failure in cross-L2 transactions.

Above all this, at its core, Ethereum L1 also has a critical blindness when it comes to verifying L2 executions.

The Ethereum L1 Blind Trust Issue

The crux of the problem lies in what I call the "blind trust dilemma”: Ethereum L1 is inherently incapable of verifying the correctness of L2 executions.

This isn't just a minor oversight; it's a fundamental flaw that undermines the entire premise of trustless, decentralized computation.

The base layer, the bedrock of trust in the Ethereum ecosystem, is essentially flying blind when it comes to L2 operations.

Let that sink in for a moment.

And why is this the case? It boils down to the diversity of execution environments in L2 solutions and the fact that Ethereum's L1, designed with a specific execution model in mind, lacks the capability to interpret or validate the myriad of proof formats and execution results coming from these diverse L2s.

Each L2 can potentially implement its own unique execution model, complete with custom primitives, state representations, and proof systems. While this flexibility might seem advantageous on the surface, it introduces a critical vulnerability at the L1 level.

It's akin to a mathematician being asked to verify the correctness of a mathematical proof written in a notation they don't understand. They might might nod and say, "Yes, this looks like a proof," but they have no way of knowing if it's actually valid or if it's complete gibberish.

I understand that this is probably the first time you are hearing about this, so let me break it down for you.

Posting “Opaque Proofs” on Ethereum L1 has Far-Reaching Implications

When L2 solutions submit proofs to the Ethereum L1, these proofs are essentially opaque. The L1 has no inherent mechanism to understand or verify the contents of these proofs.

In practice, this means Ethereum L1 is blindly accepting state transitions proposed by L2s. It's operating on what I call "proof theater" – the appearance of verification without actual verification, or at best, relying on designated 3rd party actors in the L1 ecosystem to attest to its validity.

This is not merely a matter of processing different data formats; it involves understanding and validating fundamentally different computational models.

Many L2s are far less decentralized than Ethereum's L1. By extension, they are offloading critical computations to potentially centralized entities, which goes against the core ethos of distributed ledger technology.

For anyone trying to audit the system, it becomes an almost impossible task. You'd need to understand not just Ethereum's execution environment, but also the intricacies of every L2 solution out there. It's a Herculean task that borders on the impractical.

Consider this scenario: To audit a single transaction, you might need to:

Understand the specific L2's execution environment
Interpret its unique proof format
Comprehend how it represents state (UTXO, account-based, or something entirely different)
Know the potential failure modes of this specific L2
Understand how this L2's state translates back to L1

Now the proliferation of different L2 solutions, each with its own execution environment, exacerbates this problem. Ethereum L1 isn't equipped to interpret or validate computations from these diverse environments.

The security and decentralization of the base layer become largely irrelevant if the majority of activity and value are concentrated in centralized L2s. The decentralization of the L1 merely serves as a record that all Ethereum nodes have seen the proof and agreed to include it on faith that it is valid, holding no significant meaning when the true importance lies in the decentralization of L2s.

In the pursuit of scalability and speed, L2 solutions often compromise on decentralization, resulting in a less decentralized environment compared to the Ethereum base layer. This trade-off raises concerns about the overall security and transparency of the system.

In fact, a less decentralized Ethereum that could handle all transactions on L1 would be preferable, as it would allow for easier auditing and monitoring of all activities from an external perspective.

The Ethereum ecosystem is now essentially creating a multi-tiered system where the security of the entire network is only as strong as its weakest L2. The huge amount of economic security at the L1 is essentially doing nothing. Any compromise in an L2 could potentially propagate to the L1 without detection. It's a recipe for disaster.

There are No Organizational and Cultural Benefits

First and foremost, the analogy of a "free market" of rollups leading to innovation is fundamentally flawed.

What we're actually witnessing is a fragmentation of resources and a dilution of focus.

Instead of fostering innovation, this approach is creating a chaotic landscape where developers are forced to navigate a labyrinth of incompatible systems. It's akin to asking every web developer to build their own internet protocol before they can create a simple website.

The rollup-centric roadmap has effectively shifted the focus away from what truly matters: application development.

We're seeing the brightest minds wasting their talents on solving infrastructure problems that shouldn't exist in the first place.

Interoperability issues, bridging challenges – these are symptoms of a fractured ecosystem, not hallmarks of innovation.

The cultural argument for this approach is equally misguided. The idea that a diverse rollup ecosystem will lead to a thriving, competitive environment is naive at best.

In reality, it's creating a landscape where the majority of resources are being poured into infrastructure development rather than actual applications. It's akin to a city where everyone is busy building power stations, but no one is actually using the electricity to create anything of value.

**Because every dev out there is building their own “Rollup”,** Source: https://vitalik.eth.limo/general/2024/05/23/l2exec.html

This fragmentation doesn't just affect developers; it impacts users as well. The average user doesn't care about the intricacies of L2 solutions or the philosophical implications of decentralization. They want applications that work, period.

By forcing developers to grapple with these low-level concerns, we're creating barriers to entry and stifling the very innovation we claim to promote.

Same Problems, Different Package

With the “Rollup Centric Scalability Roadmap,” the Ethereum ecosystem has essentially repackaged the same fundamental problems they faced with “execution sharding” into a shiny new L2 wrapper called rollups. This sleight of hand does nothing to address the underlying issues; it merely shifts them to a different layer, potentially making them even worse.

Cross-shard communication issues in both sharding and rollups

The coordination challenges that plagued execution sharding have found a new home in rollups. In both scenarios, the complex task of managing cross-shard or cross-rollup communication is at play.

This isn't just a minor inconvenience; it's a fundamental issue that affects the entire system's efficiency and security.

In execution sharding, there was a struggle with efficiently moving data and state between shards without compromising the system's integrity. Now, with rollups, the exact same problem is being faced, just in a different context.

The question is how to ensure that a transaction on Arbitrum can seamlessly interact with a smart contract on Optimism.

The fundamental problem remains: How to maintain a coherent, unified state across a fragmented system without compromising on security, decentralization, or efficiency.

Fragmentation of State and Liquidity

The fragmentation of state and liquidity is another carry-over from the sharding approach. In execution sharding, the challenge was grappling with the fact that not all state were available everywhere at all times. This led to potential liquidity silos and increased complexity in state management.

Rollups haven't solved this problem; they've just shifted it to a different layer. Now, instead of shards, there are multiple L2 solutions, each with its own state and liquidity pool. The result is the same fragmentation issues that plagued Ethereum’s sharding efforts causing them to abandon it. Users and developers now have to navigate a complex landscape of different L2s, each with its own quirks and limitations.

Complexity Overload

This is where the situation becomes more problematic, resulting in worse outcomes than if the initial plan had been pursued.

The rollup-centric approach introduces a level of complexity that is orders of magnitude greater than any variant of execution sharding that Ethereum would have pursued.

The original execution sharding plan for Ethereum envisioned a uniform execution environment (EVM) across all shards. Even if one argues that the EVM is a flawed programming paradigm, that approach was far better than the current situation in the Ethereum ecosystem.

Now, developers must contend with a multitude of execution environments, each with its own set of rules, primitives, and quirks. This diversity comes at the cost of increased cognitive load and a higher probability of errors.

For instance, consider the challenge of moving assets between different L2s. A developer must now be intimately familiar with the state models and execution environments of both the source and destination rollups. They must understand how to interpret and translate between different representations of state - for example, how to reconcile a UTXO-based model with a balance-based model. This complexity extends to every aspect of cross-L2 interactions, from state verification to proof generation and interpretation.

Imagine you're told that if you buy milk and put it in your car, you can't use the third gear while driving home. If you do, your car will explode. It's an absurd rule, right? But this is the level of arbitrary complexity and quirks being introduced into the Ethereum ecosystem with these diverse L2 solutions which developers will have to remember..

The Inevitability of Catastrophic Errors

The sheer complexity of this system makes catastrophic errors a certainty , much more so than if Ethereum had pursued the execution sharding path.

This complexity is a ticking time bomb. With so many moving parts and intricate interactions between different L2s, we’re not talking about small glitches here; we're looking at potential losses in the hundreds of millions or even billions of dollars.

Remember the Kyber Network hack? That was a result of a rare probability event.

Source: https://blocksec.com/blog/kyberswap-incident-masterful-exploitation-of-rounding-errors-with-exceedingly-subtle-calculations

Now, imagine that level of vulnerability multiplied across dozens of different L2 systems, each with its own unique attack surface. It's not a matter of if a major failure will occur, but when.

The Ethereum community seems to have forgotten the lessons of The DAO hack—complex systems with high stakes are breeding grounds for expensive mistakes.

With rollups, the Ethereum ecosystem is embracing orders of magnitude more complexity than any other path it could have taken for scalability.

This is Why Rollups are Not Shards

And this is all because rollups are not shards but mini blockchains running on three centralized servers located in data centers, each with its own set of execution environments and consensus rules, occasionally posting proofs of execution to the Ethereum mainchain, which is essentially blind to what the rollup has processed.

This is in stark contrast to a true sharding model, where the base layer would have intrinsic knowledge of the execution rules and state transitions occurring within each shard.

This fundamental disconnect between the rollups and the base layer underscores the fact that Ethereum has only traded one set of complex problems for another, more challenging set, and why rollups cannot be considered a holistic blockchain scaling solution but rather mini blockchains posting their opaque proofs to Ethereum L1.

Instead of simplifying the system, Ethereum’s rollup ecosystem has added layers of abstraction that make it increasingly difficult for developers to create secure, efficient applications.

This is not how scaling models should work. What is needed is a unified execution environment that reduces complexity and provides an infrastructure where developers can focus on building applications rather than reinventing the wheel by developing new execution environments.

Thanks for reading! Just a reminder:

I'm on a mission to write more often.

Expect updates on crypto, tech, and my occasional random thoughts.

Social media can be overwhelming, so this is my way to stay organized.

If you're keen, follow my Substack and don't miss a thing: