GRANDPA Equivocation and sysinfo Process Collection Results In Slashing on Kusama Network: a Post-Mortem.

Multiple bugs in code resulted in nodes dropping out from Kusama network and losing the database that stores which blocks they validated. Consequently, the same nodes double-signed those blocks on restart. The slashes caused by this issue have been reverted via Kusama Council motions.

By PolkadotAugust 18, 2020

Multiple bugs in code resulted in nodes dropping out from Kusama network and losing the database that stores which blocks they validated. Consequently, the same nodes double-signed those blocks on restart. The slashes caused by this issue have been reverted via Kusama Council motions.

On Friday July 31, two Kusama validators on runtime version v2019 started crashing every few minutes giving two distinctive errors, reporting an issue. At a first glance, the problem seemed to be related to the validators' keys. It was subsequently found that this was not the cause, as the validators affected confirmed they did not change keys in the process. Additionally, the issue seemed to be present solely on Kusama network, not on Polkadot.

Going a bit further down the rabbit hole, the team realised that the issue seemed to have started as a result of a GRANDPA equivocation causing a slash event in Kusama, originally triggered by a file descriptor leak that caused nodes to crash. This leak prevented nodes from writing the GRANDPA voter state (the votes at a given round) to disk and caused the nodes that lost this data to vote again after restarting, this time voting for a block newer than their original choice. This led to an equivocation.

The combination of these two events resulting in validators being slashed started at some point after v0.8.15 (v2015 in Kusama) was released and the network was upgraded. The Authority Discovery feature had already been in place for some time on the runtime module level but not enabled by default on the client, and this version also enabled GRANDPA to report equivocations on unsigned extrinsics.

With this information in hand, the team's main hypothesis was that equivocations caused by the file descriptors leak could actually have started happening a while ago but were only reported after the v0.8.15 upgrade back in July: by running this version of the network, nodes started reporting themselves after crashing and this attracted the attention of the teams involved. Still, investigation into the logs of nodes run by Parity did not find any previous equivocation (they would be logged to the terminal).

Further investigation into the root causes of the file descriptor leak pointed at two main culprits: authority discovery and metrics collection. Authority discovery was using an excessive amount of sockets to query data from the DHT (i.e. discovering other authorities IP addresses). For system metrics collection (e.g. CPU and memory) we were relying on the sysinfo crate which was keeping a cache of file descriptors over all processes in the system and threads for each process (it's fetching the data by reading from /proc).

The short-term solution was to disable the Authority Discovery feature by default and also to stop collecting system metrics. The Authority Discovery module will be re-enabled again in a future release once there is a proper fix for the excessive use of sockets.

Until a new version was available the Parity team recommended manually disabling Authority Discovery. Additionally, in any case of the node crashing, validators were advised to introduce a delay before restarting it (1-2 minutes). This reduces the likelihood of the node equivocating in GRANDPA if its votes were not persisted to disk.

After some discussions and developments, Polkadot v0.8.22 was released, including the short-term fixes detailed above. All validators should upgrade their version and monitor for results. All slashes caused by this bug were reverted by the Kusama Council - and in this spirit, a new discussion was opened regarding the reversion of economic loss but not the nomination loss by validators.

To keep up with developments, there are plenty of ways to get plugged in to the Kusama community. Join the discussion on the Direction Channel. Learn more about Kusama on our website and in the Kusama Wiki. Want to join the core growth team behind Kusama? Join the Ambassador Program.

From the blog

What is a crypto wallet? Your all-access pass to the future web

In Web3, your wallet is your most valuable digital tool. It’s more than just a place to store, send, and receive cryptocurrencies securely—it’s your passport to the decentralized world.

July 2024: Key network metrics and insights

Welcome to your go-to source for the latest tech updates, key metrics, and discussions within Polkadot, brought to you by the Parity Success Team. This blog series covers a variety of topics, drawing insights from GitHub, project teams, and the Polkadot Forum.

Polkadot 2.0: The rebirth of a network

Polkadot 2.0 reimagines blockchain with a bold rebrand and powerful features: Agile Coretime, Async Backing, and Elastic Scaling. Step into a more flexible, faster, and scalable network. Learn about the improvements and changes that led to this next era of Polkadot.

Meet the Decentralized Futures grant recipients: transforming ideas into impact on Polkadot

The Decentralized Mic is here to spotlight the innovative projects and teams driving Polkadot’s growth. Join us as we explore the achievements of Decentralized Futures grant recipients and their contributions to the Polkadot ecosystem on the new ecosystem community call series.

The ultimate 2024 Polkadot grants and funding guide

Explore Polkadot ecosystem funding: grants, venture capital, bounties, and community initiatives. Discover opportunities for blockchain builders today.

Decoded 2024: Polkadot’s vision for a decentralized future

Polkadot Decoded 2024 in Brussels brought together top blockchain minds to explore the future of Web3. Highlights included Björn Wagner's insights on payments and Dr. Gavin Wood's vision for digital individuality. Showcasing technical breakthroughs and real-world use cases, Polkadot affirmed its leadership in the multi-chain future.

June 2024: Key network metrics and insights

Welcome to your go-to source for the latest tech updates, key metrics, and discussions within Polkadot, brought to you by the Parity Success Team. This blog series covers a variety of topics, drawing insights from GitHub, project teams, and the Polkadot Forum.

Introducing the New Polkadot Ledger App

Discover the new Polkadot Ledger app for seamless, secure transactions. Now available on Ledger Live, it supports Polkadot, Kusama, and more.

Polkadot’s May Ecosystem Insights

Welcome to your go-to source for the latest tech updates, key metrics, and discussions within Polkadot, brought to you by the Parity Success Team. This blog series covers a variety of topics, drawing insights from GitHub, project teams, and the Polkadot Forum.

Top takeaways from the decentralization panel at Consensus

Consensus by Coindesk 2024: a blockbuster success

Empowering Decentralization: Polkadot DAO Allocates 3M DOT for DeFi Growth

With an overwhelming majority of voters in favor, the Polkadot community has chosen to allocate 3 million DOT to enhance the ecosystem’s decentralized finance (DeFi) landscape. Made through three separate proposals via Polkadot’s decentralized governance (OpenGov), this decision provides an accessible, deep layer of native liquidity to help the ecosystem flourish. It also demonstrates the power of community-driven initiatives to shape the future of decentralized finance.Hydration (formerly known as HydraDX) focuses on improving DeFi liquidity, while StellaSwap aims to optimize the efficiency of automated market makers (AMMs).

Polkadot and the Future of Real World Asset Tokenization

The world of blockchain is rapidly evolving, and one of the most exciting developments is the emergence of Real World Assets (RWA) tokenization.