Image credit: Florian Klauer

Unpacking the Ethereum stack for developers

How many layers are there, and what do they do?

Rob Hitchens
B9lab blog
Published in
7 min readJul 11, 2018

--

Ethereum is a distributed computer. As with any general purpose computer, each architectural layer presents various interfaces and performs certain tasks. A clear understanding of each layer is important context that helps researchers and developers interpret documentation and examples.

This article will unpack the Ethereum stack on a glass-in trajectory.

Let’s begin with the user.

Ethereum DAPP stack (part 1)

  • User (human, server or IoT device) controls an …
  • Externally Owned Account (EOA), and uses a ….
  • Browser or app connected to a node that interacts with the blockchain, to facilitate interactions with a …
  • dApp which is a smart contract bytecode derived from …
  • Solidity (popular and mature), Serpent (deprecated) or LLL and Vyper (formative).

A user can be a human, server, or even a device of some kind. Users possess private keys for signing transactions from Externally Owned Accounts (EOAs). The signature proves users are originators of the transactions, and proves they have knowledge of the signing key. Importantly, nothing occurs on the Ethereum network unless someone or something signs a transaction.

Transactions may be simple value transfers, or they may send information to a smart contract on the blockchain. Commonly, a user interface is purpose-built to simplify interactions with a smart contract. This combination of on-chain smart contracts and off-chain user interfaces that are meant to work together is called a distributed application, or dapp.

The on-chain smart contracts may be written in one of several human-readable languages. The most popular at the time of writing is Solidity. Serpent has fallen into general disuse while Vyper and LLL are in the early formative stages. In all cases, human-readable code must be compiled for the Ethereum Virtual Machine and is stored on the blockchain as ByteCode. One might ask how contract ByteCode becomes part of the blockchain. Users sign specially crafted transactions and pass the ByteCode while omitting the target address. These transactions (if accepted), make the ByteCode and the assigned address a fact of the blockchain history.

ByteCode is the native machine language of the Ethereum Virtual Machine, or EVM. The EVM supports a very small instruction set with machine-level instructions called OPCODES. The EVM is modeled as a stack machine in which instructions and data are pushed onto a stack and processed in reverse order. The EVM supports a very large persistent state that includes contracts that have been deployed, value transfers and data sent to contracts. State changes arise from transactions that are permitted by the protocol. Transactions are ordered in blocks created via the consensus process, which is currently proof-of-work mining.

There is more than one way to create a client that interacts with an on-chain contract. Let’s look at a few tools that can help you do that:

Ethereum Software Clients and Developer Tools

  • Web3
  • Truffle and Etherpudding
  • Pythereum
  • Nethereum

Developers use various interfaces to coordinate communication. The most popular library is the Web3 JavaScript library that fits naturally into browser-based apps using JavaScript frameworks as well as servers running NodeJS.

Truffle, which itself uses a library called Etherpudding, provides a higher level of abstraction, “truffle-contract”, which facilitates code that is more concise, readable and application-oriented than can be achieved with Web3 alone. Pythereum and Nethereum are attempts to provide general-purpose libraries for Phython and .Net developers.

Ethereum DApp stack (part 2)

  • Public or private networks. Deep down, higher level abstractions such as Web3 are reduced to …
  • JSON RPC, which interacts with Ethereum nodes, which can be …
  • Geth, Parity, Lightwallets, or simulations such as Ganache-cli.

Contrary to the client-server model where the server is often a distant monolithic construct, Ethereum users generally connect to their own Ethereum nodes.

The Ethereum node interface uses JSON RPC, and it will expose a connection to the underlying Ethereum network. The underlying network may be the main public Ethereum network, a testnet for public testing or demonstration or even an in-house private network.

There is no “official” client implementation, and any software that follows protocol can fill this role. Examples include the Go-Ethereum client, geth, and the Parity client. Users who don’t wish to sync a full node have options including LightWallets, MetaMask and Infura (the underlying service that powers MetaMask).

The underlying blockchain can use a variety of Consensus and Membership algorithms including …

Ethereum DApp stack (part 3)

  • Proof-of-Work (Ethereum public mainnet and Ropsten testnet)
  • Proof-of-Authority (Ethereum Rinkeby testnet)
  • Proof-of-Stake (Ethereum proposed roadmap)
  • Or even no proof at all (Ganache-cli) which is just a simulation for development purposes.

There are numerous forward-looking improvement proposals that offer various approaches to improving performance and scalability.

Examples of such proposals include:

Performance & Scaling Proposals

  • Sharding
  • Plasma
  • Sidechains
  • State Channels
  • Raiden
  • Loom

and more! Generally, these improvements can be understood in terms of how they approach four important concerns:

General Concerns

  • Network Topology (Membership)
  • Network State
  • State Transitions
  • Consensus

Importantly, these proposals tend to trade some portion of proof-of-work’s demonstrated consensus resilience in order to achieve greater parallelism and transaction throughput.

The proposals also vary in terms of scope. For example, high-performance payment systems may offer only compatibility with a narrow band of basic transfer transactions and compatible token contracts, while other proposals strive for more generalized overall scaling of any and all smart contract throughput.

The Ethereum Virtual Machine is…

  • A protocol-based, distributed state machine
  • Simple assembler-like ByteCode
  • Supports deterministic state changes only
  • Truth is whatever the consensus agrees on
  • Uses “gas” as a unit of account, an abstraction of network resource costs and as a solution to the halting problem.
  • Storage is, by far, the most expensive family of OPCODES.

In summary:

The EVM is a protocol-based distributed State Machine. This is in contrast to other blockchain-inspired networks that do not use the virtual machine metaphor.

The EVM uses a simple machine-code called ByteCode. It is a small instruction set, by design. The creators felt that a smaller surface area is a good first step towards a dependable and predictable platform.

Importantly, anything that follows protocol can participate and there is no reference client. Nor is there any official way to organize physical local storage of the state. Everything is defined at a logical level, with considerable latitude given to clients that wish to implement the protocol.

Ethereum smart contracts are strictly limited to deterministic computations. That is, given a set of inputs, every node must be able to independently reach identical outputs, now, and in the future. The stand-out implication of this is that Ethereum smart contracts cannot receive information from traditional APIs or news feeds. The Oracle pattern is a go-to approach for solving this challenge about working with information in the outside world.

While the consensus mechanisms of implementations vary in the details, the aim of consensus is to establish the truth as it is understood by all correctly functioning nodes. It is fair to say that the truth is whatever consensus the nodes have reached.

Ethereum uses gas as a rough approximation of the computational cost of various operations. Gas is also Ethereum’s solution to the “Halting Problem”. Generally speaking, this is about finding a way out if a program loops forever, whether by accident or by malicious design. A combination of financial cost and absolute maximum limits ensures that the EVM can continue processing, even in the case that a program demands excessive resources.

This solution ensures the EVM can continue processing without the need for a reboot under any anticipated circumstance, which is a very good thing because the EVM is designed to be virtually unstoppable.

Each machine-level OPCODE has an associated fee, expressed in units of gas. Gas can only be purchased with Ether. These are protocol-level rules, so any node that doesn’t follow the rules cannot remain in consensus with the majority. Transaction gas cost is a measure of the amount of work performed. A key point to remember is that the cost of gas is independent of currency exchange rate fluctuations. The price of gas is affected by supply and demand and will tend to increase during periods of network congestion. Gas price decouples the cost of computations from the price of Ether so that transaction costs do not necessarily increase with the price of Ether.

Since state changes are replicated across all nodes and increase the size of the blockchain, state changes are, by far, the most expensive type of operations. Reading the state is a close second, owing to the approximate cost of storage access as compared to in-memory computations. Consequently, smart contracts tend to store only the minimum information possible.

Distributed on-chain applications are seldom suitable for implementing existing application designs. The paradigm is simply too different. That is, it is rarely appropriate to “port” an existing application design to Ethereum.

It may be more fruitful to think of smart contracts as a means of defining stateful software-defined networks with strong assurances about allowable state transitions.

Hopefully this post provides some clarity about what’s going on, but the best way to learn is by doing. I invite you to try for yourself in the online Ethereum Developer Course I teach over at B9lab.

--

--