I found Solidity early in my career. While an undergrad, I was roaming the purgatory of how-to JavaScript Medium articles and the Bootcamp hell of Python. I even dabbled with Arduino for a while (CMV: Arduino should be a skill every engineer knows anyway). Poorly implemented my own language by following an article. I had no purpose and no happiness till I found Solidity.
Solidity is a fundamentally beautiful language. Solidity as a language will help you understand the fundamental nature of human since humanity is powered by transactions. When you code a liquidity pool with Solidity, you break it down to its bare pieces and build the logic of a decentralized order book back up - without worrying about I/O and Serialization - you’re just writing fundamental logic. Sure, in a few decades, Solidity might be to DeFi (and Ethereum) what COBOL is to Finance right now - but till then I will defend Solidity against some of the bad-choice arguments - it’s fun and important.
The article pieces we refer to is this link from HN - both old at this point but some criticisms apply.
Here are my point-by-point addressal of the criticisms.
- 256-bit wide types: Ethereum was designed around the 256-bit unit size to align with the EVM's word size and to simplify the underlying implementation. While this decision does lead to memory inefficiencies, the use of a uniform data size provides computational efficiency. It is also worth noting that Solidity has a multitude of data types (like "uint8", "int8", etc.) that allow for more memory-efficient coding if used correctly.
- String manipulation: The limitations with string handling in Solidity can indeed be frustrating. However, Solidity was designed primarily for smart contracts, which typically don't require extensive string manipulations. It's also important to remember that every operation on a blockchain incurs a cost, and complex operations like string manipulation can be expensive.
- Garbage collection: Lack of garbage collection is more of a feature than a bug. Ethereum is a state machine, and every transaction brings it from one state to another. Manual memory management aligns with this model and gives the programmer explicit control over the state of their contracts.
- OOP & "this" keyword: Solidity's object-oriented programming (OOP) design does have its quirks, especially regarding the use of "this" keyword. However, its usage is well-documented, and these quirks can be avoided with an understanding of how function calls in Solidity work.
- Number handling: Solidity does not support floating-point numbers, given their non-deterministic nature which could lead to consensus issues on the network. Integer operations can indeed overflow, but there are libraries available (like OpenZeppelin's SafeMath) that provide overflow-checked operations.
- Function return types: The inability to return variably sized arrays can be a hindrance, but it is a trade-off to avoid undue computational complexity and to ensure predictable gas costs.
- For loops: The pitfall mentioned here regarding the "var" keyword is a reason why its usage is often discouraged in favor of explicitly typed variables. A disciplined coder would avoid using "var" in a for-loop initializer.
- Arrays: Solidity's array declaration syntax may seem unconventional, but it is consistent within the language itself. The limitations with multi-dimensional dynamic arrays is a known issue and can typically be worked around by restructuring the data.
- Compiler bugs: The bugs mentioned here are indeed serious. However, it's worth mentioning that Solidity is under active development, with many of these issues already addressed in the newer versions. Also, the transparent reporting of these bugs in a JSON format allows for automated parsing and analysis, making it easier for developers to stay informed about the issues.
A little more on the garbage collector pointer, quickly.
A garbage collector (GC) is a form of automatic memory management used by many modern programming languages. Its role is to automatically reclaim memory that is no longer in use by the program, freeing up resources and preventing potential memory leaks.
Here's a simplified version of how it works:
- Allocation: When objects are created, memory is allocated to them. This is often done on a region of memory known as the heap.
- Marking: The garbage collector periodically examines all objects in memory to determine which are still in use. It does this by starting from "root" references (variables in the stack, static variables, etc.) and tracing all reachable objects.
- Sweeping: Once all reachable objects are marked, the garbage collector reclaims the memory occupied by unreachable objects -- the ones not marked in the marking phase. These are considered "garbage" as the application no longer has any way to reference and use them.
By automatically handling the deallocation of memory, garbage collection helps developers avoid common mistakes that can lead to issues such as dangling pointers, double frees, and memory leaks. However, garbage collection is not free and can impact performance, as the GC process consumes CPU resources. Furthermore, it can potentially introduce latency in the program execution at unpredictable times, as most GC routines are non-deterministic regarding when they execute.
In languages that do not have built-in garbage collection, such as C and C++, programmers need to manually manage memory. This involves explicitly allocating and deallocating memory as needed.
Here's a brief overview of how it works:
- Allocation: When a programmer needs to use a chunk of memory (e.g., for a variable, array, or object), they must explicitly request this from the system. In C and C++, this is typically done using functions such as
malloc()
,calloc()
, or thenew
keyword in C++. - Usage: The programmer uses the allocated memory as needed, accessing and modifying the data stored within.
- Deallocation: When the memory is no longer needed (e.g., the data is no longer needed, or the program is finishing execution), the programmer must explicitly return the memory to the system. In C and C++, this is typically done using functions such as
free()
or thedelete
ordelete[]
keywords in C++.
Failure to properly deallocate memory when it's no longer needed leads to what's known as a memory leak, where a program uses progressively more memory over time, which can slow down the system and eventually cause the program or even the whole system to crash. Conversely, deallocating memory that is still in use can lead to another class of errors, as the program may still try to access the deallocated memory, leading to undefined behavior. Thus I understand the concern over lack of GC in Solidity.
However, memory in Solidity works a bit differently due to the nature of the Ethereum platform and its need for deterministic computation across the network.
There are two types of memory spaces in Solidity/Ethereum:
- Storage: This is where all contract state variables reside. Every contract has its own persistent storage, which is stored on the Ethereum blockchain. The storage is essentially a large mapping of 256-bit words, and reads and writes are among the most expensive operations in Ethereum in terms of gas costs.
- Memory: This is a temporary space used during contract execution. It's a linear byte array, and its size expands as needed, but it's not persistent. When execution stops, it's erased.
For the memory space, Solidity uses a stack-based memory model with manual allocation. When you create local variables inside functions, they get pushed onto the stack, and when the function execution ends, these variables get popped off the stack. This mimics the behavior of garbage collection in that the memory is automatically reclaimed when it's no longer in use, but it's a simpler and more predictable mechanism.
pragma solidity >=0.4.22 <0.9.0;
contract AccessControl {
mapping (address => bool) public admins;
function addAdmin(address _newAdmin) public {
// For simplicity, we'll allow anyone to add an admin
// In a real contract, you'd likely only allow the contract owner or existing admins to do this
admins[_newAdmin] = true;
}
function removeAdmin(address _oldAdmin) public {
// Similarly, in a real contract you'd restrict who can remove an admin
delete admins[_oldAdmin];
}
function isAdmin(address _address) public view returns (bool) {
return admins[_address];
}
}
For the storage space, Solidity does not automatically free up storage even if variables are no longer in use. This is because deleting storage is not always desired and can actually be costly in terms of gas. Therefore, it's up to the contract developer to explicitly clear storage when necessary using the delete
keyword.
With advancements in statelessness, eventually, on-chain storage becomes more of an abstract concept less and less developers and validators need to worry about without context.
A follow-up word on domain specific languages and their importance
Domain-Specific Languages (DSLs), as the name implies, are programming languages specifically designed for a particular problem domain. Unlike General-Purpose Languages (GPLs), which are designed to solve a broad range of problems, DSLs focus on a specialized area, delivering more efficient and expressive tools for domain-specific tasks. This specialism proves incredibly beneficial in fields that require highly specialized knowledge or operations, such as the cryptocurrency and blockchain domain.
Solidity, Cairo, Vyper, Move, Sway etc. are all DSLs that provide either different toolkits for different networks, or different advantages for the same network. Solana and Polkadot go in a different direction with using Rust or C (Cosmos technically has BL built into the chain via Go, but now a lot of Cosmos app-chains depend on CosmWasm, which is a library for Rust). Even Bitcoin has its own script for custom transactions etc. I feel, compared to general programming languages, DSLs provide more control in the given context of the network, without compromising too much on security. I don’t have on me right now the exact reason for why Rust would be less secure for blockchain application programming, but purely on a vibe level, I feel it is insecure and isn’t exploited simply because there is not enough incentive right now.