Testing Solidity/Ethereum smart contracts with Spock Framework

Testing Solidity/Ethereum smart contracts with Spock Framework

Please find full source in Github

I started this project after evaluating few tools for smart-contract development and not being particularly happy with any of them. It's based on the great Spock framework

Idea was to define the contract in the form of specification while developing it. Test themselves are directly connected to Ethereum test network, as mocking would make here even less sense than usual. Most of the problems that you wan't to discover occur from the network. If you are not familiar with spock or specification testing, please check my previous post on the subject

Design choices

I just love docker in development as it helps to keep the dev environments uncluttered.

The compiler that compiles the solidity smart-contracts is run with docker. If your system does not support docker, you can install it manually and change the compileSolidity task in the build.gradle

Obvious choice with Spock...
What else? Maybe TypeScript next time.

About this example

This is a about smart-contract that allows 2 people sign a Deal with specific wordings. It is modeled according to this specification

The implementing contract is programmed as Solidity smart contract here

Rest of the project is really just dynamic glue code to bind the smart-contract and the specification together. Under the hood it uses web3j to generate Java wrappers from the Solidity. The glue should be generic and reusable to any smart-contract I might write in the future.

Prerequisites to run this example
Access to blockchain

You need access to some Ethereum blockchain network. For me it did not make sense to run own node just for testing as it can take a lot of disk space and could be slow to start. So I used Infura which provides API that the standard client libraries know how to connect. From development perspective it functions as you would have your own node.

Tokens to execute transactions on the blockchain

You need to pay GAS which is the cost of transaction when you create a smart-contract or execute any operations in the blockchain. This is paid in Ether.

To have ether, first you need a wallet. The wallet contains the address that is used as a key to pretty much everything. This is kind of your identifier in the blockchain. This project generates you 2 test wallets. You need 2 wallets, since we have two parties (Supplier and Purchaser) we need to test in our smart-contract.

To generate test wallets, execute gradle command gradle prepareFirstRun

In development you can use test networks, where you can obtain Ether more easily without having to actually pay. I used Rinkeby test network for which you can ask Ether from here

You should ask the Ether for your Supplier-wallet. To see the wallet ether-address after you generated it, execute gradle command gradle printSupplierWalletAddress

After you have Ether in your wallet, you can execute the previously mentioned prepareFirstRun again to transfer money from Supplier wallet to Purchaser wallet.

Gas price depends on the network and amount of Gas assigned to transaction may affect how fast it is processed. The test assigns a gas price by using a fraction of network indicated price. You can tune it on the SmartContract.groovy class if needed.

You can see transaction prices and other cool stats from the Rinkeby test network here

Running the project:

After the following preconditions are met (Machine has docker, properties-files contain the web3 endpoints and at least the Supplier wallet has Ether as described above)

./gradlew prepareFirstRun
./gradlew test

./gradlew.bat prepareFirstRun
./gradlew.bat tes

Using Drools with Groovy and Gradle: A poker hand classification example with BDD testing

Full source available in Github
Im not covering theory of rule-engines in this tutorial much, but if you are not familiar with it here is a short summary. Rule-engine is software that executes code described in a certain rule format. The idea is to use high level descriptive language to encode domain knowledge into a system. Rule-engine systems are however much more than the rule programming language. Systems typically have special tooling, such as repositories where the knowledge is managed and updated to runtime systems. Drools is an open-source rule-engine that you can try and use for free. This tutorial is more about how to use Drools with Groovy, Gradle and Spock. If you find Drools interesting and dont know anything about it yet, please read this tutorial first.
I’ve been doing Rule-Engine projects for a decade now, and it’s been over five years since I first tried Groovy language in mix with Java on these projects.
It amazes me when discussing with my collegues how little Groovy is used. Since we are about to upgrade our knowledge, why shouldnt we also take in the latest advancements in build and testing tools.
Automatic testing is the most important thing in a rule-project. There has not been a very good solution to test all the variances with mainstream Java tooling. JUnit testing this sort of stuff is pain, so years ago I wrote my own tools to do this. But with Groovy I suggest Spock framework which is used also in this tutorial.
Basic project setup

If you are familiar with Maven project structure, then you are familiar with Gradle project structure, since they are the same icon_smile. Gradle is full replacement for Maven, so you can use it in existing repository architecture instead of Maven.
For those who are not familiar with this convention:
-You should use predefined locations for files. This is called convention-over-configuration and its a paradigm that saves us from a great deal of configuration efforts.
-You put all your source and resource files into predefined directories:
  • src/main/groovy ->Groovy programming language files
  • src/test/groovy ->Groovy programming language tests
  • src/main/java -> Java programming language tests
  • src/main/resources -> Resource files that are included in package (jar or war).
  • src/test/resources -> Resource files that are in the classpath during testing
We are putting our program code into src/main/groovy and Spock tests into src/test/groovy. Rule-files that are loaded during unit-testing are located in src/main/resources/ (not in /test since they are part of the system, and not just for tests).
Gradle configuration file is located at top-level and named as build.gradle
In build.gradle you include all your project dependencies and list what Gradle-plugins you might want to use. Here we need only Groovy-plugin.
apply plugin: 'groovy'

// In this section you declare where to find the dependencies of your project
repositories {
// Use 'maven central' for resolving your dependencies.
// You can declare any Maven/Ivy/file repository here.

// In this section you declare the dependencies for your production and test code
dependencies {
// We use the latest groovy 2.x version for building this library
compile 'org.codehaus.groovy:groovy:2.2.2'
//These are needed for building example project. Commons-lang is for hashcode builder
compile 'commons-lang:commons-lang:2.6',

// We use the awesome Spock testing and specification framework
testCompile 'org.spockframework:spock-core:0.7-groovy-2.0'
//Drools needs these, and for some reason they are not in dependencies
testCompile 'com.sun.xml.bind:jaxb-xjc:2.1.12','com.sun.xml.bind:jaxb-impl:2.1.12'

For building gradle project, you dont need to have to install Gradle itself. Gradle is only needed to create Gradle projects. There is a wrapper executable, gradlew.bat / gradlew that you can use to build depending on your operating system. This wrapper takes care of downloading all necessary files to execute gradle on the project.
To execute tests in the project, simply command (in windows operating system with CMD) gradlew.bat test
If you want to export the project into eclipse, install Eclipse-Gradle-plugin and import the project as Gradle project.
Test report generation in Grade / Spock
When you run a test with gradle, it generates great reports into directory build/reports/tests. You can open the index.html with web-browser or eclipse to view test-results.
Now that we have a working project setup and way to execute and view test results we can continue to implementing tests. Tests are called Specifications in the Spock framework. Its a Behavior Driven Development (BDD) concept that states that any software unit should be specified in the terms of desired behavior of the unit. This vague description might be concretized by looking into the tests. Spock tests are very similar to Drools rules, and have two blocks: When and Then (This is oversimplified, there are of course more than just 2 blocks, but these are the ones that are used 99% of the time :). When-block describes the condition where the behavior happens. When this condition is created, then the Then block situation should happen. Then-block is asserting block where we test that the behavior of the unit is correct in the given situation. For BDD tests there are classically also Given block which describes preconditions, but this is not needed in our test project. However Spock supports also such block.
A Spock Spec (from Spock documentation):
def "HashMap should accept null key"() {
def map = new HashMap()
map.put(null, "elem")

Most of the rule-tests tend to be the same test with combinations of different data. This is especially true with rule-sheets. However, you should still test all the possible combinations. Writing the combinatory tests with plain JUnit makes Jack a dull boy, so there have been some extensions for this. However Spock has native support for combinations that is very usable with rule testing. Combinations are described in a text table in a Where block after When/Then blocks.
A Spock Spec with Where-block (without making any sense icon_smile ):
def "Size of language should be the length"() { 
def language = name
        language.size()== length
        name      | length
       "Groovy"  | 6
       "Java"     | 4

The classification of Poker hands

There are about 3 million(?) combinations of 5 poker cards. There are also variant games like Texas-Holdem where it should be possible to classify other amounts of cards for the highest rank. There are multiple solutions to this problem, but here we use knowledge presentation with rule-engine. We know how to classify the cards, so we do not need to make blind classification with neural network or such. We just code the knowledge we have into the rule-engine. From the behavior driven development perspective this gets interesting: we have 2 places that are encoding a scenario. Both rules and tests are defining the exact same scenario with almost the same kind of format. I suggest keeping the tests as tests for the rules, and try to maintain the documentative qualities in the rules. No harm done if you manage to do it in both. From the development perspective it is quite impossible to code rules without tests, so the best way to start implementing stuff is to write a test-case. In this example we test each single poker-hand using a combinatory Spock test. This however leaves out about 3 million minus 9 untested poker-hands :). After having all hands implemented in rules, we write a test that tests 250 000 hands from a preclassified test-file. That should give us a degree of confidence to our solution.
So we write our first test:
def "Given cards should give Rank #rank "() {
when: "5 cards in hand"
def hand = new Hand()
hand.cards = [
new Card(rank: rnk1, suit: suit1 ),
new Card(rank: rnk2, suit: suit2 ),
new Card(rank: rnk3, suit: suit3 ),
new Card(rank: rnk4, suit: suit4 ),
new Card(rank: rnk5, suit: suit5 )]
then: "Hand should have rank #rank"

pc.classify(hand).rank == rank

rnk1 | suit1 | rnk2 | suit2 | rnk3 | suit3 | rnk4 | suit4 | rnk5 | suit5 | rank
1 | 1 | 10 | 1 | 11 | 1 | 12 | 1 | 13 | 1 | 9

Here in the 
When-block we define a hand of cards that contains 5 new cards. Each card gets rank and suit value into constructor. Groovy has a feature that allows us initialize a new object by giving any bean-property in the constructor directly without coding the actual constructor. Values for the ranks and suits are taken from the Where-block table. @Unroll annotation is needed for the spock to generate a new test for each line of the where-table. Here we have only one row representing the Royal-flush hand with rank of 9. You can check out the complete source from github to see all the rows testing all the hands.
And then the Drools-rule that implements it:
rule "Royal flush"
Card(s: suit, rank == 13 )
Card(suit == s, rank == 12 )
Card(suit == s, rank == 11 )
Card(suit == s, rank == 10 )
Card(suit == s, rank == 1 )
cr : ClassificationResult(rank < 9)
cr.setHand("Royal flush");
update( cr );

To be able to run the test and classify the hand, we need to write some boilerplate code. Domain objects representing the game are easy:  You need Cards and then a Hand to be able to classify one. However this is a simple example and sometimes getting the right level of abstraction is tricky. One should carefully think about the domain modeling and abstraction when designing the system. What are the limits of the system? If you write too generic domain model, you end up writing too much unchanging logic in to the rule-engine that serves no purpose. For example here we could opt to write a generic game supporting domain model. Then we would have described the special needs of a poker game upon the generic system in the rules. There is a trap in this thinking: Even if you manage to produce a poker game system with such approach, it still might be difficult to write a backgammon system with the same domain model. It could end with a lot of code being written in to the rules that has no business value and is not coding any business knowledge.

Basic rule-loading is done in RuleHelper.groovy. Rule-execution is done in PokerClassifier. Checkout complete code from github!
PokerClassifier(String rulefile) {
kbase = RuleHelper.createBase(new File(rulefile).text)


ClassificationResult classify(hand) {
def session = kbase.newStatelessKnowledgeSession()

def result = new ClassificationResult()

def toRuleEngine = []
toRuleEngine << result

session.execute( toRuleEngine )

return result

Drools pattern matching ability works great with the poker hand classification. You can actually classify any amount of cards, and it should pick the best rank. It seems also pretty fast if you check the test report for execution times.