Killing Test Setup Bloat with Tiny, Reproducible Generators
I’ve been developing software for over twenty years and started my professional career just a few years after Test-Driven Development (TDD) was introduced. Back then, I often found myself being the one to bring TDD into new projects.
During these years, I've seen a lot of tests.
They pretty much always follow the same pattern: setup, execution, assertions, and on occasion - teardown.
What some call "given, when, then".
Naming aside, the amount of code in each section is far from evenly distributed.
In most tests, the setup is bloated.
This makes tests harder to read, hides the intent, and makes changes brittle –
even when only one field actually matters.
Here I'll eventually share my remedy for bloated setup - Gen
.
Before addressing a bloated test example and how to solve it, we need to define a domain model.
case class Address(street: String, city: String, postalCode: String, country: String)
case class Department(id: String, name: String, costCenter: String)
case class Grade(code: String, eligibleForPension: Boolean)
case class Job(title: String, grade: Grade, level: Int)
case class BankDetails(iban: String, bic: String, bankName: String)
case class Employee(
id: String,
name: String,
birthDate: String,
address: Address,
department: Department,
job: Job,
salary: Double,
bonuses: List[Double],
bankDetails: Option[BankDetails],
active: Boolean
):
def isPensionEligible: Boolean = job.grade.eligibleForPension
end Employee
Yes, it’s bloated and probably makes little sense — but that’s the point: real-world domain objects often are bloated as they accumulate fields over time.
There are a lot of things going on to define an employee.
When used in a test, it quickly gets messy.
Here is a test for verifying Employee.isPensionEligible
.
//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite
class EmployeePensionTest extends FunSuite:
test("employee is pension eligible if job grade allows it"):
// 🧱 Massive setup just to reach job.grade.eligibleForPension
val address = Address("Karl Johansgate 1", "Oslo", "NO", "0123")
val department = Department("D-01", "Finance", "FIN-COST")
val grade = Grade("G7", eligibleForPension = true) // 🔥 only thing that matters
val job = Job("Analyst", grade, level = 2)
val bank = BankDetails("NO9386011117947", "DNBANOKKXXX", "DNB")
val employee = Employee(
id = "emp-777",
name = "Greta Svendsen",
birthDate = "1990-05-12",
address = address,
department = department,
job = job,
salary = 650000.0,
bonuses = List(2000),
bankDetails = Some(bank),
active = true
)
// ✅ Execution
val result = employee.isPensionEligible
// ✅ Assertion
assert(result)
That’s a lot of setup for something where hardly anything matters.
eligibleForPension = true
is the only thing that really matters.
I really like to focus on the stuff that matters.
What I've ended up doing is drawing inspiration from property-based testing made famous by QuickCheck. I'm not inspired by everything — just the generator part. Hence, I'm not using ScalaCheck (one scala re-implementation of QuickCheck) as there is a lot that I don't need and I'm not really doing property-based testing.
Instead, I built a minimal Gen
abstraction tailored for what I want to achieve; readability and test clarity.
Here is the general outline; there could be more to it, but the main idea should be clear.
import scala.annotation.targetName
import scala.collection.BuildFrom
import scala.util.Random
type Seed = Long
// Using scala 3 context function
type Gen[T] = Random ?=> T
extension [T](gen: Gen[T])
@targetName("optional")
def ? : Gen[Option[T]] = option(gen)
end extension
object Gen:
def apply[T](f: Gen[T], seed: Seed = 1): T = f.apply(using Random(seed))
def fromSeed[T](seed: Seed)(f: Gen[T]): T = f.apply(using Random(seed))
def option[T](f: Gen[T]): Gen[Option[T]] = Option.when(boolean)(f)
val int: Gen[Int] = r ?=> if boolean then r.nextInt() else -r.nextInt()
val boolean: Gen[Boolean] = r ?=> r.nextBoolean()
val double: Gen[Double] = r ?=> r.nextDouble() * (if boolean then Double.MaxValue else Double.MinValue)
val uniformDistribution: Gen[Double] = r ?=> r.nextDouble()
def between(minInclusive: Int, maxExclusive: Int): Gen[Int] = r ?=> r.between(minInclusive, maxExclusive)
val char: Gen[Char] = r ?=> r.nextPrintableChar()
def string(length: Int): Gen[String] = r ?=> (0 to length).map(_ => char).mkString
@targetName("alphanumericString")
def str(length: Int): Gen[String] = r ?=> r.alphanumeric.take(length).mkString
def shuffle[T, C](xs: IterableOnce[T])(implicit bf: BuildFrom[xs.type, T, C]): Gen[C] = r ?=> r.shuffle(xs)
// There are generally more basic generators methods, but they are just more of the same.
This leverages Scala 3 Context Functions, which is a really nice feature.
Let's apply Gen
the EmployeePensionTest
:
//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random
class EmployeePensionTest extends FunSuite:
test("employee is pension eligible if job grade allows it"):
// 🧱 Expressive setup just to reach job.grade.eligibleForPension
val employee: Employee = Gen.fromSeed(2):
val address = Address(str(10), str(6), str(2), str(4))
val department = Department(str(5), str(8), str(6))
val grade = Grade(str(2), eligibleForPension = true) // 🔥 only thing that matters
val job = Job(str(10), grade, level = int)
val bank = BankDetails(str(15), str(11), str(between(3, 10)))
Employee(
id = str(8),
name = str(10),
birthDate = str(10),
address = address,
department = department,
job = job,
salary = double,
bonuses = List.fill(between(1, 10))(double),
bankDetails = Some(bank),
active = boolean
)
// ✅ Execution
val result = employee.isPensionEligible
// ✅ Assertion
assert(result)
Here there are a bunch of parameters, and the only thing that matters is explicitly defined. The rest isn't relevant.
Setting the seed to something specific is important so failures can be reproduced.
Gen:
is equivalent to Gen.fromSeed(1):
.
Selecting a different seed generates different data which might be helpful.
With many tests, reuse becomes essential. So let's make some reusable domain specific generators:
val genGrade: Gen[Grade] = Grade(
code = str(3),
eligibleForPension = boolean,
)
val genJob: Gen[Job] = Job(
title = str(12),
grade = genGrade,
level = between(1, 10),
)
val genAddress: Gen[Address] = Address(
street = str(10),
city = str(6),
postalCode = str(4),
country = str(2),
)
def genDepartment: Gen[Department] = Department(
id = str(5),
name = str(8),
costCenter = str(6),
)
val genBankDetails: Gen[BankDetails] = BankDetails(
iban = str(15),
bic = str(8),
bankName = str(6),
)
val genEmployee: Gen[Employee] = Employee(
id = str(8),
name = str(10),
birthDate = str(10),
address = genAddress,
department = genDepartment,
job = genJob,
salary = double,
bonuses = List.fill(between(1, 10))(double),
bankDetails = genBankDetails.?,
active = boolean
)
So the test can be reduced to:
//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random
class EmployeePensionTest extends FunSuite:
test("employee is pension eligible if job grade allows it"):
// 🧱 Less setup just to reach job.grade.eligibleForPension
val employee: Employee = Gen.fromSeed(3):
Employee(
id = str(8),
name = str(10),
birthDate = str(10),
address = genAddress,
department = genDepartment,
job = Job(
title = str(10),
grade = Grade(
str(2),
eligibleForPension = true // 🔥 only thing that matters
),
level = int
),
salary = double,
bonuses = List.fill(between(1, 10))(double),
bankDetails = genBankDetails.?,
active = boolean
)
// ✅ Execution
val result = employee.isPensionEligible
// ✅ Assertion
assert(result)
It’s less, but there’s still a lot of setup.
Just to keep it really basic, let's just do a simple copy. There isn't anything else to it.
//> using dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random
class EmployeePensionTest extends FunSuite:
test("employee is pension eligible if job grade allows it"):
// 🧱 Small setup just to set job.grade.eligibleForPension
val employee: Employee = Gen.fromSeed(4):
genEmployee.copy(
job = genJob.copy(
grade = genGrade.copy(
eligibleForPension = true // 🔥 only thing that matters
)
)
)
// ✅ Execution
val result = employee.isPensionEligible
// ✅ Assertion
assert(result)
This is really clear. Just setting the minimal that needs to be set. Nothing else matters, so I don't add anything to it.
If you want to avoid copy calls altogether, lenses are another option — but that’s a topic for a future post.
I've added the generators to a little code base that can be found here https://github.com/skytteren/dataseed. There is little to the basic generators. At the time of writing, there are less than 70 lines of code in one file. So there is no release. The sensible thing is to copy-paste, and then adjust the file to one's liking.
After using these generators for more than a year on a project, I’ve found that focusing only on what matters in a test makes them easier to read, faster to write, and more resilient to change.
I hope this approach is food for thought.