Killing Test Setup Bloat with Tiny, Reproducible Generators

Published 2025-08-15
scalatest

I’ve been developing software for over twenty years and started my professional career just a few years after Test-Driven Development (TDD) was introduced. Back then, I often found myself being the one to bring TDD into new projects.

During these years, I've seen a lot of tests. They pretty much always follow the same pattern: setup, execution, assertions, and on occasion - teardown. What some call "given, when, then". Naming aside, the amount of code in each section is far from evenly distributed. In most tests, the setup is bloated. This makes tests harder to read, hides the intent, and makes changes brittle –
even when only one field actually matters.

Here I'll eventually share my remedy for bloated setup - Gen.

Before addressing a bloated test example and how to solve it, we need to define a domain model.

case class Address(street: String, city: String, postalCode: String, country: String)
case class Department(id: String, name: String, costCenter: String)

case class Grade(code: String, eligibleForPension: Boolean)
case class Job(title: String, grade: Grade, level: Int)
case class BankDetails(iban: String, bic: String, bankName: String)

case class Employee(
  id: String,
  name: String,
  birthDate: String,
  address: Address,
  department: Department,
  job: Job,
  salary: Double,
  bonuses: List[Double],
  bankDetails: Option[BankDetails],
  active: Boolean
):
  def isPensionEligible: Boolean = job.grade.eligibleForPension

end Employee

Yes, it’s bloated and probably makes little sense — but that’s the point: real-world domain objects often are bloated as they accumulate fields over time. There are a lot of things going on to define an employee. When used in a test, it quickly gets messy. Here is a test for verifying Employee.isPensionEligible.

//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite

class EmployeePensionTest extends FunSuite:

  test("employee is pension eligible if job grade allows it"):
    // 🧱 Massive setup just to reach job.grade.eligibleForPension
    val address = Address("Karl Johansgate 1", "Oslo", "NO", "0123")
    val department = Department("D-01", "Finance", "FIN-COST")
    val grade = Grade("G7", eligibleForPension = true) // 🔥 only thing that matters
    val job = Job("Analyst", grade, level = 2)
    val bank = BankDetails("NO9386011117947", "DNBANOKKXXX", "DNB")

    val employee = Employee(
      id = "emp-777",
      name = "Greta Svendsen",
      birthDate = "1990-05-12",
      address = address,
      department = department,
      job = job,
      salary = 650000.0,
      bonuses = List(2000),
      bankDetails = Some(bank),
      active = true
    )

    // ✅ Execution
    val result = employee.isPensionEligible

    // ✅ Assertion
    assert(result)

That’s a lot of setup for something where hardly anything matters. eligibleForPension = true is the only thing that really matters.

I really like to focus on the stuff that matters.

What I've ended up doing is drawing inspiration from property-based testing made famous by QuickCheck. I'm not inspired by everything — just the generator part. Hence, I'm not using ScalaCheck (one scala re-implementation of QuickCheck) as there is a lot that I don't need and I'm not really doing property-based testing.

Instead, I built a minimal Gen abstraction tailored for what I want to achieve; readability and test clarity.

Here is the general outline; there could be more to it, but the main idea should be clear.


import scala.annotation.targetName
import scala.collection.BuildFrom
import scala.util.Random

type Seed = Long

// Using scala 3 context function
type Gen[T] = Random ?=> T

extension [T](gen: Gen[T])
  
  @targetName("optional")
  def ? : Gen[Option[T]] = option(gen)

end extension

object Gen:
  def apply[T](f: Gen[T], seed: Seed = 1): T = f.apply(using Random(seed))
  def fromSeed[T](seed: Seed)(f: Gen[T]): T = f.apply(using Random(seed))

def option[T](f: Gen[T]): Gen[Option[T]] = Option.when(boolean)(f)

val int: Gen[Int] = r ?=> if boolean then r.nextInt() else -r.nextInt()
val boolean: Gen[Boolean] = r ?=> r.nextBoolean()
val double: Gen[Double] = r ?=> r.nextDouble() * (if boolean then Double.MaxValue else Double.MinValue)
val uniformDistribution: Gen[Double] = r ?=> r.nextDouble()
def between(minInclusive: Int, maxExclusive: Int): Gen[Int] = r ?=> r.between(minInclusive, maxExclusive)
val char: Gen[Char] = r ?=> r.nextPrintableChar()
def string(length: Int): Gen[String] = r ?=> (0 to length).map(_ => char).mkString
@targetName("alphanumericString")
def str(length: Int): Gen[String] = r ?=> r.alphanumeric.take(length).mkString
def shuffle[T, C](xs: IterableOnce[T])(implicit bf: BuildFrom[xs.type, T, C]): Gen[C] = r ?=> r.shuffle(xs)

// There are generally more basic generators methods, but they are just more of the same. 

This leverages Scala 3 Context Functions, which is a really nice feature.

Let's apply Gen the EmployeePensionTest:

//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random

class EmployeePensionTest extends FunSuite:

  test("employee is pension eligible if job grade allows it"):

    // 🧱 Expressive setup just to reach job.grade.eligibleForPension
    val employee: Employee = Gen.fromSeed(2):
      val address = Address(str(10), str(6), str(2), str(4))
      val department = Department(str(5), str(8), str(6))
      val grade = Grade(str(2), eligibleForPension = true) // 🔥 only thing that matters
      val job = Job(str(10), grade, level = int)
      val bank = BankDetails(str(15), str(11), str(between(3, 10)))
    
      Employee(
        id = str(8),
        name = str(10),
        birthDate = str(10),
        address = address,
        department = department,
        job = job,
        salary = double,
        bonuses = List.fill(between(1, 10))(double),
        bankDetails = Some(bank),
        active = boolean
      )

    // ✅ Execution
    val result = employee.isPensionEligible

    // ✅ Assertion
    assert(result)

Here there are a bunch of parameters, and the only thing that matters is explicitly defined. The rest isn't relevant.

Setting the seed to something specific is important so failures can be reproduced. Gen: is equivalent to Gen.fromSeed(1):. Selecting a different seed generates different data which might be helpful.

With many tests, reuse becomes essential. So let's make some reusable domain specific generators:

val genGrade: Gen[Grade] = Grade(
  code = str(3),
  eligibleForPension = boolean,
)

val genJob: Gen[Job] = Job(
  title = str(12),
  grade = genGrade,
  level = between(1, 10),
)

val genAddress: Gen[Address] = Address(
  street = str(10),
  city = str(6),
  postalCode = str(4),
  country = str(2),
)

def genDepartment: Gen[Department] = Department(
  id = str(5),
  name = str(8),
  costCenter = str(6),
)

val genBankDetails: Gen[BankDetails] = BankDetails(
  iban = str(15),
  bic = str(8),
  bankName = str(6),
)

val genEmployee: Gen[Employee] = Employee(
  id = str(8),
  name = str(10),
  birthDate = str(10),
  address = genAddress,
  department = genDepartment,
  job = genJob,
  salary = double,
  bonuses = List.fill(between(1, 10))(double),
  bankDetails = genBankDetails.?,
  active = boolean
)

So the test can be reduced to:

//> using test.dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random

class EmployeePensionTest extends FunSuite:
  
  test("employee is pension eligible if job grade allows it"):
    // 🧱 Less setup just to reach job.grade.eligibleForPension
    val employee: Employee = Gen.fromSeed(3):
      Employee(
        id = str(8),
        name = str(10),
        birthDate = str(10),
        address = genAddress,
        department = genDepartment,
        job = Job(
          title = str(10),
          grade = Grade(
            str(2),
            eligibleForPension = true // 🔥 only thing that matters
          ),
          level = int
        ),
        salary = double,
        bonuses = List.fill(between(1, 10))(double),
        bankDetails = genBankDetails.?,
        active = boolean
      )

    // ✅ Execution
    val result = employee.isPensionEligible

    // ✅ Assertion
    assert(result)

It’s less, but there’s still a lot of setup.

Just to keep it really basic, let's just do a simple copy. There isn't anything else to it.

//> using dep org.scalameta::munit::1.0.4
import munit.FunSuite
import scala.util.Random

class EmployeePensionTest extends FunSuite:

  test("employee is pension eligible if job grade allows it"):
    // 🧱 Small setup just to set job.grade.eligibleForPension
    val employee: Employee = Gen.fromSeed(4):
      genEmployee.copy(
        job = genJob.copy(
          grade = genGrade.copy(
            eligibleForPension = true // 🔥 only thing that matters
          )
        )
      )

    // ✅ Execution
    val result = employee.isPensionEligible

    // ✅ Assertion
    assert(result)

This is really clear. Just setting the minimal that needs to be set. Nothing else matters, so I don't add anything to it.

If you want to avoid copy calls altogether, lenses are another option — but that’s a topic for a future post.

I've added the generators to a little code base that can be found here https://github.com/skytteren/dataseed. There is little to the basic generators. At the time of writing, there are less than 70 lines of code in one file. So there is no release. The sensible thing is to copy-paste, and then adjust the file to one's liking.

After using these generators for more than a year on a project, I’ve found that focusing only on what matters in a test makes them easier to read, faster to write, and more resilient to change.

I hope this approach is food for thought.

Tilbake til bloggen