Mutual recursion for fun and profit
I've been wanting to write this post for a while, about what I think is an elegant way to solve some constraint satisfaction problems. Constraints tend to come up fairly often in real world programs, and some times it can be effective to treat them as constraint satisfaction problems. This post has a bit of background on constraint satisfaction problems I've encountered recently, then it goes over to develop Rust code for an algorithm that we can easily use to solve some Advent of Code problems, and we use it to make a solver for sudoku puzzles. Along the way, we explain the syntax we use, it shouldn't be too hard to understand for someone who is unfamiliar with the language.
Why not just use postgres?
The past few weeks I've been experimenting with DuckDB, and as a consequence I've ended up talking about it a lot as well. I'm not going to lie, I really like it! However, experienced programmers will rightly be skeptical to add new technology that overlaps with something that already works great. So why not just use postgres?
Analyzing open data with DuckDB
When I started writing this blogpost a few weeks back, I was planning to look into DuckDB and write about it for maybe a couple of hours, and see where that got me. Something close to 30 hours of work later, I'm clawing myself back up from a very deep rabbit hole, and I don't know how to introduce this blogpost properly. See, what happened, was that I discovered data.entur.no while looking for some meaningful data sets to apply DuckDB to. In there, there's a data set with arrival times, departure times and scheduled times for most of the public transit in Norway.
Deno Scripting
Have you ever felt like you’re making too many shell scripts? Some may say that every line of bash
is too much bash
.
An example of why bash may be not the best solution for writing scripts is shellcheck. There are too many possible mistakes.
Instrumenting an application with OpenTelemetry without editing code
It's a Tuesday afternoon, and management has been in a state of poorly contained hysteria all day, running in and out of meetings continuously. The team is trying to focus on work, but it's hard to avoid gossiping.
🎶 These points of data make a beautiful line 🎶
One of my most vivid memories is from the day in my late teens when I first got a contact lens for my left eye. It took a long time to discover that I had poor vision on this eye, you see, like many people, I chose to keep both of my eyes open when I wasn't sleeping. It was the headaches that sent me to a doctor. I was adamant that I could see well, but when he blocked my right eye, I had the humbling experience of no longer being able to read the biggest letters on the poster. It turned out that my headaches were probably due to my brain working overtime to interpret the world using mostly only one eye. My appointment with an optician was only a few days later, and I got to try a contact lens that same day.
Exploring a webapp using psql and pg_stat_statements
It's always an exciting day for me when I get access to the source code for an entirely new application I need to work on. How does it look inside, how does it work? Sometimes, there's some design documentation along with it, or operational procedures, or maybe some developer handbook in a wiki. I do check all of those, but I don't expect any of those things to accurately describe how the code works, because they tend to change less frequently. It's also fairly low-bandwidth, it takes a ton of time to ingest technical text.
What if it isn't a bool?
A common way that code grows difficult to reason about is increasing the number of things you must keep in your head simultaneously to understand it. Often, this simply happens by adding one attribute, one variable, one column at a time. Some people are gifted with a great capacity for working memory, but most of us aren't -- having to hold the state of 5 variables in your head simultaneously to understand a piece of code may be pushing it far, according to this article from wikipedia:
When to avoid the in operator in postgres
In my post about batch operations, I used the
where id = any(:ids)
pattern, with ids
bound to a JDBC array. I've gotten questions about that
afterwards, asking why I do it like that, instead of using in (:id1, :id2, ...)
. Many libraries
can take care of the dynamic SQL generation for you, so often you can just write in (:ids)
, just
like the array example. I would still prefer to use the = any(:ids)
pattern, and I decided to write
down my reasoning here.
Batch operations using composite keys in postgres over jdbc
Throughout a career as a software developer, you encounter many patterns. Some appear just often enough to remember that they exist, but you still need to look them up every time. I've discovered that writing things down helps me remember them more easily. This particular pattern is very useful for my current project. So, it's time to write it down and hopefully commit it to memory properly this time. Although this post is specific to PostgreSQL, I'm sure other databases have the necessary features to achieve the same results efficiently.
Scalafix
One of the things I love with the Scala ecosystem is the tooling. One of the tools I believe we should have used more is Scalafix. It's an awesome tool that can be used for all from linting to automatic rewrites. In the blog post we-love-scala3 we have written some rules to help adaption of the new Scala 3 syntax. Now let's explore why we ended up with rewrites that did not compile.
We love scala 3 - Let's migrate!
I have a confession to make; I find Scala 3 awesome and I love it! I am a lazy developer; I love a good codebase, and I absolutely hate doing boring chores. Manual labor is not for me. Let's automate it ;-). The scala team and the community have enabled us to automate the migration and in this post we will take it one step further.
Careful with that Lock, Eugene: Part 2
A while back, I wrote Careful with That Lock,Eugene about an idea for how to check if a database migration is likely to disturb production. That post came about after having an inspiring chat with a colleague about the advantages of transactional migration scripts and the ability to check the postgres system catalog views before committing a transaction.
Careful with that Lock, Eugene
It is rewarding to work on software that people care about and use all around the clock. This constant usage means we can't simply take the system offline for maintenance without upsetting users. Therefore, techniques that allow us to update the software seamlessly without downtime or compromising service quality are incredibly valuable.
How to test for missing indexes on foreign keys
If you're developing a transactional application backed by postgres, there's a pretty cool trick you can use to check if you're missing indexes that could potentially cause serious performance issues or even outages. In particular, I mean foreign keys where the referencing side of the constraint does not have an index. The idea is very simple, we can select all of the columns that take part in a foreign key, then remove the ones that take part in a complete index, and the remainder should be the empty set, or possibly match a known allowlist. I think this is a valuable addition to the test cases for your database migrations, or if you can't easily do that, maybe in your CI/CD pipeline.
Friends don't let friends export to CSV
I worked for a few years in the intersection between data science and software engineering. On the whole, it was a really enjoyable time and I'd like to have the chance to do so again at some point. One of the least enjoyable experiences from that time was to deal with big CSV exports. Unfortunately, this file format is still very common in the data science space. It is easy to understand why -- it seems to be ubiquitous, present everywhere, it's human-readable, it's less verbose than options like JSON and XML, it's super easy to produce from almost any tool. What's not to like?