Book review: The Goal

I am a sucker for novels as allegory for technologists. When my friend Bex recommended Eliyahu M. Goldratt’s The Goal, I knew I had to give it a try. Surprisingly, the wait for the audiobook was measured in months, so I had plenty of time to wait. But the wait was worth it.

The Goal is set in a Rust Belt town where the local factory is struggling to remain productive (and therefore profitable). Our narrator is the plant manager, who is pouring all of himself into trying to save the plant. This, as you might expect, puts a serious strain on his home life. By chance, the narrator runs into his college physics professor. The professor works in a few moments here and there to help his former student improve the plant’s flow — often in counterintuitive ways — by asking questions that lead the team to a successful outcome.

You might ask what early-1980s industrial manufacturing has to do with the modern software industry, and you’re right to ask it. The setting is not immediately applicable, and a lot of the references (and attitudes in a few places) are very dated. But just like in my review of Tom DeMarco’s The Deadline, I did not find myself thinking “boy, I’m sure glad that problem is solved now.”

What’s the goal?

We still face (as do people in many other industries) being judged on local performance, whether or not that helps — or even hurts — the broader organizational goals. A person whose performance is graded based on the number of tasks they complete will complete as many tasks as they can, regardless of whether or not it creates a bottleneck downstream or is “wasted’ work. As Jonah, the professor character, says “a system of local optimums is not an optimum system at all. It is a very inefficient system.”

This, perhaps, is the one lesson that we’ve retained. System administrators know, even if managers and executives sometimes forget, that fully-utilized systems are brittle. If you want a resilient system, you need to have excess capacity in order to handle bursts. Stable systems are inefficient, and efficiency for efficiency’s sake is a bad goal.

The overall message of The Goal, and the first question that Jonah asks, is “what is the goal of the organization?” For businesses, the goal is to make money. It’s not to manufacture widgets, ship software, sell services, or whatever. Those are ways the business achieves the goal of making money. Market fit, quality, price, etc are factors in how successful the product is, but aren’t the goal in themselves. The rest of the book describes how to analyze flow through the process to optimize overall throughput in service of that goal.

Other lessons along the way

I’m not sure how much Goldratt intended for me to take these additional lessons from his novel, but I picked up a few other things along the way. First, is that if you’re too busy to update your documentation, you will get busier due to out-of-date documentation. At one point, the plant’s leadership team is operating under assumptions based on outdated documentation, which causes them to make the wrong decision. Bad documentation leads to rework, which is both demoralizing to the worker and counter to optimizing throughput.

Another lesson is that priorities that aren’t clearly communicated and understood aren’t priorities in fact. People have to understand what they’re supposed to be working on and why. If that changes and they don’t know about it, they’ll keep following the old priorities. Understanding the “why?”, especially when something seems counterintuitive, is important to get people to buy in and follow the plan.

Lastly, the Socratic method only works if both people are willing to use it. At one point, having seen how well the Socratic method worked to lead him to improvements at the plant, the plant manager tries it on his estranged wife. Because she sees his questions as having seemingly-obvious answers, he comes across as an asshole. Asking questions to understand the layers of an answer can be a very helpful approach, but everyone involved has to be in the right space for it.

My verdict

Overall, I thought The Goal was a solid book. The lessons are well-communicated, valuable, and not overly-preachy. The audiobook version is surprisingly well-produced, which made it fun as well as informative. If I ever become a college professor (unlikely!), I’d love to teach a course of allegorical novels, and The Goal will fit in well with The Deadline and The Phoenix Project.

Book review: Guiding Star OKRs

Setting, tracking, and reporting OKRs is terrible. But what if it wasn’t? In Guiding Star OKRs: A New Approach to Setting and Achieving Goals, Staffan Nöteberg lays out a framework that focuses on heading in the right direction instead of trying to meet exact targets. Unlike the OKRs you may have experienced, Guiding Star OKRs are focused on getting the right results, not the pre-determined results.

In a previous role, management was big into setting cascading OKRs (which Nöteberg says not to do). My objectives were my manager’s key results. My manager’s objectives were my VP’s key results, and so on. The end result was that there was no reward for helping colleagues meet their individual OKR targets. Instead of working together, we all worked individually in the hope that it ended up achieving the company’s broader goals. Spoiler alert: it did not.

I went into reading Guiding Star OKRs expecting to shake my head at a slight variation on a broken system. Instead, I came away enthusiastic about Nöteberg’s approach. Finally, OKRs that are meaningful!

The concept of setting and attracting objectives and key results (OKRs) really took off a Google. As anyone in the sysadmin/DevOps space in the early 2010s can tell you, a lot of organizations copied Google without considering if they need to.

This book is full of practical advice and examples to help the reader adopt the framework to their organization’s specific needs. For example: “never engage in setting objectives or key results when tasks are already determined.”

Guiding Star OKRs is a must-read for leaders who want to achieve results in a sustainable way. It’s now available in print and digital formats from The Pragmatic Bookshelf.

Full disclosure: I provided the publisher with a praise quote for this book and received a complimentary copy as a result. I received no compensation for this review.

#inaction bcotton

On 25 June 2018, I published a post called “It’s hattening”. After years of rejected applications, I was finally starting a job at Red Hat. On 24 April 2023, Red Hat announced a 4% reduction in global staff. As a member of that 4%, today is my last day at Red Hat.

What does this mean for Ben?

This is the first time I’ve been laid off from a job. I hope it will be the last, but who can say? I’d be lying if I said I haven’t felt a big range of emotions in the past three weeks: confusion, anger, sadness, amusement.

But I’ve also felt loved. I’ve received so much support from people since the news started spreading. It’s like that end scene of “It’s a Wonderful Life” and I’m George Bailey. I’m proud of the contributions I’ve made to the Fedora community over the last five years, and it feels good to have others recognize that.

While I won’t be contributing as the Fedora Program Manager anymore, I was a Fedora contributor long before I joined Red Hat, and I’m not letting them take that away from me. I’ll still be around Fedora in ways that spark joy, although perhaps not much at first as I let my wounds heal.

I’ve had the great fortune to build an incredible professional and personal network over the years. I’m already pursuing a few opportunities and if those don’t pan out, I’ll be asking for your help finding more. In the meantime, I have (at least) a few weeks to relax for a bit. There’s a ton of work to do around the house, many trails to hike, Program Management for Open Source Projects to promote, and an embarrassingly-large backlog for Duck Alignment Academy articles.

What does this mean for Fedora?

I’ve told folks that if Fedora falls off the rails, then I have failed. I’m working with Matthew, Justin, and others to ensure coverage of the core job duties one way or another. I’ve worked hard over the years to automate tasks that can be automated. The documentation is far more comprehensive than what I inherited.

No doubt there are gaps in what I’ve left for my successors. However, my goal is that in a few months, nobody will notice that I’m gone. That’s my measure of success. The only reason I’ve been successful in my role is because of the work done by my predecessors: John, Robyn, Jaroslav, and Jan.

As to what the broader implication behind the loss of my position might be, I don’t know. There’s no indication that my role was targeted specifically. There are definitely people in Red Hat who continue to view Fedora as strategically important. I wish I had a clearer understanding of how they chose people/roles to cut, but I’ll probably never know the process. What I do know is that I fully intend to still be participating in the Fedora community when my account hits the 20-year mark in May 2029.

In defense of Fedora’s release cycle

Earlier this week, Thorsten Leemhuis published a thoughtful post about what he’d change if he magically became the supreme leader of Fedora. In that post and subsequent commentary on Mastodon and Fedora Discussion, he talked about changing Fedora’s release cycle. Since the Fedora Linux release process is my job, I figured I should explain why I disagree.

Integration projects are different

If you haven’t read the post, you should. But here’s the short version: Fedora Linux uses a release model rooted in the 1990s and should move to a “modern” model. Thorsten suggests a one-month cadence for those who want the latest versions and a one-year “steady” release. Such a model has worked well for Firefox, he argues, and so it should work for Fedora.

The key reason I think this is wrong is because Firefox is a development project whereas Fedora is an integration project. Integration projects don’t write a lot of code, they take the work of others and turn it into a coherent whole. This is a fundamentally different kind of work and it takes longer by necessity.

You can’t reliably integrate disparate pieces when they’re in constant motion. That’s why we have freezes leading up to the beta and final releases — they give the QA team time to test against a stationary target. It takes time to run through all of the test cases that make Fedora Linux a reliable operating system. So the choice becomes reducing the pre-release testing or spending a significant portion of the cycle in a freeze, which limits the the usefulness of the one-month cycle.

You can solve some of this with automated testing. And the QA does do a lot of automated testing. But those tests still take time, and there are a lot of interrelated parts in a Linux distribution.

Six months isn’t magic

There’s nothing objectively correct about a six month release cycle. It’s mostly because that’s how you get two releases a year. If the calendar had 10 months, the release cycle would be five. But there is a lower bound where you’ve become a de facto rolling release, even if you still have discrete releases. I don’t know where exactly that boundary is, but I suspect that one month is at or just beyond it.

Similarly, there’s an upper limit where you’re now a slow, plodding project. Again, I can’t say where the line is. Six months may be uncomfortably close to it, but I suspect it’s closer to a year. And, of course, it depends on the nature of the specific project.

So there’s no particular reason Fedora Linux couldn’t move to a shorter release cycle. Five months is totally doable. Four is possible. Three would require a tremendous amount of work before it could be considered. But what’s the benefit of going to a shorter cycle? Does five months instead of six make a meaningful difference? At least with six months, you know there’s a release targeted for April and October. Predictability is nice.

Solving the actual problem

The bigger issue, though, is that I don’t think people actually want this. Yes, you might want your web browser and other applications to update frequently. But that doesn’t mean you want your compiler or Python interpreter or C libraries to update frequently. Most people will avoid this in favor of the “steady” stream. This eliminates the intended benefit to upstream projects.

The people who do want everything to update quickly use a rolling release distribution, something that Thorsten explicitly says his proposal is not.

Fundamentally, the proposal looks at the problem the wrong way. The problem isn’t that a six month cycle is too long. The problem is that application delivery is coupled to operating system delivery. Most people want the latest versions of the applications they care about and for everything else to remain unchanged. The challenge, of course, is that not everyone draws that distinction in the same way.

We unsuccessfully tried to solve this with Modularity. Flatpak, at least for graphical applications, offers another attempt to solve this problem.

Historically, the system and application layers have been distributed together. Figuring out how to decouple these (including how to draw the line between them) is the interesting work. And it provides real value to the end users.

Don’t make new tools fit the same hole as the old tools

I forgot what prompted me to have this thought three and a half years ago, but it seems fitting for the moment.

One of the worst things you can do when replacing a tool is to try to make it work just like the old one. If the old and new tools were meant to be exactly the same, they wouldn’t be different tools.

Change is hard but necessary. You know what else is hard? Trying to contort your old workflow to work with the new tool. Replacing the tool is an opportunity to improve your processes. If nothing else, it prevents you from fighting the tool.

This holds true even if you’re writing the tool yourself. If you’re doing the work to write a new tool (or rewrite an old one), take the opportunity to re-think how you work. What assumptions have you carried forward that are no longer valid? What new ways of working have you learned since you first adopted the old tool?

I’m seeing this play out on Mastodon as people used to Twitter try to adjust. They expect certain things based on their use of Twitter. And while Mastodon has a lot in common with Twitter, they’re not the same. Some things may change as Mastodon grows. And some of the Mastodon experience probably should be more like Twitter, even if it isn’t. But if you make the switch, think about why you think it should work the way you want.

Balancing advancement and legacy

Later today, I’ll submit a contentious Change proposal to the Fedora Engineering Steering Committee. Several contributors proposed deprecating support for legacy BIOS starting in Fedora Linux 37. The feedback on the mailing list thread and in social media is…let’s call it “mixed”.

The bulk of the objections distill down to: I have old hardware and it should still work. Indeed, when proprietary operating systems vendors (both in the PC and mobile spaces) embrace varying forms of planned obsolescence, open source operating systems can allow users to continue using the hardware they own. Why shouldn’t it continue to be supported?

Nothing comes for free. Maintaining legacy support requires work. Bugs need fixes. Existing code can hamper the addition of new features. Even in a community-driven project, time is not unlimited. It’s hard to ask people to keep supporting software that they’re no longer interested in.

I think some distros should strive to provide indefinite support for older hardware. I don’t think all distros need to. In particular, Fedora does not need to. That’s not what Fedora is. “First” is one of our Four Foundations for a reason. Other distros focus on long-term support and less on integrating the latest from upstreams. That’s good. We want different distros to focus on different benefits.

That’s not to say that we should abandon old hardware willy-nilly. It’s a balance between legacy support and advancing innovation. The balance isn’t always easy to find, but it’s there somewhere. There are always tradeoffs.

I don’t have a strong opinion on this specific case because I don’t know enough about it. We have to make this decision at some point. Is that now? Maybe, or maybe not.

Sidebar: it’s hard to know

One of the benefits of (most) open source operating systems also makes these kinds of decisions harder. We don’t collect detailed data about installations. This is a boon for user privacy, but it means we’re generally left guessing about the hardware that runs Fedora Linux. Some educated guesses can be made from the architecture of bug reports or from opt-in hardware surveys. But they’re not necessarily representative. So we’re largely left with hunches and anecdata.

My 2021 Todoist year in review

I use Todoist to manage my to-do lists. I was surprised to receive a “2021 year in review” email from the service the other day, so I thought I’d share some thoughts on it. I’m not what I’d call a productivity whiz, or even a particularly productive person. But the data provide some interesting insights.

2021 status

First, I apparently completed 2900 tasks in 2021. That’s an interestingly-round number. The completed the most tasks in November, which was a little surprising. I don’t feel like November was a particularly busy month for me. However, I’m not surprised that May had my lowest number. I was pretty burnt out, had just handed off a major project at work, and took over e-learning for my kids. May was unpleasant.

MonthTasks
January201
Feburary205
March254
April261
May196
June260
July201
August227
September251
October280
November296
December268

Looking at the days of the week, Thursday had the most completed tasks. Saturday had the fewest (yay! I’m doing an okayish job of weekending.)

Zooming in to the time of day, the 10 AM–12 PM window was my most productive. That makes sense, since it’s after I’ve had a chance to sort through email and have my early meetings. I definitely tend to feel most productive during that time. Perhaps I should start blocking that out for focus work. Similarly, I was most likely to postpone tasks at 3pm. This is generally the time that I either recognize that I’m not going to get something done that day or that I decide I just don’t want to do that thing.

Making sense of it all

Todoist says I’m in the top 2% of users in 2021. Perhaps that argues against my “I’m not particularly productive” assertion. It’s more likely that I just outsource my task management more than the average person. I put a lot of trivial tasks in so that I can get that sweet dopamine hit, but also that I just don’t have to think about them.

I don’t remember if Todoist did a year in review last year, but if they did I spent no time thinking about it. But based on what I’ve learned about the past year, I’m going to guard my late-morning time a little more jealously. I’ll try to save the trivial tasks for the last hour of the work day. This may prove challenging for me. It’s basically a more boring version of the marshmallow test.

Using variables in Smartsheet task names

I use Smartsheet to generate the Fedora Linux release schedules. I generally copy the previous release’s schedule forward and update target release date. But then I have to search for the release number (and the next release number, the previous release number, and the previous-previous release number) to update them. Find and replace is a thing, but I don’t want to do it blindly.

But last week, I figured out a trick to use variables in the task names. This way when I copy a new schedule, I just have to update the number once and all of the numbers are updated automatically.

First you have to create a field in the Sheet Summary view. I called it “Release” and set it to be of the Text/Number type. I put the release number in there.

Then in the task name, I can use that field. What tripped me up at first was that I was trying to do variable substitution like you might do in the Bash shell. But really, what you need to do is string concatenation. So I’d use

="Fedora Linux " + Release# + " release"

This results in “Fedora Linux 37 release” when release is set to 37. To get the next release, you do math on the variable:

="Fedora Linux " + (Release# + 1) + " release"

This results in “Fedora Linux 38 release” when release is set to 37. This might be obvious to people who use Smartsheet deeply, but for me, it was a fun discovery. It saves me literally minutes of work every three years.

What does it mean for a Linux distribution to be “fresh”?

I recently had a discussion with Luboš Kocman of openSUSE about how distros can monitor their “freshness”. In other words: how close is a distro to upstream? From our perspectives, it’s helpful to know which packages are significantly behind their upstreams. These packages represent areas that might need attention, whether that be a gentle nudge to the maintainer or recruiting additional volunteers from the community.

The challenge is that freshness can mean different things. The Repology project monitors a large number of distributions and upstreams to report on the status. But simply comparing the upstream version number to the packaged version number ignores a lot of very important context.

Updating to the latest upstream version as soon as it comes out is the most obvious definition of “fresh”, but it’s not always the best. Rolling releases (and their users) probably want that. In Fedora, policy is to not do “major updates” within a release. Many other release-oriented distributions have a similar policy, with varying degrees of “major”. Enterprise distributions add another wrinkle: they’ll backport security fixes (and sometimes key features), so the difference in version number doesn’t necessarily tell you what’s missing.

Of course, the upstream’s version number doesn’t necessarily tell you much. Semantic versioning is great, but not everyone uses it. And not everyone that uses it uses it well. If a distribution has version 1.4 and upstream released 1.5, is that a lack of freshness or an intentional decision to avoid mid-release compatibility changes?

I don’t have a good answer. This is a hard problem to solve. Something like Repology may be the best we can do with reasonable effort. But I’d love to have a more accurate view of how fresh Fedora packages are within the bounds of policy.

How to document your job

I was recently talking to a coworker who is in the middle of a large project to document her job. The goal is two-fold: to give teammates the ability to cover for her when she’s out of the office and to improve the onboarding experience for new team members. She has her manager’s support, but it’s a large project. She finds it difficult to make progress, in part because she doesn’t particularly like writing. As a result, the deadline keeps slipping further out.

I won’t claim to be the best documentarian, but I try to always leave my successor with better documentation than my predecessor left me. I’d like to think I’ve succeeded in that. In some cases, the bar was pretty low because there was no documentation at all. At any rate, this post is a collection of things I’ve learned over the years. This is an opinionated post and does not necessarily represent the One True Way to Document Your Job™.

Writing

  • Separate the why and the how. You have two separate audiences: someone filling in for a day or two and someone taking over the role. The information they need is very different. A person covering short-term probably doesn’t need or want to know the whole history of a process. They just want to know the steps they need to take. On the other hand, your replacement will benefit from a greater level of explanation. They’ll want to know why the steps are they way they are. In particular, knowing what didn’t work well in the past is a good reference to help them avoid re-learning that lesson later.
  • Start at a high level. If you cover the broad stuff first, that gives your reader something to work with, even if they need to ask you or someone else to help fill in the details. On the other hand, a detailed list for process A does nothing to help with process B. Starting at a high level also allows you to…
  • Write small chunks. You don’t need to write everything in a day. Starting with higher level concepts allows you to break the role into a few key areas. From there, you can break it down further and further.
  • Limit the work in progress. Similarly, you don’t need to write everything at the same time. if you work one one or two concepts at a time, you can get those sufficiently done and move on to the next. Yes, it’s a bummer if the one of the incomplete concepts is what your coworker needs to cover a sick day, but some fully-complete documentation is generally better than a lot of woefully incomplete documentation.
  • Avoid duplication. The more places the same information is recorded, the more likely it will become out of date. If documentation for something already exists, link to the authoritative source. You can provide some basic context around the link, but minimize the amount of copy/paste you do.
  • Write as you learn. If you’re just getting started, try to write as much as you can as you learn it. That way, you don’t end up assuming knowledge that your reader won’t have. If you’re learning from someone else, this also gives you the opportunity to let them verify it.

Verifying

The first time you write documentation, it will almost always be incomplete. You’ll want to get feedback to help improve it. This turns out to be another benefit of small chunks with limited work in progress. It’s almost like Agile was on to something here!

  • Trade places with a coworker. If you can (this requires management support for sure), swapping jobs with a coworker for a few hours gives you the opportunity to see how the documentation performs in real life. Doing it while you’re still around means that the documentation has been tested before you have a sick day or vacation. If there are problems, you can answer them without being inconvenienced.
  • Get feedback quicky. Even if you don’t swap roles, at least let someone look at it as soon as you’re done writing. This way everything is fresh in your mind as the questions come up.

The mechanics

You may have noticed that I cleverly avoided discussing the technical aspects. Do you write a wiki page? A word processor document? A standalone note taking app (like CherryTree)? Rendered HTML on a docs site? I leave this as an exercise for the reader. They all have benefits and drawbacks, so whatever works best for your team is the right answer. If people can’t find or update the documentation, then why did you bother writing it?