#inaction bcotton

On 25 June 2018, I published a post called “It’s hattening”. After years of rejected applications, I was finally starting a job at Red Hat. On 24 April 2023, Red Hat announced a 4% reduction in global staff. As a member of that 4%, today is my last day at Red Hat.

What does this mean for Ben?

This is the first time I’ve been laid off from a job. I hope it will be the last, but who can say? I’d be lying if I said I haven’t felt a big range of emotions in the past three weeks: confusion, anger, sadness, amusement.

But I’ve also felt loved. I’ve received so much support from people since the news started spreading. It’s like that end scene of “It’s a Wonderful Life” and I’m George Bailey. I’m proud of the contributions I’ve made to the Fedora community over the last five years, and it feels good to have others recognize that.

While I won’t be contributing as the Fedora Program Manager anymore, I was a Fedora contributor long before I joined Red Hat, and I’m not letting them take that away from me. I’ll still be around Fedora in ways that spark joy, although perhaps not much at first as I let my wounds heal.

I’ve had the great fortune to build an incredible professional and personal network over the years. I’m already pursuing a few opportunities and if those don’t pan out, I’ll be asking for your help finding more. In the meantime, I have (at least) a few weeks to relax for a bit. There’s a ton of work to do around the house, many trails to hike, Program Management for Open Source Projects to promote, and an embarrassingly-large backlog for Duck Alignment Academy articles.

What does this mean for Fedora?

I’ve told folks that if Fedora falls off the rails, then I have failed. I’m working with Matthew, Justin, and others to ensure coverage of the core job duties one way or another. I’ve worked hard over the years to automate tasks that can be automated. The documentation is far more comprehensive than what I inherited.

No doubt there are gaps in what I’ve left for my successors. However, my goal is that in a few months, nobody will notice that I’m gone. That’s my measure of success. The only reason I’ve been successful in my role is because of the work done by my predecessors: John, Robyn, Jaroslav, and Jan.

As to what the broader implication behind the loss of my position might be, I don’t know. There’s no indication that my role was targeted specifically. There are definitely people in Red Hat who continue to view Fedora as strategically important. I wish I had a clearer understanding of how they chose people/roles to cut, but I’ll probably never know the process. What I do know is that I fully intend to still be participating in the Fedora community when my account hits the 20-year mark in May 2029.

In defense of Fedora’s release cycle

Earlier this week, Thorsten Leemhuis published a thoughtful post about what he’d change if he magically became the supreme leader of Fedora. In that post and subsequent commentary on Mastodon and Fedora Discussion, he talked about changing Fedora’s release cycle. Since the Fedora Linux release process is my job, I figured I should explain why I disagree.

Integration projects are different

If you haven’t read the post, you should. But here’s the short version: Fedora Linux uses a release model rooted in the 1990s and should move to a “modern” model. Thorsten suggests a one-month cadence for those who want the latest versions and a one-year “steady” release. Such a model has worked well for Firefox, he argues, and so it should work for Fedora.

The key reason I think this is wrong is because Firefox is a development project whereas Fedora is an integration project. Integration projects don’t write a lot of code, they take the work of others and turn it into a coherent whole. This is a fundamentally different kind of work and it takes longer by necessity.

You can’t reliably integrate disparate pieces when they’re in constant motion. That’s why we have freezes leading up to the beta and final releases — they give the QA team time to test against a stationary target. It takes time to run through all of the test cases that make Fedora Linux a reliable operating system. So the choice becomes reducing the pre-release testing or spending a significant portion of the cycle in a freeze, which limits the the usefulness of the one-month cycle.

You can solve some of this with automated testing. And the QA does do a lot of automated testing. But those tests still take time, and there are a lot of interrelated parts in a Linux distribution.

Six months isn’t magic

There’s nothing objectively correct about a six month release cycle. It’s mostly because that’s how you get two releases a year. If the calendar had 10 months, the release cycle would be five. But there is a lower bound where you’ve become a de facto rolling release, even if you still have discrete releases. I don’t know where exactly that boundary is, but I suspect that one month is at or just beyond it.

Similarly, there’s an upper limit where you’re now a slow, plodding project. Again, I can’t say where the line is. Six months may be uncomfortably close to it, but I suspect it’s closer to a year. And, of course, it depends on the nature of the specific project.

So there’s no particular reason Fedora Linux couldn’t move to a shorter release cycle. Five months is totally doable. Four is possible. Three would require a tremendous amount of work before it could be considered. But what’s the benefit of going to a shorter cycle? Does five months instead of six make a meaningful difference? At least with six months, you know there’s a release targeted for April and October. Predictability is nice.

Solving the actual problem

The bigger issue, though, is that I don’t think people actually want this. Yes, you might want your web browser and other applications to update frequently. But that doesn’t mean you want your compiler or Python interpreter or C libraries to update frequently. Most people will avoid this in favor of the “steady” stream. This eliminates the intended benefit to upstream projects.

The people who do want everything to update quickly use a rolling release distribution, something that Thorsten explicitly says his proposal is not.

Fundamentally, the proposal looks at the problem the wrong way. The problem isn’t that a six month cycle is too long. The problem is that application delivery is coupled to operating system delivery. Most people want the latest versions of the applications they care about and for everything else to remain unchanged. The challenge, of course, is that not everyone draws that distinction in the same way.

We unsuccessfully tried to solve this with Modularity. Flatpak, at least for graphical applications, offers another attempt to solve this problem.

Historically, the system and application layers have been distributed together. Figuring out how to decouple these (including how to draw the line between them) is the interesting work. And it provides real value to the end users.

Balancing advancement and legacy

Later today, I’ll submit a contentious Change proposal to the Fedora Engineering Steering Committee. Several contributors proposed deprecating support for legacy BIOS starting in Fedora Linux 37. The feedback on the mailing list thread and in social media is…let’s call it “mixed”.

The bulk of the objections distill down to: I have old hardware and it should still work. Indeed, when proprietary operating systems vendors (both in the PC and mobile spaces) embrace varying forms of planned obsolescence, open source operating systems can allow users to continue using the hardware they own. Why shouldn’t it continue to be supported?

Nothing comes for free. Maintaining legacy support requires work. Bugs need fixes. Existing code can hamper the addition of new features. Even in a community-driven project, time is not unlimited. It’s hard to ask people to keep supporting software that they’re no longer interested in.

I think some distros should strive to provide indefinite support for older hardware. I don’t think all distros need to. In particular, Fedora does not need to. That’s not what Fedora is. “First” is one of our Four Foundations for a reason. Other distros focus on long-term support and less on integrating the latest from upstreams. That’s good. We want different distros to focus on different benefits.

That’s not to say that we should abandon old hardware willy-nilly. It’s a balance between legacy support and advancing innovation. The balance isn’t always easy to find, but it’s there somewhere. There are always tradeoffs.

I don’t have a strong opinion on this specific case because I don’t know enough about it. We have to make this decision at some point. Is that now? Maybe, or maybe not.

Sidebar: it’s hard to know

One of the benefits of (most) open source operating systems also makes these kinds of decisions harder. We don’t collect detailed data about installations. This is a boon for user privacy, but it means we’re generally left guessing about the hardware that runs Fedora Linux. Some educated guesses can be made from the architecture of bug reports or from opt-in hardware surveys. But they’re not necessarily representative. So we’re largely left with hunches and anecdata.

What does it mean for a Linux distribution to be “fresh”?

I recently had a discussion with Luboš Kocman of openSUSE about how distros can monitor their “freshness”. In other words: how close is a distro to upstream? From our perspectives, it’s helpful to know which packages are significantly behind their upstreams. These packages represent areas that might need attention, whether that be a gentle nudge to the maintainer or recruiting additional volunteers from the community.

The challenge is that freshness can mean different things. The Repology project monitors a large number of distributions and upstreams to report on the status. But simply comparing the upstream version number to the packaged version number ignores a lot of very important context.

Updating to the latest upstream version as soon as it comes out is the most obvious definition of “fresh”, but it’s not always the best. Rolling releases (and their users) probably want that. In Fedora, policy is to not do “major updates” within a release. Many other release-oriented distributions have a similar policy, with varying degrees of “major”. Enterprise distributions add another wrinkle: they’ll backport security fixes (and sometimes key features), so the difference in version number doesn’t necessarily tell you what’s missing.

Of course, the upstream’s version number doesn’t necessarily tell you much. Semantic versioning is great, but not everyone uses it. And not everyone that uses it uses it well. If a distribution has version 1.4 and upstream released 1.5, is that a lack of freshness or an intentional decision to avoid mid-release compatibility changes?

I don’t have a good answer. This is a hard problem to solve. Something like Repology may be the best we can do with reasonable effort. But I’d love to have a more accurate view of how fresh Fedora packages are within the bounds of policy.

Using Element as an IRC client

Like many who work in open source communities, IRC is a key part of my daily life. Its simplicity has made it a mainstay. But the lack of richness also makes it unattractive to many newcomers. As a result, newer chat protocols are gaining traction. Matrix is one of those. I first created a Matrix account to participate in the Fedora Social Hour. But since Matrix.org is bridged to Freenode, I thought I’d give Element (a popular Matrix client) a try as an IRC client, too.

I’ve been using Element almost exclusively for the last few months. Here’s what I think of it.

Pros

The biggest pro for me is also the most surprising. I like getting IRC notifications on my phone. Despite being bad at it (as you may have read last week), I’m a big fan of putting work aside when I’m done with work. But I’m also an anxious person who constantly worries about what’s going on when I’m not around. It’s not that I think the place will fall apart because I’m not there. I just worry that it happens to be falling apart when I’m not there.

Getting mobile notifications means I can look, see that everything is fine (or at least not on fire enough that I need to jump in and help), and then go back to what I’m doing. But it also means I can engage with conversations if I choose to without having to sit at my computer all day. As someone who has previously had to learn and re-learn not to have work email alert on the phone, I’m surprised at my reaction to having chat notifications on my phone.

Speaking of notifications, I like the ability to set per-room notification settings. I can set different levels of notification for each channel and those settings reflect across all devices. This isn’t unique to Element, but it’s a nice feature nonetheless. In fact, I wish it were even richer. Ideally, I’d like to have my mobile notifications be more restrictive than my desktop notifications. Some channels I want to see notifications for when I’m at my desk, but don’t care enough to see them when I’m away.

I also really like the fact that I can have one fewer app open. Generally, I have Element, Signal, Slack, and Telegram, plus Google Chat all active. Not running a standalone IRC client saves a little bit of system resources and also lets me find the thing that dinged at me a little quicker.

Cons

By far the biggest drawback, and the reason I still use Konversation sometimes, is the mishandling of multi-line copy/paste. Element sends it as a single multi-line message, which appears on the IRC side as “bcotton has sent a long message: <url>”. When running an IRC meeting, I often have reason to paste several lines at once. I’d like them to be sent as individual lines so that IRC clients (and particularly our MeetBot implementation), see them.

The Matrix<->IRC bridge is also laggy sometimes. Every so often, something gets stuck and messages don’t go through for up to a few minutes. This is not how instant messaging is supposed to work and is particularly troublesome in meetings.

Overall

Generally, using Element for IRC has been a net positive. I’m looking forward to more of the chats I use becoming Matrix-native so I don’t have to worry about the IRC side as much. I’d also like the few chats I have on Facebook Messenger and Slack to move to Matrix. But that’s not a windmill I’m willing to tilt at for now. In the meantime, I’ll keep using Element for most of my IRC need,s, but I’m not quite ready to uninstall Konversation.

Applying the Potter Stewart rule to release blockers

I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description, and perhaps I could never succeed in intelligibly doing so. But I know it when I see it.

Justice Potter Stewart in Jacobellis v. Ohio

Potter Stewart was talking bout hard-core pornography when he wrote “I know it when I see it”, but the principle also applies to release blockers. These are bugs that are so bad that you can’t release your software until they are fixed.Most bugs do not fall into this category. It would be nice to fix them, of course, but they’re not world-stopping.

Making a predictable process

For the most part, you want your release blockers to be defined by specific criteria. It’s even better if automated testing can use the criteria, but that’s not always possible. The point is that you want the process to be predictable.

A predictable blocker process provides value and clarity throughout the process. Developers know what the rules are so they can take care to fix (or avoid!) potential blockers early. Testers know what tests to prioritize. Users know that, while the software might do something annoying, it won’t turn the hard drive into a pile of melted metal, for example. And everyone knows that the release won’t be delayed indefinitely.

Having a predictable blocker process means not only developing the criteria, but sticking to the criteria. If a bug doesn’t violate a criterion, you can’t call it a blocker. Of course, it’s impossible to come up with every possible reason why you might want to block a release, so sometimes you have to add a new criterion.

Breaking the predictable process

This came up in last week’s Go/No-Go meeting for Fedora Linux 34 Beta. We were considering a bug that caused a delay of up to two minutes before gnome-initial-setup started. I argued that this should be a blocker because of the negative reputational impact, despite the fact that it does not break any specific release criterion. I didn’t want first time users (or worse: reviewers) to turn away from Fedora Linux because of this bug.

QA wizard Adam Williamson argued against calling it a blocker without developing a specific criterion it violates. To do otherwise, he said, makes a mockery of the process. I understand his position, but I disagree. While there are a variety of reasons to have release blockers, I see the preservation of reputation as the most important. In other words, the point of the process is to prevent the release of software so bad that it drives away current and potential users.

Admittedly, that is an entirely subjective opinion. I expect disagreement on it. But if you accept my premise and acknowledge that you can’t pre-write criteria to catch every possible, then it follows that squishy “I know it when I see it” rules are sometimes okay.

Can’t you just make that a criterion?

The best blocking criteria are objective. This aids automated testing and (mostly) avoids arguments about interpretation. But that’s not always possible. Even saying “feature X works” is open to argument over what constitutes “working”.

The challenge lies in how to incorporate “this bug is bad even though there’s no specific rule” without making the process unpredictable. In this case, Adam’s position makes a lot of sense. It’s much easier to write rules to address specific issues and apply them retroactively. Of course, doing that in a go/no-go meeting is perhaps straining the word “predictable” a bit, too.

So what happened?

In the case of this specific bug, we had an escape hatch. The release was likely to be declared no-go for other reasons, so we didn’t need to come to a decision either way. With a candidate fix available, we could just pull that in as a freeze exception and write a new criterion for the next time.

Because of this, I decided not to push my argument. I declined to propose a criterion in-meeting because I wanted to take some time to think about what the right approach is. I also wanted to spend some time thinking about the blocker process holistically. This gives me a blog post to publish (hi!) and some content for a project I’ll be announcing in the near future. In the meantime, I’ve proposed a change to the criteria.

What do “rolling release” and “stable” mean in the context of operating systems?

In a recent post on his blog, Chris Siebenmann wrote about his experience with Fedora upgrades and how, because of some of the non-standard things he does, upgrades are painful for him. At the end, he said “What I really want is a rolling release of ‘stable’ Fedora, with no big bangs of major releases, but this will probably never exist.”

I’m sympathetic to that position. Despite the fact that developers have worked to improve the ease of upgrades over the years, they are inherently risky. But what would a stable rolling release look like?

“Arch!” you say. That’s not wrong, but it also misses the point. What people generally want is new stuff so long as it doesn’t cause surprise. Rolling releases don’t prevent that, they spread it out. With Fedora’s policy, for example, major changes (should) happen as the release is being developed. Once it’s out, you get bugfixes and minor enhancements, but no big changes. You get the stability.

On the other hand, you can run Fedora Rawhide, which gets you the new stuff as soon as it’s available, but you don’t know when the big changes will come. And sometimes, the changes (big and little) are broken. It can be nice because you get the newness quickly. And the major changes (in theory) don’t all come at once.

Rate of change versus total change

For some people, it’s the distribution of change, not the total amount of change that makes rolling releases compelling. And in most cases, the changes aren’t that dramatic. When updates are loosely-coupled or totally independent, the timing doesn’t matter. The average user won’t even notice the vast majority of them.

But what happens when a really monumental change comes in? Switching the init system, for example, is kind of a big deal. In this case, you generally want the integration that most distributions provide. It’s not just that you get an assortment of packages from your distribution, it’s that you get a set of packages that work together. This is a fundamental feature for a Linux distribution (excepting those where do-it-yourself is the point).

Applying it to Fedora

An alternate phrasing of what I understand Chris to want is “release-quality packages made available when they’re ready, not on the release schedule.” That’s perfectly reasonable. And in general, that’s what Fedora wants Rawhide to be. It’s something we’re working on, particularly with the ability to gate Rawhide updates.

But part of why we have defined releases is to ensure the desired stability. The QA team and other testers put a lot of effort into automated and manual tests of releases. It’s hard to test against the release criteria when the target keeps shifting. It’s hard to make the distribution a cohesive whole instead of a collection of packages.

What Chris asks for isn’t wrong or unreasonable. But it’s also a difficult task to undertake and sustain. This is one area where ostree-based variants like Fedora CoreOS (for servers/cloud), Silverblue (for desktops), and IoT (for edge devices) bring a lot benefit. The big changes can be easily rolled back if there are problems.

Removing unmaintained packages from an installed system

Earlier this week, Miroslav Suchý proposed removing removing retired packages as part of Fedora upgrade (editor’s note: the proposal was withdrawn after community feedback). As it stands right now, if a package is removed in a subsequent release, it will stick around. For example, I have 34 packages on my work laptop from Fedora 28 (the version I first installed on it) through Fedora 31. The community has been discussing this, with no clear consensus.

I’m writing this post to explore my own thoughts. It represents my opinions as Ben Cotton: Fedora user and contributor, not as Ben Cotton: Fedora Program Manager.

What does it mean for a package to be “maintained”?

This question is the heart of the discussion. In theory, a maintained package means that there’s someone who can apply security and other bug fixes, update to new releases, etc. In practice, that’s not always the case. Anyone who has had a bug closed due to the end-of-life policy will attest to that.

The practical result is that as long as the package continues to compile, it may live on for a long time after the maintainer has given up on it. This doesn’t mean that it will get updates, it just means that no one has had a reason to remove it from the distribution.

On the other hand, the mere fact that a package has been dropped from the distribution doesn’t mean that something is wrong with it. If upstream hasn’t made any changes, the “unmaintained” version is just as functional as a maintained version would be.

What is the role of a Linux distribution?

Why do Linux distributions exist? After all, people could just download the software and build it themselves. That’s asking a lot of most people. Even those who have sufficient technical knowledge to compile all of the different packages in different languages with different quirks, few have the time or desire to do so.

So a distribution is, in part, a sharing of labor. By dividing the work, we reduce our own burden and democratize access.

A distribution is also a curated collection. It’s the set of software that the contributors say is worth using, configured in the “right way”. Sure there are a dozen or so web browsers in the Fedora repos, but that’s not the entirety of web browsers that exist. Just as an art museum may have several similar paintings, a distribution might have several similar packages. But they’re all there for a reason.

To remove or not to remove?

The question of whether to remove unmaintained packages then becomes a balance between the shared labor and the curation aspects of a distribution.

The shared labor perspective supports not removing packages. If the package is uninstalled at update, then someone who relies on that package now has to download and build it themselves. It may also cause user confusion if something that previously worked suddenly stops, or if a package that exists on an upgraded system can’t be installed on a new one.

On the other hand, the curation perspective supports removing the package. Although there’s no guarantee that a maintained package will get updates, there is a guarantee that an unmaintained package won’t. Removing obsolete packages at upgrade also means that the upgraded system more closely resembles a freshly-installed system.

There’s no right answer. Both options are reasonable extensions of fundamental purposes of a distribution. Both have obvious benefits and drawbacks.

Pick a side, Benjamin

If I have to pick a side, I’m inclined to side with the “remove the packages” argument. But we have to make sure we’re clearly communicating what is happening to the user. We should also offer an easy opt-out for users who want to say “I know what you’re trying to do here, but keep these packages anyway.”

Cherrytree updates in COPR

For Fedora 31 users, I have updated the cherrytree package in my COPR to the latest upstream release (0.39.2). For Fedora 32 and rawhide users…well, there’s a problem. As you may know, Python 2 has reached end of life. And that means most of Python 2 is gone in Fedora 32. I tried to build the dependency chain in COPR, but the yaks kept getting hairier and hairier. Instead, I’ve packaged the C++ rewrite as cherrytree-future.

cherrytree-future is available for Fedora 31, Fedora 32, and rawhide. I have packages for x86_64 and aarch64 for all three versions and for armhfp on Fedora 31 and 32 (the rawhide builder was out of disk space, oops!).

Because cherrytree-future is still pre-release I intentionally did not have the package obsolete cherrytree. If you’re upgrading from Fedora 31 to Fedora 32, you will first have to remove cherrytree and install cherrytree-future.

I have been using cherrytree-future in the last day and it’s working well for me so far. If you encounter any problems with the package (e.g. a missing dependency), please file an issue on my GitHub repo. If you encounter problems with the program itself, file the bug upstream.

Once upstream cuts an official release of the rewrite, I’ll work on getting it into the official repos.

[solved] Can’t log in to KDE on Fedora 31

Earlier today, I ran dnf update on my laptop, as I do regularly. After rebooting, I couldn’t log in. When I typed in my user name and password, it almost immediately returned to the login screen. Running startx from the command line failed, too. I spent an hour or two trying to diagnose the problem. There were a lot of distracting messages in the xorg log.

The problem turned out to be that the startkde command was no longer on my machine. It seems upgrading from version 5.16 to 5.17 of the plasma-workspace package removes startkde in favor of startplasma-x11. Creating a symlink fixed it as a workaround.

This is reported as bug #1785826, and I’m sure Rex and the rest of the Fedora KDE team will have a suitable fix out soon. In the meantime, creating a symlink appears to be the best way to fix it.

Why the symlink works

When an X session starts, it looks in a few different places to see what should be run. One of those places is /etc/X11/xinit/Xclients. This file checks for a preferred desktop environment. If one isn’t specified, it works through a list trying to find one that works. It does this by looking for the specific desktop environment’s executable.

Since startkde no longer exists, it had no way of checking for KDE Plasma. I don’t have any other desktop environments installed on this machine, so there was no other desktop environment to fallback to. I suspect if GNOME were installed, it would have logged me into GNOME instead, at least when running startx.

So another fix would be to replace instances of startkde with startplasma-x11 in the Xclients file (similarly if you have that file in your home directory). However, this leaves anything else that might check for the existence of startkde in the lurch. (I don’t know if anything does).

There’s probably more options for fixing it out there; this is very much not my area of expertise. I’d have to say that this was the most frustrating issue I’ve had to debug in a long time, in part because it took me a while to even know where the problem was. The fact that moving my ~/.kde directory didn’t result in a new one being created told me that it was pretty early in the process.

What distractions did I see?

In trying to diagnose the issue, I got distracted by a variety of error messages:

  • xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
  • /dev/fb0: permission denied
  • gkr-pam: unable to locate daemon control file
  • pam_kwallet5: couldn't open file