What does it mean for a Linux distribution to be “fresh”?

I recently had a discussion with Luboš Kocman of openSUSE about how distros can monitor their “freshness”. In other words: how close is a distro to upstream? From our perspectives, it’s helpful to know which packages are significantly behind their upstreams. These packages represent areas that might need attention, whether that be a gentle nudge to the maintainer or recruiting additional volunteers from the community.

The challenge is that freshness can mean different things. The Repology project monitors a large number of distributions and upstreams to report on the status. But simply comparing the upstream version number to the packaged version number ignores a lot of very important context.

Updating to the latest upstream version as soon as it comes out is the most obvious definition of “fresh”, but it’s not always the best. Rolling releases (and their users) probably want that. In Fedora, policy is to not do “major updates” within a release. Many other release-oriented distributions have a similar policy, with varying degrees of “major”. Enterprise distributions add another wrinkle: they’ll backport security fixes (and sometimes key features), so the difference in version number doesn’t necessarily tell you what’s missing.

Of course, the upstream’s version number doesn’t necessarily tell you much. Semantic versioning is great, but not everyone uses it. And not everyone that uses it uses it well. If a distribution has version 1.4 and upstream released 1.5, is that a lack of freshness or an intentional decision to avoid mid-release compatibility changes?

I don’t have a good answer. This is a hard problem to solve. Something like Repology may be the best we can do with reasonable effort. But I’d love to have a more accurate view of how fresh Fedora packages are within the bounds of policy.

Using Element as an IRC client

Like many who work in open source communities, IRC is a key part of my daily life. Its simplicity has made it a mainstay. But the lack of richness also makes it unattractive to many newcomers. As a result, newer chat protocols are gaining traction. Matrix is one of those. I first created a Matrix account to participate in the Fedora Social Hour. But since Matrix.org is bridged to Freenode, I thought I’d give Element (a popular Matrix client) a try as an IRC client, too.

I’ve been using Element almost exclusively for the last few months. Here’s what I think of it.

Pros

The biggest pro for me is also the most surprising. I like getting IRC notifications on my phone. Despite being bad at it (as you may have read last week), I’m a big fan of putting work aside when I’m done with work. But I’m also an anxious person who constantly worries about what’s going on when I’m not around. It’s not that I think the place will fall apart because I’m not there. I just worry that it happens to be falling apart when I’m not there.

Getting mobile notifications means I can look, see that everything is fine (or at least not on fire enough that I need to jump in and help), and then go back to what I’m doing. But it also means I can engage with conversations if I choose to without having to sit at my computer all day. As someone who has previously had to learn and re-learn not to have work email alert on the phone, I’m surprised at my reaction to having chat notifications on my phone.

Speaking of notifications, I like the ability to set per-room notification settings. I can set different levels of notification for each channel and those settings reflect across all devices. This isn’t unique to Element, but it’s a nice feature nonetheless. In fact, I wish it were even richer. Ideally, I’d like to have my mobile notifications be more restrictive than my desktop notifications. Some channels I want to see notifications for when I’m at my desk, but don’t care enough to see them when I’m away.

I also really like the fact that I can have one fewer app open. Generally, I have Element, Signal, Slack, and Telegram, plus Google Chat all active. Not running a standalone IRC client saves a little bit of system resources and also lets me find the thing that dinged at me a little quicker.

Cons

By far the biggest drawback, and the reason I still use Konversation sometimes, is the mishandling of multi-line copy/paste. Element sends it as a single multi-line message, which appears on the IRC side as “bcotton has sent a long message: <url>”. When running an IRC meeting, I often have reason to paste several lines at once. I’d like them to be sent as individual lines so that IRC clients (and particularly our MeetBot implementation), see them.

The Matrix<->IRC bridge is also laggy sometimes. Every so often, something gets stuck and messages don’t go through for up to a few minutes. This is not how instant messaging is supposed to work and is particularly troublesome in meetings.

Overall

Generally, using Element for IRC has been a net positive. I’m looking forward to more of the chats I use becoming Matrix-native so I don’t have to worry about the IRC side as much. I’d also like the few chats I have on Facebook Messenger and Slack to move to Matrix. But that’s not a windmill I’m willing to tilt at for now. In the meantime, I’ll keep using Element for most of my IRC need,s, but I’m not quite ready to uninstall Konversation.

Applying the Potter Stewart rule to release blockers

I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description, and perhaps I could never succeed in intelligibly doing so. But I know it when I see it.

Justice Potter Stewart in Jacobellis v. Ohio

Potter Stewart was talking bout hard-core pornography when he wrote “I know it when I see it”, but the principle also applies to release blockers. These are bugs that are so bad that you can’t release your software until they are fixed.Most bugs do not fall into this category. It would be nice to fix them, of course, but they’re not world-stopping.

Making a predictable process

For the most part, you want your release blockers to be defined by specific criteria. It’s even better if automated testing can use the criteria, but that’s not always possible. The point is that you want the process to be predictable.

A predictable blocker process provides value and clarity throughout the process. Developers know what the rules are so they can take care to fix (or avoid!) potential blockers early. Testers know what tests to prioritize. Users know that, while the software might do something annoying, it won’t turn the hard drive into a pile of melted metal, for example. And everyone knows that the release won’t be delayed indefinitely.

Having a predictable blocker process means not only developing the criteria, but sticking to the criteria. If a bug doesn’t violate a criterion, you can’t call it a blocker. Of course, it’s impossible to come up with every possible reason why you might want to block a release, so sometimes you have to add a new criterion.

Breaking the predictable process

This came up in last week’s Go/No-Go meeting for Fedora Linux 34 Beta. We were considering a bug that caused a delay of up to two minutes before gnome-initial-setup started. I argued that this should be a blocker because of the negative reputational impact, despite the fact that it does not break any specific release criterion. I didn’t want first time users (or worse: reviewers) to turn away from Fedora Linux because of this bug.

QA wizard Adam Williamson argued against calling it a blocker without developing a specific criterion it violates. To do otherwise, he said, makes a mockery of the process. I understand his position, but I disagree. While there are a variety of reasons to have release blockers, I see the preservation of reputation as the most important. In other words, the point of the process is to prevent the release of software so bad that it drives away current and potential users.

Admittedly, that is an entirely subjective opinion. I expect disagreement on it. But if you accept my premise and acknowledge that you can’t pre-write criteria to catch every possible, then it follows that squishy “I know it when I see it” rules are sometimes okay.

Can’t you just make that a criterion?

The best blocking criteria are objective. This aids automated testing and (mostly) avoids arguments about interpretation. But that’s not always possible. Even saying “feature X works” is open to argument over what constitutes “working”.

The challenge lies in how to incorporate “this bug is bad even though there’s no specific rule” without making the process unpredictable. In this case, Adam’s position makes a lot of sense. It’s much easier to write rules to address specific issues and apply them retroactively. Of course, doing that in a go/no-go meeting is perhaps straining the word “predictable” a bit, too.

So what happened?

In the case of this specific bug, we had an escape hatch. The release was likely to be declared no-go for other reasons, so we didn’t need to come to a decision either way. With a candidate fix available, we could just pull that in as a freeze exception and write a new criterion for the next time.

Because of this, I decided not to push my argument. I declined to propose a criterion in-meeting because I wanted to take some time to think about what the right approach is. I also wanted to spend some time thinking about the blocker process holistically. This gives me a blog post to publish (hi!) and some content for a project I’ll be announcing in the near future. In the meantime, I’ve proposed a change to the criteria.

What do “rolling release” and “stable” mean in the context of operating systems?

In a recent post on his blog, Chris Siebenmann wrote about his experience with Fedora upgrades and how, because of some of the non-standard things he does, upgrades are painful for him. At the end, he said “What I really want is a rolling release of ‘stable’ Fedora, with no big bangs of major releases, but this will probably never exist.”

I’m sympathetic to that position. Despite the fact that developers have worked to improve the ease of upgrades over the years, they are inherently risky. But what would a stable rolling release look like?

“Arch!” you say. That’s not wrong, but it also misses the point. What people generally want is new stuff so long as it doesn’t cause surprise. Rolling releases don’t prevent that, they spread it out. With Fedora’s policy, for example, major changes (should) happen as the release is being developed. Once it’s out, you get bugfixes and minor enhancements, but no big changes. You get the stability.

On the other hand, you can run Fedora Rawhide, which gets you the new stuff as soon as it’s available, but you don’t know when the big changes will come. And sometimes, the changes (big and little) are broken. It can be nice because you get the newness quickly. And the major changes (in theory) don’t all come at once.

Rate of change versus total change

For some people, it’s the distribution of change, not the total amount of change that makes rolling releases compelling. And in most cases, the changes aren’t that dramatic. When updates are loosely-coupled or totally independent, the timing doesn’t matter. The average user won’t even notice the vast majority of them.

But what happens when a really monumental change comes in? Switching the init system, for example, is kind of a big deal. In this case, you generally want the integration that most distributions provide. It’s not just that you get an assortment of packages from your distribution, it’s that you get a set of packages that work together. This is a fundamental feature for a Linux distribution (excepting those where do-it-yourself is the point).

Applying it to Fedora

An alternate phrasing of what I understand Chris to want is “release-quality packages made available when they’re ready, not on the release schedule.” That’s perfectly reasonable. And in general, that’s what Fedora wants Rawhide to be. It’s something we’re working on, particularly with the ability to gate Rawhide updates.

But part of why we have defined releases is to ensure the desired stability. The QA team and other testers put a lot of effort into automated and manual tests of releases. It’s hard to test against the release criteria when the target keeps shifting. It’s hard to make the distribution a cohesive whole instead of a collection of packages.

What Chris asks for isn’t wrong or unreasonable. But it’s also a difficult task to undertake and sustain. This is one area where ostree-based variants like Fedora CoreOS (for servers/cloud), Silverblue (for desktops), and IoT (for edge devices) bring a lot benefit. The big changes can be easily rolled back if there are problems.

Removing unmaintained packages from an installed system

Earlier this week, Miroslav Suchý proposed removing removing retired packages as part of Fedora upgrade (editor’s note: the proposal was withdrawn after community feedback). As it stands right now, if a package is removed in a subsequent release, it will stick around. For example, I have 34 packages on my work laptop from Fedora 28 (the version I first installed on it) through Fedora 31. The community has been discussing this, with no clear consensus.

I’m writing this post to explore my own thoughts. It represents my opinions as Ben Cotton: Fedora user and contributor, not as Ben Cotton: Fedora Program Manager.

What does it mean for a package to be “maintained”?

This question is the heart of the discussion. In theory, a maintained package means that there’s someone who can apply security and other bug fixes, update to new releases, etc. In practice, that’s not always the case. Anyone who has had a bug closed due to the end-of-life policy will attest to that.

The practical result is that as long as the package continues to compile, it may live on for a long time after the maintainer has given up on it. This doesn’t mean that it will get updates, it just means that no one has had a reason to remove it from the distribution.

On the other hand, the mere fact that a package has been dropped from the distribution doesn’t mean that something is wrong with it. If upstream hasn’t made any changes, the “unmaintained” version is just as functional as a maintained version would be.

What is the role of a Linux distribution?

Why do Linux distributions exist? After all, people could just download the software and build it themselves. That’s asking a lot of most people. Even those who have sufficient technical knowledge to compile all of the different packages in different languages with different quirks, few have the time or desire to do so.

So a distribution is, in part, a sharing of labor. By dividing the work, we reduce our own burden and democratize access.

A distribution is also a curated collection. It’s the set of software that the contributors say is worth using, configured in the “right way”. Sure there are a dozen or so web browsers in the Fedora repos, but that’s not the entirety of web browsers that exist. Just as an art museum may have several similar paintings, a distribution might have several similar packages. But they’re all there for a reason.

To remove or not to remove?

The question of whether to remove unmaintained packages then becomes a balance between the shared labor and the curation aspects of a distribution.

The shared labor perspective supports not removing packages. If the package is uninstalled at update, then someone who relies on that package now has to download and build it themselves. It may also cause user confusion if something that previously worked suddenly stops, or if a package that exists on an upgraded system can’t be installed on a new one.

On the other hand, the curation perspective supports removing the package. Although there’s no guarantee that a maintained package will get updates, there is a guarantee that an unmaintained package won’t. Removing obsolete packages at upgrade also means that the upgraded system more closely resembles a freshly-installed system.

There’s no right answer. Both options are reasonable extensions of fundamental purposes of a distribution. Both have obvious benefits and drawbacks.

Pick a side, Benjamin

If I have to pick a side, I’m inclined to side with the “remove the packages” argument. But we have to make sure we’re clearly communicating what is happening to the user. We should also offer an easy opt-out for users who want to say “I know what you’re trying to do here, but keep these packages anyway.”

Cherrytree updates in COPR

For Fedora 31 users, I have updated the cherrytree package in my COPR to the latest upstream release (0.39.2). For Fedora 32 and rawhide users…well, there’s a problem. As you may know, Python 2 has reached end of life. And that means most of Python 2 is gone in Fedora 32. I tried to build the dependency chain in COPR, but the yaks kept getting hairier and hairier. Instead, I’ve packaged the C++ rewrite as cherrytree-future.

cherrytree-future is available for Fedora 31, Fedora 32, and rawhide. I have packages for x86_64 and aarch64 for all three versions and for armhfp on Fedora 31 and 32 (the rawhide builder was out of disk space, oops!).

Because cherrytree-future is still pre-release I intentionally did not have the package obsolete cherrytree. If you’re upgrading from Fedora 31 to Fedora 32, you will first have to remove cherrytree and install cherrytree-future.

I have been using cherrytree-future in the last day and it’s working well for me so far. If you encounter any problems with the package (e.g. a missing dependency), please file an issue on my GitHub repo. If you encounter problems with the program itself, file the bug upstream.

Once upstream cuts an official release of the rewrite, I’ll work on getting it into the official repos.

[solved] Can’t log in to KDE on Fedora 31

Earlier today, I ran dnf update on my laptop, as I do regularly. After rebooting, I couldn’t log in. When I typed in my user name and password, it almost immediately returned to the login screen. Running startx from the command line failed, too. I spent an hour or two trying to diagnose the problem. There were a lot of distracting messages in the xorg log.

The problem turned out to be that the startkde command was no longer on my machine. It seems upgrading from version 5.16 to 5.17 of the plasma-workspace package removes startkde in favor of startplasma-x11. Creating a symlink fixed it as a workaround.

This is reported as bug #1785826, and I’m sure Rex and the rest of the Fedora KDE team will have a suitable fix out soon. In the meantime, creating a symlink appears to be the best way to fix it.

Why the symlink works

When an X session starts, it looks in a few different places to see what should be run. One of those places is /etc/X11/xinit/Xclients. This file checks for a preferred desktop environment. If one isn’t specified, it works through a list trying to find one that works. It does this by looking for the specific desktop environment’s executable.

Since startkde no longer exists, it had no way of checking for KDE Plasma. I don’t have any other desktop environments installed on this machine, so there was no other desktop environment to fallback to. I suspect if GNOME were installed, it would have logged me into GNOME instead, at least when running startx.

So another fix would be to replace instances of startkde with startplasma-x11 in the Xclients file (similarly if you have that file in your home directory). However, this leaves anything else that might check for the existence of startkde in the lurch. (I don’t know if anything does).

There’s probably more options for fixing it out there; this is very much not my area of expertise. I’d have to say that this was the most frustrating issue I’ve had to debug in a long time, in part because it took me a while to even know where the problem was. The fact that moving my ~/.kde directory didn’t result in a new one being created told me that it was pretty early in the process.

What distractions did I see?

In trying to diagnose the issue, I got distracted by a variety of error messages:

  • xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
  • /dev/fb0: permission denied
  • gkr-pam: unable to locate daemon control file
  • pam_kwallet5: couldn't open file

New to Fedora: z

Earlier this month, I attended Chris Waldon’s session “Terminal Velocity: Work faster in your shell” at All Things Open. He covered several interesting tools, one of which is a project called z. z is a smarter version of the cd command. It keeps track of what directories you change to and uses a combination of the frequency and recency (“frecency”) to make an educated guess about where you wanted to go.

I find this really appealing because I often forget where in the file system I put a directory. And z is written as a shell script, so it’s easy to package and use.

z is now packaged and submitted to rawhide, with updates pending for F31 and F30.

FPgM report: 2018-30

Inspired by bex’s “Slice of cake” updates, I present to the community this report of what has happened in Fedora Program Management this week.

Schedule

  • REMINDER — Software string freeze is July 31.

Changes

Announced

Submitted to FESCo

Approved by FESCo

I am on PTO this week, so anything not immediately obviously pertaining to submitted changes will be taken care of early next week.

FPgM report: 2018-29

Inspired by bex’s “Slice of cake” updates, I present to the community this report of what has happened in Fedora Program Management this week.

Schedule

  • REMINDER — Self-Contained Change submission deadline is July 24.
  • REMINDER — Software string freeze is July 31.

Changes

Announced

Submitted to FESCo

Approved by FESCo

I will be on PTO next week, but I will be checking in daily to shepherd last-minute change submissions.