Does it matter if you anger practitioners and enthusiasts?

I put this topic on my kanban board in August of 2023. At the time, Hashicorp had just angered a lot of people by switching their popular tools to the Business Source License (BuSL). This was not long after Red Hat angered a lot of people by changing how they made source code for Red Hat Enterprise Linux (RHEL) public. While the anger was numerous and voluminous, my suspicion was that it didn’t matter.

The people who buy enterprise software are, by and large, not the enthusiasts who care about the practical or philosophical importance of open source software. You can anger the nerds (I use that term with endearment, as I am one, too) all you want, because they’re not the ones who sign the checks. This is a cynical take, but Hashicorp’s subsequent (pending) acquisition by IBM would seem to reinforce it. Much of the analysis I read at the time suggested that switching to the BuSL was part of Hashicorp trying to be more appealing to potential buyers and perhaps the reason that IBM made the acquisition offer. I’m not convinced of the latter, since IBM was willing to pay a lot more for Red Hat, who open sourced everything*.

It doesn’t seem to matter

Shortly after the Hashicorp acquisition was announced, SJVN wrote an article in which he noted that IBM’s share price had fallen 8% on the news. This, combined with the fact that Hashicorp stock fell 22% between the announcement of the BuSL switch and the acquisition news, led him to wonder if the acquisition was a “blunder” on IBM’s part. But as of Friday’s close, just over four months since the news was announced, IBM shares are up almost 10% from the pre-announcement price and have reached a one-year high. If the price isn’t quite at an all-time high, it’s within spitting distance of the 2013 peak.

But now we have actual analysis. RedMonk analyst Rachel Stephens published an article last week examining the effect of license changes on Hashicorp, MongoDB, Elastic, and Confluent. You should read the whole article, but the quick summary is that the license changes appear to have no impact on revenue. The changes in valuation and net income are a little messier, but the conclusion holds: “here does not seem to be a clear link between moving from an open source to proprietary license and increasing the company’s value.”

Importantly, though, the analysis does not show a clear link between moving to a proprietary license and decreasing the company’s value. It’s a small sample size, but the best we can tell, changing the license doesn’t matter either way. If your primary business is enterprises, not SMBs or individuals, there’s little evidence that a change that angers practitioners matters, at least in the short term.

What about Elasticsearch?

Elastic dropped a surprise on Thursday. They announced that Elasticsearch is once again open source, this time under the AGPL. Elastic announced a change from the Apache Software License in January 2021, driven in part by issues with Amazon Web Services. Does this mean they finally caved to pressure from the people who were upset about the initial change?

I don’t think so. On the same day Elastic announced the shift to the AGPL, they also released their quarterly earnings. Although the earnings and revenue beat expectations, the company cut their guidance from the coming quarter. The market was displeased, and about a quarter of Elastic’s market value went poof in a single day.

I’m not a market analyst and I don’t have any inside knowledge, so the best I can do is speculate based on what I know about the software industry. But my take is that this license change is less about responding to the pressure from open source fans and more about reducing ambiguity for potential adoptees. The Server Side Public License has some vague terms that could scare away someone trying the free offering, and a better-known license can remove that friction. Is this a win? Maybe, but not a philosophically-driven one.

Elastic’s re-re-license announcement mentions working issues out with AWS. The fact that they switched to the AGPL seems like a defensive move against future perceived shenanigans. If this is the primary motivator, then it seems like the move away from open source was successful and justifies the first license change.

So what do we do?

Anger from practitioners doesn’t seem like a meaningful consequence, so how do we prevent the sort of license changes that have caused such a stir in the last few years? If I had an easy answer, I’d get paid a lot of money as a consultant. (Spoiler: I do not get paid a lot of money as a consultant.)

But it’s clear when making the case for open source at our employers that we should not rely on philosophical arguments (because those only work when money is cheap and times are good). Nor should we rely on a “this will kill us with enthusiasts” because it turns out that might not matter.

The case for open source has to be presented in terms that matter to the business. This doesn’t have to be revenue or profit, which is good since the links are not always direct. But it does need to connect in clear, direct ways to the business’s goals, strategy, or other decision influences.

The role of the Linux distro in modern computing

John Mark Walker recently published a post on how the Linux distribution has lost prominence in the DevOps ecosystem (right around the time we started using the term “DevOps”) and how it might be poised for a comeback. His take is centered around the idea of the “download your dependencies on the fly as you build and deploy software” being a risky proposition, supply-chain-securitally-speaking.

I can’t disagree with that. I’ve always felt that modern language ecosystems are bonkers in that regard. Yes, it’s great that there are better dependency relationships than in C/C++ where the relationships might as well be “LOL, GFY”. But the idea that you’d just download some libraries on the fly and hope it all works as expected? Well, we’ve learned a few times how that can go sideways.

So, yeah. I generally agree with what John Mark says here. Heck, I’ve even given a talk on several occasions with a premise of “operating systems are boring.” So can the Linux distribution become un-boring again and help us fix our woeful supply chains?

Why distros aren’t the answer

The fundamental problem that Linux distributions face is sometimes called “too fast, too slow.” Distributions prioritize stability and coherence to some degree, even the cutting edge distributions. Enterprise distributions — or more accurately, their customers — place a particular emphasis on long-term stability. But development tools and programming languages change quickly.

Some applications need the latest version of libraries. Others update slowly and need old versions. It’s hard to meet both of those needs at the same time. Various attempts have been made, like Fedora’s Modularity, but there’s no great answer. The modern language ecosystems exist, in part, to sidestep this problem.

Another problem is that not all distribution package maintainers are technically solid or even ethical. The vast majority are, of course, but there are exceptions. As a package maintainer myself, I’m not adding a ton of value from a supply chain security standpoint. I take the upstream sources, wrap them in an RPM spec file, and submit them to the build system.

Distributions also create a disconnect between developers and end users. This separation has some value, but it also creates additional points for slow bug fixes, malware injection, and social engineering attacks.

Why distros can be the answer

While it’s true that not all distribution package maintainers are capable of fixing issues in the packages they maintain, some can. And even when a maintainer can’t fix an issue, they can get help from other maintainers in the distribution’s community. Issues that upstream chooses not to fix can be fixed in the distribution’s build.

Distributions aren’t just RPMs or debs or whatever anymore. Contained application delivery methods like Flatpak, Snap, and toolbx give distributions the ability to provide curated environments in a way that improves parallel installability. Of course, upstreams can produce their own containers, but this gives end users the ability to pick the right option for their needs.

In conclusion

I’m not sure I see distributions making the comeback that John Mark hopes for. Enterprise distros move too slowly, by and large, to address the needs of language ecosystems. And there’s not a ton of value in distributions blindly packaging language libraries.

John Mark suggested that organizations would want to “outsource risk mitigation to a curated distribution as much as possible.” I don’t disagree. But community distributions can’t take on the risk that companies want to outsource.

That said, the problem won’t fix itself, so let’s work toward something. If that ends up being Linux distributions, then great. If not, distributions will still have an important role to play.

Open source AI and open data

I’m a little late to the party with this post, but I need to get it out of my head. The question of “what is ‘open source AI’, exactly?” has been a hot topic in some circles for a while now. The Open Source Initiative, keepers of the Open Source Definition, have been working on developing a definition for open source AI. The latest draft notably does not require the training data to be available under an open license. I believe this is a mistake.

Open source AI must include open data

Data is critical to modern computing. I called this out in a 2020 DevConf talk and I can hardly claim to be the first or only person to make this observation. More recently, Tom “spot” Callaway wrote his objections to a definition of “open source AI” that doesn’t include open data. My objections (and I venture to say spot’s as well) have nothing to do with ideological purity. I wrote over three years ago that I don’t care about free/open source software as an end goal. What matters is the human impact.

Even before ChatGPT hit the scene, there were countless examples of AI exacerbating biases and inequities. Part of addressing that issue is providing a better training data set. But if we don’t know what an AI model is trained on, we don’t know what sort of biases it’s reproducing. This is a data problem, not a model weights problem. The most advanced AI in the world is still going to produce biased output if trained on biased sources.

OSI attempts to address this by requiring “data information.” This is insufficient. I’ll again defer to spot to make this case better than I could. OSI raises valid points about how rules governing data can be different than those covering code. Oh well. The solution is to acknowledge that some models won’t meet the requirements instead of watering down the requirements.

No one is owed an “open source AI”

Part of the motivation behind OSI’s choices here seem to be the creation of a definition that commercially-viable AI models can meet. They say “We need an Open Source AI Definition that can effectively guide users and developers to make the right choice. We need one that doesn’t put developers of Open Source AI at a disadvantage compared to proprietary ones.” Tara Tarakiyee wrote in response “Well, if the price of making Open Source ‘AI’ competitive with proprietary ‘AI’ is to break the openness that is fundamental to the definition, then why are we doing it?”

I agree with Tara. His whole post is well worth a read. But what this particular thread comes down to is this: we don’t owe anyone a commercially-viable definition just because doing otherwise is hard. There’s nothing in the Open Source Definition that says “but you can skip some of these requirements if you can’t figure out how to make money.”

“Can’t” and “won’t” aren’t the same thing

I’ve seen some people argue that creating an definition that results in zero “open source AI” models is useless. It’s important to distinguish here between “can’t” and “won’t”: they are not the same.

It’s true that a definition that no model could possibly meet is useless. But a definition that no model currently chooses to meet is valuable. AI developers could certainly choose to make their training data available. If they don’t want to, they don’t get to call their model open source. It’s the same as wanting to release software under a license that doesn’t meet some part of the Open Source Definition. As I said in the previous section, no one is owed a definition that meets their business needs.

The argument is silly, anyway. There are at least two models that would meet a more appropriate definition.

Where to go from here?

I wrote this post because I needed to get the words out of my head and onto “paper”. I have no expectation it will change the direction of OSI’s next draft. They seem pretty committed to their choice at this point. I’m not really sure what is gained by making this compromise. Nothing of worth, I think.

This is a problem we should have been addressing years ago, instead of rushing to catch up once the cat was out of the proverbial bag, Collectively, we seem to have a tendency to skate to where the puck was, not where it will be. This isn’t the first time. At FOSDEM 2021, Bradley Kuhn said something to the effect of “if I would have known proprietary software would be funded by advertising instead of license sales, I would have done a lot of things differently.”

I’m not sure what the next big challenge will be. But you can be sure if I figure it out, I’ll push a lot harder to address it before we get passed by again.

Thoughts on the Functional Source License

Earlier this month, Sentry announced a new software license called the “Functional Source License.” The announcement has a tagline “Freedom without Free-riding”. It purports to address some of the issues with the Business Source License.

An improvement?

It does, in fact, address some issues with the Business Source License. The Business Source License is less of a license and more of a homework problem for a combinatorics class. It has many options that materially change the terms, so being told that software is under the Business Source License doesn’t tell you much. The Functional Source License has one choice: does the license change to the Apache Software License or the MIT license?

While the Functional Source License does reduce complexity, it still suffers from the same problem: it’s not open source. That’s not a problem in itself; it’s a problem because it tries to co-opt the well-known “open source” label without actually being open source. Ultimately, these are attempts to use a license to address a business model problem. As we all should know by now, open source is not a business model.

The “right” approach

I’ve seen some people ask “then what’s the right approach?” The answer depends on what it is you want to do.

If you’re trying to achieve ubiquity, make the source fully closed and give it away to people who aren’t going to pay you money anyway. Or go with a “freemium” or “open core” model.

If you want your users to see the code so that they can trust it, then just make it source-available. This doesn’t make a lot of sense in a software-as-a-service context (which is what the Functional Source License is geared toward) because the user has no way of knowing that the code they’re inspecting is that the code you’re running.

If you want to get community contributions, use an open source license. Otherwise, you’re the free rider. A copyleft license won’t prevent competitors from building a product on your software, but it does prevent them from keeping their changes to themselves.

On “non-code contributors”

I was dismayed when I read Justin Dorfman’s “Why I’m proud to be a non-code open source contributor and you should be too” this morning. It’s mostly a great article. Dorfman makes valid points about the value of contributing to open source projects beyond code. Open source needs — and should encourage — these kinds of contributions.

Which brings me to my issue with the article. Dorfman anonymously quotes a well-respected open source leader:

If you find yourself about to use the phrase “non-code contributors” you should stop and use entirely different language.

He calls that a “horrible idea” and suggests that it discourages the kinds of contributions we need. But this take is disengenuous at best. I happen to know the post he’s referring to and it continues:

Defining people by what they are not is not a valid pathway to inclusion. Want to attract designers? Say so. Want to attract technical writers or community managers? Say so.

Far from suggesting that people should be quiet about non-code contributions, the post is calling on project leaders to stop othering those contributors and explicitly value them. The author is saying that lumping everything that’s not code as “not code” diminishes it. It’s just a shame that the article misrepresents a post when it could just agree with what was actually written instead.

Does open source matter?

Matt Asay’s article “The Open Source Licensing War is Over” has been making the rounds this week, as text and subtext. While his position is certainly spicy, I don’t think it’s entirely wrong. “It’s not that open source doesn’t matter, but rather it has never mattered in the way some hoped or believed,” Asay writes. I think that’s true, and it’s our fault.

To the average person, and even to many developers, the freeness or openness of the software doesn’t matter. They want to be able to solve their problem in the easiest (and cheapest) way. Often that’s open source software. Sometimes it isn’t. But they’re not sitting there thinking about the societal impact of their software choices. They’re trying to get a job done.

Free and open source software (FOSS) advocates often tout the ethical benefits of FOSS. We talk about the “four essential freedoms“. And while those should matter to people, they often don’t. I’ve said before — and I still believe it — FOSS is not the end goal. Any time we end with “and thus: FOSS!”, we’re doing it wrong.

FOSS advocacy — and I suspect this is true of other advocacy efforts as well — tends to try to meet people where we want them to be. The problem, of course, is that people are not where we want them to be. They’re where they are. We have to meet them there, with language that resonates with them, addressing the problems they currently face instead of hypothetical future problems. This is all easier said than done, of course.

Open source licenses don’t matter — they’ve never mattered — except as an implementation detail for the goal we’re trying to achieve.

Ended, the clone wars have?

I have done my damnedest to avoid posting publicly about Red Hat’s decision to stop publishing RHEL srpms. For one, the Discourse around it has been largely stupid. I didn’t want any part of the mess. For another, I didn’t have anything particularly novel to add. I’m breaking my silence now because the dust seems to have settled in a very beneficial way that I haven’t seen widely discussed. (To be fair, since I’ve been trying to avoid the discussion, I probably just missed it.)

Full disclosure: as you may know, my role at Red Hat was eliminated earlier this year. This does not make me particularly inclined to give Red Hat as a company the benefit of the doubt, but I try to be fair. Also: during my time at Red Hat, I was the program manager for the creation of CentOS Stream. However, I did not make business decisions about it, nor did I have any say on the termination of CentOS Linux or the recent sprm change.

My take on the situation

I won’t get into the entire history or Red Hat Enterprise Linux, clones, or competitors here. Joe Brockmeier’s ongoing “Clone Wars” series covers the long-term history in detail. I do think it’s worth providing my take on the last few years, though, so you understand my take on the future.

First of all, I don’t think Red Hat (or IBM, if you’d rather) acted with evil intent. That doesn’t mean I think the decision was correct, but I do think it was a legitimate business choice. I disagree with the decision, but as much as they didn’t ask me before, they sure as hell don’t ask me now.

If RHEL development started out with a CentOS Stream model, I’m not sure CentOS Linux (and the other RHEL clones) would have existed in the first place. But we don’t live in that timeline, so RHEL clones exist.

There are plenty of valid reasons for wanting RHEL but not wanting to pay for the subscription. It’s not just that people are being cheap. Until 2018, users of Spot instances on Amazon Web Services couldn’t use RHEL. In a former role, we had RHEL customers who used CentOS Linux in AWS precisely because they wanted to use Spot instances. Others used CentOS Linux in AWS because they didn’t want to deal with subscription management for environments that might come and go. (I understand that subscription-manager is much easier to work with now.)

So while Red Hat may be right to say that RHEL clones don’t add value to Red Hat (and I disagree there, too), RHEL clones clearly add value for their users, which include Red Hat customers. It’s fair to say that, for some people, the perceived value of a RHEL subscription does not match what Red Hat charges for it. How to solve that mismatch is not a problem i’m concerned with.

So what now?

Two community-driven clones popped up in the immediate aftermath of the death of CentOS Linux: Rocky Linux and AlmaLinux. Both of these aimed to fill the role formerly held by CentOS Linux: a bug-for-bug clone of Red Hat Enterprise Linux. I never quite understood what differentiated them in practice.

But now duplicated effort becomes differentiated effort. Rocky Linux will continue to provide a bug-for-bug clone. AlmaLinux, meanwhile, will shift to making an ABI-compatible distribution — one where “software that runs on RHEL will run the same on AlmaLinux.” This differentiated effort allows those communities to serve different use cases. They now have their own niche to succeed or fail in.

Time will tell, but I think Alma’s approach is a better fit for most clone users. I suspect that most people don’t need bug-for-bug compatibility (except in the XKCD #1172 scenario). For many use cases, CentOS Stream is suitable. Of course, people make decisions based on what they think they need, not what they actually need. Third-party software vendors may end up being the deciding factor.

Given the different approaches Rocky and Alma are taking, I think Red Hat’s decision ended up being beneficial to the broader ecosystem. I don’t think it was done with that intent, and I am not arguing that the ends justify the means, but the practical result seems positive on the whole.

#inaction bcotton

On 25 June 2018, I published a post called “It’s hattening”. After years of rejected applications, I was finally starting a job at Red Hat. On 24 April 2023, Red Hat announced a 4% reduction in global staff. As a member of that 4%, today is my last day at Red Hat.

What does this mean for Ben?

This is the first time I’ve been laid off from a job. I hope it will be the last, but who can say? I’d be lying if I said I haven’t felt a big range of emotions in the past three weeks: confusion, anger, sadness, amusement.

But I’ve also felt loved. I’ve received so much support from people since the news started spreading. It’s like that end scene of “It’s a Wonderful Life” and I’m George Bailey. I’m proud of the contributions I’ve made to the Fedora community over the last five years, and it feels good to have others recognize that.

While I won’t be contributing as the Fedora Program Manager anymore, I was a Fedora contributor long before I joined Red Hat, and I’m not letting them take that away from me. I’ll still be around Fedora in ways that spark joy, although perhaps not much at first as I let my wounds heal.

I’ve had the great fortune to build an incredible professional and personal network over the years. I’m already pursuing a few opportunities and if those don’t pan out, I’ll be asking for your help finding more. In the meantime, I have (at least) a few weeks to relax for a bit. There’s a ton of work to do around the house, many trails to hike, Program Management for Open Source Projects to promote, and an embarrassingly-large backlog for Duck Alignment Academy articles.

What does this mean for Fedora?

I’ve told folks that if Fedora falls off the rails, then I have failed. I’m working with Matthew, Justin, and others to ensure coverage of the core job duties one way or another. I’ve worked hard over the years to automate tasks that can be automated. The documentation is far more comprehensive than what I inherited.

No doubt there are gaps in what I’ve left for my successors. However, my goal is that in a few months, nobody will notice that I’m gone. That’s my measure of success. The only reason I’ve been successful in my role is because of the work done by my predecessors: John, Robyn, Jaroslav, and Jan.

As to what the broader implication behind the loss of my position might be, I don’t know. There’s no indication that my role was targeted specifically. There are definitely people in Red Hat who continue to view Fedora as strategically important. I wish I had a clearer understanding of how they chose people/roles to cut, but I’ll probably never know the process. What I do know is that I fully intend to still be participating in the Fedora community when my account hits the 20-year mark in May 2029.

In defense of Fedora’s release cycle

Earlier this week, Thorsten Leemhuis published a thoughtful post about what he’d change if he magically became the supreme leader of Fedora. In that post and subsequent commentary on Mastodon and Fedora Discussion, he talked about changing Fedora’s release cycle. Since the Fedora Linux release process is my job, I figured I should explain why I disagree.

Integration projects are different

If you haven’t read the post, you should. But here’s the short version: Fedora Linux uses a release model rooted in the 1990s and should move to a “modern” model. Thorsten suggests a one-month cadence for those who want the latest versions and a one-year “steady” release. Such a model has worked well for Firefox, he argues, and so it should work for Fedora.

The key reason I think this is wrong is because Firefox is a development project whereas Fedora is an integration project. Integration projects don’t write a lot of code, they take the work of others and turn it into a coherent whole. This is a fundamentally different kind of work and it takes longer by necessity.

You can’t reliably integrate disparate pieces when they’re in constant motion. That’s why we have freezes leading up to the beta and final releases — they give the QA team time to test against a stationary target. It takes time to run through all of the test cases that make Fedora Linux a reliable operating system. So the choice becomes reducing the pre-release testing or spending a significant portion of the cycle in a freeze, which limits the the usefulness of the one-month cycle.

You can solve some of this with automated testing. And the QA does do a lot of automated testing. But those tests still take time, and there are a lot of interrelated parts in a Linux distribution.

Six months isn’t magic

There’s nothing objectively correct about a six month release cycle. It’s mostly because that’s how you get two releases a year. If the calendar had 10 months, the release cycle would be five. But there is a lower bound where you’ve become a de facto rolling release, even if you still have discrete releases. I don’t know where exactly that boundary is, but I suspect that one month is at or just beyond it.

Similarly, there’s an upper limit where you’re now a slow, plodding project. Again, I can’t say where the line is. Six months may be uncomfortably close to it, but I suspect it’s closer to a year. And, of course, it depends on the nature of the specific project.

So there’s no particular reason Fedora Linux couldn’t move to a shorter release cycle. Five months is totally doable. Four is possible. Three would require a tremendous amount of work before it could be considered. But what’s the benefit of going to a shorter cycle? Does five months instead of six make a meaningful difference? At least with six months, you know there’s a release targeted for April and October. Predictability is nice.

Solving the actual problem

The bigger issue, though, is that I don’t think people actually want this. Yes, you might want your web browser and other applications to update frequently. But that doesn’t mean you want your compiler or Python interpreter or C libraries to update frequently. Most people will avoid this in favor of the “steady” stream. This eliminates the intended benefit to upstream projects.

The people who do want everything to update quickly use a rolling release distribution, something that Thorsten explicitly says his proposal is not.

Fundamentally, the proposal looks at the problem the wrong way. The problem isn’t that a six month cycle is too long. The problem is that application delivery is coupled to operating system delivery. Most people want the latest versions of the applications they care about and for everything else to remain unchanged. The challenge, of course, is that not everyone draws that distinction in the same way.

We unsuccessfully tried to solve this with Modularity. Flatpak, at least for graphical applications, offers another attempt to solve this problem.

Historically, the system and application layers have been distributed together. Figuring out how to decouple these (including how to draw the line between them) is the interesting work. And it provides real value to the end users.

Open source is selfish: that’s good and bad

Back in May, Devin Prater wrote an excellent piece on Medium titled “Linux Accessibility: an unmaintained Mess“. Devin talks about the poor state of accessibility on mainstream Linux distributions. While blind people have certainly used Linux, it’s generally not an easy task. There’s a simple explanation for this: most open source contributors aren’t blind.

There’s no rule that you can’t make accessible software if you don’t need that particular accessibility feature. But for many open source contributors, their contributions are based on “scratching their own itch.” People work on the things that are personally interesting to them or impact them in some way.

That’s a good thing! It means they’re invested in how well the software works. I’m sure you’ve used some applications where you thought “there’s no way the people who made this actually used it.”

The problem comes when we’re excluding potential users and contributors. People with vision problems can’t contribute because they can’t easily use the software. And when they can use it, the tools for contributing add another barrier. I can’t imagine trying to understand a patch or an XML file read aloud, but there are people who have to do that.

In Program Management for Open Source Software, I wrote “software is only useful to the degree that people can use it”. I don’t have a great solution. As a community, we need to figure out how to keep the good part of the selfishness while being more inclusive.