Open source is selfish: that’s good and bad

Back in May, Devin Prater wrote an excellent piece on Medium titled “Linux Accessibility: an unmaintained Mess“. Devin talks about the poor state of accessibility on mainstream Linux distributions. While blind people have certainly used Linux, it’s generally not an easy task. There’s a simple explanation for this: most open source contributors aren’t blind.

There’s no rule that you can’t make accessible software if you don’t need that particular accessibility feature. But for many open source contributors, their contributions are based on “scratching their own itch.” People work on the things that are personally interesting to them or impact them in some way.

That’s a good thing! It means they’re invested in how well the software works. I’m sure you’ve used some applications where you thought “there’s no way the people who made this actually used it.”

The problem comes when we’re excluding potential users and contributors. People with vision problems can’t contribute because they can’t easily use the software. And when they can use it, the tools for contributing add another barrier. I can’t imagine trying to understand a patch or an XML file read aloud, but there are people who have to do that.

In Program Management for Open Source Software, I wrote “software is only useful to the degree that people can use it”. I don’t have a great solution. As a community, we need to figure out how to keep the good part of the selfishness while being more inclusive.

The right of disattribution

While discussing the ttyp0 font license, Richard Fontana and I had a disagreement about its suitability for Fedora. My reasoning for putting it on the “good” list was taking shape as I wrote. Now that I’ve had some time to give it more thought, I want to share a more coherent (I hope) argument. The short version: authors have a fundamental right to require disattribution.

What is disattribution?

Disattribution is a word I invented because the dictionary has no antonym for attribution. Attribution, in the context of open works, means saying who authored the work you’re building on. For example, this post is under the Creative Commons Attribution-ShareAlike 4.0 license. That means you can use and remix it, provided you credit me (Attribution) and also let others use and remix your remix (ShareAlike). On the other hand, disattribution would say something like “you can use and remix this work, but don’t put my name on it.”

Why disattribution?

There are two related reasons an author might want to require disattribution. The first is that either the original work or potential derivatives are embarrassing. Here’s an example: in 8th grade, my friend wrote a few lines of a song about the destruction of Pompeii. He told me that I could write the rest of it on the condition that I don’t tell anyone that he had anything to do with it.

The other reason is more like brand protection. Or perhaps avoiding market confusion. This isn’t necessarily due to embarrassment. Open source maintainers are often overworked. Getting bugs and support requests from a derivative project because the user is confused is a situation worth avoiding.

Licenses that require attribution are uncontroversial. If we can embrace the right of authors to require attribution, we can embrace the right of authors to require disattribution.

Why not disattribution?

Richard’s concerns seemed less philosophical and more practical. Open source licenses are generally concerned with copyright law. Disattribution, particularly in the second reasoning, is closer to trademark law. But licenses are the tool we have available; don’t be surprised when we ask them to do more than they should.

Perhaps the bigger concern is the constraint it places on derivative works. The ttyp0 license requires not using “UW” as the foundry name. Richard’s concern was that two-letter names are too short. I don’t agree. There are plenty of ways to name a project that avoid one specific word. Even in this specific case, a name like “nuwave”—which contains “uw”—because it’s an unrelated “word.”

Excluding a specific word is fine. A requirement that excludes many words or provides some other unreasonable constraint would be the only reason I’d reject such a license.

Isn’t it better to contribute code than money?

Recently, I was in a discussion about making contributions to open source projects. One person said it would be nice if their employer gave each employee a budget that could be directed to open source projects at the employee’s discretion. The idea is that it would be a way for employees to support the specific projects that make their jobs or lives better. Another person said “isn’t it better to contribute” code to the project?

No, it is not. Even in software companies, a large percentage of employees lack the skills necessary to make meaningful code contributions to projects. Even when you consider (the very valuable) non-code contributions like documentation, testing, graphic design, et cetera. Money is quicker and easier.

Money gives the project maintainers to put it where they need it. They could buy test hardware, pay for web hosting, hire a contractor, buy themselves a nice cup of coffee. Whatever. This is the same reason charities prefer money over goods for disaster relief donations.

Of course, money isn’t perfect either. Not all projects are equipped to accept financial donations. Even if there’s a way to route money to them, they may not want to deal with tax implications. Loosely-governed projects may not have a good mechanism for deciding how to spend the money. Money can make relationships go south in a hurry.

If you’re a company looking for ways to let employees support the open source projects that they depend on, I advocate the “¿por que no los dos?” approach. Give your employees time to contribute effort in whatever way they’re able. But also give them a pool of money to sprinkle on the projects that provide value to your company.

What does it mean for a Linux distribution to be “fresh”?

I recently had a discussion with Luboš Kocman of openSUSE about how distros can monitor their “freshness”. In other words: how close is a distro to upstream? From our perspectives, it’s helpful to know which packages are significantly behind their upstreams. These packages represent areas that might need attention, whether that be a gentle nudge to the maintainer or recruiting additional volunteers from the community.

The challenge is that freshness can mean different things. The Repology project monitors a large number of distributions and upstreams to report on the status. But simply comparing the upstream version number to the packaged version number ignores a lot of very important context.

Updating to the latest upstream version as soon as it comes out is the most obvious definition of “fresh”, but it’s not always the best. Rolling releases (and their users) probably want that. In Fedora, policy is to not do “major updates” within a release. Many other release-oriented distributions have a similar policy, with varying degrees of “major”. Enterprise distributions add another wrinkle: they’ll backport security fixes (and sometimes key features), so the difference in version number doesn’t necessarily tell you what’s missing.

Of course, the upstream’s version number doesn’t necessarily tell you much. Semantic versioning is great, but not everyone uses it. And not everyone that uses it uses it well. If a distribution has version 1.4 and upstream released 1.5, is that a lack of freshness or an intentional decision to avoid mid-release compatibility changes?

I don’t have a good answer. This is a hard problem to solve. Something like Repology may be the best we can do with reasonable effort. But I’d love to have a more accurate view of how fresh Fedora packages are within the bounds of policy.

FOSS licenses permit, not restrict

Last week, Matthew Wilson shared a very correct take on Twitter:

A few people in the mentions argued that the GPL is doing it wrong by his definition. This is incorrect. Copyleft licenses do not prevent the user from doing things, they ensure that subsequent users can do the same thing.

This may seem like a semantic argument, but there’s substance to it. All licenses (except those that amount to a public domain dedication) contain some conditions, minimal though they may be. It’s important to remember that the default is that you can do nothing with a work. Copyright is by definition a monopoly on a work.The entire point of free and open source software licenses is to tell you what you can do, because the default position is that you can’t.

One of the most annoying things about license wars is the argument that one category of license is somehow more free than another. That’s dumb. Both copyleft and permissive licenses promote freedom, just from different perspectives. Permissive licenses give the next person in line the freedom to do (essentially) whatever they want. Copyleft licenses preserve freedoms for all subsequent users, no matter how many hands the work passes through. There are plenty of philosophical and practical reasons you might choose one class of license over the other (I tend to prefer copyleft licenses, myself), but it’s wrong to paint one or the other as anti-freedom.

Getting back to Matthew’s point, there has been a fair amount of license weaponization in the last few years. By this I mean the use of a license to try to exclude a certain class of user. Some of this I’m sympathetic to (e.g. the “ethical source” movement), some of this I’m not (e.g. the various “you can do what you want, just don’t make a successful software-as-a-service offering” licenses that have popped up). In both cases, I think copyright is the wrong mechanism for achieving the goals.

Excluding classes of users is antithetical to ideals free software and open source. That may be okay. As I’ve written, free software is not the end goal. But if you’re going to claim to be open source, you should act open source.

Balancing incoming tasks in volunteer projects

Open source (and other volunteer-driven) communities are often made up of a “team of equals.” Each member of the group is equally empowered to act on incoming tasks. But balancing the load is not easy. One of two things happens: everyone is busy with other work and assumes someone else will handle it, or a small number of people immediately jump on every task that comes in. Both of these present challenges for the long-term health of the team.

Bystander effect

The first situation is known as the “bystander effect.” Because every member of the team bears an equal responsibility, each member of the team assumes that someone else will take an incoming task. The sociological research is apparently mixed, but I’ve observed this enough to know that it’s at least possible in some teams. You’ve likely heard the saying “if everyone is responsible then no one is.”

The Bystander effect has two outcomes. The first is that the team drops the task. No one acts on it. If the task happens to be an introduction from a new member or the submission of content, this demoralizes the newcomer. If the team drops enough tasks, the new tasks stop coming.

The other possibility is that someone eventually notices that no one else is taking the task, so they take it. In my experience, it’s generally the same person who does this every time. Eventually, they begin to resent the other members of the team. They may burn out and leave.

Oxygen theft

Sometimes one or two team members jump on new tasks before anyone else does. Like the delayed version in the bystander effect scenario, this can lead to burn out. But worse, it can drive away team members who want to take tasks. If they’re constantly missing work because they weren’t able to immediately jump on it, they’ll go find other places to contribute. I call this “oxygen theft” because it’s like sucking all of the oxygen out of the room: it puts out the flames.

I have been an oxygen thief myself. Shortly after I started as the Fedora Program Manager, I became an editor on the Fedora Community Blog. I was publishing regular posts and I happen to be a decent editor, so it made sense to give me that privilege. But because Fedora was my day job, I was often the first to notice new submissions. Over time, I eventually became the only editor working on posts. By accident, the editorial team became a team of one. That’s on my list to fix in the near future.

Solving the problem

Letting either the bystander effect or oxygen theft cases go for too long harms the team. But with volunteers, it’s hard to balance the work. Team members may not have consistent availability. For example, if one of the team members dayjob schedule varies from week. They probably don’t have evenly distributed availability, either. Someone who is paid to be on a project will likely have a lot more time available than someone volunteering.

One way to solve the problem is to take turns being in charge of the incoming tasks for a period of time. This addresses “if everyone is responsible then no one is” by making a single person responsible. But by making it a rotating duty, you can spread the load.

After learning my lesson with the Fedora Community Blog, I was hesitant to be too aggressive with taking tasks as an editor of the Fedora Magazine. But the Magazine team was definitely suffering from the bystander effect.

To fix this, I proposed having an Editor of the Week. Each week, one person volunteers to be responsible for making sure new article pitches got timely responses and the comments were moderated. Any of the editors are free to help with those tasks, but the Editor of the Week is the one accountable for them.

It’s not a perfect system. The Editor of the Week role is taken on a volunteer basis, so some editors serve more frequently than others. Still, it seems to work well for us overall. Pitches get feedback more quickly than in the past, and we’re not putting all of the work on one person’s plate.

[If you are intrigued by this half-baked post, you’ll enjoy my book on program management for open source projects, coming from The Pragmatic Bookshelf in 2022.]

Projects shouldn’t write their own tools

Over the weekend, the PHP project learned that its git server had been compromised. Attackers inserted malicious code into the repo. This is very bad. As a result, the project moved development to GitHub.

It’s easy to say that open source projects should run their own infrastructure. It’s harder to do that successfully. The challenges compound when you add in writing the infrastructure applications.

I understand the appeal. It’s zero-price (to write; you still need the hardware to run it). Bespoke software meets your needs exactly. And it can be a fun diversion from the main thing you’re working on: who doesn’t like going to chase a shiny for a little bit?

Of course, there’s always the matter of “the thing you wanted didn’t exist when you started the project.” PHP’s first release predates the launch of GitHub by 13 years. It’s 10 years older than git, even.

Of course, this means that at some point PHP moved from some other version control system to Git. That also means they could have moved from their homegrown platform to GitHub. I understand why they’d want to avoid the pain of making that switch, but sometimes it’s worthwhile.

Writing secure and reliable infrastructure is hard. For most projects, the effort and risk of writing their own tooling isn’t worth the benefit. If the core mission of your project isn’t the production of infrastructure applications, don’t write it.

Sidebar: Self-hosting

The question of whether or not to write your infrastructure applications is different from the question of whether or not to self-host. While the former has a pretty easy answer of “no”, the latter is mixed. Self-hosting still costs time and resources, but it allows for customization and integration that might be difficult with software-as-a-service. It also avoids being at the whims of a third party who may or may not share your project’s values. But in general, projects should do the minimum that they can reasonably justify. Sometimes that means running your own instances of an infrastructure application. Very rarely does it mean writing a bespoke infrastructure application.

The FSF does not represent my views

Earlier this week, Richard Stallman announced that he was rejoining the board of the Free Software Foundation. You may recall that he resigned as president and board member in 2019 after making unacceptable remarks about the sexual assault of a minor. This was not the first instance of unacceptable behavior. The FSF made no real changes to address the issue and now has welcomed Stallman back.

I’m thankful that the people I choose to associate with have universally condemned this as harmful. I wrote in 2012 that I think he hurts his own ideological cause. At the time I wrote the post, I was thinking entirely of his rigid aherence to free software over all else. In truth, the harm he does goes well beyond that. For me, the licensing terms of free and open source software are not as important as the human impact.

As I wrote last month, free and open source software is not the end goal. What good is free software that is used to harm others? And what good is a free software movement that is not willing to include underindexed groups. We cannot tolerate nor enable this sort of behavior.

The Fedora Council spent a lot of time debating our vision statement.

The Fedora Project envisions a world where everyone benefits from free and open source software built by inclusive, welcoming, and open-minded communities.

Fedora PRoject vision

The inclusion of the “built by” is no accident. We want our community to be vibrant and healthy. That cannot happen when bad behavior is allowed to persist.

I think it’s too late for the FSF. They’ve painted themselves into a corner long ago. This only cements that. Still, perhaps with a new slate, the organization can be reborn into something that aids the cause it purports to champion. That is why I have signed the open letter calling for the resignation of Stallman and the entire FSF Board of Directors.

Should we treat OSD compliance as a binary?

So often, we think about whether a software license complies with the Open Source Definition (OSD) as a binary: it complies or it doesn’t. But the OSD has 10 criteria. If a license complies with all except for one of those criteria, it’s non-compliant, but is it non-compliant in the same way that a license that doesn’t comply with four criteria?

I got to thinking about this as I tried to come up with names for the four quadrants in Tobie Langel’s license classification chart. It occurred to me that the bottom half represented two concepts: not explicitly OSD-compliant because it was never submitted and explicitly not OSD-compliant because it violates one or more criteria.

A diagram of the open source landscape considering licenses and norms. Created by Tobie Langel and used under CC BY-SA 4.0.

There must be 50 ways to violate the OSD

Knowing how many (and which) criteria a non-compliant license meets is important. I argue that not allowing derived works is far more important to the idea of “open source in spirit” than not restricting other software by requiring all software distributed alongside it be free.

To add even more complication, not all violations of the same criteria are equal. A license that restricts users from hunting humans for sport would be seen more favorably than a license that restricts users from making ice cream.

Saying a license is OSD-compliant tells us something. Saying it is non-compliant tells us nothing.I don’t know if there’s a succinct way to express the 1,024 possible ways a license could be non-compliant. Certainly there is not if you also include the specific reasoning.

As I showed above, saying a license is 90% compliant is not particularly useful if the 10% is really important to you. And not all 90%s are created equal. It doesn’t make sense to put the criteria on a spectrum and describe the license by how far along it gets. Again, the violation may or may not matter for your purposes. And how can we say which criteria are most important in a way that will garner any sort of widespread support?

It may be possible to group the criteria into two or three broader categories. I’m not entirely sure that would be easy to express—certainly not in a simple chart.

Do we care?

And then there’s the question of if that even matters. I wrote last week’s “free and open source software is not the end goal” post as I thought about this question. From an intellectual property law standpoint, OSD compliance matters. (In that it gives you at least a broad idea of what you’re working with.) From a “why the hell am I writing this software to begin with?” standpoint, I’m not sure that it does.

We’re back to the beginning. If the goal is to write software that advances the state of humanity, you may choose a license that is explicitly not OSD-compliant because you don’t want it used for nefarious purposes. That’s a valid choice, although a very complicated one. Is it reasonable to lump that in with all of the other non-compliant licenses? The answer depends on your context.

There is no easy answer. Tobie’s other axis (follows norms) is also messy. Even more, probably, because there’s no defined standard to measure against. Perhaps for this purpose we continue to treat it as a binary after all. The model can show which quadrant a project falls in; understanding why is left as an exercise to the reader.

Refining the model to account for all (okay, some) of the complexities I’ve discussed would make an excellent dissertation topic for an aspiring PhD student.

Free and open source software is not the end goal

When I first started thinking about this article, the title was going to be “I don’t care about free software anymore.” But I figured that would be troll bait and I thought I should be a little less spicy. It’s true in a sense, though. I don’t care about free/open source software as an end goal.

The Free Software Foundation (FSF) says “free software is about having control over the technology we use in our homes, schools and businesses”. The point isn’t that the software itself is freely-licensed, it’s about what the software license permits or restricts. I used to think that free software was a necessary-but-insufficient condition for users having control over their computing. I don’t think that’s necessarily the case anymore.

Why free software might not matter

Software isn’t useful until someone uses it. So we should evaluate software in that context. And most software use these days involves 1. data and 2. computers outside the user’s control. We’ll get back to #2 in a moment, but I want to focus on the data. If Facebook provided the source code to their entire stack tomorrow—indeed, if they had done it from the beginning—that would do nothing to prevent the harms caused by that platform. One, it does nothing to diminish the “joys” of spreading disinformation. Two, it would be no guarantee that something else isn’t reading the data.

While we were so focused on the software, we essentially ignored the data. Now, the data is just as important, if not more, as the software. There are plenty of examples of this in my talk “We won. Now what?” presented at DevConf.CZ (25 minutes) and DevConf.US (40 minutes) last year. Being open is no guarantee of data protection, just as being proprietary is not guarantee of data harm.

We’ll always use other people’s computers

Let’s return to the “computers outside the user’s control” point. There’s a lot of truth to the “there is no cloud, there’s only other people’s computers” argument. And certainly if everyone ran their own services, that would reduce the risk of harm.

But here in the real world, that’s not going to happen. Most people cannot run their own software services—they have neither the skill nor the resources. Among those who do, many have no desire to. Apart from the impossibility of people running their own services, there’s the fact that communication means that the information lives in two places, so you’re still using someone else’s computer.

It’s all very complicated

There’s also the question of whether or not the absolutist view of software freedom is the right approach. The free software movement seems to be very libertarian in nature: if each user has freedom over their computing, that is a benefit to everyone. Others would argue (as the Ethical Source movement has) that enabling unethical uses of software is harmful. These two positions are at odds.

Whether or not you think the software license is the appropriate places to address this issue, I suspect many, if not most, developers would prefer that their software not be used for evil purposes. In order to enforce that, the software becomes non-free.

This is a complicated issue, with no right answer and no universal agreement. I don’t know what the way forward is, but I know that we cannot act like free software is the end goal. If we want to get the general public on board, we have to convince them in terms that make sense to their values and concerns, not ours. We must make software that is useful and usable in addition to being free. And we must understand that people choosing non-free software is not a moral failing but a decision to optimize for other values. We must update our worldview to match the 2020s; the 1990s are not coming back.