Decentralization is more appealing in theory than in reality

One of the appeals of open standards is that they allow market forces to work. Consumers can choose the tool or service that best meets their specific needs and if someone is a bad actor, the consumer can flee. Centralized proprietary services, in contrast, tie you to a specific provider. Consumers have no choice if they want/need to use the service.

It’s not as great as it sounds

This is true in theory, but reality is more complicated. Centralization allows for lower friction. Counterintuitively, centralization can allow for greater advancement. As Moxie Marlinspike wrote in a Whisper Labs blog post, open standards got the Internet “to the late 90s.” Decentralization is (in part) why email isn’t end-to-end encrypted by default. Centralization is (again, in part) why Slack has largely supplanted IRC and XMPP.

Particularly for services with a directory component (social networks for sure, but also tools like GitHub), centralization makes a lot of sense. It lowers the friction of finding those you care about. It also makes moderation easier.

Of course, those benefits can also be disadvantages. Easier moderation also means easier censorship. But not everyone is capable of or willing to run their own infrastructure. Or to find the “right” service among twenty nearly-identical offerings. The free market requires an informed consumer, and most consumers lack the knowledge necessary to make an informed choice.

Decentralization in open source

Centralized services versus federated (or isolated) services is a common discussion topic in open source. Jason Baker recently wrote a comment on a blog post that read in part:

I use Slack and GitHub and Google * and many other services because they’re simply easier – both for me, and for (most) of the people I’m collaborating with. The cost of being easier for most people I collaborate with is that I’m also probably excluding someone. Is that okay? I’m not sure. I go back and forth on that question a lot. In general, though, I try to be flexible to accommodate the people I’m actually working with, as opposed to solving the hypothetical/academic/moral question.

Centralization to some degree is inevitable. Whether build on open standards or not, most projects would rather work on their project than run their infrastructure. And GitHub (like Sourceforge before it), has enabled many small projects to flourish because they don’t need to spend time on infrastructure. Imagine if every project needed to run it’s own issue tracker, code repository, etc. The barrier to entry would be too high.

Striking a balance

GitHub provides an instructive example. It uses an open, decentralized technology (git) and layers centralized services on top. Users can get the best of both worlds in this sense. Open purists may not find this acceptable, but I think a pragmatic view is more appropriate. If allowing some proprietary services enables a larger and more robust open software ecosystem, isn’t that worthwhile?

Is it really open source if…?

I see this question asked (or stated as an accusation) with some regularity: “is it really open source if ___?” Fill in the blank with whatever. For example:

This conversation started with a criticism of GitHub restricting Iranian users in order to comply with US law. Like true Scotsmen, we all have a notion in our heads of what open source means, and the picture doesn’t always align from person to person.

Part of the problem is that there are two separate parts: the output and the input. The output is the legal part. The part that deals with licensing. That’s easy to deal with. Software is open source if it is presented under a license that meets the Open Source Initiative’s definition. Easy.

The hard part is that open source also has a cultural component to it. This is the input. There’s a community involved in the project. That’s often what people think of when they consider “open source”, but it also has no real definition. So we argue about it. A lot.

Is it really open source if you don’t allow it to be used by Iranians? No. That violates number 5 in the Open Source Definition. Is it really open source if you don’t allow Iranians to be in your community? Yes. Does that make it right? Well, that’s the real question we should be asking.

Does open source benefit independent developers?

Patrick’s heresy is not an unreasonable statement. My current employer grew to a $34 billion market value using, creating, and supporting open source software. But there are plenty of stories of open source developers working on key projects that are barely able to sustain themselves.

Benefits of use

It’s clear that independent software developers benefit from using open source software. If nothing else, the programming languages themselves are immensely beneficial. Much of the tooling, frameworks, and libraries used for developing applications these days are available for free under open source licenses. The economic barrier to entry would be much higher if everything had to be paid for.

Benefits of development

This is where the answer becomes more qualified. Contributing to open source projects can open the door to being hired at one of the businesses that make good money. But that’s not attractive to everyone. And an “everything should be free” mindset can make it hard to earn money.

It comes down to what you want to get out of it. If you’re doing it to make money, then it might not be beneficial, unless it’s a boost to your resume. But for some people, contributing to open source projects is a hobby they enjoy. I got started out of a sense of giving back to the community that provided me with a free operating system. The fact that it eventually became a paying job is a nice benefit.

Harm of not contributing

The flip side of the question is “does not contributing to open source harm indie developers?” The answer is “yes” far too often. A lot of development positions explicitly or implicitly expect your GitHub profile to be a key part of your resume. But not everyone has the privilege to be able to contribute to open source projects in their spare time. Hopefully that understanding spreads more broadly through the industry.

If you want a diverse community, you have to stand up for marginalized members

On Monday, The Perl Conference’s Standards of Conduct Committee published an incident report. In a talk at the conference, a speaker deadnamed and misgendered a member of the community. As a result, they took down the YouTube video of that talk.

The speaker in question happens to be a keynote speaker at PerlCon. The PerlCon organizers wrote a pretty awful post (that they have since edited) that essentially dismissed the concerns of the transgender members of the Perl community. They went on to mock the pain that deadnaming causes.

I did not think to save their post in the Wayback Machine before they edited it down to a brief and meaningless statement. “We are preparing a great conference and do not want to break the festival and trash our year-long hard work we did for preparing the confernece, the program and the entertainment program.”

What they don’t say is “we value the marginalized members of our community.” And they make that very clear. Editing out the insulting part of their post does not ,mean much, as VM Brasseur pointed out.

If you value an inclusive community, you have to stand up for the marginalized members when they are excluded. If you don’t, you make it very clear that you’re only giving lip service to the idea. As it is, the PerlCon organizers don’t even apologize. They’re more concerned about the bad publicity than the bad effect on the community.

Naming your language interpreters and ecosystem

Last week, Fedora contributors proposed to change the meaning of “python” from “python2” to “python3” starting with Fedora 31. This makes sense in the context of Python 2’s upcoming demise. But some feedback on the mailing list, including mine, wonders why we’re perpetuating it.

Should there also be no “pip”, no “pytest”, no “pylint”, … command? I would say “yes”. Admittedly, it’s avoiding future pain in exchange for some current pain. But we’re already dealing with a disruption, so why enable the same issues when it’s time for Python 4?

This is bigger than Python, though. If you’re following semantic versioning, I argue that you should name your interpreters and any ecosystem executables with the major version name. Unless you’re promising to always maintain backward compatibility (or to just stop working on the project), you’re eventually setting your users up for pain.

What about non-programming languages? This is probably good advice for anything with a client-server model (e.g. databases). Or anything else where the command is separated from other components of the ecosystem. You could extend this to any executable or script that may be called by another. That’s not wrong, but there’s probably a reasonable line to draw somewhere.

Wherever you draw the line, doing it from the beginning makes life easier when the new, incompatible version comes out.

What kind of documentation are you writing?

Hopefully the answer isn’t “none”! Let’s assume you’re writing documentation because you’re a wonderful person. Is it a comprehensive discussion of all the features in your software? Good…sort of.

There’s a place for in-depth, comprehensive reference documentation. And that place is often “over there in the corner collecting dust until someone really needs it.” By and large, people are going to need smaller, more task-focused docs. It’s a difference between reference guides and user guides. Or as I like to think of it: “what could I do?” versus “what should I do?” These are not the same documents.

“What could I do?” docs should be chock-full of facts that are easy to discover when you know what you’re looking for. They don’t have opinions, they just list facts. You go to them for an answer to a very specific question that has a correct answer.

“What should I do?” docs should be opinionated. “So you want to do X? The best way to do that is Y, Z.” They’re focused on accomplishing some use case that the reader can probably describe in human language, but might not know how to do technically.

A great example of “What should I do?” docs is the tldr project. Unlike man pages, which are generally reference docs, tldr-pages focus on use cases.

When I was more active in the HTCondor project, I often dreamed (sometimes literally) of writing a book about administering HTCondor pools. The developers had a great reference manual, but it often lacked the more opinionated “here’s what you should do and why.” It’s something we should all consider when we write documentation.

Don’t give gatekeepers a foothold

Gatekeepers are a problem in communities. They decide — often arbitrarily — who can and cannot be members of a community. Are you a true Scotsman? A gatekeeper will tell you. And if there’s something they don’t like about you, or they’re feeling particularly ornery, they’ll keep you away from the community.

Gatekeeping is a problem in open source communities. More experienced (or just louder) contributors set a bar that new contributors cannot meet. This is bad for folks who want to contribute to the project, and it’s bad for the project’s sustainability.

A recent Opensource.com article asked “What is a Linux user?“. In the initial version, it left open the possibility that if you’ve only used a Linux desktop that doesn’t require a ton of tinkering, then you’re not a real Linux user. Fortunately, the comments called this out quickly. And the author, to his credit, did not hold this view. He quickly updated the article.

The revised article does a much better job of closing the door on gatekeeping, but I would rather it have never run at all. By engaging in debate on the question, you give it validity. It’s best to deal with gatekeeping by not even acknowledging that the question is valid.

Community-contributed versus community-led projects

Chris Siebenmann recently wrote a post about Golang where he said: “Go is Google’s language, not the community’s.” The community makes contributions — sometimes important ones — but does not set the direction. We frequently use “community project” to mean two separate ideas: a corporate-lead project that accept community input and a project (that may have corporate backing) lead by the community.

Neither one is particularly better or worse, so long as we’re honest about kind of project we’re running. Community-contributed projects are likely to drive away some contributors, who don’t feel like they have an ownership stake in the project. Chris mentions that Go’s governance has this effect on him. And that’s okay if you’re making that decision on your project intentionally.

Some community-contributed projects would probably welcome being community-led, or at least somewhere closer to that. But technical or governance barriers may inadvertently make it too difficult for would-be contributors to ramp up. This is one area where I don’t think GitHub’s position as the dominant code hosting platform gets enough credit. By having a single account and consistent interface across many unrelated projects, it becomes much easier for someone to progress from being a bug filer to making small contributions to becoming (if the project allows it) a key contributor.

Pay maintainers! No, not like that!

A lot of people who work on open source software get paid to do so. Many others do not. And as we learned during the Heartbleed aftermath, sometimes the unpaid (or under-paid) projects are very important. Projects have changed their licenses (e.g. MongoDB, which is now not an open source project by the Open Source Initiative’s definition) in order to cut off large corporations that don’t pay for the free software.

There’s clearly a broad recognition that maintainers need to be paid in order to sustain the software ecosystem. So if you expect that people are happy with GitHub’s recent announcement of a GitHub Sponsors, you have clearly spent no time in open source software communities. The reaction has had a lot of “pay the maintainers! No, not like that!” which strikes me as being obnoxious and unhelpful.

GitHub Sponsors is not a perfect model. Bradley Kuhn and Karen Sandler of the Software Freedom Conservancy called it a “quick fix to sustainability“. That’s the most valid criticism. It turns out that money doesn’t solve everything. Throwing money at a project can sometimes add to the burden, not lessen it. Money adds a lot of messiness and overhead to manage it, especially if there’s not a legal entity behind the project. That’s where the services provided by fiscal sponsor organizations like Conservancy come in.

But throwing money at a problem can sometimes help it. Projects can opt in to accepting money, which means they can avoid the problems if they want. On the other hand, if they want to take in money, GitHub just made it pretty easy. The patronage model has worked well for artists, it could also work for coders.

The other big criticism that I’ll accept is that it puts the onus on individual sponsorships (indeed, that’s the only kind available at the moment), not on corporate:

Like with climate change or reducing plastic waste, the individual’s actions are insignificant compared to the effects of corporate action. But that doesn’t mean individual action is bad. If iterative development is good for software, then why not iterate on how we support the software? GitHub just reduced the friction of supporting open source developers significantly. Let’s start there and fix the system as we go.

Apache Software Foundation moves to GitHub

Last week, GitHub and the Apache Software Foundation (ASF) announced that ASF migrated their git repositories to GitHub. This caused a bit of a stir. It’s not every day that “the world’s largest open source foundation” moves to a proprietary hosting platform.

Free software purists expressed dismay. One person described it as “a really strange move In part because Apache’s key value add [was] that they provided freely available infrastructure.” GitHub, while it may be “free as in beer”, is definitely not “free as in freedom”. git itself is open source software, but GitHub “special sauce” is not.

For me, it’s not entirely surprising that ASF would make this move. I’ve always seen ASF as a more pragmatically-minded organization than, for example, the Free Software Foundation (FSF). I’d argue that the ecosystem benefits from having both ASF- and FSF-type organizations.

It’s not clear what savings ASF gets from this. Their blog post says they maintain their own mirrors, so there’s still some infrastructure involved. Of course, it’s probably smaller than running the full service, but by how much?

More than a reduced infrastructure footprint, I suspect the main benefit to the ASF is that it lowers the barrier to contribution. Like it or not, GitHub is the go-to place to find open source code. Mirroring to GitHub makes the code available, but you don’t get the benefits of integrating issues and pull requests (at least not trivially). Major contributors will do what it takes to adopt the tool, but drive by contributions should be as easy as possible.

There’s also another angle, which probably didn’t the drive the decision but brings a benefit nonetheless. Events like Hacktoberfest and 24 Pull Requests help motivate new contributors, but they’re based on GitHub repositories. Using GitHub as your primary forge means you’re accessible to the thousands of developers who participate in these events.

In a more ideal world, ASF would use a more open platform. In the present reality, this decision makes sense.