Thoughts on Elastic License v2

Yesterday Elastic announced a revision to their not-great Elastic License. The Elastic License v2 was updated based on feedback from the community and apparently had a lawyer’s input. And while they seem to be backing off trying to imply that it is open source (because it decidedly is not), it still doesn’t seem like a good license.

First of all, it doesn’t comply with the Open Source Definition, so if that’s important to you, that’s all you need to know. I’m assuming if you’re reading this, you care about the license beyond that. And while I’m not a lawyer (so this is very much not legal advice), here are my thoughts: it’s vague! Seriously, the vagueness makes it a big risk whether or not you care about OSD compliance (and there are many reasons you might not, as I’ll discuss in an upcoming post).

The first line in the Limitations section reads thus:

You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software.

This contains two things I have questions about. First of all, what is a “managed service” exactly? Does that include consulting services where someone provides direct management of a customer’s software? I have a good idea of what “managed service” means in industry terms, but if a licensor using this software decides they don’t like what you’re doing, there’s enough vagueness there for them to cause you problems. And of course, if you want to use it in a Software-as-a-Service model, you can’t use it under this license. You can use it under the SSPL, of course, but that is a non-starter for a lot of users.

Secondly, what is a “substantial set of the features or functionality of the software”? If someone does their own implementation of the functionality, does that count? If someone develops additional code that extends the functionality of the software and the upstream project later adds that functionality, does the additional code now violate the license?

Another problem is that it treats “you” and “your company” as distinct entities. This doesn’t make a lot of sense to me. If I use software on behalf of my employer, the employer is the licensee. The “Patents” section contains the only uses of “your company” and says “[i]f your company makes such a claim, your patent license ends immediately for work on behalf of your company”, but that’s redundant because the license was always for my company, not for me.

Frankly, I don’t see why anyone would use this license, particularly now that Amazon has forked the project.

What does “open source” mean in 2021?

The licensing discourse in the last few weeks has highlighted a difference between what “open source” means and what we’re talking about when we use the term. Strictly speaking, open source software is software released under a license approved by the Open Source Initiative. In most practical usage, we’re talking about software developed in a particular way. When we talk about open source, we talk about the communities of users and developers, (generally) not the license. “Open source” has come to define an ethos that was all have our own definition of.

Continue reading

Open source is still not a business model

If you thought 2021 was going to be the year without big drama in the world of open source licensing, you didn’t have to wait long to be disappointed. Two stories have already sprung up in the first few weeks of the year. They’re independent, but related. Both of them remind us that open source is a development model, not a business model.

Elasticsearch and Kibana

A few years ago, it seemed like I couldn’t go to any sysadmin/DevOps conference or meetup without hearing about the “ELK stack“. ELK stands for the three pieces of software involved: Elasticsearch, Logstash, and Kibana. Because it provided powerful aggregation, search, and visualization of arbitrary log files, it became very popular. This also meant that Amazon Web Services (AWS) saw value in providing an Elasticsearch service.

As companies moved more workloads to AWS it made sense to pay AWS for Amazon Elasticsearch Service instead of paying Elastic. This represented what you might call a revenue problem for Elastic. So they decided to follow MongoDB’s lead and change their license to the Server Side Public License (SSPL).

The SSPL is essentially a “you can’t use it, AWS” license. This makes it decidedly not open source. Insultingly, Elastic’s announcement and follow-up messaging include phrases like “doubling down on open”, implying that the SSPL is an open source license. It is not. It a source-available license. And, as open source business expert VM Brasseur writes, it creates business risk for companies that use Elasticsearch and Kibana.

Elastic is, of course, free to use whatever license it wants for the software it develops. And it’s free to want to make money. But it’s not reasonable to get mad at companies using the software under the license you chose to use for it. Picking a license is a business decision.

Shortly before I sat down to write this post, I saw that Amazon has forked Elasticsearch and Kibana. They will take the last-released versions and continue to develop them as open source projects under the Apache License v2. This is entirely permissible and to be expected when a project makes a significant licensing change. So now Elastic is in danger of a sizable portion of the community moving to the fork and away from their projects. If that pans out, it may end up being more harmful than Amazon Elasticsearch Service ever was.

Nmap Public Source License

The second story actually started in the fall of 2020, but didn’t seem to get much notice until after the new year. The developers of nmap, the widely-used security scanner, began using a new license. Prior to the release of version 7.90, nmap was under a modified version of the GNU General Public License version 2 (GPLv2). This license had some additional “gloss”, but was generally accepted by Linux distributions to be a valid free/open source software license.

With version 7.90, nmap is now under the Nmap Public Source License (NPSL). Version 0.92 of this license contained some phrasing that seemed objectionable. The Gentoo licenses team brought their concerns to the developers in a GitHub issue. Some of their concerns seemed like non-issues to me (and to the lawyers at work I consulted with on this), but one part in particular stood out.

Proprietary software companies wishing to use or incorporate Covered Software within their programs must contact Licensor to purchase a separate license

It seemed clear that the intent was to restrict proprietary software, not otherwise-compliant projects from companies that produce proprietary software. Nonetheless, as it was written, it constituted a violation of the Open Source Definition, and we rejected it for use in Fedora.

To their credit, the developers took the feedback well and quickly released an updated version of the license. They even retroactively licensed affected releases under the updated license. Unfortunately, version 0.93 still contains some problems. In particular, the annotations still express field of endeavor restrictions.

While the license text is the most important part, the annotations still matter. They indicate the intent of the license and guide the interpretation by lawyers and judges. So newer versions of nmap remain unsuitable for some distributions.

Licenses are not for you to be clever

Like with Elastic, I’m sympathetic to the nmap developers’ position. If someone is going to use their project to make money, they’d like to get paid, too. That’s an entirely reasonable position to take. But the way they went about it isn’t right. As noted in the GitHub issue, they’re not copyright attorneys. If they were, the license would be much better.

It seems like the developers are fine with people free-riding profit off of nmap so long as the software used to generate the profit is also open source. In that case, why not just use a professionally-drafted and vetted license like the AGPL? The NPSL is already using the GPLv2 and adding more stuff on top of it, and it’s the more stuff on top of it that’s causing problems.

Trying to write your business model into a software license that purports to be open source is a losing proposition.

Did software stagnate in 1996?

Betteridge’s Law says “no”. But in a blog post last week, Jonathan Edwards says “yes”. Specifically, he says:

Software is eating the world. But progress in software technology itself largely stalled around 1996. 

It’s not clear what Edwards thinks happened in 1996. Maybe he blames the introduction of the Palm PIlot? In any case, he argues that the developments since 1996 have all been incremental improvements upon existing technology. Nothing revolutionary has happened in programming languages, databases, etc.

This has real “old man yells at cloud” energy. Literally. He includes “AWS” in his list of technology he dismisses.

Edwards sets up a strawman to knock down. Maybe “[t]his is as good as it gets: a 50 year old OS, 30 year old text editors, and 25 year old languages,” he proposes. “Bullshit,” he says.

I’d employ my expletive differently: who gives a shit?

Programming does not exist for the benefit of programmers. Software is written to do something for people. The universe of what is possible with computing is inarguably broader than in 1996. Much of that is owed to improvements in hardware, to be sure. And you can certainly argue of what’s possible with computing is bad. But that’s not what’s at issue here.

I don’t see carpenters bemoaning the lack of innovation in hammers. Software development isn’t special. It’s a trade like any other. And if the tools are working, let them work.

I won’t even bother with his “open source is stifling innovation” nonsense. Rebutting that is left as an exercise to the reader.

Moving the website to Lektor

Years ago, I moved all of funnelfiasco.com (except the blog, which runs on WordPress) from artisinally hand-crafted HTML to using a static site generator. At the time, I chose a project called “blatter” which used jinja2 templates to generate a site. This gave me the opportunity to change basic information across the whole site at once. Not something I do often, but it’s a pain when I do.

Unfortunately, blatter was apparently quietly abandoned by the developer. This wasn’t really a problem until Python 2 reached end of life. Fedora (reasonably) retired much of the Python 2 ecosystem. I tried to port it to Python 3, but ran into a few problems. And frankly, the idea of taking on the maintenance burden for a project that hadn’t been updated in years was not at all appealing. So I went looking for something else.

I wanted to find something that used jinja2 in order to minimize the amount of work involved. I also wanted something focused on websites, not blogs specifically. It seems like so many platforms today are blog-first. That’s fine, it’s just not what I want. After some searching and a little bit of trial and error, I ended up selecting Lektor.

The good

Lektor is written in (primarily) Python 3 and uses jinja2 templates, so it hit my most important points. It has a command to run a local webserver for testing. In addition, you can set up multiple servers configurations for deployment. So I can have the content sync to my local web server to verify it and then deploy that to my “production” webserver. Builds are destructive, but the deploys are not, which means I don’t have to shoe-horn everything into Lektor.

Another great feature is the ability to programmatically generate thumbnails of images. I’ve made a little bit of use of that for the time being. In the future, especially if I ever go storm chasing again, I can see myself using that feature a lot more.

Lektor optionally supports writing the page content in markdown. I haven’t done this much since I was migrating pre-written content. I expect new content will be much markdownier. Markdown isn’t flexible enough for a lot of web purposes, but it covers some use cases well. Why write HTML when it’s not needed?

Lektor uses databags to provide input data to templates. I do this using JSON files. Complex operations with that are a lot easier than the embedded Python data structures that Blatter supported.

If I were interested in translating my site into multiple languages, Lektor has good support for that (including changing URLs). It also has a built-in admin and editing console, which is not something I use, but I can see the appeal.

The bad

Unlike Blatter, Lektor puts contents and templates in separate files. This makes it a little more difficult to special-case a specific site.

It also has a “one directory, one file” paradigm. Directories can have “attachments”, which can include html files, but they won’t get processed, so they need to stand alone. This is not such an issue if you’re starting from scratch. Since I’m not, it was more of a headache. You can overwrite the page’s slug, but that also makes certain assumptions.

For the Forecast Discussion Hall of Fame, I wanted to keep URLs as-is. That site has been linked to from a lot of places, and I’d hate to break those inbound links. Writing an htaccess file to redirect to the new URLs didn’t sound ideal either. I ended up writing a one-line patch that passed the argument I need to the python-slugify library. I tried to do it the right way so that it would be configurable, but it was beyond my skill to do so.

The big down side is the fact that the development has ground to a halt. It’s not abandoned, but the development activity happens in spurts. Right now it’s doing what I need it to do, but I worry at some point I’ll have to make a switch again. I’d like to contribute more upstream, but my skills are not advanced enough for this.

GitHub should stand up to the RIAA over youtube-dl

Earlier this week, GitHub took down the repository for the youtube-dl project. This came in response to a request from the RIAA—the recording industry’s lobbying and harassment body. youtube-dl is a tool for downloading videos. The RIAA argued that this violates the anticircumvention protections of the Digital Millennium Copyright Act (DMCA). While GitHub taking down the repository and its forks is true to the principle of minimizing corporate risk, it’s the wrong choice.

Microsoft—currently the world’s second-most valuable company with a market capitalization of $1.64 trillion—owns GitHub. If anyone is in a position to fight back on this, it’s Microsoft. Microsoft’s lawyers should have a one word answer to the RIAA’s request: “no”. (full disclosure: I own a small number of shares of Microsoft)

The procedural argument

The first reason to tell the RIAA where to stick it is procedural. The RIAA isn’t arguing that youtube-dl is infringing its copyrights or circumventing its protections. It is arguing that youtube-dl infringes YouTube’s protections. So even if it is, that’s YouTube’s problem, not the RIAA’s.

The factual argument

I have some sympathy for the anticircumvention argument. I’m not familiar with the specifics of how youtube-dl works, but it’s at least possible that youtube-dl circumvents YouTube’s copy protection. This would be a reasonable basis for YouTube to take action. Again, YouTube, not the RIAA.

I have less sympathy for the infringement argument. youtube-dl doesn’t induce infringement more than a web browser or screen recorder does. There are a variety of uses for youtube-dl that are not infringing. Foremost is the fact that some YouTube videos are under a license that explicitly allows sharing and remixing. Archivers use it to archive content. Some people who have time-variable Internet billing use it to download videos overnight.

So, yes, youtube-dl can be used to infringe the RIAA’s copyrights. It can also be used for non-infringing purposes. The code itself does not infringe. There’s nothing about it that gives the RIAA a justification to take it down.

youtube-dl isn’t the whole story

youtube-dl provides a focal point, but there’s more to it. Copyright law is now used to suppress instead of promote creative works. The DMCA, in particular, favors the large rightsholders over smaller developers and creators. It essentially forces sites to act on a “guilty until proven innocent” model. Companies in a position to push back have an obligation to do so. Microsoft has become a supporter of open source, now it’s time to show they mean it.

We should also consider the risks of consolidation. git is a decentralized system. GitHub has essentially centralized it. Sure, many competitors exist, but GitHub has become the default place to host open source code projects. The fact that GitHub’s code is proprietary is immaterial to this point. A FOSS service would pose the same risk if it became the centralized service.

I saw a quote on this discussion (which I can’t find now) that said “code is free, infrastructure is not.” And while projects self-hosting their code repository, issue tracker, etc may be philosophically appealing, that’s not realistic. Software-as-a-Service has lowered the barrier for starting projects, which is a good thing. But it doesn’t come without risk, which we are now seeing.

I don’t know what the right answer is for this. I know the answer won’t be easy. But both this specific case and the general issues they highlight are important for us to think about.

Linux distros should be opinionated

Last week, the upstream project for a package I maintain was discussing whether or not to enable autosave in the default configuration. I said if the project doesn’t, I may consider making that the default in the Fedora package. Another commenter said “is it a good idea to have different default settings per packaging ? (ubuntu/fedora/windows)”

My take? Absolutely yes. As I said in the post on “rolling stable” distros, a Linux distribution is more than an assortment of packages; it is a cohesive whole. This necessarily requires changes to upstream defaults.

Changes to enable a functional, cohesive whole are necessary, of course. But there’s more than “it works”, there’s “it works the way we think it should.” A Linux distribution targets a certain audience (or audiences). Distribution maintainers have to make choices to make the distro meet that audience’s needs. They are not mindless build systems.

Of course, opinions do have a cost. If a particular piece of software works differently from one distro to another, users get confused. Documentation may be wrong, sometimes harmfully so. Upstream developers may have trouble debugging issues if they are not familiar with the distro’s changes.

Thus, opinions should be implemented judiciously. But when a maintainer has given a change due thought, they should make it.

What do “rolling release” and “stable” mean in the context of operating systems?

In a recent post on his blog, Chris Siebenmann wrote about his experience with Fedora upgrades and how, because of some of the non-standard things he does, upgrades are painful for him. At the end, he said “What I really want is a rolling release of ‘stable’ Fedora, with no big bangs of major releases, but this will probably never exist.”

I’m sympathetic to that position. Despite the fact that developers have worked to improve the ease of upgrades over the years, they are inherently risky. But what would a stable rolling release look like?

“Arch!” you say. That’s not wrong, but it also misses the point. What people generally want is new stuff so long as it doesn’t cause surprise. Rolling releases don’t prevent that, they spread it out. With Fedora’s policy, for example, major changes (should) happen as the release is being developed. Once it’s out, you get bugfixes and minor enhancements, but no big changes. You get the stability.

On the other hand, you can run Fedora Rawhide, which gets you the new stuff as soon as it’s available, but you don’t know when the big changes will come. And sometimes, the changes (big and little) are broken. It can be nice because you get the newness quickly. And the major changes (in theory) don’t all come at once.

Rate of change versus total change

For some people, it’s the distribution of change, not the total amount of change that makes rolling releases compelling. And in most cases, the changes aren’t that dramatic. When updates are loosely-coupled or totally independent, the timing doesn’t matter. The average user won’t even notice the vast majority of them.

But what happens when a really monumental change comes in? Switching the init system, for example, is kind of a big deal. In this case, you generally want the integration that most distributions provide. It’s not just that you get an assortment of packages from your distribution, it’s that you get a set of packages that work together. This is a fundamental feature for a Linux distribution (excepting those where do-it-yourself is the point).

Applying it to Fedora

An alternate phrasing of what I understand Chris to want is “release-quality packages made available when they’re ready, not on the release schedule.” That’s perfectly reasonable. And in general, that’s what Fedora wants Rawhide to be. It’s something we’re working on, particularly with the ability to gate Rawhide updates.

But part of why we have defined releases is to ensure the desired stability. The QA team and other testers put a lot of effort into automated and manual tests of releases. It’s hard to test against the release criteria when the target keeps shifting. It’s hard to make the distribution a cohesive whole instead of a collection of packages.

What Chris asks for isn’t wrong or unreasonable. But it’s also a difficult task to undertake and sustain. This is one area where ostree-based variants like Fedora CoreOS (for servers/cloud), Silverblue (for desktops), and IoT (for edge devices) bring a lot benefit. The big changes can be easily rolled back if there are problems.

How I broke KDE Plasma by changing my shell (and also writing a bad script)

My friends, I’d like to tell you the story of how I spent Monday morning. I had a one-on-one with my manager and a team coffee break to start the day. Since the weather was so nice, I thought I’d take my laptop and my coffee out to the deck. But when I tried to log in to my laptop, all I had was the mouse cursor. Oh no!

I did my meeting with my manager on my phone and then got to work trying to figure out what went wrong. I saw some errors in the journal, but it wasn’t clear to me what was wrong.

Aug 31 09:23:00 fpgm akonadi_control[5155]: org.kde.pim.akonadicontrol: ProcessControl: Application '/usr/bin/akonadi_googlecalendar_resource' returned with exit
code 253 (Unknown error)
Aug 31 09:23:00 fpgm akonadi_googlecalendar_resource[6249]: QObject::connect: No such signal QDBusAbstractInterface::resumingFromSuspend()
Aug 31 09:23:00 fpgm akonadiserver[5159]: org.kde.pim.akonadiserver: New notification connection (registered as Akonadi::Server::NotificationSubscriber(0x7f4d9c0
10140) )
Aug 31 09:23:00 fpgm akonadi_googlecalendar_resource[6249]: Icon theme "breeze" not found.
Aug 31 09:23:00 fpgm akonadiserver[5159]: org.kde.pim.akonadiserver: Subscriber Akonadi::Server::NotificationSubscriber(0x7f4d9c010140) identified as "AgentBaseC
hangeRecorder - 94433180309520"
Aug 31 09:23:01 fpgm akonadi_googlecalendar_resource[6249]: kf5.kservice.services: KMimeTypeTrader: couldn't find service type "KParts/ReadOnlyPart"  
                                                           Please ensure that the .desktop file for it is installed; then run kbuildsycoca5.

What broke

Before starting the weekend, I had updated all of the packages, as I normally did. But none of the updated packages seemed relevant. I hadn’t done any weird customization. As “pino|work” in IRC and I tried to work through it, I remembered that I had added a startup script to set the XDG_DATA_DIRS environment variable in the hopes of getting installed flatpaks to show up in the menu. (Hold on to this thought, it becomes important again later.)

I moved it out of the way to get things cleaned up (by removing the plasma-org.kde.plasma.desktop-appletsrc and plasmashellrc files). Looking at the script, I realized I had a syntax error (a stray single quote ended up in there) while trying to set XDG_DATA_DIRS. Yay! That’s easy enough to fix.

Why it broke

Except it was still broken. It was broken because I referred to XDG_DATA_DIRS but it was undefined. Why didn’t it inherit it? Ohhhhh because fish doesn’t use the /etc/profile.d directory.

So remember how I did this in order to get Flatpaks to show up in my start menu? I could have sworn they did at some point. It turns out that I was right. The flatpak package installs the scripts into /etc/profile.d, which fish doesn’t read. So when I switched my shell from Bash to fish a while ago, those scripts never ran at login.

How I “fixed” it

To fix my problem, I could have written scripts that work with fish. Instead, I decided to take the easy route and change my shell back to bash. But in order to keep using fish, I set Konsole to launch fish instead of bash. Since I only ever do a graphical login on my desktop, that’s no big deal, and it avoids a lot of headache.

The bummer of it all is that I lost some of the configuration I had in the files I deleted. But apparently the failed logins made it far enough to modify the files in a way that Plasma doesn’t like. At any rate, I didn’t do much customization, so I didn’t lose much either.

Removing unmaintained packages from an installed system

Earlier this week, Miroslav Suchý proposed removing removing retired packages as part of Fedora upgrade (editor’s note: the proposal was withdrawn after community feedback). As it stands right now, if a package is removed in a subsequent release, it will stick around. For example, I have 34 packages on my work laptop from Fedora 28 (the version I first installed on it) through Fedora 31. The community has been discussing this, with no clear consensus.

I’m writing this post to explore my own thoughts. It represents my opinions as Ben Cotton: Fedora user and contributor, not as Ben Cotton: Fedora Program Manager.

What does it mean for a package to be “maintained”?

This question is the heart of the discussion. In theory, a maintained package means that there’s someone who can apply security and other bug fixes, update to new releases, etc. In practice, that’s not always the case. Anyone who has had a bug closed due to the end-of-life policy will attest to that.

The practical result is that as long as the package continues to compile, it may live on for a long time after the maintainer has given up on it. This doesn’t mean that it will get updates, it just means that no one has had a reason to remove it from the distribution.

On the other hand, the mere fact that a package has been dropped from the distribution doesn’t mean that something is wrong with it. If upstream hasn’t made any changes, the “unmaintained” version is just as functional as a maintained version would be.

What is the role of a Linux distribution?

Why do Linux distributions exist? After all, people could just download the software and build it themselves. That’s asking a lot of most people. Even those who have sufficient technical knowledge to compile all of the different packages in different languages with different quirks, few have the time or desire to do so.

So a distribution is, in part, a sharing of labor. By dividing the work, we reduce our own burden and democratize access.

A distribution is also a curated collection. It’s the set of software that the contributors say is worth using, configured in the “right way”. Sure there are a dozen or so web browsers in the Fedora repos, but that’s not the entirety of web browsers that exist. Just as an art museum may have several similar paintings, a distribution might have several similar packages. But they’re all there for a reason.

To remove or not to remove?

The question of whether to remove unmaintained packages then becomes a balance between the shared labor and the curation aspects of a distribution.

The shared labor perspective supports not removing packages. If the package is uninstalled at update, then someone who relies on that package now has to download and build it themselves. It may also cause user confusion if something that previously worked suddenly stops, or if a package that exists on an upgraded system can’t be installed on a new one.

On the other hand, the curation perspective supports removing the package. Although there’s no guarantee that a maintained package will get updates, there is a guarantee that an unmaintained package won’t. Removing obsolete packages at upgrade also means that the upgraded system more closely resembles a freshly-installed system.

There’s no right answer. Both options are reasonable extensions of fundamental purposes of a distribution. Both have obvious benefits and drawbacks.

Pick a side, Benjamin

If I have to pick a side, I’m inclined to side with the “remove the packages” argument. But we have to make sure we’re clearly communicating what is happening to the user. We should also offer an easy opt-out for users who want to say “I know what you’re trying to do here, but keep these packages anyway.”