Licensing and open source communities

At FOSDEM 2014, Eileen Evans gave a talk entitled “Licensing Models and Building an Open Source Community“. The talk is basically a discussion how Evans changed her mind about the suitability of permissive licenses in vibrant open source communities. She proposes that a vibrant community requires excellent technology, suitable governance, and a license that the community perceives as fair.

A decade ago, Evans was working at Sun and considering what license to use for OpenSolaris. The decision at the time was that because copyleft licenses require downstream changes to be returned to the community (in the sense that they remain freely-licensed), copyleft licenses are necessary for a healthy community.

In the intervening years, many projects have adopted permissive licenses. The GPL family is no longer the majority license, according to several surveys. Vendor participating in open source projects favored strong copyleft until around 2006, but the preference has shifted toward permissive licenses. A survey of GitHub projects showed the MIT license with a dramatic lead over the next-most-widely-used license.

Based on this, Evans concluded that permissive licenses can, in fact, be used

Is that still true today? Projects are increasingly using permissive licenses. MIT dominates GitHub. Vendor engagement (participation in projects) was toward strong copyleft until ~2006 when permissive licenses take over. 5x increased in contributors to CloudStack after changing from copyleft to permissive. Permissive licenses may be used to build a community.

Of course, there are few who would take the position these days that permissive licenses can’t be used. Even noted copyleft advocate Bradley Kuhn can be heard agreeing on the video, though he points out his view that copyleft licenses make for better communities. Perhaps the question should be phrased as “what kind of communities develop?”

In conducting research for my thesis, I came across a study that showed copyleft licenses were associated with higher user engagement, but permissive licenses were associated with higher developer engagement. This makes sense, since not all developers develop FLOSS. A developer who isn’t developing FLOSS would probably be more drawn to a project where the license was conducive to proprietary downstreams.

Evans’ anecdote about the increase in contributions to CloudStack when it switched from copyleft to permissive licensing may or may not tell us something. It may be purely coincidental. An increase in the popularity of the project or of cloud computing generally may have driven the change. And of course, there’s more to a community than the number of committers.

I suspect that the license itself may be less important than the overall governance model. It’s certainly an area that merits further research.

How do you measure software quality?

There are two major license types in the free/open source software world: copyleft (e.g. GPL) and permissive (e.g. BSD). Because of the different legal ramifications of the licenses, it’s possible to make theoretical arguments that either license would tend to produce higher quality software. For my master’s thesis, I would like to investigate the quality of projects licensed under these paradigms, and whether there’s a significant difference. In order to do this, I’ll need some objective mechanism for measuring some aspect(s) of software quality. This is where you come in: if you have any suggestions for measures to use, or tools to get these measures, please let me know. It will have to be language-independent and preferably not rely on bug reports or other similar data. Operating on source would be preferable, but I have no objections to building binaries if I have to.

The end goal (apart from graduating) is to provide guidance for license selection in open source projects when philosophical considerations are not a concern. I have no intention or desire to turn this into a philosophical debate on the merits of different license types.

CNET considered harmful

In my younger days, I made great use of CNET’s website. It was an excellent tool for finding legal software. Apparently, it has also become an excellent tool for finding malware. An article posted to describes how CNET has begun wrapping packages with an installer that bundles unwanted, potentially malicious software with the desired package.

This is terrible, and not just for the obvious reasons. It’s bad for the free software community because it makes us look untrustworthy. There’s a perception among some people (especially in the business world) that software can only be free if it’s no good. I suppose that’s one reason some in the community use “libre” to emphasize the free-as-in-freedom aspect. (Of course, not all free-as-in-beer software is free-as-in-freedom. That’s another reason the distinction can be important.)

When this conveniently-bundled malware causes problems for users, it’s not CNET who gets the blame. Users will unfairly blame the package developer, even though the developer had nothing to do with it. For well-established and well-respected packages like nmap, this reputation damage may not be that important. For a new project just getting started — or for the idea of free software in general — this can be devastating.