Book review: Range

In many parts of society, we ask people to specialize early and go very deep. This is the path to excellence. In Range: why generalists triumph in a specialized world, David Epstein examines the role breadth plays. I should admit my bias up front: I am definitely a width person, not a depth person. So maybe I just agreed with this book because it reinforced the story I tell myself about my success.

But I do think there’s something to this. Throughout my career, I’ve found that the best colleagues are the ones who have academic or work experience outside of the tech industry. It’s not that they’re necessarily better technically, but they grasp the context much more easily. That becomes increasingly important when dealing with novel and poorly-defined problems.

I’ve long understood the value of coursework outside one’s major. Range helped me understand why that value exists. I sometimes heard at my alma mater that “we have a liberal arts school so we can produce well-rounded engineers.” Now I think perhaps we should have fewer major courses and more gen ed courses. (In addition to ethics classes which should be added to all curricula for separate reasons.)

In the context of the current time, with conspiracy theories enjoying a disturbing degree of acceptance, I find Epstein’s emphasis on amateurs a little concerning. Yes, novices sometimes make discoveries that elude the experts. Still, we must be careful not to replace “appeal to authority” with “appeal to lack of authority”.

I didn’t find Epstein’s writing style particularly compelling. This surprised me since he’s a journalist. I suppose books are a different beast. But the arguments were well-reasoned and supported by research. I would recommend this book to anyone thinking about their future career or seeking reinforcement of their past, seemingly-odd, changes in direction.

On the “commercial side of cancel culture”

Writing on his blog last week, Evan Brown said about “morals clauses” in contracts:

These clauses provide the means for the commercial side of cancel culture to flourish.

Evan Brown

Evan is a fellow 812 native and a person for whom I have tremendous respect. I wouldn’t think to argue with him on a matter of law, but this is a matter of culture, so I will.

We’re starting from different points. Evan clearly believes that “cancel culture” is a real concern. I don’t. I’ll grant that there is a possible extreme that would be a problem, but I do not believe we are there or that we are approaching it. What is called “cancel culture” is often “facing accountability for one’s improper actions.”

To that end, I proposed that the “commercial side of cancel culture” could also be called “the free market”. This specific case—Jeep pulling use of an ad featuring Bruce Springsteen after news came out that he recently had a DUI arrest—is a particularly bad example to use to make the point, too. First of all, it is very reasonable that a vehicle company would want to dissociate from someone who recently had a drunk driving arrest. Secondly, the harm is not on the person “canceled” (to the degree that Bruce Springsteen can be canceled).

Springsteen presumably got paid already (although there may be a clawback clause or a payment structure that has not fully completed yet). He doesn’t need the money or the fame. Meanwhile, Jeep produced an ad and purchased air time for it. They probably planned for a much longer run over which to recoup their expenses. I’m not suggesting we feel sorry for Jeep, but I also don’t think we need to shed any tears for The Boss.

Why newsletters are email not RSS

Some friends were recently discussing newsletters and one raised the question of why newsletters are done (largely) as email instead of blog posts shared via RSS. I’m going to answer that question in this post. Some of the answers are my own reasoning for sending Newsletter Fiasco as an email. Other answers are what I know or reasonably assume are the motivations for other newsletter senders. And, yes, many newsletters are also available via RSS, even if that’s not the intended distribution mechanism.

Email is universal

Approximately everyone who might want to read your newsletter has an email address. For all its shortcomings, email is the best example of decentralized, standards-driven digital communication. RSS, especially post-Google-Reader tends to skew nerdy. Many of my tech enthusiast friends use RSS readers of some kind. Most of my other friends don’t. Social media platforms have supplanted RSS for a lot of people. If you’re distributing via RSS, you’ve already narrowed your potential audience quite a bit.

Email can wait

I won’t pretend that my usage of RSS is generalizable to all RSS users, but here’s how I use RSS. Mostly, I use the Feedly widget in my browser to tell me when I have unread items. A few times a day, I scan through the unread items and open the ones that I want to read. Then I mark the rest as read. I may not read the open tabs right away, but I generally do it in short order. RSS, then, is an “I’ll read it now or I’ll read it never” proposition. And the longer I go between checking my feeds, the lower the percentage of articles I’ll read.

On the other hand, I might leave a newsletter unread in my email inbox for a few days. This is particularly true for The Sunday Long Read, which is full of great articles that probably require more than a few minutes to read. Sometimes I’ll let a couple of them pile up before I have a chance to sit down and look at them. That doesn’t work well with how I consume RSS.

Email can be forwarded

Forwarding is a key part of the email experience. This is bad when it’s an unhinged conspiracy from a relative (although I only get those via Facebook Messenger these days), but good when you want to share a newsletter you liked. And because it’s universal you can share it with anyone easily (as opposed to sharing on Twitter and Facebook and LinkedIn and … ).

Email feels more personal and direct

Readers understand that the newsletter isn’t written directly to them in particular. But because it comes to their inbox, it can feel more personal. Plus, many newsletter platforms allow for personalization. You can greet all your readers by their name. Or give them stats about how close they are to earning the next swag item by sharing their unique referral code with friends.

Email can be tracked

As a newsletter reader, you probably don’t love this one. But as a newsletter writer, it can be incredibly valuable. A lot of people who write newsletters are doing it in service of a #brand, either their personal brand or a professional brand. This means it’s important to know not only how many people read the content, but who. And while this may feel a little icky, I argue that it’s way less icky than web cookie tracking. It’s a compromise level of icky.

For Newsletter Fiasco, I don’t look at the stats. I have no idea what my open and click rates are. I have never looked to see who is clicking what links. Let’s be honest, I started my newsletter because I wanted to be cool like the other people who had newsletters. That anyone reads it is always a welcome surprise.

But when I worked in marketing, it was important to know who was clicking what links. If they were current customers, it was just nice to see they liked us enough to pay us and also read our newsletter. But for potential customers, seeing what items from our news roundup interested them helped our sales team make the pitch that mattered to them specifically. If they only ever clicked articles about GCP, why waste time telling them about our AWS-specific features? If nobody ever clicked the links about job schedulers, we’d stop putting them in the newsletter.

Even unsubscribes can give you useful information. Many unsubscribe pages offer an optional one-question survey: why are you unsubscribing? If someone stops visiting your blog, all you know is that they’re not visiting anymore. Well, you know that the views are down, assuming the person who left isn’t offset by a new reader. That churn number can be informative, too.

This is what a “newsletter” is

There’s probably some amount of “this is how it’s always been” here, too. Newsletters were a thing you printed and sent to people in the analog era, so that’s what they are in the digital era, too. A newsletter distributed via blog is called a blog. In that sense, the name “newsletter” is more about the distribution mechanism than the content. A good example of this is Jim Grey’s weekly “Recommended Reading” blog post. The content could easily be a newsletter, except it’s not because it’s a blog post.

Are these good reasons?

I leave that up to you, Dear Reader. I won’t claim that any of these reasons are particularly good or bad. They’re just the reasons the person producing the newsletter would use email instead of a blog.

Should we treat OSD compliance as a binary?

So often, we think about whether a software license complies with the Open Source Definition (OSD) as a binary: it complies or it doesn’t. But the OSD has 10 criteria. If a license complies with all except for one of those criteria, it’s non-compliant, but is it non-compliant in the same way that a license that doesn’t comply with four criteria?

I got to thinking about this as I tried to come up with names for the four quadrants in Tobie Langel’s license classification chart. It occurred to me that the bottom half represented two concepts: not explicitly OSD-compliant because it was never submitted and explicitly not OSD-compliant because it violates one or more criteria.

A diagram of the open source landscape considering licenses and norms. Created by Tobie Langel and used under CC BY-SA 4.0.

There must be 50 ways to violate the OSD

Knowing how many (and which) criteria a non-compliant license meets is important. I argue that not allowing derived works is far more important to the idea of “open source in spirit” than not restricting other software by requiring all software distributed alongside it be free.

To add even more complication, not all violations of the same criteria are equal. A license that restricts users from hunting humans for sport would be seen more favorably than a license that restricts users from making ice cream.

Saying a license is OSD-compliant tells us something. Saying it is non-compliant tells us nothing.I don’t know if there’s a succinct way to express the 1,024 possible ways a license could be non-compliant. Certainly there is not if you also include the specific reasoning.

As I showed above, saying a license is 90% compliant is not particularly useful if the 10% is really important to you. And not all 90%s are created equal. It doesn’t make sense to put the criteria on a spectrum and describe the license by how far along it gets. Again, the violation may or may not matter for your purposes. And how can we say which criteria are most important in a way that will garner any sort of widespread support?

It may be possible to group the criteria into two or three broader categories. I’m not entirely sure that would be easy to express—certainly not in a simple chart.

Do we care?

And then there’s the question of if that even matters. I wrote last week’s “free and open source software is not the end goal” post as I thought about this question. From an intellectual property law standpoint, OSD compliance matters. (In that it gives you at least a broad idea of what you’re working with.) From a “why the hell am I writing this software to begin with?” standpoint, I’m not sure that it does.

We’re back to the beginning. If the goal is to write software that advances the state of humanity, you may choose a license that is explicitly not OSD-compliant because you don’t want it used for nefarious purposes. That’s a valid choice, although a very complicated one. Is it reasonable to lump that in with all of the other non-compliant licenses? The answer depends on your context.

There is no easy answer. Tobie’s other axis (follows norms) is also messy. Even more, probably, because there’s no defined standard to measure against. Perhaps for this purpose we continue to treat it as a binary after all. The model can show which quadrant a project falls in; understanding why is left as an exercise to the reader.

Refining the model to account for all (okay, some) of the complexities I’ve discussed would make an excellent dissertation topic for an aspiring PhD student.

Free and open source software is not the end goal

When I first started thinking about this article, the title was going to be “I don’t care about free software anymore.” But I figured that would be troll bait and I thought I should be a little less spicy. It’s true in a sense, though. I don’t care about free/open source software as an end goal.

The Free Software Foundation (FSF) says “free software is about having control over the technology we use in our homes, schools and businesses”. The point isn’t that the software itself is freely-licensed, it’s about what the software license permits or restricts. I used to think that free software was a necessary-but-insufficient condition for users having control over their computing. I don’t think that’s necessarily the case anymore.

Why free software might not matter

Software isn’t useful until someone uses it. So we should evaluate software in that context. And most software use these days involves 1. data and 2. computers outside the user’s control. We’ll get back to #2 in a moment, but I want to focus on the data. If Facebook provided the source code to their entire stack tomorrow—indeed, if they had done it from the beginning—that would do nothing to prevent the harms caused by that platform. One, it does nothing to diminish the “joys” of spreading disinformation. Two, it would be no guarantee that something else isn’t reading the data.

While we were so focused on the software, we essentially ignored the data. Now, the data is just as important, if not more, as the software. There are plenty of examples of this in my talk “We won. Now what?” presented at DevConf.CZ (25 minutes) and DevConf.US (40 minutes) last year. Being open is no guarantee of data protection, just as being proprietary is not guarantee of data harm.

We’ll always use other people’s computers

Let’s return to the “computers outside the user’s control” point. There’s a lot of truth to the “there is no cloud, there’s only other people’s computers” argument. And certainly if everyone ran their own services, that would reduce the risk of harm.

But here in the real world, that’s not going to happen. Most people cannot run their own software services—they have neither the skill nor the resources. Among those who do, many have no desire to. Apart from the impossibility of people running their own services, there’s the fact that communication means that the information lives in two places, so you’re still using someone else’s computer.

It’s all very complicated

There’s also the question of whether or not the absolutist view of software freedom is the right approach. The free software movement seems to be very libertarian in nature: if each user has freedom over their computing, that is a benefit to everyone. Others would argue (as the Ethical Source movement has) that enabling unethical uses of software is harmful. These two positions are at odds.

Whether or not you think the software license is the appropriate places to address this issue, I suspect many, if not most, developers would prefer that their software not be used for evil purposes. In order to enforce that, the software becomes non-free.

This is a complicated issue, with no right answer and no universal agreement. I don’t know what the way forward is, but I know that we cannot act like free software is the end goal. If we want to get the general public on board, we have to convince them in terms that make sense to their values and concerns, not ours. We must make software that is useful and usable in addition to being free. And we must understand that people choosing non-free software is not a moral failing but a decision to optimize for other values. We must update our worldview to match the 2020s; the 1990s are not coming back.

Indiana COVID-19 update: 5 February 2021

On Wednesday, the Indianapolis Star reported that a state audit discovered just over 1,500 “missing” COVID-19 deaths. These deaths were added to the Indiana State Department of Health’s dashboard on Thursday. The state “snuck” them in, not including them in the “newly reported” deaths for that day’s update. Fortunately, I had the data before and after and was able to produce some information on my dashboard.

It wasn’t as good as we thought

“Missing” COVID-19 deaths by day

The missing deaths stretch as far back as early April, but the bulk came in November through January. This is also when the overall death rate was the highest. On the whole, approximately 15% of COVID-19 deaths were not included on the state’s dashboard prior to February 4. But on 48 days, the missing deaths exceeded 20%. On December 18, 31 deaths (29% of the total) were missing. Instead of having a peak death count of 97, we’ve instead exceeded 100 deaths on several days with a peak of 118.

COVID-19 deaths per day before and after the “missing” deaths were included

I wrote in the last update that I thought deaths were missing, particularly given the abrupt drop in December. It turns out that I was more right than I could have imagined. “I’m not trying to sound like a conspiracy theorist,” I wrote. “I don’t think there was any malfeasance.” I’m trying very hard to continue believing that.

At the very least, this represents appalling incompetence. This isn’t just a problem for making graphs. The death toll of this pandemic is serious. Losing 15% of the deaths is not only disrespectful to the dead and their families, but it robs decision-makers of reliable data. What decisions would have been made differently if we knew the true death toll.

Of course, we may never be sure of the true death toll, particularly early in the pandemic. At the time, testing was scarce. I’ve heard anecdotes from several reliable friends of loved ones not getting testing after death. We can compare 2020’s overall death to previous years, but that will not be definitive.

The future

The good news is that the overall numbers continue to trend in the right direction. Yesterday, hospitalizations were below 1500 for the first time since October 20. Deaths, new cases, hospitalization, and positivity all continue to drop. Mask usage is up and mobility remains 20% below the baseline, per the Institute of Health Metrics and Evaluation (IHME). Perversely, the corrected death totals represent a positive of sorts: the recent model runs have proven more accurate than it appeared.

Observed and forecast COVID-19 deaths in Indiana by day

As best I can tell, IHME’s most recent model run did not include the adjusted death totals, so it will be interesting to see how much changes in the next update. The observed death trend is dropping at a faster rate than the models would suggest, but that may flatten a bit over the coming days. Still, the trends are encouraging.

Causes for concern

But all is not well. Although IHME’s latest model run does not show an increase in deaths through the end of May, they say some states will see that. But even more worrying, it appears some of the new variants may lead to reinfection in people who already have immunity.

The Novavax Phase III trial in South Africa placebo arm found that prior infection
provided no protection from variant B.1.351. The implication of this finding is that herd
immunity is only variant-specific; if this finding is confirmed in the Johnson & Johnson
placebo arm data, our worse scenario is likely too optimistic.

IHME COVID-19 Policy Brief for the United States, 3 February 2021

With the next update, IHME will incorporate cross-variant reinfection into the model. I’ll continue to update my dashboard with the new model runs as they’re available.

Thoughts on Elastic License v2

Yesterday Elastic announced a revision to their not-great Elastic License. The Elastic License v2 was updated based on feedback from the community and apparently had a lawyer’s input. And while they seem to be backing off trying to imply that it is open source (because it decidedly is not), it still doesn’t seem like a good license.

First of all, it doesn’t comply with the Open Source Definition, so if that’s important to you, that’s all you need to know. I’m assuming if you’re reading this, you care about the license beyond that. And while I’m not a lawyer (so this is very much not legal advice), here are my thoughts: it’s vague! Seriously, the vagueness makes it a big risk whether or not you care about OSD compliance (and there are many reasons you might not, as I’ll discuss in an upcoming post).

The first line in the Limitations section reads thus:

You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software.

This contains two things I have questions about. First of all, what is a “managed service” exactly? Does that include consulting services where someone provides direct management of a customer’s software? I have a good idea of what “managed service” means in industry terms, but if a licensor using this software decides they don’t like what you’re doing, there’s enough vagueness there for them to cause you problems. And of course, if you want to use it in a Software-as-a-Service model, you can’t use it under this license. You can use it under the SSPL, of course, but that is a non-starter for a lot of users.

Secondly, what is a “substantial set of the features or functionality of the software”? If someone does their own implementation of the functionality, does that count? If someone develops additional code that extends the functionality of the software and the upstream project later adds that functionality, does the additional code now violate the license?

Another problem is that it treats “you” and “your company” as distinct entities. This doesn’t make a lot of sense to me. If I use software on behalf of my employer, the employer is the licensee. The “Patents” section contains the only uses of “your company” and says “[i]f your company makes such a claim, your patent license ends immediately for work on behalf of your company”, but that’s redundant because the license was always for my company, not for me.

Frankly, I don’t see why anyone would use this license, particularly now that Amazon has forked the project.

Other writing: January 2021

What have I been writing when I haven’t been writing here?

Stuff I wrote

Fedora

Stuff I curated

Fedora

I’m not over college sports, but it’s different

On Sunday, my friend Chris O’Donnell published a post titled “I’m over college sports“. Given the timing, I thought it was due to the fact that Purdue and Michigan held a men’s basketball game after a Purdue player tested positive for COVID-19 before the entire Michigan athletic department shut down for two weeks due to a COVID-19 outbreak. But “COVID” doesn’t appear in his post at all. Instead, he raises other very valid concerns about the state of Division I sports.

I don’t disagree with any of his points. I suppose I choose to ignore them so I can keep enjoying the games. But not being able to go to games has certainly changed things for me. Purdue football has been a “I’ll watch if I have nothing else going on” thing for me most of the last decade. For much of that time, I’ve chosen to listen to the radio broadcasts while I do something more useful with my Saturday.

Basketball is another story. I’ve had season tickets to Purdue men’s basketball for a long time and am an…enthusiastic fan. I’ve re-arranged my calendar more than once to accommodate going to a basketball game. But now that all of the games are TV-only, I’ve found that I haven’t watched nearly as much as I would otherwise.

I can’t see myself not going back to basketball games once the option is available. But other sports consumption…who can say?

What does “open source” mean in 2021?

The licensing discourse in the last few weeks has highlighted a difference between what “open source” means and what we’re talking about when we use the term. Strictly speaking, open source software is software released under a license approved by the Open Source Initiative. In most practical usage, we’re talking about software developed in a particular way. When we talk about open source, we talk about the communities of users and developers, (generally) not the license. “Open source” has come to define an ethos that was all have our own definition of.

Continue reading