You are responsible for (thinking about) how people use your software

Earlier this week, Marketplace ran a story about Michael Osinski. You probably haven’t heard of Osinski, but he plays a role in the financial crisis of 2008. Osinksi wrote software that made it easier for banks to package loans into a trade-able security. These “mortgage-backed securities” played a major role in the collapse of the financial sector ten years ago.

It’s not fair to say that Osinski is responsible for the Great Recession. But it is fair to say he did not give sufficient consideration to how his software might be (mis)used. He told Marketplace’s Eliza Mills:

Most people realized that we wrote a good piece of software that we sold in the marketplace. How people use that software is … you know, you really can’t control that.

Osinski is right that he couldn’t control how people used the software he wrote. Whenever we release software to the world, it will get used how the user wants to use it — even if the license prohibits certain fields of endeavor. This could be innocuous misuse, the way graduate students design conference posters in PowerPoint or businesspeople use Excel for all conceivable tasks. But it could also be malicious misuse, the way Russian troll farms use social media to spread false news or sew discord.

So when we design software, we must consider how actual users — both benevolent and malign — will use it. To the degree we can, we should mitigate against abuse or at least provide users a way to defend themselves from it. We are long past the point where we can pretend technology is amoral.

In a vacuum, technological tools are amoral. But we don’t use technology in a vacuum. The moment we put it to use, it becomes a multiplier for both good and evil. If we want to make the world a better place, we cannot pretend it will happen on its own.

“You’ve been hacked” corrects behavior

Part of running a community means enforcing community norms. This can be an awkward and uncomfortable task. I recently saw a Tweet that suggests it might be easier than you thought:

It’s nice because it’s subtle and gives people a chance to self-correct. On the other hand, there’s some value in letting community members (and potential community members) see enforcement actions. Not as a punitive measure, but as a signal that you take your code of conduct seriously.

This won’t work for every case, but I do like the idea as a response to the first violation, so long as it’s a minor violation. Repeated or flagrant violation of the community’s code of conduct will have to be dealt with more strongly.

Twitter interactions are not a polling mechanism

Way back in the day, clever Brands tried to conduct Twitter polls by saying “retweet for the first choice and favorite (now like) for the second choice.” This was obviously very prone to bias. The first choice’s fans will spread the poll, so virality favors the first option. But it was also the best choice available, other than linking to an external poll site (which means a much lower interaction rate).

Then Twitter introduced native polls. Now you can post a question with up to four answers. It even makes a nice bar chart of the results. Twitter interactions are not a polling mechanism, so why are you using them?!

The answer lies in the word “interaction”. Social media interactions are a way for Brands to measure the success of their social media efforts. Conducting polls via interactions instead of the native polling mechanism are a cheap way to drive up interactions. It’s a good indication that you’re not interested in the answers. People who want actual answers can use polls.

This concludes today’s episode of “Old man yells at cloud”.

Date-based conditional formatting in Google Sheets

Sometimes a “real” project management tool is too heavy. And spreadsheets may be the most-abused software tool. So if you want to track the status of some tasks, you might want to drop them into a Google spreadsheet.

You have a spreadsheet with four columns: task, due, completed, and owner. For each row, you want that row to be formatted strikethrough if it’s complete, highlighted in yellow if it’s due today, and highlighted in red if it’s overdue. You could write conditional formatting rules for each row individually, but that sounds painful. Instead, we’ll use a custom formula.

For each of the following rules, apply them to A2:D.

The first rule will strike out completed items. We’ll base this on whether or not column C (completed) has content. The custom formula is =$C:$C<>"". Set the formatting style to Custom, clear the color fill, and select strikethrough.

The second rule will highlight overdue tasks. We only want to highlight incomplete overdue tasks. If it’s done, we stop caring if it was done on time. So we need to check that the due date (column B) is after today and that the completion date (column C) is blank. The rule to use here is =AND($C:$C="",$B:$B<today(),$A:$A<>""). Here, you can select the “Red highlight” style.

Lastly, we need to highlight the tasks due today. Like with the overdue tasks, we only care if they’re not done.=AND($C:$C="",$B:$B=today(),$A:$A<>""). This time, use the “Yellow highlight” style.

And that’s it. You can fill in as many tasks as you’d like and get the color coding populated automatically. I created an example sheet for reference.

Microsoft bought GitHub. Now what?

Last Monday, a weekend of rumors proved to be true. Microsoft announced plans to buy code-hosting site GitHub for $7.5 billion. Microsoft’s past, particularly before Satya Nadella took the corner office a few years ago, was full of hostility to open source. “Embrace, extend, extinguish” was the operative phrase. It should come as no surprise, then, that many projects responded by abandoning the platform.

But beyond the kneejerk reaction, there are two questions to consider. First: can open source projects trust Microsoft? Secondly, should open source (and free software in particular) projects rely on corporate hosting.

Microsoft as a friend

Let’s start with the first question. With such a long history of active assault on open source, can Microsoft be trusted? Understanding that some people will never be convinced, I say “yes”. Both from the outside and from my time as a Microsoft employee, it’s clear that the company has changed under Nadella. Microsoft recognizes that open source projects are not only complementary, but strategically important.

This is driven by a change in the environment that Microsoft operates in. The operating system is less important than ever. Desktop-based office suites are giving way to web-based tools for many users. Licensed revenue may be the past and much of the present, but it’s not the future. Subscription revenue, be it from services like Office 365 or Infrastructure-as-a-Service offerings, is the future. And for many of these, adoption and consumption will be driven by open source projects and the developers (developers! developers! developers! developers!) that use them.

Microsoft’s change of heart is undoubtedly driven by business needs, but that doesn’t make it any less real. Jim Zemlin, Executive Director at the Linux Foundation, expressed his excitement, implying it was a victory for open source. Tidelift ran the numbers to look at Microsoft’s contributions to non-Microsoft projects. Their conclusion?

…today the company is demonstrating some impressive traction when it comes to open source community contributions. If we are to judge the company on its recent actions, the data shows what Satya Nadella said in his announcement about Microsoft being “all in on open source” is more than just words.

And in any acquisition, you should always ask “if not them, then who?” CNBC reported that GitHub was also in talks with Google. While Google may have a better reputation among the developer community, I’m not sure they’d be better for GitHub. After all, Google had Google Code, which it shut down in 2016. Would a second attempt in this space fare any better? Google Code had a two year head start on GitHub, but it languished.

As for other major tech companies, this tweet sums it up pretty well:

Can you trust anyone to host?

My friend Lyz Joseph made an excellent point on Facebook the day the acquisition was announced:

Unpopular opinion: If you’re an open source project using GitHub, you already sold out. You traded freedom for convenience, regardless of what company is in control.

People often forget that GitHub itself is not open source. Some projects have avoided hosting on GitHub for that very reason. Even though the code repo itself is easily mirrored or migrated, that’s not the real value in GitHub. The “social coding” aspects — the issues, fork tracking, wikis, ease of pull requests, etc — are what make GitHub valuable. Chris Siebenmann called it “sticky in a soft way.

GitLab, at least, offers a “community edition” that projects can self-host. In a fantasy world, each project would run their own infrastructure, perhaps with federated authentication for ease of use when you’re a participant in many projects. But that’s not the reality we live in. Hosting servers costs money and time. Small projects in particular lack both of those. Third-party infrastructure will always be attractive for this reason. And as good as competition is, having a dominant social coding site is helpful to users in the same way that a dominant social network is simpler: network effects are powerful.

So now what?

The deal isn’t expected to close for a while, and Microsoft plans to seek regulatory approval, which will not speed the process. Nothing will change immediately. In the medium term, I don’t expect much to change either. Microsoft has made it clear that it plans to run GitHub as a fairly autonomous business (the way it does with LinkedIn). GitHub gets the stability that comes from the support of one of the world’s largest companies. Microsoft gets a chance to improve its reputation and an opportunity to make it easier for developers to use Azure services.

Full disclosure: I am a recent employee of Microsoft and a shareholder. I was not involved in the acquisition and had no inside knowledge pertinent to the acquisition or future plans for GitHub.

You’re an SEO company

Business owners, regardless of their industry, often view themselves in terms of what their business does. “We’re a bookstore, a coffee shop, a web design company,” or whatever goods or services that customers pay money for. But a recent conversation made me realize that most small businesses in a mature market are really a search engine optimization (SEO) company.

Okay, there are a few caveats here. I’m thinking of mature markets as fields where there are many small or small-ish players that are attempting to serve a large number of users. Think generally of the early and late majority sections of the technology adoption life cycle. Ride sharing, for example, is out of scope. It’s pretty solidly in the middle of the bell curve, but it has three players: Uber, Lyft, and everyone else.

The subject of the conversation was a VPN service. A friend was using VPN software and observed that it would be easy to share his server with others for a fee. All the other challenges of running a business aside, I immediately asked what his differentiation is.

VPN services may not be mainstream exactly, but the market is mainstream enough. And there are a lot of players with no one particularly dominant. So how does a new entry set itself apart? There’s a little bit of room to differentiate on price, location, service, etc, but not much. So the best way to differentiate and get new customers is to be better at search engine optimization than the rest of the field.

In essence, making a business successful requires skills entirely unrelated to the business itself. When you can’t easily differentiate your product, you have to differentiate your marketing.

Book review: Habeas Data

What does modern technology say about you? What can the police or other government agencies learn? What checks on their power exist? These questions are the subject of a new book from technology reporter Cyrus Farivar.

Habeas Data (affiliate link) explores the jurisprudence that has come to define modern privacy law. With interviews with lawyers, police officers, professors, and others who have shaped the precedent. What makes this such an interesting subject is the very nature of American privacy law. Almost nothing is explicitly defined by legislation. Instead, legal notions of privacy come from how courts interpret the Fourth Amendment to the United States Constitution. This gives government officials the incentive to push as far as they can in the hopes that no court cases arise to challenge their methods.

For the first two centuries or so, this served the republic fairly well. Search and seizure were constrained to the physical realm. Technological advances did little to improve the efficiency of law enforcement. This started to change with the advent of the telegraph and then the telephone, but it’s the rapid advances in computing and mobility that have rendered this unworkable.

As slow as legislatures can be to react to technological advances, courts are even slower. And while higher court rulings have generally been more favorable to a privacy-oriented view, not everyone agrees. The broad question that courts must grapple with is which matters more: the practical effects of the technology changes or the philosophical underpinnings?

To his credit, Farivar does not claim to have an answer. Ultimately, it’s a matter of what society determines is the appropriate balance between individual rights and the needs of the society at large. Farivar has his opinions, to be sure, but Habeas Data does not read like an advocacy piece. It is written by a seasoned reporter looking to inform the populace. Only by understanding the issues can the citizenry make an informed decision.

With that in mind, Habeas Data is an excellent book. Someone looking for fiery advocacy will likely be disappointed, but for anyone looking to understand the issue, it’s a great fit. Technology law and ethics courses would be well-advised to use this book as part of the curriculum. It is deep and well-researched while still remaining readable.

It has its faults, too. The flow of chapters seems a little haphazard at times. On the other hand, they can largely be treated as standalone studies on particular issues. And the book needed one more copy editing pass. I saw a few typographic errors, which is bound to happen in any first-run book, but was jarred by a phrase that appeared to have been accidentally copy/pasted in the middle of a word.

None of this should be used as a reason to pass on this book. I strongly recommend Habeas Data to anyone interested in the law and policy of technology, and even more strongly to those who aren’t interested. The shape that privacy law takes in the next few years will have impacts for decades to come.

Google Duplex and the future of phone calls

For the longest time, I would just drop by the barber shop in the hopes they had an opening. Why? Because I didn’t want to make a phone call to schedule an appointment. I hate making phone calls. What if they don’t answer and I have to leave a voicemail? What if they do answer and I have to talk to someone? I’m fine with in-person interactions, but there’s something about phones. Yuck. So I initially greeted the news that Google Duplex would handle phone calls for me with great glee.

Of course it’s not that simple. A voice-enabled AI that can pass for human is ripe for abuse. Imagine the phone scams you could pull.

I recently called a local non-profit that I support to increase my monthly donation. They did not verify my identity in any way. So that’s one very obvious way for causing mischief. I could also see tech support scammers using this as a tool in their arsenal — if not to actually conduct the fraud then to pre-screen victims so that humans only have to talk to likely victims. It’s efficient!

Anil Dash, among many others, pointed out the apparent lack of consent in Google Duplex:

The fact that Google inserted “um” and other verbal placeholders into Duplex makes it seem like they’re trying to hide the fact that it’s an AI. In response to the blowback, Google has said it will disclose when a bot is calling:

That helps, but I wonder how much abuse consideration Google has given this. It will definitely be helpful to people with disabilities that make using the phone difficult. It can be a time-saver for the Very Important Business Person™, too. But will it be used to expand the scale of phone fraud? Could it execute a denial of service attack against a business’s phone lines? Could it be used to harass journalists, advocates, abuse victims, etc?

As I read news coverage of this, I realized that my initial reaction didn’t consider abuse scenarios. That’s one of the many reasons diverse product teams are essential. It’s easy for folks who have a great deal of privilege to be blind to the ways technology can be misused. I think my conclusion is a pretty solid one:

The tech sector still has a lot to learn about ethics.

I was discussing this with some other attendees at the Advanced Scale Forum last week. Too many computer science and related programs do not require any coursework in ethics, philosophy, etc. Most of computing has nothing to do with computers, but instead with the humans and societies that the computers interact with. We see the effects play out in open source communities, too: anything that’s not code is immediately devalued. But the last few years should teach us that code without consideration is dangerous.

Ben Thompson had a great article in Stratechery last week comparing the approaches of Apple and Microsoft versus Google and Facebook. In short: Apple and Microsoft are working on AI that enhances what people can do while Google and Facebook are working on AI to do things so people don’t have to. Both are needed, but the latter would seem to have a much greater level of ethical concerns.

There are no easy answers yet, and it’s likely that in a few years tools like Google Duplex will not even be noticeable because they’ve become so ubiquitous. The ethical issues will be addressed at some point. The only question is if it will be proactive or reactive.

 

 

LISA wants you: submit your proposal today

I have the great honor of being on the organizing committee for the LISA conference this year. If you’ve followed me for a while, you know how much I enjoy LISA. It’s a great conference for anyone with a professional interest in sysadmin/DevOps/SRE. This year’s LISA is being held in Nashville, Tennessee, and the committee wants your submission.

As in years past, LISA content is focused on three tracks: architecture, culture, and engineering. There’s great technical content (one year I learned about Linux filesystem tuning from the guy who maintains the ext filesystems), but there’s also great non-technical content. The latter is a feature more conferences need to adopt.

I’d love to see you submit a talk or tutorial about how you solve the everyday (and not-so-everyday) problems in your job. Do you use containers? Databases? Microservices? Cloud? Whatever you do, there’s a space for your proposal.

Submit your talk to https://www.usenix.org/conference/lisa18/call-for-participation by 11:59 PM Pacific on Thursday, May 24. Or talk one of your coworkers into it. Better yet, do both! LISA can only remain a great conference with your participation.