Naming your files is important

I recently shared a Tweet about file names.

The inspiration for this was adding a new podcast to my podcatcher. For reasons that are mostly nerdy, I use bashpodder. I run it a couple of times an hour during my waking hours and stream or copy the files to whatever device I happen to be at. It’s a setup that works pretty well for me in general.

The downside is that all of the files get dumped into a directory by date. Some podcasts (e.g. Marketplace) do a good job of naming files: I know what show it is and when it’s from just by looking at the file name. Others use the network (e.g. “GLT” for Gimlet Media) and a string of numbers without any obvious meeting. The worst offender is, from where I get “The Greatest Generation” and Akimbo. Those shows have UUIDs as filenames.

I can understand why, on the backend, that is beneficial. The files themselves are just one part of (I assume) a database of shows. No human ever has to touch it, so you might as well name it in a way that minimizes the risk of a naming collision. But it’s extremely hostile to the user.

I suspect that most podcast listeners these days use an app and don’t directly download the files. But for those that do, sane file names are important. A friend asked about using just the date as the file name, as he apparently does for recordings from his church. That’s even worse, because it assumes that the listener saves them in a unique location.

When it comes to media that you intend for others to download, it’s vitally important to not make any assumptions how they will store it. Maybe they save everything to their Downloads folder and never move it. If two separate items were produced on the same day, one of them will potentially get overwritten. That’s probably not what you want to happen.

Wrists on with the Samsung Galaxy Watch

I’ve owned the same watch for two decades or so. It’s a Timex Expedition that I paid about $25 for. I’ve paid far more than that to replace batteries over the years. But recently I decided to get a new watch, so I popped into the T-Mobile store to get the Samsung Galaxy Watch.

The style is nice. The 42mm bidy fits my wrists well. While the strap won’t win any design awards, it’s unobtrusive. Of course, the face can be whatever you want. I have the “Analog Utility” face, but in fancier situations, I might set it to something a little more elegant. Or not.

Setting up the watch was simple. I like that I can decide which apps will notify on the Watch. The Galaxy Wearable app on my phone made it simple to apply updates and install new apps to the Watch. Of course, the app selection for the Tizen operating system is pretty limited. Samsung Health and replies to incoming texts and Facebook Messenger messages are about the limit of my usage so far.

Composing those replies has been a trip. The default input method is to write the letters with a finger. That generally works pretty well, but the character set is limited. Typing on the T9 option takes some getting used to, since it isn’t the T9 you remember from your featurephone days. Speech-to-text is…underwhelming. It’s clear that Bixby is not in the same league as Google Assistant.

The battery life is pretty good. I’ve been wearing my Watch all night and charging it during the day when I’m at my desk. Even over the weekend, it doesn’t take long on the wireless charger to get enough juice. I could probably get close to two days without charging. Longer would be nice, but this is good enough for me.

I have the built-in SIM, although I haven’t used the Watch away from the phone yet. I can’t see myself doing that too often. But using it for payment instead of pulling out my phone is slightly more convenient. Samsung Pay makes it easy to quickly select which card I want to use. Still, I’m more likely to have already pulled out my wallet by the time I realize that using Samsung Pay is an option.

What’s been most interesting to me is how the Watch has changed my behavior. I’ve noticed I have my phone out less now because I can get notifications on my wrist. From there I can decide whether to pull out my phone or just wait a bit. As a parent who sometimes gets too distracted by his phone, I appreciate this. I also like that it can count my steps without having to remember to put my phone back in my pocket. Having sleep data and heart rate data is interesting, although I haven’t done much with it. I see that data as something to look at retrospectively.

In all, I’m pretty happy with the Galaxy Watch. It won’t last as long as my Timex, but if I get a few solid years out of it, I’ll probably buy a new one.

Why subscribe to a newsletter you don’t read?

Why would you subscribe to a newsletter that you don’t read? I mean, maybe you intend to. Maybe it’s sitting there in your inbox unread just waiting for you to get around to it Real Soon Now. Or maybe you filter it off to some folder where email does to die. I get that. I do that all the time.

No, what I’m thinking about is the case where an obvious spam account signs up for a newsletter. As of this writing, my newsletter has 283 subscribers — a number that has grown 27% in the past month. But only 40 people at most have ever opened it. The number of opens has stayed relatively constant even as the subscriber count has gone up.

So why do I think the accounts are spam? For one, there’s the fact that most of them haven’t opened any newsletters. Sure, maybe there’s a reason for that. But also they look…spammy. The addresses are often yahoo or other domains that have fallen out of favor. The names represented by the addresses don’t look like the names of people I know. I can’t imagine why people I do know read my newsletter, nevermind why strangers would. Taken all together, I feel safe calling many of these accounts spam.

But to what end? I understand spam accounts on Twitter liking random posts in the hopes that someone will look at the profile and click a link to whatever thing someone’s trying to peddle. Or maybe follow the account and get clicks that way. That makes sense to me. But what can a spammer do with a newsletter subscription? Is it a really crappy denial of service attack? Do they hope that after a few years my subscriber list will exceed Mailchimp’s free tier? Maybe it’s done to hide nefarious activity in a flood of confirmation emails. That seems like the most likely answer, but it doesn’t seem very efficient. Then again, I’m not a spammer, so what do I know?

If everyone followed good password advice, we’d be less secure

Passwords are hard. To be useful, they must be hard to guess. But the rules we put in place to make them hard to guess also make them hard to remember. So people do the minimum they can get away with.

Earlier this week, security company Webroot took a look at the unintended consequences of password constraints. The rules organizations set in order to ensure passwords are sufficiently complex reduce the total number of possible passwords. This can make automated password guessing more

Good passwords are easy for the user to remember and hard for computers and other humans to guess. Let’s say I wanted to use a password like 2Clippy2Furious!! Various password checking sites rate it highly. It’s 18 characters long and contains upper- and lower-case letters, digits, and special characters. But because it contains consecutive repeating letters, some companies won’t allow it.

Writing for Webroot, Randy Abrams says “it’s length, not complexity that matters.” And he’s right. That’s the point behind the “correct horse battery staple” password in XKCD #936. So let’s all do that, right?

Well…it’s not so simple. If I were trying to brute force passwords, and I knew everyone was using four (or five or six) words, suddenly instead of “CorrectHorseBatteryStaple” being 26 characters, it’s four. Granted, the character set goes from 95 to (using /usr/share/dict/words on my laptop) 479,828. “CorrectHorseBatteryStaple” is many powers of 10 more secure if the attacker doesn’t know you’re using words.

And let’s be real: they don’t. This hypothetical weakness has a long time before it becomes a real concern. Don’t believe me? Just look at the password dumps when a site gets hacked. There are a lot of really bad passwords out there. If we took all the constraints off (except for minimum length), people would just use really dumb, easily-guessed passwords again. But it amuses me that if everyone followed good password advice, we’d actually make it worse for ourselves. Passwords are hard.

Sidebar: Yes, I know

The savvier among you probably read this and thought “it’s better to use a random string that you never have to memorize because your password manager handles it for you. Just set a very long and memorable password on that and you’re good to go.” Yes, you’re right. But people, even those who use password managers, will often go to memorable passwords for low-risk sites or passwords they have to use often (e.g. to log in to their computer so they can access the password manager). 

You are responsible for (thinking about) how people use your software

Earlier this week, Marketplace ran a story about Michael Osinski. You probably haven’t heard of Osinski, but he plays a role in the financial crisis of 2008. Osinksi wrote software that made it easier for banks to package loans into a trade-able security. These “mortgage-backed securities” played a major role in the collapse of the financial sector ten years ago.

It’s not fair to say that Osinski is responsible for the Great Recession. But it is fair to say he did not give sufficient consideration to how his software might be (mis)used. He told Marketplace’s Eliza Mills:

Most people realized that we wrote a good piece of software that we sold in the marketplace. How people use that software is … you know, you really can’t control that.

Osinski is right that he couldn’t control how people used the software he wrote. Whenever we release software to the world, it will get used how the user wants to use it — even if the license prohibits certain fields of endeavor. This could be innocuous misuse, the way graduate students design conference posters in PowerPoint or businesspeople use Excel for all conceivable tasks. But it could also be malicious misuse, the way Russian troll farms use social media to spread false news or sew discord.

So when we design software, we must consider how actual users — both benevolent and malign — will use it. To the degree we can, we should mitigate against abuse or at least provide users a way to defend themselves from it. We are long past the point where we can pretend technology is amoral.

In a vacuum, technological tools are amoral. But we don’t use technology in a vacuum. The moment we put it to use, it becomes a multiplier for both good and evil. If we want to make the world a better place, we cannot pretend it will happen on its own.

“You’ve been hacked” corrects behavior

Part of running a community means enforcing community norms. This can be an awkward and uncomfortable task. I recently saw a Tweet that suggests it might be easier than you thought:

It’s nice because it’s subtle and gives people a chance to self-correct. On the other hand, there’s some value in letting community members (and potential community members) see enforcement actions. Not as a punitive measure, but as a signal that you take your code of conduct seriously.

This won’t work for every case, but I do like the idea as a response to the first violation, so long as it’s a minor violation. Repeated or flagrant violation of the community’s code of conduct will have to be dealt with more strongly.

Twitter interactions are not a polling mechanism

Way back in the day, clever Brands tried to conduct Twitter polls by saying “retweet for the first choice and favorite (now like) for the second choice.” This was obviously very prone to bias. The first choice’s fans will spread the poll, so virality favors the first option. But it was also the best choice available, other than linking to an external poll site (which means a much lower interaction rate).

Then Twitter introduced native polls. Now you can post a question with up to four answers. It even makes a nice bar chart of the results. Twitter interactions are not a polling mechanism, so why are you using them?!

The answer lies in the word “interaction”. Social media interactions are a way for Brands to measure the success of their social media efforts. Conducting polls via interactions instead of the native polling mechanism are a cheap way to drive up interactions. It’s a good indication that you’re not interested in the answers. People who want actual answers can use polls.

This concludes today’s episode of “Old man yells at cloud”.

Date-based conditional formatting in Google Sheets

Sometimes a “real” project management tool is too heavy. And spreadsheets may be the most-abused software tool. So if you want to track the status of some tasks, you might want to drop them into a Google spreadsheet.

You have a spreadsheet with four columns: task, due, completed, and owner. For each row, you want that row to be formatted strikethrough if it’s complete, highlighted in yellow if it’s due today, and highlighted in red if it’s overdue. You could write conditional formatting rules for each row individually, but that sounds painful. Instead, we’ll use a custom formula.

For each of the following rules, apply them to A2:D.

The first rule will strike out completed items. We’ll base this on whether or not column C (completed) has content. The custom formula is =$C:$C<>"". Set the formatting style to Custom, clear the color fill, and select strikethrough.

The second rule will highlight overdue tasks. We only want to highlight incomplete overdue tasks. If it’s done, we stop caring if it was done on time. So we need to check that the due date (column B) is after today and that the completion date (column C) is blank. The rule to use here is =AND($C:$C="",$B:$B<today(),$A:$A<>""). Here, you can select the “Red highlight” style.

Lastly, we need to highlight the tasks due today. Like with the overdue tasks, we only care if they’re not done.=AND($C:$C="",$B:$B=today(),$A:$A<>""). This time, use the “Yellow highlight” style.

And that’s it. You can fill in as many tasks as you’d like and get the color coding populated automatically. I created an example sheet for reference.

Microsoft bought GitHub. Now what?

Last Monday, a weekend of rumors proved to be true. Microsoft announced plans to buy code-hosting site GitHub for $7.5 billion. Microsoft’s past, particularly before Satya Nadella took the corner office a few years ago, was full of hostility to open source. “Embrace, extend, extinguish” was the operative phrase. It should come as no surprise, then, that many projects responded by abandoning the platform.

But beyond the kneejerk reaction, there are two questions to consider. First: can open source projects trust Microsoft? Secondly, should open source (and free software in particular) projects rely on corporate hosting.

Microsoft as a friend

Let’s start with the first question. With such a long history of active assault on open source, can Microsoft be trusted? Understanding that some people will never be convinced, I say “yes”. Both from the outside and from my time as a Microsoft employee, it’s clear that the company has changed under Nadella. Microsoft recognizes that open source projects are not only complementary, but strategically important.

This is driven by a change in the environment that Microsoft operates in. The operating system is less important than ever. Desktop-based office suites are giving way to web-based tools for many users. Licensed revenue may be the past and much of the present, but it’s not the future. Subscription revenue, be it from services like Office 365 or Infrastructure-as-a-Service offerings, is the future. And for many of these, adoption and consumption will be driven by open source projects and the developers (developers! developers! developers! developers!) that use them.

Microsoft’s change of heart is undoubtedly driven by business needs, but that doesn’t make it any less real. Jim Zemlin, Executive Director at the Linux Foundation, expressed his excitement, implying it was a victory for open source. Tidelift ran the numbers to look at Microsoft’s contributions to non-Microsoft projects. Their conclusion?

…today the company is demonstrating some impressive traction when it comes to open source community contributions. If we are to judge the company on its recent actions, the data shows what Satya Nadella said in his announcement about Microsoft being “all in on open source” is more than just words.

And in any acquisition, you should always ask “if not them, then who?” CNBC reported that GitHub was also in talks with Google. While Google may have a better reputation among the developer community, I’m not sure they’d be better for GitHub. After all, Google had Google Code, which it shut down in 2016. Would a second attempt in this space fare any better? Google Code had a two year head start on GitHub, but it languished.

As for other major tech companies, this tweet sums it up pretty well:

Can you trust anyone to host?

My friend Lyz Joseph made an excellent point on Facebook the day the acquisition was announced:

Unpopular opinion: If you’re an open source project using GitHub, you already sold out. You traded freedom for convenience, regardless of what company is in control.

People often forget that GitHub itself is not open source. Some projects have avoided hosting on GitHub for that very reason. Even though the code repo itself is easily mirrored or migrated, that’s not the real value in GitHub. The “social coding” aspects — the issues, fork tracking, wikis, ease of pull requests, etc — are what make GitHub valuable. Chris Siebenmann called it “sticky in a soft way.

GitLab, at least, offers a “community edition” that projects can self-host. In a fantasy world, each project would run their own infrastructure, perhaps with federated authentication for ease of use when you’re a participant in many projects. But that’s not the reality we live in. Hosting servers costs money and time. Small projects in particular lack both of those. Third-party infrastructure will always be attractive for this reason. And as good as competition is, having a dominant social coding site is helpful to users in the same way that a dominant social network is simpler: network effects are powerful.

So now what?

The deal isn’t expected to close for a while, and Microsoft plans to seek regulatory approval, which will not speed the process. Nothing will change immediately. In the medium term, I don’t expect much to change either. Microsoft has made it clear that it plans to run GitHub as a fairly autonomous business (the way it does with LinkedIn). GitHub gets the stability that comes from the support of one of the world’s largest companies. Microsoft gets a chance to improve its reputation and an opportunity to make it easier for developers to use Azure services.

Full disclosure: I am a recent employee of Microsoft and a shareholder. I was not involved in the acquisition and had no inside knowledge pertinent to the acquisition or future plans for GitHub.