No, the cloud is not dead

If you think cloud hype is bad, cloud-is-dead hype may be worse. There’s nothing like declaring something dead to get attention. For example, this recent article in Wired. I’ll give Jeremy Hsu credit: he probably didn’t write the headline. Nonetheless, it’s an article in search of conflict.

The Wired article introduces the concept of edge computing to its readers. The idea behind edge computing is simple in concept: move the computation closer to where the consumer is: the edge of the network instead of a central data center(s).

Edge computing has great benefit in certain situations. Latency-sensitive applications such as mobile augmented reality (e.g. Pokemon Go) do better the closer the compute is to the user and their data. In fact, if all the computation can happen locally (e.g. on the user’s phone), that’s the best scenario. I don’t like Hsu’s example of self-driving cars, though. Cars that require a network connection to avoid running into things are cars that do not belong on the road.

But even with edge computing having solid use cases, that doesn’t mean a thing for the idea of cloud computing. First of all, there are still plenty of cases where edge computing doesn’t make sense. Centralization allows for greater economy of scale, which is great for many applications. Secondly, compute demand doesn’t decrease. More computing at the edge doesn’t mean less computing at the core, it means more computing total.

Now the rapid growth in cloud usage (and thus revenue) can’t go on forever at the current rate. At some point, it will level off and reach a steadier rate of growth. That’s the nature of the market. But it’s a mistake to equate maturity with death.

Disclosure: my employer is a leading public cloud provider.

Other writing in August 2017

What am I writing when I’m not writing here?

SSH login failures when you have too many keys

I recently had an interesting issue where I SSH login failures in to both work and personal servers. When I tried to log in, I’d immediately get

Received disconnect from w.x.y.z port 22:2: Too many
authentication failures for funnelfiasco
Authentication failed.

This was a surprise, because I hadn’t tried to log in for a while. Why would I get “too many authentication failures”? I knew we ran fail2ban on the work servers and I figured my web host used something like that, too, so I thought maybe something was triggering a ban.

I checked that there wasn’t something on my network that was generating SSH attacks. tcpdump didn’t show anything (whew!).

It turns out that the issue is due to how the SSH agent works. The SSH agent holds your SSH keys. This allows you to remote into a Unix server with a key without having to re-type your passphrase every time. This is really useful behavior, especially if you make remote connections regularly (whether directly SSHing or using something like git over SSH). But it has some behaviors that can cause problems.

By default, if you have an SSH agent running, it will send all of the keys in the agent, even if you’ve explicitly specified the identity to use. If you have more keys than the server’s MaxAuthTries setting, you may end up with too many login attempts before it gets to the one you want. If you don’t want this behavior, you can add IdentitiesOnly yes to your SSH config file.

Your assumptions may outlive you

Walt Mankowski had a great article on last week called “Don’t hate COBOL until you’ve tried it“. In this article, he shares the story of a bug. Because columns are special in some versions of COBOL, his code didn’t behave the way he expected.

The lesson I took away from this is: be careful about the assumptions that you make because they might bite someone decades later.

This isn’t a knock on Grace Hopper. At the time COBOL was invented, 80-line punch cards had been in use for over a century. It made sense at the time to treat that as a given. But here’s the thing about the 20th century: not only did technology change, but the rate of change increased. The punch cards that had survived over 100 years were well on their way to obsolescence 10 years later.

The future is hard. You can’t fault pioneers for not seeing how people would use computers decades later. But it turns out that this assumption was not future-proof.

Maybe that’s the better lesson: if you make something well, your assumptions will out live you.

How I back up my data

I talk about backups a lot (seriously, have them!), but I don’t really explain what I do. The answer is pretty simple, but I need some blog filler.

In short, I use CrashPlan. I have it write to a local USB drive and upload to their cloud service. The unlimited hosted backups cost $65 a year and mostly just work for me. I keep all of my important stuff on my main server. When I’m using another computer (e.g. my laptop), I use your typical remote access tools (e.g. NFS). I assume that anything on my laptop is subject to going poof at any moment.

I also use SpiderOak to synchronize and backup a few small things: my podcast downloads, a few application configs, etc.

In the old days, I had an rsync script that copied only particular directories to my USB drive. It would also copy to an external drive I kept at my office. This worked well when most of what I cared about was on one of two mountpoints. Now I use LVM to carve up a filesystem for broad categories, which would necessarily complicate the script. Frankly, I’d rather just let someone else’s software handle it.

One thing I don’t do that I really want to is setting up configuration management. While that’s not backup, it’s a shortcut to “get this machine back to where I want it to be”. Data is one thing, configuration is another. I could get back most of my configuration from the CrashPlan backup, but having configuration management would get my system functional quickly while I wait for the restore process to complete. Someday.

Communicating uncertainty to the public

Forecasting weather is a very imprecise endeavor. This is due in part to the fact that forecasts matter on very local scales. If I cancel a cookout due to a thunderstorm forecast, I won’t care that it rained everywhere else if it didn’t rain in my back yard. Given that the forecast will never be certain, how can forecasters communicate uncertainty to the public?

As the Washington Post‘s Capital Weather Gang wrote:

As much as we communicated the uncertainty, the forecast cannot be considered a success if the message we were trying to send … did not reach some people.

So what do they suggest? The high-level suggestion is to use a “traffic light” metaphor to indicate how people should proceed with their day. This benefits from being simple, but it has some key failings. As Jason Samenow noted, it needs to be broken into at least two dayparts. As many as five dayparts may be necessary: morning commute, day, afternoon commute, evening, and overnight.

Time-of-day isn’t the only issue. Thunderstorm forecasts are particularly sensitive to geography as well. Even if you’re only forecasting for a metro area, you may not end up with the same observed weather. So the system would need to account for multiple areas. If you divide the area into quadrants, that gives you 20 time/area combinations.

At some point, the simplified system becomes almost as complicated as the status quo. This means that the public will miss the nuance in the same way they do now. It’s a hard problem. An ideal balance between simplicity and nuance exists somewhere. But who can say where?

Potential Tropical Cyclone Nine Forecast Contest

Hear ye! Hear ye! I’ve opened up the Tropical Forecast Game for Potential Tropical Cycle Nine. Forecasts are due by 8 PM EDT Friday.

Chances are very good that this storm will be named later today. I’ll keep the “nine” appellation until after the contest closes to avoid any confusion. (Yes, the code is old and crusty so the name matters).

I have a new employer

If you haven’t already heard the news, my employer was acquired by Microsoft this week. Now if this had happened 10 or maybe even 5 years ago, I probably would have noped on out of there. But the Microsoft of today is, at least from outward indications, not at all the Microsoft of yore.

I am legitimately excited about this. The company started out as an $8,000 credit card bill 12 years ago and we’ve made it through on revenue ever since. It says a lot for our team, our product, and our customers that we’ve been able to grow and be successful.

At the same time, it’s been hard not having a big cushion to fall back on. When you’re bootstrapped, everything depends on deals closing and checks being sent on time. At Microsoft, we’ll have more latitude to make strategic investments.

Along those lines, I won’t just be marketing CycleCloud anymore. I’ll be working on marketing for the entire cloud HPC ecosystem. This is a daunting challenge, but one I’m really looking forward to. If I can do it well, my next step will be whatever I want it to be. If I do it poorly, I will have learned a lot along the way.

I can’t wait to see what happens.

Are shared block lists the answer to Twitter abuse?


At the beginning of the year, the good folks at Lawfare suggested shared block and follow lists could be an answer to the lack of civility and excess of abuse on Twitter. This is not a new idea. The Lawfare article discusses Block Together – a robust tool for sharing block lists. And Randi Harper’s GGautoblocker builds a block list of accounts that appear to be associated with “Gamer Gate”. What Citron and Wittes propose is to take similar functionality and make it natively a part of Twitter.

I understand their reasoning, but I don’t think it’s the right answer. First, there’s the practical concern. People use blocks in different ways. Some people block with great impunity. They might not have a problem with the person per se, but maybe they just don’t want to be reminded of something. Others only block as a last resort. Trying to manage your own preferences while automatically getting someone else’s can be challenging.

And then of course there’s the fact that it doesn’t get harassers off of the platform. If a garbage human shitposts on Twitter, but no one is around to see it, are they still a garbage human? Of course they are. I understand Twitter’s commitment to being “the free speech wing of the free speech party.” The idea of free speech is critical to a free society. By the same token, there’s no reason they have to give it a platform.

I don’t have the answers. Targeted abuse is a tough problem to solve. As an example, one or more people have been creating account after account targeting meteorologists on Twitter. At last check, Twitter has suspended some 700+ accounts – identical in every respect except for the incremented number at the end of the handle. Worse, the abuse has spread to other platforms, so even if Twitter had a good way of addressing it, they’d be limited by the borders of their service.

Shared block lists might not be the answer to abuse, but maybe they’re the best answer we have right now?

Other writing in July 2017

What have I been writing when I haven’t been writing here?