Brainstorming Security For The Internet Of Things

Introduction

This afternoon, my internet connection was so unusable that I couldn’t even watch non-HD Youtube videos. I decided that before blaming Comcast again, I should at least try to make sure the problem wasn’t on my end. I started by resetting my wifi router to the defaults and reconfiguring it from scratch.

I had long suspected that (a) some neighbor had cracked my WPA password and was wasting all my bandwidth; and/or (b) that the router itself was thoroughly pwned. I am of course extremely lazy, so I let this enjoyable paranoia simmer in the back of my mind, unresolved, for months. Besides, I had forgotten the admin password, so I knew I would have to reset it to factory defaults just to get back into the administration interface. Today was that day.

Note: In this post I’ll discuss some specific vulnerabilities I found in my wifi router, which any competent security engineer could find upon cursory inspection. The reason I describe them is to solidify with specific examples a larger point about the emerging “Internet Of Things” (IoT) market segment and its engineering requirements and limitations. (Especially authentication.) I believe that if we engineers/managers/marketers/business people are going to serve this IoT market, it is our duty to make the products as safe as we can — that is, much safer than they currently are. We are in early days, so now is the time to establish best practice and ratchet up the engineering culture.

For more on wifi router vulnerabilities specifically, see the results of the SOHOpelessly Broken contest at DEFCON 22.

The State Of The (Consumer-Grade) Art

So, while setting my router up, I decided to try the HTTPS option for the administration interface. By default, it’s HTTP-only.

Interface options.

Interface options.

Now, I expected to get the “authority invalid” HTTPS error page when I tried to connect to the router. “Authority invalid” means that no public, well-known (by the browser) certification authority (CA) has vouched for the server’s cryptographic identity (its certificate). This makes perfect sense, since my device is private: no CA could possibly have vetted my little router’s certificate, nor its (non-unique, private) IP address, nor its (made-up by me just now) name.

Given that, clicking through this warning screen would at least maybe make sense:

Authority invalid.

Authority invalid. But that is actually OK in this case.

Before continuing, I decided to take a look at the connection info and the certificate.

Connection tab in the Origin Info Bubble.

Connection tab in the Origin Info Bubble.

Certificate info.

Certificate info.

For some reason, the router serves a certificate signed with MD5withRSA and a 512-bit RSA key — obsolete algorithm and key size — yet uses an curiously strong 256-bit cipher (presumably some mode of AES) for bulk encryption.

(I say “curiously strong” because usually, cryptography engineers seek to set all crypto parameters to the same security level, as measured in powers-of-2 complexity. AES-256 is many orders of magnitude stronger than RSA 512 and MD5withRSA; mixing algorithms at these varying levels of strength does not make much sense. See this article on key size for example.)

You might imagine that there would be some performance concern with using sufficiently-modern (i.e. 2048-bit or larger) RSA keys; after all, this device is very tiny and doesn’t have much compute power. So a modern key size might cause the machine to establish TLS sessions slowly, due to the cost of the asymmetric crypto. But, on a machine with gigabit Ethernet, and for which the user will only rarely use the management interface, I don’t really think that explains it. And if the engineers were really concerned about compute resources, they might more likely have chosen RC4 and a smaller key for the bulk encryption, instead of AES 256. So, these crypto parameters are a bit mysterious to me. (Not that I advocate the use of RC4, of course.)

(Note also that the certificate is not valid before 8:20 PM PDT; I observed this certificate at 7:27 PM. When I later went to double-check this, I found that the router did have the correct time. However, I found that every time you disable and then re-enable the HTTPS option, the machine generates a new certificate with a Not Valid Before date 1 hour in the future, and with a new, distinct 512-bit RSA key. I suspect a time zone/daylight savings time math mistake in the programming. On the bright side, it’s very good that the machine generates a fresh key every time you re-enable HTTPS: That means that the key is not static, or identical on all the routers of the same make or model.)

Because 512-bit RSA and MD5withRSA are so obsolete, Chrome and Firefox simply refuse, as a matter of policy, to even connect to servers that present such cryptographic configurations. You can’t click through the HTTPS warning page; you get an outright network failure:

No data wanted.

No data wanted.

Firefox refuses to talk to the server in a similar manner, and for the same reason.

“No problem,” I thought, “I’ll just upgrade this thing’s firmware, which will probably fix this and lots of other things. After all, since I assume this machine is pwned, it needs at least a re-install.” Regular readers know me for my boundless optimism.

So I hit the vendor’s support page for the device, and note that it’s not HTTPS even though it serves a firmware download. (Yay! There is an updated firmware! The release notes refer to “various security vulnerabilities”, with no details. Boo, hiss.)

Even if you manually upgrade the page to HTTPS, it has mixed image content (not too terrible, but not great) and it still serves an HTTP link to the firmware. So, rather than click it, I copy it, paste it into a new tab, and manually upgrade it to HTTPS. Alas:

Akamai: Sad Trombone Distribution Network (STDN)

Akamai.

Still, I downloaded and installed it anyway, over broken HTTPS. For science.

Basic Web Application Safety: A Sidebar

It’s easy to check whether or not an application defends against cross-site request forgery (CSRF). It seems my router’s management interface does not.

To defend against CSRF, an application needs to verify that an incoming request was previously “formulated” by the application itself, and not by a 3rd-party attacker. (See the references in the Wikipedia page, e.g. Jesse Burns’ paper.) To do so, it should include an unpredictable secret value in the request parameters that only the server and the true client know.

Note in this request that the only authentication token is the session_id, a 32 hex digit (16 byte, 128 bit) random-looking number that is a parameter in the URL query string. There is no separate CSRF defense token. Here is some copy-pasta from the Network tab of Chrome’s Developer Tools, from when I changed the device’s name to “noncombatant2” and the secondary DNS server to the fake value “8.8.4.3”:

Request
Remote Address:10.0.0.1:80
Request URL:http://10.0.0.1/apply.cgi;session_id=[redacted]
Request Method:POST
Status Code:200 Ok

Request Headers
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:819
Content-Type:application/x-www-form-urlencoded
Host:10.0.0.1
Origin:http://10.0.0.1
Referer:http://10.0.0.1/index.asp;session_id=[redacted]
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.13 Safari/537.36

Form Data
submit_button:index
change_action:
submit_type:
gui_action:Apply
hnap_devicename:noncombatant2
...elided...
lan_netmask:255.255.255.0
machine_name:noncombatant2
...elided...
wan_dns0_0:8
wan_dns0_1:8
wan_dns0_2:8
wan_dns0_3:8
wan_dns1_0:8
wan_dns1_1:8
wan_dns1_2:4
wan_dns1_3:3
...elided...

Response Headers
Cache-Control:no-cache
Connection:close
Content-Type:text/html
Date:Sun, 12 Oct 2014 04:05:19 GMT
Expires:0
Pragma:no-cache
Server:httpd

(Some boring stuff elided.) Any attacker who can discover the session_id — such as when it leaks over HTTP when you click on the link to the Linksys web site from the Firmware Upgrade page —

Referer Leak Here.

Referer Leak Here.

There goes that auth token.

There goes that auth token.

could mount a CSRF attack to, for example, set your DNS servers to be malicious servers that always point to the attacker’s web server or proxy. They could thus intercept, eavesdrop on, and falsify all your non-HTTPS web browsing (among other potential attacks).

If you have this router model, you can work around this vulnerability by setting “Access via Wireless” to Disabled in the Administration page. Assuming you usually browse the web from a computer connected to the wifi interface and only manage the router from the wired interface, that reduces the window of vulnerability. Additionally, since you are rarely logged into the router’s administration interface, the window of vulnerability is further narrowed. (CSRF attacks only apply to users who are logged in to the vulnerable web app at the moment of attack.)

Internet Of Vulnerable Things

Let’s summarize what we’ve learned about my router.

  • its obsolete and unusable HTTPS means it can only be managed non-securely
  • its management interface is remotely vulnerable to an easily-exploited class of web application attack known since 2001
  • the session_id is easily leaked due to its placement
  • if the firmware update is secure, that is not apparent to the user
  • the firmware update (released “04/24/2014 Ver.1.0.06 (build 2)”) does not resolve the most immediately obvious problems (but does fix other unspecified vulnerabilities)

This is a mass-market, fairly powerful device, from a major vendor. It seems to have been originally sold in late 2009 or early 2010 (according to the reviews on the vendor’s product web page), so we can perhaps assume its hardware and software were designed and implemented/manufactured in late 2008 or early 2009. While not new, this is not quite a prehistoric, pre-security machine; 2008 is well after the Microsoft Trustworthy Computing initiative (but only the beginning of the time when some major internet services started offering HTTPS).

I should probably buy the very latest wifi router from this vendor and repeat the above tests. It’d be interesting to see if anything has improved. I should note that, last night, a friend of mine was setting up his brand-new wifi routers (from a different vendor). He determined they also were at least vulnerable to CSRF.

So, How Could We Improve?

If we are going to live in an “internet of things” (IoT) world, vendors need to improve far beyond the state that my middle-aged wifi router is in.

Consumer appliances like wifi routers, file servers, and printers are all relatively powerful computers that can definitely support the full range of security goodness:

  • transport encryption and authentication
  • storage encryption (where applicable)
  • type-safe implementation languages (at least for code with network-facing attack surface)
  • automatic and well-authenticated updates
  • modern frameworks for things like the web-based management application
  • perhaps even secure boot? Dare I dream?

We can expect these devices to represent the high-end of the IoT, and that smaller devices may not meet such a high engineering quality bar (at least initially).

We might need to upgrade the specifications of lower-end devices to meet a bare minimum, or perhaps apply alternative security strategies. For example, if a device is only marketable if its price point is so low that it cannot be secure, perhaps it should disable itself after some reasonable life-time. That way, at least the devices won’t live on for a long time, making their users vulnerable. Or perhaps such devices can fall back to minimal functionality, automatically reducing their attack surface when they get too old.

The Authentication Problem

Note that even if my wifi router were perfect in every way, there would still be that initial problem: “invalid authority”. That is, we still need a way for TLS clients to authenticate an IoT device’s TLS server (note that I am leaving room for the application protocol to be anything, not just HTTPS). I can think of at least these ways to approach that problem, which I’ll sketch here. I stress that this is currently not a solved problem.

Trust on first use, and remember. Currently, Firefox allows users to click through most HTTPS error screens, with the option to “confirm this exception”, so that Firefox will remember that the user accepts the error for the given site. With Chrome, we are experimenting with variations on this idea. (See chrome://flags/#remember-cert-error-decisions in Chrome 39 Beta.)

Trust and then key-pin on first use. Perhaps IoT is enough like the SSH use case: The device could generate a new key and certificate each time it is reset (or, as my router does, each time the HTTPS server is launched), and the client would leap-of-faith trust it on the first connection (perhaps prompting the user, perhaps not). Thereafter, the client would expect the same public key from that server. (Or expect 1 member of a set of public keys, if the server serves a certificate chain.) If a device by that name ever served a new key — such as because it was reset, or because there was truly a man-in-the-middle attacker — the client would reject the connection. As with SSH, the user would have to affirmatively delete the old name/key association and then re-establish trust.

“Confirm this exception” and TOFU + key pinning are not necessarily great solutions. The easier it is to confirm such exceptions, the more likely it is that users will mistakenly accept authentication errors on the public web. Yet it must be easy to confirm such exceptions, so that users can use the product. Recovering from legitimate key rotation would likely be a pain point for users. (Perhaps clients could incorporate some easy recovery UX flow, but that is still an open problem in secure UX design.)

Dual-mode general-purpose clients. Perhaps general-purpose clients, such as browsers, should be able to go into 2 modes: Public Internet Mode (the current behavior, in which clients discourage self-signed certificates), and IoT Mode (in which clients expect self-signed or alternative trust anchors). The client would need some reliable way to know which mode to use; for example, clients might go into IoT Mode for servers using non-unique private IP addresses, non-ICANN-approved gTLDs, or dotless hostnames.

Dedicated management client, with baked-in trust anchors. Another possibility is to not try to authenticate the IoT in general-purpose clients. Instead, for example, vendors could ship an Android app and an iOS app and a Windows app and a Mac OS X app and a Linux app so that users could use and manage the vendor’s devices. Since the client and server would be more tightly integrated, they could use an alternative, vendor-managed trust anchor, rather than relying on self-signed certificates.

Obviously, developing clients for many platforms is more expensive than developing for just the web platform. But specialized clients can have their advantages.

That doesn’t mean that vendors will necessarily give their customers the full benefit of those advantages. I once had a Drobo (a file server appliance) that worked this way. Both for file service and for management, it could only be used with a dedicated client program (I used the Mac OS X client). If I recall correctly, it did not serve files by open or semi-open standards like SMB/CIFS or NFS. Instead, it actually used a kernel module to implement its own network filesystem. Unfortunately, it did not take advantage of the tight client-server integration to use a vendor-managed trust anchor; all communication was unauthenticated and unencrypted. Still, it could have had authentication and encryption with good usability.

How Do We Get There From Here?

A key problem with IoT is that, in general, the price point for the devices must be very, very low. This puts huge pressure on vendors: We want high engineering quality, including good security and good usability, yet we also want low prices. And low engineering quality could doom the entire IoT product class: It might only take a few news reports of users being surveilled and trolled by their refrigerators before people decide to stop buying IoT things.

I think I bought this wifi router for $50 USD. So how do you get strong authentication and encryption, resilience against native code vulnerabilities, frequent and secure updates, defense against well-known web application attack classes, and so on, for $50? While also getting high performance networking like 802.11ac?

The good news is that there are a few design decisions that vendors can take early on in the product development lifecycle to keep costs lower and bugs fewer. In addition to those, for web-enabled things I’d add a requirement to use a modern web application framework that resolves XSS, CSRF, session fixation, auth token leakage, et c. by design. (Here is 1 example of a bestiary of web application attack classes. A successful web application must resolve all that apply.)

For updates, vendors need to use signed updates (with the public key(s) baked into the firmware), and to automate delivery, and to manage the signing keys extremely well. All of this is hard, and so far only a few software vendors have been able to do it.

There is a long road between where we are now and a secure internet of things. I actually am optimistic that we can make good progress down this road, but it will require engineers and business people to get creative — the sooner the better.

Setting Up HTTPS Is Cheap And Not Too Hard

When I advocate that people use HTTPS for their web sites, I get a lot of pushback, usually along these lines:

  • Setting up HTTPS is too difficult
  • Setting up HTTPS costs too much
  • HTTPS is too slow
  • I am ideologically opposed to giving certification authorities (CAs) money

I am going to address the first 2 points in this article. To find out if setting up HTTPS is too difficult or expensive, I found a domain name I had already bought but had not set up a web server for, and stood up an HTTPS server for it from scratch.

I decided to use Amazon EC2 to get a virtual server, to use Apache as my web server software (just because I am familiar with it), and sslmate.com as my X.509 certificate provider. Here is my tale.

Step 0: Buy a domain

I had already bought nonfreesoftware.org from register.com. Their prices: 1 year: $38, 2 years: $70, 3 years: $79. I bought it for 1 year.

Step 1: Get a virtual server

Start time: 11:15 20 Aug 2014.

I had to create an Amazon AWS account before I could rent an EC2 instance. I did so, and then started up a t2.micro Ubuntu 14.04 instance. I created a new security group to allow SSH, HTTP, and HTTPS from anywhere on the internet.

Step 2: Install Apache

I created an SSH key for the instance, logged in, and installed Apache.

~ # apt-get install apache2

Step 3: Set up DNS

Now that I had an IP address from Amazon, I set the A records for nonfreesoftware.org and http://www.nonfreesoftware.org to point to my IP.

Step 4: Set up an sslmate.com account

On the web site, I created an account.

Then I tried to install the sslmate client software on my EC2 instance using the .deb for Ubuntu 14.04, as documented at https://sslmate.com/help#install, but I got this error:

E: Release file for http://packages.sslmate.com/ubuntu/dists/trusty/InRelease is expired (invalid since 41d 3h 58min 21s). Updates for this repository will not be applied.

So, I sent this message to sslmate.com’s tech support using the form on the web site:

Trying to install the .deb for Ubuntu 14.04, I get:

~ # apt-get update
[...]
Get:7 http://packages.sslmate.com trusty InRelease [5,357 B]
E: Release file for http://packages.sslmate.com/ubuntu/dists/trusty/InRelease is expired (invalid since 41d 3h 58min 21s). Updates for this repository will not be applied.

Tried to change the URL in /etc/apt/sources.list.d/sslmate.list to https://, and got this warning:

W: Failed to fetch https://packages.sslmate.com/ubuntu/dists/trusty/main/binary-amd64/Packages gnutls_handshake() warning: The server name sent was not recognized

Time: 11:46. Took a break.

Started again at 11:55. I decided to try installing sslmate via npm, as the web site also suggests. npm and its dependencies are a 127 MB install, including the GCC compiler. Yikes. Oh well.

~ # apt-get install npm
[...]
~ # npm install sslmate
npm http GET https://registry.npmjs.org/sslmate
npm http 200 https://registry.npmjs.org/sslmate
npm http GET https://registry.npmjs.org/sslmate/-/sslmate-0.1.4.tgz
npm http 200 https://registry.npmjs.org/sslmate/-/sslmate-0.1.4.tgz
sslmate@0.1.4 node_modules/sslmate

Since that worked, I decided to try buying a certificate for 1 year ($15.95).

~ # sslmate buy www.nonfreesoftware.org 1
sslmate: command not found
~ # find / -name '*sslmate*'
/etc/apt/sources.list.d/sslmate.list
/etc/apt/trusted.gpg.d/sslmate.gpg
/var/lib/apt/lists/partial/packages.sslmate.com_ubuntu_dists_trusty_InRelease.reverify
/home/ubuntu/node_modules/sslmate
/home/ubuntu/node_modules/sslmate/bin/sslmate
/home/ubuntu/node_modules/.bin/sslmate
/home/ubuntu/.npm/sslmate
/home/ubuntu/.npm/sslmate/0.1.4/package/bin/sslmate
~ # ./node_modules/sslmate/bin/sslmate buy www.nonfreesoftware.org 1
/usr/bin/env: node: No such file or directory

I take it the sslmate program is a Node JS script that starts with #!/usr/bin/env node. Unfortunately, I don’t have a program called node, even though I installed npm. Hmm. Word to the wise: apt-get install node gets you something called ax25-node — not what we want.

Interlude

Time: 12:03. I got an email from the founder of sslmate.com, saying that he had forgotten to update the .deb file. He thanked me for pointing out the problem and said he’d fix it. He noted, correctly, that since the .deb files are signed with GPG, that they don’t need to transported over HTTPS for authentication and integrity. I replied saying thanks, and that I was only trying to download the apt source via HTTPS on a lark anyway. (Still, it would seem to be nice to make it available.)

Back to work

But, this works:

~ # nodejs ./node_modules/sslmate/bin/sslmate buy www.nonfreesoftware.org 1
If you don't have an account yet, visit https://sslmate.com/signup
Enter your SSLMate username: noncombatant
Enter your SSLMate password: *
Linking account... Done.
Generating private key... Done.
Generating CSR... Done.
Submitting order...

We need to send you an email to verify that you own this domain.
Where should we send this email?

1. postmaster@www.nonfreesoftware.org
2. webmaster@www.nonfreesoftware.org
3. hostmaster@www.nonfreesoftware.org
4. administrator@www.nonfreesoftware.org
5. admin@www.nonfreesoftware.org
6. postmaster@nonfreesoftware.org
7. webmaster@nonfreesoftware.org
8. hostmaster@nonfreesoftware.org
9. administrator@nonfreesoftware.org
10. admin@nonfreesoftware.org

Oops, I have to set up an email account for the domain first.

Step 5: Set up Gmail for the domain

I added the new nonfreesoftware.org domain to my existing noncombatant.org account. Following Google’s tech support page instructions, I proved to Google that I own the domain by setting up a TXT record in DNS, in register.com’s web interface. Again following instructions, I added MX records in DNS to point to the Gmail servers. I verified that the settings work with dig.

Time: 12:19. Stopped, waiting for Google to finish verifying MX setup. Ate a piece of cold pizza and finished my coffee.

12:30: Started again, adding a new webmaster@nonfreesoftware.org account to satisfy sslmate.

Step 6: Try buying a certificate again

~ # nodejs ./node_modules/sslmate/bin/sslmate buy www.nonfreesoftware.org 1
Generating private key... Done.
Generating CSR... Done.
Submitting order...

We need to send you an email to verify that you own this domain.
Where should we send this email?

1. postmaster@www.nonfreesoftware.org
2. webmaster@www.nonfreesoftware.org
3. hostmaster@www.nonfreesoftware.org
4. administrator@www.nonfreesoftware.org
5. admin@www.nonfreesoftware.org
6. postmaster@nonfreesoftware.org
7. webmaster@nonfreesoftware.org
8. hostmaster@nonfreesoftware.org
9. administrator@nonfreesoftware.org
10. admin@nonfreesoftware.org
Enter 1-10 (or q to quit): 7

============ Order summary ============
Host Name: www.nonfreesoftware.org
Product: Standard SSL
Price: $15.95 / year
Years: 1

=========== Payment details ===========
Credit Card: MasterCard ending in [censored]
Amount Due: $15.95 (USD)

Press ENTER to confirm order (or q to quit):
Placing order...
Order complete.

You will soon receive an email at webmaster@nonfreesoftware.org from sslorders@geotrust.com. Follow the instructions in the email to verify your ownership of your domain. Once you've verified ownership, your certs will be automatically downloaded.

If you'd rather do this later, you can hit Ctrl+C and your certs will be delivered over email instead.

Waiting for ownership confirmation...

I checked my webmaster@nonfreesoftware.org Gmail, and saw that I had a new email from RapidSSL. The email contained a magical link for me to click on to prove that I had received the email (hence that I can read email at the domain). I clicked on it. Then I Command-Tab’d back to my shell window, and found that sslmate had finished:

Your certificate is ready for use!

Private key file: www.nonfreesoftware.org.key
Certificate file: www.nonfreesoftware.org.crt
Certificate chain file: www.nonfreesoftware.org.chain.crt
Certificate with chain file: www.nonfreesoftware.org.chained.crt

Sweet!

Step 7: Configure Apache to use the certificate

Below is my Apache configuration file for the domain.

I want to use only strong protocol versions and ciphersuites, so I did a web search for [ apache ssl ciphersuites ]. That got me to http://httpd.apache.org/docs/2.0/ssl/ssl_howto.html.

I also want the http:// site to automatically redirect to https://, so I searched for [ apache redirect ssl ]. That got me to https://wiki.apache.org/httpd/RedirectSSL.

I also want to use Strict Transport Security, so I searched for [ apache hsts ]. That got me to https://www.owasp.org/index.php/HTTP_Strict_Transport_Security.

To figure out the certificate chain business, I search for [ apache server intermediate certificate ], and found http://www.digicert.com/ssl-certificate-installation-apache.htm. You need to serve both the end-entity (EE) certificate, and any intermediate certificates that allow the EE to chain up to the root. As you’ll recall from the sslmate output, it got me the EE, the intermediate, and a file containing both.

<VirtualHost *:80>
  ServerName nonfreesoftware.org
  ServerAlias www.nonfreesoftware.org
  ServerAdmin webmaster@nonfreesoftware.org
  DocumentRoot /var/www/html
  LogLevel info ssl:warn
  ErrorLog ${APACHE_LOG_DIR}/error.log
  CustomLog ${APACHE_LOG_DIR}/access.log combined
  Redirect permanent / https://nonfreesoftware.org/
</VirtualHost>
<VirtualHost *:443>
  ServerName nonfreesoftware.org
  ServerAlias www.nonfreesoftware.org
  ServerAdmin webmaster@nonfreesoftware.org
  DocumentRoot /var/www/html
  LogLevel info ssl:warn
  ErrorLog ${APACHE_LOG_DIR}/error.log
  CustomLog ${APACHE_LOG_DIR}/access.log combined
  SSLEngine On
  SSLCertificateFile /etc/ssl/certs/www.nonfreesoftware.org.crt
  SSLCertificateChainFile /etc/ssl/certs/www.nonfreesoftware.org.chain.crt
  SSLCertificateKeyFile /etc/ssl/private/www.nonfreesoftware.org.key
  Header add Strict-Transport-Security "max-age=1576800"
  SSLProtocol all -SSLv2
  SSLCipherSuite HIGH:!aNULL:!MD5
</VirtualHost>

That config required me to enable modssl (for SSL/TLS), modsocache_shmcb (for TLS session cacheing), and modheaders (for the HSTS header directive):

/etc/apache2/mods-enabled # ls -l
[...]
lrwxrwxrwx 1 root root 26 Aug 20 19:42 ssl.conf -> ../mods-available/ssl.conf
lrwxrwxrwx 1 root root 26 Aug 20 19:42 ssl.load -> ../mods-available/ssl.load
lrwxrwxrwx 1 root root 36 Aug 20 19:43 socache_shmcb.load -> ../mods-available/socache_shmcb.load
lrwxrwxrwx 1 root root 30 Aug 20 19:57 headers.load -> ../mods-available/headers.load

Set-up complete

Time: 12:46. So, it took me about 90 minutes, including breaks and a technical mishap, to go from 0 to a running HTTPS web server with a valid certificate.

I spent another 90 minutes or so writing and editing this blog post. (I’m a slow writer.)

Conclusion

I spent $38 for the domain; $16 for the certificate; a small, ongoing cost (I don’t know how much yet) for the EC2 instance; and 90 minutes of my time. The certificate is probably the cheapest item, or perhaps on par with my t2.micro server instance for a few months or the year.

Especially given the fact that I am a software engineer — not a good operations engineer/systems administrator, as I think my command lines and configuration files prove — I think it’s fair to say that any technical person who can set up an HTTP server can also set up an HTTPS server with only marginal additional cost and in only marginally more time.

Sure, sslmate could be more perfect, and hopefully nobody will run into the .deb problem again. Also, perhaps the sslmate.com people should update their documentation to show the exact commands you have to run to launch the Node JS script. But avoiding those mishaps would have saved me 20 minutes at most.

Also, there are surely better Apache configurations — I haven’t used elliptic curve ciphers, I haven’t performance-tuned it all (including a good configuration for text compression), and so on. But it works, it’s reasonable, and it took 90 minutes from start to finish. I think that’s enough of an answer to people who believe HTTPS is too difficult or too expensive to set up.

Why not use HTTPS for noncombatant.org?

An excellent question! I’m glad you asked. I am using hosted WordPress, and unfortunately they do not offer HTTPS for custom domains. I asked a tech support representative about it, and she said they may offer it in the future but do not now. They definitely could, as a technical matter, but until Windows XP dies, it’s difficult given WordPress’ multi-tenant deployment.

So, I have to choose between having WordPress manage their software for me, and having a securely-transported blog. I definitely don’t want to manage WordPress myself on my own server, but perhaps I could move this blog to nonfreesoftware.org running some simpler blogging platform or as a static set of pages.

An Update

I just got an email from sslmate.com saying that they have now made packages.sslmate.com work over HTTPS, too. So that’s nice.

#ValueMusic 10 August 2014: Tumi Mogorosi

Yep, it’s time for another installment of #ValueMusic.

Tumi Mogorosi, Project Elo (Jazzman 2014): I love everything about this. South African drummer and bandleader Tumi Mogorosi brings a wide range of elements together to craft an instant classic: accessible yet ear-stretching, new but grounded in the classics. It’s another facet of what A Love Supreme made possible: this time with a choir, and a guitar.

This is a good album to get on vinyl; it has a huge dynamic range, which the bastardized “Mastered For iTunes” format would surely destroy. Mastering for vinyl typically (but by no means necesasrily) involves less dynamic range compression — a key reason I am interested in buying new music on vinyl. You have to crank this record all the way up, and you’ll be rewarded with knowledge of every twitch of the bass player’s finger.

The arrangements are complex yet feel natural; every voice is in balance and in time. There is a risk that egg-head music like this will wallow in contrivance and lose sight of its inspiration, but Mogorosi avoids that fate and delivers a vibrant vision that feels classic yet new at every moment.

Credits: Tumi Mogorosi: drums; Thembinkosi Mavimbela: double bass; Sibusile Xava: guitar; Nhanhla Mahlangu: tenor sax; Malcom Jiyane: trombone; Themba Maseko: voice; Ntombi Sibeko: voice; Mary Moyo: voice; Gabsile Motuba: voice. All tracks composed by Tumi Mogorosi. Lyrics written by Gabisile Motuba and Tumi Mogorosi. Cover illustration: Kevin Neireiter; design: Andrew Symington.

#ValueMusic 3 August 2014: Metal Roundup!

This post covers my #ValueMusic goodies from last week. I’m not lazy; I was just hung up on finishing the giant post on security first.

I got 3 vinyl records at Amoeba this weekend, in various delicious flavors of metal!

Ghoul, Noothgrush/Coffins, and Arch Enemy

Ghoul, Hang Ten EP (Tankcrimes, 2014): In the tradition of Gwar and Goblin Cock, Ghoul plays curiously well-crafted goof-metal. The production is excellent, with absurdly thick guitars that don’t obscure the tight bass and drum performances. Largely instrumental, the 6 tunes tell a tale of gruesome heroism in a Mad Max-style dystopia. From the liner notes:

The state apparatus had been torn limb from limb by the Mezmetron-addled horde. The last of the missing Basilisk’s security detachments were forced into hiding along with their leader, Commandant Dobrunkum. The streets were lawless. Out of the chaos thundered The Midnight Ride of the Cannibals MC.

High points: “Hang Ten” and “It Was A Very Good Year”.

Credits:

R.A. MacLEAN as KREEG ad TONY FORESTA as HENCHMAN #1 with DIGESTOR, CREMATOR, FERMENTOR, DISSECTOR and THE MORON CAVERN-SHACKLED CHOIR featuring GABE GALVER and R. LAWRENCE DiBARI with special narration by PETER SVOBODA.

Recording and Mixing: Salvador Raya at Earhammer Studios; Mastering: Dan Randall at Mammoth Sound Mastering; Cover Art: Sean Äaberg; Layout: Doktor Sewage; All music digested, cremated, fermented, and dissected by Ghoul except “It Was A Very Good Year” by Ervin Drake.

Noothgrush/Coffins split LP (Southern Lord, 2013): This record is a good introduction to 2 venerable doom/sludge-metal bands and to the genre(s) in general. Noothgrush opens the LP with classic Black Sabbath guitar and drum sounds: huge and dark. Although doom and gloom are the name of the game in this genre, Noothgrush are not above a little fun weirdness, including a sample of the Tusken Raiders cheer from Star Wars in “Jundland Wastes”, and a loopy monologue in “Thoth”. The band is, after all, named after a Dr. Seuss character. High point: “Humandemic”.

Credits: Dino Sommese: vocal; Russ Kent: guitar; Gary Niederhoff: bass; Chino Nukaga: drum. Recorded and mixed at Earhammer; mastered by Bob Boatright; artwork by Josh Graham.

Coffins take side B with awesome feedback, a ridiculously sludgey guitar tone, and a drummer who sounds like he has as large a cymbal collection as Tomas Haake. High point: “Drown In Revelation”.

Credits: Uchino: guitars/vocal; Koreeda: bass/vocal; Ryo: vocals; Satoshi: drums. Recorded and mastered at Noise Room; engineered by Shigenori Kobayashi; produced by Uchino and Coffins; all songs written by Uchino; all lyrics written by Ryo.

Arch Enemy, War Eternal (Century Media, 2014): Although Wages Of Sin is still my favorite Arch Enemy album, that’s probably just nostalgia. Arch Enemy records always have perfect production, perfect performances, and songwriting in the classic model of Iron Maiden. Even when Alissa White-Gluz is screaming her face off, which is most of the time, there’s always something to sing along to.

This is the first Arch Enemy record with new vocalist White-Gluz, who replaces the irreplaceable Angela Gossow, and Nick Cordle, who replaces Christopher Amott on second guitar. They’re both every bit as good as their predecessors, even reinvigorating.

For some reason, this is a “bootleg” vinyl, numbered from a set supposedly of 500, minimally packaged, and on transparent green vinyl. If it’s a marketing gimmick, it’s corny; I’d rather have spent the not-small sum on a properly-packaged LP with download codes.

High points: “Stolen Life” and “Time Is Black”; although the whole album is consistently solid.

Credits: Alissa White-Gluz: lead vocals; Michael Amott: lead and rhythm guitar, keyboards; Nick Cordle: lead and rhythm guitar; Sharlee D’Angelo; bass guitar; Daniel Erlandsson: drums. Per Wiberg: mellotron; Henrik Janson: orchestration, string arrangements; Ulf Janson: additional keyboards, orchestration, string arrangements; Stockholm Session Strings: strings. Produced by Arch Enemy; mixed and mastered by Jens Bogren; Staffan Karlsson, Nick Cordle, Daniel Erlandsson, Johan Örnborg, Linn Fajal: engineering. Costin Chioreanu: artwork, layout; Patric Ullaeus: photography.

Security As A Class Of Interface Guarantee

This post is an attempt to pin down my intuition that an “interface”, broadly defined, can be a productive conceptual frame for a wide variety of security problems and solutions. I can’t promise that this post makes total sense; it’s just thinking out loud at this point.

There are many ways to understand software security engineering. One (all-too-)prevalent view is of security as a cat-and-mouse game: by hook or by crook, any little thing you can do to attack or avoid being attacked counts as “security engineering”. Especially for defenders, this view leads directly to failure. It’s analogous to micro-optimizing a fragment of code (a) before profiling it to see if it’s really a hot spot; (b) without testing to see if the micro-optimizations help or hurt; and (c) without any quantified performance target.

For example, consider a web application firewall (WAF). People often buy these to “secure” their web applications, saying things like, “Hey, even if the web application is well-engineered, belt and suspenders, right?! Belt and suspenders!” But ask: How much does the WAF cost to buy? How much does it cost to install, configure, and run? Who looks at its logs and reports, and how much does that person’s time cost? (Don’t forget the opportunity cost.)

How does the WAF affect the application’s performance and reliability? Possibly not well.

How much attack surface does the WAF itself create and expose? Often, a WAF can create significant new risk. I once found an XSS vulnerability in a web application, and ran a demonstration exploit so I could document that it worked. No big surprise there. After a while, a guy came up to me and said he was he WAF operator for that app, and did these weird pop-ups he kept seeing have anything to do with my security testing? I didn’t even know the app was (supposedly) being protected by a WAF, but I had accidentally exploited both the app and the WAF in one shot.

A correct WAF configuration is equivalent to fixing the bug in the original application. Why not just do that?

I want to forget all about both belts and suspenders; instead, I want to buy pants that actually fit.

A note on terminology: In this blog post, I’ll use the term interface to mean any of: user interface, programming language syntax and semantics, in-process API, system call, RPC and network protocol, or ceremony. I’ll use guarantee to include design contracts with explicit non-guarantees. I’ll use caller to mean any of: human programmer, human user, call-site in source code, or requesting network protocol peer. A callee is a person who receives a message (e.g. an individual or the operator of a remote service), an API or library implementation or other in-process called function, or an RPC or network protocol respondent. An interface definition is any programmatic function signature (including identifiers and type annotations), type semantics, visual semiotics of a GUI or CLI, et c. that attempts to communicate the meaning and guarantees of the interface to callers. The primary interface definition is the immediately accessible surface of the interface itself, e.g. a function or method declaration, an IDL specification or other code generation/specification system for network protocols, the grammar of a programming language, or a user-facing GUI or CLI. A secondary interface definition is supplementary material; usually documentation, annotation, post-facto errata, entries in issue trackers, commit log messages, et c.

Security Is Part Of Every Interface

I prefer to think of security as a class of interface guarantee. In particular, security guarantees are a kind of correctness guarantee. At every interface of every kind — user interface, programming language syntax and semantics, in-process APIs, kernel APIs, RPC and network protocols, ceremonies — explicit and implicit design guarantees (promises, contracts) are in place, and determine the degree of “security” (however defined) the system can possibly achieve.

Design guarantees might or might not actually hold in the implementation — software tends to have bugs, after all. Callers and callees can sometimes (but not always) defend themselves against untrustworthy callees and callers (respectively) in various ways that depend on the circumstances and on the nature of caller and callee. In this sense an interface is an attack surface — but properly constructed, it can also be a defense surface.

Here are some example security guarantees in hypothetical and real interfaces:

  • The function bool isValidEmailAddress(String address, Set knownTLDs) returns true if the email address is syntactically valid for SMTP addresses according to RFC 3696, and if the domain part is in a known top-level domain.
  • All array accesses are checked at run time; an attempt to use an index that is less than zero or greater than or equal to the length of the array causes an ArrayIndexOutOfBoundsException to be thrown. (From the Java Language Specification.)
  • DNS queries and responses can be read, copied, deleted, altered, and forged by an attacker on any network segment between client and server.
  • Within a single goroutine, the happens-before order is the order expressed by the program. (From the Go language documentation.)

The Interface Perception Gap

The true technical security guarantee that an interface’s implementation provides is not necessarily the same as the guarantee the caller perceives. I’ll call this the interface perception gap, for lack of a less-awful term. The gap could exist for many reasons, including at least:

  • the guarantee is implicit (i.e. not in the interface definition)
  • the guarantee is explicit, but the caller did not read or understand the interface definition
    • possibly because the interface definition is too complex for the caller to understand
    • possibly because the guarantee is not in the caller’s mental model of the interface or of the caller’s own requirements
  • the interface misuses terms in its own definition
  • the interface definition is so poor that the caller must imagine their own implicit definition

Gaps in contracts tend, over time, to become implicit guarantees and non-guarantees. It can be possible to assert new technical guarantees in the gaps. Consider address space layout randomization (ASLR). The executable loaders of operating systems never specified the precise location in memory of the program text, heap, stack, libraries, et c. in memory; this freed up implementors to randomize those locations to thwart exploit developers, cat-and-mouse style. When it was invented, ASLR was a decent way to buy some time (a couple years at most) for the authors of programs written in unsafe languages to fix their bugs or port to safe languages. However, it was never going to be possible for ASLR to fully solve the problems of unsafe languages, for many reasons, including at least:

  • ASLR was a new technical guarantee retrofitted into the interface perception gap of pre-existing executable loaders that had to be compatible with existing code, and thus not all program components could be randomized with a high degree of entropy.
    • And ASLR is an all-or-nothing affair: If the attacker can reliably locate any executable code, they can almost certainly find gadgets useful for exploitation there.
  • Programs generally must be recompiled with new options, or at least with old options previously thought of as being exclusively for dynamically-loadable library code — that is, there wasn’t enough of a perception gap in the toolchains’ interfaces! As a result, the guarantee of ASLR is still not ubiquitous, more than a decade later.
  • Many program errors are still exploitable due to the limited granularity of what program parts can be efficiently randomized — there is an implicit guarantee of run-time efficiency that extreme ASLR could violate.
    • Sometimes even coarse-grained ASLR violates certain extreme performance requirements.
  • In applications that give attackers significant but not directly malicious control over run-time behavior — for example, as any dynamic programming environment like a web browser must do — the attacker can significantly reduce the effective entropy of ASLR, thus weakening the already-weak guarantee.
  • Previously low-severity bugs, like single-word out-of-bounds read errors, become information leaks that can undo all the benefits of ASLR and enable an attacker to craft a reliable exploit. The implied ‘interface’ of an out-of-bounds read primitive changes: while an OOB read should be guaranteed not to happen, the ‘guarantee’ changes from “likely possible but mostly harmless” to ”there goes ASLR… now all those ROP exploits are back in scope.” Oops.

Perhaps because ASLR was not (to my knowledge) clearly documented as a temporary cat-and-mouse game, engineers have come to rely on it as being the thing that makes the continued use of unsafe languages acceptable. Unsafe (and untyped) languages will always be guaranteed to be unsafe, and we should have used the time ASLR bought us to aggressively replace our software with equivalents implemented in safe languages. Instead, we linger in a zone of ambiguity, taking the (slight) performance hit of ASLR yet not effectively gaining much safety from it.

Sometimes, interface perception gaps are surfaced, and the interface and implementation change to close the gap. A classic example is the denial-of-service problem in hash tables: If an attacker can influence or completely control the keys of the pairs inserted into a hash table, they can cause the performance to degrade from the (widely perceived — but usually explicitly disclaimed!) ~ O(1) performance guarantee for hash table lookup. Defenders can either explicitly claim the performance guarantee by randomizing the hash function in a way the attacker cannot predict, or (if they specified a more abstract interface) switch to an implementation (such as a red-black tree) that does not suffer from the problem.

The Importance Of Explicit Guarantees

The technical strength of a security mechanism is limited when it is not backed by an explicit contract. Explicit, understandable, tested, and enforced guarantees, which could reasonably fit into the caller’s mental model, are best.

A guarantee that is not also perceived by its callers is limited in effectiveness. Consider an interface for a map data structure: If the implementation is guaranteed to be a sorted tree, callers can trust that they can iterate over the keys in sorted order without having to do any extra work. But if they don’t understand that part of the interface definition, they might mistakenly waste time and space by extracting all the keys into an array and pointlessly re-sorting it. The problem is reversed if the interface is explicitly defined to be (say) a hash table, but the caller does not realize that.

Similarly, a security guarantee that callers do not perceive — but which is present — can cause callers to miscalculate their risk as being higher than it is. While it might seem that is OK, because callers will “err on the side of caution”, in fact the misperception can have an opportunity cost. (In a sense, a self-denial-of-service.)

A non-guarantee that is not perceived can also become dangerous. For example, although documentation explicitly disclaims it, users often perceive that programs can maintain (e.g.) confidentiality for the user’s data even when the underlying platform is under the physical control of an attacker. Such an attacker’s capabilities tend to be well outside the users’ mental models; and in any case, documentation (a secondary interface definition) is a poor substitute for a user-visible interface definition in the GUI (a primary definition).

Interface misperceptions are sometimes widely or strongly held, and can become implicit or even explicit guarantees, and can force brittleness or even breakage into the interface. As an extreme example, consider the User Account Control feature introduced in Windows Vista. After it was released, Microsoft published a blog post (a secondary interface definition) and tried to roll back the expectations that callers developed when reading the primary definitions (the GUI and aspects of the API):

It should be clear then, that neither UAC elevations nor Protected Mode IE define new Windows security boundaries. Microsoft has been communicating this but I want to make sure that the point is clearly heard. Further, as Jim Allchin pointed out in his blog post Security Features vs Convenience, Vista makes tradeoffs between security and convenience, and both UAC and Protected Mode IE have design choices that required paths to be opened in the IL wall for application compatibility and ease of use.

Perhaps the core problem with UAC, Integrity Levels, and User Interface Privilege Isolation is that one interface, the security principal (in Windows, represented by the access token), is too hard to compose with another interface: the traditional multi-process/single principal windowing environment for presenting user interfaces. Modern platforms require a 2-part security principal (see the Background section in that document), composable with a user interface paradigm that allows users to distinguish the many cooperating principals. (Consider the EROS Trusted Windowing System as an example alternative.)

Don’t Imagine Interfaces Or Guarantees

At the beginning of this blog post, I poked a little fun at WAFs. Making fun of WAFs is traditional picnic banter in my tribe (application security engineers), so I feel it is only fair to put a little sacred cow hamburger on the grill, too. Here are 2 examples.

Constant-time array comparison to defeat timing side-channel attacks. Consider for example the HMAC defense against CSRF: token = HMAC_SHA256(secret_key, session_token + action_name). It should be computationally infeasible for the attacker to ever guess or learn the token value, but a timing side-channel, such as that introduced by a naïve byte array comparison allows the attacker to guess the token in a feasible amount of time and attempts (proportional to N = number of bits in token). A canonical solution is to use an array comparison function that always takes the same amount of time, rather than returning as soon as it finds a mismatch.

The trouble with this is that, apart from the code being slightly subtle, there is no interface guaranteeing that the code will indeed take the same amount of time on all inputs. Several things are permissible, given the documented interfaces between the programmer and the ultimate execution context:

  • the compiler might find a way to optimize the function;
  • the CPU’s XOR instruction might not take the same amount of time to compute all inputs; or
  • the machine (real, or virtual!) might even transform and optimize the code before running it.
    • For example, some processor cores accept code from one instruction set as input, but transform it to another instruction set before running it in the processor core.

Does the expected timing guarantee still hold, given these interfaces and their non-guarantee? As Lawson says, the solution is fragile and you have to test it every time the execution environment changes.

An additional, essentially fatal problem is that many real-world applications are implemented in very high-level languages like Python and Java, where there are even more layers of abstraction and therefore even less of a constant-time interface guarantee.

An alternative solution, which I learned from Brad Hill, is to forget about trying to run in constant time, and instead to blind the attacker by making what timing information they learn useless. Rather than directly comparing the timing-sensitive tokens (say, SAML blob signatures or CSRF tokens), HMAC the received blob and the expected blob again (with a new, separate HMAC key), and then compare those HMAC outputs (with any comparison function you want, even memcmp). The attacker may indeed observe a timing side-channel — but the timing information will be random relative to the input. This is due to the straightforward, documented, and tested interface guarantee of the HMAC function as a pseudo-random function. And it works as expected in any language, on any computing substrate.

Consider another cryptography-related security conundrum: the supposed need to clear secrets from RAM when the secrets are no longer needed, or even to encrypt the RAM (presumably decrypting it in registers?). This is supposed to ensure that live process RAM never hits the disk (as in e.g. swap space), nor is available to an attacker who can read the contents of RAM. The usual threat scenario invoked to warrant this type of defense is that of a physically-local forensic attacker, usually of relatively high capability (e.g. capable of performing a cold boot attack or a live memory dump). The goal is to not reveal secrets (e.g. Top Secret documents, passwords, encryption keys, et c.) to such an attacker.

The trouble with this goal is that there can be no interface guarantee that clearing memory in one area will fully erase all copies of the data. The virtual memory managers of modern operating systems, and the dynamic heap allocators of modern language run-times, in fact guarantee very little in the way of memory layout or deterministic behavior. Instead they provide guarantees of more-or-less high performance, which additional security guarantees could complicate or render infeasible.

  • If you realloc memory, the userland run-time or the kernel might make a copy that you can no longer reliably reference (so you can’t reliably clear it).
  • When you free memory, the kernel might not zero the pages out until the last second before giving them to the next requestor. Thus, the time window in which they are prone to discovery by the forensic attacker increases.
  • Kernel APIs like mlock, which purport to lock memory into physical RAM pages (stopping the pages from being swapped out to disk), do not necessarily work the way you expect, or even at all.
  • In a garbage-collected run-time, essentially any amount of copying, moving, and reallocating is possible. There can be no guarantee that a piece of data is stored in exactly 1 location in RAM, and that you can clear it.
  • The same holds for virtual machines, of course.

Essentially, there can be no guarantee that a high-capability forensic attacker cannot find secrets in RAM or swapped-out process memory; the more complex the operating system and run-time, the less likely it is that you can even probabilistically defeat such an attacker.

The most you can realistically do in the general case is mitigate the problems with full disk encryption and whatever degree of physical security you can get. In specific cases, such as cryptographic keys, you can keep the keys in a tamper-resistant, tamper-evident hardware security module.

“Conclusion”

This post is partly an attempt to investigate why the “security vs. convenience” dichotomy is false. I think it’s worse than a false dichotomy, really; it’s a fundamental misconception of what security is and of what an interface is — and of what “convenience” (an impoverished view of usability) is.

But also it’s an attempt to re-frame security engineering in a way that allows us to imagine more and better solutions to security problems. For example, when you frame your interface as an attack surface, you find yourself ever-so-slightly in a panic mode, and focus on how to make the surface as small as possible. Inevitably, this tends to lead to cat-and-mouseism and poor usability, seeming to reinforce the false dichotomy. If the panic is acute, it can even lead to nonsensical and undefendable interfaces, and a proliferation of false boundaries (as we saw with Windows UAC).

If instead we frame an interface as a defense surface, we are in a mindset that allows us to treat the interface as a shield: built for defense, testable, tested, covering the body; but also light-weight enough to carry and use effectively. It might seem like a semantic game; but in my experience, thinking of a boundary as a place to build a point of strength rather than thinking of it as something that must inevitably fall to attack leads to solutions that in fact withstand attack better while also functioning better for friendly callers.

The safest interface is still no interface — don’t multiply interfaces unnecessarily. But when you must expose something, expose a well-tested shield rather than merely trying to narrow your profile or hide behind a tree.

And Now, Your Moment Of Zen

https://twitter.com/rootkovska/status/498227129969295360

Concise Java

Updated: See bottom of post!

While I do love Java, I have never liked the Java culture of writing verbose, new-happy, Design Patterns-addled code. It’s painful to read, hard to use, slow, and despite all the design patterning, often surprisingly un-general and un-composable.

First, let’s look at the unnecessary verbosity. Found in the Mitro codebase:

public final class RPC {
  public static class LoginToken {
    public String email;
    public long timestampMs;
    public String nonce;
    public boolean twoFactorAuthVerified=false;
    public String extensionId;
    public String deviceId;
  }

But this works just as well:

public final class RPC { 
    static class LoginToken {
        String email;
        long timestampMs;
        String nonce;
        boolean twoFactorAuthVerified;
        String extensionId;
        String deviceId;
    }

The default scope is package scope for a reason! It’s actually usually what you want. And all variables are born with well-defined initial values — no need to initialize primitives with 0 or reference types with null.

Consider a more complex example, also from Mitro: java/server/src/co/mitro/core/util/Random.java, a class to generate securely random alphanumeric strings.

public class Random {
  private static int fillCharRange(char start, char endInclusive, char[] output, int index) {
    for (char c = start; c <= endInclusive; c++) {
      output[index] = c;
      index += 1;
    }
    return index;
  }

  protected static char[] makeAlphaNumChars() {
    char[] output = new char[26+26+10];
    int index = fillCharRange('a', 'z', output, 0);
    index = fillCharRange('A', 'Z', output, index);
    index = fillCharRange('0', '9', output, index);
    assert index == output.length;
    return output;
  }

  private static final char[] ALPHANUM = makeAlphaNumChars();

  // caches SecureRandom objects because they are expensive
  private static final ConcurrentLinkedQueue RNG_QUEUE =
      new ConcurrentLinkedQueue();

  /** Returns a secure random password with numChars alpha numeric characters. */
  public static String makeRandomAlphanumericString(int numChars) {
    SecureRandom rng = RNG_QUEUE.poll();
    if (rng == null) {
      // automatically seeded on first use
      rng = new SecureRandom();
    }

    StringBuilder output = new StringBuilder(numChars);
    while (output.length() != numChars) {
      // nextInt()'s algorithm is unbiased, so this will select an unbiased char from ALPHANUM
      int index = rng.nextInt(ALPHANUM.length);
      output.append(ALPHANUM[index]);
    }

    RNG_QUEUE.add(rng);
    return output.toString();
  }
}

With less code, we can write a more general, better-documented, and more performant class:

public class RandomString {
    static final char[] Alphanumerics = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789".toCharArray();
    static final char[] Alphabetics = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".toCharArray();
    static final char[] Numerics = "0123456789".toCharArray();
    static class Lazy {
        static final SecureRandom Random = new SecureRandom();
    }

    /**
     * @param characterSet The set from which to randomly select characters.
     * @param length The length of the array to create.
     *
     * @return A securely-random character array.
     */
    public static char[] generate(char[] characterSet, int length) {
        char[] output = new char[length];
        generate(output, characterSet);
        return output;
    }

    /**
     * @param output The array to fill with random characters.
     * @param characterSet The set from which to randomly select characters.
     */
    public static void generate(char[] output, char[] characterSet) {
        for (int i = 0; i < output.length; i++) {
            output[i] = characterSet[Lazy.Random.nextInt(characterSet.length)];
        }
    }
}

The new version allows the caller to specify their own character set, provides 3 handy built-in character sets with build-time optimizable construction, allows the caller to choose between callee-allocates and caller-allocates versions (reducing news), and no unnecessary scope specifiers. I replaced the single-item ConcurrentLinkedQueue with a lazy initializer as a lagniappe. :)

Like C and C++, Java doesn’t have to be awful. Like C and C++, it can even be kind of nice. Cultural factors dominate language-technical ones.

Update: Thanks to Bruce Leidl for pointing out that, in this tiny class whose few methods all touch the SecureRandom object, the lazy initialization pattern is unnecessary. The Java class loader loads all classes lazily, and until a caller actually calls RandomString.generate, the SecureRandom will not be new’d. (I always want to save a new if I can, since I come from C World.) Thus the lazy initializer pattern (itself relying on the fact that the class loader also loads inner classes lazily) is overkill here — indeed, it’s bloat. :) Better code:

    static final SecureRandom Random = new SecureRandom();
    // ...
    public static void generate(char[] output, char[] characterSet) {
        for (int i = 0; i < output.length; i++) {
            output[i] = characterSet[Random.nextInt(characterSet.length)];
        }
    }

#ValueMusic 30 July 2014: Kodacrome’s Aftermaths

I’ve been longing for a full-length release from Kodacrome ever since their too-short Perla EP. The combination of Elissa’s cool and collected vocals with her and Ryan’s clean and warm production is golden. Aftermaths is understated, moody, chill, electronic pop with just enough surprising head-turns. The record is minimal yet full-sounding; the minimalism serves to clarify the musical ideas and showcase the excellent production, rather than being an affectation to hide behind.

I haven’t listened to the digital release yet, but the vinyl version really rewards high volume. The record is so spare, but you find yourself swimming in it. Awesome.

Every song in Aftermaths is a solid hit, but for me particular high points are: the spooky “Buggy Bumper”, the beautiful vocal melody in “Strike The Gold”, the harmony lesson in “Panama”, and the dramatic sound design in “Solitary And Elect”. The wonderful video singles are “Immaculada” and “Strike The Gold”.

Buy everything by Kodacrome on Bandcamp, and enjoy their YouTube channel!

My Curvy Career Trajectory

Things have been fortuitously random in my life. My first love was music, and I would like to have made it my career. But, my body needs a lot of health care, and music is no way to make money. When it came time for college, I was firmly dissuaded from majoring in music. (I would probably have made a better composer than performer, but now we’ll never know…)

So I turned to my second love, language. In high school I took AP French and did a lot of German, and had started to read books on linguistics. So I ended up double-majoring in linguistics and in French language and literature, and almost completed a minor in Latin (I got burned out). Clearly, these were not much better options for career maximization. My mom thought I could be a translator. (I did profoundly love the 2 courses on translation and stylistics, and the excellent professor who taught them. I loved them all.)

However, I did take 2 quarters of computational linguistics classes, and I fell in love with the homework assignments which consisted of writing simple Lisp programs to parse, generate, and manipulate corpora in tiny toy grammars. Of huge importance was the mentorship of two friends who had spent their childhoods learning to program while I had been practicing guitar and learning music theory. After classes on Wednesday nights we drank a lot of pizza and ate a lot of beer, and I soaked up as much as I could about Unix, discrete math, Perl, and C. I probably missed a lot.

I had also taken a job in the foreign-language building’s computer lab doing help desk and webmastering. I thereby discovered the Constructed Languages mailing list, and found myself trying to understand the Perl script someone posted to generate made-up words, and then Word Net too. I didn’t yet know what to do with it, but I knew it was going to be cool.

Then I moved to the multi-media computer lab and forced myself to use Linux as my daily computer. (A Pentium 75 running Red Hat 4.2! It might have had as many as 32 MiB of RAM…) I got addicted to its quirks, and started trying to understand C code and shell scripts. Then I bought a used and already-obsolete NeXT Color Turbo for home — the first computer of my very own, and I loved it as much as I loved my first decent guitar (a Washburn D-10 that I sold a few years back to a 14-year-old girl (same age I was when I got it) whose mom was as anxious as mine was about this kid who was about to waste their life on rock and roll). On that NeXT I learned Perl, I attempted to learn C, and I stayed up until 4 AM reading the manual pages.

And when I finally graduated from college, I was just barely employable as a web developer.

A paranoid web developer. I almost wonder if my near-total ignorance of programming contributed to my interest in security engineering. I had learned enough in college to come to believe that the difference between a global library of unrestrained, free reading and conversation, and a globally-connected Panopticon, was cryptography (not that I understood anything about it). That piqued some cranky interest. And then I joined the BUGTRAQ mailing list (the Full Disclosure of its day) and realized just how many awful security vulnerabilities I had authored just that month. Then I was really hooked.

When I finally moved out to San Francisco — in the depths of the dot-com crash, May 2001 — I got another web app development job and started focusing in earnest on learning more programming languages (Python and Java!) and on security in general and OpenBSD in particular. When the long-moribund dot-com stopped paying us on time, I got lucky and took a chance to work for much less money (but consistent money!) at the Electronic Frontier Foundation as the systems administrator. I managed to parlay that into a position as Staff Technologist, and then as Technology Manager; and by that point I had learned enough about security to move to a job as a security engineering consultant at iSEC Partners.

iSEC opened my eyes to a lot of insanity and hilarity. It was at iSEC that I got my real security engineering education. By a few years in I had developed a pretty serious eye-twitch, and a nervous tic: I couldn’t leave the house without checking, like 5 times, that the doors were locked. Sometimes I would circle the block, come back, check again. This, even though I knew that standard home door locks are trivially bump-keyed.

It was a completely irrational response to the (now rather banal) knowledge that our entire economy and society could collapse at any moment. It was all good, though; about this time I built a guitar from parts and gigged regularly with 2 bands.

Through iSEC I had been doing a lot of work at Google, mostly on the new Android operating system, and I had come to love the Google and Android engineers. Eventually, a good friend press-ganged me into coming to work at Google on Android, and my new boss was the rightfully legendary Dianne Hackborn. Even more education was had, by me, at that time.

Then I went back to the EFF, this time as Technology Director. It was a good opportunity to lead some technology projects there and assist on some of the litigation they do. In that role I learned that although I am a cat, I am not a cat herder. Perhaps that is obvious, but it wasn’t yet obvious to me. Soon I was enticed back to Google, this time with the Chrome Security team.

Best job ever.

The most important thing, what enabled me to grow and learn as quickly as I could, was community — teachers, mentors, learning-friends. Just by luck and random encounters I’ve had several truly excellent music teachers, great French and Latin teachers at all levels, linguistics friends who were also Unix wizards, great managers, great team-mates, great engineering role models. My worst times were times when I was disconnected and had no community — I moved much more slowly than I could have when I first moved to SF, and things picked up for me as I found my peeps. Peeps make it happen.

Why I Love Java

…it’s not just because I love being a contrarian.

For background see jwz’s delightful rant about Java. When I say I love “Java”, I mean only the langauge and perhaps sort of the security model. The standard library and the virtual machine are not what I mean. (They are truly awful.)

Basically, you can use Java to write C, but with safe and sane semantics. Integer overflow has defined behavior, out-of-bounds array accesses have defined behavior, both the compiler and the run-time know the types of objects and have defined behavior on bad casts… generally, you can write code in Java and have the nice, warm feeling of at least basic safety. This frees you up to worry about application-domain vulnerabilities, which is still painful, but at least you aren’t ice-skating through a laser-maze like you are in C and C++. I’ve been working a lot on hardening PDFium lately, and I honestly long for the sanity that this tiny amount of safety brings.

Java’s simplified integer type system also makes programming profoundly easier — very few C/C++ programmers understand the billions of integer types and typedefs, and confusion can be disastrous.

Similarly, strong static typing is essentially the only reliable way to develop large software projects with large teams. And while it’s best not to have to do that, (a) sometimes problems are just plain large; and (b) even small projects and teams can benefit hugely from strong, static typing. Types are documentation, types are safety, types are lightweight formal methods — and practical!

Of course, the other baggage that comes with Java — the VM, the standard libraries, the fully-dynamic GC, the broken String type and the ugly attempts to repair it — is sad.

What we need is a language with the relatively straightforward memory- and type-safety of Java, without the poor abstractions and poor performance. Go is close, but for the GC and lack of generics; Rust is close, but for the ridiculous Ruby syntax and complex semantics; C++11 is not close at all and is evil and bad but I kinda like those new smart pointers.

Expose The Correct Interface (Even When Taking Shortcuts)

When you’re designing a data structure in C, you often want to be space-efficient. For example, consider a struct of bools:

typedef struct {
    bool isAwesome;
    bool isBiped;
    bool drinksCoffee;
    bool isFurry;
    bool hasRabies;
} Friend1;

typedef struct {
    uint8_t isAwesome:1;
    uint8_t isBiped:1;
    uint8_t drinksCoffee:1;
    uint8_t isFurry:1;
    uint8_t hasRabies:1;
} Friend2;

Clearly, Friend2 is (or, can be) more space-efficient. On my machine, sizeof(Friend1) is 5 and sizeof(Friend2) is 1. In an application that requires allocating millions of friends, this can make a big difference.

Similarly, consider 2 approaches to implementing a string:

typedef struct {
    size_t count;
    uint8_t* bytes;
} String1;

typedef struct {
    uint32_t count;
    uint8_t bytes[1];
} String2;

As before, String2 is much more space-efficient (16 vs. 8 on my machine). Additionally, by having bytes be immediate instead of a pointer to a (potentially far away) array, we can improve data locality. (But it means that we need to be a bit trickier in the implementation of our constructor.)

Notice how, unlike with the example of Friend2, String2 significantly changes the data structure. First, on a 64-bit machine, the use of uint32_t instead of size_t means that we cannot have strings with more than 232 – 1 bytes. Since we’re trying to save memory, that can make sense, but it does mean we have a limitation we must enforce when interfacing with other code. (Thankfully, uint32_t will always safely cast up to a size_t on >= 32-bit machines.)

As long as we expose correct and abstract interfaces to these data structures, we can freely change between implementation strategies — we can swap Friend1 and Friend2 if necessary, and we can swap String1 and String2.

To see what I mean by “correct”, consider these hypothetical and incorrect Friend constructors:

Friend* newFriend_bitfield(int options) {
    Friend f = calloc(1, sizeof(Friend));
    f->isAwesome = F_IS_AWESOME & options;
    f->isBiped = F_IS_BIPED & options;
    // ...
    return f;
}

Friend* newFriend_ints(int isAwesome,
                       int isBiped,
                       int drinksCoffee,
                       int isFurry,
                       int hasRabies)
{
    Friend* f = calloc(1, sizeof(Friend));
    f->isAwesome = isAwesome;
    f->isBiped = isBiped;
    // ...
    return f;
}

In newFriend_bitfield, we are exposing to the caller the bitfield space optimization that we used in Friend2, and requiring the caller to learn and use constants like F_IS_AWESOME — not too complicated, but it is one more thing for the programmer to learn. Although well-behaved callers will have readable call-sites such as

Friend* f = newFriend_bitfield(F_IS_AWESOME | F_IS_FURRY);

it will be possible to have nonsensical call-sites such as

Friend* f = newFriend_bitfield(42);

If an interface allows something, no matter how silly, rest assured that some programmer somewhere will indeed invoke it that way.

Similarly, newFriend_ints appears to allow any integer value, even though it will of course immediately cast them to bool.

The correct interface is immediately understandable, and allows either implementation:

Friend* newFriend(bool isAwesome,
                  bool isBiped,
                  bool drinksCoffee,
                  bool isFurry,
                  bool hasRabies)
{
    Friend* f = calloc(1, sizeof(Friend));
    f->isAwesome = isAwesome;
    f->isBiped = isBiped;
    // ...
    return f;
}