Thoughts On Language Design Bugs

This post is an attempt to answer some pretty reasonable questions my friends and colleagues have asked me, on the topic of programming language security. If you’ve read anything else I’ve written, you know I believe the 2nd-biggest software security engineering problem is the unsafety of C and C++. What that implies and what to do about it is not necessarily obvious. So here are some lightly paraphrased questions, and my general thoughts.

(The 1st-biggest problem is all about human factors: abuse, phishing, and accessibility.)

Is JavaScript Memory-Unsafe?

Why do you call JS and Python ‘safer’? Since (e.g.) JavaScript is implemented in C++, doesn’t that make it just as memory-unsafe as C/C++?

Yes — and no. The kernels of the implementations of languages like Python, JavaScript (JS), and Java are typically in C/C++ and they certainly do exhibit memory unsafety and other C/C++ undefined behavior (UB) bugs. (Memory unsafety is a subclass of UB.) However, these languages intend to limit UB as part of their interface or contract. Minimizing or eliminating memory unsafety is a design goal.

Thus, crucially, UB bugs in these languages are implementation bugs. The (e.g. Python) developers fix them once, and then everyone benefits from the implementation getting closer to the interface semantics. The implementation can inch closer to the ideal of the interface, and the community can adopt the improvement at scale.

In C/C++, by contrast, UB is considered a design ‘feature’, not a bug. The language design committees and compiler developers won’t fix such bugs. Even brand new features in C++ introduce new UB — it’s not considered a historical mistake to be corrected.

It might seem like Python, JS, et c. are safe wrappers around unsafe code. And they can be, and (depending on the specific implementation) more or less are. For example, if an application implemented in Python is successfully attacked, it’s much more likely to have been from a bug in the application logic than from a use-after-free or a buffer overflow in a list comprehension or other core Python feature.

Thus, a safe language implemented in an unsafe language might be OK, to the extent that we can scale up fixing the errors in the implementation. But that’s highly variable, as the next question raises.

What’s Happening With JavaScript Security?

What about the fairly rough time JS implementations are having? They don’t seem to be getting incrementally closer to the interface ideal.

Yes, this is a notable problem. I think there are a few reasons the problem exists, and is apparent.

JS implementations are quite complex and large. Any large body of C/C++ code is going to have a lot of problems. By contrast, the implementations of (say) Lua and Self are notably concise, and Python is large but not huge. (And a good chunk of Python’s size is auto-generated code.) If we assume roughly equivalent bug-density per line across developers — in general we have no reason to assume otherwise — less code means fewer bugs.

Additionally, defensive and offensive security research teams are hunting night and day, en masse and at scale, for bugs in JS implementations specifically. If another language suddenly grew to equivalent prominence, it might face similar scrutiny and perhaps the known bug count would go up.

But there is a 3rd critical issue: Many of the bugs affecting JS implementations are not vanilla C/C++ UB implementation bugs. For Reasons, JS happens to face fairly intense scrutiny on raw micro-performance, which typically leads developers to cut corners on correctness. (That’s the usual justification for C/C++’s UB, too. Such an extreme performance focus can make sense in some circumstances, but in the vast majority of cases it’s the wrong trade-off.)

As part of achieving high performance, JS engines typically include several different run-time compilers (just-in-time or JIT compilers) to transform the code at run-time into a faster form. To build not 1 but several such systems into your language implementation is a significant and complex undertaking — especially when the pressure is on to go fast and save battery life on people’s phones.

For example, JS implementations often have JIT optimization bugs that go something like this: “We thought we could optimize by removing this dynamic type check, because we thought we had a solid argument that the object is guaranteed to be of type T. But, we were wrong.” (This kind of thing is quite hard to get right.) And then the JIT emits memory-unsafe object code due to erroneous assumptions during compilation. This class of bug is not due directly or uniquely to C/C++.

Why Don’t We Rewrite Everything In Safer Languages?

Given that C/C++ UB creates so many problems — causing the implementations of languages to not live up to their designs — why aren’t the likes of Python and other languages being rewritten in memory-safe languages?

First, because it’s expensive to do that. There are whole teams working hard to make it less expensive to transition large codebases from C/C++ to modern languages, but it’s just not a cheap or easy thing to do yet. Whether it is possible to make it cheap enough at all is an open research question. Whether or not it succeeds, I hope that the work being done now, in several organizations, is made public. Even negative results would be hugely useful.

Additionally, separate from C/C++ UB, there exists a claim that developers would just as likely make the same JIT compiler logic errors in a safer language as they do in C/C++. In switching to a safer language, you would get rid of the ‘simple’ or direct UB and memory-unsafety problems, but JIT compilers would still be difficult.

I hypothesize that some such compiler and interpreter logic bugs can be approached as type errors and state machine transition errors, and thus automatically detected and prevented by the implementation language’s own type semantics. (For example, consider a bug where it should never have been possible to move from state 1 to state 2 with an object of type T1, only with a T2. This is known as the typestate pattern, and it might help with certain of the problems that dynamic language run-times face.)

Micro-Performance vs Correctness: An Ecosystem Problem

To a significant extent, though, the semantics of JS, Lua, and Python are highly dynamic — and that means there’s an inherent tension between the run-time cost of dynamic correctness checks vs. raw micro-performance. Dynamism is an awesome feature, but it comes at the cost of some machine-level performance.

I believe the right approach to this trade-off is to focus on macro-performance, and to stop worrying about micro-performance for a while.

(At an absolute level, the micro-performance of modern JS engines is absolutely stunningly awesome. Part of the reason we are having these problems is that the developers of these engines have already done the impossible 10 times over, and now they’re looking for some 11th win. And, who knows... they’re so good at what they do, they might very well find it. JS engine developers have effectively solved the micro-performance problem of dynamic languages. It’s impossible to understate the excellence of that — in part because it makes safer languages that much more applicable and deployable! So we should all thank performance-oriented engineers for this safety.)

What’s wrong with JS performance — why some pages or apps make your phone warm — no longer has much to do with whether we do or don’t elide a dynamic correctness check. It’s all about JS ecosystem problems.

So What Are We Supposed To Do?

Given that we are drowning in the personal, ecosystem, and political consequences of C/C++ UB bugs and vulnerabilities, but that reimplementing is expensive and difficult, what in the actual shit are we supposed to do right now with the systems we depend on?

First and foremost, as a matter of professional ethics and responsibility, no green-field development must be done in unsafe languages. The behind-the-curve technology of the 1970s has not enabled and will not enable us to meet the requirements of the 2020s and 2030s. We have to put a lot of work into working around its problems, as I describe below, and we have to enter into that effort knowing that it is all repair work and not new advancement.

Complementarily, we must do everything we can to minimize the amount of maintenance and development we do in unsafe languages. That means gradually migrating old code to safer languages, developing the new features of existing systems in safer languages, replacing or removing components implemented in unsafe languages, and so on.

In the limited and blocked-off area of maintenance and development in unsafe code, there is actually a lot we can do to improve things. First, take the micro-performance heat off by exploring solutions to the macro-performance problems, whatever they might be. (Look for amplifiers at the application level. Does 1 click incur 100 requests or operations?) When the micro-heat is off, you can breathe a little and start looking into correctness and security.

Keep testing. Incentivize testing and bug finding. More. Incentivize fixing bugs, polish, quality. More. It must be possible to get promoted to and compensated at a high level for measurably improving code quality, instead of shipping new features. It does sometimes happen, but overall most software development organizations need a significant culture change.

Although C/C++ cannot be ‘fixed’, there is quite a bit we can do to minimize and avoid the problems of these languages.