18 Feb: See below for some nice updates!
Programming languages advance by introducing new constraints. A key reason we
don’t use assembly language for everything is that the lack of constraints make
it too hard to use for everyday programming. Before
was considered harmful, people wrote machine code that jumped all over the
place, and programmers had to maintain a mental model of the complete machine
state and the full implications of each jump — a recipe for bugs.
programming was introduced: structured languages still compiled down to
gotos (or arbitrary jumps), but the programmer could think in terms
of more limited jumps:
for. These constrained
jumps are much easier to understand; for example, when you’re reading code, you
can know that
return doesn’t return just anywhere. It returns only
to the caller, as identified by a pointer on the stack. Later, language
designers added additional constrained jumps like
catch, and virtual function calls.
throw is a little bit too
goto-y for my taste,
since you can’t tell locally where the relevant
catch block is. But
that’s a story for another time.)
A key innovation of C++ was to introduce RAII, which essentially ‘piggybacks’ on the value of the stack and enriches it with a lot more power. (The additional complexity is usually manageable, and worth it.) It allows you extend the automatic memory management that the stack provides, initializing and cleaning up complex resources instead of just primitive values like integers and floats. You can automatically close open files, release dynamic storage, and so on. And it’s deterministic.
But there was still the problem of the heap: a free-fire zone with no constraints, riddled with memory leaks (heap resources allocated but never released) and use-after-free bugs (heap resources re-used even after having been released).
A key innovation of Rust has been to statically constrain the lifetimes of heap resources, enabling us to more completely solve the worst remaining memory unsafety problem. (Previous solutions to the heap lifetime problem were dynamic, not static, and hence expensive in space and time — as well as being non-deterministic. These limitations reduce the applicability of dynamically-managed languages to applications and environments where these costs are affordable.)
And, of course, taming object lifetimes greatly eases the problem of safe, efficient concurrency. Concurrency is the key to improving performance in modern systems.
Beyond memory safety, Rust makes more use of typefulness than I typically see
in other mainstream languages in its niche. For example, Rust’s rich
enums and pattern matching make it easier to write state machines,
new type idiom makes it easier to get additional type safety (and improves
the interface-as-documentation factor), and so on. You can work to get similar
benefits in other languages, but Rust’s syntactic mechanisms and idiomatic usage
create affordances for these easier patterns.
Another freeing constraint Rust has introduced has been to systematize and automate dependency management: the Cargo package management system. Good dependency management is a monstrously hard problem. Any dependency management system, including manual or ad hoc management, poses a variety of problems:
The NPM ecosystem provides the clearest modern illustration of these problems. (See page 11 of Github’s report on security, for example.)
However, for all of NPM’s problems, at least it is a package management system at all! It’s easy to pick on NPM (or predecessors like CPAN, or CTAN, or...), but even at its worst it’s a huge improvement over manually managing dependencies (such as by manually vendoring them into your source tree, or just telling the user to install such-and-such libraries before attempting to compile).
Life is better with NPM, and with Rust’s Cargo, Go’s
go get, and
so on. Even when they aren’t perfect yet, they provide a framework for
improvement, by constraining where dependencies come from and how we maintain
But a lot of work is still necessary. As an example of a Nice Thing Indeed, Cargo has this add-on package called supply-chain, which will show you all the packages a given package depends on. It will also estimate how many individual publishers author those dependencies. Here is what happens when you run supply-chain on itself:
~/src/rust/cargo-supply-chain % cargo supply-chain publishers The following crates will be ignored because they come from a local directory: - cargo-supply-chain The `crates.io` cache was not found or it is invalid. Run `cargo supply-chain update` to generate it. Fetching publisher info from crates.io This will take roughly 2 seconds per crate due to API rate limits Fetching data for "adler" (0/79) [77 items, including some surprising ones, elided...] Fetching data for "xattr" (78/79) The following individuals can publish updates for your dependencies: 1. alexcrichton via crates: flate2, wasm-bindgen-backend, wasi, bitflags, proc-macro2, wasm-bindgen-macro, wasm-bindgen, openssl-probe, unicode-xid, wasm-bindgen-macro-support, filetime, semver, tar, unicode-normalization, libc, js-sys, bumpalo, log, wasm-bindgen-shared, cfg-if, cc, web-sys [55 authors elided...] 57. zesterer via crates: spin Note: there may be outstanding publisher invitations. crates.io provides no way to list them. Invitations are also impossible to revoke, and they never expire. See https://github.com/rust-lang/crates.io/issues/2868 for more info. All members of the following teams can publish updates for your dependencies: 1. "github:rustwasm:core" (https://github.com/rustwasm) via crates: web-sys, js-sys, wasm-bindgen-macro, wasm-bindgen-macro-support, wasm-bindgen-backend, wasm-bindgen, wasm-bindgen-shared 2. "github:servo:cargo-publish" (https://github.com/servo) via crates: core-foundation-sys, percent-encoding, form_urlencoded, unicode-bidi, core-foundation, idna, url 3. "github:servo:rust-url" (https://github.com/servo) via crates: percent-encoding, form_urlencoded, idna, url 4. "github:rust-bus:maintainers" (https://github.com/rust-bus) via crates: security-framework-sys, security-framework, tinyvec 5. "github:rust-lang-nursery:libs" (https://github.com/rust-lang-nursery) via crates: bitflags, log, lazy_static 6. "github:serde-rs:owners" (https://github.com/serde-rs) via crates: serde_derive, serde, serde_json 7. "github:rust-lang:libs" (https://github.com/rust-lang) via crates: libc, cfg-if 8. "github:rust-lang-nursery:log-owners" (https://github.com/rust-lang-nursery) via crates: log 9. "github:rust-random:maintainers" (https://github.com/rust-random) via crates: getrandom Github teams are black boxes. It's impossible to get the member list without explicit permission. ~/src/rust/cargo-supply-chain % cargo supply-chain update Note: this will download large amounts of data (approximately 250Mb). On a slow network this will take a while.
Now, that’s a lot of dependencies by a lot of publishers whom I don’t know.
(Although it’s not automated, if you dig around you’ll find that many of those
authors are well-established members of the Rust development team, so trusting
them is an easier sell.) Another bummer is that, when I built supply-chain, my
$CFLAGS broke the build (Update 18 Feb: with an
almost certainly spurious and not security-relevant warning,
-Wunused-macros). (My flags are quite persnickety:
-Weverything -Werror -std=c11. Very little code builds with these
flags. 😇) Apparently, some of supply-chain’s own dependencies depend on C code.
But that’s OK! Cargo provides a framework for working on these problems. Over time, I’d like to see things move along these lines:
unsafe. This has been happening, and will continue to, over time. (See the Safety Dance project, which is a focused on reducing the use of
std’, so that they can appear as a single dependency with a single publishing team. This is controversial in some communities, but I think it would go a long way toward reducing the problems.
std, and ‘extended
std’ (where and if appropriate). This is also sometimes controversial, but again I think it would help.
unsafeblocks, or in C/C++/assembly? Some of these things can be more or less automatically determined, and tooling could flag packages that stand out. Fun update 18 Feb: Such a thing exists, and is called crev. Awesome!
Another good thing about Rust is its friendly community. Not all systems programming communities are as welcoming as Rust’s is. Rust, and some other communities, have taken proactive steps to maintain a healthy community. I think it’s fair to say the Rust community is doing relatively well, especially in the systems programming niche.
Like all language communities, whether of natural languages or artificial languages, the community and the body of literature and the oral tradition are what matter. In its niche, Rust looks like the option with the most momentum around a more positive, healthier community. The community and the language are probably not perfect — nothing is, if perfect is even a thing — but Rust looks like the community most open to solving its problems, and most capable of solving systems programming problems.
Thanks to Adrian Taylor for reminding me to mention typefulness, concurrency, and Safety Dance.
Thanks to Sergey Davidoff, supply-chain maintainer, for pointing me at
crev and noting that Safety Dance is more about reducing