18 Feb: See below for some nice updates!
Programming languages advance by introducing new constraints. A key reason we
donât use assembly language for everything is that the lack of constraints make
it too hard to use for everyday programming. Before goto
was considered harmful, people wrote machine code that jumped all over the
place, and programmers had to maintain a mental model of the complete machine
state and the full implications of each jumpâââa recipe for bugs.
Then, structured
programming was introduced: structured languages still compiled down to
goto
s (or arbitrary jumps), but the programmer could think in terms
of more limited jumps: if
, switch
/case
,
call
, return
, for
. These constrained
jumps are much easier to understand; for example, when youâre reading code, you
can know that return
doesnât return just anywhere. It returns only
to the caller, as identified by a pointer on the stack. Later, language
designers added additional constrained jumps like
throw
/catch
, and virtual function calls.
(throw
is a little bit too goto
-y for my taste,
since you canât tell locally where the relevant catch
block is. But
thatâs a story for another time.)
A key innovation of C++ was to introduce RAII, which essentially âpiggybacksâ on the value of the stack and enriches it with a lot more power. (The additional complexity is usually manageable, and worth it.) It allows you extend the automatic memory management that the stack provides, initializing and cleaning up complex resources instead of just primitive values like integers and floats. You can automatically close open files, release dynamic storage, and so on. And itâs deterministic.
But there was still the problem of the heap: a free-fire zone with no constraints, riddled with memory leaks (heap resources allocated but never released) and use-after-free bugs (heap resources re-used even after having been released).
A key innovation of Rust has been to statically constrain the lifetimes of heap resources, enabling us to more completely solve the worst remaining memory unsafety problem. (Previous solutions to the heap lifetime problem were dynamic, not static, and hence expensive in space and timeâââas well as being non-deterministic. These limitations reduce the applicability of dynamically-managed languages to applications and environments where these costs are affordable.)
And, of course, taming object lifetimes greatly eases the problem of safe, efficient concurrency. Concurrency is the key to improving performance in modern systems.
Beyond memory safety, Rust makes more use of typefulness than I typically see
in other mainstream languages in its niche. For example, Rustâs rich
enum
s and pattern matching make it easier to write state machines,
the
new type idiom makes it easier to get additional type safety (and improves
the interface-as-documentation factor), and so on. You can work to get similar
benefits in other languages, but Rustâs syntactic mechanisms and idiomatic usage
create affordances for these easier patterns.
Another freeing constraint Rust has introduced has been to systematize and automate dependency management: the Cargo package management system. Good dependency management is a monstrously hard problem. Any dependency management system, including manual or ad hoc management, poses a variety of problems:
The NPM ecosystem provides the clearest modern illustration of these problems. (See page 11 of Githubâs report on security, for example.)
However, for all of NPMâs problems, at least it is a package management system at all! Itâs easy to pick on NPM (or predecessors like CPAN, or CTAN, or...), but even at its worst itâs a huge improvement over manually managing dependencies (such as by manually vendoring them into your source tree, or just telling the user to install such-and-such libraries before attempting to compile).
Life is better with NPM, and with Rustâs Cargo, Goâs go get
, and
so on. Even when they arenât perfect yet, they provide a framework for
improvement, by constraining where dependencies come from and how we maintain
them.
But a lot of work is still necessary. As an example of a Nice Thing Indeed, Cargo has this add-on package called supply-chain, which will show you all the packages a given package depends on. It will also estimate how many individual publishers author those dependencies. Here is what happens when you run supply-chain on itself:
~/src/rust/cargo-supply-chain % cargo supply-chain publishers The following crates will be ignored because they come from a local directory: - cargo-supply-chain The `crates.io` cache was not found or it is invalid. Run `cargo supply-chain update` to generate it. Fetching publisher info from crates.io This will take roughly 2 seconds per crate due to API rate limits Fetching data for "adler" (0/79) [77 items, including some surprising ones, elided...] Fetching data for "xattr" (78/79) The following individuals can publish updates for your dependencies: 1. alexcrichton via crates: flate2, wasm-bindgen-backend, wasi, bitflags, proc-macro2, wasm-bindgen-macro, wasm-bindgen, openssl-probe, unicode-xid, wasm-bindgen-macro-support, filetime, semver, tar, unicode-normalization, libc, js-sys, bumpalo, log, wasm-bindgen-shared, cfg-if, cc, web-sys [55 authors elided...] 57. zesterer via crates: spin Note: there may be outstanding publisher invitations. crates.io provides no way to list them. Invitations are also impossible to revoke, and they never expire. See https://github.com/rust-lang/crates.io/issues/2868 for more info. All members of the following teams can publish updates for your dependencies: 1. "github:rustwasm:core" (https://github.com/rustwasm) via crates: web-sys, js-sys, wasm-bindgen-macro, wasm-bindgen-macro-support, wasm-bindgen-backend, wasm-bindgen, wasm-bindgen-shared 2. "github:servo:cargo-publish" (https://github.com/servo) via crates: core-foundation-sys, percent-encoding, form_urlencoded, unicode-bidi, core-foundation, idna, url 3. "github:servo:rust-url" (https://github.com/servo) via crates: percent-encoding, form_urlencoded, idna, url 4. "github:rust-bus:maintainers" (https://github.com/rust-bus) via crates: security-framework-sys, security-framework, tinyvec 5. "github:rust-lang-nursery:libs" (https://github.com/rust-lang-nursery) via crates: bitflags, log, lazy_static 6. "github:serde-rs:owners" (https://github.com/serde-rs) via crates: serde_derive, serde, serde_json 7. "github:rust-lang:libs" (https://github.com/rust-lang) via crates: libc, cfg-if 8. "github:rust-lang-nursery:log-owners" (https://github.com/rust-lang-nursery) via crates: log 9. "github:rust-random:maintainers" (https://github.com/rust-random) via crates: getrandom Github teams are black boxes. It's impossible to get the member list without explicit permission. ~/src/rust/cargo-supply-chain % cargo supply-chain update Note: this will download large amounts of data (approximately 250Mb). On a slow network this will take a while.
Now, thatâs a lot of dependencies by a lot of publishers whom I donât know.
(Although itâs not automated, if you dig around youâll find that many of those
authors are well-established members of the Rust development team, so trusting
them is an easier sell.) Another bummer is that, when I built supply-chain, my
default $CFLAGS
broke the build (Update 18 Feb: with an
almost certainly spurious and not security-relevant warning,
-Wunused-macros
). (My flags are quite persnickety:
-Weverything -Werror -std=c11
. Very little code builds with these
flags. đ) Apparently, some of supply-chainâs own dependencies depend on C code.
Alas.
But thatâs OK! Cargo provides a framework for working on these problems. Over time, Iâd like to see things move along these lines:
unsafe
. This has been happening, and will continue to, over time.
(See the Safety Dance
project, which is a focused on reducing the use of
unsafe
.)std
â, so that they can appear as a single dependency with a single
publishing team. This is controversial in some communities, but I think it would
go a long way toward reducing the problems.std
, and âextended
std
â (where and if appropriate). This is also sometimes
controversial, but again I think it would help.unsafe
blocks, or in C/C++/assembly? Some of these things can be more or less
automatically determined, and tooling could flag packages that stand out. Fun
update 18 Feb: Such a thing exists, and is called crev. Awesome!Another good thing about Rust is its friendly community. Not all systems programming communities are as welcoming as Rustâs is. Rust, and some other communities, have taken proactive steps to maintain a healthy community. I think itâs fair to say the Rust community is doing relatively well, especially in the systems programming niche.
Like all language communities, whether of natural languages or artificial languages, the community and the body of literature and the oral tradition are what matter. In its niche, Rust looks like the option with the most momentum around a more positive, healthier community. The community and the language are probably not perfectââânothing is, if perfect is even a thingâââbut Rust looks like the community most open to solving its problems, and most capable of solving systems programming problems.
Thanks to Adrian Taylor for reminding me to mention typefulness, concurrency, and Safety Dance.
Thanks to Sergey Davidoff, supply-chain maintainer, for pointing me at
crev and noting that Safety Dance is more about reducing unsafe
than C.