Noncombatant 😚 About 🤓 Other Writing 🧐 Bandcamp 🎵 GitHub 💻 Mastodon 🐘
By Chris Palmer,
I originally prepared this article for the 2018 Workshop On Speculative Side-Channel Analysis, but I think/hope it has some usefulness generally.
In particular, I strongly believe that more types of applications must align their principals with security mechanisms that platforms provide. Until they do, they remain unnecessarily vulnerable to both side-channel attacks and more prosaic problems like bad ol’ memory corruption, and logic bugs leading to principal confusion.
For that to work, platforms (operating systems and hardware) must provide stronger, guaranteed, documented, efficient, and ergonomic security mechanisms. The leading edge of application security practice works and is good. But it is fragile, hard to use and maintain, and not as powerful as it could be. This is all for lack of mechanisms built in to hardware and software platforms whose roots are in the 1970s. It’s time for platforms to catch up to the needs of applications. Spectre and Meltdown were a wake-up call, but the problems have been with us for decades.
Please send comments/criticism/questions to firstname.lastname@example.org or email@example.com. Thanks!
Typically the operating system defines the principal, with support from the hardware. However, some applications define their own internal principals beyond those the OS defines. The Spectre and Meltdown vulnerabilities have exposed and exacerbated a pre-existing lack of effective confinement between application-defined principals in many applications.
Operating systems must provide mechanisms for applications to effectively isolate application-defined principals, and applications must use those mechanisms. Many existing applications can and must make better use of existing mechanisms, and platforms must provide better mechanisms.
Existing mechanisms for isolating application-defined principals fall short in various ways, including complexity, brittleness, and inefficiency.
The discovery of the Spectre and Meltdown vulnerabilities [HornSideChannel] [LippMeltdown] [KocherSpecter] showed that previously unknown micro-architectural side-channel attacks erode confidentiality guarantees across a variety of security boundaries, including hypervisor/guest, kernel/user, and user/user. Unfortunately, security boundaries inside user-space programs are trivially unable to provide confidentiality against a Spectre attacker operating inside any one of the boundaries.
Many types of user-space application (see Applications That Define Principals, below) define their own security boundaries or principals that are crucial to the confidentiality, integrity, or control of information managed by those principals. The Open Web Platform’s (OWP) same-origin policy [BarthOrigin] is a notable example.
Saltzer and Schroeder [SaltzerProtection] define a principal as follows:
The entity in a computer system to which authorizations are granted; thus the unit of accountability in a computer system.
Typically the operating system defines the principal, with support from the hardware. For example, NT and POSIX use the hardware’s memory protection features to protect processes and ‘users’ from each other, and to protect the kernel from user processes.
Some applications define their own internal principals beyond those the OS defines. This could happen as a result of porting an application to a new platform with a different conception of principals than the application’s original platform had. It can also happen when an application grows to become an application platform itself. The OWP is an example, but sometimes even web applications themselves can be come platforms. An example is Facebook, which hosts third-party applications that Facebook users can use. Database servers also often support their own user account principals.
Because user applications do not run as privileged, kernel-mode code with access to the hardware’s memory protection instructions, applications can only enforce boundaries between their principals by using the facilities the OS provides to user programs. Unfortunately, it is generally the case that applications do not make use of such facilities. Instead, many or most applications that define their own principals do so entirely with in-process logic. Such applications and their (by definition mutually hostile) principals are subject to (at least) all of the problems described in Problems For Application Principals, below.
As problematic as application-defined principals currently are, they exist and are unlikely to go away. They enable platform-within-a-platform applications like the web; they enable security architecture experimentation even on legacy platforms; and applications often grow to become platforms (e.g. MS Office, Facebook) in ways that benefit people.
Therefore, platforms must provide mechanisms for applications to effectively isolate application-defined principals, and applications must use those mechanisms. Many existing applications can and must make better use of existing mechanisms, and platforms must provide better mechanisms because existing mechanisms are inadequate in various ways.
Web browser developers have deployed various mitigations against Spectre(-like) micro-architectural side-channel attacks, and other attacks. (See e.g. Loss Of Integrity, Confidentiality, And Control Via Principal Confusion, below). They are re-architecting their browsers to re-establish the boundaries between principals (origins, or sites) [ChromeRethink].
It is very likely that other application types that define their own principals, including but not necessarily limited to those that allow different principals to run code of their own authorship, should undertake efforts similar to those of web browser developers in order to re-establish the boundaries between their principals. This is true at least, but not only, for defense against side-channel attacks against confidentiality.
From the application perspective, we have to assume that the underlying platform (including the operating system kernel or hypervisor, and the hardware) have been repaired enough to re-establish confidentiality against micro-architectural side-channel attacks that the operating system’s own boundaries purport to defend against. If that is not the case, then no application principal or boundary can be effective.
Unfortunately, for now and at least into the medium-term, systems in the field are likely to be only partially repaired. Even so, it is worthwhile to plan for the time when systems are fully repaired, and to determine what mechanisms and facilities platforms will need to more completely support applications that define principals.
The side-channel crisis is also an opportunity for large-scale platform improvements.
Not only (micro-architectural and other) side-channel attacks, but also application-layer vulnerabilities, pose problems for applications trying to enforce boundaries between their own principals. From the application layer perspective, micro-architectural side-channel attacks are not necessarily different from more mundane vulnerabilities such as the circus of horrors arising from the use of memory-unsafe implementation languages. Some other side-channels, however, are in the application domain.
Any out-of-bounds (OOB) read bug, such as those due to unchecked array indices or type confusion arising from the use of unsafe implementation languages, can enable a hostile principal to read any memory in the application’s address space, including sensitive information belonging to other principals (including belonging to the application itself.)
This includes not just micro-architectural side-channel attacks like Spectre, but also e.g. timing side-channels in the application domain (see e.g. Cascading Style Sheets in web browsers [HabalovCSS]).
Bugs in the application’s own logic for isolating principals can result in a loss of integrity, confidentiality, or control. In the web browser context, where the origin is the principal, we call this universal cross-site scripting (UXSS) or same-origin policy bypass (SOP bypass). More generally, for any application type, we can call this principal confusion.
Application environments that give principals a rich programming interface, like (but likely not only) the OWP, often suffer from principal confusion. For a browser example, see Chromium issue 605766.
The presence of these problems, and in particular the presence of micro-architectural side-channels, poses problems not only in the application domain itself, but in exploit mitigations for the implementation.
If we assume that an attacker can achieve at least a word-sized OOB read and make use of the value (and, given micro-architectural side-channel attacks, we may have to even in the long term), certain existing exploit mitigations may lose their effectiveness. I call this the free infoleak problem.
Address space layout randomization (ASLR) can’t have meaning intra-process against an attacker with any kind of OOB read. On platforms that don’t re-randomize per process, the same is true across processes on the same host. This would seem to restrict the usefulness of ASLR to 64-bit platforms and only when defending against cross-host attacks. (ASLR on 32-bit platforms is known to be ineffective [ShachamASLR].)
It is better if the platform re-randomizes for each process, but this can degrade run-time efficiency.
Here again, the free infoleak problem leads straightforwardly to an attacker’s ability to ‘repair’ a corrupt stack canary when exploiting an OOB write on the stack.
Certain implementations of control-flow integrity (CFI), such as some (but not all) of those that keep a ‘shadow stack’ somewhere in memory to validate returns, may be weakened given the free infoleak problem. If the attacker can locate the shadow stack, they may be able to use an OOB write to repair it in the process of exploting an OOB write on the (real) stack. (Note that Intel CET write-protects the location of the shadow stack to avoid this [IntelCET].) This might or might not require the attacker to have a second OOB write vulnerability, depending on the freedom of the first OOB write primitive.
Since protecting returns is necessary for CFI to be effective [CarliniCFB], CFI implementations where the shadow stack is in the target’s address space and is writable would seem to be significantly weaker due to the free infoleak.
Although web browsers are relatively well-understood to define their own principals (primarily origins), they are by no means the only class of application that does so.
The most obvious type of principal for a web browser to define is the web origin. But depending on the browser design and implementation, there may be others:
Database servers need to defend themselves against hostile queries, and also to isolate different query authors.
Application servers (including, but not only, web servers) may also run code from multiple authors, and may therefore need to define principals and effectively enforce them.
Web browsers are not the only type of application to retrieve ‘documents’ from untrustworthy sources and parse, render, and execute their complex contents. A variety of other applications, such as office suites, PDF viewers, and so on, face the same risks.
A key goal for defenders is to identify a minimal number of simple mechanisms (i.e., parsimony) that defend well against a variety of problems, regardless of their root cause. Parsimony is important because each defense mechanism imposes a certain cost in run-time efficiency and complexity. Piling mechanism upon mechanism can quickly become untenable, and the interactions between mechanisms can reduce defense effectiveness or even give rise to new vulnerabilities.
One approach to defending against the problems is to maintain a 1:1 mapping between processes and principals (privilege separation) and to reduce the privilege of each process to the minimum appropriate for the principal (privilege reduction). (Privilege separation and reduction used together are often called ‘sandboxing’, although note that that term does not necessarily imply that each principal gets its own sandbox.)
Process isolation is an excellent defense mechanism against the vulnerabilities we’re concerned with here, but the costs can be significant and surprising.
Each process incurs costs in time (startup latency, scheduling) and in space (the memory overhead ranges from low-but-nonzero to very high, depending on the platform). For example, one cloud service provider I talked to offers a service allowing customers to run arbitrary code on the provider’s servers. They need to scale to 10,000 or more customers per machine, at which point process isolation becomes too expensive in terms of process startup latency, kernel scheduling overhead, and memory overhead. (This particular provider uses Linux, where process overheads are very low relative to other platforms. They also report occasional spikes in process startup and scheduling latency, even though it is generally low.) As efficient as Linux is (in the common case), and as necessary as isolation is, there is a hard real-world limit that is below the needs of some service providers.
Worse, Linux is probably the best case (although I’d be interested to see if
FreeBSD behaves more linearly under heavy load). Windows was designed with an
expensive process/inexpensive thread programming model, and of course threads by
definition do not provide the kind of isolation we need. Although Android uses
Linux for its kernel,
exec are not officially
supported for non-platform applications — the intention is for applications to
through Java platform APIs, which incur huge latency and memory overheads (due
to virtual machine costs). (These costs apply even to applications that don’t
need to run Java code.)
Most of the operating systems available to application developers, and certainly all of the popular ones, were not originally designed to provide privilege reduction and privilege separation (‘sandboxing’) for application-defined principals.
As a result, application developers seeking to isolate their own principals have had to develop ‘off-label’ sandboxing mechanisms using existing platform APIs (public, private, documented, and undocumented) and even quirks. Maintaining these mechanisms, and ensuring their continual functionality, imposes significant costs on application developers. This surely contributes to the relative rarity of sandboxing in applications.
For example, the Windows function
is intended to enable Windows services running at high privilege (e.g. as SYSTEM
or Network Service) to impersonate lower-privilege subjects to handle an
incoming request, and then revert to their previous (higher) degree of privilege
after the request is satisfied. However, Chromium uses this function
‘backwards’: renderers impersonate at higher privilege during sandbox
warm-up, then call
RevertToSelf to ‘re-drop’ privilege before
handling web content.
On Linux (and hence Chrome OS), Chromium uses a variety of hacks to enable
sandboxing. An example is the setuid-root helper binary. To isolate renderers
from the filesystem, the helper
clones the zygote while keeping the
filesystem view shared, then
chroots to an empty directory.
chroot is a UID 0-only system call, this zygote clone can
chroot because the helper that cloned it is setuid-root.) This way,
Chromium renderers can be
chrooted without requiring the zygote to
run as root.
(The Chromium renderer zygote (distinct from the Android concept of the same name) is a process that makes spawning renderer processes more efficient and reliable.)
The setuid helper is now deprecated in favor of user namespaces, but Chromium cannot entirely drop support for it because some Linux distributions do not support user namespaces. User namespaces are a Linux kernel feature enabling application-defined principals, but comes with the implementation flaws that inevitably come with a re-design that imposes new requirements on old code. (And it exposes attack surface of its own.)
Chrome OS security engineers developed the Seccomp-BPF system for enabling processes to revoke their own access to kernel APIs. (Note in particular the “Caveats”, “Pitfalls”, and “What it isn’t” sections of that document.) With effort they were able to merge into upstream Linux. It works well for Chromium renderers and other callers, but is an example of the kind of complexity application developers have to take on to enforce their own principals.
Android was extended to provide the
API, enabling applications to run many
Service processes each
isolated from each other and other principals.
Although macOS has a privilege reduction mechanism called Seatbelt, direct
programmatic access to it was deprecated and is now undocumented. Apple supports
preset Seatbelt profiles in Xcode, but the Seatbelt policy language is a “System
Private Interface”. Other privilege reduction APIs, such as
setrlimit to reduce certain resource consumption limits, appear to
work but have no effect.
If a platform can provide process isolation efficient enough for an application’s needs, the remaining limitation is the granularity of isolation. The theoretical gold standard would seem to be object capability isolation, but existing platforms in the field don’t offer capability isolation for all types of resources. On most platforms there remain global resources, access to which can only be mediated in a somewhat ad hoc manner.
The in-development Zircon kernel (supporting the Fuchsia platform) aims to follow the object-capability model more closely. It should therefore be easier to achieve fine-grained isolation for application principals on this platform.
Since processes are expensive, it would be helpful if there were a more efficient way to isolate principals. For example, they could each get their own light-weight thread, but be locked to 1 or more segments of memory in the shared address space. (We could call this a segmented thread.) No widely-deployed platform directly supports this concept, but there are concepts in the literature (e.g. CHERI) and and the possibilities of building blocks in leading-edge deployed platforms. Given the increasing need to isolate finer-grained principals, it would be helpful for the security of real-world applications if hardware and operating systems supported sraightforward, guaranteed interfaces for this kind of isolation.
The CHERI architecture [WatsonCheriIsa] enables lower-cost application-defined principals by hardware-enforced capabilities, which are essentially memory segments. As a natural result of its design, CHERI defends against Variants 1 and 3 [WatsonCheriSpectre].
To defend against Variant 2, CHERI needs an enhancement: to expose a CHERI compartment identifier (CID) in the instruction set, allowing calling code to direct the micro-architecture as to when it can and cannot share state across compartments. In addition to being a defense against a class of vulnerability, this enhancement may improve applications’ ability to define principals in a flexible way, including trading off security and performance in important deployment scenarios (e.g. many principals using a shared runtime):
There are a various tradeoffs around the use of a CID to partition micro-architectural state. Hard partitioning of that state may lead to performance loss where prior sharing led to performance gain — e.g., where branch-predictor state was shared beneficially between two compartments sharing common code. This concern also arises in the context of harder partitioning for MMU-based process schemes: forked processes executing common code (e.g., a language runtime or server application) may benefit from branch-predictor leakage between tasks, which would be prevented by partitioning. However, architectural CIDs need not map one-to-one with software compartments: if only integrity or availability is required (and not confidentiality), then a software compartment change can take place without requiring use of an independent micro-architectural state.
CHERI or any similar hardware design is not deployed in the field now, and even if adopted immediately and universally, hardware lifecycles are long.
One can imagine microcode updates for existing hardware to support new security boundaries such as CHERI-like capabilities, but experience with microcode updates has shown that they are expensive and not without risk. Radical changes such as new security boundaries would likely be more expensive and risky. Mass-market operating system providers would still need to support upgraded and non-upgraded hardware, further increasing complexity.
Intel Memory Protection Keys (see e.g. [CorbetMPK]) enable an application to protect its own
address space with up to 16 ‘keys’ for address space regions and 2 permission
bits (read and write). This mechanism is supposed to be more efficient than the
mprotect function. Corbet:
The problem with the current bits is that they can be expensive to manipulate. A change requires invalidating translation lookaside buffer (TLB) entries across the entire system, which is bad enough, but changing the protections on a region of memory can require individually changing the page-table entries for thousands (or more) pages. Instead, once the protection keys are set, a region of memory can be enabled or disabled with a single register write. For any application that frequently changes the protections on regions of its address space, the performance improvement will be large.
It would seem that an application could therefore use MPK to efficiently isolate the memory of up to 16 application-defined principals per process, such as by giving each principal a thread with its own protected memory, and reducing that thread’s access to the kernel.
However, the fact that the keys themselves are in the process’ address space,
and that updating the control register is an unprivileged processor instruction,
limit the defense value of this mechanism against attackers who can corrupt
memory and/or take control of a process. (A process-per-principal solution can
avoid this weakness.) If the application can ensure that the MPK-isolated
principal cannot/does not issue
WRPKRU (the instruction that
updates the protection bits), and then continues executing code it controls, it
can resolve that problem.
The mechanism might be more readily usable if the keys and the control register were managed exclusively with privileged processor instructions, but that might negate some of the performance advantage of MPK over traditional page protections.
There are also a variety of pragmatic hurdles to be overcome, such as memory management and communication with the application.
Although it is possible to mitigate some (not all) micro-architectural side-channels by controlling the machine code such that it does not rely on (e.g.) branches for critical safety checks (see e.g. [PizloSpectre] and [ChromeRethink]), it fundamentally means sacrificing much of the performance improvement that comes with out-of-order execution. The incompleteness of the solution, and the fact that it doesn’t address the other problems we want to address, makes it not cost-effective in general.
That said, object code and runtime mitigations can be effective or even crucial in environments that can afford the cost, can’t afford the cost of address space isolation, have greater control over their operating environment than do (e.g.) web browsers, and/or don’t face the full suite of problems this article discusses.
To defend against all the attacks (not just OOB reads), can we get by on address space segmentation alone or do we need independent scheduling units (processes or threads) for each principal? That is likely to depend on the application and its operating environment. Some reasons to use independent scheduling units include:
Principals are often composed or compose themselves with data or code from foreign sources — other principals — and need a way to ergonomically do so in a way that does not break cross-principal isolation.
We can’t secure applications that directly include or link against hostile
code. However, it may be possible to extend some application platforms to
compose applications in a different, safer way. For example,
iframes on the web ‘(sandboxed’,
and cross-process when Site Isolation is in effect) can provide a form of
isolation. There have also been experiments to do something similar on Android
([PearceAdDroid] and my own unpublished
There may be other examples. Crucially, any such mechanism must be efficient and ergonomic, to enable rich composition.
Application-defined principals are here to stay, but applications struggle to enforce them with the tools that deployed, mainstream platforms provide.
Application developers need to recognize the principals they define, and re-engineer their applications to enforce them with effective platform mechanisms. At the moment, that usually means process isolation.
At the same time, application developers should organize to pressure platform developers to provide more efficient and effective isolation mechanisms. In particular, innovation is necessary in hardware. While there are some good experiments and proposals happening, more movement is necessary. This is difficult due to the long lifecycle of hardware. That makes it all the more important to start now.
Thanks to Jann Horn, Jorge Lucangeli Obes, Robert Sesek, and Will Harris for ideas, specific examples, and corrections. The errors that surely remain are my own.
[BarthOrigin] RFC 6454: The Web Origin Concept; Adam Barth
[CarliniCFB] Control-Flow Bending: On the Effectiveness of Control-Flow Integrity; Nicolas Carlini, Antonio Barresi, Mathias Payer, David Wagner, Thomas R. Gross
[ChromeRethink] Post-Spectre Threat Model Re-Think; Chrome Security Team
[CorbetMPK] Memory Protection Keys; Jonathan Corbet
[CrowellConfinement] The Confinement Problem: 40 Years Later; Alex Crowell, Beng Heng Ng, Earlence Fernandes, Atul Prakash
[HabalovCSS] Side-channel Attacking Browsers Through CSS3 Features; Ruslan Habalov
[HornSideChannel] Reading Privileged Memory With A Side Channel; Jann Horn
[IntelCET] Control-flow Enforcement Technology Preview, Intel
[KocherSpectre] Spectre Attacks: Exploiting Speculative Execution; Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, Yuval Yarom
[LampsonConfinement] A Note On The Confinement Problem; Butler W. Lampson
[LinuxMPK] Overview Of Memory Protection Keys; Linux man-pages project
[LippMeltdown] Meltdown; Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, Mike Hamburg
[PearceAdDroid] AdDroid: Privilege Separation For Applications And Advertisers In Android; Paul Pearce, Adrienne Porter Felt, Gabriel Nunez, David Wagner
[PizloSpectre] What Spectre And Meltdown Mean For WebKit, Fil Pizlo
[ReisBrowserOS] Web Browsers As Operating Systems: Supporting Robust And Secure Web Programs; Charles Reis
[RogowskiDataOnly] Revisiting Browser Security in the Modern Era: New Data-only Attacks and Defenses; Roman Rogowski, Micah Morton, Forrest Li, Fabian Monrose, Kevin Z. Snow, Michalis Polychronakis
[SaltzerProtection] The Protection of Information In Computer Systems; Jerome H. Saltzer and Michael D. Schroeder
[ShachamASLR] On The Effectiveness Of Address-Space Randomization; Hovav Shacham, Matthew Page, Ben Pfaff, Eu-Jin Goh, Nagendra Modadugu, Dan Boneh
[WatsonCheriIsa] Capability Hardware Enhanced RISC Instructions: CHERI Instruction-set architecture; Robert N.M. Watson, Peter G. Neumann, Jonathan Woodruff, Jonathan Anderson, David Chisnall, Brooks Davis, Ben Laurie, Simon W. Moore, Steven J. Murdoch, Michael Roe
[WatsonCheriSpectre] Capability Hardware Enhanced RISC Instructions (CHERI): Notes on the Meltdown and Spectre Attacks; Robert N. M. Watson, Jonathan Woodruff, Michael Roe, Simon W. Moore, Peter G. Neumann