A brief introduction to speculative semantics

Author: Alexandra Michael
Date Published: 03 February 2025

Speculative execution, aka speculation, is a form of computation that increasingly underlies every program our computers run. If your mental model of a CPU is a device that runs every instruction given to it, in the order given, with all prerequisite inputs gathered beforehand—wipe that model from your head, because we live in the world of speculation and Spectre.

Speculation (what is it, and why do we care?)

In short, speculation occurs when your computer guesses at the outcome of a particular operation before that operation actually finishes. There are many, many forms this can take. One of the most prosaic is branch prediction, in which the hardware guesses whether an if-statement condition will be true or false and begins executing on the guessed branch before it knows whether its guess was correct.

This sort of potentially-incorrect computation is allowed because the speculatively-executed instructions aren’t really being executed (according to hardware vendors, that is). During a given period of speculative execution, your computer carefully tracks what it’s done since the initial branch prediction, or other form of guess, that triggered the speculation. Once the true result of the branch condition is available, the speculatively-executed instructions are either committed (if the prediction was correct), and their results made permanent; or rolled back (if the prediction was incorrect) so the CPU can re-execute on the correct branch. In the second case, we refer to the speculatively-executed instructions as transient, because their effects are reverted after the fact.

If you think all of that sounds harmless—after all, any incorrect instructions will be undone in the end—well, you wouldn’t be alone. Even hardware vendors didn’t think there was a real problem here before the original Spectre vulnerability, sometimes called Spectre v1, proved that transient execution has real, detectable effects on the hardware state. Spectre demonstrated that attackers can practically use those effects to learn cryptographic secrets or even take control of a program’s execution. And because transient execution falls outside traditional models of computation, even programs previously thought to be provably, mathematically safe—typically meaning that secret data cannot be leaked to attackers—are vulnerable to Spectre-type attacks.

Thus, for those of us in pursuit of formal security guarantees, the question becomes: how can we incorporate speculative execution into our computational models? More precisely, what are, or should be, the semantics of speculation?

Speculative semantics

Some notes before we dive further in: modeling speculative execution is a very new and active line of research, featuring many different and evolving approaches. Due to space constraints, and to keep this blog post approachable, I’ll be focusing on high-level concepts that inform how speculative semantics are designed. For implementation-level details, particularly as they vary across use cases and forms of speculation, I recommend checking out the papers in my references.

Why don’t “normal” semantics work for speculation?

Let’s look at an example, straight from Spectre v1:

if (x < arr1_len) {
    y = arr2[arr1[x] * 64];
}

What would sequential semantics (i.e., our standard non-speculative model of how programs should run) say about this piece of code? Probably we would start with a generic semantics for if-statements, something like:

\[\frac {\text{if }(cond)\{stmt\}~~~~~cond\stackrel{*}{\rightarrow}\text{true}} {\text{<exec}~stmt\text{>}}\] \[\frac {\text{if }(cond)\{stmt\}~~~~~cond\stackrel{*}{\rightarrow}\text{false}} {\text{skip}}\]

Where \(\text{<exec}~stmt\text{>}\) would be replaced by whatever it means to “execute” a statement, most likely entailing updates to some model of memory and architectural state. Then for our example, these semantics would give us:

\[\frac {\text{if }(\texttt{x < arr1_len})\{\texttt{...}\} ~~~~~(\texttt{x < arr1_len})\stackrel{*}{\rightarrow}\text{true}} {\text{<exec}~\texttt{y = arr2[arr1[x] * 64];}\text{>}}\] \[\frac {\text{if }(\texttt{x < arr1_len})\{\texttt{...}\} ~~~~~(\texttt{x < arr1_len})\stackrel{*}{\rightarrow}\text{false}} {\text{skip}}\]

Notice that under these semantics, the two branches are mutually exclusive! We execute the interior statement only if the branch condition eventually evaluates to \(\text{true}\). If the branch condition evaluates to \(\text{false}\), we do nothing, instead \(\text{skip}\)ping to the next statement. Furthermore, these two cases are completely deterministic according to the values of x and arr1_len.

What these semantics don’t tell us is what happens when the conditional branch predictor mispredicts the result of the branch condition. Let’s say that x is not, in fact, less than arr1_len. That is, \((\texttt{x < arr1_len})\stackrel{*}{\rightarrow}\text{false}\). Our semantics say that in this case, we should skip over the interior statement entirely. But if the branch predictor guesses that x < arr1_len is true, then we’ll begin executing the statement anyway! Our sequential semantics can’t model that.

Tracing transient execution

What we need is a way to track transiently-executed operations in addition to sequential ones, even through rollbacks. But it’s not as simple as just tracking every instruction executed, speculatively or not, because transient execution has different and more limited effects than sequential execution. For example, while a transiently-executed memory store affects what memory addresses are kept in cache, it does not ultimately change the actual contents of memory.

One (naive) option would be to model the specific elements of hardware that retain effects of transient execution. This would require us to build whole caches, as well as any other impacted structures, into our notion of execution state. We would then also need to model all the policies associated with those caches, so that we could properly update their contents at each step; and add those updates into our sequential semantics; and do the same for any other hardware element we believe to be affected by transient execution. Clearly, this approach would be a massive undertaking. And that’s before taking into account how cache structures and policies vary between systems, and how often information about those structures is obscured by manufacturers—so we as semantics designers wouldn’t even know everything we need to build an accurate model.

Instead, speculative semantics often incorporate a notion of observations, representing what information an adversary can be expected to learn as a result of any given step in a program’s execution. What exact observations result from a specific step is encoded into each operation’s semantics. For example, an if-statement by itself might not provide any new information to the adversary. But a memory access inside an if-statement would, usually in the form of knowledge about what memory addresses are contained in cache.

We can track these observations by adding them to an observation trace at every step in a program’s execution. For example, the sequential semantics for a memory load arr1[x] would now append to the trace the observation that the address of arr1[x] is now in cache. That information then stays in the trace even if the access arr1[x] is later rolled back, and all its effects on the rest of the architectural state are reversed.

Notice that this method of tracking attacker information doesn’t have to be unique to speculative semantics! You could just as easily use observation traces in modeling sequential execution, though depending on the use case, you might find them redundant with the raw instruction trace.

Leakage models

So now we have a way to track what an attacker learns from transient execution. But how do we know when something is observable in the first place?

The short answer is: we often don’t. The hardware security community’s understanding of attacker capabilities is constantly evolving as people uncover new forms of information leakage. Some of these result from new optimizations introduced by hardware manufacturers, while others derive from clever uses of existing mechanisms. There’s a reason the original Spectre is often called Spectre v1—there have been a huge number of related attacks developed since that first one!

In practice, semantics designers make best-effort decisions about what to assume attackers can learn. In their review of software-level defenses against Spectre, Cauligi et al. refer to this as the leakage model [1]. Researchers develop leakage models as a function of both current understandings of what a “reasonable” attacker is likely capable of and the specific intended use case of the semantics. For example, most leakage models presume that an adversary can detect which memory address is accessed in a load or store instruction, because loads and stores typically bring that address into cache, where attackers can detect its presence with relative ease. Some leakage models, but not all, might additionally assert that the attacker can detect the contents of accessed memory addresses, which is suitable when assuming a significantly more powerful adversary.

Handling nondeterminism

Beyond tracking what an attacker learns if a piece of code executes transiently, we also need to somehow determine when that transient execution should occur. Unlike sequential execution, speculative execution is inherently nondeterministic from the programmer’s perspective. While you might be able to predict the outcome of a given speculation decision when given sufficient information about the hardware state, in practice, we almost never have enough information to do so.

Note that though predicting speculation decisions is challenging, controlling those decisions as an adversary is much more achievable. Attackers can exploit well-understood techniques to “train” speculation mechanisms, such as the branch predictor, into making specific incorrect predictions. For example, by executing a specific branch repeatedly in a context where the branch condition is true, an adversary could trick the predictor into guessing that condition will remain true in another context (e.g., in another user’s process running the same program). We see this concept modeled in speculative semantics as attacker directives, which state what predictions should be made at specified points during a program’s execution.

Even with directives, however, we cannot know exactly which predictions a real attacker would train the hardware to make. Thus we still need a way to explore the full scope of possible observations from speculative execution. Just as with modeling attacker observations, building a whole conditional branch predictor or other speculation mechanism into our semantics’ notion of state would be extraordinarily difficult. In practice, many formulations of speculative semantics follow the naive approach of exploring every possible speculative path, by assuming misprediction will occur every time.

You might wonder how this approach can be tractable. After all, it seems like even a mechanism as constrained as a conditional branch predictor—with true or false as its only two options—would see exponential growth in its search space when exploring all paths. It helps, however, that there is only one “real” path through the program: the path we would follow under sequential semantics. Every time we explore an incorrect branch, the instructions we execute there are rolled back shortly thereafter, leaving nothing but their effects on the observation trace, and allowing us to proceed with program execution along only the deterministic, sequential path.

This approach is also aided by the fact that processors only dedicate so many resources to any given round of speculation. These resource constraints create what we call a speculation window: the maximum number of cycles, instructions, or other resources that can be executed speculatively based on a single prediction. Because speculation windows tend to be short, we can largely avoid exploding our search space by modeling those windows directly in our semantics.

Yet even with those advantages, always assuming mispredictions is significantly less feasible with forms of speculation that are less constrained than conditional branch predictors. When the full address space is available as possible “guesses”, for example, exploring every possible mispredicted path is far from simple. As such, we more often see this approach in conceptual semantics papers than with tools intended for building verifiable, but practical, mitigations. The latter tend to make use of data flow analysis and other techniques to more narrowly select the possible range of paths to explore [1].

Proving safety

We’ve now looked at when and how to model the effects of transient execution. To actually use speculative semantics to reason about or verify a program’s execution, we also need to know what our criteria for safety should be. We can call this our security policy: a set of decisions made by semantics designers about when a program’s execution is considered secure.

As with leakage models, security policies vary by the context and use case of given set of speculative semantics. Most, though, are founded on some form of noninterference, which aims to show that changes to the values of some set of “secrets” (e.g., a cryptographic key) are completely undetectable to attackers so long as non-secret (“public”) values are held constant. While some speculative semantics seek to enforce noninterference for all secrets, others are interested in what Cauligi et al. call relative noninterference [1]. Under relative noninterference security policies, a program is considered secure so long as speculation does not add any vulnerabilities beyond those that may already exist under sequential semantics. While this policy clearly cannot guarantee that a program is fully secure, it can be sufficient when showing that a compiler does not add new speculative vulnerabilities to the source program, for example.

Conclusion

I hope you enjoyed this very brief introduction to speculative semantics! I’ve covered a few ways we can model how attackers gain information from transient execution, what information they gain, nondeterminism in predictions, and what safety under speculation should actually mean. The approaches I’ve covered here are far from the only ones, however, nor could I fit the full details and nuance of these concepts into this blog post. For any readers interested in further exploring how speculative semantics are built in practice, I highly recommend checking out one or more of the papers below!

References and further reading

[1] Cauligi et al. SoK: Practical Foundations for Software Spectre Defenses. IEEE S&P 2022.
- Highly recommended reading for those interested in learning more; provides an overview of many recent approaches in this space.
[2] Cauligi et al. Constant-Time Foundations for the New Spectre Era. PLDI 2020.
- A good first paper for diving into the full math of (one version of) speculative semantics.
[3] Guarnieri et al. Hardware-Software Contracts for Secure Speculation. IEEE S&P 2021.
- Another good paper, formalizing how to specify the safety of a user program in terms of the hardware’s speculation and leakage mechanisms.

Alexandra Michael is a 3rd year PhD student at UW, working on formal methods for hardware security.

Keywords:

Back to blog posts...

Post Metadata