# Microarchitectural Attacks and Provable Defenses

### **Seminar ENS Rennes**

September 24th, 2025

Lesly-Ann Daniel, EURECOM





# My background



# Something I wish I knew before: How to live with impostor syndrome

- Bachelor: felt like I didn't belong as a woman
- ENS-Rennes: weak math background, no classe prépa
- Academia: very competitive environment full of smart people
  - Hard to not compare yourself to others



# What helped

- Know that you're not alone
- Be kind to yourself and others
- Very supportive environments with amazing colleagues (choose your advisors wisely)
- Do not compare to others: everyone is different!
- Amy Cuddy TED Talk Fake it Till You Make it
- For women: <u>L'Oréal-UNESCO For Women in Science</u>

# Microarchitectural Attacks and Provable Defenses

### **Seminar ENS Rennes**

September 24th, 2025

Lesly-Ann Daniel, EURECOM





# Security critical software is prevalent

### What:

- Secure communications,
- Banking transactions, ...

### Where:

- Servers, smartphones, ...
- Shared by many users (cloud)

# SECURITY CABER SECURITY CABER SECURITY COMMENT AND A SECURITY COMMEN

### How: cryptography

- Mathematical guarantees, verified implems.
- But what about their execution in the physical world?



Intel Meteor Lake – Credit <a href="https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/">https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/</a>



Intel Meteor Lake - Credit https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/

- Caches



Intel Meteor Lake - Credit <a href="https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/">https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/</a>

- Caches
- Out-of-order speculative execution



Intel Meteor Lake - Credit https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/

- Caches
- Out-of-order speculative execution
- And more [1]?

[1] Vicarte, Jose Rodrigo Sanchez, et al. "Opening pandora's box: A systematic study of new ways microarchitecture can leak private data." ISCA, 2021



Intel Meteor Lake - Credit https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/

Caches



# What about security?

### execution

- And more [1]?

[1] Vicarte, Jose Rodrigo Sanchez, et al. "Opening pandora's box: A systematic study of new ways microarchitecture can leak private data." ISCA, 2021



Intel Meteor Lake - Credit https://semianalysis.com/2022/05/26/meteor-lake-die-shot-and-architecture/

# ... Well security is not good :(



# Spectre flaws continue to haunt Intel and AMD as researchers find fresh attack method

The indirect branch predictor barrier is less of a barrier than hoped

 ♣ Thomas Claburn
 Fri 18 Oct 2024 14:01 UTC



\*non exhaustive list

# Back to the basics

### Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems

Cache-timing attacks on AES

Paul C. Kocher

Cryptography Research, Inc.
607 Market Street, 5th Floor, San Francisco, CA 94105, USA.
E-mail: paul@cryptography.com.

Abstract. By carefully measuring the amount of time required to perform private key operations, attackers may be able to find fixed Diffie-Hellman exponents, factor RSA keys, and break other cryptosystems. Against a vulnerable system, the attack is computationally inexpensive and often requires only known ciphertext. Actual systems are potentially at risk, including cryptographic tokens, network-based cryptosystems, and other applications where attackers can make reasonably accurate timing measurements. Techniques for preventing the attack for RSA and Diffie-Hellman are presented. Some cryptosystems will need to be revised to protect against the attack, and new protocols and algorithms may need to incorporate measures to prevent timing attacks.

Daniel J. Bernstein \*

Department of Mathematics, Statistics, and Computer Science (M/C 249)

The University of Illinois at Chicago

Chicago, IL 60607-7045

djb@cr.yp.to

Abstract. This paper demonstrates complete AES key recovery from known-plaintext timings of a network server on another computer. This attack should be blamed on the AES design, not on the particular AES library used by the server; it is extremely difficult to write constant-time high-speed AES software for common general-purpose computers. This paper discusses several of the obstacles in detail.

2005

# **Control-flow leaks**





- end-to-end timing
- different resource consumption
- branch predictor state
- instruction cache
- instruction prefetcher
- micro-op cache

- . . .

```
bool check_pin(char* guess) {
  for (i=0; i<4; i++)
    if (guess[i] != pin[i])
     return false;
  return true;
}</pre>
```





```
bool check_pin(char* guess) {
  for (i=0; i<4; i++)
    if (guess[i] != pin[i])
     return false;
  return true;
}</pre>
```

```
pin = 4321
```

 $0000 \rightarrow 1s$   $1000 \rightarrow 1s$   $2000 \rightarrow 1s$   $3000 \rightarrow 1s$   $4000 \rightarrow 2s$   $5000 \rightarrow 1s$ ...



```
bool check_pin(char* guess) {
  for (i=0; i<4; i++)
    if (guess[i] != pin[i])
     return false;
  return true;
}</pre>
```

```
pin = 4321
```

 $0000 \rightarrow 1s$   $1000 \rightarrow 1s$   $2000 \rightarrow 1s$   $3000 \rightarrow 1s$   $4000 \rightarrow 2s$   $5000 \rightarrow 1s$ ...



```
bool check_pin(char* guess) {
  for (i=0; i<4; i++)
    if (guess[i] != pin[i])
     return false;
  return true;
}</pre>
```

$$4000 \rightarrow 2s$$

$$4100 \rightarrow 2s$$

$$4200 \rightarrow 2s$$

$$4300 \rightarrow 3s$$

$$4400 \rightarrow 2s$$

$$4500 \rightarrow 2s$$
...



```
bool check_pin(char* guess) {
  for (i=0; i<4; i++)
    if (guess[i] != pin[i])
     return false;
  return true;
}</pre>
```

```
nin = 4321
```





```
bool check_pin(char* guess) {
  good = true;
  for (i=0; i<4; i++)
     good &= guess[i] == pin[i];
  return good;
}</pre>
```

### **Solution**

Make timing independent of secret Remove secret-dependent branch!

### Victim program

### Data cache





Attacker
Shares microarchitecture
with victim

### Victim program

x = tab[secret]





### Victim program



# Victim program

x = tab[secret]

- caches
- data pre-fetchers
- load/store dependencies
- ...



# **Solution? Constant-time programming!**



### **Unsafe instructions**

- Control-Flow
- Memory accesses
- Variable-time instr.

- Full software countermeasure
- De facto standard for crypto: BearSSL, Libsodium, HACL\*, etc.
- (Almost) Secure against micro-architectural attacks (hum...)

# Constant-time is not easy to implement

```
uint32_t select(uint32_t x, uint32_t y, bool secret) {
  if (secret) return x;
  else return y;
}
```

```
uint32_t ct_select(uint32_t x, uint32_t y, bool secret) {
   signed b = 0 - secret;
   return (x & b) | (y & ~b);
}
```

# Compilers can break constant-time!



# What can we do about it?

# **Constant-time preserving compilers**

Secure compilation of side-channel countermeasures: the case of cryptographic "constant-time"

Gilles Barthe\*, Benjamin Grégoire<sup>†</sup>, Vincent Laporte\*

\*IMDEA Software Institute, Madrid, Spain
gilles.barthe@imdea.org vlaporte@imdea.org

†Inria, Sophia-Antipolis, France
benjamin.gregoire@inria.fr

# Formal Verification of a Constant-Time Preserving C Compiler

GILLES BARTHE, MPI for Security and Privacy, Germany and IMDEA Software Institute, Spain SANDRINE BLAZY, Univ Rennes, Inria, CNRS, IRISA, France BENJAMIN GRÉGOIRE, Inria, France RÉMI HUTIN, Univ Rennes, Inria, CNRS, IRISA, France VINCENT LAPORTE, Inria, France DAVID PICHARDIE, Univ Rennes, Inria, CNRS, IRISA, France ALIX TRIEU, Aarhus University, Denmark

# Domain specific languages & compilers for crypto

### FaCT: A DSL for Timing-Sensitive Computation

Sunjay Cauligi UC San Diego, USA

Fraser Brown Stanford, USA

Benjamin Grégoire INRIA Sophia Antipolis, France Gary Soeller UC San Diego, USA

Riad S. Wahby Stanford, USA

Gilles Barthe
MPI for Security and Privacy,
Germany
IMDEA Software Institute, Spain

Deian Stefan UC San Diego, USA Brian Johannesmeyer UC San Diego, USA

> John Renner UC San Diego, USA

Ranjit Jhala UC San Diego, USA

### Jasmin: High-Assurance and High-Speed Cryptography

José Bacelar Almeida INESC TEC and Universidade do Minho, Portugal

> Arthur Blot ENS Lyon, France

Tiago Oliveira INESC TEC and FCUP Universidade do Porto, Portugal Manuel Barbosa INESC TEC and FCUP Universidade do Porto, Portugal

Benjamin Grégoire Inria Sophia-Antipolis, France

Hugo Pacheco INESC TEC and Universidade do Minho, Portugal

Pierre-Yves Strub École Polytechnique, France Gilles Barthe IMDEA Software Institute, Spain

Vincent Laporte IMDEA Software Institute, Spain

Benedikt Schmidt Google Inc.

# What can we do about it?

### **Constant-time preserving compilers**

Secure compilation of side-channel countermeasures: the case of cryptographic "constant-time"

Gilles Barthe\*, Benjamin Grégoire†, Vincent Laporte\*

# Formal Verification of a Constant-Time Preserving C Compiler

GILLES BARTHE, MPI for Security and Privacy, Germany and IMDEA Software Institute, Spain SANDRINE BLAZY, Univ Rennes, Inria, CNRS, IRISA, France BENJAMIN GRÉGOIRE, Inria, France

# Fight the compiler And verify binary code

Sunjay Cauligi UC San Diego, USA

Fraser Brown Stanford, USA

Benjamin Grégoire INRIA Sophia Antipolis, France Gary Soeller UC San Diego, USA

Riad S. Wahby Stanford, USA

Gilles Barthe
MPI for Security and Privacy,
Germany
IMDEA Software Institute, Spain

Deian Stefan UC San Diego, USA Brian Johannesmeyer UC San Diego, USA

> John Renner UC San Diego, USA

Ranjit Jhala UC San Diego, USA JOSE BACEIAI AIMEIUA
INESC TEC and
Universidade do Minho, Portugal

Arthur Blot ENS Lyon, France

Tiago Oliveira INESC TEC and FCUP Universidade do Porto, Portugal INESC TEC and FCUP
Universidade do Porto, Portugal

Benjamin Grégoire Inria Sophia-Antipolis, France

Hugo Pacheco INESC TEC and Universidade do Minho, Portugal

Pierre-Yves Strub École Polytechnique, France IMDEA Software Institute, Spain

Vincent Laporte IMDEA Software Institute, Spain

> Benedikt Schmidt Google Inc.



# **Constant-Time (a bit more) Formally**

2 executions that only differ in their secret input must be indistinguishable to an observer



# **Constant-Time (a bit more) Formally**

2 executions that only differ in their secret input must be indistinguishable to an observer



# Several approaches [1]

### **Static**

- Type systems
- Abstract interpretation
- Symbolic execution

# **Dynamic**

- Record and compare observations
- Statistical tests
- Fuzzing
- Dynamic symbolic execution

[1] Geimer, Antoine, Mathéo Vergnolle, Frédéric Recoules, Lesly-Ann Daniel, Sébastien Bardin, and Clémentine Maurice. "A systematic evaluation of automated tools for side-channel vulnerabilities detection in cryptographic libraries." In ACM CCS 2023.

# 41st IEEE Symposium on Security and Privacy

# Binary-Level Symbolic Execution for Constant-Time

BINSEC/REL: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level

Lesly-Ann Daniel\*, Sébastien Bardin\*, Tamara Rezk<sup>†</sup>





```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

Can error be reached?



```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### **Symbolic store**



```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### **Symbolic store**



```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### **Symbolic store**



```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### **Symbolic store**



#### **Path constraint**

$$t_0 = 0$$



0x0080: mul t, p, s

0x0084: add t, t, 48

0x0088: beqz t error

0x008c: div t, s, t

0x0090: [...]

#### Symbolic store

 $p \mapsto p_0$  $S \mapsto S_0$ t → p<sub>0</sub> × s<sub>0</sub> - 48 pc → error

#### Path constraint

$$t_0 = 0$$

#### Can error be reached?

=> Query SMT-Solver

$$t_0 = p_0 \times s_0 - 48$$

$$\wedge t_0 = 0 \text{ is SAT?}$$

$$\sum_{\rho_0 = 6} p_0 = 6$$

$$s_0 = -8$$







## Safety vs. 2-Hypersafety

2 executions that only differ in their secret input must be indistinguishable to an observer



Secure Information Flow by Self-Composition\*

Gilles Barthe<sup>1</sup> Pedro R. D'Argenio<sup>2</sup> Tamara Rezk (corresponding author) <sup>3</sup>

**Key idea:** Turn a 2-hypersafety property of a program **P** to a safety property of a self-composed program **P;P'** 

Can re-use verification techniques/tools for safety!

```
0x0080: mul t, p, s
0x0084: add t, t, 48

0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### Is program CT?

(1) SE 
$$t_0 = p_0 \times s_0 - 48$$

```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### Is program CT?



```
0x0080: mul t, p, s
0x0084: add t, t, 48

0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### Is program CT?



```
0x0080: mul t, p, s
0x0084: add t, t, 48
0x0088: beqz t error
0x008c: div t, s, t
0x0090: [...]
```

#### Is program CT?



## **Beyond Self-Composition: Optimization for SE**

#### Relational Symbolic Execution

Gian Pietro Farina\*<sup>1</sup>, Stephen Chong<sup>†2</sup> and Marco Gaboardi<sup>‡1</sup>

<sup>1</sup>University at Buffalo, SUNY <sup>2</sup>Harvard University

- 2 execution in 1 SE instance
- Maximize sharing
- Spare queries

# BINSEC/REL: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level

Lesly-Ann Daniel\*, Sébastien Bardin\*, Tamara Rezk<sup>†</sup>

\* CEA, List, Université Paris-Saclay, France † INRIA Sophia-Antipolis, INDES Project, France

 $les ly-ann. daniel @\,cea.fr,\,sebastien.bardin @\,cea.fr,\,tamara.rezk @\,inria.fr$ 

- RelSE for CT
- Optimization for binary-level

# And concretely?



# Binsec/Rel

#### **Binary**

X86-32 / 64 RISC-V 32 ARMv7/AARCH64/AMD64

#### Configuration

Concretize esp, .data, canaries, ...
Libc stubs





Boolector Bitwuzla











CT-analysis of cryptographic primitives

## Preservation of constant-time by compilers

#### 11 compiler versions

- 5 versions of clang for x86
- 5 versions of gcc for x86
- 1 version of gcc for ARM

#### **Optimization setups**

- Optimization level O1 ... O3
- Individual optimizations
  - X86-cmov-converter, if-conversion

#### **Programs**

- Analyze 34 small programs
- Total: 4148 binaries



Compile

&

Analyze with Binsec/Rel

```
LLVM ≠ Binary
                                                                                            arg 4= dword ptr 8
                                                                                                               Binary
                                                                                            push
                                                                                                   esi
                                                                                                  eax, [esp+4+arg_4]
                                                                                                   edx, [eax]
                                                                                                   esi, [eax+4]
int sort2(int *out2, int *in2) {
                                      Source
                                                                                                   ecx, [eax+4]
  signed char c;
                                                                                                   edx, esi
  c = (in2[0] < in2[1]) - 1;
                                                                                                   short loc 80483B3
  out2[0] = (\sim c \& in2[0]) | (c \& in2[1]);
  out2[1] = (\sim c \& in2[1]) | (c \& in2[0]);
                                                                                                💶 🚄 🖼
  return (in2[0] < in2[1]);
                                                                                                       esi, edx
                                                                 LLVM-IR
                                                                                            💶 🚄 🚾
define i32 @sort2(i32* nocapture %out2, i32* nocapture reador.y
  %1 = load i32* %in2, align 4, !tbaa !1
                                                                                            loc 80483B3:
  %2 = getelementptr inbounds i32* %in2, i32 1
                                                                                                   edx, [esp+4+arg_0]
                                                                                                   [edx], esi
  %3 = load i32 * %2, align 4, !tbaa !1
                                                                                                   esi, eax
                                                                                                   short loc 80483BF
    = select i1 %not., i32 %3, i32 %1
  %5 = load i32* %2, align 4, !tbaa !1
                                                                                                **
                                                                                                       esi, ecx
  %7 = select i1 %not., i32 %6, i32 %5
                                                                                           53
                                                                                               store i32 %7, i32* %8, align 4, !tbaa !1
  %9 = load i32 * %in2, align 4, !tbaa !1
  %10 = load i32* %2, align 4, !tbaa !1
                                                                                               loc 80483BF:
                                                                                                      ecx. [esi]
  %11 = icmp slt i32 %9, %10
                                                                                               mov
                                                                                                      [edx+4], ecx
  %12 = zext i1 %11 to i32
                                             Backend passes can still
                                                                                                      ecx, [eax]
                                                                                               mov
  ret i32 %12
                                                                                                      ecx, [eax+4]
                                                                                               setl
                                                                                                      al
                                               introduce violations!
                                                                                               movzx
                                                                                                      eax, al
                                                                                               pop
                                                                                                      esi
                                                                                               retn
                                                                                               sort2 endp
```

public sort2 sort2 proc near arg\_0= dword ptr

#### Clang adds secret dependent memory access

```
void sort2(i32* out, i32* in) {
    a0 = load in[0]
    a1 = load in[1]
3
    a = select (a0 < a1) a0 a1
    store a out[0]
    b1 = load in[1]
    b0 = load in[0]
    b = select (a0 < a1) b1 b0
8
    store b out[1] }
  LLVM-IR
```

```
sort2:
           esi := load (in+0)
           edi := load (in+4)
           cmp esi edi
4
           edi := cmovle esi
5
           store (out+0) edi
           ecx := in+0
           edx := in+4
           edx := cmovge ecx
           ecx := load edx
10
           store (out+4) ecx
11
```

clang-9 -m32 -O3 -march=i686

## Summary

- Constant-Time = de facto standard against microarchitectural SCA
- We can formalize CT as a 2-hypersafety
- There are tools to verify crypto primitives / find bugs
- We can find cool bugs introduced by compilers #Binsec/Rel



LLVM analysis is not sufficient!



## But constant-time is not enough!



# Spectre flaws continue to haunt Intel and AMD as researchers find fresh attack method

The indirect branch predictor barrier is less of a barrier than hoped

 ♣ Thomas Claburn
 Fri 18 Oct 2024 14:01 UTC







GoFetch





```
char array[len]
char mysecret

if (idx < len)
x = array[idx]
load(x)</pre>
```



```
char array[len]
char mysecret

if (idx < len)

x = array[idx]

load(x)</pre>
```

Predict condition true



Consider idx = len

Consider idx = len

```
char array[len]
char mysecret
if (idx < len)
x = array[idx]
x = mysecret
load(x)</pre>
```

```
char array[len]
char mysecret
                               Predict condition true
if (idx < len)
                               x = mysecret
    x = array[idx]
                               Leak mysecret to
    load(x)
                               microarchitecture!
 Consider idx = len
```

## **Mitigate Spectre**





# Part 1: How to fix existing software?

CT WASN'T HARD ENOUGH?

## Fences to block speculative execution

Branch is mispredicted to true



- fence stalls until branch is resolved
- Rollback before leak(mysecret)



## **Speculative Constant-Time (SCT)**

#### **Constant-Time Foundations for the New Spectre Era**

```
Sunjay Cauligi<sup>†</sup> Craig Disselkoen<sup>†</sup> Klaus v. Gleissenthall<sup>†</sup> Dean Tullsen<sup>†</sup> Deian Stefan<sup>†</sup> Tamara Rezk* Gilles Barthe<sup>**</sup>

<sup>†</sup>UC San Diego, USA *INRIA Sophia Antipolis, France

*MPI for Security and Privacy, Germany *IMDEA Software Institute, Spain
```

**Idea:** Security in the constant-time observation mode on a *speculative semantics* 

Many flavors of microarchitectural semantics / ways to define security (see [1])

[1] Cauligi, S., Disselkoen, C., Moghimi, D., Barthe, G., & Stefan, D. (2022, May). SoK: Practical foundations for software Spectre defenses. *SP'22* 

## Why is that hard?

**Problem.** Microarchitectural semantics with predictions and out-of-order execution

Challenge. Microarchitectural features are complex, often undocumented

**Goals.** Find suitable abstraction to reason about Spectre

- Capture all variants of Spectre
- Keep it simple

## **Modelling speculative semantics**

#### Litmus tests (328 instrutions):

- Sequential semantics
  - $\rightarrow$  14 paths
- Speculative semantics
  - → 37M paths



Modelling all transient paths explicitly is intractable

We need to be smarter

#### **RelSE for architectural semantics**



## **RelSE for Spectre-PHT (naive)**



#### Fork into 4 paths:

- 2 sequential paths
- + 2 extra transient path
- and verify constant-time

Wang, G., Chattopadhyay, S., Biswas, A. K., Mitra, T., & Roychoudhury, A. (2020). KLEESpectre: Detecting information leakage through speculative cache attacks via symbolic execution. *ACM TOSEM* 

## RelSE for Spectre-PHT (but let's be smarter)



## **Experimental evaluation**

#### Benchmark

#### **Litmus tests**

#### **Cryptographic primitives:**

- tea
- donna
- Libsodium secretbox
- OpenSSL ssl3-digest-record
- OpenSSL mee-cdc-decrypt

#### Results

#### Litmus tests

- Paths:  $1546 \rightarrow 370$
- Time:  $3h \rightarrow 15s$

#### Libsodium + OpenSSL

- Coverage: 2273 → 8634

#### **Total**

- Timeouts:  $5 \rightarrow 1$ 

## And concretely?

- Find gadgets in crypto [1,2]
- Find attacks combining Spectre variants [2,3]
- Insert Spectre protections smartly [4,5]
- Type system to protect crypto against Spectre [5]
- Find gadgets in the Linux kernel [6]
- [1] Cauligi, Sunjay, et al. "Constant-time foundations for the new spectre era." PLDI'20
- [2] Daniel, Lesly-Ann, Sébastien Bardin, and Tamara Rezk. "Hunting the haunter-efficient relational symbolic execution for spectre with haunted relse." *NDSS' 21*
- [3] Fabian, Xaver, Marco Guarnieri, and Marco Patrignani. "Automatic Detection of Speculative Execution Combinations." *CCS*'22
- [4] Vassena, Marco, et al. "Automatically eliminating speculative leaks from cryptographic code with blade." POPL'21
- [5] Shivakumar, Basavesh Ammanaghatta, et al. "Typing High-Speed Cryptography against Spectre v1." SP'23
- [6] Johannesmeyer, Brian, et al. "Kasper: scanning for generalized transient execution gadgets in the linux kernel." NDSS'22

## Mitigate Spectre

#### **Speculative constant-time:**

- That's hard!
- New speculation mechanisms?



#### **Speculative constant-time:**

- That's hard!
- New speculation mechanisms?

# Part 2: We want security for CT code Hardware can help



# **ProSpeCT Provably Secure** Speculation for the **Constant-Time Policy**

Lesly-Ann Daniel, Marton Bognar, Job Noorman, Sébastien Bardin, Tamara Rezk, Frank Piessens



KU Leuven, Inria, CEA

**USENIX'23** 

## We need Secure Speculation for Constant-Time!



Developers should not care about speculations



Hardware shall not speculatively leak secrets



But still be efficient and enable speculation

# **Hardware Secrecy Tracking**

#### Software side

- Label secrets
- Constant-time program

#### Hardware side

- Track security labels
- Secrets do not speculatively flow to unsafe instructions

ConTExT: A Generic Approach for Mitigating Spectre

SpectreGuard: An Efficient Data-centric Defense Mechanism against Spectre Attacks

Farzad Farshchi

Michael Schwarz<sup>1</sup>, Moritz Lipp<sup>1</sup>, Claudio Canella<sup>1</sup> Speculative Privacy Tracking (SPT): Leaking Information From University of Kansas **Speculative Execution Without Compromising Privacy** 

Rutvik Choudhary

UIUC, USA

Christopher W. Fletcher UIUC, USA

Jiyong Yu UIUC, USA

Incoh Fuetoe

Adam Morrison Tel Aviv University, Israel

76

Heechul Yun University of Kansas

```
char array[len]
secret char mysecret

if (idx < len)
x = array[idx]
load(x)</pre>
Developer marks secrets
```

```
char array[len]
secret char mysecret
if (idx < len)
2:     x = array[idx]
3:     load(x)</pre>
Developer marks secrets
Speculative execution
```

```
char array[len]

secret char mysecret

if (idx < len)

x = array[idx]

load(x)

Developer marks secrets

Speculative execution

x = mysecret:secret
```

```
Developer marks secrets
    char array[len]
    secret char mysecret
                                        Speculative execution
    if (idx < len)
         x = array[idx]
                                        x = mysecret:secret
3:
         load(x)
                                  Speculative execution + secret
     Consider idx = len
                                      x not forwarded to load
```

# How do I know that my defense works?



# How do I know that my defense works?



# Hardware-Software Contracts for Secure Speculation

Marco Guarnieri\*, Boris Köpf<sup>†</sup>, Jan Reineke<sup>‡</sup>, and Pepe Vila\*
\*IMDEA Software Institute †Microsoft Research ‡Saarland University



## **ProSpeCT: Generic formal processor model for HST**

Semantics of generic out-of-order speculative processor with HST

- → Abstract microarchitectural context
- → Functions *update*, *predict*, *next*

All public values are leaked / influence predictions

- → Captures all known variants of Spectre
- → And futuristic mechanisms Load Value Prediction





#### Security proof

Constant-time programs (ISA semantics) do not leak secrets (microarchitectural semantics)

```
char secret mysecret
1: x = load mysecret
2: y = x + 4
```



```
char secret mysecret

1: x = load mysecret

2: y = x + 4

Compute y = 4
```

#### **Resolve prediction:**

- if mysecret = 0: Commit and continue to line 3
- if mysecret != 0: Rollback to line 1

That leaks!

```
char secret mysecret

1: x = load mysecret

2: y = x + 4

Compute y = 4
```

#### **Resolve prediction:**

- if mysecret = 0: Rollback to line 1
- if mysecret != 0: Rollback to line 1

Always rollback when actual value is secret

#### **Proteus: An Extensible RISC-V Core for Hardware Extensions**

(RISC-V Summit '23)

Marton Bognar, Job Noorman, Frank Piessens



# A modular textbook processor to study HW extensions

- In/Out-of order pipelines
- Optimizations: branch predictors, cache, prefetchers, ...
- Configurable: #exec units, ROB size, ...
- Extensible: plugin system
- SpinalHDL □ verilog □ FPGA / simulator



# Implementation on Proteus and Evaluation



#### **Performance overhead** [1]

| Speculation/Crypto | 25/75 | 50/50 | 75/25 | 90/10 |
|--------------------|-------|-------|-------|-------|
| Precise (Key)      | 0%    | 0%    | 0%    | 0%    |
| Conservative (All) | 10%   | 25%   | 36%   | 45%   |

No overhead in SW for CT code when secrets are precisely annotated

[1] Jacob Fustos, Farzad Farshchi, and Heechul Yun. "SpectreGuard: An Efficient Data-Centric Defense Mechanism against Spectre Attacks". In: DAC. 2019

#### **Hardware Cost:**

Synthesized on FPGA

• LUTs: +17%

• Registers: +6%

Critical path: +2%

#### Conclusion



We need to move beyond CT!



Mitigating Spectre in software is hard and costly



HW-SW co-designs can improve security & performance



My belief: HW-SW contracts are promising for end-to-end security

# Many remaining challenges!



**Software:** PL support + parametric in leakage contract



New defenses: new attacks, emerging applications, platforms, etc.



Hardware verification: support defenses and scale existing techniques

Credit icons: <a href="https://www.flaticon.com/">https://www.flaticon.com/</a>