Comment by natesales

2 months ago

The verified trust boundary extends from the CPU to GPU [1], and TLS encrypts all data to/from the enclave and client so we can't see anything in the clear.

HTTP parsing and application logic happens on the CPU like normal. The GPU runs CUDA just like any other app, after it's integrity is verified by the CPU. Data on the PCIe bus is encrypted between the CPU and GPU too.

[1] https://github.com/NVIDIA/nvtrust/blob/main/guest_tools/atte...

4 comments

natesales

etaioinshrdlu 2 months ago

Could you talk more about how how this works? I don't think linked article doesn't given enough detail on how the trust boundary extends from CPU to GPU.

Does the CPU have the ability to see unencrypted data?

natesales 2 months ago

The keys are generated on the CPU and never leave the enclave, but the data is decrypted on the CPU so it hits the registers in plaintext.
When the enclave starts, the CPU does a few things:
1. The CPU does a key exchange with the GPU (in confidential compute mode [1]) to derive a key to encrypt data over PCIe
2. The CPU verifies the integrity of the GPU against NVIDIA's root of trust [2]
[1] https://developer.nvidia.com/blog/confidential-computing-on-...
[2] https://github.com/tinfoilsh/cvmimage/blob/b65ced8796e8a8687...
edit: formatting

candiddevmike 2 months ago

You're not terminating the TLS connection from the client anywhere besides the enclave? How do you load balance or front end all of this effectively?

FrasiertheLion 2 months ago

>You're not terminating the TLS connection from the client anywhere besides the enclave?
Yes.
>How do you load balance or front end all of this effectively?
We don't, atleast not yet. That's why all our model endpoints have different subdomains. In the next couple months, we're planning to generate a keypair inside the enclave using HPKE that will be used to encrypt the data, as I described in this comment: https://news.ycombinator.com/item?id=43996849