← Back to context

Comment by natesales

17 hours ago

The verified trust boundary extends from the CPU to GPU [1], and TLS encrypts all data to/from the enclave and client so we can't see anything in the clear.

HTTP parsing and application logic happens on the CPU like normal. The GPU runs CUDA just like any other app, after it's integrity is verified by the CPU. Data on the PCIe bus is encrypted between the CPU and GPU too.

[1] https://github.com/NVIDIA/nvtrust/blob/main/guest_tools/atte...

Could you talk more about how how this works? I don't think linked article doesn't given enough detail on how the trust boundary extends from CPU to GPU.

Does the CPU have the ability to see unencrypted data?

You're not terminating the TLS connection from the client anywhere besides the enclave? How do you load balance or front end all of this effectively?

  • >You're not terminating the TLS connection from the client anywhere besides the enclave?

    Yes.

    >How do you load balance or front end all of this effectively?

    We don't, atleast not yet. That's why all our model endpoints have different subdomains. In the next couple months, we're planning to generate a keypair inside the enclave using HPKE that will be used to encrypt the data, as I described in this comment: https://news.ycombinator.com/item?id=43996849