Comment by vlovich123

7 months ago

Harvesting the pages for the JIT and somehow reusing them to prewarm the JIT state, not the heap state overall. The heap state itself is definitely solved by the simple prewarming you describe because of the various state within various code paths that might take time to initialize/prewarm.

I’m not saying it’s not helpful. I’m just flagging that JIT research is pretty clear that the performance improvements from JIT are hugely dependent on actually running the realistic code paths and data types that you see over and over again. If there’s divergence you get suboptimal or even negative gains because the JIT will start generating code for the misoptimization which you actually don’t care about. If you actually have control of the JIT then you can mitigate some of these problems but it sounds like you don’t in which case it’s something to keep in mind as a problem at scale. ie could end up being 5-10% of global compute I think if all your traffic is JIT and certainly would negatively impact latencies of this code running on your service. Of course I’m sure you’ve got bigger technical problems to solve. It’s a very interesting approach for sure. Great idea!

4 comments

vlovich123

laurencerowe 7 months ago

Thanks! I can see how that would be useful but it sounds like it would require deep integration with the JIT. With the TinyKVM/KVMServer approach we have a well defined boundary in the Linux system call interface to work with. It's been quite surprising to me how much is possible with such a small amount of code.

vlovich123 7 months ago
For sure. I think though you might want more non-JIT customers because a) Cloudflare and AWS have a better story there and thus customer acquisition is more expensive b) you have a much stronger story for things they have to breakdown to WASM for as WASM has significant penalties. Eg if I had an easy Cloudflare-like way to deploy Rust that would be insanely productive.
- laurencerowe 7 months ago
  
  I guess my question here is if you are already writing Rust do you care about per-request isolation so much? If you don't then deploying a container to AWS Lambda or GCP Cloud Run is already pretty easy. It might be possible to offer better cold start performance with the TinyKVM approach, but that is still an unknown.
  For the Varnish TinyKVM vmod they brought up examples of running image transcoding which is definitely something that benefits from per request isolation given the history of exploits for those kinds of C/C++ libraries.
  It's worth noting that Cloudflare/AWS Lambda don't have per-request isolation and that's pretty important for server side rendering use cases where code was initially written with client side assumptions.
  Not sure this will ever turn into a business for me personally - my motivation is in trying to regain some of the simplicity of the CGI days without giving up the performance gains of modern software stacks. Though it would be helpful to have a production workload to improve at some point.
  
  1 reply →