Comment by chriskanan
2 days ago
I have no idea if comments actually have any impact, but here is the comment I left on the document:
I am Christopher Kanan, a professor and AI researcher at the University of Rochester with over 20 years of experience in artificial intelligence and deep learning. Previously, I led AI research and development at Paige, a medical AI company, where I worked on FDA-regulated AI systems for medical imaging. Based on this experience, I would like to provide feedback on the proposed export control regulations regarding compute thresholds for AI training, particularly models requiring 10^26 computational operations.
The current regulation seems misguided for several reasons. First, it assumes that scaling models automatically leads to something dangerous. This is a flawed assumption, as simply increasing model size and compute does not necessarily result in harmful capabilities. Second, the 10^26 operations threshold appears to be based on what may be required to train future large language models using today’s methods. However, future advances in algorithms and architectures could significantly reduce the computational demands for training such models. It is unlikely that AI progress will remain tied to inefficient transformer-based models trained on massive datasets. Lastly, many companies trying to scale large language models beyond systems like GPT-4 have hit diminishing returns, shifting their focus to test-time compute. This involves using more compute to "think" about responses during inference rather than in model training, and the regulation does not address this trend at all.
Even if future amendments try to address test-time compute, the proposed regulation seems premature. There are too many unknowns in future AI development to justify using a fixed compute-based threshold as a reliable indicator of potential risk. Instead of focusing on compute thresholds or model sizes, policymakers should focus on regulating specific high-risk AI applications, similar to how the FDA regulates AI software as a medical device. This approach targets the actual use of AI systems rather than their development, which is more aligned with addressing real-world risks.
Without careful refinement, these rules risk stifling innovation, especially for small companies and academic researchers, while leaving important developments unregulated. I urge policymakers to engage with industry and academic experts to refocus regulations on specific applications rather than broadly targeting compute usage. AI regulation must evolve with the field to remain effective and balanced.
---
Of course, I have no skin in the game since I barely have any compute available to me as an academic, but the proposed rules on compute just don't make any sense to me.
The regulation doesn't exactly make this assumption. Not only are large models stifled, the ability to serve models via API to many users, and the ability to have many researchers working in parallel on upgrading the model is also stifled. It wholesale stifles AI progress for the targeted nations.
This is an appropriate restriction on what will likely be a core part of military technology in the coming decade (eg drone piloting).
Look, if Russia didn't invade Ukraine and China didn't keep saying they wanted to invade Taiwan, I wouldn't have any issues with sending them millions of Blackwell chips. But that's not the world we live in. Unfortunately, this is the foreign policy reality that exists outside of the tech bubble we live in. If China ever wants to drop their ambitions over Taiwan then the export restrictions should be dropped, but not a moment sooner.
right. China. but Switzerland? Israel? what is going on here?
Israel is a known industrial espionage threat to the us, how'd you think they got nuclear weapons? some analysts say they're the largest threat after china. Not to mention theyre currently using ai in targeting systems while under investigation for war crimes.
3 replies →
It could be related to 14eyes with modifications (finland and ireland, plus close asian allies).
https://res.cloudinary.com/dbulfrlrz/images/w_1024,h_661,c_s... (from https://protonvpn.com/blog/5-eyes-global-surveillance).
Israel, Poland, Portugal and Switzerland are also missing from it
1 reply →
> Switzerland? Israel?
I hope someone with a better understanding of the details can jump in, but they are both Tier 2 (not Tier 3) restricted, so maybe there are some available loopholes or Presidential override authority or something. Also I believe they can still access uncapped compute if they go via data centers built in the US.
Limiting US GPU exports to unaligned countries is completely counterproductive as it creates a market in those countries for Chinese GPUs, accelerating their development even faster. Because a mediocre Huawei GPU is better than no GPU. And it harms the revenue of US-aligned GPU companies, slowing their development.
Interesting theory. Any evidence that this is how the world really works? (And, is there a catchy name for the phenomenon?)
4 replies →
> Even if future amendments try to address test-time compute, the proposed regulation seems premature. There are too many unknowns in future AI development to justify using a fixed compute-based threshold as a reliable indicator of potential risk.
I'm disinclined to let that be a barrier to regulation, especially of the export-control variety. It seems like letting the perfect be the enemy of the good: refusing to close the barn door you have, because you think you might have a better barn door in the future.
> Instead of focusing on compute thresholds or model sizes, policymakers should focus on regulating specific high-risk AI applications, similar to how the FDA regulates AI software as a medical device. This approach targets the actual use of AI systems rather than their development, which is more aligned with addressing real-world risks.
How to you envision that working, specifically? Especially when a lot of models are pretty general and not very application-specific?
<< It seems like letting the perfect be the enemy of the good: refusing to close the barn door you have, because you think you might have a better barn door in the future.
Am I missing something? I am not an expert in the field, but from where I sit, there literally is no barn door at this point to even close too late..
> First, it assumes that scaling models automatically leads to something dangerous.
The impression I had is with reversed causation: that it can't be all that dangerous if it's smaller than this.
Assuming this alternative interpretation is correct, the idea may still be flawed, for the same reasons you say.
I also suspect that the only real leverage the U.S. has is on big compute (i.e. requires the best chips), and less capable chips are not as controllable.
> these rules risk stifling innovation
These rules intentionally "stifle innovation" for foreigners - this is a feature, not a bug.
You wrote three separate comments for this course. Can you just combine them?
The three comments are communicating three separate things, I think it's clearer that way.
While your critiques most likely have some validity (and I am not positioned to judge their validity), you failed to offer a concrete policy alternative. The rules were undoubtedly made with substantial engagement from the industry and academic researchers, as there is too much at stake for them not to engage, and vigorously. Likely there were no perfect policy solutions, but decided to not let the perfect stop the good enough since timeliness matters as much or more than the policy specifics.
Doing nothing is a better alternative, because these restrictions will just encourage neutral countries to purchase Chinese GPUs, because their access to US GPUs is limited by these regulations. This will accelerate the growth of Chinese GPU companies and slow the growth of US-aligned ones; it's basically equivalent to the majority of nations in the world placing sanctions on NVidia.
[dead]