Comment by chriskanan

2 days ago

The most salient thing in the document is that it put export controls on releasing the weights of models trained with 10^26 operations. While there may be some errors in my math, I think that corresponds to training a model with over 70,000 H100s for a month.

I personally think the regulation is misguided, as it assumes we won't identify better algorithms/architectures. There is no reason to assume that the level of compute leads to these problems.

Moreover, given the emphasis on test-time compute nowadays and that it seems like a lot of companies have hit a wall with performance gains with trying to scale LLMs at train-time, I especially think this regulation isn't especially meaningful.

Traditional export control applied to advanced hardware is because the US doesn't want its adversaries to have access to things that erode the US military advantage. But most hardware is only controlled at the high-end of the market. Once a technology is commodotized, the low-end stuff is usually widely proliferated. Night vision goggles are an example, only the latest generation technology is controlled, and low-end stuff can be bought online and shipped worldwide.

Applying this to your thoughts about AI, is that as the efficiency of training gets better, the ability to train models is commodotized, and those models would not be considered to be advantageous and would not need to be controlled. So maybe setting the export control based on the number of operations is a good idea- it naturally allows efficiently trained models to be exported since they wouldn't be hard to train in other countries anyway.

As computing power scales maybe the 10^26 limit will need to be revised, but setting the limit based on the scale of the training is a good idea since it is actually measurable. You couldn't realistically set the limit based on the capability of the model since benchmarks seem become irrelevant every few months due to contamination.

  • I wonder what makes people believe that the US currently enjoys any kind of a meaningful "military advantage" over e.g. China? After failing to defeat the Taliban and running from the Houthis especially. This seems like a very dangerous belief to have. China has 4x the population and outproduces us 10:1 in widgets (2:1 in dollars). Considering just e.g. steel, China produces about 1 billion metric tons of it per year. We produce 80 million tons. Concrete? 2.4B tons vs 96M tons. 70+% of the world's electronics. Their shipbuilding industry is 230x more productive (not a typo). Etc, etc.

    The short term profits US businesses have been enjoying over the past 25 years came at a staggering long term cost. The sanctions won't even slow down the Chinese MIC, and in the long run they will cause them to develop their own high end silicon sector (obviating the need for our own worldwide). They're already at 7nm, at a low yield. That is more than sufficient for their MIC, including the AI chips used there, currently and in the foreseeable future.

    • a) just because the government has policies that doesn’t mean they are 100% effective

      b) export controls aren’t expected to completely prevent a country from gaining access to a technology, just make it take longer and require more resources to achieve

      You may also be misunderstanding how much money China will spend to develop their semiconductor industry. Sure, they will eventually catch up to the West, but the money they spend along the way won’t be spent on fighter jet, missiles, and ships. It’s still preferable (from the US perspective) to having no export controls and China being able to import semiconductor designs, manufacturing hardware, and AI models trained using US resources. At least this way China is a few months behind and will have to spend a few billion Yuan to achieve it.

The practical problem I see is that unless US AI labs have perfect security (against both cyber attacks and physical espionage), which they don’t, there is no way to prevent foreign intelligence agencies from just stealing the weights whenever they want.

  • Of course. They're mitigations, not preventions. Few defenses are truly preventative. The point is to make it difficult. They know bad actors will try to circumvent it.

    This isn't lost on the authors. It is explicitly recognized in the document:

    > The risk is even greater with AI model weights, which, once exfiltrated by malicious actors, can be copied and sent anywhere in the world instantaneously.

  • This. We put toasters on the internet and are no longer surprised, when services we use send us breach notices at regular intervals. The only thing this regulation would do, as written, is add an interesting choke point for compliance regulators to obsess over.

Could be nice with some artificial pressure to use more efficient algorithms though. The current game of just throwing in more data centers and power plants may be kind of convenient for those who can afford it, but it's also intellectually embarrassing.

>The most salient thing in the document is that it put export controls on releasing the weights of models trained with 10^26 operations.

Does this affect open source? If so, it'll be absolutely disastrous for the US in the longer term, as eventually China will be able to train open weights models with more than that many operations, and everyone using open weights models will switch to Chinese models because they're not artificially gimped like the US-aligned ones. China already has the best open weights models currently available, and regulation like this will just further their advantage.

  • "consistent with its general practice, BIS will not require a license for the export of the model weights of open-weight models"

this is like saying that regulating automatic weapons is misguided because someone might invent a gun that is equally dangerous without being automatic

  • This appears to be a very shallow take and lazy argument that does not capture even basic nuance of the issue at hand. For the sake of expanding it a little and hopefully moving it in the right direction, I will point out that BIS framework discusses use of advanced models as dual use goods ( ie. not automatically automatic weapons ).

    edit(removed exasparated sigh; it does not add anything )

We can't let perfect be the enemy of good, regulations can be updated. Capping FLOPs is a decent starter reg.

  • Counterpoint would be the $7.25 minimum wage. It can be updated, but politicians aren't good at doing that. In both cases (FLOPS and minimum wage), at least a lower bound for inflation should be included:

    Something like: 10^26 FLOPS * 1.5^n where n is the number of years since the regulation was published.

    • > Something like: 10^26 FLOPS * 1.5^n where n is the number of years since the regulation was published.

      Why would you want to automatically increase the cap algorithmically like that?

      The purpose of a regulation like this is totally different than the minimum wage. If the point is to keep and adversary behind, you want them to stay as far behind as you can manage for as long as possible.

      So if you increase the cap, you want only increase it when it won't help the adversary (because they have alternatives, for instance).

    • I don’t see an issue here, because our legislators probably care more about FLOPS than humans.