Comment by d4rkn0d3z

2 days ago

As a graduate student I was actually given tests that more closely resembled the second scenario the auther described. Difficult problems in GR, a whole weekend to work on them, no limits as to who or what references I consulted.

This sounds great until you realize there are only a handful of people on earth that could offer any help, also the proofs you will write are not available in print anywhere.

I asked one of those questions of Grok 4 and its response was to issue "an error". AFAIK, in many results quoted for AI performance, filling the answer box yields full marks but I would have recieved a big fat zero had I done the same.

5 comments

d4rkn0d3z

godelski 2 days ago

As a physics undergraduate I had similar style tests for my upper division classes (the classical mechanics professor and loved these). We'd have like 3 days to do the test, open book, open internet[0] and the professor extended his office hours, but no help from peers. It really stretched your thinking. Removed the time pressure but really gave the sense of what it was like to be a real physicist.

Even though in the last decade a lot more of that complex material appears online, there's still a lot that can't. Unfortunately, I haven't seen any AI system come close to answering any of these types of questions. Some look right at a glance but often contain major errors pretty early on.

I wouldn't be surprised if an LLM can ace the Physics GRE. The internet is filled with the test questions and there are so few variations. But I'll be impressed when they can answer one of these types of tests. They require that you actually do world modeling (and not necessarily of the literal world, just the world that the physics problem lives in[1]). Most humans can't get these right without drawing diagrams. You got to pull a lot of different moving information together.

[0] you were expected to report if you stumbled on the solution somewhere. No one ever found one though

[1] an important distinction for those working on world models. What world are you modeling? Which physics are you modeling?

bwfan123 2 days ago
Would you mind sharing a sketch of one problem from the test you mention ? I am interested in how it looks.
- d4rkn0d3z 1 day ago
  
  Nobody will do this, there are only so many questions that can be asked.
  Generally, a test like this will ask you to derive some result then to expand on it in several ways. I concur with other posters that the important part is how you set up the fictions you will rely on. If you get that wrong then all that follows is wrong, if you make a mistake you turn in many pages of garbage. I found one either achieves near 100% or abject failure, there is not much in between.
  The thing is with very hard physics, when you are around people who understand you get the feeling you understand too, and maybe you do, but in the end there is a 1/r understanding potential around the people who really do understand.
- j7ake 20 hours ago
  
  The triple star questions in the back of exercises of textbooks will be of this calibre.
  In CS check Knuths book.
- godelski 2 days ago
  
  It's been a decade, so I don't have any of the actual tests anymore. But the class used Marion and Thornton's Classical Mechanics[0] and occasionally pulled from Goldstein's book[1]. It was an undergrad class, so we only pulled from the second in the Classical II class.
  For these very tough physics (and math) problems usually the most complex part is just getting started. Sure, there would always be some complex weird calculation that needs to be done, but often by the time you get to there you have a general knowledge of what actually needs to be solved and that gives you a lot of clues. For the classical we were usually concerned with deriving the Hamiltonian of the system[2]. By no means is the computation easy, but I found (and this seemed to be common) that the hardest part was getting everything set up and ensuring you have an accurate description which to derive from. Small differences can be killer and that was often the point. There are a lot of tools that give you a kind of "sniff test" as to if you've accounted for everything or not, but many of these are not available until you've already gotten through a good chunk of computation (or all the way!). Which, tbh, is really the hard part of doing science. It is the attention to detail, the nuances. Which should make sense, as if this didn't matter we'd have solved everything long ago, right?
  I mean in the experiment section of my optics class we also were tested on things like just setting up a laser so that it would properly lase. I was one of two people that could reliably do it in my cohort. You had to be very meticulous and constantly thinking about how the one part you're working with is interacting with the system as a whole. Not to mention the poor tolerances of our lab equipment lol.
  Really, a lot of it comes down to world modeling. I'm an AI researcher now and I think a lot of people really are oversimplifying what this term actually means. Like many of those physics problems, it looks simple at face value but it isn't until you get into the depth that you see the beauty and complexity of it all.[3]
  [0] https://www.amazon.com/Classical-Dynamics-Particles-Systems-...
  [1] https://www.amazon.com/Classical-Mechanics-3rd-Herbert-Golds...
  [2] Once you're out of basic physics classes you usually don't care about numbers. It is all about symbolic manipulation. The point of physics is to generate causal explanations, ones that are counterfactual. So you are mainly interested in the description of the system because from there you can plug in any numbers you wish. Joke is that you do this then hand it off to the engineer or computer.
  [3] A pet peeve of mine is that people will say "I just care that it works." I hate this because it is a shared goal no matter your belief about approach (who doesn't want it to work?! What an absurd dichotomy). The people that think the AI system needs to derive (learn) realistic enough laws of physics are driven because they are explicitly concerned with things working. It's not about "theory" as it is that this is a requirement for having a generalizable solution. They understand how these subtle differences quickly cascade into big differences. I mean your basic calculus level physics is good enough for a spherical chicken in a vacuum but it gets much more complex when you want to operate in the real world. Unfortunately there aren't things that can be determined purely through observation (even in a purely mechanical universe).