Comment by noduerme

3 years ago

Fuck, imagine how many doctoral theses I could've written every time I tweaked a few lines of code to try some abstract way of recombining outputs I didn't fully understand. I missed the boat. All this jargon is absolutely for show, though. Purely intended to create the impression that there's some kind of moat to the "discovery". There are much clearer ways to express "we fucked around with putting the outputs of this black box back into the inputs", but I guess that doesn't impress the rubes.

59 comments

noduerme

kristopolous 3 years ago

I really really wish this culture of expressing simple things in ornate ways would die. All it does is make knowledge less accessible

noduerme 3 years ago

I think this applies to everything right now. Papers like this are just ridiculous examples. In like, 6th grade I won second place at the LA county science fair for coding a simulation of a coyote's life in hypercard (with tons of graphs). Yay. Y'know what? That shit and those graphs would've been incomprehensible to the judges if it hadn't been written in plain language, in an attempt to make them understand what they were looking at. My entire career since has been an attempt to communicate and alleviate the pain points in communication between parties, by way of writing software that encapsulated their descriptions of what they needed. And likewise I never pretended to be smarter or know more than my clients did: Everything must be explained and comprehensible in normal people language. People need to know how shit works, especially if they're paying for it.
Or they should.
Or if they don't know and don't care, they're fucking negligent.
Especially if they say "wow that sounds smart, let's let these guys run our weapons program".
To your point, the reason this ornate language thrives and people get away with complacency about how their own systems work, boils down to a silent pact between managers and engineers to sweep everything under the rug out of laziness and ill-will. There's something blatantly mendacious and evil (in the banal way) about the agreement that managers approve black boxes which were approved by complex-sounding papers so that upper management can wash their hands of the results.
[edit] maybe I'm just bitter because I spent hours today pondering exactly how many engineers at Monsanto must have known about the dangers of the astroturf, and how many raised their hand, or hid behind a spreadsheet
https://frontofficesports.com/investigation-links-astroturf-...
pedrosorio 3 years ago
“Engineers who like to pretend to be mathematicians” - I heard once.
- kristopolous 3 years ago
  
  In math, this is mostly an English problem I think. Next time you find a Wikipedia math page to be an impenetrable wall of jargon, click the Wikipedia language tool and choose another language, any will do.
  Then use Chrome's tool to machine translate the foreign language version back to English. I've found invariably this makes the article more coherent then the native English language Wikipedia math page.
  It says something about the culture for sure.
  
  5 replies →
mattew 3 years ago

What I find entertaining/confounding is how difficult the abstracts to these new AI papers are to understand. It feels like academia is pushing this style, so it’s hard to blame the authors since they have to play the game.
For reference I have an undergrad degree in computer science, have been working professionally for 25 years, and am fairly data centric in my work.
I’m hoping when I run this through GPT4 to get an explanation for a mortal software developer something sensible comes out the other end.
TrackerFF 3 years ago
"Not math-y enough"/ "Needs more math" is a very common feedback ML/AI researchers get when writing papers.
The other day I was watching a live-stream of a doctoral defense, as the thesis was quite relevant to my work.
So one of the committee members would really pick and criticize the math - ask questions like "You are supposed to be the bleeding edge on this topic, why was the math so simple? Did you research more rigorous theories to explain the math?" etc. (He was awarded the doctorate though)
So, I dunno, if that's how things are now - it makes sense to me that the authors go overboard with complicated notation, even if they could have written it much simpler. Probably makes the work seem more rigorous and legit.
Doesn't really take that much more time, and it covers your ass from "not rigorous enough" gotchas - though at the expense of readability.
- kristopolous 3 years ago
  
  Go read any article in the first 200 years or so of philtrans. There's lots of crucial science there written in a way that doesn't have the modern trappings of the form. It's good reading. Maybe some style perturbations borrowed from earlier eras would be good
  https://www.biodiversitylibrary.org/bibliography/62536 menu on the right
  Benjamin Franklin, Robert Boyle, Isaac Newton, Maxwell, Ohm and Volt - they're all there. If that style was good enough for them ...
tinco 3 years ago
If the excuse is true and the "ornate" language really is a dense representation of information then it should be fairly trivial to have an LLM agent unsummarize it.
There could be a webservice that offers a parallel track of layman's translations of any paper.
- heydenberk 3 years ago
  
  Sounds like you're referring to https://www.explainpaper.com/ :)
- noduerme 3 years ago
  
  An LLM won't tell you that the authors obfuscated it because they don't know what the fuck they did. You need a human for that.
  
  1 reply →
CuriouslyC 3 years ago
That's literally the entire field of philosophy after the ~18th century.
- FrustratedMonky 3 years ago
  
  Yep ~18 century Didn't Wittgenstein and/or Nietzsche say something similar. Words are in-adequate for communication, and all philosophy is playing with words.
  But, Language is all we have to communicate, so guess we are stuck with it.
FrustratedMonky 3 years ago

I wish also. When I was young and new, so much wasted time trying to parse the 'arcane' math that was really something simple bug dressed up as complicated to give it weight.
awesomeMilou 3 years ago

People have to get their PhD's somehow... ;)

thechao 3 years ago

Watching the AI community rediscover automatic differentiation 20+ years after the field was considered "mature" was equal parts frustrating & fascinating. The frustration was them rewriting the history of discovery, but without any sort of sense or rigor ... and it was also the most fascinating!

AndrewKemendo 3 years ago
This is indeed the frustration
I'm waiting for some fresh group of grad students to make a breakthrough using a reinvented version of Pearls "Do" calculus or maybe they make some narrow breakthrough using BayesNets and everyone geeks out on those for a while
*I do think transformers (much like ff networks + backprop from 2012-2018) are probably a lasting software architecture for inference applications until we come up with new hardware, and move beyond GPU focused computing
It's exciting to see it all working, but disheartening how a-historical this last few years has been in AI - with the exception of Brooks, Sutton and a few other greybeards in the field who say similarly
- barelyauser 3 years ago
  
  Scarcity is not a myth.
  
  1 reply →
capableweb 3 years ago
The funny thing is that this constantly happens in every field ever, humans truly excel at repeating history without learning from the past.
Another example:
- HTML served by static file servers
- HTML generated by backend
- HTML enhanced with small JS snippets
- HTML generated by frontend, but served by backend
- Go to step one, not learning why anyone moved on from the previous method
- joaogui1 3 years ago
  
  Programmers excel at it, you don't see that happening much with Physics, Chemistry, Math, Material Sciences etc Funnily enough it has also been observed before, by Alan Kay comparing programming to a pop culture "In the last 25 years or so, we actually got something like a pop culture" https://queue.acm.org/detail.cfm?id=1039523#:~:text=In%20the...
- mr_toad 3 years ago
  
  Poor training, poor communication, & knowledge not being curated.
  When then best method of getting advice on the internet is to post the wrong answer you know the system is broken.

maf12 3 years ago

I think the main motivation in ml theory that touches current SOTA is not "expressing simple ideas with a jargon for show". Jargon is necessary, as much as some (mostly very practical) engineers or software people cannot see it due to how unnecessary it seems to them (as they are used to practically and quickly express themselves). It's a jargon for the mathematics of machine learning, which is pretty unstandardized so to speak. So you need to define yourself. And without a jargon and clear proofs, what you do is just brainstorming at most. The value of such work is that their statement is pretty clear, proved and contain hypotheses which can be tested by the future papers.

Here is an example: to explain the existence of adversarial example, there are 2 suggestions without a jargon: 1) that the decision boundary is too nonlinear, 2) that the decision boundary is too linear. Both of these explanations contradict and stated without any real proof and unfortunately can be widely heard in most of the adversarial example papers. If we were to have clear formulations of these two statements, we could have tested both of these claims but unfortunately the papers that suggested these theories didn't put effort for defining a jargon and putting their suggestion as a clear-formal statement.

gargablegar 3 years ago

I studied using ML just over a decade ago. I actually compared MLPs to SVMs and had a similar thought to this. It does seem like there is a regression on understanding some of the fundamentals and older tools of the trade.

I guess everyone gets focused on the newer things.

Really does seem like people rediscovering older endpoints.

uoaei 3 years ago
There's been a huge flood of vanilla software engineers into ML, retconning it as "a subfield of computer science" (computability is a minor concern compared to the statistical underpinnings). They pretend to know the math because they can read the equations, then claim with utmost confidence that actually they're doing all the hard work in ML because they are experts in calling APIs and integrating into products, however useful or useless.
- consilient 3 years ago
  
  > "a subfield of computer science" (computability is a minor concern compared to the statistical underpinnings)
  Computability theory is not all of computer science. It's just one subfield among many.
  
  1 reply →
- noduerme 3 years ago
  
  Really? Here I thought it was a flood of academics retconning neural net code into a "science" now that programmers had made it run in Python for them fast enough to be useful.
  
  1 reply →
- visarga 3 years ago
  
  There's a similar trend of accusing LLMs of not really understanding, being just pattern machines. Funny that a whole group of people get the same treatment.
  
  4 replies →
- jjtheblunt 3 years ago
  
  Retconning? What does that mean?
  
  3 replies →
Q6T46nT668w6i3m 3 years ago

That’s science. You can’t expect everyone to know everything. It’s a preprint so this is the first opportunity to provide feedback.

agent327 3 years ago

You think it's that easy, but, as frequently the case with transformers, I believe there's more here than meets the eye.

necroforest 3 years ago

which jargon here is "just for show"?

ftxbro 3 years ago
> we show that over-parameterization catalyzes global convergence by ensuring the feasibility of the SVM problem and by guaranteeing a benign optimization landscape devoid of stationary points
does this mean 'an over-parameterized transformer problem is a convex svm problem'?
- tensor 3 years ago
  
  The irony is that your "simplification" uses even more "jargon."
  But yes, thats how I would read that, and I also see no issue at all with the language in the paper. These terms are used for precision, and have meaning to those in the field. Papers are written for other experts, not laymen.
  
  2 replies →
- dongecko 3 years ago
  
  I read it the same way as you did, or at least it's an approximation.
  In general that's not really surprising. I remember discussions from some years ago about larger networks leading to smother loss surfaces.

meltyness 3 years ago

Or, optimistically, this is really how they think about these things, and you should simply be happy they're not trying to obfuscate their findings.

Q6T46nT668w6i3m 3 years ago

It’s a preprint on arXiv, not the Magna Carta.

r-zip 3 years ago

What are you talking about? Did we read the same abstract?

ResearchCode 3 years ago

You're not wrong. Applied ML articles are not worth reading.

constantly 3 years ago

I wouldn’t go this far, applied ML articles are my favorite articles. If you’re in the arena, it’s good to see things that other people have done from a practical perspective so you can ape it in your own work or not give it further consideration.