Comment by 0xfaded

5 months ago

I once published a method for finding the closest distance between an ellipse and a point on SO: https://stackoverflow.com/questions/22959698/distance-from-g...

I consider it the most beautiful piece of code I've ever written and perhaps my one minor contribution to human knowledge. It uses a method I invented, is just a few lines, and converges in very few iterations.

People used to reach out to me all the time with uses they had found for it, it was cited in a PhD and apparently lives in some collision plugin for unity. Haven't heard from anyone in a long time.

It's also my test question for LLMs, and I've yet to see my solution regurgitated. Instead they generate some variant of Newtons method, ChatGPT 5.2 gave me an LM implementation and acknowledged that Newtons method is unstable (it is, which is why I went down the rabbit hole in the first place.)

Today I don't know where I would publish such a gem. It's not something I'd bother writing up in a paper, and SO was the obvious place were people who wanted an answer to this question would look. Now there is no central repository, instead everyone individually summons the ghosts of those passed in loneliness.

128 comments

0xfaded

erikig 5 months ago

The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request which I'd summarize as follows:

With no one asking questions these technical questions publicly, where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?

Aurornis 5 months ago

> The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request
They also completely missed the fact that 0xfaded did write a blog post and it’s linked in the second sentence of the SO post.
> There is a relatively simple numerical method with better convergence than Newtons Method. I have a blog post about why it works http://wet-robots.ghost.io/simple-method-for-distance-to-ell...
keepamovin 5 months ago
Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably. Why wouldn't the faded ox publish in a paper? Idk, but I guess we need things similar to those circulars that British royal society members used to send to each other...except not reserved for a club. The web should be a natural at this. But it's either centralized -> monetized -> corrupted, or decentralized -> unindexed/niche -> forgotten fringe. What can come between?
- Nition 5 months ago
  
  I wonder if there could be something like a Wikipedia for programming. A bit like what the book Design Patterns did in 1994, collecting everyone's useful solutions, but on a much larger scale. Everyone shares the best strategies and algorithms for everything, and updates them when new ones come about, and we finally stop reinventing the wheel for every new project.
  To some extent that was Stack Overflow, and it's also GitHub, and now it's also LLMs, but not quite.
  May I suggest "PASTE": Patterns, Algorithms, Solutions, Techniques, and Examples. "Just copy PASTE", they'll say.
  
  11 replies →
- lelanthran 5 months ago
  
  > Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably.
  I think GP's min-distance solution would work well as an arxiv paper that is never submitted for publication.
  A curated list of never-published papers, with comments by users, makes sense in this context. Not sure that arxiv itself is a good place, but something close to it in design, with user comments and response-papers could be workable.
  Something like RFC, but with rich content (not plain-text) and focused on things like GP published (code techniques, tricks, etc).
  Could even call it "circulars on computer programming" or "circulars on software engineering", etc.
  PS. I ran an experiment some time back, putting something on arxiv instead of github, and had to field a few comments about "this is not novel enough to be a paper" and my responses were "this is not a publishable paper, and I don't intend to submit it anywhere". IOW, this is not a new or unique problem.
  (See the thread here - https://news.ycombinator.com/item?id=44290315)
- knolan 5 months ago
  
  There is the Journal of Open Source Software perhaps:
  https://joss.theoj.org/
zahlman 5 months ago
You can (and always were encouraged to) ask your own questions, too.
And there are more sites like this (see e.g. https://codidact.com — fd: moderator of the Software section). Just because something loses popularity isn't a reason to stop doing it.
- eastbound 5 months ago
  
  StackOverflow is famously obnoxious about questions badly asked, badly categorized, duplicated…
  It’s actually a topic on which StackOverflow would benefit from AI A LOT.
  Imagine StackOverflow rebrands itself as the place where you can ask the LLM and it benefits the world, whoch correctly rephrasing the question behind the scenes and creating public records for them.
  
  4 replies →
Forgeties79 5 months ago
Seriously where will we get this info anymore? I’ve depended on it for decades. No matter how obscure, I could always find a community that was talking about something I needed solved. I feel like that’s getting harder and harder every year. The balkanization of the Internet + garbage AI slop blogs overwhelming the clearly declining Google is a huge problem.
- nerusskyhigh 5 months ago
  
  My genuine impression is that communities moved from forums to discord. Maybe that's why they are harder to find
  
  3 replies →
- seb1204 5 months ago
  
  Keep using SO?
  
  4 replies →
- HumblyTossed 5 months ago
  
  Usenet?
  
  1 reply →
0xbadcafebee 5 months ago

> where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?
The same place people have always discovered problems to work on, for the entire history of human civilization. Industry, trades, academia, public service, newspapers, community organizations. The world is filled with unsolved problems, and places to go to work on them.
Einstein was literally a patent clerk.

sky2224 5 months ago

This is a perfect example of an element of Q&A forums that is being lost. Another thing that I don't think we'll see as much of anymore is interaction from developers that have extensive internal knowledge on products.

An example I can think of was when Eric Lippert, a developer on the C# compiler at the time, responded to a question about a "gotcha" in the language: https://stackoverflow.com/a/8899347/10470363

Developer interaction like that is going to be completely lost.

tempest_ 5 months ago
This type of thing often lives in the issues / discussion tab of a github repo now a days, for better and worse.
- dimator 5 months ago
  
  Yuck. I don't know if it's just me, but something feels completely off about the GH issue tracker. I don't know if it's the spacing, the formatting, or what, but each time it feels like it's actively trying to shoo me away.
  It's whatever the visual language equivalent of "low signal" is.
  
  3 replies →
- skvark 5 months ago
  
  I think most relevant data that provides best answers lives in GitHub. Sometimes in code, sometimes in issues or discussions. Many libs have their docs there as well. But the information is scattered and not easy to find, and often you need multiple sources to come up with a solution to some problem.
- fireflash38 5 months ago
  
  A lot of valuable information lived/lives in email threads that might or might not be publicly archived.
Philpax 5 months ago

The second answer cites Lippert's pre-existing blog post on the subject: https://ericlippert.com/2009/11/12/closing-over-the-loop-var...
I agree that there will be some degradation here, but I also think that the developers inclined to do this kind of outreach will still find ways to do it.
gessha 5 months ago

I believe the community has seen the benefit of forums like SO and we won’t let the idea go stale. I also believe the current state of SO is not sustainable with the old guard flagging any question and response you post there. The idea can/should/might be re-invented in an LLM context and we’re one good interface away from getting there. That’s at least my hope.
yaroslavvb 5 months ago

I used to look at all TensorFlow questions when I was on the TensorFlow team (https://stackoverflow.com/tags/tensorflow/info). Unclear where people go to interact with their users now....Reddit? But the tone on Reddit is kind of negative/complainy

namanyayg 5 months ago

I had a similar beautiful experience where an experienced programmer answered one of my elementary JavaScript typing questions when I was just starting to learn programming.

He didn't need to, but he gave the most comprehensive answer possible attacking the question from various angles.

He taught me the value of deeply understanding theoretical and historical aspects of computing to understand why some parts of programming exist the way they are. I'm still thankful.

If this was repeated today, an LLM would have given a surface level answer, or worse yet would've done the thinking for me obliviating the question in the first place.

I wrote a blog post about my experience at https://nmn.gl/blog/ai-and-learning

matsemann 5 months ago
Had a similar experience. Asked a question about a new language feature in java 8 (parallell streams), and one of the language designers (Goetz) answered my question about the intention of how to use it.
An LLM couldn't have done the same. Someone would have to ask the question and someone answer it for indexing by the LLM. If we all just ask questions in closed chats, lots of new questions will go unanswered as those with the knowledge have simply not been asked to write the answers down anywhere.
- haddr 5 months ago
  
  Would you share the link to the answer?
  
  1 reply →
cinntaile 5 months ago

You can prompt the LLM to not just give you the answer. Possibly even ask it to consider the problem from different angles but that may not be helpful when you don't know what you don't know.
Gigachad 5 months ago

For every example of that, there were 999 instances of people having their question closed, criticised, or ignored.

jvanderbot 5 months ago

You can write a paper, submit the arxiv, and you can also make a blog post. At any rate, I agree - SO was (is?) a wonderful place for this kind of thing.

I once had a professor mention that they knew me from SO because I posted a few underhanded tricks to prevent an EKF from "going singular" in production. That kind of community is going to be hard to replace, but SO isnt going anywhere, you can still ask a question and answer your own question for permanent, searchable archive.

paulgerhardt 5 months ago
I would imagine the endorsement requirement reduces submissions by a few orders of magnitude.
- marcosdumay 5 months ago
  
  At this point SO seems harder to publish into than arxiv.
  
  3 replies →

scirob 5 months ago

Has anyone tried building a modern Stack Overflow that's actually designed for AI-first developers? The core idea: question gets asked → immediately shows answers from 3 different AI models. Users get instant value. Then humans show up to verify, break it down, or add production context. But flip the reputation system: instead of reputation for answers, you get it for catching what's wrong or verifying what works. "This breaks with X" or "verified in production" becomes the valuable contribution. Keep federation in mind from day one (did:web, did:plc) so it's not another closed platform. Stack Overflow's magic was making experts feel needed. They still do—just differently now.

noduerme 5 months ago
Oh, so it wasn't bad enough to spot bad human answers as an expert on Stack Overflow... now humans should spend their time spotting bad AI answers? How about a model where you ask a human and no AI input is allowed, to make sure that everyone has everyone else's full attention?
- imcritic 5 months ago
  
  Why disallow AI input? Is it that poor? Surely it isn't.
  
  5 replies →
cpa 5 months ago
Am I reading an AI trying to trick me into becoming its subordinate?
- dataviz1000 5 months ago
  
  In 2014, one benefit of Stack Overflow / Exchange is a user searching for work can include that they are a top 10% contributor. It actually had real world value. The equivalent today is users with extensive examples of completed projects on Github that can be cloned and run. OP's solution if contained in Github repositories will eventually get included in a training model. Moreover, the solution will definitely be used for training because it now exists on Hacker News.
  
  2 replies →
- imcritic 5 months ago
  
  Yeah, they didn't even bother to suggest paying you with tokens for the job well done! The audacity!
  
  1 reply →
- scirob 5 months ago
  
  hehe, damn I did let an AI fix my grammer and they promptly put the classic tell of — U+2014 in there
j45 5 months ago

AI is generally setup to return the "best" answer as defined as the most common answer, not the rightest, or most efficient or effective answer, unless the underlying data leans that way.
It's why AI based web search isn't behaving like google based search. People clicking on the best results really was a signal for google on what solution was being sought. Generally, I don't know that LLMs are covering this type of feedback loop.
whilenot-dev 5 months ago
That seems like a horrible core idea. How is that different from data labeling or model evaluation?
Human beings want to help out other human beings, spread knowledge and might want to get recognition for it. Manually correcting (3 different) automation efforts seems like incredible monotone, unrewarding labour for a race to the bottom. Nobody should spend their time correcting AI models without compensation.
- scirob 5 months ago
  
  Great point, thanks for the reality check.
  Speaking of evals the other day I found out that most of the people who contributed to Humanities Last Exam https://agi.safe.ai/ got paid >$2k each. So just adding to your point.
mcintyre1994 5 months ago
I think this could be really cool, but the tricky thing would be knowing when to use it instead of just asking the question directly to whichever AI. It’s hard to know that you’ll benefit from the extra context and some human input unless you already have a pretty good idea about the topic.
- imcritic 5 months ago
  
  Presumably over time said AI could figure out if your question had already been answered and in that case would just redirect you too the old thread instead.

achille 5 months ago

thanks for sharing that, it was simple, neat, elegant.

this sent me down a rabbit hole -- I asked a few models to solve that same problem, then followed up with a request to optimize it so it runs more efficiently.

chatgpt & gemini's solutions were buggy, but claude solved it, and actually found a solution that is even more efficient. It only needs to compute sqrt once per iteration. It's more complex however.

                   yours  claude
  ------------------------------
  Time (ns/call)    40.5   38.3
  sqrt per iter        3      1
  Accuracy        4.8e-7 4.8e-7

Claude's trick: instead of calling sin/cos each iteration, it rotates the existing (cos,sin) pair by the small Newton step and renormalizes:

  // Rotate (c,s) by angle dt, then renormalize to unit circle
  float nc = c + dt*s, ns = s - dt*c;
  float len = sqrt(nc*nc + ns*ns);
  c = nc/len; s = ns/len;

See: https://gist.github.com/achille/d1eadf82aa54056b9ded7706e8f5...

p.s: it seems like Gemini has disabled the ability to share chats can anyone else confirm this?

0xfaded 5 months ago
Thanks for pushing this, I've never gone beyond "zero" shotting the prompt (is it still called zero shot with search?)
As a curiosity, it looks like r and q are only ever used as r/q, and therefore a sqrt could be saved by computing rq = sqrt((rxrx + ryry) / (qxqx + qyqy)). The if q < 1e-10 is also perhaps not necessary, since this would imply that the ellipse is degenerate. My method won't work in that case anyway.
For the other sqrt, maybe try std::hypot
Finally, for your test set, could you had some highly eccentric cases such as a=1 and b=100
Thanks for the investigation:)
Edit: BTW, the sin/cos renormalize trick is the same as what tx,ty are doing. It was pointed out to me by another SO member. My original implementation used trig functions
- achille 5 months ago
  
  Nice, that worked. It's even faster.
  yours yours+opt claude --------------------------------------- Time (ns) 40.9 36.4 38.7 sqrt/iter 3 2 1 Instructions 207 187 241
  Edit: it looks like the claude algorithm fails at high eccentricities. Gave chatgpt pro more context and it worked for 30min and only made marginal improvement on yours, by doing 2 steps then taking a third local step.
  https://gist.github.com/achille/23680e9100db87565a8e67038797...
  
  4 replies →

weatherlite 5 months ago

I can relate. I used to have a decent SO profile (10k+ reputation, I know this isnt crazy but it was mostly on non low hanging fruit answers...it was a grind getting there). I used to be proud of my profile and even put it in my resume like people put their Github. Now - who cares? It would make look like a dinosaur sharing that profile, and I never go to SO anymore.

davchana 5 months ago

I too, around 2012 was too much active on so, in fact, it had that counter thing continuously xyz days most of my one liners, or snippets for php are still the highest voted answers. Even now when sometimes I google something, and an answer comes up, I realize its me who asked the same question and answered it too.

banku_brougham 5 months ago

I have had this experience -- twice with the same answer. There is nothing so amusing in quite this way.
googlehater 5 months ago

I often forget just how much smaller and less siloed the internet was just ~13 years ago.

zellyn 5 months ago

Please, start a blog! Hugo + GitHub hosting makes it laughably simple. (Or pick a different stack; that’s just mine.)

Even if you’re worried it’ll be sparse and crappy, isn’t an Internet full of idiosyncratic personal blogs what we all want?

If you want help or encouragement, reach out: zellyn@ most places

Aurornis 5 months ago

> Please, start a blog!
The second sentence of the SO post is a link to their blog where it was posted originally. The blog is not a replacement for the function SO served.
0xfaded 5 months ago

It's been a long time, but here is the writeup https://blog.chatfield.io/simple-method-for-distance-to-elli...

OJFord 5 months ago

I don't disagree completely by any means, it's an interesting point, but in your SO answer you already point to your blog post explaining it in more detail, so isn't that the answer, you'd just blog about it and not bother with SO?

Then AI finding it (as opposed to already trained well enough on it, I suppose) will still point to it as did your SO answer.

Neywiny 5 months ago

Looks like solid code. My only gripe is the shadowing of x. I would prefer to see `for _ in range`. You do redefine it immediately so it's not the most confusing, but it could trip people up especially as it's x and not i or something.

0xfaded 5 months ago

Hahaha thanks, I never noticed that. If I ever print it out and frame it I'll be sure to fix it

noduerme 5 months ago

That's pretty nice ;)

I once wrote this humdinger, that's still on my mostly dead personal website from 2010... one of my proudest bits of code besides my poker hand evaluator ;)

The question was, how do you generate a unique number for any two positive integers, where x!=y, such that f(x,y) = f(y,x) but the resulting combined id would not be generated by any other pair of integers. What I came up with was a way to generate a unique key from any set of positive integers which is valid no matter the order, but which doesn't key to any other set.

My idea was to take the radius of a circle that intersected the integer pair in cartesian space. That alone doesn't guarantee the circle won't intersect any other integer pairs... so I had to add to it the phase multiple of sine and cosine which is the same at those two points on the arc. That works out to:

(x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y)))

And means that it doesn't matter which order you feed x and y in, it will generate a unique float for the pair. It reduces to:

x^2+y^2+( (x/y) / (x^2+y^2) )

To add another dimension, just add it to the process and key it to one of the first...

x^2+y^2+z^2+( (x/y) / (x^2+y^2) )+( (x/z) / (x^2+z^2) )

bazzargh 5 months ago
It looks like you have typos? (x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y))) reduces to x^2+y^2+( (x/y) / (x^2/y^2 + 1) ) - not the equation given? Tho it's easier to see that this would be symmetrical if you rearrange it to: x^2+y^2+( (xy) / (x^2+y^2) )
Also, if f(x,y) = x^2+y^2+( (x/y) / (x^2+y^2) ) then f(2,1) is 5.2 and f(1,2) is 5.1? - this is how I noticed the mistake. (the other reduction gives the same answer, 5.4, for both, by symmetry, as you suggest)
There's a simpler solution which produces integer ids (though they are large): 2^x & 2^y. Another solution is to multiply the xth and yth primes.
I only looked because I was curious how you proved it unique!
- noduerme 5 months ago
  
  Hhhhmm. Ok. So I invented this solution in 2009 at what you might call a "peak mental moment", by a pool in Palm Springs, CA, after about 6 hours of writing on napkins. I'm not a mathematician. I don't think I'm even a great programmer, since there are probably much better ways of solving the thing I was trying to solve. And also, I'm not sure how I even came up with the reduction; I probably was wrong or made a typo (missing the +1?), and I'm not even certain how I could come up with it again.
  2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?
  Primes take too much time.
  The thing I was trying to solve was: I had written a bitcoin poker site from scratch, and I wanted to determine whether any players were colluding with each other. There were too many combinations of players on tables to analyze all their hands versus each other rapidly, so I needed to write a nightly cron job that collated their betting patterns 1 vs 1, 1 vs 2, 1 vs 3... any time 2 or 3 or 4 players were at the same table, I wanted to have a unique signature for that combination of players, regardless of which order they sat in at the table or which order they played their hands in. All the data for each player's action was in a SQL table of hand histories, indexed by playerID and tableID, with all the other playerIDs in the hand in a separate table. At the time, at least, I needed a faster way to query that data so that I could get a unique id from a set of playerIDs that would pull just the data from this massive table where all the same players were in a hand, without having to check the primary playerID column for each one. That was the motivation behind it.
  It did work. I'm glad you were curious. I think I kept it as the original algorithm, not the reduced version. But I was much smarter 15 years ago... I haven't had an epiphany like that in awhile (mostly have not needed to, unfortunately).
  
  9 replies →

emmelaich 5 months ago

You should write it up and submit it to some journal officially. Doesn't matter if it mostly duplicates your own (technically unpublished) work.

PeterStuer 5 months ago

SO in 2013 was a different world from the SO of the 2020's. In the latter world your post would have been moderator classified as 'duplicate' of some basic textbook copy/pasted method posted by a karma grinding CS student and closed.

eitland 5 months ago

My experience as well:
Stack Overflow used to (in practice) be a place to ask questions and get help and also help others.
At some point it became all about some mission and not only was it not as useful anymore but it also became a whole lot less fun.

eru 5 months ago

I have a similar story about an interesting little advance in computing that I haven't formally published anywhere, but it's at https://cs.stackexchange.com/a/171695/50292

The question boils down to: can you simulate the bulk outcome of a sequence of priority queue operations (insert and delete-minimum) in linear time, or is O(n log n) necessary. Surprisingly, linear time is possible.

RustyRussell 5 months ago

On the other hand, I once implemented something to be told later it was novel and probably the optimal solution in the space.

An AI might be more likely to find it...

eviks 5 months ago

> Today I don't know where I would publish such a gem.

In the same blog you published it originally, then mentioning it on whatever social media site you use? So same?

fho 5 months ago

Then let me quickly say: thank you! I used that algorithm three times in different projects during my academic "career" :-)

rerdavies 5 months ago

Reddit is my current go-to for human-sourced info. Search for "reddit your question here". Where on reddit? Not sure. I don't post, tbh, but I do search.

Has the added benefit of NOT returning stackoverflow answers, since StackOverflow seems to have rotted out these days, and been taken over by the "rejection police".

mightybyte 5 months ago

Sounds like this should live in Wikipedia somewhere on https://en.wikipedia.org/wiki/Ellipse...or maybe a related but more CS focused related page.

kwakubiney 5 months ago

Naive question maybe but how haven’t the models been trained on your answer if it’s on SO?

wesammikhail 5 months ago

Models are NOT search engines.
Even if LLMs were trained on the answer, that doesn't mean they'll ever recommend it. Regardless of how accurate it may be. LLMs are black box next token predictors and that's part of the issue.

jmux 5 months ago

This is a really method for solving that problem! I wouldn’t have thought to use the tangents but that makes perfect sense

baq 5 months ago

If you ask me your blog post is basically a paper, I’d publish to arxiv.

userbinator 5 months ago

That algorithm reminds me of raymarching signed distance functions.

lbj 5 months ago

Really great write-up, thanks for sharing it again!

techsystems 5 months ago

Amazing work!

mmaaz 5 months ago

Very cool!

qwertox 5 months ago

Why did SO decide to do that to us? to not invest in ai and then, iirc, claim our contributions their ownership. i sometimes go back to answers i gave, even when answered my own questions.

socalgal2 5 months ago

Decide to do what?
SO didn't claim contributions. They're still CC-BY-SA
https://stackoverflow.com/help/licensing
AFAICT all they did is stop providing dumps. That doesn't change the license.
I was very active, In fact I'm actually upset at myself for spending so much time there. That said, I always thought I was getting fair value. They provided free hosting, I got answers and got to contribute answers for others.