Comment by 0xfaded

5 months ago

I once published a method for finding the closest distance between an ellipse and a point on SO: https://stackoverflow.com/questions/22959698/distance-from-g...

I consider it the most beautiful piece of code I've ever written and perhaps my one minor contribution to human knowledge. It uses a method I invented, is just a few lines, and converges in very few iterations.

People used to reach out to me all the time with uses they had found for it, it was cited in a PhD and apparently lives in some collision plugin for unity. Haven't heard from anyone in a long time.

It's also my test question for LLMs, and I've yet to see my solution regurgitated. Instead they generate some variant of Newtons method, ChatGPT 5.2 gave me an LM implementation and acknowledged that Newtons method is unstable (it is, which is why I went down the rabbit hole in the first place.)

Today I don't know where I would publish such a gem. It's not something I'd bother writing up in a paper, and SO was the obvious place were people who wanted an answer to this question would look. Now there is no central repository, instead everyone individually summons the ghosts of those passed in loneliness.

The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request which I'd summarize as follows:

With no one asking questions these technical questions publicly, where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?

  • > The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request

    They also completely missed the fact that 0xfaded did write a blog post and it’s linked in the second sentence of the SO post.

    > There is a relatively simple numerical method with better convergence than Newtons Method. I have a blog post about why it works http://wet-robots.ghost.io/simple-method-for-distance-to-ell...

  • Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably. Why wouldn't the faded ox publish in a paper? Idk, but I guess we need things similar to those circulars that British royal society members used to send to each other...except not reserved for a club. The web should be a natural at this. But it's either centralized -> monetized -> corrupted, or decentralized -> unindexed/niche -> forgotten fringe. What can come between?

    • I wonder if there could be something like a Wikipedia for programming. A bit like what the book Design Patterns did in 1994, collecting everyone's useful solutions, but on a much larger scale. Everyone shares the best strategies and algorithms for everything, and updates them when new ones come about, and we finally stop reinventing the wheel for every new project.

      To some extent that was Stack Overflow, and it's also GitHub, and now it's also LLMs, but not quite.

      May I suggest "PASTE": Patterns, Algorithms, Solutions, Techniques, and Examples. "Just copy PASTE", they'll say.

      11 replies →

    • > Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably.

      I think GP's min-distance solution would work well as an arxiv paper that is never submitted for publication.

      A curated list of never-published papers, with comments by users, makes sense in this context. Not sure that arxiv itself is a good place, but something close to it in design, with user comments and response-papers could be workable.

      Something like RFC, but with rich content (not plain-text) and focused on things like GP published (code techniques, tricks, etc).

      Could even call it "circulars on computer programming" or "circulars on software engineering", etc.

      PS. I ran an experiment some time back, putting something on arxiv instead of github, and had to field a few comments about "this is not novel enough to be a paper" and my responses were "this is not a publishable paper, and I don't intend to submit it anywhere". IOW, this is not a new or unique problem.

      (See the thread here - https://news.ycombinator.com/item?id=44290315)

  • You can (and always were encouraged to) ask your own questions, too.

    And there are more sites like this (see e.g. https://codidact.com — fd: moderator of the Software section). Just because something loses popularity isn't a reason to stop doing it.

    • StackOverflow is famously obnoxious about questions badly asked, badly categorized, duplicated…

      It’s actually a topic on which StackOverflow would benefit from AI A LOT.

      Imagine StackOverflow rebrands itself as the place where you can ask the LLM and it benefits the world, whoch correctly rephrasing the question behind the scenes and creating public records for them.

      4 replies →

  • Seriously where will we get this info anymore? I’ve depended on it for decades. No matter how obscure, I could always find a community that was talking about something I needed solved. I feel like that’s getting harder and harder every year. The balkanization of the Internet + garbage AI slop blogs overwhelming the clearly declining Google is a huge problem.

  • > where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?

    The same place people have always discovered problems to work on, for the entire history of human civilization. Industry, trades, academia, public service, newspapers, community organizations. The world is filled with unsolved problems, and places to go to work on them.

    Einstein was literally a patent clerk.

This is a perfect example of an element of Q&A forums that is being lost. Another thing that I don't think we'll see as much of anymore is interaction from developers that have extensive internal knowledge on products.

An example I can think of was when Eric Lippert, a developer on the C# compiler at the time, responded to a question about a "gotcha" in the language: https://stackoverflow.com/a/8899347/10470363

Developer interaction like that is going to be completely lost.

  • This type of thing often lives in the issues / discussion tab of a github repo now a days, for better and worse.

    • Yuck. I don't know if it's just me, but something feels completely off about the GH issue tracker. I don't know if it's the spacing, the formatting, or what, but each time it feels like it's actively trying to shoo me away.

      It's whatever the visual language equivalent of "low signal" is.

      3 replies →

    • I think most relevant data that provides best answers lives in GitHub. Sometimes in code, sometimes in issues or discussions. Many libs have their docs there as well. But the information is scattered and not easy to find, and often you need multiple sources to come up with a solution to some problem.

    • A lot of valuable information lived/lives in email threads that might or might not be publicly archived.

  • I believe the community has seen the benefit of forums like SO and we won’t let the idea go stale. I also believe the current state of SO is not sustainable with the old guard flagging any question and response you post there. The idea can/should/might be re-invented in an LLM context and we’re one good interface away from getting there. That’s at least my hope.

I had a similar beautiful experience where an experienced programmer answered one of my elementary JavaScript typing questions when I was just starting to learn programming.

He didn't need to, but he gave the most comprehensive answer possible attacking the question from various angles.

He taught me the value of deeply understanding theoretical and historical aspects of computing to understand why some parts of programming exist the way they are. I'm still thankful.

If this was repeated today, an LLM would have given a surface level answer, or worse yet would've done the thinking for me obliviating the question in the first place.

I wrote a blog post about my experience at https://nmn.gl/blog/ai-and-learning

  • Had a similar experience. Asked a question about a new language feature in java 8 (parallell streams), and one of the language designers (Goetz) answered my question about the intention of how to use it.

    An LLM couldn't have done the same. Someone would have to ask the question and someone answer it for indexing by the LLM. If we all just ask questions in closed chats, lots of new questions will go unanswered as those with the knowledge have simply not been asked to write the answers down anywhere.

  • You can prompt the LLM to not just give you the answer. Possibly even ask it to consider the problem from different angles but that may not be helpful when you don't know what you don't know.

  • For every example of that, there were 999 instances of people having their question closed, criticised, or ignored.

You can write a paper, submit the arxiv, and you can also make a blog post. At any rate, I agree - SO was (is?) a wonderful place for this kind of thing.

I once had a professor mention that they knew me from SO because I posted a few underhanded tricks to prevent an EKF from "going singular" in production. That kind of community is going to be hard to replace, but SO isnt going anywhere, you can still ask a question and answer your own question for permanent, searchable archive.

Has anyone tried building a modern Stack Overflow that's actually designed for AI-first developers? The core idea: question gets asked → immediately shows answers from 3 different AI models. Users get instant value. Then humans show up to verify, break it down, or add production context. But flip the reputation system: instead of reputation for answers, you get it for catching what's wrong or verifying what works. "This breaks with X" or "verified in production" becomes the valuable contribution. Keep federation in mind from day one (did:web, did:plc) so it's not another closed platform. Stack Overflow's magic was making experts feel needed. They still do—just differently now.

  • Oh, so it wasn't bad enough to spot bad human answers as an expert on Stack Overflow... now humans should spend their time spotting bad AI answers? How about a model where you ask a human and no AI input is allowed, to make sure that everyone has everyone else's full attention?

  • Am I reading an AI trying to trick me into becoming its subordinate?

    • In 2014, one benefit of Stack Overflow / Exchange is a user searching for work can include that they are a top 10% contributor. It actually had real world value. The equivalent today is users with extensive examples of completed projects on Github that can be cloned and run. OP's solution if contained in Github repositories will eventually get included in a training model. Moreover, the solution will definitely be used for training because it now exists on Hacker News.

      2 replies →

    • hehe, damn I did let an AI fix my grammer and they promptly put the classic tell of — U+2014 in there

  • AI is generally setup to return the "best" answer as defined as the most common answer, not the rightest, or most efficient or effective answer, unless the underlying data leans that way.

    It's why AI based web search isn't behaving like google based search. People clicking on the best results really was a signal for google on what solution was being sought. Generally, I don't know that LLMs are covering this type of feedback loop.

  • That seems like a horrible core idea. How is that different from data labeling or model evaluation?

    Human beings want to help out other human beings, spread knowledge and might want to get recognition for it. Manually correcting (3 different) automation efforts seems like incredible monotone, unrewarding labour for a race to the bottom. Nobody should spend their time correcting AI models without compensation.

    • Great point, thanks for the reality check.

      Speaking of evals the other day I found out that most of the people who contributed to Humanities Last Exam https://agi.safe.ai/ got paid >$2k each. So just adding to your point.

  • I think this could be really cool, but the tricky thing would be knowing when to use it instead of just asking the question directly to whichever AI. It’s hard to know that you’ll benefit from the extra context and some human input unless you already have a pretty good idea about the topic.

    • Presumably over time said AI could figure out if your question had already been answered and in that case would just redirect you too the old thread instead.

thanks for sharing that, it was simple, neat, elegant.

this sent me down a rabbit hole -- I asked a few models to solve that same problem, then followed up with a request to optimize it so it runs more efficiently.

chatgpt & gemini's solutions were buggy, but claude solved it, and actually found a solution that is even more efficient. It only needs to compute sqrt once per iteration. It's more complex however.

                   yours  claude
  ------------------------------
  Time (ns/call)    40.5   38.3
  sqrt per iter        3      1
  Accuracy        4.8e-7 4.8e-7

Claude's trick: instead of calling sin/cos each iteration, it rotates the existing (cos,sin) pair by the small Newton step and renormalizes:

  // Rotate (c,s) by angle dt, then renormalize to unit circle
  float nc = c + dt*s, ns = s - dt*c;
  float len = sqrt(nc*nc + ns*ns);
  c = nc/len; s = ns/len;

See: https://gist.github.com/achille/d1eadf82aa54056b9ded7706e8f5...

p.s: it seems like Gemini has disabled the ability to share chats can anyone else confirm this?

  • Thanks for pushing this, I've never gone beyond "zero" shotting the prompt (is it still called zero shot with search?)

    As a curiosity, it looks like r and q are only ever used as r/q, and therefore a sqrt could be saved by computing rq = sqrt((rxrx + ryry) / (qxqx + qyqy)). The if q < 1e-10 is also perhaps not necessary, since this would imply that the ellipse is degenerate. My method won't work in that case anyway.

    For the other sqrt, maybe try std::hypot

    Finally, for your test set, could you had some highly eccentric cases such as a=1 and b=100

    Thanks for the investigation:)

    Edit: BTW, the sin/cos renormalize trick is the same as what tx,ty are doing. It was pointed out to me by another SO member. My original implementation used trig functions

I can relate. I used to have a decent SO profile (10k+ reputation, I know this isnt crazy but it was mostly on non low hanging fruit answers...it was a grind getting there). I used to be proud of my profile and even put it in my resume like people put their Github. Now - who cares? It would make look like a dinosaur sharing that profile, and I never go to SO anymore.

I too, around 2012 was too much active on so, in fact, it had that counter thing continuously xyz days most of my one liners, or snippets for php are still the highest voted answers. Even now when sometimes I google something, and an answer comes up, I realize its me who asked the same question and answered it too.

Please, start a blog! Hugo + GitHub hosting makes it laughably simple. (Or pick a different stack; that’s just mine.)

Even if you’re worried it’ll be sparse and crappy, isn’t an Internet full of idiosyncratic personal blogs what we all want?

If you want help or encouragement, reach out: zellyn@ most places

I don't disagree completely by any means, it's an interesting point, but in your SO answer you already point to your blog post explaining it in more detail, so isn't that the answer, you'd just blog about it and not bother with SO?

Then AI finding it (as opposed to already trained well enough on it, I suppose) will still point to it as did your SO answer.

Looks like solid code. My only gripe is the shadowing of x. I would prefer to see `for _ in range`. You do redefine it immediately so it's not the most confusing, but it could trip people up especially as it's x and not i or something.

  • Hahaha thanks, I never noticed that. If I ever print it out and frame it I'll be sure to fix it

That's pretty nice ;)

I once wrote this humdinger, that's still on my mostly dead personal website from 2010... one of my proudest bits of code besides my poker hand evaluator ;)

The question was, how do you generate a unique number for any two positive integers, where x!=y, such that f(x,y) = f(y,x) but the resulting combined id would not be generated by any other pair of integers. What I came up with was a way to generate a unique key from any set of positive integers which is valid no matter the order, but which doesn't key to any other set.

My idea was to take the radius of a circle that intersected the integer pair in cartesian space. That alone doesn't guarantee the circle won't intersect any other integer pairs... so I had to add to it the phase multiple of sine and cosine which is the same at those two points on the arc. That works out to:

(x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y)))

And means that it doesn't matter which order you feed x and y in, it will generate a unique float for the pair. It reduces to:

x^2+y^2+( (x/y) / (x^2+y^2) )

To add another dimension, just add it to the process and key it to one of the first...

x^2+y^2+z^2+( (x/y) / (x^2+y^2) )+( (x/z) / (x^2+z^2) )

  • It looks like you have typos? (x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y))) reduces to x^2+y^2+( (x/y) / (x^2/y^2 + 1) ) - not the equation given? Tho it's easier to see that this would be symmetrical if you rearrange it to: x^2+y^2+( (xy) / (x^2+y^2) )

    Also, if f(x,y) = x^2+y^2+( (x/y) / (x^2+y^2) ) then f(2,1) is 5.2 and f(1,2) is 5.1? - this is how I noticed the mistake. (the other reduction gives the same answer, 5.4, for both, by symmetry, as you suggest)

    There's a simpler solution which produces integer ids (though they are large): 2^x & 2^y. Another solution is to multiply the xth and yth primes.

    I only looked because I was curious how you proved it unique!

    • Hhhhmm. Ok. So I invented this solution in 2009 at what you might call a "peak mental moment", by a pool in Palm Springs, CA, after about 6 hours of writing on napkins. I'm not a mathematician. I don't think I'm even a great programmer, since there are probably much better ways of solving the thing I was trying to solve. And also, I'm not sure how I even came up with the reduction; I probably was wrong or made a typo (missing the +1?), and I'm not even certain how I could come up with it again.

      2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?

      Primes take too much time.

      The thing I was trying to solve was: I had written a bitcoin poker site from scratch, and I wanted to determine whether any players were colluding with each other. There were too many combinations of players on tables to analyze all their hands versus each other rapidly, so I needed to write a nightly cron job that collated their betting patterns 1 vs 1, 1 vs 2, 1 vs 3... any time 2 or 3 or 4 players were at the same table, I wanted to have a unique signature for that combination of players, regardless of which order they sat in at the table or which order they played their hands in. All the data for each player's action was in a SQL table of hand histories, indexed by playerID and tableID, with all the other playerIDs in the hand in a separate table. At the time, at least, I needed a faster way to query that data so that I could get a unique id from a set of playerIDs that would pull just the data from this massive table where all the same players were in a hand, without having to check the primary playerID column for each one. That was the motivation behind it.

      It did work. I'm glad you were curious. I think I kept it as the original algorithm, not the reduced version. But I was much smarter 15 years ago... I haven't had an epiphany like that in awhile (mostly have not needed to, unfortunately).

      9 replies →

You should write it up and submit it to some journal officially. Doesn't matter if it mostly duplicates your own (technically unpublished) work.

SO in 2013 was a different world from the SO of the 2020's. In the latter world your post would have been moderator classified as 'duplicate' of some basic textbook copy/pasted method posted by a karma grinding CS student and closed.

  • My experience as well:

    Stack Overflow used to (in practice) be a place to ask questions and get help and also help others.

    At some point it became all about some mission and not only was it not as useful anymore but it also became a whole lot less fun.

I have a similar story about an interesting little advance in computing that I haven't formally published anywhere, but it's at https://cs.stackexchange.com/a/171695/50292

The question boils down to: can you simulate the bulk outcome of a sequence of priority queue operations (insert and delete-minimum) in linear time, or is O(n log n) necessary. Surprisingly, linear time is possible.

On the other hand, I once implemented something to be told later it was novel and probably the optimal solution in the space.

An AI might be more likely to find it...

> Today I don't know where I would publish such a gem.

In the same blog you published it originally, then mentioning it on whatever social media site you use? So same?

Then let me quickly say: thank you! I used that algorithm three times in different projects during my academic "career" :-)

Reddit is my current go-to for human-sourced info. Search for "reddit your question here". Where on reddit? Not sure. I don't post, tbh, but I do search.

Has the added benefit of NOT returning stackoverflow answers, since StackOverflow seems to have rotted out these days, and been taken over by the "rejection police".

Naive question maybe but how haven’t the models been trained on your answer if it’s on SO?

  • Models are NOT search engines.

    Even if LLMs were trained on the answer, that doesn't mean they'll ever recommend it. Regardless of how accurate it may be. LLMs are black box next token predictors and that's part of the issue.

This is a really method for solving that problem! I wouldn’t have thought to use the tangents but that makes perfect sense

If you ask me your blog post is basically a paper, I’d publish to arxiv.

Why did SO decide to do that to us? to not invest in ai and then, iirc, claim our contributions their ownership. i sometimes go back to answers i gave, even when answered my own questions.

  • Decide to do what?

    SO didn't claim contributions. They're still CC-BY-SA

    https://stackoverflow.com/help/licensing

    AFAICT all they did is stop providing dumps. That doesn't change the license.

    I was very active, In fact I'm actually upset at myself for spending so much time there. That said, I always thought I was getting fair value. They provided free hosting, I got answers and got to contribute answers for others.