But that presumably doesn't handle the relative motion of the stars, which makes the problem even trickier, since the distances will change as you travel, no? Or is my astronomy off base here?
I think your astronomy skills are correct, but if we have to worry about actual travel then you would also have to consider things like fuel capacity, refuel opportunities, the fact that you probably don't want to actually fly through a star but around it, etc.
Note that the tour itself was found quickly using a heuristic solver (https://www.math.uwaterloo.ca/tsp/korea/computation.html), the achievement here and all the computation is to establish that this is the lower bound (assuming I understood correctly).
So, the heuristic solver worked pretty darn well :) Although, I’m not sure how close it would have been the heuristic algorithm you are describing (I suspect that it is considerably more advanced for good reasons, randomly picking will take too long to converge).
The algorithm that OP describes is more commonly known as 2-opt [0]. The heuristic used in this case is referred to as LKH which I assume means the Lin-Kernighan Heuristic [1]. The latter is sort of a meta generalisation of the former.
Iirc the (probably simplified) LKH heuristic they used:
For each iteration:
apply some randomisation
starting at each place
cut the path in 2..n places
reconnect in the most optimal way
if the new tour is the new best, save
LK heuristic is a bit more involved, but focuses on continuing to reverse the subarray on the [b, ..., d] segment, with search and backtracking involved. (I think that's refered to as sequential k-opt moves, but I think it's already quite hard to know what exactly LK is, and LKH does much more)
By focusing on the subarray, assuming distance symmetry (length from b to e is same as length from e to b, but there are correct workarounds if this does not hold), you can evaluate the cost of the new route in constant time (but with bigger k there's more moves to evaluate https://oeis.org/A001171)
It's strange that they don't mention the total distance. I understand that the point-to-point travel time is what they're solving for, but it would be interesting to know what the actual distance of travel was, if for no other reason than calculating caloric burn. But then you could also see how much it deviated from the shortest-distance path.
Proper routing is also an expensive computation. Yes you could just run A* or something on the roads but that would assume no closures, no one way roads, wouldn’t account for elevation change, ect. Using a proper routing API is almost certainly cost prohibitive
Many countries have much more used “public” spaces, and people spend much more time in them, together.
The idea of driving home to the suburbs and locking yourself into your private home is very North American.
I just got back from10 months across Europe. The number of people in public places eating, chatting and just spending time (no simply going somewhere) makes LA or Chicago look like a ghost town.
Looks like they got their hands on a dataset of every restaurant that is licensed to serve alcohol -- or at least a decent subset of such restaurants, filtered by menu or whatever.
I checked a few dots near where I live and they're all fried chicken joints. Yeah, we do love chimaek around here. :)
"Bar" doesn't mean the same thing in every country. In Spain although a bar serves alcohol of all kinds it is also where one eats breakfast and lunch and gets a coffee. They are indispensable social centers and even a tiny town of 150 has one.
After living there for about four years, my mind goes immediately to soju. Not sure if there is a connection, but that’s something I might deep dive with an LLM today.
reminds me of a question they used to ask in the Irish army back in the 60s. My Dad told me this. "How do you get from Bachelor's Walk to Collins Barracks without passing a bar?". People would spend hours and days working on the answer. In the end, the answer was "Go in to every one".
So NP is like P again. I learned in school 13 is the max and one of my algebra professors advanced it to 15 (in the 80ies). Then came 20, then came 20.000, this is 80k with proof, and at the World TSP page we see the record was 1m.
There is tons of work to do on running optimization algorithms in GPU. In its current form, Branch and Bound and Cutting Planes do not gain an advantage if implemented in CUDA. There is a new algorithm, PDLP, which is implementable in GPUs but it is still in early stages. For more, see https://blogs.nvidia.com/blog/cuopt-open-source/.
I'm impressed they found a dataset this hard, but not much harder. It's a delicate balance between beating the last Traveling Salesman hiscore (Netherlands), and never finishing your compute
In the "computations" page[1], the table lists the Netherlands computation as costing 97 CPU years with 6 months of elapsed time, while the Korean bars costs 44 years of CPU time and 3 months of elapsed time. I can't tell if the two problems were solved using the same hardware.
Branch-and-bound is an algorithm "from the book" to me. Fundamentally very simple, provided you view the LP solver as a black box, but incredibly useful.
I zoomed in on the map and discovered at least one shortcut where one could have saved a few seconds, now is that proof enough that the solution is not optimal? :P
Would be nice if they could briefly describe the algorithm. Sounds like they’ve turned the TSP into an integer linear program that they can do branch and bound on, but I’m not sure.
There was a man who documented his travel to every country in the world. Not long before he was finished, South Sudan gained independence and he had to take a special trip there to complete his journey, which apparently had already completed all the other countries in Africa long ago.
>Our computation produced a tour together with a proof that it is a shortest-possible route [...]
Proof nowhere to be found.
Waterloo-ers are nice people but I see an increasing trend of them just lying to get some cred. Come on guys, you don't have to follow the valley model that much.
Not sure what you expected to get. The Concorde TSP solver is an exact solver that uses branch and bound search, it will return either a solution with a specified bound or the optimal bound. They provide the dataset and the solution they found (and I believe their solver is open source), if you don't believe them you can go ahead and find a better tour.
People really really really need to take some time to understand the concept of "burden of proof", so they can't stop making fools of themselves in public.
If you find this impressive, take a look at the 1.33 billion stars TSP solution provided by the same authors.
- Gaia DR2 (1,331,906,450 Stars): https://www.math.uwaterloo.ca/tsp/star/gaia2.html
> "The tour is at most 1.0038 times the length of a shortest-possible route."
But that presumably doesn't handle the relative motion of the stars, which makes the problem even trickier, since the distances will change as you travel, no? Or is my astronomy off base here?
I think your astronomy skills are correct, but if we have to worry about actual travel then you would also have to consider things like fuel capacity, refuel opportunities, the fact that you probably don't want to actually fly through a star but around it, etc.
4 replies →
This also doesn't handle new bars being opened and closed as you travel. Not to mention bouncers having bad days so you will have to revisit the bar.
1 reply →
But they are so far apart and move on roughly the same trajectory that it shouldn't really matter.
If you just use the simple-minded Bell Labs probabilistic algorithm, how much worse is that result?
The classic TSP approach is:
1. Hook up all the nodes in some arbitrary path.
2. Cut the path in two places to create three pieces.
3. Rearrange those three pieces in the six possible ways and keep the shortest.
4. Iterate steps 2-3 until no improvement has been observed for a while.
This is not guaranteed to be optimal, but for most real-world problems either finds the optimal result or is very close.
Note that the tour itself was found quickly using a heuristic solver (https://www.math.uwaterloo.ca/tsp/korea/computation.html), the achievement here and all the computation is to establish that this is the lower bound (assuming I understood correctly).
So, the heuristic solver worked pretty darn well :) Although, I’m not sure how close it would have been the heuristic algorithm you are describing (I suspect that it is considerably more advanced for good reasons, randomly picking will take too long to converge).
The algorithm that OP describes is more commonly known as 2-opt [0]. The heuristic used in this case is referred to as LKH which I assume means the Lin-Kernighan Heuristic [1]. The latter is sort of a meta generalisation of the former.
[0] https://en.m.wikipedia.org/wiki/2-opt
[1] https://en.m.wikipedia.org/wiki/Lin%E2%80%93Kernighan_heuris...
1 reply →
https://www.youtube.com/watch?v=tChnXG6ulyE
Author's presentation about it
Iirc the (probably simplified) LKH heuristic they used:
n is a small number like 4 maybe 5?
2-opt: [a, b, ..., d, e]
reversing subarray from b to d is a 2-opt move.
3-opt (1 particular move):
a b c d e f
a e d c b f -- reversal from b to e
a e d b c f -- reversal from c to b
LK heuristic is a bit more involved, but focuses on continuing to reverse the subarray on the [b, ..., d] segment, with search and backtracking involved. (I think that's refered to as sequential k-opt moves, but I think it's already quite hard to know what exactly LK is, and LKH does much more)
By focusing on the subarray, assuming distance symmetry (length from b to e is same as length from e to b, but there are correct workarounds if this does not hold), you can evaluate the cost of the new route in constant time (but with bigger k there's more moves to evaluate https://oeis.org/A001171)
It's strange that they don't mention the total distance. I understand that the point-to-point travel time is what they're solving for, but it would be interesting to know what the actual distance of travel was, if for no other reason than calculating caloric burn. But then you could also see how much it deviated from the shortest-distance path.
Proper routing is also an expensive computation. Yes you could just run A* or something on the roads but that would assume no closures, no one way roads, wouldn’t account for elevation change, ect. Using a proper routing API is almost certainly cost prohibitive
I am overwhelmed with the thought of nearly 82 thousand bars within a country roughly the size of Ohio.
I checked a few and there's a lot of restaurants included.
https://www.math.uwaterloo.ca/tsp/korea/data/korea81998.xy.t...
americans always compare massive cities to empty states
Snark aside - I used this site which compares the country question to the location closest to you. I live in ohio. https://www.mylifeelsewhere.com/country-size-comparison/unit...
That country has a population of 52 million, i.e. about 5 times Ohio.
Sure, but Ohio has ~4200 bars[0]. Which is roughly 1/4 the ratio of bars to people.
[0]: https://rentechdigital.com/smartscraper/business-report-deta...
15 replies →
If this is correct, it seems like Seoul has over 40x the number of bars that Chicago has, despite having only about 4x the population
How in the hell?
Many countries have much more used “public” spaces, and people spend much more time in them, together.
The idea of driving home to the suburbs and locking yourself into your private home is very North American.
I just got back from10 months across Europe. The number of people in public places eating, chatting and just spending time (no simply going somewhere) makes LA or Chicago look like a ghost town.
1 reply →
The bars may be much smaller.
Looks like they got their hands on a dataset of every restaurant that is licensed to serve alcohol -- or at least a decent subset of such restaurants, filtered by menu or whatever.
I checked a few dots near where I live and they're all fried chicken joints. Yeah, we do love chimaek around here. :)
In korea after a certain hour every restaurant, karaoke, PCBang, and hotteok parlor is basically a bar :)
God I miss this place so much <3
NYC that is like 20 miles across has 11,000 locations that serve alcohol.
How many bars do you expect are in Ohio?
Less than 40,000
3 replies →
"Bar" doesn't mean the same thing in every country. In Spain although a bar serves alcohol of all kinds it is also where one eats breakfast and lunch and gets a coffee. They are indispensable social centers and even a tiny town of 150 has one.
Town? More like village. You can have a nearly empty church, but there's no village without a bar.
And South Korea has one of the highest rates of stomach cancer.
After living there for about four years, my mind goes immediately to soju. Not sure if there is a connection, but that’s something I might deep dive with an LLM today.
To be fair, it does sound like a pretty tough place to stomach.
reminds me of a question they used to ask in the Irish army back in the 60s. My Dad told me this. "How do you get from Bachelor's Walk to Collins Barracks without passing a bar?". People would spend hours and days working on the answer. In the end, the answer was "Go in to every one".
So NP is like P again. I learned in school 13 is the max and one of my algebra professors advanced it to 15 (in the 80ies). Then came 20, then came 20.000, this is 80k with proof, and at the World TSP page we see the record was 1m.
http://webhotel4.ruc.dk/~keld/research/LKH/
The biggest proven optimum is for 3178031 right now.
This should be really done with CUDA, not plain C, btw.
There is tons of work to do on running optimization algorithms in GPU. In its current form, Branch and Bound and Cutting Planes do not gain an advantage if implemented in CUDA. There is a new algorithm, PDLP, which is implementable in GPUs but it is still in early stages. For more, see https://blogs.nvidia.com/blog/cuopt-open-source/.
The thing is that Euclidean TSP needs a lot of data to encode hard instances.
N=15 was even considered solved in the 60s, and N=20 has never been considered large instances, especially not of Euclidean TSP.
I cannot see how anyone could say 13 is the max: you need 100k memory slots and 1M comparisons. This has been trivial for quite some time.
Yeah, I probably mixed it up with the Hamiltonian Path problem. It was a long time ago
>80k with proof
Post proof.
I'm impressed they found a dataset this hard, but not much harder. It's a delicate balance between beating the last Traveling Salesman hiscore (Netherlands), and never finishing your compute
In the "computations" page[1], the table lists the Netherlands computation as costing 97 CPU years with 6 months of elapsed time, while the Korean bars costs 44 years of CPU time and 3 months of elapsed time. I can't tell if the two problems were solved using the same hardware.
[1] https://www.math.uwaterloo.ca/tsp/korea/computation.html
Do we know they didn’t just prune problematic bars from the dataset until they found a one with a solution?
You and I don’t know. But this is hacker news so there is probably somebody here keeping them honest.
Branch-and-bound is an algorithm "from the book" to me. Fundamentally very simple, provided you view the LP solver as a black box, but incredibly useful.
OSRM lead dev here. Love to see this large of an instance being solved.
I looked at near my home, they missed a few. A issue is that in Korea a lot of the local spots are not on any public maps.
Oh no, looks like a few new bars opened up, and a few others closed. Time to recalculate.
I zoomed in on the map and discovered at least one shortcut where one could have saved a few seconds, now is that proof enough that the solution is not optimal? :P
Where?
If you spent 40 years of your life on this path, you would still be visiting 5.616 bars per day. Nuts.
Less than 6 bars a day is pretty doable! :-p
Isn't comma the decimal separator ;)
2 replies →
That’s also not considering whether they’re open or existing anymore after so much time has passed.
Has anyone done the opposite of this, finding the longest possible route?
Does it account for "pit stops"
Would be nice if they could briefly describe the algorithm. Sounds like they’ve turned the TSP into an integer linear program that they can do branch and bound on, but I’m not sure.
It's classic Lin Kernighan (http://webhotel4.ruc.dk/~keld/research/LKH/) for the primal heuristic, and optimality proof by Concorde for cutting plane generation and branching (https://www.math.uwaterloo.ca/tsp/book/index.html, or https://www.math.uwaterloo.ca/tsp/korea/computation.html for details specific to this instance), with CPLEX as the underlying LP solver.
Is that one of the problem quantum computers would resolve instantly if it is actually possible to scale them up?
Very interesting..
Is anything supposed to happen if you click on those red circles? I was hoping it will show name or other info!
It would suck to get to bar 51,248 only to find out it's now permanently closed
There was a man who documented his travel to every country in the world. Not long before he was finished, South Sudan gained independence and he had to take a special trip there to complete his journey, which apparently had already completed all the other countries in Africa long ago.
impresive, I have forgot TSP after graduated.
traveling drinking man's problem
They call it the Traveling Salesman Problem, but it sounds more like the Drunkard’s Walk to me.
I’d like to call it a stumble :)
“The locations were downloaded from a database maintained by the Korean National Police Agency.”
>Our computation produced a tour together with a proof that it is a shortest-possible route [...]
Proof nowhere to be found.
Waterloo-ers are nice people but I see an increasing trend of them just lying to get some cred. Come on guys, you don't have to follow the valley model that much.
Not sure what you expected to get. The Concorde TSP solver is an exact solver that uses branch and bound search, it will return either a solution with a specified bound or the optimal bound. They provide the dataset and the solution they found (and I believe their solver is open source), if you don't believe them you can go ahead and find a better tour.
People really really really need to take some time to understand the concept of "burden of proof", so they can't stop making fools of themselves in public.
3 replies →
I also expected to get an actual proof.
4 replies →
These claims are provisional. Until someone produces a better tour or a valid counter-proof, this stands as the best-known solution.
Kids, we're going on a road trip!
Title is incorrect. 178 days is the walking time of the optimal tour, not how long it took to solve for the best route
3 months, using 44 CPU-years, is that time.
(Submitted title was "Shortest walking tour to 81,998 bars in Korea — TSP solved in 178 days".)