Comment by KaiserPro

2 months ago

Ok so this looks like bullshit.

First things first, its entirely possible to geolocate using just visual markers.

A bunch of startups did it around 2018 (most got bought by facebook, ie mapillary) They work by extracting keypoints from pictures and building a massive point cloud of identifiable key points.

But

That picture they use with supposed keypoint matching is wrong. None of those keypoints are reliable feature descriptors. They all are on foliage, which changes depending on season and wind. Geolocating that picture accurately _automatically_ using features is next to impossible.

Now, they might have a vibe based matcher which does some basic spatial comparison, but I'm not sure how reliable they are, especially given a large search radius.

The other interesting question is, where did they get their data from? I'm pretty sure google spent a lot of time making it really difficult to train from street view (lord knows we've tried.)

Edit the demo here: https://geospy.ai/ is much more what I recognise a bog standard VPS system does. Note that the user is matching buildings. Thats far more reliable way to do feature matching.

2 comments

KaiserPro

MontyCarloHall 2 months ago

>They all are on foliage, which changes depending on season and wind. Geolocating that picture accurately _automatically_ using features is next to impossible.

Seems plausible enough to me. The trees are evergreens in a place that doesn't get snow, and the keypoints are mostly grounded on stable parts of the trees (trunks or thick branches), which barring gale-force winds probably don't fluctuate all that much.

The part that gives me pause are the keypoints that map the hood of the car to the pavement, and the point on the far right that maps the ledge to the pavement. How can a system robust enough to map foliage also return such blatant false matches?

KaiserPro 2 months ago

> return such blatant false matches
long answer, have a try on this demo: https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html
short answer is that they are similar enough features to match. think of them as homophones (ie words that sound the same but have different meanings) in language. You need context to be able to filter them out. (https://github.com/polygon-software/python-visual-odometry/b...)
> don't fluctuate all that much.
Over time that doesn't bear out. Good features are areas of high contrast with nice clearly defined edges (text is great, so are buildings). branches move, which means they create lots of diffrent features depending on the wind, even light wind. when we were building out maps, we filtered as much greenery out as possible