Comment by mootothemax
2 years ago
Any suggestions for geolocating datacenter IPs, even very roughly? I'm analysing traceroute data, and while I have known start and end locations, it's the bit in the middle I'm interested in.
I can infer certain details from airport codes in node hostnames, for example.
It would also be possible - I guess - to infer locations based on average RTT times, presuming a given node's not having a bad day.
Anyone have any other ideas?
Edit: A couple of troublesome example IPs are 193.142.125.129, 129.250.6.113, and 129.250.3.250. They come up in a UK traceroute - and I believe they're in London - but geolocate all over the world.
Those IPs are owned by Google and NTT, who both run large international networks and can redeploy their IPs around the world when they feel like it. So lookup based geolocation is going to be iffy, as you've seen.
Traceroute to those IPs certainly looks like the networking goes to London.
The google IP doesn't respond to ping, but the NTT/Verio ones do. I'd bet if you ping from London based hosting, you'll get single digit ms ping responses, which sets an upper bound on the distance from London. Ping from other hosting in the country and across the channel, and you can confirm the lowest ping you can get is from London hosting, and there you go. It could also be that its connectivity is through London, but it's elsewhere --- you can't really tell.
Check from other vantage points, just to make sure it's not anycast; if you ping 8.8.8.8 from most networks around the world, you'll get something nearby; but these IPs give traceroutes to london from the Seattle area, so probably not anycast (at least at the moment, things can change).
If you don't have hosting around the world, search for public looking glasses at well connected network that you can use for pings like this from time to time.
This looked promising:
"TULIP's purpose is to geolocate a specified target host (identified by IP name or address) using ping RTT delay measurements to the target from reference landmark hosts whose positions are well known (see map or table)."
https://tulip.slac.stanford.edu/
But the endpoint it posts to seems dead.
https://ensa.fi/papers/geolocation_imc17.pdf has some ideas.
Using RIPE atlas probes to get RTT to the IPs from known locations is close to your idea and probably the best anyway.
> A couple of troublesome example IPs are 193.142.125.129, 129.250.6.113, and 129.250.3.250. They come up in a UK traceroute - and I believe they're in London - but geolocate all over the world.
If I'm running a popular app/web service, I would have my own AS number and I will have purchased a few blocks of IP addresses under this AS and then I would advertize these addresses from multiple owned/rented datacenters around the world.
These BGP advertisements would be to my different upstream Internet service providers (ISPs) in different locations.
For a given advertisement from a particular location, if you see a regional ISP as upstream, you can make an educated guess that this particular datacenter is in that region. If these are Tier 1 ISPs who provide direct connectivity around the world, then even that guess is not possible.
You can see the BGP relationships in a looking glass tool like bgp.tools – https://bgp.tools/prefix/193.142.125.0/24#connectivity
If you have ability to do traceroute from multiple probes sprinkled across the globe with known locations, then you could triangulate by looking at the fixed IPs of the intermediate router interfaces.
Even this is is defeated if I were to use a CDN like Cloudflare to advertise my IP blocks to their 200+ PoPs and ride their private networks across the globe to my datacenters.
> If you have ability to do traceroute from multiple probes sprinkled across the globe with known locations
Everyone who's aware of RIPE Atlas has that ability.
I have almost a billion RIPE Atlas credits. A single traceroute costs 60. I have enough credits to run several traceroutes on the entire IPv4 internet. (the smallest possible BGP announcement is /24, so max of 2^24 traceroutes, but in reality it's even less).