Can’t send email more than 500 miles (2002)

2 years ago (web.mit.edu)

Vaguely related - this reminded me of a support call I had where similarly the real world apparently merged into the digital world.

I was doing IT support for a small Australian company back in '98. A guy called me from a remote office, and after a few pleasantries he explained that the screen saver had fallen off the monitor of his dumb terminal, bounced on one of the keys on his keyboard, and now terminal was locked up. He wanted to know what key to press to unlock the terminal.

Eh?

I knew the guy, and although he wasn't trained in IT, he knew his way around the basics, he wasn't completely clueless.

I asked him to explain the problem again as I wasn't sure I'd understood. He repeated exactly what he'd said the first time.

I replied "What do you mean the screen saver fell off the monitor, that's impossible? Besides, it's a dumb terminal, they don't have screen savers."

After a little more fumbling around this weird upside-down world he was presenting me with, it suddenly clicked. He was talking about the physical CRT anti-glare screen filter [0] that used to be common around then, that literally hung in front of the screen. This has come unstuck and hit the scroll lock on the terminal. He called this a screen saver.

Since then the phrase "Screen saver" seems to have now morphed to mean what I used to call a desktop wallpaper, but that's a separate topic.

[0] https://dylbs6e8mhm2w.cloudfront.net/productimages/500x500/E...

I love this kind of stuff. When you're sure the thing that seems to be happening couldn't possibly be happening, and then you find out that literally the speed of light is coming into play.

We had a similar problem at one of my first jobs where I was a programmer and backup network support guy. One employee was having a problem with his CRT monitor flickering. It was very subtle, but just enough to drive him nuts.

So we replaced the monitor with one that worked fine on another machine. Same problem. We tried replacing cables, power cords, and did a bunch of other troubleshooting things. Problem persisted. Eventually we replaced his entire computer. Same problem.

Finally I put his computer and monitor on a cart with an extension cord and wheeled it out into the hallway. The problem went away. It turned out to be bad electrical shielding in his office.

  • > The problem went away. It turned out to be bad electrical shielding in his office.

    My Commodore 64 started "typing" of its own accord. We sent it to be repaired twice only for it to work perfectly when they tested it. Turned out after we got a bigger TV, we kept it too close to it, and the static electricity eventually caused the effect.

    By the time the repair people got to it, it'd presumably discharged enough for it to stop, and it worked fine for a while.

  • Heh, reminds me of a cursed user I was trying to help in the mid 90s.

    Sold a person a computer, they said it bluescreened when they used it. So we picked it up and tested it. No problems. Sent it back, bluescreening again. So they came to the office with the computer. I set it up and used it for 30 minutes with them there, not a single issue. The moment they touched the mouse the computer bluescreened. Replaced the mouse and the problem went away.

  • I once worked in a lab where all computers had its own electrical stabilizer, but they were so poor that probably they did more harm than good. When someone turned on a stabilizer, the nearest CRT monitors would distort for a second, then flicker and colors would be degraded.

    Luckily, my place was by the wall, so the effect was diminished, but it gave me big headaches. I lasted only 6 months in that company this being the biggest reason.

    • The stabilizer was triggering the degausser on the CRT. Turning on speakers or putting cellphones where a call was coming in would sometimes do this too.

The best part - the consultant who patched the server is on Hacker News! He commented on his part here:

https://news.ycombinator.com/item?id=23775404

  • I'd forgotten about that! That part of the thread begins here: https://news.ycombinator.com/highlights

    • /highlights isn't on https://news.ycombinator.com/lists – is that intentional?

      I knew this existed, and I was looking for it a few weeks ago; it's an interesting page to browse through every once in a while. But I just couldn't remember the name until now.

      Having a "highlight" people can't find doesn't seem much of a "highlight" to me?

      5 replies →

    • > We put it in /highlights..

      +1, After two years from joining HN I’m still learning about it. This is the first time I heard about highlights section! I couldn’t find it in lists nor on any other part of the site yet still interesting to read some comments there that do not show up in best comments section. How exactly this works?

      9 replies →

  • So the article says

    >Well, the consultant came in and patched our server and rebooted it. But I called him, and he said he didn't touch the mail system.

    But the comment

    >Since my preference to wipe and reload was unacceptable - too much downtime and too many billable hours - the obvious thing to do was update sendmail

    Must be the part where

    >The story is slightly altered in order to protect the guilty

Every few years this story resurfaces and always makes me smile.

Somewhat related but not really. Back in 2007 when I worked for a large ISP as a second line support technician for various services, ADSL was very much still in vogue. And the technology, over copper wire, had a max distance of where it would be stable. Some clients were on a special plan that tried to up this distance by a bit, maybe 2-3 more km but really it was still quite unreliable but still usable for browsing the internet, generally.

But during the summer I received a call from a client that had been unable to use his IPTV service during the day for almost a month without hickups and disturbances and his internet was slow as a glacier from time to time and as I was measuring the equipment, packet loss and all the usual stuff it struck me that he was very far away from his nearest telephone station. After some back and forth with a technician and lots of measuring we came to the conclusion that since it was so hot out during that summer the line just expanded over to a distance that was just far away enough that the line would become unstable during daytime when it became hotter outside.

We could not really do anything to help him. I do not miss the copper net.

  • What about insulating the wire from the pole to his house more?

    • Typically they wouldn't do these jobs, the copper net was already on it's way out by that time in favor of fiber optics and the entire landline system was on maintenance at that point.

      I left shortly after and these days, now they have removed almost all of it already.

tangentially related: 15 or maybe 20 years ago i worked at a repair shop and someone brought in a TV that they said switched to spanish every night at 5pm.

they were watching over the air channels and there was only a setting in the tv for menu language. sure enough though, at 5pm that night we watched as the tv started speaking in spanish. we tried a few more channels and found that all but one or two were in spanish.

as it turns out, some stations broadcast audio in multiple languages and some tvs allow you to change the preference. sadly for this person, the used tv they bought came from a spanish speaking country and didn't have anyway to change that preference.

  • My Bluetooth speaker switched to Chinese after a few years of use. I have no idea of how it did it and no idea of how to revert it. There is no reference to it in the manuals.

    A few days before, I brought a robot vacuum home. It was made and purchased in China. When I started it for the first time, it bumped into my server and unplugged it.

    Therefore a state-sponsored cyberattack is not out of the question.

Ah, the true mother of all leaky abstractions:

The actual underlying transmission protocol of the relativistic universe shining through when trying to send an email.

Great story!

At lunch today I was just talking about Sendmail, which I can assure you is a rather rare occurrence. I was talking about the first time I set up sendmail, back in '91 or '92. I was using the bat book and nearly tore my hair out over a week getting that first setup working. I eventually came to understand and appreciate the m4 config, but I ended up moving to qmail and postfix in the mid '90s and never looked back.

Related. Others?

The case of the 500-mile email (2002) - https://news.ycombinator.com/item?id=123489 - Feb 2008 (7 comments)

I've wondered how feasible it would be to do something like this to have a website that could only be accessed when a client is within a certain physical proximity of the host. Could make for a fun CTF!

  • I’ve played CTF challenges where the latency to the host was a key factor in determining if you could get a flag or not. For those, I’ve often found it useful to spin up a cloud machine in a datacenter near the target (or, better yet, in the same datacenter if we can figure it out).

    A very common case is when the challenge has a short timeout but requires a lot of interaction, e.g. you only get ten seconds but you have to perform 10000 queries for a heap spray or something.

    The most insidious case I remember was a read() call that didn’t check the result, causing it to return short if the fragments of the input didn’t arrive fast enough.

    • So... if you're referring to a challenge that did that during one of the DDTEK years of DEFCON-CTF, that was one of mine.

      The expectation wasn't to buy time in an adjacent cloud, but to use out of order ip fragmentation or tcp segments, having the servers network stack reassemble the packets such that the read was coherent in one go.

      My goal was to teach competitors to model real world challenges of exploitation.

      1 reply →

  • Didn't do that, but one of my earliest "dynamic" websites ca mid 90's would have a CGI try to ping the client with a short timeout, and if we got an answer that indicated a leased line or something rather than dialup and we'd serve up a heavy animated version of our logo instead of a static image... But could be used as a vague indicator of distance too.

    Trickiest part of doing that today is so many fewer hosts are reachable via icmp, so you'd probably be better off serving up an initial response with some JS to measure more accurately.

    (Another silly little thing we added was a link back to a users own ISP from the top ten or so of our competitors based on net block - got us a worried phone call from one of them who thought we'd been hacked and wanted to make sure we didn't think he was responsible)

  • My quick hack would be to establish a websocket connection, and send a random stream of numbers to the client. If the client didn't return the number within a ping threshold, block their access.

    • hm but it would block your crazy next-door neighbor who only uses curl.

      To get a good server-client-server roundtrip with only HTTP/1.1, I'd personally try using a temporary redirect, maybe a 307.

i think about this story often and i find that the person who figured out that it was 500 miles actually deserves more credit than they get in the story. have to really think out of the box to figure that one

Every time I read this story the part that always surprises me again is the units command. Converting from 3 millilightseconds to miles is brilliant, and I am delighted every time that the units command can do this.

  • kragen posted a several of excellent comments highlighting the capabilities of GNU Units a couple of months back, these two in particular:

    <https://news.ycombinator.com/item?id=36995046>

    And Trey Harris's "500 mile email" story is what clued me on to GNU units and its capabilities.

    Reminder: if you're on MacOS, or one of the BSDs, your default units is from BSD, not the GNU version, and is far less capable. GNU units can be installed on MacOS through Homebrew. The package is "gnu-units", the command is "gunits" once installed.

    Edit: Corrected Homebrew package name.

  • Anyone who likes the units command should plan an evening where they can sit in a comfortable chair with an appropriate beverage, and read all the comments in the data file in the source. It is like a novel about the history of measurement.

Thats why when you get a weird ticket it pays to check for yourself before calling your user crazy.

I did a startup making a mp3 player that was attached to whole house audio distribution systems. We got an angry email from a customer saying he woke up to ABBA playing full blast at 3:00am. While it was likely an integration/timer issue, he was wondering 'why ABBA? what is the player trying to tell me?"

The control system was sending 'play' with nothing else, which was more of an edge case based on our UI, and so it started at the beginning of the list of artists, and ABBA was at the beginning of that list.

Other players might have started at the beginning of the list of songs, but for some reason (25 years ago) we chose the beginning of the list of artists. Later on it was configurable - random, favorite playlist, etc.

There is a blog created to collect similar stories: https://news.ycombinator.com/item?id=35708339

> our campus network at the time was that it was 100% switched

is this realistic, or a writers license?

  • > is this realistic, or a writers license?

    Realistic. And, believe it or not, I know of at least one organization that plans to convert an entire literal skyscraper of office space from routed networks to a single, flat switched network for all the employees of all the subcompanies. In 2023.

    Obviously everyone with a bit of braincells left tells them to not do that because it's utterly dumb, but hey, strategic decision by the holding company to save on costs...

    At least they're not using hubs. (For the younger generation: a hub is an Ethernet device that takes any packet it ingests in one port and sends it out to all other ports, with no consideration at all if the device that the packet is destined for actually is on that port - something a switch does, by maintaining a mapping of MAC addresses to ports. Extremely dumb devices, but used to be way faster and especially cheaper than switches in the 90's/early '00s)

    • I still keep an old 4-port hub in my junk-box because that way I can diagnose/snoop on network traffic... Although so much of it is encrypted these days that it's harder to see what's going on.

      P.S.: Yes, modern alternatives would be to to buy a switch and that can be configured to "mirror" packets onto a chosen port, or a smalls Ethernet network tap unit... But why buy more stuff if I don't really need to?

    • bonus fact: multicast was still being done via broadcast in some switches ~10y ago, also extremely dumb :P

  • What does this mean anyway ? I tried googling but no dice.

    • I needed reminding too.

      "In a "switched" network, when Device A wants to send data to Device B, the switch directly connects these two devices so they can chat. Think of it like a train switcher that directly links Track A to Track B for a specific train, instead of sending it through a maze of tracks where other trains are moving.

      In contrast, a "hub-based" network is like a party line in old telephone systems. When Device A talks, EVERY device hears it, but only Device B cares and listens. This is less efficient and can be slower because all devices get the data, which clogs up the network.

      Another option is a "routed" network, where a router decides the best path for the data. This is like GPS choosing the best route based on current traffic conditions. It's more flexible but can introduce more delays because the data might go through multiple steps to reach its destination.

      It's called "switched" because the switch acts like a railroad switch operator, making a direct track connection from one device to another for each piece of data. It "switches" the pathway specifically for that data to make the communication as direct as possible."

Does this add up? If the connection timeout is 3ms, then that means there's 3ms for a roundtrip, 1.5ms each way. So the maximum distance would actually be roughly 250 miles. But even then, packets don't actually travel at the speed of light in fiber optic cables. It also assumes that the cables are laid as the crow flies, which they aren't.

Fun read! Along the way I was trying to guess the cause and my best guess was TTL-related. However I don’t quite understand the actual cause! If the connection timeout is 3ms in practice, shouldn’t that be for a packet round-trip? So ~250 miles? And wouldn’t we expect at least a small delay on the remote SMTP server to process the packet?

I love these sort of debugging stories! It sounds like that timeout would be based on the round trip travel time to the remote host rather than the one way distance, wouldn’t that make a 250 mile cutoff?

Does anybody think that limiting email to 500 miles might be a Good Thing?

I don’t have a very well-formed use-case in mind, but I strongly suspect there is one: suggestions welcome.

I don’t get why this story is so popular here. None of the technical details adds up.

Multiple back and forth due to protocol handshakes and router delays would add enough latency to prevent any connection from happening at all if the timeout was set to 0 as stated.

I guess people just like the “it was the time light took to travel” vibe.

_The_ classic CS story (I personally think! ;) <3 :')))) ;'D). I think it will be hard to beat this one. :')

Ah, the good ol' days....Wait, no, we're still living in 'em now! Just go to the edge of a quiet, non-hypey, but still expanding field.

Tons of fun. ;)))) <3 <3 <3 <3 :)

  • Definitely not the same level of WTF, but I worked on one of my favorite bugs I've ever seen super early on in my career.

    I joined a tiny digital agency maintaining wordpress sites and about a month in one of our customers files a ticket that their website was broken. Just a white page, no error, no nothing. I ask my boss if I should switch to work on it and he goes, "Nah, this customer does that every few months and there's never anything wrong. It's something with their hosting or something (they were self hosting a site we built and maintained). Just take a look at their site when you have time to say you did and close it as can't reproduce."

    Two days later I have some spare time so I take a look and sure enough, everything is working as expected. A few months pass, same ticket from the same customer, I pick it up a day or two later and everything is working as expected. A few months later, same thing, but this time I have nothing I'm working on so I pick the ticket up immediately and sure enough the website is broken. I immediately show my boss and he's like "well, I'll be damned" and then tells me to fix it. I poke around for a few hours, but can't figure it out, so I call it a day. When I get in the next morning, things are working as expected, so we're both like "wtf?"

    I don't remember how exactly I ended up figuring this out or even all the details of the bug, but the root of the issue ended up being that who ever wrote the code for the site had some code that revolved around the current date and they'd hardcoded that there was always 30 days in the month. When ever the current date was the 31st of the month this code broke and took down the website, but by the time the 1st rolled around the code worked again.

For those on a Mac the millilightseconds unit probably won't work, in which case you can try:

  586 units, 56 prefixes
  You have: 3 lightyear / 365.25 / 24 / 60 / 60 / 1000
  You want: miles
      * 558.83525
      / 0.0017894361

What I wonder is, wouldn't the signal would have to travel back and fort, i.e. about 1000 Miles, so the sender can receive the 200 OK signal?

> I'm looking for work.

So is he now jobless because of his defunct config file that couldn't match the "speed of light"?

  • I know you're being silly, but I think Harris now works for Segovia Technology, a payment processor geared toward people living in extreme poverty.

I have a story from my current workplace which threw me and the other IT staff for a loop.

Get a support ticket from one of the staff saying that the screen of their laptop keeps "going black" when they are using the computer. So I set up a time to take a look and ventured away from my desk to see what's going on.

It's worth noting that this is a laptop, less notable is that it is a shared device as different people work this shift, regardless, all the staff that share it are adamant it only happens with this one user. So I asked them to demonstrate, they open a browser and start typing, sure enough after a sentence or two the screen goes black and the machine locks / goes to sleep.

I ask to take a look at the laptop and log back in, open notepad and start banging away on the keyboard like a cracked out chimpanzee, nothing happens. I hit every key on the keyboard, double checked hotkey / function keys, can't get it to happen. Now I had written a small exe for our staff that locks their workstations when unplugging their security keys and killed it just to make sure that wasn't causing issues, nope that's not it.

Then they tell me they even swapped laptops and the problem follows this user... So I blow out the user profile and log them back in and let it sync again, issue still persists. Now I am watching them type like a hawk trying to see if I can spot what key sequence or wizardry is happening to cause this issue.

I can't find any reason that makes sense, when I check the event logs it just says "Machine sleep due to lid closure or button press". Happens on 2 different laptops so it's unlikely to be a faulty lid switch or loose / sticky button. I notice a bracelet on this persons wrist and joke about it not being a super magnet or something and she laughed and said no, but it does use a magnet to hold it together.

I didn't think much of it at the time and ventured back to my office defeated and confused. I am going over the issues with one of my co-workers and mention the machine sleep due to lid closure or button press, then something clicked. We grabbed a spare nametag with a magnet and started waving it around the same model laptop surfaces and sure as shit the screen goes black and it goes to sleep.

The magic spot on this laptop was to the right of the trackpad, the user wore the magnetic jewelry on their right wrist and when they arranged their hands just right it triggered the lid closure sensor which on these models uses a magnet in the lid.

In real life this issue persisted for a few weeks before we figured out what was going on, I thought I was going crazy though when it was happening.

Does anyone else remember reading a longer version of this or am i mixing it with some other similar story?

> even of a relatively impoverished department like statistics

It's not a statement you would read in 2023 :)

I love those stories. I can then engineer bedtime stories. Is there a place where I can find more?

"I choked on my latte. "Come again?""

That sentence is a rather decent: double entendre.

    "You waited a few DAYS?" I interrupted, a tremor tinging my voice.  "And
    you couldn't send email this whole time?"

    "We could send email.  Just not more than--"

    "--500 miles, yes," I finished for him, "I got that.  But why didn't
    you call earlier?"

    "Well, we hadn't collected enough data to be sure of what was going on
    until just now."  Right.  This is the chairman of *statistics*. "Anyway,
    I asked one of the geostatisticians to look into it--"

    "Geostatisticians..."

    "--yes, and she's produced a map showing the radius within which we can
    send email to be slightly more than 500 miles. 

Pure gold. I love that the stats department put in such rigorous testing before submitting the ticket.

This excellent tale has appeared many times on HN; here's dang, in 2021, listing some of the past threads:

https://news.ycombinator.com/item?id=29213472