The 500-mile email (2002)

11 years ago (web.mit.edu)

144 comments

folz

This one comes up ever 3-4 years or so in sysadmin communities, and I read it every single time. because it's worth it.

It's one of those things that I highly doubt would have occurred to me to have even checked, or given even a moments thought to, under normal circumstances.

tootie 11 years ago

This and the story of Mel never get old.
http://www.catb.org/jargon/html/story-of-mel.html
AceJohnny2 11 years ago
I was looking for another famous sysadmin story, where the guy who also happens to be a top Linux developer (so maybe Alan Cox?) rescues a deeply broken Linux system where even glibc is no longer accessible by manipulating inodes in a running process. Or something.
Over the years, my Google-fu has failed me. Any clue? :)
- HarryHirsch 11 years ago
  
  http://www.lug.wsu.edu/node/414 is what you are looking for.
  
  1 reply →
- dredmorbius 11 years ago
  
  That's a classic.
  Best I can claim is zmodem transfers of uunecoded packages over a PLIP link as I tried to get ethernet support up on an old but fairly reliable box.

danbruc 11 years ago

Another email incident at Microsoft worth reading [1].

[1] http://blogs.technet.com/b/exchange/archive/2004/04/08/10962...

JonnieCache 11 years ago

My favorite version of this tale: "Free Bananas in the Kitchen!"
http://www.metafilter.com/78177/PLEASE-UNSUBSCRIBE-ME-FROM-T...
I remember it happened at NYU a couple of years ago and they turned it into a kind of ad-hoc social network/partyline. I wonder if anyone archived those emails? I suppose they deserve to remain "private."
sp332 11 years ago

One listserve (can't remember which) made up a list for people who complained like this instead of following the unsubscribe instructions. The admins would remove complainers from the normal lists and add them all to one mailing list, where the only emails they got were each others' demands to be taken off the mailing list, with unsubscribe instructions added to the beginning and the end of every single email.
MrBuddyCasino 11 years ago
Ha. There is no explanation of why the mailing lists were named "Bedlam" though, and I doubt non-native readers know what it refers to. To quote Wikipedia [0]:
"Bedlam may refer to:
Bethlem Royal Hospital, London hospital first to specialise in the mentally ill and origin of the word "bedlam" describing chaos or madness"
[0] http://en.wikipedia.org/wiki/Bedlam
- bevacqua 11 years ago
  
  I'm a non-native speaker and I know what Bedlam means. Thanks to Ultima Online and Diablo :)
AceJohnny2 11 years ago
I also found that to be evidence of pretty horrific architecture in Exchange. Two actual recipient lists with a secret internal one? Bloating headers to 13K? At the very least, it seems to me like they chose to put the distribution logic at the wrong layer...
- fragmede 11 years ago
  
  > Two actual recipient lists with a secret internal one?
  How else do you propose handling BCC and mailing lists?
AtmaScout 11 years ago

Thanks for the link. I was surprised that it was written by Larry Osterman. I enjoy listening to his stories about Microsoft. Have you seen his Channel 9 videos [0]? I really enjoy the checking in videos with Erik Meijer.
[0] http://channel9.msdn.com/tags/Larry+Osterman
Shaaan 11 years ago

Literally the exact same thing happened at Case Western this past weekend
digi_owl 11 years ago

Ah yes, the age old "reply-all" email storm.
The bit about the recipient processing bug is novel tough, ouch.

IgorPartola 11 years ago

If only every bug report that I received had been processed by a geostatistician... Usually I get a "hey, I can't get X to work". One of three responses from me usually fixes it: "Is your computer on?", "are you online?", and "try hitting refresh".

I am actually surprised the sysadmin in this scenario thought it was a bad thing that the statistics department did their research and presented a well documented error.

mcguire 11 years ago

Well, technically, the geostatistician (Did I spell that right?) was doing research that was orthogonal to the actual problem and its symptoms. In this case, the results were sufficiently odd that they sort of pointed in the right direction, but I've been sent off on wild goose chases by people skillfully applying their own particular set of skills before.
On the other hand, there's the word document with nothing but a screen shot showing half of a useless error message.
vacri 11 years ago

Reminds me... when I post a support request to Google Apps, the issue description header says "in as much detail as possible"... but the field is limited to 1000 characters. When you're dealing with anything other than simple first-level support issues, a user simply can't put in a usefully descriptive amount of detail...

copperx 11 years ago

I didn't know about the units program. Is there any resource out there that lists these little *nix utility programs?

motters 11 years ago
That's the thing about unix-like systems. No matter how much you have learned there's always some command you don't know.
- creshal 11 years ago
  
  paste was my most recent "holy shit this saves so much time" discovery. I blame it on the not quite intuitive name.
  
  8 replies →
tedunangst 11 years ago
ls /usr/bin
- andrewstuart2 11 years ago
  
  Unless `units` doesn't happen to be installed by default, which is the case at least for Arch Linux.
  Though it doesn't contain `units` either, here's a Wikipedia list of the standardized (IEEE 1003.1-2008) unix commands. http://en.wikipedia.org/wiki/List_of_Unix_commands
  
  2 replies →
db48x 11 years ago

info coreutils is a great place to start.
jacobsenscott 11 years ago
units is nice, but there isn't much help, and the syntax isn't always easy to remember. It was fun to play with for a while, but wolframalpha.com is better.

fbnt 11 years ago

Shouldn't this account for a round trip, and the speed through copper (~ 2/3rd of the speed of light)? That would lower the radius to much more than 500 miles.

dantillberg 11 years ago
I had this thought when reading this before as well. I imagine that the "3 milliseconds" they determined from testing was a typical number, maybe the median/mean, and that the actual timeout varied considerably depending on CPU load at that particular moment. Add in a number of retries for the server to attempt sending each email, and the effective timeout might have been a few milliseconds more... or at least it must have been, because `(2 * 500 miles) / (2/3 speed of light)` works out to about 8 milliseconds (where the 2X is for the round trip, and 2/3 is a rough multiplier for the speed of light traveling in either copper or optical fiber).
- ErrantX 11 years ago
  
  The FAQ answers this question. Basically; it was a long time ago, and the point of the story isn't in the detail. :)
  http://www.ibiblio.org/harris/500milemail-faq.html
  
  11 replies →

bpp 11 years ago

Another of the 10,000 here - this is such a delightful story.

Also just discovered the "units" conversion program and disappointed that the default Mac library has only 586 units. And shockingly there don't seem to be compatible libraries out there.

girvo 11 years ago
`brew install gnu-units` should do it :)
Edit: You'll then want to run it with `gunits` rather than `units`
- oaktowner 11 years ago
  
  Now I know where the rapper got his name.
- bpp 11 years ago
  
  Awesome, thank you!
- tim333 11 years ago
  
  Yay! Works.

andrewchambers 11 years ago

I was so happy to discover that units command line program, then i realized that Google already does this, it just wasn't as fun.

btilly 11 years ago

Google has units, but not detailed commentary on unit definitions.
See https://futureboy.us/frinkdata/units.txt for that.
tim333 11 years ago
yeah and units on the mac terminal doesn't recognise "3 millilightseconds" whereas Google works for "0.003 light seconds to miles"
- chris_b 11 years ago
  
  works exactly as in the blog post in fedora.
  
  2 replies →

tsaoutourpants 11 years ago

Forgot to account for the difference between traditional speed of light (in a vacuum) and speed of light traveling through copper of fiber. :)

sampo 11 years ago

And the time it takes to make a round trip.

vog 11 years ago

Better link that contains more headers (showing the email's date, and linking to a FAQ):

http://www.ibiblio.org/harris/500milemail.html

th0ma5 11 years ago

This always reminds me of the email around the world: http://phrack.org/issues/41/4.html

fabulist 11 years ago

Thanks for a good read. Its strange to think about a time when there were a myriad of incompatible networks, and their different capabilities could be exploited.

anonfunction 11 years ago

Since I've seen a few comments about units not having lightseconds so here are a few ways to add the missing unit if you don't have it.

1) Add this line under the lightyear definition in /usr/share/misc/units.lib (or wherever `man units` says the standard units library is under the FILES section)

    lightsecond lightyear / 365.25 / 24 / 60 / 60

2) If you're on a mac and use homebrew just `brew install gnu-units` and then run `gunits`

smoyer 11 years ago

That's the speed of light in a vacuum ... through fiber-optic cable the speed of light is about two-thirds that value.
chernevik 11 years ago
I did #2, then:
sudo mv units macunits sudo ln -s $(which gunits) units
- lloydde 11 years ago
  
  or use brew install option --with-default-names and put your homebrew at the start of your path.

Aissen 11 years ago

Damn statisticians. They do know their job quite well.

nchelluri 11 years ago
It was a seriously accurate bug report. If only all users were so thoughtful.
- nashashmi 11 years ago
  
  > If only all users were so thoughtful.
  But then it sent him off in a direction not worth going. He literally started to map out how far emails would go if they succeeded. The whole time the error was in the timeout instead.
  
  2 replies →

Scarbutt 11 years ago

Who though it was going to be a TTL issue before finishing reading the story? :)

mobiplayer 11 years ago
You probably mean something else (RTT?) but definitely not TTL, which is a completely different thing :)
- motoboi 11 years ago
  
  TTL is involved when dealing with routed networks. The farther the destiny, you normally get more hops on the way. If the starting TTL is low, you won't reach the destiny. So, TTL values cause problems like this, although the radius wouldn't be so precise. Damn statisticians!
  
  1 reply →
gabeio 11 years ago

Not I, I don't know why but I was thinking IP version issue but that makes no sense.

kissickas 11 years ago

Why do I get:

> unknown unit 'millilightseconds'

Is this one of the embellishments that just makes the story more entertaining?

kaishiro 11 years ago
Not an embellishment at all.
Via 'man units': "The conversion information is read from a units data file that is called 'definitions.units' and is usually located in the '/usr/share/units' directory."
Via definitions.units (L. 223), you can see the milli- prefix: https://gist.github.com/anonymous/f06769de95e0c7f9e658#file-...
Via deifnitions.units (L. 1060), you can see the lightsecond unit: https://gist.github.com/anonymous/f06769de95e0c7f9e658#file-...
Maybe check it for completeness?
Edit: Spelling
- anonfunction 11 years ago
  
  Some distributions only support lightyear so adding this line to your units file (which you can find with man units) will give you support for *lightseconds:
  lightsecond lightyear / 365.25 / 24 / 60 / 60
  
  1 reply →
frivoal 11 years ago

I don't know about what "units" support, but if you ever need picolightseconds in your web design, CSS got you covered:
http://dev.w3.org/csswg/css-egg/#astro-units
anonfunction 11 years ago

I had the same thing happen to me. From the manpage I gathered that units uses the definitions defined in /usr/share/misc/units.lib, by running cat /usr/share/misc/units.lib | grep light I found I only had lightyear and it's shortcut ly defined. I added lightsecond, and since milli prefix is already defined it worked a treat.
Here's the line you'll want to add:
lightsecond lightyear / 365.25 / 24 / 60 / 60
riking 11 years ago

If you're on a mac, try $ brew install gnu-units - it's probably using a very incomplete library of units.
masklinn 11 years ago

More complete units library. Note how the original author's units has 1311 units and 63 prefixes, OSX only has 586 and 56.

tlrobinson 11 years ago

I see this story every so often, and it's good one, but haven't thought to verify it. Has anyone else?

ryan-c 11 years ago

This FAQ was posted in the comments of a previous posting: https://webcache.googleusercontent.com/search?q=cache:http:/...

bontoJR 11 years ago

Absolytely a good reading. Sometimes this kind of readings can help in a complete different problem. Sometime happens you are dealing with another problem, then you remember this story, and you figure out what's wrong because there're some similarities. I remember to have fixed a problem with Postgresql remembering a story about Unicode and Postfix, different domain, but similar problem.

carlesfe 11 years ago

That was great out-of-the-box thinking, and I wonder if that could be used as one of these job interview questions:

Q: "Your email server for some reason is only working for addresses within 500 miles of the server. What may go wrong?"

And let the candidate think logically and reach some sane answer, even if not 100% accurate (i.e. check routers first, connectivity, DNS, timeouts...)

iopq 11 years ago
That's one of those interview questions that tests for someone reading hacker news and pretending that they figured it out all on their own...
- rmc 11 years ago
  
  "culture fit"
  
  1 reply →
Xorlev 11 years ago

Anyone who short circuits and brings up this story should get a +1.

ilaksh 11 years ago

If you're a sysadmin and someone brings in a consultant who gets root access and upgrades the whole OS to a new operating system which then almost takes out email.. wouldn't that be a problem?

If I were the sysadmin and that happened, I would need to have a meeting with some people. What's the point of being a sysadmin if he operating system is randomly going to be completely changed without someone telling you?

I have a fair amount of built up rage. This seems like one of those situations where it is actually your responsibility to rip people a new one.

t27 11 years ago

A perfect answer to the YC application question - "Tell us something surprising or amusing that one of you has discovered" :)

laex 11 years ago

I tried the 'units' program on OSX. It seems that it does not recognise the 'millilightseconds' unit.

JulianMorrison 11 years ago
Try one L in mili?
- laex 11 years ago
  
  Didn't work.
  
  1 reply →

kowdermeister 11 years ago

I'm wondering how many hits that email address got at the bottom of the page :)

anonu 11 years ago

Was this just a clever way to let people know he was looking for a job?

mborsuk 11 years ago

Every time I read this I am reminded of units(1) util, which is super useful and I always forget about and revert to Google. But yeah, that connect timeout to 500 mi correlation is fun too.

nathancahill 11 years ago

1 year ago: https://news.ycombinator.com/item?id=123489

dang 11 years ago

Reposts are fine when a story hasn't had significant attention in the past year or so.
https://news.ycombinator.com/newsfaq.html
Pyxl101 11 years ago
Once a year is about the right frequency. Recurring stories is one way in which a community shares and perpetuates its culture with newcomers. Some of them are a delight to read on that yearly cadence, like the SR-71 story about a pilot and his copilot becoming a crew.
That said, it's wise to consider the frequency with which such things appear, individually and in total. Too much repetition and focus on memes becomes dysfunctionally self-obsessive. Not sure what the right answer is, but I can probably deal with once per year, short time on front page, and small % of total content.
- geuis 11 years ago
  
  This is an interesting idea. Have a system where a community can mark something as important, and to have it automatically reposted at preset intervals. Community members could be allowed to additionally repost, or the system can politely say it's already archived and will be shared again on such & such date. Use it as a way to reinforce community history.
  
  4 replies →
- carleverett 11 years ago
  
  Would you mind linking the SR-71 story? Somehow I never saw that.
  
  4 replies →
Twirrim 11 years ago
Maybe the submitter is one of the ten thousand https://xkcd.com/1053/
Bet there are a few more that will find this submission too.
- emilioolivares 11 years ago
  
  Ha, I'm one of the ten thousand for both the XKCD and this post. Lucky me!
  
  2 replies →
- johnaspden 11 years ago
  
  I learned about diet coke and mentos from that cartoon. It was wonderful.
- yzzxy 11 years ago
  
  This doesn't apply very well... HN is heavily archived... this comic is about being rude to people for not knowing about something, not justifying shoving the same cyclical content in people's faces repeatedly.
DiabloD3 11 years ago
And the thing is, every time I see this story (I saw it years ago before I discovered HN), I read it in its entirety, and love the shit out of it.
It really makes me sad the BOFH series of stories is over, I loved those too.
- blfr 11 years ago
  
  The Register still carries them[1]. I don't know their exact relation to the original but I think they're official.
  [1] http://www.theregister.co.uk/data_centre/bofh/
  
  3 replies →
nocman 11 years ago

Wow, I must have bad timing. I've had an account here for almost all of those, and I think I was probably lurking for the 1 or 2 occurrences when I did not have an account, but don't remember seeing it before.
Or perhaps senility is setting in early. :-D
brudgers 11 years ago

The only significant discussion was almost five years ago. Or about the time the first iPads went on sale. And before either of us were members.
I missed it all the other times and am glad it was reposted.
simplyluke 11 years ago

I don't think this is a bad thing. It was either 1 or 2 years ago when I first read about this - newcomers to the community have to find out about things in one way or another.
obituary_latte 11 years ago

/me is unsure whether this is condemnation or validation. anyway, let's ride the wave.

ai_ja_nai 11 years ago

old but gold

elchief 11 years ago

Can we get a nice "HN Classic" tag to put beside annual stories like this? I'm fine if stories like this pop up every year, actually.

bastomi29 11 years ago

https://gunungbromosunrisetour.wordpress.com/2015/01/08/dari...

kazinator 11 years ago

> And also being a good system administrator, I had written a sendmail.cf [...]

Say what? Nobody writes a sendmail.cf from scratch, unless they are crazy.

> ... that used the nice long self-documenting option and variable names available in Sendmail 8 rather than the cryptic punctuation-mark codes that had been used in Sendmail 5

Good system administrators stick to conservative, portable subsets of configuration and scripting languages, rather than bleeding edge stuff.

When they deviate, they have a clear plan. They document their choice to use something new and shiny, and they keep it separated from the default system configuration.

Since SunOS came with Sendmail 5, the upgraded Sendmail 8 should have been installed in some custom location with its own path so that it coexists with the stock Sendmail, and is not perturbed if the OS happens to upgrade that.

A good syadmin would stick that in some /usr/local/bin type local directory, and not overwrite /usr/bin/sendmail.

The consultant was not wrong to update the OS. People have reasons to do that. The consultant should have consulted with the sysadmin, of course. But even in that event, it might not have immediately occurred to the sysadmin what the implication would be to the sendmail setup.

TreyHarris 11 years ago

Goodness, you're determined to find fault, aren't you? (For the record in re your comment later about my "basis to call [myself] a good system admin", those claims were a) jokey, and b) fairly well-substantiated by my reputation by that time, I should think. I was published by that point and had been on several conference committees along with many who'd be reading that mailing list; I hardly needed to peacock like you seem to think I was doing.)
But I think your criticisms seem a little uninformed (or possibly over-informed by later practice to the point where you aren't considering this in the context of mid-1990's practice). Let's see...
> > And also being a good system administrator, I had written a sendmail.cf [...]
> Say what? Nobody writes a sendmail.cf from scratch, unless they are crazy.
I didn't say "from scratch". I used the m4 macros to create a cf, like everyone did at the time. Using the default file would only work if you still used email programs that read raw mbox files, had no email lists, and needed no interesting aliasing or vacation script behavior. Oh, and ran in an environment where it was reasonable to assume someone's canonical email address could be found via the equivalent of "echo "${USER}@${HOST#.}".
Very few production systems could get away with that; writing a sendmail.cf was standard practice. And with m4, you usually spoke of "writing" a file where today we'd call it "configuring" a file; either way it was taking boilerplate and replacing bits with things that were right for your situation. I assume you wouldn't have had an issue with my writing that I'd "configured" the sendmail.cf. That's all I did.
> > ... that used the nice long self-documenting option and variable names available in Sendmail 8 rather than the cryptic punctuation-mark codes that had been used in Sendmail 5
> Good system administrators stick to conservative, portable subsets of configuration and scripting languages, rather than bleeding edge stuff.
Hmm, you either weren't administering SunOS in the mid-90's or you're forgetting some details. SunOS still came with Sendmail 5 years* after best practice was to use Sendmail 8. Check out the O'Reilly Sendmail book of the time's pagecount: it was longer than the prior and the later versions because it had to document both. I'm not entirely certain SunOS (as opposed to Solaris) ever was upgraded to Sendmail 8 in the distribution; obviously the people using SunOS still so late were change-averse.
"Bleeding edge" != "the version that all but the most conservative holdouts are using". Also, remember that this was the same period we were doing the rsh/rlogin conversion to SSH. Sendmail 5 still had known security issues that were fixed in Sendmail 8. We were used to replacing system components when what the OS vendor was shipping us was literally dangerous to run.
And Sendmail 8's Sendmail 5 compatibility mode was simply there for testing; it was never intended to be used production long-term, so using a least-common-denominator sendmail.cf wouldn't have been "conservative and portable"; it would have been risky, bordering on malpractice.
> Since SunOS came with Sendmail 5, the upgraded Sendmail 8 should have been installed in some custom location with its own path so that it coexists with the stock Sendmail, and is not perturbed if the OS happens to upgrade that. > A good syadmin would stick that in some /usr/local/bin type local directory, and not overwrite /usr/bin/sendmail.
Again, either you didn't run this installation in the mid-90's or you're forgetting some details. /usr/lib/sendmail (notice the "lib"! Your referring to "/usr/bin/sendmail" suggests to me you definitely weren't running SunOS 4 or have forgotten details; sendmail was never in /usr/bin) couldn't be left alone, as other tools hardcoded that path. The actual executable was there, so symlinking couldn't be used to get around that.
gabeio 11 years ago
> Say what? Nobody writes a sendmail.cf from scratch, unless they are crazy. The point moreover was that he had a custom version of the config file (not just default).
- kazinator 11 years ago
  
  Yes, sites have necessary customizations in sendmail.cf. These do not have to be rewrites that use shiny new syntax.
  My biggest problem with the author was not that he uses his admin blunders as a basis to call himself a good sysadmin, but that he assumed that the stats people were idiots who don't know anything about `puters or networks.
  I was not surprised by the 500 mile claim. It strikes me as obvious that the 500 miles has to do with some combination of network topology and propagation delays, those being approximately the same in every direction.
  Yes, networking does work "that way": farther places take more time to reach than nearer ones, broadly speaking. (Of course, it's faster to reach something 12,000 km away with no packet switch in between than something 50 miles away with switching. That doesn't eliminate the generality.)
  It was also obvious why they didn't report the problem instantly; you cannot instantly know that mail isn't reaching beyond 500 miles without gathering data and correlating to a map, which takes time. Instantly, you can only know data points like "I can't mail to users@example.com". You know that if a stats person gives you a number, it was based on data, and not just a couple of data points. The head of the stats department isn't going to give you a number that isn't factual and backed by science. Of course stats people pride themselves on their data analysis; they are not just going to relay a couple of data points with no analysis attached.
  
  1 reply →