Mail does not go further 500 miles - FAQ

The story of e-mail, which did not go further than 500 miles from the sender , has long become a bearded classic. I thought that the normal reaction was just to laugh, but there were not so few people who wanted to prove to the author that this could not be, because ... In the end, the author could not stand it and issued a whole FAQ. So meet:



Email that didnā€™t go further than 500 miles



I have received many responses to the publication, ā€œMail does not go beyond 500 miles.ā€ My story was reprinted many times and sold much wider than I could have hoped. Most of the answers are thanks for the funny story and job offers (by the way, thanks for them, and I would like them to keep coming!) However, there were many who looked for inaccuracies and contradictions in my story, finding fault with trifles. Instead of answering every such attack, I simply collected the most frequently asked questions and answered all at once.



1. Was it true, or is history just a story?



It is a reality. At that time, I was responsible for a centralized email system on the University of North Carolina campus at Chapel Hill. In addition, I also set up e-mail for those departments that for some reason used their own servers. The main thing in the context of this story is that I wrote a configuration file for the mail server, sendmail.cf, which was used by most campus servers.



2. When did this story happen?



I would really like to say for sure. But despite the fact that Iā€™m one of those plusies (in the original, I'm one of those anal-retentive types) that carefully store all their mail, incoming and outgoing, I can not find a single letter on this issue. As I wrote, I was informed about the problem by telephone, and I also answered by telephone. After some time, I consciously decided not to write any letters, mainly because the story is very good, and I liked to retell it and watch the faces of people who heard it for the first time. Based on the recollections of the office in which I was working at that time, of the colleagues to whom I was telling this story, and other irrelevant but time-related details, the story happened around 1994-1997.



However, you can certainly calculate the time more accurately. For example, when sendmail 8 was already ā€œfairly stableā€, but was Sun still shipping sendmail 5? Eric Allman (Eric Allman) wrote to me about this, that some functions could be backported to sendmail 5, and if you know when this happened, you can significantly narrow the time interval. In general, if you have any thoughts on how to calculate the time of a story more accurately, I will listen to them with gratitude



3. Did this story really happen to you, or is the personality of the main character from the very ā€œirrelevant detailsā€ that were changed?



This is really my story. Maybe one of my colleagues warned me that ā€œsomething happened in the statistical office,ā€ before talking to the head of the department. It may even be that I called the head of the department, and not he to me. Most likely, one of my colleagues was sitting next to me while I was sorting out the problem, because I have a habit of discussing working issues out loud in the process of solving. But Iā€™m hardly telling someone elseā€™s story and reaping someone elseā€™s laurels. Although, if you were working with me at that time and you think that solving the problem is your merit, contact me and we will come up with something.



4. If you are not 100% sure of the details, then why are there so many details in the story?



Because with the details, the story looks much better. Do you really think that if I started each sentence with the words ā€œI donā€™t remember exactly, but it seems to be ...ā€, then something would have changed? In the end, at the very beginning, I warned that some minor details were changed, and some were intentionally omitted - just to make the story better.

The second important point is the site where the story was first published. I sent this story to the SAGE (System Administrators Guild) mailing list in the "incredible challenges" section. These were just stories about the most incredible tasks that management sometimes puts to system administrators.



Of course, if I knew that the story would spread all over the Internet, I would have been more careful about writing. But the text was written for colleagues, most of whom I know personally, and who in general tend to believe me.



5. The story is funny, but the technical details at the end are wrong.



Yes I know. Reread the answer to the previous question. First of all, I wrote a humorous story based on an incident that happened to me. This is not educational material, so there are no more technical details in it than is necessary to understand the general meaning of what is happening. In general, after writing this story, I was imbued with great respect for authors writing stories based on real events. Now I know how difficult it is to maintain a balance between verisimilitude and fiction. And now I know well what a flurry of criticism dooms the author, choosing an art syllable :-)



6. Well, well, but why don't you write training material now?



Unfortunately, this will not work, even if I wanted to, because I did not have the source data. I did not save the logs, and I did not have any notes that I then made. I really, really would like them to be preserved, because I understand that I could make a good article out of them. Then all this seemed trivial, worthy only of turning into a joke for a narrow circle of friends. And I coped with this task - even without logs and notes.



Although ... in fact, there are details that I remember or can restore. And I use them to answer the following questions.



7. Setting the connect () timeout to 3 ms does not make sense.



Yes I know. But there was no such installation. The story describes how I spent 10 minutes moving from a 500-mile limit on the range of sending mail to a timeout of 3 ms due to the speed of light. In fact, the process took several hours, and my work can be compared with the work of a detective. In the end, I found a solution, ran tests and poured coffee (moreover, I am sure that this was far from the first cup of coffee). So all the same, what exactly confuses you in the ā€œ3 msā€ figure?



8. Well, firstly, 3 ms is clearly not enough, because this is only enough for the outgoing packet to reach the recipient. But you still need to get an answer, so the minimum delay should be 6 ms?



Of course. This is just one of those details that I omitted. It's too complicated and boring for a humorous story.



9. Or maybe the timeout should generally be 12/18/24 ms due to the three-phase TCP connection protocol?



May be. Again, these are the details that I canā€™t remember because I lost all the notes. However, I think that when the SYN / ACK packet is received, the connect () function timeout is reset, that is, it is not necessary that the TCP connection must be fully established during the timeout. Yes, even if it should, anyway, telling the story, I would reduce all these complex calculations to the number "3".



10. Network equipment introduces much greater delays in the passage of the signal than you thought.



Yes, you may be right. But I could take these delays into account. Iā€™m not sure that I did everything just like that, but I could, for example, ping the nearest router (for example, a router serving the network of another college of our university) to calculate how much delay the router gives. Then I could multiply the resulting delay by the number of nodes through which the signal passes to the destination. This amount is approximately the same for all universities on the East Coast. But even if this were not so, the delay added by one redundant router is several hundred microseconds, which does not affect the overall time so much.



11. A funny story, but it has a fatal flaw : a signal in a copper wire does not propagate at the speed of light.



Yes, it is, the signal comes at a speed of Ā¾c or so. But the campus network, and the backbone, were entirely fiber optic.



12. Aha! But even in optical fiber, light does not propagate at the same speed as in vacuum!



Yes, here you got me. In optical fiber, the signal travels at a speed from ā…”c (yes, slower than in a copper wire) to almost c, depending on a bunch of factors. But I repeat once again - all this I could take into account and, of course, took into account. I pinged different nodes and recorded the ping time and distance to the node. Comparing the figures obtained, I deduced a certain ā€œempirical timeā€, which was slightly different from real time. However, all this is also insignificant details, which I omitted to make the story shorter and more interesting.



13. Stop-stop-stop ... Do you want to say that you first guessed that the problem was somehow related to the speed of light, and only then got down to calculations (in the original - ā€œtyped it into unitsā€, that is, used the units utility) ?



Yes exactly. I was stubborn. In the process of solving the mystery, have you ever had to not notice the correct answers? This is exactly what happened to me. Most likely, on the contrary, I first transferred 500 miles to light milliseconds and only then I adjusted the answer to this knowledge.



14. That is, you knew how to solve the problem of users, but did not solve it until you figured out that it was a timeout?



No. As soon as I realized that replacing the standard sendmail in SunOS with sendmail 8 solves the problem, I did it. (Even if I didnā€™t know that this would solve the problem, I would do it because sendmail 5 with parameters from sendmail 8 is not the best configuration). But I kept the old binary - to still deal with the problem at my leisure.



System administrators always do this. It never happens that ā€œthe system has been running and tired for too long,ā€ but rebooting often helps. First, the administrator solves the problem as best he can so that users can continue to work, and later he returns and looks for the true cause of what happened.



15. Usually, data travels over the Internet with very bizarre routes, but in this story the sender always connects directly to the recipient. How so?



No way. 500 miles plus or minus - there was a zone to send a letter beyond which it was impossible. Inside this zone there were also nodes where letters were not sent or sent with varying success.



There can be at least two reasons for this. The first is some additional delay (for example, on the firewall), which led to the expiration of the timeout. The second - the path to these nodes was really difficult, and its total length was more than 500 miles.



The University of North Carolinaā€™s network was built very well, and the signal path to other universities on the East Coast (to which, in fact, the mail was successfully delivered) was almost direct (in the original, orthodromy) , especially when this story happened. In those days, it was rare for a packet from Atlanta to Washington to go through San Jose.



16. And why did you still find it necessary to mention in history that your network was almost completely built on switches?



I do not know. At that time, it seemed that without this remark the story would be completely implausible. Although now I do not understand why. So when you re-read the story, you can mentally throw out the corresponding paragraph.



A user with the nickname Hacksaw wrote the following: ā€œSwitching excludes delays, for example, for resolving collisions. The absence of such delays simplified the search for the described problem, since the data for analysis were cleaner. I bet you meant that. ā€



17. Sendmail 5 does not understand the configuration file from sendmail 8.



But he understood. I have already been told that the sendmail 5 that can be found on the network does not understand. Therefore, I am forced to assume that only sendmail, supplied by Sun as part of Solaris, could do this. If you have access to its source, I would be grateful for checking if this was possible. But still - it happened, which means it could happen :-)



18. sendmail has default parameter values ā€‹ā€‹with which it is compiled; it cannot just set all uninitialized parameters to 0.



Several people wrote to me about this. Today, this may be so, but in those days it was definitely not so. I am sure of this, because a year or two after this story happened, I was at the sendmail workshop in LISA with Eric Allman. He noticed that sendmail does not have default values ā€‹ā€‹for some options that he talked about (in the standard sendmail.cf these values ā€‹ā€‹were, but, as you remember, this does not apply to our history). I took the opportunity and told him a story about 500 miles. He literally was lying under the table with laughter :-)



19. The units utility in SunOS does not understand such units as ā€œlight millisecondsā€ (in the Russian translation it says ā€œtake 3 milliseconds and multiply by the speed of lightā€, and the output of the units utility is shown in the original)



Yes. So what? On all the machines I work with, I write my own units.dat with a bunch of extra units and prefixes. And in general, as far as I remember, units I started under AIX. I donā€™t know if I know anything about the AIX light milliseconds. Check out the units.dat that comes with any Linux distribution today. He probably knows about light milliseconds (millilightseconds).



20. Of course, itā€™s very convenient to refer to ā€œlost notesā€ ...



Sure. And how many pieces of paper five years ago are kept by you?



21. Anyway, this story is a fiction!



Answer the question: if we ignore the technical details, can the incorrect configuration of the mail server lead to the fact that letters are delivered to recipients nearby, but not to remote recipients? I think the answer is yes. In fact, I know that the answer is yes because it actually happened. But even if you do not take my experience into account and look at the question from the outside, I think it is still possible, although at first glance it seems implausible.



If you still have questions that I have not answered, write me an email at trey+500mi@lopsa.org. I will add your question to the FAQ and mention you as the author. But most likely, Iā€™ll just say ā€œI donā€™t know, I donā€™t remember, and I donā€™t have the data to answer.ā€



22. The signature says that you are looking for work. Is this still relevant? (signature removed in Russian translation)



Not anymore, but thanks for this question!



All Articles