Comment by windexh8er

13 years ago

So this is one of those times that I find myself torn between labels.

A little background... I was brought up in the network ranks, I worked as a network / sys admin in high school, ended up working for an ISP as a junior network engineer in college (while I went to college at one of the first Cisco NetAcad baccalaureate programs - which was a combo of network study and Cisco curriculum and certifications) and have gone on to work in every major vertical since then for the past 10+ years; government, finance, healthcare, retail, telecomm, etc. I always tell clients and potential employers that having a network background generally gives me somewhat of an edge in the industry I primarily focus on: security, and I generally will study and take Juniper & Cisco tests and work on labs just to stay current. Most software devs and security folks I've run into (keep in mind there are a lot of really good folks who have a better grasp on network than a lot of seasoned engineers do) are generally overzealous in the thought that they truly do understand IP from a debugging and troubleshooting standpoint.

Case in point: I interviewed for a "Network Architect" position with a very well known online backup company (think top 4). The interview was the most bizarre I've ever had, not that it spanned more than 5 interviews, but that every time they positioned a complex network problem it was generally solvable within 5 to 10 minutes of pointed questions. The software dev who was interviewing me was baffled by how I came to a reasonable solution that took them over a week, in some cases, that quickly - and it was pretty simple in the fact that 1) I've seen something similar and 2) that's what I studied and still have a passion for over the course of 20+ years (when I found the Internet in 1991).

Most of the time when I run across a "magical" problem it's because someone hasn't looked at it from L1 up. As this article showcases you generally have two generic stack angles to approach it from - application back down to physical, or the inverse. Having been in network support - by the time you get a problem like this it's often so distorted with crazy outliers that really have nothing to do with the problem your best bet is to start from that L1 and go back up through the stack. Reading into the problem the author describes I think there were some key data that was missed and/or misinterpreted. There most surely would have been key indicators in TCP checksum errors and it was glossed over pretty lightly in the explanation - but it's interesting that those items of interest are often cast aside when digging into something like this. Nobody in this thread has indicated where a bit error test or even something as simple as iperf, or similar, would have been able to more accurately showcase/reproduce the problematic network condition.

But back to the labels remark - I don't believe, as some people have said, that this is a DevOps role largely. I don't mean to cut down on DevOps folks because I think, at some level, if you're a jack-of-all in any org then that's your role, it is what it is. However, this would be a problem most suited towards a professional network engineer - and you don't see much of that need in the startup space until people get into dealing with actual colo / DC type environments, otherwise it's often very simple and not architected with significant depth or specific use cases.

Long story short: network professionals are worth the money in the case of design, build, fix of potentially issues that may seem complex to others, but can be solved or found in minutes when you know what you're looking at. That being said, I'm impressed that the OP dug into it to get to a point where he could ask a specific person (who was probably a network engineer / tech of some level) to validate/fix his claim.

0 comments

windexh8er

No comments yet

Contribute on Hacker News ↗