Comment by nickpsecurity

10 years ago

"cited in my 2009 USENIX paper"

Oh hell: didn't realize I was talking to the inventor himself. Pleasant surprise. My poor memory might be getting mixed up here on timeline. Huge dissertation people kept giving me was 2012: didn't realize you were trying to push this stuff far back as 2007. You may have been a bit ahead of some & problem I describe was working in reverse: your info not getting to them. ;) Additionally, DDE was the newer effort. The old one I tried to remember was this:

https://www.usenix.org/legacy/event/osdi04/tech/full_papers/...

This is the bell that rang in my ear when people were first telling me about how Rump Kernels let you reuse drivers. By 2006, a German paper at Dresden had similarly reused FreeBSD drivers for disks or something. The same team continued working to build all kinds of L4 stuff and I/O virtualization, resulting in DDE around 2010-2011 range. An advantage of DDE, which yours might have too (dissertation on my backlog), is that you can use just components needed for your driver's needs. OKL4 used such tech in their "microvisor" platform with these results: native drivers on L4 kernel for minimal or security-focused deployment; minimal, device wrapper for Linux drivers for effeciency; use of drivers in a full VM; virtual stubs to any of this for clients. So, depending on the When, the What of your work might have benefited from any lessons learned during these projects or even communications with group members. If their existence/accomplishments had reached you.

So, that's what was in my head. May have helped, may have not. Rather than evolution, your work sounds like independent re-invention of one capability and new invention in others. That happens, too, even in my own investigations. I got more confused as I just looked at the 2009 paper as it talks about a lot more than driver reuse. I was going to drop your example from my meme (or reverse it) but I might need to straight up read your book before I talk about the subject more. I think what a lot of people describe to me online about Rump Kernels and what your paper says are a bit different. Misinformation could've thrown off my opinion.

"I do know that even my first [rejected] paper on rump kernels from 2007 stressed the idea of not forking."

That's a key advantage of your work and worth several different lines of research. Whether academics like it or not, it was a good idea. Good job focusing on that hard problem as few even tried.

" I don't know if me being ignorant is more embarrassing for myself or the general computing community."

It could be you or them, as I've written, but my meme about leveraging prior work is not about you being ignorant: it's how our field tracks and leverages what we've learned. I know people think of what I think before me so I searched "re-use device drivers" back when idea came to me. Gave me 2004 paper. Many papers I've found have great "related work" sections with references that teach me plenty. Some were clearly a quick Google and summary that don't reflect field's accomplishments. So, culture of institution plays into it. Past that, the stuff is so spread out in so many Universities, conferences, ACM/IEEE, web articles, DEFCON's, etc it's nearly impossible to track it all unless one is obsessively dedicated like myself. ;) But the tradeoff was that I couldn't be implementing in code as many of my ideas like you did, eh?

So, I think the issue here is that good ideas were brewing in several different places yet without a way to connect them and academia as a whole not really pushing for that. I keep trying to determine a solution but it's tricky. Recurring concept is to have a non-profit collection (eg Citeseerx) of most of the stuff with tags (eg "driver re-use") and good search functionality to encourage serendipity. Plus, a series of forums that only allow people doing research, coding things, other proven capabilities, or referrals by the same. Then, you see someone's work, you can contact them directly or start a forum post related to their work that notifies them. Encourage conversations that work out hard problems or prevent wasteful re-work. Public can read them but not comment unless approved by moderation as useful. Goal is to keep quality at pro or aspiring pro level. Tricky to get this going, for sure, but I'm trying to work out the concept itself before worrying about popularity too much. Thoughts on this scheme or others?

"Yes, I absolutely believe that drivers should be separate from the OS layer. Them being bundled together is a (pre)historic mistake. I agree that we would be in a much better place if projects like OSKit and DDEKit had gotten attention early on."

Basically my point. Again, not a gripe at you or Rump kernels: just how stuff stays obscure. Your original papers actually fit into that category now that we've talked about them. Least yours got out there and took off. Great work on both the tech and making it a success. :)

Yea it took a while. I often joke that it took two weeks to get things working, and four years to make it maintainable. In reality, the first version did take two weeks, but maintainability grew with completeness and correctness. It also took several years to get people to stop telling me I've reinvented FUSE (which I did, but not with rump kernels ;). If you're interested, I recently touched on the early history and original motivation in this blog post: https://blog.xenproject.org/2015/08/06/on-rump-kernels-and-t... I am absolutely certain that we'd not be having this discussion if kernel driver development had not been the original motivation. That's the buy-in for the upstream OS.

See the community link at rumpkernel.org if you want more non-false information. One of the problems with "related work" is that you're trying to find some difference to your own work to justify the publication of your paper, and all that without fully understanding the related work. So you tend to make goal-driven assumptions, and hence "related work" generally tends to be more wrong than right. The exception is really well-known work, because mis-assumptions about it won't get through the PC. But then again, telling everyone what they already know is not that useful, is it? At least for myself, writing "related work" was always the hardest bit.

btw, my papers are somewhat obsolete, the dissertation is still mostly accurate, even if the use case descriptions are out-of-date. So, I'm sure someone could 100% accurately cite an old paper of mine, and still completely misportray the current state of the art, because there's no "obsolete by" for papers.

I never saw the LeVasseur paper as being in the same category as e.g. DDE. It was one of those "don't fix the OS, just throw (virtual) hardware at it" approaches. Not saying it's wrong or right, just completely different IMO. Though, if you're trying to get rid of the OS -- like we are now, but not back then -- it's a weird approach ;)

Not sure if you can manually manage a research database. Are we really not able to do that automatically in 2015, or is it just a question of nobody building the right kind of crawler/searcher? After all, a manually managed index is a fork of the actual information. The other problem is that most research tends to be conducted with a "graduate-and-run" method. The professor might have a more holistic vision, but the professor lacks a) time to engage in such discussion b) a grass-roots understanding. But if it can be made to work, would be quite valuable.

  • " If you're interested, I recently touched on the early history and original motivation in this blog post"

    Appreciate the write-up. Actually a good read. I did see a convergence between you and Dresden in this:

    " the external dependencies on top of which rump kernels run were discovered to consist of a thread implementation, a memory allocator, and access to whatever I/O backends the drivers need to access"

    Their papers from 2006-2010 on pushing stuff into user-mode for L4 kernels kept mentioning the same three things albeit with work. Shows that other academics trying to improve status quo in kernel-user mode tech should continue to work on making those things easier to understand, modify, debug, integrate, and so on. They keep popping up as critical to unrelated projects [for obvious reasons but still].

    " Since this is the Xen blog, we should unconventionally understand ASIC to stand for Application Specific Integrated Cloud."

    I really wish the community didn't re-invent ASIC's definition. ASC would've done nicely. This is going to screw up search results for people researching either and filtering via titles/abstracts. I know because one threw me off for around 5-10 minutes because I skipped the abstract assuming it meant a chip and was very confused with their findings lol. "How the hell did they implement Rump Kernels on an ASIC? Where's the chip boundary in this diagram? Are these PCI devices in x86 servers?" It will get worse in near future as much cutting-edge cloud stuff is on FPGA's or leverage ASIC's. I know: I gotta live with it. Just annoying as hell...

    " the ability to run unmodified POSIX-y software on top of the Xen hypervisor via the precursor of the Rumprun unikernel was born. "

    Speaks to what a good job you did on your end. Back to the HN conversation, though.

    "all that without fully understanding the related work. So you tend to make goal-driven assumptions, and hence "related work" generally tends to be more wrong than right. "

    Interesting. I'll try to keep that in mind when reading seemingly bad ones in the future.

    "btw, my papers are somewhat obsolete, the dissertation is still mostly accurate"

    I was going to read... thoroughly skim... the dissertation anyway. Thanks for the tip, though, cuz I could've gotten lazy and read the paper instead. ;) I'll just use dissertation and web site when I get around to trying to learn this stuff.

    "I never saw the LeVasseur paper as being in the same category as e.g. DDE. "

    The connection is that the paper tried to reuse, unmodified drivers with new OS's and clients (eg stand-alone apps). Implied one could even use several OS's if one supported hardware X and the other hardware Y. That's what DDE did, albeit differently, along with what people told me you did with again different implementation strategy. That's the only connection. I mean, don't you at some point have a client (user-mode app, unikernel on Xen) use a stub/function in one space that gets redirected to code in NetBSD to execute it against the hardware? Seems like a similarity. However, this conversation has shown yours to be much more advanced and portable in design/implementation.

    Similarities ended up being the main goal (driver/feature reuse), several areas of implementation (dependencies), hooking into a major BSD/Linux, and turning that into something supporting lightweight VM's. Past that, your work is totally in it's own category given its specifics and the no-fork focus. Congratulations on being original in a space where that's uncommon. And the History indicates you got to originality by focusing on not being original (reusing code). Oh the ironies of life!

    "Not sure if you can manually manage a research database. Are we really not able to do that automatically in 2015, or is it just a question of nobody building the right kind of crawler/searcher? "

    You probably can. Datamining is outside of my expertise, though. I have over 10,000 papers on software engineering, formal verification, INFOSEC, tech like yours, etc. Quite a few are obscure despite totally solving a category of problem. That's a start. Maybe a tech such as the open-source, DARPA-sponsored DeepDive can be used to sort it. Gives side benefit of overly-paranoids running screaming when they see "Powered by DARPA tech" in fine print under search. :O

    http://deepdive.stanford.edu/

    "The other problem is that most research tends to be conducted with a "graduate-and-run" method. The professor might have a more holistic vision, but the professor lacks a) time to engage in such discussion b) a grass-roots understanding."

    Maybe have a solution to that. My collection was built and individual works evangelized with no participation on those groups part. Matter of fact, some were very hard to find with me adding a few gems from late 90's just this month. The common denominator is that a description and PDF are published somewhere accessible to the public. If it's a group (eg CS department), then it's even easier to manually or automatically pull the research if they simply have a dedicated page for publications. If students don't care & professors do, then professors might be willing to send in interesting work with pre-made tags, etc. Takes little time. Could even make students do it as part of requirements with a list of tags on web-site with suggestion capability for ease of use. Others digging through the database with motivation to build can pick up abandoned ideas.

    As Jeremy Epstein at NSF told me, the biggest problem will probably be getting buy-in from schools outputing high-quality stuff. Without critical mass, it's unlikely to go anywhere. However, I fight with myself over whether to push it anyway given that something good might come from it anyway much like Citeseerx's passive collection. Even if only a few use it, something really good might come out of it and I'd hate to waste that potential. Internal struggle on that one as idealism and pragmatism rarely go in same direction in this field.

    Anyway, glad you like the concept. Still getting 3rd party input before going all out with it.

    • The community didn't reinvent "ASIC". It was my joke, but humour is difficult ... Read the conclusions of my dissertation. Then realize that the cloud is just one form of special-purpose hardware. Then be me trying to tongue-in-cheek claim that I foresaw the potential of rump kernels on the cloud 4 years earlier. Then maybe the joke will be funny. If not, well, you don't get your money back, sorry, your only condolence is that I rarely use the same joke twice.

      I still don't see too much similarity between DDE and using full-OS VM's to act as DDE backends. It's like observing that unikernels and traditional timesharing operating systems can both run applications, so they're similar. Yes, but ... Anyway, I understand what you mean, disagree, and don't think it's worth debating further.

      1 reply →