← Back to context

Comment by elorant

12 hours ago

I do a lot of work that is based on academic research, aka building a proprietary sparse embedding model. My issue with academia is that they don’t bother to solve the practical issues. They tell you how to build a PPMI model, but what about hitting a database that’s 500TB to find co-occurrence numbers? This isn’t even touched so you’d then have to go and invent a bazillion of algorithms yourself to make your life easier. So while the bedrock is based on academic research and we thank them for that, scaling anything requires a lot of work in uncharted territories.

Well, yeah. That's why we have "research & development" as a term.

What you're referring to is the "development" part of that. In some sense: the job you have _exists precisely because it's not part of the research phase_, and it's equally as valuable as the research part. Research is the proof of concept; development is scaling up and making production-ready and finding small efficiencies and so on.

From an industry perspective, it's tempting to conflate these, because that's what industry research labs are designed to do: integrated R&D. But that is not at all how academic research labs work.

But that isn't the purpose of academia -- the purpose of it is to discover new phenomena not to make products. It is true that there is a lot of work to turn a new advance into a product whether it is software or turning biological knowledge into a drug, but without discovery of new phenomena new products will come to a halt. While it is true that some corporate labs, most famously Bell Labs in its heyday, but also for example IBM's T.J. Watson and Xerox's PARC did do basic research besides product-focused work, this is pretty rare because it is hard to justify the cost of something that may only be practical in decades and often help your competitors as much as yourself.

> My issue with academia is that they don’t bother to solve the practical issues. They tell you how to build a PPMI model, but what about hitting a database that’s 500TB to find co-occurrence numbers?

Soon we will also blame academia for not providing iOS and android apps

I jest but database design is its own sub field of computer science, maybe look into their papers?

  • I did that too. Ending up building my own reverse index with a fixed-size vocabulary. But that's my issue, you start building one product and you end-up building ten in the process to solve all edge cases because no one bothered to research how things scale.

The practical issue of academia is epistemological. It's about learning how a phenomenon came to exists. If you are looking for efficiency the field of academia related to learning how to do so is computational complexity and it works quite well.

The goal of academia isn't to be practical, "only" learning.