Comment by bonoboTP
5 months ago
The term "machine vision" is mainly used in highly controlled, narrow industrial applications, think factory assembly lines, steel inspection, monitoring for cracks in materials, shape or size classification of items, etc. The task is usually very well defined, and the same thing needs to be repeated under essentially the same conditions over and over again with high reliability.
But many other things exist outside the "glue some GPT4o vision api stuff together for a mobile app to pitch to VCs" space. Like inspecting and servicing airplanes (Airbus has vision engineers who make tools for internal use, you don't have datasets of a billion images for that). There are also things like 3D motion capture of animals, such as mice or even insects like flies, which requires very precise calibration and proper optical setups. Or estimating the meat yield of pigs and cows on farms from multi-view images combined with weight measurements. There are medical things, like cell counting, 3D reconstruction of facial geometry for plastic surgery, dentistry applications, and a million other things other than chatting with ChatGPT about images or classifying cats vs dogs or drawing bounding boxes of people in a smartphone video.
Thank you for your thoughtful comment! I completely agree.
It’s great to see someone emphasize the importance of mastering the fundamentals—like calibration, optics, and lighting—rather than just chasing trendy topics like LLM or deep learning. Your examples are a great reminder of the depth and diversity in machine vision.
Thanks for the LLM response. Not sure if you meant to be clever here.
Your clever remark highlights poor emotional intelligence and weak communication skills. Sarcasm might have its place in casual conversation, but in professional discussions, it signals insecurity and a lack of respect—neither of which contribute to meaningful dialogue.
Your disdain for LLMs is equally puzzling. Are you seriously suggesting I shouldn’t use tools to improve my grammar and delivery simply because they don’t align with your engineering view? Ironically, LLM-based tools likely support your own work—whether through coding assistance, debugging, or other tasks—even if you choose not to acknowledge it.
By the way, I used an LLM to craft this reply too—who doesn’t?
2 replies →
Your disdain for LLMs is unfounded.
I use LLMs daily for coding. They are great. They are not a replacement for reading a book like the one linked here, or understanding image formation, lenses etc. Many people seem to imagine that all this stuff is now obsolete and all you need to do is wire up some standard APIs, ask an LLM to glue the json and that's all there is to being a computer vision engineer nowadays. Maybe even pros will self denigradinglybsay say say that but after a bit of chatting it will be obvious they have plenty of background knowledge beyond prompting vision language models.
So it's not disdain, I'm simply trying to broaden the horizon for those who only know about computer vision from OpenAI announcement and tech news and FOMO social media influencers.