Comment by hedgehog
1 day ago
Once you start looking at the world through the lens of frequency domain a lot of neat tricks become simple. I have some demo code that uses fourier transform on webcam video to read a heartrate off a person's face, basically looking for what frequency holds peak energy.
>Once you start looking at the world through the lens of frequency domain a lot of neat tricks become simple.
Not the first time I've heard this on HN. I remember a user commenting once that it was one of the few perspective shifts in his life that completely turned things upside down professionally.
It's effectively the underpinning of all modern lossy compression algorithms. The DCT which underlies codecs like Jpeg, h264, mp3, is really just a modified FFT.
Inter/intra-prediction is more important than the DCT. H264 and later use simpler degenerate forms of it because that's good enough and they can define it with bitwise accuracy.
There is also a loose analogy with finance: act (trade) when prices cross a certain threshold, not after a specific time.
I don't think pulsing skin (due to blood flow) is visible from a webcam though.
Plenty of sources suggest it is:
https://github.com/giladoved/webcam-heart-rate-monitor
https://medium.com/dev-genius/remote-heart-rate-detection-us...
The Reddit comments on that second one have examples of people doing it with low quality webcams: https://www.reddit.com/r/programming/comments/llnv93/remote_...
It's honestly amazing that this is doable.
My dumb ass sat there for a good bit looking at the example in the first link thinking "How does a 30-60 Hz webcam have enough samples per cycle to know it's 77 BPM?". Then it finally clicked in my head beats per minute are indeed not to be conflated with beats per second... :).
Non-paywalled version of the second link https://archive.is/NeBzJ
1 reply →
MIT was able to reconstruct voice by filming a bag of chips on a 60FPS camera. I would hesitate to say how much information can leak through.
https://news.mit.edu/2014/algorithm-recovers-speech-from-vib...
I befriended the guy in high school who built a Tesla coil. For his next trick he was building a laser to read sound off of plate glass. The decoder was basically an AM radio. Which high school me found slightly disappointing.
2 replies →
It is, I've done it live on a laptop and via the front camera of a phone. I actually wrote this thing twice, once in Swift a few years back, and then again in Python more recently because I wanted to remember the details of how to do it. Since a few people seem surprised this is feasible maybe it's worth posting the code somewhere.
You will be surprised of The Unreasonable Effectiveness of opencv.calcOpticalFlowPyrLK
Which is a special case of mathematics.
It is, but there's a lot of noise on top of it (in fact, the noise is kind of necessary to avoid it being 'flattened out' and disappearing). The fact that it covers a lot pixels and is relatively low bandwidth is what allows for this kind of magic trick.
The frequency resolution must be pretty bad though. You need 1 minute of samples for a resolution of 1/60 Hz. Hopefully the heartrate is staying constant during that minute.
It totally is. Look for motion-magnification in the literature for the start of the field, and then remote PPG for more recent work.
Sure it is. Smart watches even do it using the simplest possible “camera” (an LED).
I have seen apps that use the principle for HRV. Finger pushed on phone cam.
You can do it with infrared and webcams see some of it, but I'm not sure if they're sensitive enough for that.
[dead]