← Back to context

Comment by _bin_

10 months ago

I actually went over this recently with someone who wanted to build something similar. The conclusion was this is a very difficult problem to solve, probably intractable to some extent. You can't see the complete composition of the food with a standard camera. E.g. I make a salad which is maybe 300 calories. Then I sprinkle some croutons and bacon on top, which will mostly be in the middle. Then I put dressing on it, which is hard to estimate. That dressing hides the bacon and croutons and, since it contains a lot of oil, could seriously skew the measurement one way or the other. Now I mix it all around and the AI can't tell how much dressing was used at all.

I pick this example because I've seen specifically this cause problems for people trying to lose weight. They think their eating a salad, not realizing they've thrown an extra 500 calories on top.

Another case: I sit down to breakfast, having made myself eggs and toast. One of if not the largest contributor to my calorie intake will be the amount of butter on my toast. If I use four pats that will probably exceed my calorie intake from eggs. If I use one, not as much. I sincerely doubt it's realistic to tell the difference with any sort of precision.

I'm in the calorie identification app business, and one benchmark I like to use is the old milkshake salad. It LOOKS like a salad with some mystery white sauce, maybe a ranch dressing or some such, but it's actually a milkshake. Hah! Gotcha. Humans: 3, AI:1.

  • How can that benchmark be useful unless real people are actually eating some lettuce covered in milkshake?

    Wait, ARE people eating that? Have I been out of the game too long?