← Back to context

Comment by jules

13 years ago

The difference in error between the first and the rest is ENORMOUS.

Task 1:

    1st 0.15315 (convolutional neural net)
    2nd 0.26172
    3rd 0.26979
    4th 0.27058
    5th 0.29576
    [...]

Differences:

    0.10857
    0.00807
    0.00079
    0.02518

As you can see the first is way ahead of the rest. The difference between the 1st and 2nd is ~11%, between the second and third ~1%.

Task 2:

    1st 0.335463 (convolutional neural net)
    2nd 0.500342
    3rd 0.536474

Idem dito.

But the most exciting thing is that the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".

The points of the paper you cite are important concerns, but this result is still very exciting.

the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".

This deserves even more emphasis. All of the other teams were writing tons of domain specific code to implement fancy feature detectors that are the results of years of in-depth research and the subject of many PhDs. The machine learning only comes into play after the manually-coded feature detectors have preprocessed the data.

Meanwhile, the SuperVision team fed raw RGB pixel data directly into their machine learning system and got a much better result.

Lol.. my bad. I did not pay attention. I thought the error was in percentages. (I was comparing with MNIST and somehow assumed this too was percentages). Come to think of it, that is really dumb (what that would mean) !!