Comment by pmelendez
13 years ago
I don't think this proves a superiority of any algorithm against other. Just that SuperVision team did a great job on task 1 and task 2. I just would add two things: 1) There is a No Free Lunch Theorem (http://en.wikipedia.org/wiki/No_free_lunch_theorem) that had been applied to pattern recognition too and that states that there is not a significative difference in performance between most pattern recognition algorithms.
2) There is way more chance to get an increment on performance depending of the choose of the features being used, and that seems to be the case here.
Many comments expressed concern about the alleged inappropriateness of the title. Even the no-free lunch theorem has been invoked, and words like SVM mentioned.
However: The original title, "Neural Networks officially best at object recognition", is much more appropriate than the current title, because it is by far the hardest vision contest. It is nearly two orders of magnituder larger and harder than other contests, which is why the winner of this contest is best at object recognition. The original title is much more accurate and should be restored.
Second, the gap between the first and the second entry is so obviously huge (25% error vs 15% error), that it cannot be bridged with simple "feature engineering". Neural networks win precisely because they look at the data, and choose the best possible features. The best human feature engineers could not come close to a relentless data-hungry algorithm.
Third, there was mention of the no-free lunch theorem and of how one cannot tell which methods are better. That theorem says that learning is impossible on data that has no structure, which is true but irrelevant. What's relevant that on the "specific" problem of object recognition as represented by this 1-million large dataset, neural networks are the best method.
Finally, if somebody makes SVMs deep, they will become more like neural networks and do better. Which is the point.
This is the beginning of the neural networks revolution in computer vision.
To nitpick at the math: "No free lunch" results are asymptotic in the sense that they necessarily hold over the _entire_ domain of whatever problem you're trying to solve. Obviously, algorithms will and do perform differently over the relatively few inputs (compared to infinity...) that they actually encounter. It's similar to undecidability: just because a problem is generally undecidable doesn't mean you can't compute it for certain subsets of input, and compute it reasonably well (for some definition of reasonable).
Agreed... I was in a rush to catch the train this morning and I didn't have chance to elaborate, I shouldn't do that.
However, my point was that most of the algorithms used on that link (ANN, SVM, etc) had similar expressive power (VC dimension) and had been proved to have similar performance between them in object recognition.
People normally take advantage on their specific properties rather than paying too much attention how well the algorithm would perform (since either SVM and ANN are expected to perform reasonably well). I still maintain my opinion that any difference in classification performance is more likely to be related to how the team managed the data instead of the chosen algorithm.
Deep convolutional learning is the difference here and indeed seems to be an interesting architecture which the current state of the art only support ANN. But that doesn't mean that somebody wouldn't come up with a strategy for deep learning on SVM or another classification technique in the future.
Although SVMs and layered neural nets have similar expressivity, the similarity is very much like turing completeness. i.e. Can't tell aparts the haskells from the unlambdas. SVMs express certain functions in a manner that grows exponentially with input vs a deep learner which tends to be more compact. The key to being a deep learner is in using unsupervised learning to seed a hierarchy of learners learning ever more abstract representations.
Also, Multilayered Kernel learners already exist.
1 reply →
That's why they include which features they used, which is educational.
Isn't NFL utter crap?
When you average an learning algorithms performance over a whole bunch of domains that _NATURE WILL NEVER GENERATE_, all algorithms are equally bad.
Paying attention to the theorem is mostly defeatist and counter-productive.
Imagine some ads serving company improves their learning algorithms 10% and is making 100s of millions more dollars. Are you going to say, well, there are billions of other possible universes in which they'd be losing money, they just got lucky that we don't live in those universes?
Actually, it does, since the difference in performance between entry #1 and entry #2 is so huge (25% error vs 15% error!), and since this is by far the hardest computer vision challenge yet!
Sorry for disagree, but it seems more related to the fact that they are using deep convolutional learning rather than the neural network itself. If you use an ANN with the same set of features side by side with a SVM you will see very equivalent results.
I will be more agree with a title like "Deep Convolutional learning overperformed traditional techniques in Object Recognition"
Yeah, if you use the same raw RGB features for the SVM as the neural net then the neural net would blow the SVMs away even more utterly.
2 replies →
Re #2, automatic, optimal feature selection is one of the touted advantages of neural networks. (usually with the caveat that it doesn't work so well in practice, however).
Have to agree that this doesn't prove anything. It is only one local contest. The title is very misleading.