I don't think this proves a superiority of any algorithm against other. Just that SuperVision team did a great job on task 1 and task 2. I just would add two things: 1) There is a No Free Lunch Theorem (http://en.wikipedia.org/wiki/No_free_lunch_theorem) that had been applied to pattern recognition too and that states that there is not a significative difference in performance between most pattern recognition algorithms.
2) There is way more chance to get an increment on performance depending of the choose of the features being used, and that seems to be the case here.
Many comments expressed concern about the alleged inappropriateness
of the title. Even the no-free
lunch theorem has been invoked, and words like SVM mentioned.
However: The original title, "Neural Networks officially best at object
recognition", is much more appropriate than the current title, because
it is by far the hardest vision contest. It is nearly two
orders of magnituder larger and harder than other contests,
which is why the winner of this contest is best at object recognition.
The original title is much more accurate and should be restored.
Second, the gap between the first and the second entry is so obviously
huge (25% error vs 15% error), that it cannot be bridged with simple "feature
engineering". Neural networks win precisely because they look at the
data, and choose the best possible features. The best human feature
engineers could not come close to a relentless data-hungry algorithm.
Third, there was mention of the no-free lunch theorem and of how one
cannot tell which methods are better. That
theorem says that learning is impossible on data that has
no structure, which is true but irrelevant. What's
relevant that on the "specific" problem of object recognition
as represented by this 1-million large dataset, neural networks
are the best method.
Finally, if somebody makes SVMs deep, they will become more like neural
networks and do better. Which is the point.
This is the beginning of the neural networks revolution in computer vision.
To nitpick at the math: "No free lunch" results are asymptotic in the sense that they necessarily hold over the _entire_ domain of whatever problem you're trying to solve. Obviously, algorithms will and do perform differently over the relatively few inputs (compared to infinity...) that they actually encounter. It's similar to undecidability: just because a problem is generally undecidable doesn't mean you can't compute it for certain subsets of input, and compute it reasonably well (for some definition of reasonable).
Agreed... I was in a rush to catch the train this morning and I didn't have chance to elaborate, I shouldn't do that.
However, my point was that most of the algorithms used on that link (ANN, SVM, etc) had similar expressive power (VC dimension) and had been proved to have similar performance between them in object recognition.
People normally take advantage on their specific properties rather than paying too much attention how well the algorithm would perform (since either SVM and ANN are expected to perform reasonably well). I still maintain my opinion that any difference in classification performance is more likely to be related to how the team managed the data instead of the chosen algorithm.
Deep convolutional learning is the difference here and indeed seems to be an interesting architecture which the current state of the art only support ANN. But that doesn't mean that somebody wouldn't come up with a strategy for deep learning on SVM or another classification technique in the future.
When you average an learning algorithms performance over a whole bunch of domains that _NATURE WILL NEVER GENERATE_, all algorithms are equally bad.
Paying attention to the theorem is mostly defeatist and counter-productive.
Imagine some ads serving company improves their learning algorithms 10% and is making 100s of millions more dollars. Are you going to say, well, there are billions of other possible universes in which they'd be losing money, they just got lucky that we don't live in those universes?
Actually, it does, since the difference in performance between entry #1 and entry #2 is so huge (25% error vs 15% error!), and since this is by far the hardest computer vision challenge yet!
Sorry for disagree, but it seems more related to the fact that they are using deep convolutional learning rather than the neural network itself. If you use an ANN with the same set of features side by side with a SVM you will see very equivalent results.
I will be more agree with a title like "Deep Convolutional learning overperformed traditional techniques in Object Recognition"
Re #2, automatic, optimal feature selection is one of the touted advantages of neural networks. (usually with the caveat that it doesn't work so well in practice, however).
So far I've watched the first lecture and it seems like it'll be exactly the course I've been wanting: starting with the basics of machine learning but quickly diving into the state of the art for neural nets.
Sensational title that misrepresent the results of a competition with limited (albeit high quality) participants. There is limited information of general value in this link.
But the most exciting thing is that the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".
The points of the paper you cite are important concerns, but this result is still very exciting.
the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".
This deserves even more emphasis. All of the other teams were writing tons of domain specific code to implement fancy feature detectors that are the results of years of in-depth research and the subject of many PhDs. The machine learning only comes into play after the manually-coded feature detectors have preprocessed the data.
Meanwhile, the SuperVision team fed raw RGB pixel data directly into their machine learning system and got a much better result.
Lol.. my bad. I did not pay attention. I thought the error was in percentages. (I was comparing with MNIST and somehow assumed this too was percentages). Come to think of it, that is really dumb (what that would mean) !!
Thanks for the reference. It goes well with "Machine Learning that Matters", a paper cited by Terran Lane in his recent blog post "On leaving Academia".
Neural Networks officially best at object recognition in this particular competition of seven teams, on two of the three tasks.
Not to take away from the accomplishment of the SuperVision team, but claim in the title seems somewhat sensationalist. Is this competition like the world cup of object recognition or something?
Just to add sense for newcomers, the original title of the thread was "Neural Networks officially best at object recognition" and most of the posts in here debated that the title was not appropriate for the link.
I don't think this proves a superiority of any algorithm against other. Just that SuperVision team did a great job on task 1 and task 2. I just would add two things: 1) There is a No Free Lunch Theorem (http://en.wikipedia.org/wiki/No_free_lunch_theorem) that had been applied to pattern recognition too and that states that there is not a significative difference in performance between most pattern recognition algorithms.
2) There is way more chance to get an increment on performance depending of the choose of the features being used, and that seems to be the case here.
Many comments expressed concern about the alleged inappropriateness of the title. Even the no-free lunch theorem has been invoked, and words like SVM mentioned.
However: The original title, "Neural Networks officially best at object recognition", is much more appropriate than the current title, because it is by far the hardest vision contest. It is nearly two orders of magnituder larger and harder than other contests, which is why the winner of this contest is best at object recognition. The original title is much more accurate and should be restored.
Second, the gap between the first and the second entry is so obviously huge (25% error vs 15% error), that it cannot be bridged with simple "feature engineering". Neural networks win precisely because they look at the data, and choose the best possible features. The best human feature engineers could not come close to a relentless data-hungry algorithm.
Third, there was mention of the no-free lunch theorem and of how one cannot tell which methods are better. That theorem says that learning is impossible on data that has no structure, which is true but irrelevant. What's relevant that on the "specific" problem of object recognition as represented by this 1-million large dataset, neural networks are the best method.
Finally, if somebody makes SVMs deep, they will become more like neural networks and do better. Which is the point.
This is the beginning of the neural networks revolution in computer vision.
To nitpick at the math: "No free lunch" results are asymptotic in the sense that they necessarily hold over the _entire_ domain of whatever problem you're trying to solve. Obviously, algorithms will and do perform differently over the relatively few inputs (compared to infinity...) that they actually encounter. It's similar to undecidability: just because a problem is generally undecidable doesn't mean you can't compute it for certain subsets of input, and compute it reasonably well (for some definition of reasonable).
Agreed... I was in a rush to catch the train this morning and I didn't have chance to elaborate, I shouldn't do that.
However, my point was that most of the algorithms used on that link (ANN, SVM, etc) had similar expressive power (VC dimension) and had been proved to have similar performance between them in object recognition.
People normally take advantage on their specific properties rather than paying too much attention how well the algorithm would perform (since either SVM and ANN are expected to perform reasonably well). I still maintain my opinion that any difference in classification performance is more likely to be related to how the team managed the data instead of the chosen algorithm.
Deep convolutional learning is the difference here and indeed seems to be an interesting architecture which the current state of the art only support ANN. But that doesn't mean that somebody wouldn't come up with a strategy for deep learning on SVM or another classification technique in the future.
3 replies →
Isn't NFL utter crap?
When you average an learning algorithms performance over a whole bunch of domains that _NATURE WILL NEVER GENERATE_, all algorithms are equally bad.
Paying attention to the theorem is mostly defeatist and counter-productive.
Imagine some ads serving company improves their learning algorithms 10% and is making 100s of millions more dollars. Are you going to say, well, there are billions of other possible universes in which they'd be losing money, they just got lucky that we don't live in those universes?
Actually, it does, since the difference in performance between entry #1 and entry #2 is so huge (25% error vs 15% error!), and since this is by far the hardest computer vision challenge yet!
Sorry for disagree, but it seems more related to the fact that they are using deep convolutional learning rather than the neural network itself. If you use an ANN with the same set of features side by side with a SVM you will see very equivalent results.
I will be more agree with a title like "Deep Convolutional learning overperformed traditional techniques in Object Recognition"
3 replies →
Re #2, automatic, optimal feature selection is one of the touted advantages of neural networks. (usually with the caveat that it doesn't work so well in practice, however).
Have to agree that this doesn't prove anything. It is only one local contest. The title is very misleading.
Hinton's team (SuperVision) uses an interesting 'dropout' technique. He gave a Google Tech Talk on this back in June.
http://www.youtube.com/watch?v=DleXA5ADG78&feature=plcp
And an older talk that covers some of what a deep convolutional net is:
http://www.youtube.com/watch?v=VdIURAu1-aU
Hinton is currently teaching a Coursera class on neural nets: https://class.coursera.org/neuralnets-2012-001/class/index
So far I've watched the first lecture and it seems like it'll be exactly the course I've been wanting: starting with the basics of machine learning but quickly diving into the state of the art for neural nets.
Sensational title that misrepresent the results of a competition with limited (albeit high quality) participants. There is limited information of general value in this link.
Am not sure if you can apply winner takes all for such marginal difference in error. Give a slightly different database and things go awry.
Check out : "Unbiased Look at Dataset Bias", A. Torralba, A. Efros,CVPR 2011.
The difference in error between the first and the rest is ENORMOUS.
Task 1:
Differences:
As you can see the first is way ahead of the rest. The difference between the 1st and 2nd is ~11%, between the second and third ~1%.
Task 2:
Idem dito.
But the most exciting thing is that the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".
The points of the paper you cite are important concerns, but this result is still very exciting.
the results were obtained with a relatively general purpose learning algorithm. No extraction of SIFT features, no "hough circle transform to find eyes and noses".
This deserves even more emphasis. All of the other teams were writing tons of domain specific code to implement fancy feature detectors that are the results of years of in-depth research and the subject of many PhDs. The machine learning only comes into play after the manually-coded feature detectors have preprocessed the data.
Meanwhile, the SuperVision team fed raw RGB pixel data directly into their machine learning system and got a much better result.
Lol.. my bad. I did not pay attention. I thought the error was in percentages. (I was comparing with MNIST and somehow assumed this too was percentages). Come to think of it, that is really dumb (what that would mean) !!
Thanks for the reference. It goes well with "Machine Learning that Matters", a paper cited by Terran Lane in his recent blog post "On leaving Academia".
I worry you may have taken a biased look at "Unbiased Look at Dataset Bias".
Not that only, I had a high variance on my bias.. ;)
Neural Networks officially best at object recognition in this particular competition of seven teams, on two of the three tasks.
Not to take away from the accomplishment of the SuperVision team, but claim in the title seems somewhat sensationalist. Is this competition like the world cup of object recognition or something?
I found the title of this post really ironic.
"There is now clearly an objective answer to which inductive algorithm to use"
Just to add sense for newcomers, the original title of the thread was "Neural Networks officially best at object recognition" and most of the posts in here debated that the title was not appropriate for the link.
Congrats to the awesome folks at ISI for scoring 1st at task 3 and 2nd at task 1! Keep rocking my world.
why isn't there any solution of task 3 from team SuperVision with their Neural Nets?
*this implementation of a neural network designed for object recognition for this particular challenge
So, this is what HN posts have come to? The level of tabloid science news coverage.
The title has changes at least twice, confusing discussion. Can we have a title history on HN posts? Mutable state stinks.