Will extremely large databases of information, starting in the petabyte level, change how we learn. It may turn out that tremendously large volumes of data are sufficient to skip the theory in order to make a predicted observation.
Google was one of the first to notice this. For instance, take Google’s spell checker. When you misspell a word when googling, Google suggests the proper spelling. How does it know this? How does it predict the correctly spelled word? It is not because it has a theory of good spelling, or has mastered spelling rules. In fact Google knows nothing about spelling rules at all.
Instead Google operates a very large dataset of observations which show that for any given spelling of a word, x number of people say “yes” when asked if they meant to spell word “y.” Google’s spelling engine consists entirely of these datapoints, rather than any notion of what correct English spelling is. That is why the same system can correct spelling in any language.
In fact, Google uses the same philosophy of learning via massive data for their translation programs. They can translate from English to French, or German to Chinese by matching up huge datasets of humanly translated material. For instance, Google trained their French/English translation engine by feeding it Canadian documents which are often released in both English and French versions. The Googlers have no theory of language, especially of French, no AI translator. Instead they have zillions of datapoints which in aggregate link “this to that” from one language to another.
The Google Way of Science
The End of Theory: The Data Deluge Makes the Scientific Method Obsolete