In a group I sporadically meet with, we got into an argument about prediction, sparked as a result of macho posturing surrounding the following 3 papers. I’m not making an effort to link them properly because their contents are not that important.
http://jasss.soc.surrey.ac.uk/11/4/12.html, http://jasss.soc.surrey.ac.uk/12/1/9.html, http://jasss.soc.surrey.ac.uk/12/1/10.html
In this argument, I tried to make it clear that, even though I agree to some extent with the symmetry thesis (explanations are potential predictions and vice versa)1, I assert that all registration into kinds is preemptive registration [5.2] (which puts me more squarely in Epstein's camp, as devoted to agnosticism). I'd previously advocated that perspective to this group in a different form through Firestein's book Ignorance. Scientists look for black cats in dark rooms. Engineers search for their keys under lamp posts.
Models don't predict. Humans predict. And scientists are not, should not be, in the business of prediction. That's left up to engineers, policy wonks, and psychics. What Thompson, Derr, and Troitzsch do, like many of us, is lift the vernacular use of the word "prediction" to also apply jargonally to the testing part of the various training-testing methods for VV&A (into which I include the typical ML - machine learning - methods like holdout, bootstrapping, etc.). But anyone whose engaged in significant VV&A work tacitly knows that type of "prediction" is fundamentally different from what the laity mean by the word.
A compromise is often made by thresholding (or clustering) to register a dimensional variable into a categorical variable, including things like statistical power, p-values, confidence, etc. For example, one might predict with “75% chance that there will be a new wave peaking in March or April, with a peak at least half again as high as the preceding trough.” The categorization, here, is:
no new wave vs. new wave
March-April vs. any other concrete time
new peak < 1.5 vs. > 1.5 previous trough
But I maintain that any such psuedo-categorization is, at best, tentative, hedged, reversible. What’s required for our sense of vernacular use of “prediction” is a commitment to that categorization, a preemptive registration of the 3 categories that disallows any other categorization. It should be clear that there’s not much of a commitment to the above categories. E.g. if the peak turns out to be 1.4999 times the preceeding trough, that will satisfy the criterion well enough.
Further, preemptive categorization is merely necessary for prediction, not sufficient. A hallmark of prediction is that it must be, in some sense, concrete. It must provide some special detail, some binding. In categorizing the ambient milieu in which we bathe from quantity to quality, the concreteness requirement implies the "what it's like" (qualia) aspect seemingly unique to the predicting device/human. Category registration is a prerequisite for concreteness. Then some kind of "I am here" binding has to be irreversibly2 placed somewhere inside the categories. One particular category has to be chosen. As an example, when the weatherperson says “There’s a 90% chance of rain tomorrow.” That’s not a prediction. But if she says “It will rain tomorrow.” That’s a prediction. It binds the state of the world contextually3. That concrete "what it's like" or "I am here" quality is a critical element of prediction. And any attempt to preserve the dimensional quantity from which the categorization was made (i.e. preserve reversibility) reveals it as a test of the explanation, not a prediction. So the “90% chance” shows it is NOT a prediction in the vernacular sense. It is a testing criterion for some (partially unstated) model.
The purpose of the testing part of training-testing is not to say what one should expect. It is not a prediction. It’s purpose is to falsify (or validate) a hypothetical mechanism, a potential explanation. What the psychics, engineers, and policy wonks then do with those explanations is their business.
I even took the time to lay out my rhetoric about how compressibility and algorithmic depth tie explanation and prediction into a spectrum rather than a dichotomy, to small avail.
Overused examples of irreversible (arrow of time) processes like the measurement problem in QM, catastrophic collapse, canalizing deposition, and perhaps even staged freezing may suggest ways to build algorithms that actually predict in the vernacular sense.
Obviously, there’s still some ambiguity in the criterion. Rain is a spectrum from misting to thunderstorm. Precipitation is spread (percentaged) over space as well as time. Etc. But this doesn’t refute my claim. It strengthens it. Algorithms don’t abstract away details willy-nilly like humans do. And scientists are closer to algorithms in the hedging against premature registration, which is why science popularization is so problematic.