Hi, Beige Bandicoot, thanks for reading and commenting! I agree that it is impossible to falsify a hypothesis with certainty. In particular, it is impossible to determine the direction of causality with full confidence. That is why I like probabilistic descriptions within a Bayesian framework. If we want to choose between “sun drives rooster”, or “rooster drives sun”, we can evaluate the ratio Prob(sun drives rooster | data) / Prob(rooster drives sun | data). According to Bayes’ rule, and assuming equal priors (which, admittedly, can be questioned), this ratio is equal to Prob(data | sun drives rooster) / Prob(data | rooster drives sun). These two probability distributions are not delta-like, since both hypothesis have some probability to produce the data that we actually observe. But the data turn out to be overwhelmingly more easily linked with “sun drives rooster” than with “rooster drives sun”. In more technical terms:
Hypothesis 1 is: “The rooster wakes up the world”
If then we observe that the world is not following the rooster’s crow (because it does not), we may conclude that (a) Hypothesis 1 is false, or (b) a very large number of exotic phenomena A1, A2, …, Ak required for the world to look as it does even if H1 is true must have happened. Since (b) is kind of unlikely (because there are a huge number of degrees of freedom involved in “the world waking up”), we take P(data | rooster drives sun) to be small.
Hypothesis 2 is: “The sun wakes up the world”
If then we observe that the world does follow the sun (because everything indicates it does), we may conclude that (a) Hypothesis 2 is true, or (b) a very large number of exotic phenomena B1, B2, …, Bk required for the world to look as it does even if H2 is false must have happened. Since (b) is unlikely, we take P(data | sun drives rooster) to be large.
Ultimately, this is a counting argument. How many unobserved facts do we have to assume to be true for H1 to hold given the data, as compared to H2 to hold given the data? The requirement for the system to be dissipative is imposed for there be many degrees of freedom involved, so that these two numbers be very different. In the original system, before the degrees of freedom are separated into thermal and classical, these numbers are not different.
Let me know if your question was in this direction, I am not sure whether I am answering your question, or just rambling. And yes, there are good many ideas that I have not cited. I just submitted the essay in the last minute. I have now looked at my reference list and it looks kind of a weird selection. But well, there was no time to improve it, I could either submit it as it was, or not submit at all ¯_(ツ)_/¯
Thanks again!