[deleted]
Tommaso,
Yes, m(x) and Chaitin's Omega are deeply connected, in fact as you noticed the former contains the latter. While Chaitin's Omega is the halting probability, m(s) is the output probability (over halting programs running on a prefix-free universal Turing machine) so knowing the latter gives you the former.
Yes, we (in joint with Jean-Paul Delahaye) have been able to measure (to some limited extent) the statistical correlations between several real-world datasets (binary strings) from the real world and an empirical purely algorithmic generated m(s).
As you say, m(s) is 'lower semi-computable', that means one can (with a lot of computational resources) approximate it but never entirely compute it (because of the halting problem). But halting runtimes are known for (relatively) large spaces of abstract machines, for example for Turing machines thanks to the busy beaver game. So one of the ways we undertook was to calculate an experimental m(s) for the known available busy beaver function values.
As an important aside result there is that if you have a way (even if limited) to calculate m(s) then you have a way to calculate C(s), the Kolmogorov-Chaitin complexity of the string s, by way of Chaitin-Levin's coding theorem! And that's the applicability of our research beyond the statistical comparison between real-world datasets and artificial/digital datasets (to test the computational hypothesis).
The calculation we performed produced enough data to produce an empirical m(s) up to relatively short strings (from small Turing machines up to 4 states), for which we could evaluate C(s), something never done before due to its difficulty given that the usual way to approximate C(s) is by compression algorithms but for short strings this used to fail for obvious reasons (compression algorithms have a hard time finding patterns in strings too short so values from compressors are too unstable).
You ask where would you look for a data set to test a real world dataset against Levin's distribution. The answer is here: http://arxiv.org/abs/1101.4795 and we can of course provide full tables.
You also ask whether this view is an internal or external view at the universe. I have troubles placing me in this dichotomy at this moment. I think I should think further about it. I think, however, our view may be a bird view (even at an upper level of the physical). In fact m(s) is also called the universal distribution because it assumes nothing but the use of a universal Turing machine, so it dominates (proven by Levin himself) whichever other semi-computable distribution (it has also been called a 'miraculous' distribution, see http://www.springerlink.com/index/867P162741726288.pdf).
So if one would, for example, create candidate universes, one should probably first look whether the universe is capable of approaching the empirical calculated m(s) which would be an indication that the said universe is capable of producing enough complexity both in terms of structured complexity and apparent randomness distributed as m(s) says, and then one should look at whether the universe fulfills all others physical properties (such as the Lorentz invariant).
You also ask another interesting question about assuming a prior for the distribution of strings (I guess strings acting as initial conditions over the programs). The beauty of m(s) is that it basically does not matter from where (or what) you start from, you end up getting the same distribution because what matters is the computational filter, the random distribution of programs. I think only the distribution of programs would have an impact into m(s) (our experiments also confirm this). An interesting question is effectively whether one can impose restrictions on the distribution of programs, for example imposed by physical laws that one may model with game theory (something close to Kevin Kelly's critics to all Bayesian approaches to learning theory and in connection to the old problem of induction).
But in fact, as a consequence of our research, m(s) is no longer a prior, our experimental m(s) (with the reserve that it has a limited scope) is no longer Bayesian but an empirical (hence posterior) distribution, which according to Kevin Kelly would give our approach and to the algorithmic complexity approach greater legitimacy as a theory. One can see complexity emerging from our experimental distributions according to the way m(s) and C(s) was believed to operate.
Sincerely.