[deleted]
Hi Lorraine,
Although I agree with Robert, I think I can also see where you're coming from.
Just to make sure I'm on the same page as you guys, I'd like to give an example of what I think of information is. Hopefully we all agree on at least this part.
If I receive three distinct symbols in the form of sentences such as "Hello.", "How are you today?", and "Lovely weather, don't you think?", the information content per symbol (in natural units) is [math]S = \ln(3)[/math]. If I'm understanding correctly, according to Robert's definition that's pure gold, no dirt.
If I concatenate the sentences and then break them down into their roots (individual characters), I get a different value for the information content per symbol because there are now repetitions (including spaces and punctuation, only 22 of the 58 total symbols are distinct): [math]S = 2.8[/math]. Since the pure gold measure would be [math]\ln(58) = 4.06[/math], this means that there's a lot of dirt.
With all of that aside: Is it an accurate guess for me think that you're searching for the most "broken down" description of physics, where the most fundamental of root symbols are all that's to be considered? Or is it kind of the opposite, where you're looking for the highest level symbols (leaves, for lack of being able to think of better word that is opposite to root)?
Robert kind of brings up a good point about the JPEG image data versus the JPEG decompression data, insomuch that they're both just data (although the image tends to vary often and the algorithm not so much, and so the images have a good chance of bearing greater information). Is it accurate to guess that whatever you're searching for (root or leaves) would be a synthesis of the "image" data and "algorithm" data, or would it be just the "algorithm" itself?
This was a very thought provoking essay. Thank you for sharing it.
- Shawn