Some links about entropy (information) in physics:

http://ls.poly.edu/~jbain/physinfocomp/lectures/03.BoltzGibbsShannon.pdf

">http://galileo.phys.virginia.edu/classes/752.mf1i.spring03/DensityMatrix.htm

](https://galileo.phys.virginia.edu/classes/752.mf1i.spring03/DensityMatrix.htm

)

">http://www.cs.berkeley.edu/~vazirani/s07quantum/notes/qinfo.pdf

](https://www.cs.berkeley.edu/~vazirani/s07quantum/notes/qinfo.pdf

)

From what I have read, the calculation of the von Neumann entropy essentially boils down to an equation that is of the same form as the ones in the theory given by Boltzmann/Gibbs/Shannon. Also from what I've read, a mixed state consists of many distinct pure states, and the probabilities of those distinct pure states are used to calculate the entropy. If the probabilities are entirely balanced, then it's referred to as a "maximally mixed state", in which case the entropy is just S = ln(number of distinct pure states in the mixed state).

My language may be sloppy, but I am not in any way trying to say that ALL physicists and ALL biologists are wrong. I'm saying that the ones that are wrong are the ones who do not recognize that the Shannon entropy and von Neumann entropy are effectively measuring the same thing -- probabilities of distinct states. Whether a state is defined by spin, charge, etc or by a fixed size block code (ie. ASCII), or by a variable-bit Huffman code, or whatever. A distinct state is a distinct state is a distinct state.

    So, it stands to reason that the entropy of a single pure state, as a whole, alone, is S = ln(1) = 0.

    This is just like how the letter-state 'a' all by itself has an entropy of S = ln(1) = 0.

    Entropy emerges only when there are multiple, distinct states under consideration.

    How informative would English be if it only had one letter? Not informative at all -- it would tell you that the person you're communicating with is there, making sounds or scribbles on paper, but that's about it.

    We were talking about how the universe is or is not like a computer, and I had mentioned 't Hooft's model of a black hole from his paper Dimensional Reduction in Quantum Gravity. I have no idea if it's a correct model or not, but it does illustrate a good point about the higher orders of entropy.

    Essentially, the black hole's event horizon is made up of N spin-like Boolean degrees of freedom (bits), where N is related to the number of distinct microscopic states by N = ln(number of distinct states)/ln(2). Barring some topology differences, this means that the black hole's event horizon is effectively the same thing as an N-bit integer, which also has 2^N distinct states. Of course, the probabilities of the distinct states in both the black hole model and the integer model are assumed to be all the same -- 1/(2^N) -- which is why the calculation of N is so simple (no summing required, the answer is already known).

    To be clear: The measure N is the first order entropy (in binary units). In the black hole model it's the von Neumann entropy, in the integer model it's the Shannon entropy. They're effectively measuring the same thing -- the logarithm of the number of distinct equiprobable states.

    This is not the end of the story though, because 't Hooft continues on in his paper to describe the beginnings of a cellular automaton rule that would govern the evolution from state to state. He also uses the word data a whole lot. Bonus.

    What does his assumption of the requirement of a cellular automaton rule immediately tell me, without even looking into the technical details of it? It tells me that he assumes that the higher orders of entropy are likely NOT maximal. Look at it from the opposite point of view: If the evolution from one state to the next was entirely random at all orders, then all possible combinations of time-adjacent states (pairs, triples, etc) would have equal probability, and so the entropy would be maximal for all orders. This kind of randomness is the definitive anti-rule, and so you would not actually need a cellular automaton rule for it to occur -- for each bit you could simply ignore all of the neighbouring bits (again, neighbours depending on the topology) and just randomly flip the bit. That's the definitive anti-cellular automaton, in which case you entirely ignore the neighbours.

    It seems to me that determinism in this model would be indicated by less-than maximal entropy for some or all of the higher orders of entropy. I don't think he states it quite like that, but it seems to be true.

    Of course, entropy is information.

      Surely I am not the first to try this line of reasoning in order to illustrate the general importance of the second and higher order entropy. I just am not aware of it because I do not read every single paper that comes out, and I would love to read about it if someone has shown the reasoning to be true or false. This kind of thing is used very often to analyze English, where each state is a distinct letter from an alphabet, so it's not like it's an alien thought.

      Needless to say, when I read a sentence such as that from http://cfpm.org/jom-emit/1998/vol2/wilkins_js.html :

      ... "Memes are those units of transmitted information that are subject to selection biases at a given level of hierarchical organization of culture. Unlike genes, they are not instantiated in any exclusive kind of physical array or system, although at base they happen to be stored in and expressed from neurological structures."

      I come to realize that people are confusing data with information. A meme (symbol/sign) is a unit of data, not a unit of information. The information arises only in the context of multiple memes, their probabilities of occurring, their probabilities of being time-adjacent, etc.

      Perhaps I'm misreading the intent of the author, but as soon as I hear unit of information, I have to wonder...

        At least the guy acknowledges, indirectly, that the data content per meme is flexible, and it's all but independent of the information content ... up to the point where there's a defined less-than operator, which is all that's needed to produce a Boolean test for equality/non-distinctness vs inequality/distinctness.

        If you do not understand my meaning about the less-than operator, then look up the use of the STL 'set' container and encapsulation/blackboxing via classes and private data members in C++.

        • [deleted]

        :) that the force be with you.

        I left this message for Robert McEachern on his essay page:

        Hi Robert,

        Here's another attempt at answering your question... "So what is the big deal? What makes this so significant?"

        After reading:

        - Your essay

        - 'The Heisenberg Uncertainty Principle and the Nyquist-Shannon Sampling Theorem' by Pierre Millette

        - 'An Introduction to Information Theory: Symbols, Signals and Noise' by John Pierce

        - 'Communication in the Presence of Noise' by Claude Shannon

        I am left with the impression that Shannon and Piece predicted that the holographic principle would become a naturally accepted concept in physics. They detail how the volume of the signal space "creeps" away from the origin of the space as the dimension of the space increases; how there is dimensional reduction in the message space when compensating for phase "differences" (same message, different phase) that can arise when sampling of the signal. Seems at first glance to be hint at how to get rid of singularities at the centres of black holes.

        Perhaps it's not quite the same thing. On the other hand, if it's the same thing, then that's quite significant. In any case, I note that Shannon was not directly referenced in 't Hooft's first paper called 'Dimensional Reduction in Quantum Gravity'.

        - Shawn

        P.S. The book 'An Introduction to Information Theory: Symbols, Signals and Noise' by John Pierce makes the distinction that I was making earlier by referring to the difference between the information (referred to as bits) and the data in the message (referred to as just "binary digits").

        Let's give the content sent as "binary digits" a name: data.

        So, you send x bits of data, it has y bits of information, and the redundancy in the data is x - y.

        Now, if someone wishes to say that "the information is physical", and wishes to take that to the extreme, then you can say that the redundancy will never be greater than 1 bit. In that case, nature automatically would implement variable-length quantum Huffman codes.

          What they call a message, I would call a composite datum (a symbol, made up of binary symbols0.

          Dear Shawn,

          This is a very interesting idea you are proposing. I have two questions.

          1. Have you thought about inflation in the early universe in this context? It immediately comes to mind after reading your essay as something your ideas could possibly explain. Some physicists have proposed an as-of-yet unidentified "inflaton field" that drove inflation but seems to be absent today. From the perspective of your idea, one could hypothesize that the decrease in energy density (and hence gravitational interaction) "turned off" or "damped out" the inflaton field after the initial expansion.

          2. There is a theoretical energy limit (called the GZK limit) for cosmic rays from distant sources, based on the hypothesized interaction of the particles with background radiation along their trajectories. However, cosmic rays have been observed with energies above this limit. This seems to be another piece of data your idea could possibly explain: along most of their trajectories, cosmic rays from distant sources would be in regions of low gravitational interaction, and hence would not interact with the background radiation. Thus, they would preserve more of their energy than conventionally predicted. Have you thought about this possibility?

          If you like, take a look at my essay On the Foundational Assumptions of Modern Physics. It has a completely different viewpoint, but it's possible you may get some interesting ideas from it as I did from yours. Take care,

          Ben Dribus

            Hi Ben,

            It's really hard to say for me whether the model is just a toy or would have a desirable effect on the real phyiscs. I have tried to think of how things would have been like near the Big Bang in the context of the model, but I haven't any crystal clear thoughts on the matter.

            It was actually the GZK limit that got me started on this. It started out as a numerology (centred around the energy scale 10^19 eV) and then came the creation and annihilation idea a while later. I have thought about how this would affect the propagation of cosmic rays, but again, nothing crystal clear.

            To be honest, it's been a month now since I thought about the whole thing for more than a couple of minutes at a time. I'm bored with it.

            At worst, it is a toy model that can make for a possibly useful video game idea.

            - Shawn

            Oh yeah... do notice how the data bits x per message is always an integer, and that y and r are not necessarily so unless the number of messages is a power of two and all messages are equiprobable.

            Notice that when you analyze the classical binary messages that the mean radial distance increases, but the standard deviation decreases.

            It does kind of seem, at first glance, like a "spherization" of the positions in the message (state) space.

            The C++ code is attached.

            In the following list, the "max message size" is the number of bits per message. So the first in the list analyzes just the two 1-bit messages, the second in the list analyzes the four 2-bit messages, etc, etc. I have to redo the code so that it does not store the radii in an array, since it inevitably runs out of memory on this system when the "max message size" gets to about 25. I can just analyze the radii twice; the first time to get the mean, the second time to get the standard deviation.

            The n-bit messages of course "live" in a discrete n-dimensional space that has only two positions (0, 1) for each dimension.

            max message size: 1

            min radius: 0

            max radius: 1

            mean radius: 0.5 -/+ 0.5

            max message size: 2

            min radius: 0

            max radius: 1.41421

            mean radius: 0.853553 -/+ 0.521005

            max message size: 3

            min radius: 0

            max radius: 1.73205

            mean radius: 1.12184 -/+ 0.491409

            max message size: 4

            min radius: 0

            max radius: 2

            mean radius: 1.33834 -/+ 0.456989

            max message size: 5

            min radius: 0

            max radius: 2.23607

            mean radius: 1.52183 -/+ 0.428974

            max message size: 6

            min radius: 0

            max radius: 2.44949

            mean radius: 1.68313 -/+ 0.408759

            max message size: 7

            min radius: 0

            max radius: 2.64575

            mean radius: 1.82867 -/+ 0.394921

            max message size: 8

            min radius: 0

            max radius: 2.82843

            mean radius: 1.96247 -/+ 0.385622

            max message size: 9

            min radius: 0

            max radius: 3

            mean radius: 2.08713 -/+ 0.379348

            max message size: 10

            min radius: 0

            max radius: 3.16228

            mean radius: 2.20439 -/+ 0.37503

            max message size: 11

            min radius: 0

            max radius: 3.31662

            mean radius: 2.31552 -/+ 0.371968

            max message size: 12

            min radius: 0

            max radius: 3.4641

            mean radius: 2.42143 -/+ 0.369719

            max message size: 13

            min radius: 0

            max radius: 3.60555

            mean radius: 2.52281 -/+ 0.368006

            max message size: 14

            min radius: 0

            max radius: 3.74166

            mean radius: 2.62022 -/+ 0.366657

            max message size: 15

            min radius: 0

            max radius: 3.87298

            mean radius: 2.7141 -/+ 0.365562

            max message size: 16

            min radius: 0

            max radius: 4

            mean radius: 2.80482 -/+ 0.364652

            max message size: 17

            min radius: 0

            max radius: 4.12311

            mean radius: 2.89268 -/+ 0.36388

            max message size: 18

            min radius: 0

            max radius: 4.24264

            mean radius: 2.97793 -/+ 0.363215

            max message size: 19

            min radius: 0

            max radius: 4.3589

            mean radius: 3.0608 -/+ 0.362634

            max message size: 20

            min radius: 0

            max radius: 4.47214

            mean radius: 3.14148 -/+ 0.362121

            max message size: 21

            min radius: 0

            max radius: 4.58258

            mean radius: 3.22012 -/+ 0.361665

            max message size: 22

            min radius: 0

            max radius: 4.69042

            mean radius: 3.29689 -/+ 0.361256

            max message size: 23

            min radius: 0

            max radius: 4.79583

            mean radius: 3.37191 -/+ 0.360886

            max message size: 24

            min radius: 0

            max radius: 4.89898

            mean radius: 3.44529 -/+ 0.360552

            max message size: 25

            min radius: 0

            max radius: 5

            mean radius: 3.51713 -/+ 0.360246Attachment #1: radius.txt

              Attached is an image of the radii for different spaces of n bits, where n = 10, 18, 26. The radii are normalized by sqrt(n), binned, and then drawn. The lighter coloured lines (bins) have more messages in them.

              The radii slowly creep together as n increases.Attachment #1: shell.jpg

              ... This is all equivalent to saying:

              The set of n-bit messages contains 2^n messages total. The bits in each message are Cartesian coordinates in the message space, and the radial distance squared of each message (the count of the bits with the value 1 in the message) can be any one of the integer values 0 through n.

              The number of n-bit messages with the radial distance squared x is

              f(n, x) = n! / [x! (n - x)!]

              which is a way of counting the combinations (see binomial coefficient).

              Summing the various f(n, x) for x from 0 through n, the grand total is 2^n.

              5 days later
              • [deleted]

              Hello,

              Thanks for your comment. Have a good day.

              - Shawn

              After studying about 250 essays in this contest, I realize now, how can I assess the level of each submitted work. Accordingly, I rated some essays, including yours.

              Cood luck.

              Sergey Fedosin

                • [deleted]

                Hi Sergey,

                Thanks for your rating, whatever it may have been. Good luck in the contest as well.

                - Shawn