Hi, Ganesh, I respond here to a question that you asked in my page:

"I will have to ponder over the idea of ascribing goals to any entropy reduction in a system. I am wondering if that is too narrow a definition. After all, a (conscious) observer should be capable of ascribing a system as performing a computation (and hence the goal of performing that computation,) even with no entropy change(?)"

If the computation is invertible, then the output is equal to the input, except for a change of names. I believe that computations are interesting only when they are non-invertible. But perhaps I am missing something...

I saw your essay as soon as it came out, I was impressed, but did not follow all the details. Today I gave it a second look, and I am still impressed, above all, because this strikes me as an original contribution, which I found only very rarely in this forum. Moreover, within the neural networks theory, I've had enough gradient-descent learning rules that come out of the blue, your proposal is so much physical. I confess I must still give it more proper thought - or perhaps, find the time to do the calculations myself - because I intend to take these ideas very seriously. I hope you publish this work as a paper soon, this essay contest does not seem to be the best environment. The work is probably a bit too technical given the contest rules, the length is too constrained, and the audience can be better targeted. I hope that you will consider presenting these ideas in the computational neuroscience audience. They may not have your same physical-computational background, but they will be surely interested in the conceptual result.

Congratulations!

inés.

    Hi Ines,

    Thanks for your kind comments and encouragement. Yes, I have had issues with a wide variety of gradient descent based learning rules, which is why I wanted something more physically grounded. I am working on a more formal paper as we speak, where I will have the space to discuss the details. This is a continuously evolving idea and after receiving some great feedback, I have realized I need to make clear some things and provide better explanation for others. I am an engineer by training and I intend to leverage the ideas here to build something, but I do intend to present some of these results to the (computational) neuroscience field, especially ones connected to the critical brain hypothesis.

    I will reply to your comments here, since I get notifications when you reply on my page.

    I agree that bijective identity like mappings which lose no information are not interesting. Computation has been defined by the characteristic loss of information from input to output. Let me clarify what I was thinking about. Consider a physical system containing 4 orthogonal distinguishable states A,B,C and D. The system involves to achieve the identity function and we are left with 4 orthogonal states and no entropy change. A (conscious) observer is capable of associating the logical state 0 with the final state A, and the logical state 1 to the final states B, C and D and claim that this physical evolution represents an AND gate if the initial 4 states corresponds to the inputs 00, 01, 10 and 11. I would refer to this as an unfaithful implementation of the abstract logical AND gate, but nonetheless the observer will claim that this physical evolution with zero entropy change has achieved the goal of being an AND gate. Hence while I agree that there is a relationship between goals and entropy reduction processes, I wonder if with the right observer, goals can be ascribed to a non-entropy reducing process. In fact, I have questioned if this ability to imbue goals to zero entropy change or entropy increasing processes, is a defining characteristic of conscious (and definitely intelligent) observers? After all, while we are able to perform input-output computing at will (without it requiring to be entropy reducing), our computer's outputs have computational value only because we as conscious (intelligent) observers interpret them as such. Please let me know if you have any thoughts for/against this.

    This idea of computational faithfulness of a physical implementation of logical mappings is discussed in detail here if you would like to know more.

    Anderson, Neal G. "On the physical implementation of logical transformations: Generalized L-machines." Theoretical Computer Science 411.48 (2010): 4179-4199.

    Neal is my PhD adviser and has been very influential in my thinking. He has been working recently on addressing the importance of observers in determining information as a physical quantity. This paper discusses that in detail and I think you might like it.

    Anderson, Neal G. "Information as a Physical Quantity." (2016)

    His paper does not state what a conscious observer would be but using the ideas presented in the essay, I have some initial thoughts on how to address that and define such observers in a physically grounded manner.

    I am looking forward to hearing your thoughts. Thanks.

    Cheers

    Natesh

    Hi, Ganesh, I am afraid I do not understand. "the observer will claim that this physical evolution with zero entropy change has achieved the goal of being an AND gate." Why do you say that the evolution has no entropy change, if the observer has made the association A -> 0, and B,C,D -> 1? This association is entropy-reducing, isn't it? I wait for your reply before elaborating more.

    Great to know you are on the way to publish! Your essay is new raw material, so the natural evolution is: get it published. As a neuroscientist, I was more surprised by the learning part of your essay, than by the criticality one, but mind you, I am not truly mainstream, so just take it as one opinion out of many. To me, the learning part is thought provoking, I have the impression that new paradigms, and new understanding may come out of that. The criticality claim seems to be everywhere, but I do not gain much from it, apart from classifying the process as critical. Anyway, surely I am missing something...

    best!

    ines.

    Hi Ines,

    Consider the evolution of the system with 4 initial distinguishable states A,B,C and D to 2 orthogonal states 0 and 1, with A evolving to 0, and B,C,D evolving to 1. There is a clearly a reduction in the physical entropy of this system and an observer with access to observe this evolution might decide to associate the AND operation with this evolution. We will call this a faithful physical realization of the AND operation in a system.

    Now consider the evolution of the system with 4 initial distinguishable states A,B,C and D to 4 orthogonal end states 0, 1,2 and 3 with a one-to-one evolution. There is no reduction in the physical entropy of this system and another observer with access might decide to associate the physical state 0 with the logical state '0' and the physical states 1,2 and 3 with the logical state '1'. Such an observer will associate the AND operation with this evolution (this is the principle of reversible computing where there is no minimum dissipation) and will not be wrong. The difference being that this is what we refer to as an unfaithful physical realization of the AND operation.

    I was trying to point out that it is possible to associate interesting computation with a system evolution in which there is no change in the physical entropy of the system. There might be reduction in the entropy of the observer, not sure there has to be though. Perhaps I am missing/misunderstood something? Are you saying that the reduction in the entropy of the observer (and not necessarily the system) is enough to imbue the system with a goal? I was contesting the idea that entropy reduction in the system alone is enough to achieve that.

    Yes, the equations that I have obtained are themselves well established in the Information Bottleneck method (used in clustering and machine learning) and my main contribution is tying it all together in a physical sense. I was pointing out the criticality part, since it seemed the idea though popular is still debated as there seemed to be no clear theoretical foundation for why the brain needs to be a critical system. Most criticality arguments are made from observing neuronal avalanches on EEGs and other experimental data, which can be explained away using critical behavior. And calculations of expected branching parameters is much lower than what is seen in the critical brain. Being able to view different cognition states as phase transition in input signal mapping can allow us to bypass these past hurdles I think. But I have to give it much more thought.

    And please call me Natesh. Ganesh is my father. We have a whole different first name, last name system.

    Cheers

    Natesh

    Hi Natesh,

    sorry about the names! I am a slave of rhymes, and tend to map together all what sounds similar. Actually it's even worse, I also cluster faces together. By all means, I must learn to represent information more injectively...

    Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all settings, ascribing agency (as I understand it) requires an entropy-reducing mapping. If the mapping is not injective, it may still be interesting or useful for the observer, and he or she may still make a valuable acquisition in their life by learning the mapping. But I can hardly relate this operation with perceiving a system that seeks to achieve a "goal". For me, a goal is something that tends to be reached irrespective of factors that tend do interfere with its accomplishment. That is why I require the non-injectivity: the putative obstacles are a collection of input conditions, that are supposed to be overcome by the goal-seeking agent.

    Now for the variable settings.

    One option is: I am the observer, and I receive information of the behavior of some external system (say, a replicating DNA molecule outside me), and by defining the system under study in some specific way, I conclude that there is some goal-oriented behavior in the process.

    Another option is: I am given some information which I represent in my head, and I perform a certain computation with that information, for example, the AND gate you mention. The computation happens inside my head. But I also have observers inside my head, that monitor what other parts of my head are doing. For one such observer, what the other end of the brain is doing, acts as "external" (in all computations except for self-awareness, which is the last part of my essay). One such observer can assign agency (if it wants to!) to the computation, and conclude that the AND gate (inside my head!) "tends" to reduce the numbers 1, 2, 3 to the digit 0 (or whichever implementation we choose). A weird goal for an agent, but why not.

    All I am claiming is: goal-directed behavior does not exist without an observer that tends to see things in a very particular way: as non-injective mappings. I am not claiming that this is the only thing that can be learned by a plastic system (one-to-one mappings can also be learned). I am not claiming that the only thing that can be done with a non-injective mapping is to arrogate it with agency and goals. There are many more things happening in the world that may be good to learn, as well as in the brain of the observer. And there are many more computations to do, other arrogating agency. My only point is: if we see a goal (inside or outside us), then we have trained ourselves to interpret the system in the right manner for the goal to emerge. The goal is not intrinsic to nature, it is a way of being seen.

    Not much, just one tiny point. Or perhaps, just a definition. If you look through the essays, there are as many definitions of "goal", "agent" and "intention" as authors...

    > my main contribution is tying it all together in a physical sense

    Yes, and that is precisely why it is so great! And perhaps you are right, within your framework, what so far has been presented as a mere description of the critical brain, now can be seen as the natural consequence when certain physical conditions are assumed. I do appreciate that.

    Anyhow, time to rest! buenas noches!

    inés.

      Sorry, one more point: when an observer learns a one-to-one mapping, arrogating agency has not much sense, because there is no entropy loss in the mapping. The process of learning the mapping, though, can be arrogated with agency: the observer tends to learn. But here there is a meta-observer observing the first observer, right? It is the learning observer that may be arrogated with agency, not the lower-level system under study. And now yes, I go. Sorry for the long speeches!

      Hi Ines,

      Thanks for the long detailed response. You have given me much to think about and definitely introduced me a new point of view.

      "Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all settings, ascribing agency (as I understand it) requires an entropy-reducing mapping. If the mapping is not injective, it may still be interesting or useful for the observer, and he or she may still make a valuable acquisition in their life by learning the mapping. But I can hardly relate this operation with perceiving a system that seeks to achieve a "goal". For me, a goal is something that tends to be reached irrespective of factors that tend do interfere with its accomplishment. That is why I require the non-injectivity: the putative obstacles are a collection of input conditions, that are supposed to be overcome by the goal-seeking agent."

      -->Agreed.

      "All I am claiming is: goal-directed behavior does not exist without an observer that tends to see things in a very particular way: as non-injective mappings."

      -->I agree again on that. I think I have described such observer systems in this essay that tend to associate goals with their sensory inputs.

      "The goal is not intrinsic to nature, it is a way of being seen."

      --> Yes, you are completely right about that. Goals are purely subjective.

      "when an observer learns a one-to-one mapping, arrogating agency has not much sense, because there is no entropy loss in the mapping. The process of learning the mapping, though, can be arrogated with agency: the observer tends to learn. But here there is a meta-observer observing the first observer, right? It is the learning observer that may be arrogated with agency, not the lower-level system under study."

      --> I see your point but I need more time to think about this.

      Thanks again for a delightful exchange. Please let me know if you have other questions/comments related to the idea in general. I would be happy to answer.

      Cheers

      Natesh

      15 days later

      Hi Dear Ganesh

      I have read your article (as we usually speaking this!) and I will simply tell you that I am somewhat skeptical on its possible success. I see that your approach is presented by well logical flow, but I am skeptical as these are based on hypotheses. Maybe you will very right, however who can say you this certainly and definitely for today? Thus, your essay seems to me as the interesting Ideas represented in nice form and by impressing narration. I hope you can understand my point (and maybe somewhat to become agree with me!) if you find time to check my work. Then we can continue talk in my page, if you see we can have somewhat common views.

      Best regards

        Hi George,

        Thank you for your comments. Skepticism is good and an important quality of a good scientist. I welcome it and your criticism, for it will help me grow as a researcher. As a PhD student with deadlines, I am a little busy but I will have a chance to read your work slowly and in detail over the weekend. I shall get back to you once I have understood what you have to say in your submission. Thanks.

        Natesh

        Hi Shaikh,

        Thank you for your comments.

        "How matter learns?"

        --> This is what I address in section 2 of my submission. I think that minimally dissipative systems necessarily learn in an unsupervised manner naturally. There are hints in the derivations on how reinforcement and supervised learning will be covered as well.

        "Can we say a cyclone is a goal-directed system?"

        --> Good question!We have to differentiate between us ascribing a goal to a cyclone vs whether the cyclone has a goal for itself and a sense of agency? That is an important distinction. We can ofcourse project a goal from us onto a cyclone but unless a cyclone is a minimally dissipative system (having an hierarchical implementation), which it isnt, then the cyclone does not have a sense of its own agency, goal oriented or not. I would recommend reading Dan Bruiger's submission here which makes this distinction between teleology and teleonomy very clearly.

        "There is periodicity of want-and-hunt/intentions in every living being, how that can be designed?"

        --> I do not know the answer for that yet, on how to make systems like the ones I describe in my submission. I am not even sure if 'design' is the right way to go about it. All the systems that I refer to have been self-organized. Perhaps we should look to creating the conditions/ constraints for such systems to emerge and let physical law do its thing. This is how I would imagine we would achieve a new way of computing, called 'thermodynamic computing' that is being theorized about.

        Hope I have satisfactorily answered your questions.

        Cheers

        Natesh

        Dear Natesh,

        I very much enjoyed reading your most impressive essay. Since you have read mine and commented, I will look at possible correlations, based on the assumption that one of us is actually representative of reality. In fact, even if my essay is correct about the universal nature of awareness, your model may well 'simulate' awareness and may describe realistic governing constraints on dynamics of learning. For example you model a sense of agency as "the awareness of an action being performed as it is being performed." This is compatible with your definition of agency as "pre-reflective subjective awareness that one is initiating, executing, and controlling one's own volitional actions in the world." The key word is of course 'subjective', and that is the great question underlying whether or not the singularity is possible.

        Let me first say that the qualms I have about quantum mechanics are based on common interpretations of physical reality. I have no problem at all with your usage of QM in your essay. It is interesting however that Crooks fluctuation theorem of non-equilibrium thermodynamics is essentially a classical, not a quantum analysis. Part of this, I believe, is that work is not an observable in quantum mechanics, and the relevant work averages are given by time-ordered correlation functions of the exponentiated Hamiltonian rather than by expectation values of an operator representing the work as a pretended observable. [Talkner, Lutz, and Hanggi] I'm not familiar enough with England's approach but from what you present of it it appears to be essentially classical.

        Although I did not elaborate in my essay, I have in response to questions on my page noted that the field as I hypothesize it senses (and affects) momentum density, and this is very relevant. One could say to me: "You claim that the consciousness field interacts with ions in axons and vesicles flowing across synaptic gaps. Why then would not the same field interact with electrons flowing in circuitry, since the momentum density of an electron is greater than that of an ion or a vesicle?"

        An excellent question. Part of the answer is that the charge-to-mass ratio of ions and vesicles makes them less susceptible to EM fields. But the key answer is that momentum density flow in the brain (and even the blood) is in 3-D and the consciousness field exists in 3-D, and our subjective awareness of 3-D is very strongly linked to these facts. Current circuitry (see my paper FPGA Programming: step-by-step) is 2-D, and even the 2-D arrangements of circuits are designed to optimize timing. There is no spatial aspect to computer circuitry, of the sort we find in the brain. If (and it's a big if) we ever reach the point where circuitry (say a nanotube network) could span the 3-D volume (with suitable I/O: see FPGA Design from the Outside In) then I would think it might be possible that a 'super brain' could be built, but this is contingent on the existence of the consciousness field as the seat of awareness! Doing without the field and without 3-D (as opposed to computations of 3-D) is one heck of a task.

        In addition to the work I've done on pattern recognition and learning (hinted at in my endnotes) I also covered Steven Grossberg's mathematical model of neural circuits [The Automatic Theory of Physics (my ref.5)] I hope you are so close to finishing your PhD that you have no use for any of this information, but, given your familiarity with my microprocessor systems design you would at least find the info readable, and perhaps even a source of ideas. I hope this discussion stimulates useful thoughts for you.

        I would be very surprised if your essay does not win one of the prizes. It is an exceptional essay, and I wish you well in this field.

        My very best regards,

        Edwin Eugene Klingman

          Dear Natesh Ganesh,

          Thank you for your beautifully written, and rigorously argued essay. I agree that your "minimal dissipation hypothesis" is a very good indicator of intent and that goal-directed agency emerges from, as you put it, systems that dissipate minimally.

          Just as a quick question, have you followed some of Charlie Bennett's work on computational efficiency from thermodynamic considerations? I had the privilege of spending some time with him about a decade ago and found him to be a great source of talent and inspiration in that regard.

          Good luck in the contest, I have rated your essay and thoroughly enjoyed reading it.

          Regards,

          Robert

            Dear Edwin,

            Sorry for the delayed response. I had a conference paper deadline and finally have some time to myself. Thank you for your detailed response and encouraging comments. It fills me greater confidence to keep working harder at a solution.

            I really enjoyed your discussion on the effect of the consciousness field on electrons vs ions. It is an interesting point you make. Some colleagues of mine in the department are working on ion-based memristor devices, which might actually serve as a better substrate to interact with a consciousness field than an electronic device. Furthermore I completely agree with you on the concept of a 3D structure, than the traditional 2D architecture. I too am convinced that any system capable of a comparable consciousness should have some kind of 3D structure. Interestingly I am in discussion with them to possibly constructing 3D array of sorts with these ionic memristors, with the type of constraints that I talk about in my essay (if we figure out how to impose them) and just let it run in an input environment to see what it does. Should be very interesting I think.

            I am about 6-7 months from finishing and in full writing mode, but I will definitely take a look at the resources you mentioned (especially the ones on pattern recognition). One can never learn enough and I am sure they will provide some new insight for me. Thanks.

            Natesh

            PS: I finally got around to rating your essay. I would appreciate it if you rate mine, if you havent already. If you have already, thank you very much!

            Dear Robert,

            Thank you for your encouraging reply. I am happy to hear that you liked my submission. Yes, I do think that "minimal dissipation" might provide a sufficient condition for emergence of goal-oriented agency.

            Yes, I have across Bennett's work! I think he has been one of the most influential thinkers of our time!! I study the fundamental thermodynamic limits to computing as part of my dissertation and I often use the works of Landauer and Bennett. I also like his work on reversible computing, and am hoping the field will gain more momentum. This is my favorite paper of his "Dissipation-error tradeoff in proofreading." (Apologies for my long winded rant.)

            Good luck on the contest. I will definitely take a look at your submission. Thanks.

            Natesh

            PS: My title is actually a play of words on Landauer's famous paper "Information is Physical".

            Dear Natesh,

            I have now rated you (10). Past experience has indicated that there may be turbulence in the final hours, so I had planned to hold off to help you then, but perhaps increased visibility will help now. Some earlier essays that I pushed up for visibility were immediately given '1's by whatever trolls lurk in low places.

            The final decisions are made by FQXi judges, and I think they will judge your work well.

            I am very glad that you agree about the 3-D structure. What you say about ionic memristors is very interesting! I'm glad to hear this. I hope we stay in touch.

            Best,

            Edwin Eugene Klingman

            Dear Natesh --

            Let me ask a very basic question. Say I take a simple Newtonian system, two planets orbiting around each other.

            I hit one with a rock, and thereby change the orbital parameters. There's a map from the parameters that describe the incoming rock, and the resulting shift in the system. The system appears to have "learned" something about the environment with minimal (in fact, zero) dissipation.

            If I let the rock bounce off elastically, then there is strictly no change in entropy. I could probably arrange the environment in such a way that the system would show decreasing amounts of change to rocks flying at random times in from a particular direction. In general, there will be nice correlations between the two systems.

            Why is this open system not an inferential agent?

            I suppose I'm trying to get a sense of where the magic enters for you. I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?

            Thank you for a stimulating read. It's a pleasure to see the Friston work cited alongside (e.g.) Jeremy England.

            Yours,

            Simon

              Dear Simon,

              Thank you for your comments and questions. This is a nice coincidence. I just finished reading about the Borgesian library and currently on section 2 "the physics of the gap". Great piece of writing and I will reach out on your page once I am done reading and re-reading and digesting it.

              "Why is this open system not an inferential agent?"

              --> Yes, it technically is for that very particular environment providing those particular input signals. If those planets saw ONLY the type of conditions, that allowed it maintain a specific macrostate at minimal dissipation, then we might have to entertain the possibility that it is an inferential agent in that environment. In section-2 of my submission, I introduced the notion of the link between minimal dissipation and learning. I added section-4 to not only show the link to Englands work, but also explain why we should focus on systems that are minimally dissipative in all the input signals from their environment, that they might encounter as they maintain their macrostate. For example, if we thought about a system that was minimally dissipative for one input signal but not the rest of it, I would think that system is not an inferential agent, unless the probability of that particular signal goes to 1.

              "This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?"

              --> Can you please give me a simple example. I seem to be not following here. If you mean that it is possible to construct simple cases of systems that are minimally dissipative in a particular environment, and do not learn anything, my first guess is that it does not possess the sufficient complexity to do so, hence not satisfying that constraint of the hypothesis. After all, there are large nice periods of blissful dreamless unconscious sleep where we dont learn or infer anything either, which would be explained by changes to our computational complexity while maintaining the minimal dissipation.

              On a side note, I do wonder that given our finite computational complexity and if our brain is indeed a minimally dissipative system, might serve to explain why there might be some computational problems, that our brain simply cannot solve by itself.

              I agree that both Friston and England's works are very influential, and drove to me to look for a link between the two. Hopefully I satisfactorily answered your great questions. If I have not, please let me know and I will take another crack at it.

              Cheers

              Natesh

              PS: I am continuing to work on an updated version of the essay to better clarify and explain myself without the constraints of a word limit. The questions you have asked are very useful, and I will include explanations that version to better address them.

              Dear Edwin,

              Thank you for your kind rating. Yes, I agree with you on the sad trolling that has been going on, that I fear is hurting the contest overall. I was hit with 5 continuous 1's without any feedback, which sent my essay in freefall and left me disheartened earlier. Hopefully I will have the opportunity to have the work judged by the FQXi panel. Good luck on the contest and I would very much like to stay in touch. Thanks.

              Natesh

              Dear Natesh -- thank you for your very thoughtful response.

              You asked me about this remark:

              "I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail."

              Saying "Can you please give me a simple example."

              So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative.

              Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems. With a lot of work, I can create a map between any two systems. It might require strange disjunctive unions of things ("System 1 State A corresponds to System 2 State B at time t, C at time t+1, W or X at time t+2...") and be very hard to compute, but it's there. I'm not sure how dissipative the two could be, but my guess is that it's hard to rule out the possibility that the coarse-grained state spaces the maps imply could have low dissipation.

              (Scott Aaronson has a nice piece on computational complexity and Waterfall problems--http://www.scottaaronson.com/papers/philos.pdf)

              You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker.

              Yours,

              Simon