Hi George,

Thank you for your comments. Skepticism is good and an important quality of a good scientist. I welcome it and your criticism, for it will help me grow as a researcher. As a PhD student with deadlines, I am a little busy but I will have a chance to read your work slowly and in detail over the weekend. I shall get back to you once I have understood what you have to say in your submission. Thanks.

Natesh

Hi Shaikh,

Thank you for your comments.

"How matter learns?"

--> This is what I address in section 2 of my submission. I think that minimally dissipative systems necessarily learn in an unsupervised manner naturally. There are hints in the derivations on how reinforcement and supervised learning will be covered as well.

"Can we say a cyclone is a goal-directed system?"

--> Good question!We have to differentiate between us ascribing a goal to a cyclone vs whether the cyclone has a goal for itself and a sense of agency? That is an important distinction. We can ofcourse project a goal from us onto a cyclone but unless a cyclone is a minimally dissipative system (having an hierarchical implementation), which it isnt, then the cyclone does not have a sense of its own agency, goal oriented or not. I would recommend reading Dan Bruiger's submission here which makes this distinction between teleology and teleonomy very clearly.

"There is periodicity of want-and-hunt/intentions in every living being, how that can be designed?"

--> I do not know the answer for that yet, on how to make systems like the ones I describe in my submission. I am not even sure if 'design' is the right way to go about it. All the systems that I refer to have been self-organized. Perhaps we should look to creating the conditions/ constraints for such systems to emerge and let physical law do its thing. This is how I would imagine we would achieve a new way of computing, called 'thermodynamic computing' that is being theorized about.

Hope I have satisfactorily answered your questions.

Cheers

Natesh

Dear Natesh,

I very much enjoyed reading your most impressive essay. Since you have read mine and commented, I will look at possible correlations, based on the assumption that one of us is actually representative of reality. In fact, even if my essay is correct about the universal nature of awareness, your model may well 'simulate' awareness and may describe realistic governing constraints on dynamics of learning. For example you model a sense of agency as "the awareness of an action being performed as it is being performed." This is compatible with your definition of agency as "pre-reflective subjective awareness that one is initiating, executing, and controlling one's own volitional actions in the world." The key word is of course 'subjective', and that is the great question underlying whether or not the singularity is possible.

Let me first say that the qualms I have about quantum mechanics are based on common interpretations of physical reality. I have no problem at all with your usage of QM in your essay. It is interesting however that Crooks fluctuation theorem of non-equilibrium thermodynamics is essentially a classical, not a quantum analysis. Part of this, I believe, is that work is not an observable in quantum mechanics, and the relevant work averages are given by time-ordered correlation functions of the exponentiated Hamiltonian rather than by expectation values of an operator representing the work as a pretended observable. [Talkner, Lutz, and Hanggi] I'm not familiar enough with England's approach but from what you present of it it appears to be essentially classical.

Although I did not elaborate in my essay, I have in response to questions on my page noted that the field as I hypothesize it senses (and affects) momentum density, and this is very relevant. One could say to me: "You claim that the consciousness field interacts with ions in axons and vesicles flowing across synaptic gaps. Why then would not the same field interact with electrons flowing in circuitry, since the momentum density of an electron is greater than that of an ion or a vesicle?"

An excellent question. Part of the answer is that the charge-to-mass ratio of ions and vesicles makes them less susceptible to EM fields. But the key answer is that momentum density flow in the brain (and even the blood) is in 3-D and the consciousness field exists in 3-D, and our subjective awareness of 3-D is very strongly linked to these facts. Current circuitry (see my paper FPGA Programming: step-by-step) is 2-D, and even the 2-D arrangements of circuits are designed to optimize timing. There is no spatial aspect to computer circuitry, of the sort we find in the brain. If (and it's a big if) we ever reach the point where circuitry (say a nanotube network) could span the 3-D volume (with suitable I/O: see FPGA Design from the Outside In) then I would think it might be possible that a 'super brain' could be built, but this is contingent on the existence of the consciousness field as the seat of awareness! Doing without the field and without 3-D (as opposed to computations of 3-D) is one heck of a task.

In addition to the work I've done on pattern recognition and learning (hinted at in my endnotes) I also covered Steven Grossberg's mathematical model of neural circuits [The Automatic Theory of Physics (my ref.5)] I hope you are so close to finishing your PhD that you have no use for any of this information, but, given your familiarity with my microprocessor systems design you would at least find the info readable, and perhaps even a source of ideas. I hope this discussion stimulates useful thoughts for you.

I would be very surprised if your essay does not win one of the prizes. It is an exceptional essay, and I wish you well in this field.

My very best regards,

Edwin Eugene Klingman

    Dear Natesh Ganesh,

    Thank you for your beautifully written, and rigorously argued essay. I agree that your "minimal dissipation hypothesis" is a very good indicator of intent and that goal-directed agency emerges from, as you put it, systems that dissipate minimally.

    Just as a quick question, have you followed some of Charlie Bennett's work on computational efficiency from thermodynamic considerations? I had the privilege of spending some time with him about a decade ago and found him to be a great source of talent and inspiration in that regard.

    Good luck in the contest, I have rated your essay and thoroughly enjoyed reading it.

    Regards,

    Robert

      Dear Edwin,

      Sorry for the delayed response. I had a conference paper deadline and finally have some time to myself. Thank you for your detailed response and encouraging comments. It fills me greater confidence to keep working harder at a solution.

      I really enjoyed your discussion on the effect of the consciousness field on electrons vs ions. It is an interesting point you make. Some colleagues of mine in the department are working on ion-based memristor devices, which might actually serve as a better substrate to interact with a consciousness field than an electronic device. Furthermore I completely agree with you on the concept of a 3D structure, than the traditional 2D architecture. I too am convinced that any system capable of a comparable consciousness should have some kind of 3D structure. Interestingly I am in discussion with them to possibly constructing 3D array of sorts with these ionic memristors, with the type of constraints that I talk about in my essay (if we figure out how to impose them) and just let it run in an input environment to see what it does. Should be very interesting I think.

      I am about 6-7 months from finishing and in full writing mode, but I will definitely take a look at the resources you mentioned (especially the ones on pattern recognition). One can never learn enough and I am sure they will provide some new insight for me. Thanks.

      Natesh

      PS: I finally got around to rating your essay. I would appreciate it if you rate mine, if you havent already. If you have already, thank you very much!

      Dear Robert,

      Thank you for your encouraging reply. I am happy to hear that you liked my submission. Yes, I do think that "minimal dissipation" might provide a sufficient condition for emergence of goal-oriented agency.

      Yes, I have across Bennett's work! I think he has been one of the most influential thinkers of our time!! I study the fundamental thermodynamic limits to computing as part of my dissertation and I often use the works of Landauer and Bennett. I also like his work on reversible computing, and am hoping the field will gain more momentum. This is my favorite paper of his "Dissipation-error tradeoff in proofreading." (Apologies for my long winded rant.)

      Good luck on the contest. I will definitely take a look at your submission. Thanks.

      Natesh

      PS: My title is actually a play of words on Landauer's famous paper "Information is Physical".

      Dear Natesh,

      I have now rated you (10). Past experience has indicated that there may be turbulence in the final hours, so I had planned to hold off to help you then, but perhaps increased visibility will help now. Some earlier essays that I pushed up for visibility were immediately given '1's by whatever trolls lurk in low places.

      The final decisions are made by FQXi judges, and I think they will judge your work well.

      I am very glad that you agree about the 3-D structure. What you say about ionic memristors is very interesting! I'm glad to hear this. I hope we stay in touch.

      Best,

      Edwin Eugene Klingman

      Dear Natesh --

      Let me ask a very basic question. Say I take a simple Newtonian system, two planets orbiting around each other.

      I hit one with a rock, and thereby change the orbital parameters. There's a map from the parameters that describe the incoming rock, and the resulting shift in the system. The system appears to have "learned" something about the environment with minimal (in fact, zero) dissipation.

      If I let the rock bounce off elastically, then there is strictly no change in entropy. I could probably arrange the environment in such a way that the system would show decreasing amounts of change to rocks flying at random times in from a particular direction. In general, there will be nice correlations between the two systems.

      Why is this open system not an inferential agent?

      I suppose I'm trying to get a sense of where the magic enters for you. I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?

      Thank you for a stimulating read. It's a pleasure to see the Friston work cited alongside (e.g.) Jeremy England.

      Yours,

      Simon

        Dear Simon,

        Thank you for your comments and questions. This is a nice coincidence. I just finished reading about the Borgesian library and currently on section 2 "the physics of the gap". Great piece of writing and I will reach out on your page once I am done reading and re-reading and digesting it.

        "Why is this open system not an inferential agent?"

        --> Yes, it technically is for that very particular environment providing those particular input signals. If those planets saw ONLY the type of conditions, that allowed it maintain a specific macrostate at minimal dissipation, then we might have to entertain the possibility that it is an inferential agent in that environment. In section-2 of my submission, I introduced the notion of the link between minimal dissipation and learning. I added section-4 to not only show the link to Englands work, but also explain why we should focus on systems that are minimally dissipative in all the input signals from their environment, that they might encounter as they maintain their macrostate. For example, if we thought about a system that was minimally dissipative for one input signal but not the rest of it, I would think that system is not an inferential agent, unless the probability of that particular signal goes to 1.

        "This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?"

        --> Can you please give me a simple example. I seem to be not following here. If you mean that it is possible to construct simple cases of systems that are minimally dissipative in a particular environment, and do not learn anything, my first guess is that it does not possess the sufficient complexity to do so, hence not satisfying that constraint of the hypothesis. After all, there are large nice periods of blissful dreamless unconscious sleep where we dont learn or infer anything either, which would be explained by changes to our computational complexity while maintaining the minimal dissipation.

        On a side note, I do wonder that given our finite computational complexity and if our brain is indeed a minimally dissipative system, might serve to explain why there might be some computational problems, that our brain simply cannot solve by itself.

        I agree that both Friston and England's works are very influential, and drove to me to look for a link between the two. Hopefully I satisfactorily answered your great questions. If I have not, please let me know and I will take another crack at it.

        Cheers

        Natesh

        PS: I am continuing to work on an updated version of the essay to better clarify and explain myself without the constraints of a word limit. The questions you have asked are very useful, and I will include explanations that version to better address them.

        Dear Edwin,

        Thank you for your kind rating. Yes, I agree with you on the sad trolling that has been going on, that I fear is hurting the contest overall. I was hit with 5 continuous 1's without any feedback, which sent my essay in freefall and left me disheartened earlier. Hopefully I will have the opportunity to have the work judged by the FQXi panel. Good luck on the contest and I would very much like to stay in touch. Thanks.

        Natesh

        Dear Natesh -- thank you for your very thoughtful response.

        You asked me about this remark:

        "I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail."

        Saying "Can you please give me a simple example."

        So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative.

        Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems. With a lot of work, I can create a map between any two systems. It might require strange disjunctive unions of things ("System 1 State A corresponds to System 2 State B at time t, C at time t+1, W or X at time t+2...") and be very hard to compute, but it's there. I'm not sure how dissipative the two could be, but my guess is that it's hard to rule out the possibility that the coarse-grained state spaces the maps imply could have low dissipation.

        (Scott Aaronson has a nice piece on computational complexity and Waterfall problems--http://www.scottaaronson.com/papers/philos.pdf)

        You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker.

        Yours,

        Simon

        Dear Simon,

        I address your comments/questions below:

        "So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative."

        --> Please bear with me, but I take my time in understanding all the tiny details completely. Correct me if I am wrongly characterizing what you are saying, If the two systems are run in the manner that you describe, are you saying that the joint system is minimally dissipative or just the second system? If they are jointly dissipative, then the correlation between the two would be plastic as expected. I mention this at the start of section 5, where I discuss how subsystem relationships should be plastic if the joint system is minimally dissipative. And the correlation will hence vary depending upon the input provided. Does that answer your point?

        "Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems."

        --> Let me get back to you on this once I have a firmer grasp on what these problems are exactly. I remember reading about it in Aaronson's blog a while ago and I need to revisit it. Thank you for that particular link. I am an avid fan of his blog and work, and the updated version of the essay has references to his blog post on the Integrated Information Theory.

        "You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker."

        -->I agree that while deep learning networks are learning in a manner similar to us, there are large differences between us and deep learning algorithms. Along the lines of John Searle's Chinese room argument, I would argue that such algorithms are only syntatical and there are no semantics there. Furthermore running such algorithms on a von Neumann architecture GPUs (as they traditional are) means these are not minimally dissipative systems. I think the plastic subsystem connections are needed for any system to be minimally dissipative and von Neumann architecture does not have that. If we went to systems with neuromorphic architecture, then it becomes a lot more interesting I think.

        I agree with you that the effort is really with the map-making and this is why I am very interested in unsupervised learning with an array of devices called memristors (Look for Prof. Yang's group at Umass,Amherst. They are doing cool things like this). Short of starting with an artificial primordial soup and evolving/self-organize an artificial brain on silicon in an accelerated manner, I think such an approach is the best way to test my ideas and build an agent remotely close to us (Since we know somethings about the final product aka our brain, we can cheat and start with an array of memristors since they can behave as neurons and synapses. How to impose other thermodynamic constraints on this array is something I am thinking about now). We just set up the array of physical devices without any preprogramming or map making, let it run and supply it with inputs and it is allowed to make its own maps and provide outputs. If such a system is able to answer questions about flight performance, based on x-ray image of the airplane, I think a) that would be amazing and b) we have to seriously entertain the possibility that it is an agent like us (I am not touching the question whether such an agent is conscious or not by a 10 foot pole haha)

        I hope I didnt miss anything and answered your questions. Let me know if I need to further clarify more things.

        Cheers

        Natesh

        PS: In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things.

        Hi Natesh,

        The posts in this blog are as interesting a conversation as are in this contest. In particular your conversation with Ines Samengo is most interesting. More on that in a moment.

        The wording of FQXi.org's contest is nebulous, unless you realize it is about the MUH of Tegmark. Tegmark's emphasis is about Mathematics. Landauer's emphasis is about Information. Your emphasis is about Intention. My emphasis is about how we choose. I would make a hierarchy as shown below:

        "Mathematics is Physical"...........Tegmark

        "Information is Physical"..............Landauer

        "Intention is Physical"..................Ganesh

        "Choice (intention from a personal viewpoint) is Physical (maybe), but we can never know it"......Limuti

        I did read your essay and honestly I had trouble following it (I did however spot the insulated gate mosfet structures :))

        The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals.

        Non equilibrium thermodynamics as treated by you and Ines was interesting. Ines brought out the memory clearing needed by Maxwell's demon to control the entropy (I think I got that right). Perhaps this memory clearing is why we can't know how we choose. For example, move your finger. How did you do that? Do not point to MRIs or brain function. I maintain that you have no direct experiential record (knowledge or memory) of how you moved your finger. I believe the answer is that you moved your finger, but you do not know directly how you did that. Was Maxwell's demon involved? I know this is a bit esoteric, but would like to know what you think.

        In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation.

        You forced me to think....minor miracle. Therefore this is a super essay!

        Thanks,

        Don Limuti

          Dear Natesh --

          It's fun to go back and forth on this.

          If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other. Of course, it is a limit case, perhaps most useful for pointing out what might be missing in the story, rather than claiming there's something bad about the story.

          The machine learning case provides a different challenge, I think. You seem to agree that the real difficulty is contained in the map-making. But then this makes the prediction/learning story hard to get going without an external goal for the map-maker. Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping. More colloquially, when my computer makes a prediction, I have to know how to read it off the screen (printout, graph, alarm bell sequence). Without knowledge of the code (learning or discovered post hoc) the prediction is in theory only.

          You write, "In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things." I think that might be the main point of difference. I'm happy to use the stories you tell to determine whether an engineered system is doing something, and this seems like a really interesting criterion. Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction.

          Thank you again for a provocative and interesting essay.

          Yours,

          Simon

          Hi Don,

          Thank you for your very kind comments. I am glad to see that you liked the essay. Ines's work was outstanding and it was very insightful to discuss ideas with her.

          "The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals. "

          --> I will check this book out.

          "In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation."

          --> Ha!! Wonderful. I did not immediately get that but it adds much more to your submission. Thanks.

          Cheers

          Natesh

          • [deleted]

          Dear Simon,

          "It's fun to go back and forth on this."

          -->Agreed.

          "If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other."

          --> I think I misunderstood the problem you had presented (a simple case of lost in translation I guess). If the two systems are evolving independently and there are no inputs being presented to either one of them, then I am not sure what it is that they can learn in the first place. But then again, if this is a limiting case of no inputs at all, then I must think about this further. Since my derivations start with the assumptions that there are some external inputs affecting the physical system in question, I would say that a system just evolving without being affected by external inputs and dissipating minimally is not learning anything. This is further captured by the fact that the mutual information complexity measure can serve as a measure of memory/history in the system.

          "Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping."

          --> I agree that the X-ray image itself cannot predict but a minimally dissipative system which is presented with the X-ray image as inputs that affect it's state transition might be capable of learning and predicting from the input image.

          "Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction."

          --> I argue that my constraints specify which systems could be goal oriented agents in the first place. And that goals and desires are created and evolved as such systems interact with their input environment.

          "Thank you again for a provocative and interesting essay."

          --> Thanks for a very a stimulating discussion. I am pretty convinced that I should rename the minimal dissipation hypothesis to something like the "dissipation-complexity" tradeoff principle to reduce the confusion.

          Cheers

          Natesh

          Dear Natesh --

          I'm just finishing up an article on learning and thermodynamic efficiency (using the Still & al. framework of driving), so I think my head's full of a set of ideas that are competing and overlapping with your insights here. To be clear, I think this is a fantastic piece, and one of the most provocative in a (very good) bunch.

          I hope we see more cross-over work at the interface of origin of life, themodynamics, and machine learning and I encourage you to publish a version of this in a journal (you might consider Phys. Rev., or perhaps the journal Entropy).

          Yours,

          Simon

          Dear Natesh,

          thanks for your kind comments on my page, which led me to your interesting essay.

          I'm afraid, you lose me on page 2. What is [math]\mathcal{S}\text{?}[/math] A Hilbert space? What are the [math]\sigma\text{?}[/math] A basis for this Hilbert space? Similarly, what are [math]\mathcal{R}[/math] and the [math]\hat{x}\text{?}[/math] What does [math]\mathcal{R}_0\mathcal{R}_1[/math] mean? Is that some kind of product? The transition mappings [math]\mathcal{L}\text{,}[/math] are they unitary, stochastic, or...? You write that some time evolution is governed by a Schrödinger equation, what's the corresponding Hamiltonian? How is this Hamiltonian related to the [math]\mathcal{L}\text{?}[/math]

          Or maybe we can go back one step, away from the technical details: What does it mean that a system has "constraints on its finite complexity"? And can I think of dissipation as energy transfer from the system to the heat bath?

          Sorry for so many questions, I just feel I can't get the message, when I don't even understand the terminology on the first few pages.

          Cheers, Stefan

          PS: Sorry for the rendering - I don't know how to do inline math here. Each equation-tag causes a linebreak. :-(

            Hi Stefan,

            No problem at all. I had the same problem and pretty much gave up on using Latex in this forum :D. Given the word limit, I could not get into explaining all the terms you had stated in detail, but here is a paper with all the details- Ganesh, Natesh, and Neal G. Anderson. "Irreversibility and dissipation in finite-state automata." Physics Letters A 45.377 (2013): 3266-3271. Let me know if you are having trouble accessing this.

            The paper was written for deterministic automata but the extensions to stochastic mappings hold. The entire universe of referent-system-bath evolve unitarily but the system evolution can be non-unitary (probably is). Shortened version would be S-system in which FSA is instantiated with states \sigma. R=R0R1 is the joint system of past inputs R0 and present input R1, with x being a string in that distribution of inputs (In a classical case, all of these are essentially random variables). L is the transition mapping for which the corresponding Hamiltonian of the global joint system can be constructed so as to achieve the necessary state transition.

            "What does it mean that a system has constraints on its finite complexity"?

            --> If the complexity of the system can be captured by a mutual information measure, then a finite state automata with finite number of states can only have finite complexity. When we perform optimization of a variable, while keeping another condition constant, we call it constrained optimization and the condition as a constraint.

            "And can I think of dissipation as energy transfer from the system to the heat bath?"

            ---> Yes! Thats exactly what it is. Details are in that paper again.

            Thanks for your questions. Wish I had more space to explain all the terms in details. I am working on a more formal paper now and hopefully I can be a lot more detailed in that so as to avoid confusion. Let me know if there any more points to be clarified and I shall be happy to do it.

            Cheers

            Natesh

            Hi Natesh,

            Very interesting and well-written essay! I liked the idea of the minimal dissipation hypothesis, and how you used it to learning dynamics and the emergence of goal-oriented agency and the biological evolutionary process.

            Best regards,

            Cristi