Essay Abstract

In this essay, I will present the fundamental relationship between energy dissipation and learning dynamics in physical systems. I will use this relationship to explain how intention is physical, and present recent results from non-equilibrium thermodynamics to unify individual learning with dissipation driven adaptation. I will conclude the essay by establishing the connection between the ideas presented here and the critical brain hypothesis, and it's implications on cognition.

Author Bio

Natesh Ganesh is a PhD student in the Electrical and Computer Engineering Dept. at Umass, Amherst. Research interests include fundamental limits of energy efficiency in computing systems, neuromorphic computing architectures, machine learning algorithms, a physical basis for learning, coherent definitions for information and consciousness, and the philosophy of mind.

Download Essay PDF File

Dear Natesh

this is an impressive piece of work.

I have not had time to try to work through the details, but for me two things are important. First, you have set a context of predictive estimation. That is a key issue, and I agree whole-heartedly. So there has to be a structure that underlies the existence of this function, and the kay issue is where this structure came from. That cannot be via non-equlibrium thermodynamics alone.

Second, you say Agency is the capacity of a system/entity/agent/organism to act on it's environment. Is is the Moon an agent n that respect? (after all it causes tides on the Earth). "We will define sense of agency (SA) as the pre-reflective subjective awareness that one is initiating, executing, and controlling one's own volitional actions in the world." That is already assuming key elements of psychology that do not arise in any simple way from physics.

I will try to reflect more on what you have written in due course. Your principle may well be important at the physical level, when the rest of the context is given.

Best wishes

George

    Professor Ellis,

    Thank you for your comments. Since we seem to be dealing with two different conversation threads with different points raised in both, I will try and keep my answers separate to avoid confusion as much as possible.

    "Second, you say Agency is the capacity of a system/entity/agent/ organism to act on it's environment. Is is the Moon an agent n that respect? (after all it causes tides on the Earth)."

    --> Correct me if I am wrong, but you seem to prescribe to a definition of agency rooted in psychology? The agency I am talking of (takes the definition from philosophy and it) is simply the capacity to act. To act involuntarily, unconsciously or consciously with a purpose will all fall under it. The moon with the ability to act on earths waters makes it an agent, but doesn't have to fall under the category of making it a purposeful one for the moon. It is very possible to think of physical systems that have no agency, can change their state based on the influence of external systems but do not have the ability to 'act' and affect its environment. It is also possible to have systems that have 'agency' as I define but not have a purpose or intent for that agency. I limit purpose and intent only to a small set of systems that fall under my hypothesis. And a sense of agency as defined in the essay, need not be present in all agents. I have it there to show that a minimal dissipative system with an hierarchical structure can have something like that because of the physics of the structure alone.

    "So there has to be a structure that underlies the existence of this function, and the kay issue is where this structure came from. That cannot be via non-equlibrium thermodynamics alone."

    --> For the emergence of the structures themselves, I used England's dissipation driven adaptation to explain that. Your statement on structures not coming from non-equilibrium thermodynamics alone is an assumption I would argue. While the biological explanation of selection mechanisms are more explanatory and necessary, it is possible these selection mechanisms themselves are manifestations of deeper thermodynamic principles, which is what England argue for in his papers. While some aspects of his hypothesis seems to have been misunderstood and his results are specialized, I used it to clarify anyone who immediately dismisses the essay that minimal dissipation and dissipation driven are obviously contradicting (something that I struggled with for a while and some of my colleagues pointed out). Section 4 was dedicated to better clarify the assumptions in England's hypothesis and how it complements my derivations (Two sides of the same coin).

    Thanks for a delightful exchange. I am enjoying myself!!

    Natesh

    Natesh Ganesh asks, "Can minimal dissipation alone be a sufficient condition for learning?"

    Two biological systems might fit this description.

    First, in Charles Gallistel's book The Organization of Learning, Chapter 11 tells of foraging experiments. Consider fish (are neurons like fish?). Given probabilistic feeding stations, the school divides itself proportionally-- which from the perspective of game theory looks like a Nash equilibrium: no single fish can increase its payoff by leaving one station to feed at another. In this situation, no food energy is "dissipated" because every scrap is eaten. So according to the hypothesis, learning must be involved. Indeed, if this were a probability learning experiment with just one fish and food energy given per play of the game, we would see probability learning (also as in Chapter 11). In probability learning food is "dissipated." Perhaps because of fear of leaving one group and joining another, the visiting behavior that's seen in probability learning is attenuated when the school forages as above; and then no food energy is dissipated.

    Second, in Goranson and Cardier's A two-sorted logic for structurally modeling systems, there is a review of apoptosis (programmed cell death, which according to the paper happens in the human body about a million times a second). In the olfactory system, apoptosis of neurons amounts to learning new smells, which involves consciousness.

    In Barwise and Seligman's Information Flow: The Logic of Distributed Systems, the example of a flashlight is given. The system-- a flashlight in this case-- supports different "local logics" and therefore different languages between which there may be perfect or imperfect translations. The language in this essay could be one such language. But there are others, as the example of the flashlight shows (including the specification language that specifies the purpose of the flashlight). Goranson describes software by which many different languages, local logics, or situations like this can be organized.

      Hi Natesh,

      I think this essay is fantastic and basically completely correct. I love that you took the time to make explicit connections between the Landauer limit, which many biochemical processes ahve been shown to assymptotically approach, and the importance of a predictor circuit and feedback between sensing and acting, and you even bring in the flcutuation theorems at the end in discussing the problem of assigning probabilitites to brain states, I think it's wonderful and very informed with regard to current research in stat mech, neuroscience, and machine learning. You have the diversity of background required to address this question which is at the intersection of so many fields.

      I hope you might take the time to peruse my submission, entitled "A sign without meaning." I took a very different approach and went with an equation-free text in the hopes of being as accessible as possible, but I think you'll find that we agree on a great number of issues, and I'm glad that the question is being addressed from multiple perspectives but with the right foundation in statistical mechanics.

      Best of luck to you in the competition I think you wrote a hell of an essay!

      --Joe Brisendine

        Quite interesting, Natesh. The emergence of intention, purpose, goals, and learning are automatically achieved with England's restructuring and replication thesis as dissipation occurs -- but humanly done with purpose and goals? Your emphasis on computer modeling seems to blur the distinction between the human and machine but that is probably my failure to view it after one quick read.

        Impressive study.

        Jim

          Natesh - You did a good job on your essay and I like how you've been able to incorporate math, which was suggested in the essay rules. Well done! I gave you a good community rating which I hope helps to give your essay the visibility/rating it deserves.

            Natesh, I can't find an equation that defines "dissipation" in the essay. Is it in the references? I can find "the lower bound on dissipation in this system as it undergoes a state transition..." But that seems specific to finite state machines, which are not equivalent in power to Turing machines. Is it just "delta E"? where E is energy lost from something like a thermodynamic engine?

              Dear Bloomquist,

              Thanks for your comments and very interesting links. I will have to look through them in detail as soon as I can. I am hoping we can find many more examples of biological systems that satisfy the idea

              presented in this essay. Here are some of my thoughts on your comments-

              I have not thought about the behavior of a larger system comprising of many small minimally dissipative parts in detail but if I have to venture a guess, I would think some sort of cooperative behaviour would emerge. Also with respect to the fish example, the fish is a system that can act on its environment and thus its behavior is a tradeoff between 'exploration vs exploitation' under this idea and would not just be a form of predictive learning that we would see in a system that

              cannot act on its environment.

              While there may be many languages, some providing a more detailed and useful definition of purpose, the language in this paper would explain the emergence of what these other languages describe better.

              Thanks again for your comments.

              Natesh

              Dear Bloomquist,

              The dissipation by the system S into the bath B is captured by the \Delta expression of the bath. This would be the change in the average energy of the bath and since only S can exchange energy with B, the increase in energy of the environment would be due to the dissipation by S due to a state transition. A much detailed treatment of the methodology I use is available in the reference as cited in the submission.

              Addressing your comment about finite state automata (FSA) vs Turing machines, you are right that Turing machines have greater power but the Markov FSA as I have defined in the essay is still very general and will allow for a wide range of scenarios. Furthermore I have heard arguments that biological organisms need not be Turing machines and capable of computing all functions. Having said that I recognize that the model prescribed here can be vastly improved. I am currently working on something a little more general than the FSA I have prescribed here (still not a Turing machine yet) that will still allow for some insightful takeaways.

              I want to note that I substitute the dissipation bound for the actual dissipation, since for the biological processes we would be interested in the bounds are good approximations of the actual dissipation. Furthermore I want to add that, the dissipation bounds though might have a Landauer-like essence to it, it is more rigorously derived and overcomes some of the objections that critics have raised against the hand wavy like calculations in Landauer's original paper. You can view it as energy lost by a very specialized type of thermodynamic engine.

              Thank you again for the comments. Please let me know if I missed in address anything else you have brought up.

              Natesh

              Dear Jim,

              I am glad you find it interesting. Yes, while England's ideas have been a big step forward in the right direction, there are some caveats in his hypothesis and I illustrate those points and present a way to unify individual learning and evolutionary processes under the single fluctuation theorems.

              "but humanly done with purpose and goals?" I am sorry but I fail to understand your question. Can you help me out here?

              I might have used finite state automata/machine which are popular in computer engineering and and being one I am very familiar with them. Their popularity in computer engineering does not reduce their general applicability.

              Thanks for your comments. Let me know if there are other things I can clarify if you get a chance to view it in detail.

              Natesh

              Dear Jeff,

              Thank you for your encouraging comments and kind rating. Gives me greater confidence to carry on and to work harder. Yes, it was tough but after several edits I think I managed to find a good balance of math vs no-math. And the language of math is always beautiful and adds so much to the discussion, wouldnt you agree. I am hoping more people will read this essay.

              Natesh

              Dear Joe,

              Thank you for your very kind and encouraging comments. Inspires me to work harder. I am glad that I managed to communicate the ideas in the essay coherently to you. Yes, the topic of this essay is at a very unique intersection of so many different fields. I wish I wasn't right at the word limit and had more room to discuss a bunch of other things. There is a much needed discussion of semantics, consciousness and the implications of the ideas presented here on the philosophy of the mind that I would have loved to delve into.

              The title of your essay is very intriguing. I am caught up at a conference for the next two days but I will definitely read your essay in detail over the weekend and get back to you with questions/comments. I look forward to reading your thoughts on this problem. Thanks a lot again for your encouragement.

              Natesh

              • [deleted]

              Thank you for your patience with me, Natesh. I see you are at a nanotechnology lab, so I imagine that "I squared R" dissipates a lot of heat that is of concern. How much dissipation in your hypothesis is from I^(2) R ?

              From my work experience in software, implementing a FSM by an array with two indexes, one for current state, one for current input signal, storing at those indexes next state and output signal, is much quicker and easier to debug than leaving the FSM in a lot of "if-then" statements-- I.e. using data instead of code increases the speed of execution and reduces debugging significantly. Then if the rest of the code is needed for the full Turing machine, most of the speed loss and complexity in my coding experience would come from the full Turing machine, not the FSM. That's why I ask. And, in your line of work, it seems to me that the vonNeumann architecture would be more relevant for I^(2) R dissipation than more abstract models like co-algebras, streams, or Chu spaces for modeling computing. For example, as in Samson Abramsky's Big Toy Models.

              In engineering terms, apoptosis is a quality control in animals (it never occurs in plants) that seems like a way to minimize energy loss and therefore may be relevant to your dissipation hypothesis. In apoptosis, defective cells that would be energy-wise inefficient are destroyed and their components re-used to build other cells, thus holding onto the potential energy in the sub-assemblies and again reducing energy loss. But I don't see how I^(2) R heat loss plays a role in apoptosis.

              If you had an equation for dissipation rather than a verbal explanation, It might give me something more general than I^(2) R to think about-- especially regarding apoptosis as an example of minimizing dissipation of energy. What are your thoughts?

              Regarding your references-- I will try to find them online. Are they in arxiv? The closest big universities where I could photocopy these papers are hours away from me.

                Sorry, the previous post was me. Don't know how that happened.

                Lee Bloomquist

                Hi Lee,

                I am happy to answer all questions. I want to point out again that I use the dissipation bound as a (good) approximation of the actual dissipation in the processes that we are interested in. The dissipation bound expression as entropy \delta S and mutual information terms I. The dissipation bound is fundamental and relates to dissipation associated with irreversible information loss. It is implementation independent. When you talk about I^2*R expression for dissipation, you are thinking about wires and charges moving through those wires. So that expression will not hold for spin based architectures. However irrespective of implementation, if there is information loss, then there will be fundamental dissipation associated with it characterized by entropy and information terms. Hence their presence in the expression. For a long time, these bounds from fundamental law (informally called Landauer bounds) were many many orders of magnitude lower than the I^2*R and leakage dissipation and were not significant. But decades later and thanks to Moores law, we will hit such Landauer limits in a decade or so. At the most fundamental level, we define a broad physical implementation of finite state machine that is architecture or technology (like CMOS) independent, and the bound in the paper is the bound of that nature and only depends upon just definition achieved physically. We can always add the dissipation of the architecture and circuit on top of this. Unfortunately the paper in the references that can make this so much clearer is not on arxiv. If you can access them on a college campus somewhere, here are some other papers that will make a lot clearer and what exactly I am talking about, especially the second one.

                Anderson, Neal G. "On the physical implementation of logical transformations: Generalized L-machines." Theoretical Computer Science 411.48 (2010): 4179-4199.

                Anderson, Neal G., Ilke Ercan, and Natesh Ganesh. "Toward nanoprocessor thermodynamics." IEEE Transactions on Nanotechnology 12.6 (2013): 902-909.

                Thanks.

                Natesh

                Natesh. All I can get on the second paper is the abstract and a bit of the intro. The rest is behind a paywall.

                You wrote, "When you talk about I^2*R expression for dissipation, you are thinking about wires and charges moving through those wires. So that expression will not hold for spin based architectures." It's Joule heating, as in this article:

                http://poplab.stanford.edu/pdfs/Grosse-GrapheneContactSJEM-nnano11.pdf

                And, you wrote about "entropy":

                "I want to point out again that I use the dissipation bound as a (good) approximation of the actual dissipation in the processes that we are interested in. The dissipation bound expression as entropy \delta S and mutual information terms I."

                But the abstract talks about "energy dissipation":

                ****

                Abstract:

                A hierarchical methodology for the determination of fundamental lower bounds on energy dissipation in nanoprocessors is described. The methodology aims to bridge computational description of nanoprocessors at the instruction-set-architecture level to their physical description at the level of dynamical laws and entropic inequalities. The ultimate objective is hierarchical sets of energy dissipation bounds for nanoprocessors that have the character and predictive force of thermodynamic laws and can be used to understand and evaluate the ultimate performance limits and resource requirements of future nanocomputing systems. The methodology is applied to a simple processor to demonstrate instruction- and architecture-level dissipation analyses.

                ...I. Introduction

                Heat dissipation threatens to limit performance gains achievable from post-CMOS nanocomputing technologies, regardless of future success in nanofabrication. Simple analyses suggest that the component of dissipation resulting solely from logical irreversibility, inherent in most computing paradigms, may be sufficient to challenge heat removal capabilities at the circuit densities and computational throughputs that will be required to supersede ultimate CMOS.1

                For 1010devices/cm3 each switching at 1013sтИ'1 and dissipating at the Landauer limit EminтЙИkBT have Pdiss=414 W/cm2 at T=300K Comprehensive lower bounds on this fundamental component of heat dissipation, if obtainable for specified nanocomputer implementations in concrete nanocomputing paradigms, will thus be useful for determination of the ultimate performance capabilities of nanocomputing systems under various assumptions regarding circuit density and heat removal capabilities.

                ****

                Are you saying the heat dissipation in post-CMOS devices is just due to spin, and none of that heat is due to Joule heating?

                Best Regards,

                Lee

                Natesh, thank you for your reply! You wrote:

                "Also with respect to the fish example, the fish is a system that can act on its environment and thus its behavior is a tradeoff between 'exploration vs exploitation' under this idea and would not just be a form of predictive learning that we would see in a system that cannot act on its environment."

                Is what you wrote, above, about probability-learning foraging fish implied by the definitions of terms in your hypothesis? That is, do the definitions of the terms in your hypothesis (like "open," "constraints," etc.) imply what you have written, above?

                Your hypothesis: "Open physical systems with constraints on their finite complexity, that dissipate minimally when driven by external fields, will necessarily exhibit learning and inference dynamics."

                Or, is more than this required to understand your hypothesis-- more than just the above statement of your hypothesis together with definitions of the terms used in your hypothesis?

                Dear Ganesh,

                I wish you all the best with your in depth analysis of how intentions govern reality. I welcome you to read there are no goals as such in which I propose that consciousness is the fundamental basis of existence and that intent is the only true content of reality. Also that we can quantify consciousness using Riemann sphere and achieve artificial consciousness as per the article Representation of qdits on Riemann Sphere. I saw that you are also arriving at study of consciousness in physical systems in the conclusion of your essay. Also please see all the diagrams I have attached in my essay.

                Love,

                I.

                  Dear Ganesh,

                  Thank you for the good essay on "Intension is Physical"

                  You are observations are excellent, like... "fundamental limits on energy efficiency in new computing paradigms using physical information theory as part of my dissertation"

                  I have some questions here, you mean to say for every external input, our Brain predicts and takes a correction. It wont be acting on directly by its own self....

                  Probably if we make energy efficient computer, it will become super intelligent, Probably we may require some software also....

                  Even though my essay (Distances, Locations, Ages and Reproduction of Galaxies in our Dynamic Universe) is not related to Brain functions, It is on COSMOLOGY....

                  With axioms like... No Isotropy; No Homogeneity; No Space-time continuum; Non-uniform density of matter(Universe is lumpy); No singularities; No collisions between bodies; No Blackholes; No warm holes; No Bigbang; No repulsion between distant Galaxies; Non-empty Universe; No imaginary or negative time axis; No imaginary X, Y, Z axes; No differential and Integral Equations mathematically; No General Relativity and Model does not reduce to General Relativity on any condition; No Creation of matter like Bigbang or steady-state models; No many mini Bigbangs; No Missing Mass; No Dark matter; No Dark energy; No Bigbang generated CMB detected; No Multi-verses etc.

                  Dynamic Universe Model gave many results otherwise difficult to explain

                  Have a look at my essay on Dynamic Universe Model and its blog also...

                  http://vaksdynamicuniversemodel.blogspot.in/

                  Best wishes................

                  =snp. gupta