Intention is Physical by Natesh Ganesh

NNatesh Ganesh · Mar 26, 2017

Dear Simon,

Thank you for your comments and questions. This is a nice coincidence. I just finished reading about the Borgesian library and currently on section 2 "the physics of the gap". Great piece of writing and I will reach out on your page once I am done reading and re-reading and digesting it.

"Why is this open system not an inferential agent?"

--> Yes, it technically is for that very particular environment providing those particular input signals. If those planets saw ONLY the type of conditions, that allowed it maintain a specific macrostate at minimal dissipation, then we might have to entertain the possibility that it is an inferential agent in that environment. In section-2 of my submission, I introduced the notion of the link between minimal dissipation and learning. I added section-4 to not only show the link to Englands work, but also explain why we should focus on systems that are minimally dissipative in all the input signals from their environment, that they might encounter as they maintain their macrostate. For example, if we thought about a system that was minimally dissipative for one input signal but not the rest of it, I would think that system is not an inferential agent, unless the probability of that particular signal goes to 1.

"This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?"

--> Can you please give me a simple example. I seem to be not following here. If you mean that it is possible to construct simple cases of systems that are minimally dissipative in a particular environment, and do not learn anything, my first guess is that it does not possess the sufficient complexity to do so, hence not satisfying that constraint of the hypothesis. After all, there are large nice periods of blissful dreamless unconscious sleep where we dont learn or infer anything either, which would be explained by changes to our computational complexity while maintaining the minimal dissipation.

On a side note, I do wonder that given our finite computational complexity and if our brain is indeed a minimally dissipative system, might serve to explain why there might be some computational problems, that our brain simply cannot solve by itself.

I agree that both Friston and England's works are very influential, and drove to me to look for a link between the two. Hopefully I satisfactorily answered your great questions. If I have not, please let me know and I will take another crack at it.

Cheers

Natesh

PS: I am continuing to work on an updated version of the essay to better clarify and explain myself without the constraints of a word limit. The questions you have asked are very useful, and I will include explanations that version to better address them.

NNatesh Ganesh · Mar 26, 2017

Dear Edwin,

Thank you for your kind rating. Yes, I agree with you on the sad trolling that has been going on, that I fear is hurting the contest overall. I was hit with 5 continuous 1's without any feedback, which sent my essay in freefall and left me disheartened earlier. Hopefully I will have the opportunity to have the work judged by the FQXi panel. Good luck on the contest and I would very much like to stay in touch. Thanks.

Natesh

SSimon DeDeo · Mar 26, 2017

Dear Natesh -- thank you for your very thoughtful response.

You asked me about this remark:

"I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail."

Saying "Can you please give me a simple example."

So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative.

Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems. With a lot of work, I can create a map between any two systems. It might require strange disjunctive unions of things ("System 1 State A corresponds to System 2 State B at time t, C at time t+1, W or X at time t+2...") and be very hard to compute, but it's there. I'm not sure how dissipative the two could be, but my guess is that it's hard to rule out the possibility that the coarse-grained state spaces the maps imply could have low dissipation.

(Scott Aaronson has a nice piece on computational complexity and Waterfall problems--http://www.scottaaronson.com/papers/philos.pdf)

You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker.

Yours,

Simon

NNatesh Ganesh · Mar 26, 2017

Dear Simon,

I address your comments/questions below:

"So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative."

--> Please bear with me, but I take my time in understanding all the tiny details completely. Correct me if I am wrongly characterizing what you are saying, If the two systems are run in the manner that you describe, are you saying that the joint system is minimally dissipative or just the second system? If they are jointly dissipative, then the correlation between the two would be plastic as expected. I mention this at the start of section 5, where I discuss how subsystem relationships should be plastic if the joint system is minimally dissipative. And the correlation will hence vary depending upon the input provided. Does that answer your point?

"Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems."

--> Let me get back to you on this once I have a firmer grasp on what these problems are exactly. I remember reading about it in Aaronson's blog a while ago and I need to revisit it. Thank you for that particular link. I am an avid fan of his blog and work, and the updated version of the essay has references to his blog post on the Integrated Information Theory.

"You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker."

-->I agree that while deep learning networks are learning in a manner similar to us, there are large differences between us and deep learning algorithms. Along the lines of John Searle's Chinese room argument, I would argue that such algorithms are only syntatical and there are no semantics there. Furthermore running such algorithms on a von Neumann architecture GPUs (as they traditional are) means these are not minimally dissipative systems. I think the plastic subsystem connections are needed for any system to be minimally dissipative and von Neumann architecture does not have that. If we went to systems with neuromorphic architecture, then it becomes a lot more interesting I think.

I agree with you that the effort is really with the map-making and this is why I am very interested in unsupervised learning with an array of devices called memristors (Look for Prof. Yang's group at Umass,Amherst. They are doing cool things like this). Short of starting with an artificial primordial soup and evolving/self-organize an artificial brain on silicon in an accelerated manner, I think such an approach is the best way to test my ideas and build an agent remotely close to us (Since we know somethings about the final product aka our brain, we can cheat and start with an array of memristors since they can behave as neurons and synapses. How to impose other thermodynamic constraints on this array is something I am thinking about now). We just set up the array of physical devices without any preprogramming or map making, let it run and supply it with inputs and it is allowed to make its own maps and provide outputs. If such a system is able to answer questions about flight performance, based on x-ray image of the airplane, I think a) that would be amazing and b) we have to seriously entertain the possibility that it is an agent like us (I am not touching the question whether such an agent is conscious or not by a 10 foot pole haha)

I hope I didnt miss anything and answered your questions. Let me know if I need to further clarify more things.

Cheers

Natesh

PS: In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things.

DDon Limuti · Mar 26, 2017

Hi Natesh,

The posts in this blog are as interesting a conversation as are in this contest. In particular your conversation with Ines Samengo is most interesting. More on that in a moment.

The wording of FQXi.org's contest is nebulous, unless you realize it is about the MUH of Tegmark. Tegmark's emphasis is about Mathematics. Landauer's emphasis is about Information. Your emphasis is about Intention. My emphasis is about how we choose. I would make a hierarchy as shown below:

"Mathematics is Physical"...........Tegmark

"Information is Physical"..............Landauer

"Intention is Physical"..................Ganesh

"Choice (intention from a personal viewpoint) is Physical (maybe), but we can never know it"......Limuti

I did read your essay and honestly I had trouble following it (I did however spot the insulated gate mosfet structures :))

The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals.

Non equilibrium thermodynamics as treated by you and Ines was interesting. Ines brought out the memory clearing needed by Maxwell's demon to control the entropy (I think I got that right). Perhaps this memory clearing is why we can't know how we choose. For example, move your finger. How did you do that? Do not point to MRIs or brain function. I maintain that you have no direct experiential record (knowledge or memory) of how you moved your finger. I believe the answer is that you moved your finger, but you do not know directly how you did that. Was Maxwell's demon involved? I know this is a bit esoteric, but would like to know what you think.

In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation.

You forced me to think....minor miracle. Therefore this is a super essay!

Thanks,

Don Limuti

SSimon DeDeo · Mar 27, 2017

Dear Natesh --

It's fun to go back and forth on this.

If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other. Of course, it is a limit case, perhaps most useful for pointing out what might be missing in the story, rather than claiming there's something bad about the story.

The machine learning case provides a different challenge, I think. You seem to agree that the real difficulty is contained in the map-making. But then this makes the prediction/learning story hard to get going without an external goal for the map-maker. Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping. More colloquially, when my computer makes a prediction, I have to know how to read it off the screen (printout, graph, alarm bell sequence). Without knowledge of the code (learning or discovered post hoc) the prediction is in theory only.

You write, "In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things." I think that might be the main point of difference. I'm happy to use the stories you tell to determine whether an engineered system is doing something, and this seems like a really interesting criterion. Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction.

Thank you again for a provocative and interesting essay.

Yours,

Simon

NNatesh Ganesh · Mar 27, 2017

Hi Don,

Thank you for your very kind comments. I am glad to see that you liked the essay. Ines's work was outstanding and it was very insightful to discuss ideas with her.

"The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals. "

--> I will check this book out.

"In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation."

--> Ha!! Wonderful. I did not immediately get that but it adds much more to your submission. Thanks.

Cheers

Natesh

[deleted] · Mar 27, 2017

Dear Simon,

"It's fun to go back and forth on this."

-->Agreed.

"If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other."

--> I think I misunderstood the problem you had presented (a simple case of lost in translation I guess). If the two systems are evolving independently and there are no inputs being presented to either one of them, then I am not sure what it is that they can learn in the first place. But then again, if this is a limiting case of no inputs at all, then I must think about this further. Since my derivations start with the assumptions that there are some external inputs affecting the physical system in question, I would say that a system just evolving without being affected by external inputs and dissipating minimally is not learning anything. This is further captured by the fact that the mutual information complexity measure can serve as a measure of memory/history in the system.

"Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping."

--> I agree that the X-ray image itself cannot predict but a minimally dissipative system which is presented with the X-ray image as inputs that affect it's state transition might be capable of learning and predicting from the input image.

"Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction."

--> I argue that my constraints specify which systems could be goal oriented agents in the first place. And that goals and desires are created and evolved as such systems interact with their input environment.

"Thank you again for a provocative and interesting essay."

--> Thanks for a very a stimulating discussion. I am pretty convinced that I should rename the minimal dissipation hypothesis to something like the "dissipation-complexity" tradeoff principle to reduce the confusion.

Cheers

Natesh

SSimon DeDeo · Mar 28, 2017

Dear Natesh --

I'm just finishing up an article on learning and thermodynamic efficiency (using the Still & al. framework of driving), so I think my head's full of a set of ideas that are competing and overlapping with your insights here. To be clear, I think this is a fantastic piece, and one of the most provocative in a (very good) bunch.

I hope we see more cross-over work at the interface of origin of life, themodynamics, and machine learning and I encourage you to publish a version of this in a journal (you might consider Phys. Rev., or perhaps the journal Entropy).

Yours,

Simon

SStefan Keppeler · Mar 28, 2017

Dear Natesh,

thanks for your kind comments on my page, which led me to your interesting essay.

I'm afraid, you lose me on page 2. What is [math]\mathcal{S}\text{?}[/math] A Hilbert space? What are the [math]\sigma\text{?}[/math] A basis for this Hilbert space? Similarly, what are [math]\mathcal{R}[/math] and the [math]\hat{x}\text{?}[/math] What does [math]\mathcal{R}_0\mathcal{R}_1[/math] mean? Is that some kind of product? The transition mappings [math]\mathcal{L}\text{,}[/math] are they unitary, stochastic, or...? You write that some time evolution is governed by a Schrödinger equation, what's the corresponding Hamiltonian? How is this Hamiltonian related to the [math]\mathcal{L}\text{?}[/math]

Or maybe we can go back one step, away from the technical details: What does it mean that a system has "constraints on its finite complexity"? And can I think of dissipation as energy transfer from the system to the heat bath?

Sorry for so many questions, I just feel I can't get the message, when I don't even understand the terminology on the first few pages.

Cheers, Stefan

PS: Sorry for the rendering - I don't know how to do inline math here. Each equation-tag causes a linebreak. :-(

NNatesh Ganesh · Mar 28, 2017

Hi Stefan,

No problem at all. I had the same problem and pretty much gave up on using Latex in this forum :D. Given the word limit, I could not get into explaining all the terms you had stated in detail, but here is a paper with all the details- Ganesh, Natesh, and Neal G. Anderson. "Irreversibility and dissipation in finite-state automata." Physics Letters A 45.377 (2013): 3266-3271. Let me know if you are having trouble accessing this.

The paper was written for deterministic automata but the extensions to stochastic mappings hold. The entire universe of referent-system-bath evolve unitarily but the system evolution can be non-unitary (probably is). Shortened version would be S-system in which FSA is instantiated with states \sigma. R=R0R1 is the joint system of past inputs R0 and present input R1, with x being a string in that distribution of inputs (In a classical case, all of these are essentially random variables). L is the transition mapping for which the corresponding Hamiltonian of the global joint system can be constructed so as to achieve the necessary state transition.

"What does it mean that a system has constraints on its finite complexity"?

--> If the complexity of the system can be captured by a mutual information measure, then a finite state automata with finite number of states can only have finite complexity. When we perform optimization of a variable, while keeping another condition constant, we call it constrained optimization and the condition as a constraint.

"And can I think of dissipation as energy transfer from the system to the heat bath?"

---> Yes! Thats exactly what it is. Details are in that paper again.

Thanks for your questions. Wish I had more space to explain all the terms in details. I am working on a more formal paper now and hopefully I can be a lot more detailed in that so as to avoid confusion. Let me know if there any more points to be clarified and I shall be happy to do it.

Cheers

Natesh

CCristinel Stoica · Mar 29, 2017

Hi Natesh,

Very interesting and well-written essay! I liked the idea of the minimal dissipation hypothesis, and how you used it to learning dynamics and the emergence of goal-oriented agency and the biological evolutionary process.

Best regards,

Cristi

SStefan Keppeler · Mar 29, 2017

Dear Natesh,

thanks, after reading your Phys. Lett. A article, I think I understand the definitions. I think I also understand roughly how you obtain the bound (3) in your article. There is a similar (but not identical?) bound on page 2 of your essay, which I think is neither derived in your article nor in your essay -- or did I overlook anything?

Cheers, Stefan

NNatesh Ganesh · Mar 29, 2017

Hi Stefan,

"There is a similar (but not identical?) bound on page 2 of your essay, which I think is neither derived in your article nor in your essay -- or did I overlook anything?"

--> Yes, the bound in the essay was not derived here but is an extension of the bound in the Phys. Lett. paper. The bound in that paper was derived for independent inputs i.e. R0 and R1 are independent. The bound in the essay is derived for correlated R0 and R1, thus generalizing the bound from Phys. Lett. paper (I am writing a new paper on this generalization but it will hold if you follow the same set of steps from the earlier paper). The bound in the essay will reduce to the one in the 2013 paper if you assume R0 and R1 have zero correlations with the last term in equation (3) going to zero. Hope that explains everything. I am glad to see you are being extremely rigorous with the essay. Please keep the questions and comments coming.

Cheers

Natesh

NNatesh Ganesh · Mar 29, 2017

Hi Cristi,

Thanks for your comments. It is definitely a very interesting idea and I intend to keep working on it. I have read and rated your essay and it was very good piece of work. Good luck in the competition. Thanks.

Cheers

Natesh

PS: Kindly rate my essay if you havent already. If you have, thank you very much for doing so.

VVladimir Fedorov · Mar 31, 2017

Dear Natesh,

With great interest I read your essay, which of course is worthy of the highest rating.

I'm glad that you have your own position

«I will present the fundamental relationship between energy dissipation and learning dynamics in physical systems. I will use this relationship to explain how intention is physical, and present recent results from non-equilibrium thermodynamics to unify individual learning with dissipation driven adaptation.»

«I will refer to as the minimal dissipation hypothesis»

Your assumptions are very close to me «the phase space characterization of self-organized systems which dissipate minimally, improved understanding of internal control mechanisms to maintain criticality, and detailed formulations of cognitive states as phase transitions in a (non-chaotic strange) attractor.»

You might also like reading my essay , where it is claimed that quantum phenomena occur in the macrocosm due to the dynamism of the phase state of the elements of the medium in the form de Broglie waves of electrons, where parametric resonance and soliton occur, and this mechanism of operation is analogous to the principle of the heat pump. At the same time, «the minimal dissipation hypothesis» is realized.

I wish you success in the contest.

Kind regards,

Vladimir

SStefan Keppeler · Mar 31, 2017

Hi Natesh,

I'm slowly making progress. It's not so easy since it's essentially all new for me...

In your Sec. II you state and justify your minimal dissipation hypothesis. Towards the end of Sec. II you conclude that "learning dynamics are inevitable in a trade off between energy dissipation and statistical complexity". Your essay prompted me to have a (superficial) look at Still et al. 2012, who conclude that "making a predictive model of the environment and using available energy efficiently [are fundamentally related]". Is that essentially the same as your minimal dissipation hypothesis or is there a subtle difference which I'm missing?

Thanks for bearing with me, cheers, Stefan

NNatesh Ganesh · Apr 2, 2017

Hi Stefan,

"Is that essentially the same as your minimal dissipation hypothesis or is there a subtle difference which I'm missing?"

--> The link between learning and energy dissipation itself is not new. Energy efficiency as a possible unifying principle has been touted before. Still obtains the bounds in her paper (derived under different assumptions) and suggests the same idea of a link between the two. The main difference is that I go further than that and hypothesize that learning is simply a manifestation of energy efficient dynamics, and that (explained in section 4 essay) perhaps we need to look at a framework in which evolution and learning as manifestation of larger thermodynamic principles. The evolution part has been suggested by England's work (discussed in the essay) and I relate it back to the minimal dissipation hypothesis and show how framing the problem as I have in section 2, we can now relate it back to known ideas in machine learning, neuroscience, etc. I saw a recent video by Still where she was trying to do something similar as well starting from her derivation of the bound and a different setup. I am reaching out to her to get her thoughts on this. Hope that answers your question.

Cheers

Natesh

JJochen Szangolies · Apr 3, 2017

Dear Natesh,

thanks for an interesting, very densely-packed essay! Your minimal dissipation hypothesis carries some immediate intuitive heft: anything minimizing its dissipation must in some ways adapt to the environment. You then turn traditional reasoning on its head, subverting the expectation that because something learns, it may minimize its dissipation (a good thing for any living system with bounded resources), arguing rather that such minimization itself is simply what constitutes learning.

It's sort of like the thinking that got rid of élan vital: once we've explained the moving around, reproducing, seeking out of food etc. it became clear that we don't need additional magic fairy dust---those sorts of things are just what's meant by the term 'life', they're not the consequence of a life-giving force being present. So in a sense, I see you attempting to do something similar for 'learning': once we've realized minimal dissipation in the agent, we find there's nothing else left over.

I'm a bit puzzled regarding your occasional mentions of quantum systems---it seems to me that essentially the same analysis could be carried out classically; nothing seems to ride on any specifically quantum features, such as superposition, interference, or quantum correlations.

Hope you do well in the contest!

Cheers,

Jochen

RRajiv Singh · Apr 3, 2017

Dear Ganesh,

I suppose, you like critical examination of your essay. I must confess that I really could not follow the mathematical derivation entirely, may be due to my own limitation. But, I will grant the concluding remarks by you based on those mathematical expressions. I read this essay twice over a fortnight.

I take the following statement as your motivation. "Open physical systems with constraints on their finite complexity, that dissipate minimally when driven by external fields, will necessarily exhibit learning and inference dynamics."

In Fig.1b, at the first stage we see the external input coming, which is mixed with the prediction of the same coming from higher level, and up goes the 'prediction error'. This is OK, but from the next stage onwards, we see that the predictive estimator (processor/comparator) receives only the prediction error from lower level, and feedback of prediction from the next higher level. A prediction from higher level cannot be compared with the prediction error that took place at the lower level, it would make no sense. A predictive estimator must receive appropriate modular value derived (or predicted) from observation from lower level in order to be able to compare or generate prediction error. I suppose the direction of flow is incorrect. In fact, a predictive estimator should generate a prediction error internally from the predictions coming from both sides, and use the error to predict for the next higher level as well as for the lower level, which must be appropriate for both sides independently. Natesh, in cases of processing systems, always take the limiting cases to test the hypothesis. For example, when the system makes first observation, at the lowest level there is no prediction coming back from the higher level to compute the error. Similarly, at the highest level no prior action to make a correction with only incoming prediction error. Furthermore, this is also to be noted, in any realistic system, a module may receive input from multiple modules and send its output to multiple modules.

"The joint system SA is a quantum system with two components." From this I also gather a classical system might not be able to achieve what a quantum system does, otherwise, there was no need to classify it as quantum. But then, later on I notice that you identify neo-cortex as S, and A as motor-cortex. I trust, you are equating neo-cortex and motor-cortex as quantum systems, a hard to gulp inference.

"Agency is the capacity of a system/entity/agent/organism to act on it's environment." And if all physical entities satisfy this definition of agency, then I do not see the need of a separate definition taking the attention of some readers on the side of psychology. Being a part of environment, any reaction to the physical context is equivalent to altering the environment. But when you say, "(I am imbuing system A with agency, but not with a specific goal or purpose)", it is as if there could be a system without agency. As you defined earlier, all physical entities are natural agents. So, by stating this, you are priming a reader with certain preconceived notion of agency. Again when you say, "The optimal encoding of R0 in SA is a trade-off between exploiting known information and exploration", where does the exploration come from? I understood that A would simply react physically as per the input from S. But this reaction is aimless. The term 'exploration' also achieves the same goal of priming the reader with certain kind of agency, reinforcing the sense. "While the state of A depends upon balancing exploration with prediction", further enhances this sense.

Even in cases where the system SA is evolving to predict the incoming input correctly, it is just a prediction of the system R, where is the purposeful goals for self sustenance or whatever coming into picture? Therefore, I suppose, one has to design an extra element in SA system such that S tries to optimize on certain parameter, to signal A to act in a particular manner. Otherwise why would S set the task of throwing a ball in any manner, let alone trying to dunk. The purpose also has to be artificially coded.

"Due to these past inputs, let the state of the system A (motor cortex) that is most likely given the prediction-exploration trade off, corresponds to the action "throw the ball." How did the first input come, and what would be any reason to throw the ball at all?

"We will define sense of agency (SA) as the pre-reflective subjective awareness that one is initiating, executing, and controlling one's own volitional actions in the world."

"Thus the joint state of SA=("see ball being thrown","throw ball") as the ball is thrown will explain the sense of agency, the awareness of an action being performed as it being performed." The association of an awareness of an action being performed to the system SA is in your/our mind. I do not see where and how exactly this sense of awareness is represented in SA. Then you rain statements like, "For example, in the case of visual perception of a face, the higher levels make predictions corresponding to say, 'seeing a face'." I can accept that the system may have representation of all the parts described, but I do not see how 'seeing a face' is represented.

The masterpiece of all statements is, "Similarly predictions made in the higher levels of the hierarchical model in SA, under the minimal dissipation hypothesis, would correspond to the higher level intention of the action-sense of agency (like say "win game" in our example)...."

As I said about your system that goals and purposes would not arise unless especially coded in the system, the same applies to all systems. In a system like brain, such a coding is achieved by the process of natural evolution in the Darwinian sense. You may quote me on any statement here.

Then comes the attribution of ownership, "Crucial to this process though, is a sense of ownership that the system will learn over time about what is within the system's control and what is beyond that." Natesh, what you could see as a logical extension from your own perspective of a relation of an actor and its acts, you assign that to the system.

"... arguments have been made for inherent intentionality in every perception event [12]. We can view the upper levels of the hierarchical model in the brain as the source of only intentions and make a strong case that intention is physical."

Imagine if we say, "intention is a specification of an information represented in a physical system", then it does not remain physical, yet it has origins in physical systems. But then, if we insist on the paradigm of 'intention is physical', then there must be a way to measure it. Though I trust what you may have meant is 'intention' arises from physical function of the universe, it does not require or depend on any non-physical phenomena.

As a concluding remark, I am going to consider a stone as a system S embedded in surrounding heat bath, the air in thermal equilibrium. A puff of wind blows that applies certain force on the stone, but the stone remains undisplaced, and no exchange of heat (energy) takes place, i.e., the stone dissipated minimum energy. In such a scenario, what learning has taken place in S that it can predict about wind? So, any development from minimum dissipation hypothesis must conform to this limiting case. I suppose, you may require some other constraint in addition.

In an exchange with Ellis, you wrote, "Thanks for a delightful exchange. I am enjoying myself!!" I consider you a system like SA, so which component of S and A is referring to itself as an enjoyer, and which component is being enjoyed? And why would both be claimed to be as oneself?

I feel favorable to consider reasonably well rating for clever usage of the terms so wisely that the reader might end up with the notion of emergence of goals from minimal dissipative hypothesis. Mr Natesh, you are a magician too.

Rajiv