Biomolecular Renormalization: A New Approach to Protein Chemistry
Terry Bollinger, 2018-03-05
Abstract. In every cell in your body, hundreds of proteins with very diverse purposes float in the same cytosol fluid, and yet somehow rapidly and efficiently carry out their equally diverse tasks including synthesis, analysis, demolition, replication, and movement. Based on an earlier 2017 FQXi Essay contest mini-essay on the importance of renormalization and the Nature paper below, I propose here that the many protein chemistry pathways that go on simultaneously in eukaryotic and prokaryotic cells are enabled, made efficient, and kept isolated by a multi-scale biomolecular renormalization process that breaks each interaction into scale-dependent steps. I conclude by discussing ways in which this concept could be applied both to understanding and creating new biomolecules.
----------------------------------------
NOTE: A mini-essay is my attempt to capture an idea, approach, or prototype theory inspired by interactions with other FQXi Essay contestants. This mini-essay was inspired by:
1. What does it take to be physically fundamental by Conrad Dale Johnson
2. What if even the Theory of Everything isn't fundamental by Paul Bastiaansen
3. The Laws of Physics by Kevin H Knuth
4. The Crowther Criteria for Fundamental Theories of Physics
5. The Illusion of Mathematical Formality by Terry Bollinger (mini-essay)
Non-FQXi References
6. Extreme disorder in an ultrahigh-affinity protein complex, March 2018, Nature 555(7694):61-66. Article in ResearchGate project Novel interaction mechanisms of IDPs
7. Extreme disorder in an ultrahigh-affinity protein complex, March 2018, Nature 555(7694):61-66. NOTE: This article is behind a (large) paywall.
----------------------------------------
Background: Scale-Dependent Protein Interactions
In the March 2018 Nature paper Extreme disorder in an ultrahigh-affinity protein complex, the authors provide a fascinating and extremely detailed description of how certain classes of "intrinsically disordered proteins" (IDPs) can bind together based initially on large-scale charge interactions that are then followed by complex and remarkably disorderly bindings at smaller size scales. The purpose of this essay is not to analyze this specific paper in detail -- this excellent paper does that very well for its intended biochemistry audience -- but to show how an external set of physics-derived, scale-dependent renormalization framework can be used not only to provide an alternative way to look at the interactions of these proteins, but to understand a broad range of large-molecule interacts in a new and potentially more unified and analytical fashion. This broader framework could in principle lead to new approaches to both understanding and designing proteins and enzymes for specific objectives, such as how to bind to a wide range of flu viruses.
The Importance of Approximation-At-A-Distance
The initial approach of two IDP proteins via simple, large-scale difference of electrical charge appears to be an example of biological multi-scale physics-style "renormalization." By that I mean that the proteins are interacting in a hierarchical fashion in which large, protein level charge attractions initiate the process while the proteins are still at some distance from each other and details are irrelevant due to charge blurring. This is the central concept of renormalization in, say, the QED theory of electron charge: You can at large distances (scales) approximate the electron charge as a simple point, much as you are approximating the complex protein charge as a "lump charge" in first stage.
As the proteins approach, more detailed patterns grow close enough to become visible, and the initial lump-protein-charge model fails. One must at this point "renormalize," that is, drop down to a smaller, more detail scale that allows analysis in terms of smaller patterns within smaller regions of the protein. In the case of the dynamic and exceptionally disorganized IDP proteins, these later stages result in surprisingly strong bindings between the proteins. More will be discussed later about this intriguing feature, which I believe can be reinterpreted as a more complicated process that only appears to be random and disorganized from an outside perspective. It is at least possible, based on a renormalization analysis, that this "randomness" is actually a high-density, multi-level transfer of data. This transfer would be enabled by the large number of mobile components of the protein behaving more cogs and wheels in a complicated machine than as truly random parts. Alternatively, however, if binding truly is the top priority for the proteins, the moving parts could also accomplish that without using the resulting bindings as data.
Broadening the Model: Multi-Level Attraction and Rejection
However, even more interesting than detailed binding when proteins grow closer is the possibility that the interactions at that level reject rather than encourage further interactions. Such cases might also be very common, possibly even dominant. You would have a "dating service" that allows the proteins to spend a small amount of time and mobility resource to check out a potential match, but then quickly (and this is important) realize at low cost the match will not work. Amplify such low-cost rejections by huge numbers of protein types and individual instances, and the result is a very substantial overall increase in cellular efficiency.
If however the next level of charge-pattern detail does encourage closer attraction, the result would be to head down the path of repeated downward renormalization of scale, as individual sheets and strands move close enough to "see" more detail. If the proteins were exact matches to begin with, then renormalization (which in this contex just means "scaling down to see greater levels of charge pattern detail") would proceed all the way down to the atomic charge level. The "dating service" would be a success, and the match accomplished. But more importantly, it would be accomplished with high efficiency by avoiding getting into too much detail too quickly.
Broader Implications of Multi-Scale Protein Interactions
There are a number of very interesting potentials in such a renormalization interpretation of protein-to-protein binding. Importantly, most of these potentials apply to pretty much any form of large-bio-molecule binding, including emphatically DNA) and (to me even more interesting) enzymatic creation of novel molecules. These potentials include:
o Efficient, low-time-cost, multi-stage elimination of non-matches.
Proteins (or DNA) would be able to approach at the first scale level based on gross charge, then quickly realize there is no match, and so head off to find the right "machinery" for their tasks. The efficiency issue is huge: Repeated false matches at high levels of detail would be very costly, causing the entire cell to become very inefficient.
o Increased probability of correct protein surface matchups.
Or, conversely: Lower probabilities of protein matchup errors. A huge but somewhat subtle advantage of multi-scale attraction is that it gives each new level of smaller detail a chance to "reorient" its components to find a better local match. One way to think of this advantage is that the earlier larger-scale attractions are much like trip directions that tell you which interstate highway to take. You don't need detail at that level, since there will in general be only one interstate (one "large group area match") that gets you to the general region you need for a more detailed matchup. Only after you "take that interstate" and approach more closely do the detailed "maps" show up and become relevant.
o Complex "switch setting" during the multi-scale matchup process.
Since proteins are not just static structures but nano-scale machines that can have complex levels of local group mobility (more later on the implications of that), such lower-scale matchups can be more than just details showing up at the finer scales. They can also re-orient groups and structures, which in turn can potentially "activate" or "change the mode" of one or both proteins, much like turning a switch once you get close enough to do so. These "switches" would of course themselves be multi-scale, ranging e.g. from early large-scale reorientations of entire beta sheets down to later fine-scale rotations of side groups of individual amino acids. What is particularly interesting about this idea is that you potentially could program remarkably complex sequences in time and space of how such switches would be reset. There is potential in multi-scale, multi-time switch setting for a remarkable degree of relevant information to be passed between proteins.
o Multi-scale adjustment of both specificity and "stickiness".
As with gecko feet, if the goal of the protein is aggressive "grabbing" of a range of some broad class of proteins, this can be programmed in fairly easily via the multi-scale model. It works like this: If the purpose of the protein is to bind and entire class of targets based on overall large-scale charge structure (and please note the relevance of this idea to e.g. ongoing efforts for universal flu vaccines), then the next lower level of scale in the protein should be characterized by extreme mobility of the groups that provide matching, so that they can quickly rotate and translate into positions that allow them to match essentially any pattern in the target molecule.
Conversely, if certain patterns at lower scales indicate that the target is wrong, then those parts of the program should present a rigid, immobile charge pattern upon closer approach. Mobility of groups thus becomes a major determinant of both of how specific the protein is, and how tightly it will bind to the target.
o Energetic activation of low-probability chemical reactions.
This is more the enzyme interpretation, but it's relevant because multi-scale provides a good way to "lock in" just the right sequence of group repositions to create highly unlikely binding scenarios. Imagine first large groups then increasingly smaller and more specific groups all converging in attraction down to a point where some small group of atoms is forced into an uncomfortable positioning that normally would never occur. (This is a version of the multi-level-switch scenario, actually.)
At that point a good deal of energy is available due the higher-level matchups that have already occurred; the target atoms are under literal pressure to approach in ways that are not statistically likely. And so you get a reaction that is part of the creation of some very unlikely molecule. This is really quite remarkable given the simplicity and generally low-overall-energy level of amino acid based sequences, yet it comes about fairly easily via the multi-level model.
Another analogy can be used here: Imagine trying to corral wild horses who have a very low probability of walking into your corral spontaneously. Multi-scale protein matchup energetics then are like starting with large-scale events, say helicopters with loudspeakers, as the first and largest-scale way of driving the horses into a certain region. After the horses get within a certain smaller regions, the encirclement process is then scaled down (renormalized) to use smaller ground vehicles. The process continues until "high energy" processes such as quickly setting up physical barriers come into play, ending with full containment.
o Enablement and isolation of diverse protein reaction systems within the same cytosol medium.
The idea that in terms of interactions, molecules can both immediately reject and reject at low cost interactions not relevant to their purpose is another way of saying that even if a huge variety of molecules with very diverse purposes are distributed within the same cytosol, they can behave in effect as if they do not "see" any other molecules except the ones with which they are designed to react. These subnetworks thus can focus on their tasks with efficiency and relative impunity against cross-reactions.
There is a fascinating and I think rather important corollary to this idea of multi-scale enabled isolation of protein chemistry subnetworks, which is this: It only works if the proteins are pre-structured to stay isolated. That is, on average I would guess that high levels of mutual invisibility between protein reaction subnetworks is not likely, and that the subnetworks must in advance agree to certain "multi-scale protocols" about how to distinguish them from each other. This distinction would begin and be most critical at the largest and most efficient scales of charge blurring, the same levels that your paper abstract describes.
So, a prediction even: Careful analysis of the charge profiles of the many types of proteins found in eukaryotic (and prokaryotic, likely more accessible but not your main bailiwick) will reveal that multi-scale isolation of multiple subnetworks of interactions that are based first on high-level, "blurred" charge profiles between the proteins, with additional isolations at lower scales. It will be show statistically that the overall level of isolation between the subnetworks is extremely unlikely without all such reaction paths sharing the charge-profile equivalent of a registry in which each reaction subgroup has its own multi-scale "charge address".
o Possible insights into the protein folding problem.
Finally, it is worth noting that the hierarchical guidance concept that underlies biomolecular renormalization could well have relevance to the infamous multi-decade protein folding problem, which is this: How does a simple string of amino acids fold itself into a large and precisely functioning protein "machine"? This feat is roughly equivalent to a long chain of about twenty different link types somewhat magically folding itself into some form of complicated machine with moving parts.
Either directly through multi-scale attractions or indirectly through helper molecules, it is at least plausible that biomolecular renormalization may play a role in this folding process. With regard to helper molecules, one intriguing hypothesis (nothing more) is that previously folded proteins of that same type could provide some form of multi-scale guidance for how to fold new proteins.
While an intriguing idea, it is also frankly unlikely for the following reason: Such assistance would almost certainly require the existence of some class of "form transfer" helper molecules that would look at the existing molecules and from them find and present that information to the folding process. It is hard to imagine that such a system could exist and not have already been noticed.
Nonetheless, the concept of folding-begets-folding has an intriguing appeal from a simple information transfer perspective. And in one area it would resolve a very interesting resolution to a long-term mystery of large biomolecules, prions. Prions are proteins that have folded or refolded into destructive forms. Once established, these incorrectly folded proteins show a remarkable, even heretical ability to reproduce themselves by somehow "encouraging" correctly folded proteins to instead adopt the deleterious prion folding.
Folding-begets-folding would help to explain this mysterious process by making it a broken version of some inherent mechanism that cells have for reproducing the folding structures of proteins. Whether any of this is possible, and whether if so it is related to biomolecular renormalization, is an entirely open question.
Conclusions and Future Directions
As a concept, biomolecular renormalization appears to have good potential as a framework not only for understanding known and recently uncovered protein behaviors, but also to provide a more theory-based approach to designing proteins and enzymes. It may also provide insights into cell-level biological processes that previously have seemed opaque or mysterious under other forms of analysis.