Essay Abstract

Mathematical concepts compress solutions to problems involving natural and other phenomena. Hundreds of generations of human society have tested those solutions for their accuracy and utility and have continuously refined them. Those concepts likely often reflect fundamental aspects of the universe including phenomena not yet modeled, perhaps not yet identified. Mathematics therefore has predictive power. The accumulation of solved mathematical problems that we possess represents distilled intelligence that enormously exceeds that of any individual human being.

Author Bio

Since completing law school in 1976, I have continued to learn about mathematics and physics.

Download Essay PDF File

Bob,

I like this idea that math has some predictive power as to what new physics will be. I am also keen on the idea of "problem solving capacity" you discuss in the paper. I was wondering if you had any thoughts on the idea of whether we gain benefit from "free-thinking" such that we allow individuals to choose what problems they want to solve, or should it be more directed? Some might argue that allowing someone to think of all the potential plot consequences of an episode of "Friends" or "Star Trek" is time wasted, but then one might counter that such thinking may have helped people in their approaches to problems they encounter in their own lives. Curious as to your thoughts.

Best,

Harlan

    Dear Bob Shour,

    A most fascinating essay. Most essays are such that is not difficult to find significant areas of agreement, and specific details upon which we disagree. Yours is hard to disagree with, in general. From your key assumption that "brains networked over time have more problem-solving capacity than individual brains", combined with the idea that 'compression' over time is a powerful means of increasing efficiency, you discuss inference sets (with examples), Hayek (optimal allocation) and a fascinating discussion of language compression (with examples).

    Then your examples of relative numbers of neurons, in ants, in humans, in ant societies, and in human societies, lays the framework for the hypothesis:

    "Maybe networked neurons don't care where they are located when they exercise their degrees of freedom."

    With this first-order approximation you proceed to analyze society's problem-solving rate. Again, fascinating!

    You tie concepts of entropy from Shannon to Clausius, to analyze the function F(n) which relates the problem solving capacity of a society of n individuals to one individual.

    You are to be congratulated on a most interesting and most enjoyable essay, one certain to do well in this contest.

    My best regards,

    Edwin Eugene Klingman

      Dear Harlan Swyers,

      Thank you for your comments. Free thinking has two aspects. Think about anything and see where it leads. A directed question invites a diversity of responses, one of which may add insights. Both seem important to me.

      Regards,

      Bob Shour

      Dear Edwin Eugene Klingman,

      Thank you for taking the time to read, comment on, and summarize my essay. Feedback is appreciated, and I am grateful when it is positive. I am glad you enjoyed the essay. I hope to have the chance in the future to improve the description and presentation of the ideas in the essay.

      Best wishes,

      Bob Shour

      You mention ants.

      The context of this contest suggests an interesting observation about ants.

      Humanity started building the perfect arch in the 1500s or so. This took considerably math ability, physics ability and many other human intellectual abilities.

      Ants for their lack of neurons have been building perfect arches for millions of years. They substituted rules and evolution to do the job.

      Their rules are (1) everybody starts to build a mound of mud. (2) after a while, groups of 3-5 identify the tallest mound and build on that while abandoning the others (3) after a while, they start adding mud to build toward the closest mounds. (4) when the mound meet they fill in the wall. As the tensions work their way, some mud falls but some stays to make the perfect arch.

      Mankind should not get to enamored with their methods.

      5 days later

      Bob,

      A very interesting essay. Thank you for sharing these ideas.

      I guess it is true ... two heads, or perhaps, 2.7 heads, are better than one.

      I'm not sure if there is a word for it, but the fact that you have used a mathematical model to quantify why mathematical models are useful and effective is ... well, simply delicious.

      The use of the number of words in a language as a surrogate for the complexity of a society is quite clever I think. There could be some second order interactions that cause error though. For example in the German language, it is common to make a new word by combining two other words ... as an example, consider "thought-experiment".

      I have to wonder though, is it possible for technology to cause a step change in the effectiveness of a society at solving problems. For example, was there a step change when the printing press was invented or when the library was conceived? Possibly we will see such a change due to the computer or parallel processing? I will tell you, from my own experience, my ability to create, understand, and solve new problems is not limited by computational capacity. It is limited by my ability to have insights into the problem being considered.

      Best Regards,

      Gary Simpson

        Dear Gary D. Simpson

        Thank you for your comments.

        On 'step changes,', the biologist Stephen Jay Gould similarly noted that evolution is also not smooth but has punctuated equilibriums. Whatever common thermodynamic principles underlie emergence might provide an answer to both points.

        Suppose that there is a steady supply of energy to an ecosystem, a stable average use of energy by organisms within it, and a steady average rate of mutation. Some mutations are neutral, some harmful, but some incrementally more energy efficient. Improved efficiency improves survivability. Natural selection may favor a changed organism, an emergent 'step change.' The outcome looks like a punctuation to us but arises from the steady supply of energy to the ecosystem. We use language to categorize a newly emergent form. The new word category is a step change, the rate of energy use is steady.

        Apply the same analysis to abstraction. Person A's brain contains, among other inference sets (or abstractions) laws of motion, calculus, limit and quaternions. Suppose A after much learning and thought formulates a problem: is there a calculus of quaternions? (That seems to be an inference set in your essay.) This question is a new inference set. Suppose A (a theoretical construct) thinks about this problem for some time, a week, a month or years. A develops ideas that fail, or lead to dead ends, but A is on a steady average basis thinking about this problem. Unsuccessful points of view are discarded and improve the chance of solution because they eliminate unfruitful lines of inquiry. At some point A hits on the idea of taking the limit of a scalar applied to the quaternion, a new inference set. This new inference set is the emergent result of energy (problem solving) applied to networking abstractions of other people (quaternions, limit and calculus) with A's inference set, the calculus of quaternions. This seems to the outside observer to be a step change. But it is the emergent result of energy steadily applied to the network of abstractions.

        Moreover, the input inference sets calculus, limit and quaternions represent an enormous amount of energy spent by thousands and in some cases millions of people discovering, improving, and compressing those concepts, improving notation and finding better ways to write about teach these inference sets. Thus A brings to bear on A's problem an enormous amount of legacy energy in addition to A's own. The same may be said about each of the legacy inference sets, and so on. I read the quote at the end of my essay in that light.

        Bob Shour

        Bob,

        The accuracy of your analysis of my approach to solving problems is amazingly accurate. If I were paranoid, I would think you were a fly on the wall in the room where I work. The main emphasis is upon legacy knowledge.

        Any idea that produces accurate predictions must contain some element of truth. You may have opened the door for better usage of mathematics by the less technical disciplines.

        Best Regards,

        Gary Simpson

        7 days later

        I enjoyed your essay. In my view, the extrapolation you are making is interesting because it opens questions.

        One of them is : why do you stop calculating collective intelligence with life forms? For example, you introduce the collective intelligence of an ants colonie.

        They are billions of computation happening on earth every seconds. It is not life forms, it is physical properties. Why cannot they be considered as intelligent computation too?

        Dear Christophe Tournayre

        Thank you for your comments. I agree with you that the essay raises those questions.

        Three reasons for not calculating with ants are. 1) Confining the essay to the relationship between physics and math as artifacts of collective problem solving aims at thematic consistency and relevance to the contest theme. 2) A short essay is less work to read. 3) I do not know of any studies that measure the mean path length for ants to transmit information, or what might be measurable ant colony collective problem solving. Can it be done? Theory predicts a mean path length of ( 4/3 ) e, for the e the natural log for an ant network receiving information.

        There have been attempts, notably around 1996 by the American Psychological Association (Intelligence: Knowns and Unknowns) to define what is measured by IQ tests (eg, "ability to understand complex ideas" etc) but I don't know if there is a consensus. I was forced to consider the rate of problem solving in 2005-2009 with my hypothesis that average IQs increase due to improving ideas. I agree that a rate of problem solving definition opens questions. For non-life forms, I would characterize the situation as: what principle common to non-life forms and human problem solving makes some natural phenomena resemble intelligent computation (your terminology)?

        Regards

        Bob Shour

        Dear Joe Fisher

        Perhaps some math and physics is more complicated than necessary and so seems incomprehensible because we have not yet figured out nature's simpler approach. This Einstein quote seems similar to your point: "as far as the propositions of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality."

        On your wording correction: I agree that it is important to keep in mind that mathematics deals with idealized abstractions.

        Thank you for your kind comments. Regards.

        Bob Shour

          Dear Mr. Shour,

          I have been trying to follow your essay, and have some problems understanding the calculation you undertake on p. 3. Your write:

          "Suppose that the average individual problem solving rate in a society of n individuals is x problems per time unit, and the whole society's problem solving rate is X problems per time unit. Is there a function F such that X = F(n)x?"

          On the assumption that every problem solved is solved by someone, then F(n) is just n: the total number of solved problems is the number solved (on average) per individual (which you call x) times the number of individuals. Perhaps you mean that there is a function x = F(n) which describes how the efficiency of problem solving by individuals is determined by the number of people in the society? Then your equation should be X = F(n)n, not X = F(n)x.

          I am also confused about the meaning of μ. You say μ is the number of other nodes (people) that a given person communicates with in a given time (on average). If that's right, then μ is just a number that characterizes the communication web of the society, and must be determined empirically as data. But you go one to worry about somehow solving for μ. Solving on the basis of what?

          It is also not clear why you make the assumption of isotropy. In a practical context, it would mean that each individual in a society communicates randomly: there are no groups that communicate strongly in-group and weakly out-group. That is not at all a realistic assumption. It does imply that one can rely on results from random graph theory, but random graphs are not reasonable models of actual human communications in a society.

          Maybe you can clarify a bit what you have in mind here.

          Regards,

          Tim Maudlin

          Dear Tim Maudlin

          Thank you for engaging with the essay. I will try to address your points.

          Forget about F for now. Start with X. X is a rate, not a number. For example, in 1657 there were 200,000 English words, and in 1989 there were 616,500. Assume criteria are the same for both counts. Then average English lexical growth is 3.39% per decade. The whole of English speaking society is involved in appraising (voting) on which words get used and which get discarded before making it into a dictionary. We have one possible collective rate of improvement in ideas. Can we corroborate the rate?

          The labor cost of lighting from 1750 BCE to 1992 improved at an average 3.41% per decade. All of networked society collectively decides which lighting improvements are favored and lay the foundation for the next technological improvement.

          Third, average IQs, though over a time period that is shorter, improve at about the same rate.

          Conclude: ideas that reflect widespread collective problem solving (per lighting, lexicon, average IQs) get better at the same average rate of about X = 3.4% per decade. The conclusion depends on interlinking and mutually corroborating concepts. This is a skeleton outline.

          Apply economic analysis. Just as the invisible hand of networked market interactions result in a market price for goods and services, assume the invisible hand results in an energy price for collective problem solving so that the information pay off for energy input required by collective problems is comparable. Then mathematical ideas used by society should be improving also at about 3.4% per decade.

          If today there were 100 English words, given society's collective problem solving rate of 3.4% per decade, we would expect there to be 103.4 words ten years hence.

          That is the end of what be called part One. If one accepts the existence of a collective problem solving rate, one has then laid the foundation for a much harder problem, is there an F such that X = F(n) x. In effect, this asks: how much smarter is society (what is F) than an individual? It turns out that one has to find F for each of two networks, one of brains and one of ideas.

          That leaves outstanding the issue of F, which is the number of degrees of freedom RELATIVE to the network's mean path length. Suppose for example the mean path length of physics profs is 3 and there are 9 professors. Then F for 9 people is 2, because one step of information transmission can (has the capacity to, not does) reach 3 people and one more step (each of the 3 can pass information on to a colleague) reaches 9 (if we don't duplicate the information paths). Why isotropy? That is a natural consequence of working with the mean path length. Since the average is the same for everybody (is that tautological?) then every average node has the capacity to transmit to the same number of average nodes per information iteration. By working with averages, isotropy is necessarily along (so to speak) for the ride.

          The mean path length is the average distance between members of society in STEPS. If you know the head of your department H and I don't but I know you, then H and I are 2 steps apart. How many mean path lengths does it take to span a system from one end to the other? The answer is log_\mu(n) where n is the population and \mu is the mean path length. Yes, but the mean path length \mu is also the average distance from one end to the other.

          This implies that the same energy it takes to go one mean path length confers a CAPACITY to traverse log_\mu(n) mean path lengths. One energy unit cannot both go one mean path length and at the same time several. So log_\mu(n) must be measuring the degrees of freedom of population n relative to the mean path length \mu.

          So F is a measure of capacity. Capacity is linearly related to F. If a brain has a larger network of concepts (eg 100 ways to prove the Pythagorean theorem compared to 1), then that brain's ability to solve geometry problems is greater.

          In effect F is the entropy of a network.

          My essay compresses many years work. I attempted to simplify the essay by way of a succinct sketch. I hope this reply helps. What have I left out in this explanation?

          Regards.

          Bob Shour

          Dear Tim Maudlin,

          This is a supplement to my initial reply.

          You mentioned strong in group connections etc. The calculation of F is of an emergent meta quality of a network. The clustering coefficient C is the average connectedness of nodes, which averages the effect of strong and weak in groups. So the C log (n) formula characterizes the entire network by utilizing averages. By using averages an enormous amount of information relating to individuals is compressed using only 2 parameters, C and the mean path length \mu.

          Consider the distribution or use of a finite amount of energy per time period (a rate) for 10 million households (of electricity), of 100 billions neurons in a human brain or 350 million brains (both energy via food). Use electricity distribution as a paradigm. We can calculate electrical use per capita per time period for 10 million households. Consider that 10 million degrees of use freedom per household. Now suppose all 10 million households each have a maximum rate of energy use during a year. The electrical distribution system does not have the capacity to deliver electricity to those 10 million households if they are all simultaneously at peak usage. But peak usage per household varies in time of occurrence. So the system only need to have capacity to meet the greatest average use. If a networked electrical distribution system has 10,000 supply nodes supplying 10 million households, the flexibility of capacity arises from the 10,000 degrees of distribution freedom.

          Similarly, for a brain's neurons, problem solving output capacity is affected by the degrees of freedom in the neuronal (or synaptic) paths available for problem solving. If there are 10 ways to get from a house to the office, the traveler can vary the route to minimize energy use. 10 ways to solve a problem gives more capacity to the problem solver than one way. To measure that capacity a log function relative to a mean path length gives a network capacity (as opposed for example, to a simple per capita energy use rate).

          Humans collectively aim for the most efficient solution.

          Regards,

          Bob Shour

          11 days later

          Hello. The question of collective intelligence is an interesting one, though I'm not sure it has much to do with the official topic of this contest (it would have better fit with last year's contest).

          However I am skeptical about the possibility to modelize it and establish any explicit formula about it. All what I consider to be clear is that increasing the population also increases the rate of technological progress but slower, and at a rate that is relatively lower as the population is big (i.e the advantage of a more numerous population vanishes when the population is already big enough so that being bigger no more changes much).

          About your formula, I disagree with your way to derive it. First if we believe your assumptions, that any individual has a rate of solving problems that is independent of the rest of the population, that all problems solved by individuals are actually different problems (not the repetition of each other), and that the solutions are going to spread in the way you describe, then what we get is that the number of solved problems is proportional to n (population) that are produced at a time and that take the time log(n) to spread. Then we would seem to get a flow of problems that would be not log(n) as you wrote but rather n/log(n) (dividing the number n of problems solved in parallel by the time log(n) it takes to spread) but that is not even true because log(n) is not a factor of slowing down of the flow, but of delay after which the solutions would reach the whole population. During this delay, the flow of solutions goes on. In a large time interval (much larger than spreading time), if solutions add up and do not repeat each other and are all going to be spread then the slowing factor is irrelevant, so that the factor is n instead of n/log(n).

          But I would also question your assumptions. It is common in science to have co-discoverers of important results, so that adding more researchers in parallel does not help. I discussed some disadvantages of overpopulation, including for speed of progress, in this text.

          Also, I consider that your model of spreading is not realistic. Seriously, do you know any example of solutions to problems that are spread in this way ? Human attention cannot be extended endlessly, to simultaneously verify many solutions found by others. The only process I see roughly behaving in the way you describe, is the process of Darwinian evolution, with the results of beneficial genetic mutations spreading across a population.

          Instead, if things happen ideally in an ideal world, any solution once found and verified by a few people comes to be published and thus instantaneously spread across the world, bypassing peer-to-peer communication.

          But we are not in an ideal world, so that in many cases, solutions once found are not spread at all and remain unknown. Especially because intelligence is not something common, and intelligent people are not always well-connected to each other, so that even after someone finds a solution, and even once it is explained to a few people, the information might not go further, as the mediatic space is occupied instead by a lot of rubbish which looks much more interesting in the eyes of the less intelligent people. I reported here my experience about this.

          Also, there are cases where a solution becomes known by the relevant community, but we have a persistence of a large separate community that remains ignorant about the solution. For example, the problem how to understand fundamental physics for so many purposes, has been solved by the community of physicists in the first half of 20th century, with general relativity and quantum physics, but still we have a persistence of a large community of cranks, including the majority of authors of essays in this very contest, who failed to understand these solutions because they are not good at maths, but to not feel ashamed of their failure to understand maths they need to mistake things as if the known mathematical solutions were wrong for no real reason but the fact these solution are mathematical so that it makes them obscure in their eyes.

          Similarly, in the 19th century and until the 1930's there were big monetary instabilities. Economists found ways to stabilize the money by central banks. However the Bitcoin community still never heard about that problem and solution, and keeps blindly believing that the instability of the value of the bitcoin is just due to the fact it is not popular enough and it will naturally become stable by the magic of being popular, just because, as usual currencies are now relatively stable since a few decades, the very existence of the problem remains completely ignored by people who specialize in cryptography and have no clue about finance.

          It is also possible that a larger population is a direct obstacle to innovation, as the larger number of people obliges a standardization of work where any deviation from the standard is becoming impossible. This particularly happens with the teaching system.

          On the other hand there are cases where a large population induces a collective form of problem solving where solutions work without being understood by individual members (or at least not by any majority of members). This especially happens with the "invisible hand" of free market that provides a sort of global optimization that does not need to be understood by any individual in order to work.

          Dear Sylvain Poirier,

          On your first paragraph. As a body of knowledge, math concepts and methods (which still grow) took much more energy than any one person could spend in a lifetime, and hence represents a disembodied higher intelligence which problem solvers use when they solve problems. Hence, math is effective because in a sense it is smarter than individuals.

          On your second. A population being bigger has slower change; a log function is not inconsistent with your point.

          On your third, the formula uses statistics about the entire network; the independence of individuals is not a required assumption. Co-discovery (your fourth) and redundancy are irrelevant; the rate of increase is of the entire network. Problem flow is analyzed by considering the networking of people and ideas, which have been measured.

          On your fifth, I suggested language and lighting. Ideas collectively made observably improve. What is a metric for improvement?

          On your sixth. A broadcast does not depend on peer to peer. The paper considers network effects, not just broadcasts.

          On your seventh and eighth: I respect the contributed essays so far because:

          (1) One can learn about (network with) other people's ideas such those about Bell's theorem, Kolmogorov complexity; the calculus of quaternions, and that is a valuable thing in itself.

          (2) Even though one may not understand or agree with all the essays or even any of an entire essay, some other people might benefit or get an idea, or in future might helpfully use some of those ideas. The forum allows ideas to meet, so to speak.

          (3) Even if none of the essays are entirely correct, and even if some are in parts entirely wrong, essay remarks, questions, and hypotheses may lead someone to a good idea, even if it is an opposite hypothesis, for example.

          (4) Estimation of someone's idea by a singe person, and even by a substantial majority, may be and often is incorrect. What seems wrong today, may be right in the future. That is the value of diversity of opinion, and the opportunity to network with different people and ideas.

          Regards,

          Bob Shour

          Dear Sir,

          Your concept of development of language and mathematics are questionable. While evolution of information is well established, there is no proof of evolution of intelligence.

          Without defining intelligence, you cannot even remotely equate ants with human beings. At any moment, our sense organs are bombarded by a multitude of stimuli. But at any instant only one of them is given a clear channel to go up to the thalamus and then to the cerebral cortex, so that like photographic frames, we perceive one discrete frame at every instant, but due to the high speed of their reception, mix it up - so that it appears as continuous. Unlike the sensory agencies that are subject specific (eyes can only receive electromagnetic radiation, ears only sound, etc.); the transport system within the body functions for all types of sensory impulses. The same carrier transports the external stimuli from sensory agencies to the cerebral cortex and back as a command. This carrier is the mind. The existence of mind is inferred from the knowledge or lack of it about external stimuli. Only if the mind transports different external impulses to the brain for mixing and comparison with the stored data, we (Self) know about that (for the first time impulse received about something, there is no definite 'knowledge'). It requires an agent to mix these signals and convert them to electro-chemical information and submit to a conscious agent (operator) to cognize and utilize them. In perception, this task is done by a transitory neural activity in brain called intellect. Though, it is not directly perceptible, it is inferred from its actions - firing of positrons in specific areas of brain during perception. Each individual can develop his intelligence by learning from others, but there is nothing like collective intelligence - like a group performing a physical task, which is linearly additive.

          The people who constructed pyramids were not primitive. Before 3500 BCE, in India each letter of the alphabet was pronounced in at least 18 different ways and the meaning of the word depended on the specific pronouncement. Hence it was not written. There are highly codified formulas for this, which are still available. One book written by Panini is referred to by computer professionals even today for developing programming. All modern Indian grammars follow those methods in a much diluted format. Those people developed the number system including zero and also had mathematical treatises including highly developed geometry (called Shulva Sootra). At around 4th Century BCE, Chanakya in India compiled the earlier works on Statecraft and Economics, which is treated as authoritative even today. Please do not denigrate our ancestors.

          Regards,

          basudeba