First, Occam's Razor is very likely a kind of least energy principle. That is, it suggests the least energy solution to a problem, viz., that the best solution to a problem creates the fewest new terms, hypotheses, etc., in explaining a phenomenon. In this case, it's thermodynamically efficient when compared to the other hypotheses which purport to explain a phenomenon. It's a process comparing the energy cost outcomes of the other hypotheses.
But what does it mean to "explain"? That is the problem here in a nutshell. What, for that matter does "understand" mean in neurophysiological/mental terms? If we look at the model of evolution, we see this in action. Evolution states simply that all life comes from previously existing life, by means of variations in the individuals of species, which transmit through genes those advantageous characteristics to their offspring. Acting through those individual characteristics, some of which are more efficient/effective than others, leads, by the compound interest rule of 72, an evolutionary advantage over generations, formerly called fitness, but now cast into a least energy, thermodynamics rule, instead. Evolution concepts are used because those widely and efficiently explain how the species developed, in the same, analogous way that Newton's laws of motion explained efficiently the basics of mass, momenta, and energy. The relationship to creativity is at once apparent, too.
The best explanation is therefore energy efficiency compared to other possible explanations. We get a lot of good explanations from the idea of evolution compared to others.
Interestingly enough, recognition also has a characteristic that it's efficient, when compared to other possible recognitions. It's Least Energy Principle in many cases, too. It explains/comprehends the event which is to be recognized using least energy to do so.
This point should be taken into consideration when trying to quantify and understand Occam's Razor.
Feynman's diagrams and how they drastically saved time in figuring particle interactions compared to Schwinger's exhaustive calculations, are also the least energy rule.
Regarding category mathematics, hierarchies and categories create problems for mathematics, but not if using verbal thinking, is also the case.
How protons/electrons give rise to atoms, which give rise to compounds for instance, or carbon atoms' bonding gives rise to polymers, protein chains and those to enzymes, the latter an even higher category, as well. Tough for math to scale up, but easy using language.