Epistemology

Epistemology is the branch of philosophy concerned with the nature of human knowledge. What is truth, understanding, meaning, etc?

There are many different approaches that have been taken to this most central of philosophical questions, going back to the ancient Greeks etc. See Wikipedia for an overview.

The approach taken here is based on the following principles:

1. The only primary truth is subjective. This is the only thing we can know for sure, without any further contingencies.

2. Across many people’s brains, across time, there is strong evidence for an objective reality outside of our subjective experience. However, our own individual knowledge of this objective external world is only obtained through our own subjective experience, so it is necessarily secondary to that primary subjective experience. In other words, the objective world and everything we believe to be true about it is fundamentally contingent on the subjective state of each individual’s brain – there is no “direct”, “absolute”, “pure” access to any “true” facts about the objective external world.

3. Distinct individual humans (or other similarly “intelligent” agents) can not directly share their own subjective experience, which is, by definition, subjective and thus only available exclusively to each individual. All sharing across individuals is routed through the objective external reality (otherwise, under a fully solipsistic worldview, the “other” does not actually exist), and is contingent on extensive cultural learning about language and social conventions, that enable communication across brains.

4. Therefore, all shareable knowledge about the objective external world is fundamentally contingent on a stack of assumptions that must also be shared by the individuals entering into such a sharing relationship. The truth value of these assumptions is not directly verifiable, because they are assumptions about the nature of objective reality, and we do not have any primary, privileged access to that.

5. The “validity” (truth value) of these assumptions derives instead from the work that they accomplish in helping individuals come to a mutually consistent understanding of the objective world. For example, the scientific method has proven its value as a procedure for generating a consistent body of understanding across individuals, across time. See the discussion of truth-likeness below.

By way of analogy, the validity of this stack of assumptions is based on a kind of end to end backpropagation process where many people keep using the same stack across time and across different domains of understanding, and if these assumptions continue to enable the development of mutually consistent, stable, “individually satisfying” levels of understanding across these individuals, across time, then the effective “truth value” of these assumptions increases.

But, per above, there is no point at which they obtain anything like “absolute truth”. Instead, it is essential that each individual recognize the presence and role of this stack of assumptions in shaping their subjective feelings of understanding and knowledge, and fundamentally acknowledge that these assumptions are not in any way independently valid beyond their evident utility as defined above.

So what is this stack of assumptions? Well, the first entries are articulated above. Yep, this is a fully recursive enterprise. You gotta start somewhere.

Some other important tools in the assumption stack are:

• Parsimony: simpler explanations that account for the same amount of data are generally to be favored, because they are likely to generalize better to novel situations. A classic example is the heliocentric model of the solar system, which is fundamentally very simple, compared to the epicycle theory, which required lots of complex assumptions and corrections, to account for the same data.

• Consistency: Logically, if something is reliably true about the objective world, then everyone should experience the same effects of it in their own subjective worlds. Thus, we should always be on the lookout for things that are consistent across individuals, and across time, as indicators of what is real. This is a cornerstone of the scientific method. Furthermore, as we develop theories about the nature of objective reality, obviously we want these theories to be self consistent, not mutually contradictory. Thus, we seek a fully self-consistent stack of assumptions and more detailed theories.

• Levels of analysis: It is possible to describe the same physical reality at multiple different levels of abstraction or analysis. At the bottom level, it is most parsimonious to assume that there is just a kind of bare physical reality, operating independently and autonomously according to the most fundamental laws of physics. Everything else is just an abstraction that we impose onto this bare reality in order to understand it more efficiently and effectively. The principle of emergence and associated discussion there is essential for understanding the “reality” of emergent levels above the most fundamental physics level.

In brief: levels of analysis and emergent phenomena are sufficiently reliable cognitive tools (assumptions) that they provide an essential part of how we understand objective reality, without which we would be limited to only talking about specific configurations of fundamental particles operating under basic physical laws. The reality of an emergent phenomenon is validated whenever a cognitive system compares across multiple different physical systems that share a common emergent-level organization, or in describing the stable aspects of the behavior of a single such system across time, in the face of substantial thermal noise and other underlying changes (Figure 1).

Neither of these operations are available outside of a cognitive system, so again they are not an intrinsic part of that base-level “raw” physical reality. But they are absolutely “real” in the sense that they enable shared, mutually-consistent, accurate predictive understanding of the behaviour of such physical systems (i.e., they are truth-like).

Skepticism and the desire for higher standards of truth

The foundations of this approach to epistemology are notably weak compared to what perhaps one might otherwise desire. However, the claim would be that anything stronger is just delusional, and it is better to have an accurate understanding of weak foundations rather than a delusional understanding of stronger ones.

Another potential reaction is to throw up one’s hands and give up on the entire enterprise, given how weak its foundations are. How is it possible to ever make progress when everything is so contingent? This is the perennial role of the skeptic, which is certainly easy to defend, but it also runs the risk of defeating the potential for progress where such potential may actually exist.

Comparison, similarity, and categorization in a one-off world

Figure 1:

A single instance of a “thing” is isomorphic with its base-level physical reality. It just is, and doesn’t require any kind of abstract description. Once you introduce two instances, however, that very assertion that these are two “instances” of a “thing” immediately requires some way of comparing across the two instances, which then requires some basis for such a comparison: does shape matter? If so, what aspects? Does material matter? Thus, any attempt to assert any level of description above the raw base-level physical reality is subject to assumptions about these criteria and requires a cognitive system capable of performing the necessary comparisons to test for these critera.

Any attempt to posit that “this thing” is the “same” as some “other thing” violates the bare physical reality that everything is its own unique blob of matter (Figure 1). No two physical entities (composed of fermions) can ever be completely identical (they at least have different spatiotemporal timelines), and the individual constituents are exceptionally unlikely to be precisely comparable in any way (measurement issues aside) as the systems get to the macroscopic sizes that we can actually perceive and directly understand.

Furthermore, any complex combination of molecules is exceptionally unlikely to ever be identical to itself at a prior moment in time, given the ever-evolving nature of the physical components and their relationships to each other, under the constant influence of thermal noise. This is apparently an issue that philosophers have wrestled with. A “thing” is not really a “thing” at the bare physical level.

Thus, any attempt to assert that two things are “the same” necessarily must be qualified relative to an abstract, mental category based on some kind of criteria, that always involves at least some “rounding error”: differences that we just ignore because they are small relative to the main dimensions of interest.

We can posit that two objects have the same mass, or size, or color, or various other such properties, but it is literally impossible for any two objects to be the same in all respects, and most of these other more limited claims are only approximately true, at least due to the limitations of quantum measurements, if not due to more prosaic macroscopic measurement errors.

Thus, any attempt to postulate natural categories of things in the world is subject to these limitations: all such categories are necessarily approximate (except at the most basic physical level: it is highly likely that all electrons are identical except for certain well-specified properties such as their spatiotemporal location).

The category of human brains or people is a rough collection of individuals, each entirely unique in their spatiotemporal timeline and physical configuration. Although we are prone to want to generalize across such categories, it is important to be aware of this fundamental imprecision. And any thought experiment that requires two “identical” people except for some specific property is entirely suspect.

All of these points play a critical role in the foundational issue of emergence, which depends critically on the ability to compare across multiple different physical instantiations of complex physical systems. It is only when we engage in this comparison that we can establish that an emergent phenomenon depends on some higher-level properties of physical systems, and is not directly reducible to just the “sum” of its parts. Furthermore, a defining property of emergence is that multiple different physical systems can give rise to the “same” emergent phenomenon, as in the example of a system of gears composed from different materials.

Thus, we conclude that emergent phenomena are entirely a construction that we impose onto the physical world, and they do not exist as such at the bare physical level where no such comparisons are possible. But again, they are not arbitrary, and are subject to all the constraints of objective knowledge per above.

And yet, due to the constantly changing nature of any macroscopic physical thing, everything that we think about at this macroscopic level is effectively an emergent phenomenon: there are certain properties that we use to define the relevant nature of the thing in question, and we consider it to be the “same” thing so long as those properties still hold, as an invariant property of the thing relative to its ever-changing underlying physical substrate.

Truth-likeness

Figure 2:

Definition of truth-likeness in terms of an approximate description of the behavior of a real system under a given theoretical framework, that has an explicit model implementation, which then allows detailed comparisons between the behavior of the model vs. the real system. Only the idealized model can provide a precise, 100% accurate implementation of a given theory. The accuracy of the mappings between the model and the real system determine the truth-likeness of the theory. From Psillos (2005) via Alex Petrov.

A simple framework for capturing a graded notion of the accuracy or truth-value (termed truth-likeness) of a given scientific theory is shown in Figure 2 (Psillos, 2005). A verbal theory can in principle be fully realized via a working model, and thus the relationship between the theory and this model could be considered one of perfect truth (although this is obviously not directly truth about the objective world). The accuracy of the mappings between the model and the real system determine the truth-likeness of the theory.

To the extent that the model provides highly accurate predictions of the behavior of the real system, then that increases the truth-likeness of the overarching theory. As discussed in the prior section, the model and the theory are always an idealization of the real system, because, at the level of the bare physical reality, the real system is isomorphic with the actual atomic and molecular stuff that the real system is made of, at each moment in time, whereas the theory and model represent an attempt to capture what is common across “identically prepared” instances, and across time.

Typically, per the assumptions of parsimony and levels of analysis, the theory and the model represent a much more abstract, simpler level of description that intentionally eliminates a large number of irrelevant sources of variance across instances of a given object of interest, and only captures what is most functionally relevant: what actually makes the system function in the manner of interest.

In the language of mathematics, by way of analogy, a theory and its corresponding model are attempting to capture the principal components of the function of a real system — the core elements that capture the most “variance” about how the system behaves. With such elements in place, the truth-likeness score in terms of predictive accuracy can be optimized in terms of the greatest accuracy given the least amount of model complexity. This ideal is reflected in the minimum description length (MDL) principle, and the bayesian information criterion (BIC) frameworks, for example.

In practice, it is not always possible to exactly quantify such things, but these are at least helpful at the level of analogy for the factors upon which theories and models are typically evaluated. And a significant amount of subjectivity enters into such evaluations, based on the specific aspects of the real system one is attempting to capture. For example, if the goal is to understand metabolic costs of neural function in the brain, that dramatically changes the target phenomena of interest relative to a different goal of understanding particular cognitive, information-processing functions.

In the domain that I am particularly familiar with, Computational Cognitive Neuroscience (i.e., trying to understand how the brain works), one can make models of neurons at many different levels of biological realism, from very elaborate attempts to capture the detailed connectivity and physical structure of actual neurons in an actual brain, to much more abstract, simplified versions of neurons that admit to closed-form mathematical analysis. The abstract neurons may do a fine job of capturing certain abstract information-processing properties of the brain, but more biologically-detailed neurons may be required to understand other aspects of brain function.

It is the job of each theory and model to clearly articulate what the target phenomena are, and why each aspect included in the model is necessary to capture a given element of the this target phenomena. If something in the model is truly superfluous relative to the target phenomena, clearly it should be omitted from the model, or the domain of target phenomena should be expanded to motivate its inclusion.

Thus, again, there is no single right level of analysis, or one single model that is the best at capturing everything about the real system, except for the real system itself.

The evident fact across many domains of objective reality is that it is almost always possible to create highly simplified, highly abstract models that accurately capture meaningful target phenomena of interest. This strongly supports the reality of emergent phenomena and the value of considering different levels of analysis, As long as everyone is aware of all the contingencies and stacks of assumptions at work, such abstract models provide entirely “valid” and “true” ways for human brains to understand the complexities that have emerged within the objective reality that we find ourselves in, here on planet Earth, with its high levels of evolved biological complexity.

Empirical epistemology

If you follow the full stack of assumptions to its “logical” conclusion, the only way to really understand the nature of human understanding is to understand in detail how the human brain functions, because that is where our knowledge somehow emerges. This then defines a purely empirical objective domain of epistemology, as opposed to the philosophical stack of assumptions outlined above.

According to the current state of cognitive neuroscience, e.g., as detailed in compcogneuro.org, the brain is made of neurons, which function by integrating and sending chemical and electrical signals to each other. Current AI models are based on the same principles of neural computation, and demonstrate the functionalist principle that systems of such neuron-like processing elements can capture essential features of cognition, whether they are implemented in silicon and software, or lipids, proteins and ions. The emergent behavior can be the same, despite differences in the underlying hardware. This is another vote in favor of the power of levels of analysis and abstractions.

From an epistemological standpoint, the critical thing about networks of neurons is that they are not based on any kind of formal logic. Instead, they operate by creating hierarchies of increasingly abstract categories, for example “dog”, “chair”, or “truth”. The fundamental learning mechanism driving the formation of these categories is predictive learning, which basically means that we form categories that allow us to make accurate predictions about the future state of the world, much as in the truth-likeness framework.

This is basically an empirically-validated version of the levels of analysis assumption: our brains naturally organize the world into more abstract categories, to the extent that doing so allows us to better predict what will happen next. The scientific method and scientific theories are just a more formalized extension of this same fundamental process.

Why do our brains do this? Because it works. It allows us to better survive and out-compete other beings that aren’t doing this. Therefore, this means that, in fact, these abstractions are real in the sense that the nature of objective reality is consistent with such abstractions.

Thus, evolution and survival-of-the-fittest is the ultimate adjudicator of truth. This does not mean that “might makes right”, because in fact, as we have seen in the course of human cultural evolution, societies that develop greater scientific understanding of our objective reality can then use that knowledge to make tools that create far greater strength and power than raw physical strength. Knowledge is power, and ultimate knowledge is the ultimate power. See meaning of life for further discussion.

Soft, distributed categories

The categories that emerge in the brain are fundamentally graded (“soft”) and distributed (multi-dimensional, partially-overlapping). These properties are critical for efficient search through the exponentially-huge representation space that must be navigated through learning and online processing of the world around us. But they are a further departure from anything like the discrete, hard symbols of formal logic.

Instead, we think in terms of “central tendencies” and similarity-based analogies. The goal of self-consistency is hard to accomplish, because everything is so squishy and overlapping — as you undoubtedly have experienced, people are easily capable of holding many seemingly contradictory beliefs at the same time. Each such belief is just a different set of neurons with their own separate set of connections — the ability to specialize knowledge for different situations is essential for learning and efficient processing with partial, incomplete understanding.

But it makes the process of trying to establish a fully self-consistent, all-encompassing understanding of the nature of objective reality a rather challenging task, requiring many iterations and waves of incoherent neural activity sloshing around in the brain for years, before any kind of stable, consistent understanding can emerge (see the poem Strange Manifold for a subjective rendition of this process).

There is nothing automatic or “natural” about that process: it takes work. What is natural is a kind of effortless perception of things that seem obvious, until you try to really connect them all and explain it to someone else, in a way that actually makes sense. That is essentially what I’m trying to do here in this website, and as always, it is a work in progress, but the process of trying to do so is what drives that progress.

Criteria for understanding

There are various philosophical traditions that have tried to clarify the criteria for when a person thinks they understand something. These include three main criteria (Fleisher, 2022):

• Belief: some kind of mental representation about the content of what is being understood (the “content”).
• Success: a way of validating the truth value of the belief, as in the truth-likeness discussed above.
• Justification: an ability to appeal to normative principles or rules that justify the belief.

This provides an elaboration of the basic elements of the truth-likeness framework: the belief component corresponding to the theoretical description, and the predictive validity providing the success criterion. The justification provides an additional dimension, closely related to the consistency factor discussed above.