Categories
Cognitive Science Essays Technology

Why Large Language Models Will Not Understand Human Language

A machine that can perfectly understand and respond to human language has been the paradigm of success in artificial intelligence since Turing. The last decade has featured an ongoing revolution in language-focused machine learning, enabled primarily by new architectures and skyrocketing scale. A single model with billions of parameters, like OpenAI’s GPT-3, can complete a huge variety of tasks, from answering questions to generating text, based on few or even zero examples. Yet this revolution will not continue unhindered. I argue that large language models (LLMs) are structurally incapable of achieving human-level language abilities. By applying critical insights from cognitive science to deep learning, I develop both empirical and theoretical arguments to temper the overzealous excitement that LLMs will soon ‘solve’ natural language processing (NLP). There is more to language than LLMs can grasp. Finally, I sketch some more tractable pathways to language understanding by machines.

Crucially, my thesis is distinct from other long-standing and contentious debates in cognitive science. I do not defend the related but much more heavyweight assertions that artificial intelligence in general is impossible or that connectionism is false. Rather, I muster evidence against the specific approach of LLMs, showing that this method is extremely unlikely to match the human ability to understand and use language. These problems with LLMs are structural, and not merely incidental quirks of certain models or inadequacies due to insufficient scale or incomplete data. The issues stem from the constitutive features of LLMs, like applying statistical learning to text data.

Why should we care if LLMs can truly “think” or “understand”? Often, unarticulated fears and anxieties get entangled with this question – worries about our obsolescence as humans, of AI taking over our jobs and systems, of AI becoming sentient and evil, of being unable to keep up with our own creation. These are all fears worth exploring in their own right. Sometimes they are dismissed with a simple “oh, but it’s not really thinking.” This is not enough. We need to dig into these problems and their implications. But in this essay, I aim to bypass these fears somewhat by focusing on the structural and technical features of LLMs, and the philosophy and cognitive science of language understanding.

1. The Structure of Large Language Models (LLMs)

A language model (LM) is a system for predicting strings in a sequence. Formally, an LM is simply a probability distribution over a sequence of words. The model samples from this distribution to find the conditional probability of different words appearing in the sequence. Early research centered on N-gram LMs, which just use the relative frequency of words in the training data to predict what words come next. In contrast, LLMs are the state-of-the-art: models with (at least) millions of parameters that train an artificial neural network (ANN) to predict strings. Simply put, you can give an LLM a fragment of text, and it can tell you what is most likely to come next based on its statistical model of human language. Once the system gets good enough at predicting what comes next, it can generate new patterns of text that match the structure of its corpus.

The GPT-3 Architecture, on a Napkin
The architecture of Transformer-based LLMs, the most common structure for LLMs (including GPT-3).

These models start by converting words to vectors (embeddings), where semantically similar words are closer together in the representational space. Then, they train an ANN to predict an output label (the next word in text) given some input vector (the context). Almost all modern LLMs involve transformers, an architecture that uses self-attention mechanisms to weight the significance of each word in the input and takes advantage of parallelization to process all the input data at once (Vaswani et al, 2017). Due to a perfect storm involving huge open-access databases of internet text, an outpouring of investments from tech giants, and faster compute with new cloud-based TPU and GPU processors, these LLMs have become the golden child of deep learning.

2. Evaluating language understanding

I asked GPT-NeoX, an open-source version of GPT, if LLMs have human-level language abilities.[1] Here are a some of the more coherent sentences from its response:

It is a complex task to define the human ability of language. As a community, we do not have a consensus…it is not a simple task to make an argument that the human ability of language is not contained in large NLP models.

– GPT-NeoX

I hold that this model is just unthinkingly reproducing patterns from its training data and that no LLM understands language. While limited tests can make LLMs seem successful, more thorough and extended imitation games will eventually betray the system’s inadequacy. Even Yeshua Bengio, one of the godfathers of deep learning, stated that the field “hasn’t delivered yet on systems that can discover high-level representations—the kind of concepts we use in language” (Saba, 2022). Just producing human-like text is not enough, as having human-level language abilities is a symptom of understanding extra-linguistic representations like concepts and situations.

How do we know if LMs have human-level language abilities? Current research relies on both intrinsic measures to calculate the LM’s theoretical accuracy, and extrinsic measures to evaluate the model’s performance on concrete tasks. These metrics illustrate both the successes and the blind spots of LLMs. Perplexity is one key entropy-based intrinsic measure for determining how well an LM predicts an unseen test set. Lower perplexity indicates higher predictive power and accuracy. A perplexity of 10-12 is considered human-level, and GPT-3 achieves a word-level perplexity of 20.5 (Shen et al, 2017). While this is an impressive result, it is still a long way from human performance – and it required a model with 175B parameters, 45TB of training data, and $12 million in compute costs (Wiggers, 2020). The most comprehensive benchmark for evaluating LMs on actual language tasks is SuperGLUE, a composite of many tests from reading comprehension to recognizing words in context to causal reasoning (Wang et al, 2020). GPT-3 achieved a total SuperGLUE score of 71.8%, where the human baseline is 89.8%. GPT-3 also reached an accuracy of 80.1% on the Winograd schema challenge, a difficult assessment that requires the model to reason about its world-knowledge to resolve an ambiguity in a statement (Levesque & Morgenstern, 2012). Are these remarkable empirical results enough to show that LLMs have already matched or exceeded human language abilities?

A closer analysis reveals serious flaws in these findings. LMs learn by finding co-occurrence patterns in the streams of symbols from the input data. LLMs trained on a huge corpus of internet text have almost certainly encountered sentences very similar to the test prompts. This is the duplication problem: testing a model on information it has already been exposed to, like giving a student an exam they have a cheat-sheet for or have practiced many times. Unlike humans, who can understand and reason about the underlying representations connected to words, GPT-3 just looks back on its many terabytes of training data to assess how often these tokens occur together. Therefore, LMs can simulate language comprehension, although more probing always reveals the lacuna of basic understanding behind this illusion. A LLM is like one of Searle’s Chinese rooms, except no philosophical arguments are needed to establish its blindness to meaning – it is enough to just understand the model and interact with it.

What is the Chinese Room Argument in Artificial Intelligence? | by Vimarsh  Karbhari | Acing AI | Medium
A diagram of Searle’s Chinese Room. Source: Wikicomms.

In fact, the creators of GPT-3 admit that the model has an “increased potential for contamination and memorization” (Brown et al, 2020, p. 7) because the tests are present in the training data. The authors tried to reduce this contamination by eliminating exact matches of the test cases from the input data. However, GPT-3 can still exhibit (illusory) high performance by exploiting similar patterns from the training data to answer questions, even if the exact answer is not present. Even more damning is the finding that the frequency of terms in an LLM’s input data is linearly correlated with its performance on related tests, suggesting these models mostly rely on memorization-like mechanisms (Razeghi et al, 2022). Other studies found that LLM performance can be fully “accounted for by exploitation of spurious statistical cues in the dataset” (Niven & Kao, 2019), and their accuracy results from employing heuristics that only work for frequent example types (McCoy et al, 2019). Thus, LLMs cannot make generalizable, novel, and robust linguistic inferences beyond statistical associations.

Abstract art based on an AI in Searle’s Chinese Room, generated with Midjourney.

Viewing LLMs as sophisticated search engines that scan through their training data can help explain why they perform well on some tasks and fail on others. Many examples show that GPT-3 lacks conceptual understanding and has no idea what the words it uses mean. For instance, it fails to keep track of objects and characters in stories, thinks a swimsuit is appropriate attire for a courtroom when “clean” is in the sentence, states that grape juice is poison when “sick” appears nearby, and wanders into irrelevant nonsense in any response longer than a few sentences (Marcus and Davis, 2020). The original paper also shows that the model cannot infer basic logical relationships between two sentences and cannot do any math but simple arithmetic that can be memorized from tables in the training data (Brown et al, 2020). Further, the same tests used to assess human linguistic capabilities reveal the failures of LLMs. On an extensive set of psycholinguistic tests, BERT (a major LLM) struggles with pragmatic inferences, shows context insensitivity, cannot predict clearly implied events, fails to prefer true over false completions for sentences on category membership, and generally only succeeds when it can exploit loopholes in the training data (Ettinger, 2020). Whenever uniquely human language abilities are tested, LLMs malfunction.

Ultimately, this analysis shows that even the most advanced LLMs do not understand language. Further, their successes are only possible with the aid of a human intelligence to cue the model with well-designed prompts, collect enough relevant training data, and scan through the generations for appropriate responses. This makes LLMs a human-in-the-loop system, which cannot exhibit linguistic capabilities without a person to guide the model. LLMs may employ similar computational structures as human language, for as Futrell (2019) finds, the “behavior of neural language models reflects the kind of generalizations that a symbolic grammar-based description of language would capture” (p. 1). However, the fact that LLMs fail on unfamiliar or untrained prompts suggests that they use a simpler and more rigid grammar than human language, where “even slight changes may cause the [program] to fail” (Granger, 2020, p. 27). Larger models simply allow the LLM to hide its inability to understand for longer intervals. Of course, LLMs also lack a basic element of language – communicative intent. They do not express meaningful intentions or try to interact with their prompters, instead just babbling about what they are prompted to babble about, often in seemingly random and contradictory directions. Therefore, LLMs can be seen as a kind of sophisticated search engine that crawls over its input data for matches. They can memorize and recall but do not reason or understand.

2. Why LLMs will not understand language, and how other models could

Beyond this evidence on the limitations of current LLMs, more theoretical arguments show that this kind of system cannot reach human-level language understanding. Critics of connectionism have long argued that language relies upon an underlying “language of thought,” involving representations with systematicity and combinatorial structure (Fodor & Pylyshyn 1988; Fodor 1998). Although these are important considerations, my arguments do not depend on these claims and are targeted specifically at LLMs. The fundamental problem is that deep learning ignores a core finding of cognitive science: sophisticated use of language relies upon world models and abstract representations. Systems like LLMs, which train on text-only data and use statistical learning to predict words, cannot understand language for two key reasons: first, even with vast scale, their training and data do not have the required information; and second, LLMs lack the world-modeling and symbolic reasoning systems that underpin the most important aspects of human language.

The data that LLMs rely upon has a fundamental problem: it is entirely linguistic. All LMs receive are streams of symbols detached from their referents, and all they can do is find predictive patterns in those streams. But critically, understanding language requires having a grasp of the situation in the external world, representing other agents with their emotions and motivations, and connecting all of these factors to syntactic structures and semantic terms. Since LLMs rely solely on text data that is not grounded in any external or extra-linguistic representation, the models are stuck within the system of language, and thus cannot understand it. This is the symbol grounding problem: with access to just formal symbol system, one cannot figure out what these symbols are connected to outside the system (Harnad, 1990). Syntax alone is not enough to infer semantics. Training on just the form of language can allow LLMs to leverage artifacts in the data, but “cannot in principle lead to the learning of meaning” (Bender & Koller, 2020). Without any extralinguistic grounding, LLMs will inevitably misuse words, fail to pick up communicative intents, and misunderstand language.

Art based on the symbol grounding problem for AI, generated by Midjourney.

Research on language acquisition shows that how children learn is strikingly different from the LLM training process. Infants learn language by drawing on a wide range of cues, while LMs only train on the tiny slice of the world in their input texts. When children are forced to use a more LLM-like learning process, limited to a single input modality and deprived of social interaction, they fail to learn language. For instance, Kuhl (2007) shows that infants quickly learned Mandarin words with live exposure to native speakers but learned almost nothing from TV or audio alone.

Further, statistical learning alone is not enough to ‘crack the speech code,’ as children need varied and frequent interactions with other agents in social situations to grasp the meanings of symbols (Kuhl, 2011). Indeed, it seems that a critical mechanism for language learning is joint attention, when a child and a teacher are focusing on the same thing and both aware of this (Baldwin & Moses, 1994). Recent research shows that how much babies follow other peoples’ gaze when speaking predicts their vocabulary comprehension 7-8 months later (Brooks & Meltzoff, 2005). Language is a system for communicating intents to real people in the real world, and the lexical similarity and syntactic structure of raw text are not enough to learn this system. LLMs are missing some key aspects of human language: these models are not part of a linguistic community, they have no perception or model of the world beyond language, they do not act as agents or express intentions, and they do not form beliefs about propositions. (At least as far as we know – and given their structure, attributing mental properties like beliefs and intentions to LLMs is not warranted unless we have very strong evidence to do so).

In defense of LLMs, some argue for the scaling hypothesis, the idea that high-level abilities like language can arise just by increasing the number of basic computational elements. As Granger (2020) argues, intelligence may be mostly a product of allometric scaling of brain size, and “human-unique and ubiquitous abilities, very much including language, arise as a (huge and crucial) qualitative difference originating from a (colossal) quantitative change” (32). Indeed, the performance of neural LLMs has scaled in a power-law relationship with model size, data size, and the amount of compute used for training (Kaplan et al, 2020). However, I do not need to dispute this hypothesis. It may be the case that sheer scale – the right quantity of the right kind of basic computational elements – is sufficient for human-level language understanding, but the kind of scale and the type of circuitry involved in LLMs will not singlehandedly achieve this milestone. Further, LLMs already have enormous scale—the recent Megatron-Turing model has 530 billion parameters and took the equivalent of 1558 trans-American flights in energy costs for training (Simon, 2021). Human babies access enormous amounts of high-definition data in many modalities, and LLMs cannot match either the quality or quantity of this information. How many more resources do we need to throw into scaling before we realize the LLM approach will not achieve full language understanding?

3. Conclusions and alternate approaches

Machine language understanding is still possible in principle, even if this popular current approach is a dead end. For example, deep learning could take inspiration from the 4E framework for cognition: to reach human-level understanding, machines must be embodied, embedded, enactive, and extended (Borghi et al, 2013). One way to implement this approach would be to integrate an LLM as just one module in a larger system, like a robotic agent in a reinforcement-learning environment. The agent could use the LLM’s powerful text processing capabilities as needed but could also use other modules to process different sensory modalities, interact with other agents, and take actions. By embedding the agent in a more real environment, it may over time learn how to elicit all of these sub-systems for the optimal behavior – potentially including language understanding. Building up a model of the world could allow the agent could connect the word embeddings from the LLM to external referents like objects and actions. Thus, embodied cognition could be fruitful approach to solving the grounding problem for machine language understanding.

Neuro-Symbolic” AI. Where deep learning meets traditional… | by Nandhini  Swaminathan | The Research Nest | Medium
A diagram showing how neuro-symbolic AI and deep neural networks (as in LLMs) could be combined into an integrated system. Source: Knowable Magazine.

Another approach argues that LLMs only need to be augmented with modules for symbolic reasoning and world modeling. Advocates for this approach argue that LLMs function like the System 1 of human cognition, performing fast, heuristic, but often flawed inferences, and just need to be supplemented with a more deliberative System 2 module. This System 2 could represent and update world knowledge with hierarchical Bayesian modeling, a promising approach to cognition (Tenenbaum et al, 2011). As human children seem to use built-in templates like intuitive psychology and physics to learn, these models will likely need to be preprogrammed with some basic theories informed by scientific research (Lake et al, 2017). For instance, one MIT team combined GPT-3 with a symbolic world state model to dramatically improve the coherence of the LLM’s text generation (Nye et al, 2021). These neuro-symbolic systems can harness the power of deep learning while rectifying its shortcomings.

Conclusively, LLMs alone cannot solve language. However, I may be wrong about this. It is risky to be an AI skeptic, as many naysayers have already been proven wrong. This paper does not make an unfalsifiable philosophical argument, but a prediction about the future of AI. Larger LMs will be highly impactful, but banal. These models may allow us to automate many routine linguistic tasks, but will not understand language or be “smart” in a way current LLMs are not. If it turns out LLMs can reach human-level language abilities, this will teach us a great deal, indicating that we can learn everything for language understanding by simply training on text data. Progress on LLMs can inform our theory of language, and psychology and linguistics should inform the development of LLMs. This interdisciplinary process is our best hope of instilling human language understanding in machines.

Appendix: The Aftermath of ChatGPT

I wrote this essay before ChatGPT was released and GPT-3 was improved with the new text-davinci-003 model. I stand by the arguments here, but of course I’ve updated my thinking after the undeniably incredible performance of GPT-3. It has achieved extraordinary capabilities with a lot of scale and tons of compute. However, as Murray Shanahan writes in the excellent paper Talking About Large Language Models:

“It doesn’t matter what internal mechanisms it uses, a sequence predictor is not, in itself, the kind of thing that could, even in principle, have communicative intent, and simply embedding it in a dialogue management system will not help. We know that the internal mechanisms of LLMs are not sensitive to things like the truth of the word sequences it predicts. It does not refer to any external “ground” for evaluating the meaning of these words.”

– Shanahn, Talking About Large Language Models, page 5.

One of the key points here is that humans can make a causal connection between words and phenomena in the real world. On the other hand, LLMs can only make a correlation between words and other words. For instance, when you ask ChatGPT “what country is to the east of Yemen,” it will answer correctly – “Oman.” However, this is not because it has built a sophisticated model of the geography of the world, or because it has developed the belief that Oman is east of Yemen. Rather, it’s just that tokens like ‘Yemen,’ ‘east,’ and ‘Oman’ were paired the most frequently in that order and context in ChatGPT’s text corpus. The model answers this question in the same way it would answer the prompt “twinkle twinkle” with “little star.” Both are simply correlation-based statistical predictions. To a human, these are two distinct types of question. One is about the real world, and one is simply pairing some text with the most likely completion. To an LLM, they’re not different.

These correlations are accurate enough that ChatGPT is almost always right. But there are hundreds of examples where the model simply confabulates, hallucinates, or otherwise bullshits the answer. It cannot think from first principles, make epistemic judgements based on experience, or compare its beliefs to a model of the world. This is why its text-based educated guessing of what word should come text become unhinged from reality. Language is learned by talking to other language-users while immersed in a shared world and engaged in joint activity. Without this, the LLM cannot develop human-level language abilities. One of the most dangerous abilities of ChatGPT is its ability to make up plausible-sounding bullshit, that looks right but is in fact wrong – for example, coming up with fake info security answers, or creating regex answers with subtle flaws.

However, it might be the case that LLMs alone can achieve language understanding. It is possible that in the process of trying to perform sequence prediction, the LLM stumbled upon emergent mechanisms that warrant higher-level descriptions like “knowledge,” “belief,” or “understanding.” Perhaps in all of this large-scale statistical learning, the LLM discovered that it could predict tokens better if it also represented the connections between words and stored a latent representation of the world described by these words. Maybe all that is needed to understand human language is contained within language itself. In other words, maybe human language is reducible to next token prediction. If we create a performant enough LLM with enough training data, it will be able to perfectly simulate language understanding. This essay argues that this outcome is unlikely for structural reasons. But it is still possible – we have been shocked by AI research before.

How can we tell if an LLM really does understand? There is no way to prove its ability to understand human language, as even millions of successful examples where it seems to understand could be disproven by a single edge case where it clearly does not understand. ChatGPT can clearly pass something like the Turing Test in most cases, but it fails (showing its true colors as an AI) in some other cases. What success rate is enough to justify calling it “understanding”? These are all important questions, and the answers are still unclear.

Works Cited

Granger, R. (2020). Toward the quantification of cognition. arXiv preprint arXiv:2008.05580.

Borghi, A. M., Scorolli, C., Caligiore, D., Baldassarre, G., & Tummolini, L. (2013). The embodied mind extended: using words as social tools. Frontiers in psychology4, 214.

Nye, M., Tessler, M., Tenenbaum, J., & Lake, B. M. (2021). Improving coherence and consistency in neural sequence models with dual-system, neuro-symbolic reasoning. Advances in Neural Information Processing Systems, 34.

Tenenbaum, J. B., Kemp, C., Griffiths, T. L., & Goodman, N. D. (2011). How to grow a mind: Statistics, structure, and abstraction. Science, 331(6022), 1279-1285.

Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and brain sciences, 40.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems30.

Bender, E. M., & Koller, A. (2020, July). Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5185-5198).

Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1-3), 335-346.

Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., … & Amodei, D. (2020). Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.

Kuhl, P. K. (2011). Early language learning and literacy: neuroscience implications for education. Mind, brain, and education5(3), 128-142.

Kuhl, P. K. (2007). Is speech learning ‘gated’ by the social brain?. Developmental science10(1), 110-120.

Baldwin, D. A., & Moses, L. J. (1994). Early understanding of referential intent and attentional focus: Evidence from language and emotion. Children’s early understanding of mind: Origins and development, 133-156.

Ettinger, A. (2020). What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models. Transactions of the Association for Computational Linguistics, 8, 34-48.

Minsky, M. L. (1974) A framework for representing knowledge. MIT-AI Laboratory Memo 306. [aBML]

Marcus, Gary and Davis, Ernest. (2020). GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about. MIT Technology Review. https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/

Futrell, R., Wilcox, E., Morita, T., Qian, P., Ballesteros, M., & Levy, R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state. arXiv preprint arXiv:1903.03260.

Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1-2), 3-71.

Fodor, J. A. (1998). Concepts: Where cognitive science went wrong. Oxford University Press.

Saba, Walid (2022). AI Cannot Ignore Symbolic Logic, and Here’s Why. ONTOLOGIK on Medium. https://medium.com/ontologik/ai-cannot-ignore-symbolic-logic-and-heres-why-1f896713525b

Shanahan, M. (2022). Talking About Large Language Models. arXiv preprint arXiv:2212.03551.

Shen, X., Oualil, Y., Greenberg, C., Singh, M., & Klakow, D. (2017). Estimation of Gap Between Current Language Models and Human Performance. In INTERSPEECH (pp. 553-557).

Simon, Julien (2021). Large Language Models: A New Moore’s Law? HuggingFace. https://huggingface.co/blog/large-language-models

Wiggers, K. (2020). OpenAI’s massive GPT-3 model is impressive, but size isn’t everything. VentureBeat.com. https://venturebeat.com/2020/06/01/ai-machine-learning-openai-gpt-3-size-isnt-everything/

Wang, A., Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., … & Bowman, S. (2019). Superglue: A stickier benchmark for general-purpose language understanding systems. Advances in neural information processing systems32.

Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot reasoning. arXiv preprint arXiv:2202.0720

Black, S., Biderman, S., Hallahan, E., Anthony, Q., Gao, L., Golding, L., … & Weinbach, S. (2022). Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745.

Categories
Philosophy Politics

Reclaiming Slurs through Conceptual Engineering

Images generated by MidJourney AI, based on a prompt about a conceptual hammer destroying an ideological structure.

Introduction

Ideology can leave us “stuck in a cage, imprisoned among all sorts of terrible concepts.”[1] Slurs are linked to an especially harmful kind of concept. Successfully reclaiming slur terms requires understanding and rejecting these concepts. Linguistic reclamation of slur terms, when combined with critique of the underlying concept, can put an oppressive weapon out of action and help liberate us from pernicious conceptual cages.

My analysis will not focus on the semantic theory of slurs or slur reclamation. Constructing a natural language semantics of slurs is primarily a matter for empirical linguistic research, not philosophy. Indeed, Cappelen (2017) argues that semantics should be left to specialists with the expertise to conduct empirical study and formal analysis of linguistic phenomena.[2] Of course, findings in linguistics will be very relevant for philosophers, and it is certainly within the purview of philosophy to interpret these findings and investigate the theoretical foundations of linguistics. The substantial philosophical literature on the semantics of slurs also demonstrates that philosophers can use interdisciplinary approaches to make meaningful progress in semantics. Developing theoretical semantic accounts of slurs has proven valuable. However, validating these theories will require empirical study of linguistic patterns in natural language use. Then, we can evaluate how operationalized forms of these semantic theories can explain the observed patterns. Ultimately, settling the differences between semantic theories of slurs requires linguistic research.

However, conceptual engineering and conceptual ethics are matters for philosophy. The task of philosophers is not just to describe linguistic tools, but to assess the representational features of these tools and find ways to fix their defective or harmful aspects. Therefore, instead of conducting descriptive semantics, this paper focuses on the concepts underpinning slur terms. Section 1 describes the concepts connected to slurs and explicates their normative flaws. Section 2 argues that fully successful reclamations of slurs must involve conceptual engineering, not just lexical change. Finally, section 3 addresses some important objections to this conceptual view of slurs.

1. Slurring Concepts

Slur lexical items are connected to underlying concepts (representational devices), which we can call slurring concepts. These concepts are defective and harmful in virtue of their key characteristics: they are thick, essentializing, reactive, and subordinating.

First, slurring concepts are thick concepts, with both descriptive and normative features. Slurs make a negative evaluation of some social group.

Second, slurring concepts are essentializing. As Neufeld describes, slurs designate an essence that is causally connected to negative stereotypical features of some social group.[3] This essence is a failed natural kind. For example, the N-word posits a “blackness essence” that is supposed to be causally responsible for negative features of Black people. The evidence for this semantic view is substantial, as it can explain features of slurs in natural language that other theories do not account for. For instance, it explains a systematic linguistic pattern: slurs are always nouns. This is because nouns are unique lexical devices that predicate things into enduring, essential categories like natural kinds. Neufeld’s theory has many other successful predictions and explanatory benefits. However, our primary concern is not in identifying the correct semantic theory, but in understanding slurring concepts and their defects. It is a sufficient to say that slurs must make use of essentializing concepts to refer to a targeted group in a stable way and to warrant negative inferences about this group.

Essentializing concepts are epistemically flawed ways to describe social groups. Using essentializing concepts for real natural kinds like rock and atom is appropriate. However, social groups like races, religions, and sexual orientations are not immutable essences with strict natural boundaries, and they cannot justify attributing inherent properties to their members. Essentializing social categories produces cognitive mistakes and bad inferences.[4] Furthermore, essentializing concepts have normative harms, as they encourage dehumanization and harmful stereotypes. Treating members of a targeted group as determined by their group membership, without the autonomy of a person, is clearly dehumanizing. Empirical research shows that essentializing concepts, like a biological conception of race, result in increased stereotyping and discrimination.[5] For example, people who endorse an essentializing biomedical concept of mental illness distance themselves more from those seen as mentally ill, perceive them as more dangerous, have lower expectations of their recovery, and show more punitive behavior.[6] Simply making an essentializing concept salient can cause members of the essentialized group to perform worse on various activities, even if the stereotypes associated with the group are neutral or positive.[7] These defects alone are strong reasons to reject the use of essentializing concepts for social groups.

Third, slurring concepts are reactive, as described by Braddon-Mitchell: a reactive concept automatically tokens a reactive representation, which is a representation that shortcuts the belief-desire system and includes a motivation for action.[8] For instance, the reactive concept kike can trigger a representation of Jews that encourages prejudicial actions against them, and includes a negative view of Jews that justifies these actions. Indeed, one study demonstrated that “category representations immediately and automatically activate representations of the related stereotype features.”[9] This makes slurs uniquely dangerous forms of linguistic propaganda, as they can bypass conscious processing to produce discriminatory representations and behaviors.

Finally, slurring concepts are subordinating. They are thick concepts with a specific kind of normative component: a negative evaluation that ranks the targeted group as inferior and legitimates discriminatory behavior toward the group.[10] This represents members of the target group in ways that justify derogating, intimidating, abusing, or oppressing them. Due to their specific features, slurring concepts do not just cause subordination, they constitute subordination. This constitutive claim is surprising: if a representation is just held mentally and does not manifest in any harmful actions, how can it be subordinating?

The act of conceptualizing a social group in an essentializing, negative way creates reactive representations that result in subordinating stereotypes and inferences. Because our social reality is shaped by the way others see us, being surrounded by people who represent you as inferior or subhuman is a kind of subordination itself, even if their representations do not lead to tangible actions. Furthermore, slurring concepts are so closely tied to subordinating effects that it is not sensible to separate this kind of representation from its consequences. Holding a slurring concept leads to unconscious, automatic discriminatory behaviors, and even members of the targeted group experience inhibitions and impaired performance when a slurring concept is salient.[11] Ultimately, whether slurring concepts are constitutive of subordination or only cause subordination, they vital point is that they are subordinating.

2. Slur Reclamation as Conceptual Engineering

Reclaiming slurs is often an intentional project carried out by oppressed groups to resist their oppression and to co-opt a tool of subordination for purposes of liberation. Taking ownership of a slur and imbuing it with positive associations is an act of “weapons control” that diminishes the word’s subordinating power, effectively putting the slur out of action.[12] For example, in the 1980s, LGBT activists applied the slur “queer” to themselves in positive and pride-evoking ways, and they were largely successful in changing the word’s connotation.[13] However, I argue that changing a lexical item’s meaning is insufficient for slur reclamation. Lexical change is not an effective form of weapons control because it fails to challenge the most dangerous weapon: the slurring concept remains intact. Fully successful slur reclamation requires conceptual change, and not just linguistic change. The slurring concept connected to the lexical item must be critiqued and dismantled.

2.1 Partial vs. Full Slur Reclamations

How can we explain slur reclamation? Under semantic theories of slurs like Croom’s,[14] one might describe reclamation as the process of adding positive properties to a term that become more salient than any negative properties. This explanation cannot account for slur reclamations that do not change the valence of a term but instead detach it from an essentializing social kind. For example, the term “gypsy” as it is used in the U.S. is disconnected from the Roma social group, but the term is still attached to negative properties and used as a pejorative. At least in the American cultural context, this slur has been neutralized – it no longer is linked to an essentializing concept. However, because it still has derogating force, “gypsy” has not been reclaimed.

In contrast, Camp’s perspectival theory holds that regardless of what perspective an individual holds when using a slur, the slur is still connected to a slurring perspective.[15] However, it is empirically clear that slurs can be detached from derogating perspectives through individual and collective linguistic actions. Camp’s theory cannot explain this reclamation without substantial revisions. Regardless, her perspectival approach is insightful in emphasizing that slurs are linked to a near-automatic, integrated way of thinking about a targeted group. Rather than interpreting signaling allegiance to a somewhat vague ‘perspective,’ we interpret slurs as uses of slurring concepts. As a result of the specific features of slurring concepts, their properties are similar to Camp’s perspectives.

Finally, under Neufeld’s account, just as a slur is created when a failed natural kind is causally connected to negative properties, a slur can be unmade when the kind is disconnected from these negative properties. For instance, the reclaimed slur “queer” is still used to refer to roughly the same social kind (people with non-conforming sexual and gender identities), but it is disconnected from negative properties, and instead is even attached to positive properties. In this case, the social kind connected to the term remained the same, but the valence associated with it was neutralized or reversed. In the “gypsy” case discussed above, the opposite occurred in the US – the negative properties of the word remained, while it was disconnected from the essentializing concept (of the Roma as a social kind). Neufeld’s explanation of derogatory variation can explain both kinds of slur reclamation: holding the level of essentialization fixed, more negative slurs are more derogating, while holding the negativity fixed, more essentializing slurs are more derogating. Disconnecting slurs from essentializing concepts and reducing their pejorative force are therefore two ways to carry out reclamation projects.

All of these theories fail to directly account for the importance of confronting the underlying concept in slur reclamation. If a mental representation like a slurring perspective or concept is critical to the meaning and force of a slur, then it follows that complete slur reclamation must fix these mental representations and not merely the lexical item. Indeed, Neufeld holds a meta-semantic view where terms inherit their linguistic meaning from the mental concepts we associate with them.[16] Partial reclamations can occur when a positive or neutral version of the slur term achieves linguistic uptake, or when the lexical item is no longer associated with an essentialized social group. However, this kind of reclamation is limited and insufficient. It only decouples a lexical item from a slurring concept and does not subvert the slurring concept itself. The most dangerous weapon, the slurring concept, remains at large, and will continue to manifest in other lexical items.

Partial reclamations can thereby constitute illusions of change. They play ‘whack-a-mole’ with lexical items while failing to address the root cause. Full reclamation involves not just lexical change, but a successful dismantling of the slurring concept. The importance of the underlying concept means that “ameliorative attempts that focus exclusively on the language used are unlikely to have much success in the long run.”[17] For example, the descriptive term for intellectually deficient individuals has been changed many times, from “moron” to “idiot” to “mentally retarded.” When they were initially introduced, these were non-pejorative descriptive terms, but all were rapidly adopted as slurs for people with intellectual disablements. This shows the insufficiency of merely changing language without critique and rejection of the slurring concept.

2.2 Conceptually Engineering Slurs

Reclaiming slurs therefore requires addressing the slurring concept. One fruitful method for carrying out full reclamation is conceptual engineering: the process of assessing our representational devices, reflecting on how to improve them, and implementing these improvements. As we have already diagnosed the flaws of slurring concepts, how can we go about fixing these representations? One obvious approach is to eliminate the slurring concept entirely. However, this is just elimination, not reclamation. It is also not clear how to eliminate a slurring concept. The characteristic features of slurring features give us a few lines of attack. For instance, we can reject the negative normative component of the thick concept and encourage adoption of either a pure descriptive concept (e.g. person of color) or a thick concept with a positive normative component (e.g. queer). However, this approach risks “reinforcing an essentialist construction of the group identity,”[18] as it maintains an essentializing concept of the targeted group. The slur can easily be reactivated and weaponized against its targets by reversing its valence, making this type of reclamation very precarious.

Another possible approach is to reduce the reactivity of slurring concepts. For example, perhaps training people to consciously recognize how slurs prompt automatic reactive representations of the targeted group can curb the impact of reactive concepts. Indeed, there is some evidence that implicit bias training can work to a limited degree.[19] However, this only mitigates the slurring concept’s effects. Additionally, slurring concepts are reactive because they are essentializing. Essentialism about social kinds is what leads to automatic, reactive processing about the groups targeted by slurs.[20] Likewise, attempting to undermine the subordinating force of slurring concepts starts at the end of the process, as it fails to address the features that make these concepts subordinating. Conclusively, all approaches to engineering slurring concepts lead us back to the same source: essentialism.

Disarming and rehabilitating a slurring concept therefore must start by rejecting essentialism. Failing to critique the essentializing concept leaves the conceptual foundations of the slur intact. In this sense, concepts like woman, race, mental illness, and homosexual are proto-slurring concepts. By essentializing a social category, these concepts function to lay the groundwork for slurs, making the essentialized group a target for oppression and subordination. Successful critiques of essentializing concepts can remove the ground that slurs stand upon. For example, Haslanger argues that woman is a failed natural kind used to mark an individual as someone who should occupy a subordinate social position based on purported biological features.[21] Shifting the meaning of “woman” to be more in line with its real social function can unmask this underlying ideology. Instead of conceptualizing womanhood as an essential biological category, we should treat woman as a folk social concept used to subordinate. In the same vein, Appiah critiques the essentializing concept of race, arguing that there is no biological or naturalistic basis for treating races as real categories.[22] Finally, many thinkers including Szasz and Foucault argue that mental illness is a failed natural kind used to justify social exclusion practices.[23] Conceptual engineering projects like these can undermine the essentialist foundations of slurs.

3. The Importance of Social Practice in Slur Reclamation

One objection to anti-essentialist conceptual engineering projects is that partial slur reclamations are successful precisely because they enable positive identification and solidarity within an essentialized group. For example, the N-word is a way for Black people to express solidarity and camaraderie as members of an essentialized and oppressed social category.[24] Rejecting the essentializing race concept could have at least two harmful consequences: (1) it precludes organizing and expressing solidarity along racial lines, (2) it can lead to false consciousness, pretending that the essentialized categories do not have continue to have real social effects simply because we have rejected the essentializing concept. However, solidarity does not require essentialism. Instead of treating race as an essential category, one can treat race as a social construction used to target groups for subordination. People within the targeted groups can then express solidarity not as common members of a real natural kind, but as fellow targets of arbitrary social oppression. Indeed, the liberatory, reclaimed form of the N-word does not require treating Blackness as an essential category. The reclamation can reject the essentializing concept while emphasizing the way this concept is still used to oppress and conveying solidarity and resistance amongst members the targeted group.

However, why try to reclaim slurs at all? Why not introduce a new lexical item to communicate a new, liberating, non-essentializing concept, instead of using a term tainted by being a former slur? It seems paradoxical to intentionally choose a lexical item that one considers deeply flawed. Slur terms might also have direct lexical effects, where the word itself produces negative cognitive reactions even if its meaning is changed.[25] (For example, the word “Hitler” has negative lexical effects regardless of its conceptual content or usage). This gives a prima facie reason to avoid the lexical item. However, there are important reasons why conceptually engineering projects should reclaim the slur word by associating it with a new concept, rather than abandoning it entirely. First, maintaining the original lexical item allows us to put an oppressive weapon out of action, and to actually turn it against the oppressors. Once reclaimed, the word no longer has its subordinating power. Instead, it can be used as a vehicle to for liberatory, non-essentializing concepts that replace the slurring concept. Second, language has an important role in shaping social reality. Reclaiming terms with preexisting impacts can allow us to ameliorate or even reverse these impacts on social reality, while introducing a new term will require building its social impact from the ground up.[26] The benefits of co-opting slur terms are sufficient to outweigh the costs of lexical effects.

Finally, one especially potent objection to concept-focused slur reclamation projects is that they prioritize changing representations over changing practices. As Táíwò emphasizes, our analysis of propaganda should focus not just on mental representations, but how these representations influence practice and action.[27] Even if a person does not hold a slurring concept, they can still act upon a public practical premise, treating members of the targeted group in essentializing and subordinating ways. The important feature of slurs is not the concept, but the way these slurs feature in oppressive social structures and license harmful actions. Therefore, it is misguided to emphasize mental representations, and our primary concern in reclamation projects should not be changing concepts. Rather, we should focus on the social structures and practices that give slurring concepts their power. Conceptual engineering is far too abstract and ideal, placing our priorities in the wrong places and failing to recognize the importance of practice. We need reality engineering, not conceptual engineering.

This objection is well-received, and I agree with Táíwò’s practice-first approach. Any attempt to fully reclaim a slur must coincide with material changes to prevent oppressive practices. However, harmful representations can be oppressive in themselves. Slurring concepts represent their targets as essentially subordinate kinds, and result in oppressive and limiting mindsets. Lifting the blinders of a slurring concept can itself be liberatory. Additionally, conceptual engineering is not exclusive with practical reform, and it can help enable and guide material changes. Furthermore, a key feature of slurring concepts is that they are reactive. This makes slurring concepts action-engendering, as they automatically motivate and encourage discriminatory action. Focusing on the harmful actions associated with a slurring concept is a treatment of a symptom, not the underlying conceptual disease. Finally, slurring concepts are integrated within larger oppressive conceptual systems that can be aptly characterized as ideologies. Therefore, reclaiming slurs and critiquing slurring concepts functions as a form of ideology critique. Conceptual engineering can make the essentializing, subordinating ideology more visible, discouraging complacency and false consciousness while promoting actions to resist this ideology.

Conclusion

Dismantling slurring concepts is an essential step in fully successful slur reclamation. This paper emphasizes the critical role of slurring concepts. I began by describing the key features of slurring concepts that enable slurs to serve their harmful function. Then, I argued that full reclamation requires not just lexical change but conceptual engineering, and that rejecting essentializing thinking is the key to disarming slurs. Finally, I addressed some objections and complications in the engineering of slurring concepts. Reclaiming slur terms and critiquing slurring concepts can serve a vital role in critiquing and resisting oppressive ideologies.

Bibliography

Appiah, Kwame Anthony. The ethics of identity. Princeton University Press, 2010.

Bolinger, Renee (forthcoming). The Language of Mental Illness. In Justin Khoo & Rachel Katharine Sterken (eds.), Routledge Handbook of Social and Political Philosophy of Language. Routledge.
PhilArchive copy v1: https://philarchive.org/archive/BOLTLO-7v1

Braddon-Mitchell, David. “Reactive Concepts: Engineering the Concept CONCEPT.” In Conceptual Engineering and Conceptual Ethics. Oxford University Press.

Camp, Elisabeth. “Slurring perspectives.” Analytic Philosophy 54, no. 3 (2013): 330-349.

Cappelen, Herman, “Why philosophers shouldn’t do semantics,” Review of Philosophy and Psychology 8, no. 4 (2017): 743-762.

Cappelen, Herman. Fixing language: An essay on conceptual engineering. Oxford University Press, 2018.

Carnaghi, Andrea, and Anne Maass. “In-group and out-group perspectives in the use of derogatory group labels: Gay versus fag.” Journal of Language and social Psychology 26, no. 2 (2007): 142-156.

Croom, Adam M. “Slurs.” Language Sciences 33, no. 3 (2011): 343-358.

Fawaz, Ramzi, and Shanté Paradigm Smalls. “Queers Read This! LGBTQ Literature Now.” GLQ: A Journal of Lesbian and Gay Studies 24, no. 2-3 (2018): 169-187.

Habgood-Coote, Joshua. “Fake news, conceptual engineering, and linguistic resistance: reply to Pepp, Michaelson and Sterken, and Brown.” Inquiry (2020): 1-29.

Herbert, Cassie. “Precarious projects: the performative structure of reclamation.” Language Sciences 52 (2015): 131-138.

Jeshion, Robin. “Pride and Prejudiced: on the Reclamation of Slurs.” Grazer Philosophische Studien 97, no. 1 (2020): 106-137.

Khoo, Justin. “Code words in political discourse.” Philosophical topics 45, no. 2 (2017): 33-64.

Langton, Rae. “Speech acts and unspeakable acts.” Philosophy & Public Affairs (1993): 293-330.

Maitra, Ishani. “Subordinating speech.” Speech and harm: Controversies over free speech (2012): 94-120.

Neufeld, Eleonore. An essentialist theory of the meaning of slurs. Ann Arbor, MI: Michigan Publishing, University of Michigan Library, 2019.

Nguyen, Hannah-Hanh D., and Ann Marie Ryan. “Does stereotype threat affect test performance of minorities and women? A meta-analysis of experimental evidence.” Journal of applied psychology 93, no. 6 (2008): 1314.

Podosky, Paul-Mikhail Catapang. “Ideology and normativity: constraints on conceptual engineering.” Inquiry (2018): 1-15.

Pritlove, Cheryl, Clara Juando-Prats, Kari Ala-Leppilampi, and Janet A. Parson. “The good, the bad, and the ugly of implicit bias.” The Lancet 393, no. 10171 (2019): 502-504.

Richard, Mark, A. Burgess, H. Cappelen, and D. Plunkett. “The A-project and the B-project.” Conceptual Engineering and Conceptual Ethics (2018).

Rieger, Sarah. “Facebook to investigate whether anti-Indigenous slur should be added to hate speech guidelines.” CBC News. Oct 24, 2018.

Stanley, Jason. How propaganda works. Princeton University Press, 2015.

Táíwò, Olúfémi O. “The Empire Has No Clothes.” Disputatio 1, no. ahead-of-print (2018).

Táíwò, Olúfẹmi. “Beware of Schools Bearing Gifts.” Public Affairs Quarterly 31, no. 1 (2017): 1-18.

  1. Nietzsche, Friedrich. The twilight of the idols. Jovian Press, 2018. Pg. 502.
  2. Cappelen, Herman, “Why philosophers shouldn’t do semantics,” Review of Philosophy and Psychology 8, no. 4 (2017): 743-762.
  3. Neufeld, Eleonore, An essentialist theory of the meaning of slurs, Ann Arbor, MI: Michigan Publishing, University of Michigan Library, 2019.
  4. Wodak, Leslie, and Rhodes, “What a loaded generalization: Generics and social cognition,” (2015).
  5. Prentice and Miller, “Psychological essentialism of human categories,” (2007).
  6. See Haslam (2011), Mehta and Farina (1997), Lam, Salkovskis, and Warwick (2005), Phelan (2005).
  7. Nguyen, Hannah-Hanh D., and Ann Marie Ryan, “Does stereotype threat affect test performance of minorities and women? A meta-analysis of experimental evidence,” Journal of applied psychology 93, no. 6 (2008): 1314.
  8. Braddon-Mitchell, “Reactive Concepts,” Conceptual Engineering and Conceptual Ethics (2020): 79.
  9. Neufeld, pg. 21. Quote is from a summary of a study by Carnaghi & Maass (2017).
  10. See Maitra “Subordinating speech,” (2012).
  11. See empirical evidence in Carnaghi and Maass (2017); Nguyen and Ryan (2008).
  12. Jeshion, Robin, “Pride and Prejudiced: on the Reclamation of Slurs,” Grazer Philosophische Studien 97, no. 1 (2020): 106-137.
  13. Fawaz, Ramzi, and Shanté Paradigm Smalls, “Queers Read This! LGBTQ Literature Now,” GLQ: A Journal of Lesbian and Gay Studies 24, no. 2-3 (2018): 169-187.
  14. Croom, Adam M, “Slurs,” Language Sciences 33, no. 3 (2011): 343-358.
  15. Camp, Elisabeth, “Slurring perspectives,” Analytic Philosophy 54, no. 3 (2013): 330-349.
  16. Neufeld, An essentialist theory of the meaning of slurs, pg. 3 (in footnote 8).
  17. Renee Bolinger, “The Language of Mental Illness,” in Justin Khoo & Rachel Katharine Sterken (eds.), Routledge Handbook of Social and Political Philosophy of Language (forthcoming).
  18. Herbert, Cassie, “Precarious projects: the performative structure of reclamation,” Language Sciences 52 (2015): 131-138. Pg. 133.
  19. Pritlove, Cheryl, Clara Juando-Prats, Kari Ala-Leppilampi, and Janet A. Parson, “The good, the bad, and the ugly of implicit bias,” The Lancet 393, no. 10171 (2019): 502-504.
  20. Prentice and Miller (2007).
  21. Sally Haslanger, “Going on, not in the same way,” Conceptual engineering and conceptual ethics (2020): 230.
  22. Kwame Anthony Appiah, The ethics of identity, Princeton University Press, 2010.
  23. See Jeremy Hadfield, “The Conceptual Engineering of Mental Illness,” jeremyhadfield.com (2020) for a review.
  24. Robin Jeshion, “Pride and Prejudiced: on the Reclamation of Slurs,” Grazer Philosophische Studien 97, no. 1 (2020): 106-137.
  25. See Cappelen, “Fixing Language,” (2018).
  26. Herman Cappelen, “Conceptual Engineering: The Master Argument,” Conceptual engineering and conceptual ethics, Oxford University Press (2019).
  27. Olúfémi Táíwò, “The Empire Has No Clothes,” Disputatio 1, no. ahead-of-print (2018).
Categories
Cognitive Science Essays Neuroscience Philosophy

How Imagining Can Set Us Free

Imagining gives us our freedom. Or so I argue in this paper, which aims to describe the neural basis of imagination and its role in free will perceptions. First, I review imagination on all three of Marr’s levels of analysis: its computational function, its algorithmic structure, and its neural implementation (Marr, 1982). Then, I argue that the capacity to imagine alternative possibilities is essential to perceiving oneself as acting freely. I also show that the imagination is not free and unconstrained, but has systematic constraints, and these limit our ability to act volitionally and choose among possibilities. Further, expanding the imagination results in a greater perception of free will. By imagining we can make ourselves even more free.

1. What is Imagination?

The term “imagination” presents a challenge for researchers, since it is used colloquially in a variety of ways, its meaning is the subject of intense debates in philosophy and other fields, and “imagination” does not have an agreed-upon formalized definition in mathematics, computer science, or cognitive science. However, a minimal shared concept of imagination allows us gloss over the differences between the many sub-types of imagination, from mental imagery of visual scenes to propositional supposing, and focus on the commonalities.

Specifically, imagination is mental simulation: the ability to simulate non-occurrent possibilities, representing something in the mind without aiming to capture things as they actually are in the moment. In other words, imagination is to “represent without aiming at things as they actually, presently, and subjectively are” (Liao and Gendler, 2011). Thus, imagination can be understood as a form of “attention to possibilities,” where potential realities are projected, simulated, and operated upon internally (Williamson 2016, 4). In computational terms, imagination refers to a system’s processing or manipulation of information that is not directly present to the system’s sensors (Marques, 2009). This ‘imaginative’ processing occurs offline (when the agent is not receiving new sensory data or is not connected to its environment), covertly (without immediate consequences for the agent’s actions), and/or internally (occurring within the agent’s own latent models). Generally, the defining feature of imagination is the simulation of non-occurrent sensory states with internal representations or models.

The human capability to imagine is vital to a wide range of cognitive processes, including memory, predicting the future, conjuring alternate worlds, simulating and empathizing with other minds, spatial navigation, and inventing novel combinations of images and objects (Mullaly & Maguire, 2014). As expected of a process with so many diverse roles, imagination varies on several dimensions.[1] Specifically, imaginings can be voluntary (e.g. creative generation) or involuntary (e.g. daydreaming). Some even apply the dual-process framework to imagination, dividing imaginative processes into the (1) unconscious, uncontrolled, spontaneous, non-volitional, and effortless and (2) the conscious, controlled, volitional, and effortful (Stuart, 2021). Further, imagination can be sensory, as in mental imagery, or cognitive and non-sensory, as in imagining yourself with alternative beliefs or traits (Dokic, & Arcangeli, 2014). This sensory aspect can implicate any modality, from vision and touch to smell and sound. Finally, the complexity of the imagined object can vary dramatically. Imagination often involves representing a complete situation, “a configuration of objects, properties, and relations” rather than a single isolated object (Berto, 2018). It can even involve constructing an entire imaginary world. This paper will touch on imagination in all its aspects but will focus especially on the voluntary generation and manipulation of imagined possibilities.

2. The Computational Role of Imagination

At the computational level, we ask what imagination does, and why: the role imagination plays in the human information system and its adaptive function in our environment (Bechtel & Shagrir, 2013). As stated above, imagination’s primary role is world-simulation: it generates a model of the world or a part of the world and simulates it in internal experience. Imagination takes sensory data, memories, and our implicit and explicit models of the world as input, and it outputs an imagining – most often, a consciously experienced internal simulation.

Imagination has myriad adaptive functions. Critically and perhaps most obviously, imagining is crucial for decision-making. It allows us to simulate the consequences of actions, allowing our imaginings to ‘die for us’ so that we do not have to make choices by costly trial-and-error alone (Kielak, 2019). Further, the imagination can facilitate escaping local maxima in a decision environment: situations where no single choice could put you in a better position, but a series of choices (potentially through a long ‘desert’ of low reward) could improve your situation significantly. If we were unable to imagine the high-value oasis at the end of the low-reward desert, it would be far more difficult to escape these sub-optimal, least-bad situations. Gaesser (2013) also shows that imagination has a crucial role in enabling empathy and social cognition, supporting theory of mind, and in encouraging prosocial behavior, where more vivid and detailed imaginings of a person more effectively promoted altruistic actions toward them. Earlier in human evolution, imagination was likely crucial to creating novel tools and traps, mentally planning attack strategies, and improving tribal cohesion with religion, myth, and art (Vyshedskiy, 101). Imagination is therefore vital to human behavior.

More passive modes of imagining, like daydreaming and dreaming, may allow us to draw connections and integrate disparate information, incubate potentially creative ideas, and move experiences to long-term memory (Malinowski & Horton, 2015). Further, they may even serve a ‘defensive activation’ role: dreaming keeps visual areas in the brain active and engaged to prevent them from being reduced or replaced (Eagleman & Vaughn, 2021). Among the many functions of imagination, this paper focuses on two that are especially vital to making choices and perceiving oneself as free: imagination’s role in creativity and modal cognition.

1.2 Imagination and creativity

While imagination is closely connected to creativity, it is a separate process. Creativity is the process of producing ideas, artifacts, or concepts that are both novel and valuable. Imagination is the ability to produce and/or simulate new objects, sensations, or ideas in the mind, and can be understood as both a sub-process within creativity and as a semi-separate capacity that supports creative generation. As the initial generative step in creativity, imagination produces the creative possibilities that are then considered, evaluated, and implemented by other systems. Imagination produces internal representations that will not necessarily be novel or useful (creative), but that can provide a fertile starting point for the creative process. Ellamil et al (2012) finds that the two phases of generation and evaluation involved in creativity implicate distinct neural systems: creative generation recruits primarily medial temporal lobe regions like the hippocampus, while evaluation co-recruits the default mode and executive control networks. Further, these networks are competitive: “the more successfully [participants] were able to engage in creative generation while avoiding evaluative processes, the more they recruited MTL regions associated with creative generation” (p. 6). This neuroscientific evidence supports the hypothesized computational role of imagination in creativity: it generates loose, unrefined ideas to be evaluated, modified, and polished by other cognitive processes. Free choice relies on this relatively divergent, unstructured initial step to produce creative options, which can then be winnowed down and selected from in a convergent process.

1.2 Imagination and modal cognition

Imagination is closely tied to modal cognition – thinking about possibilities. Modal cognition relies on imagination to represent situations and generate potential alternatives. Just as in creativity, imagination is the initial step in modal cognition, as it generates the possibilities for consideration. The possibilities in the generated consideration set can then be partitioned into a more limited set of relevant possibilities, and ordered based on some criteria, like value or probability. Considering the ways a captain could have prevented a ship from sinking, for instance, requires mentally simulating this scenario and varying its features to produce alternative possibilities. If we were unable to generate and represent the alternative possibilities for a given situation, it would be difficult or impossible to see ourselves as free. Section 4 expands on the importance of imagination in modal cognition for free will perceptions.

3. The Algorithm of Imagination: Generative Models

The representational or algorithmic level asks how information is organized, encoded, and processed in the imagination, transforming representations into an imagined output. I argue that generative models serve as the fundamental algorithm of imagination. Imagination uses rules and implicit models of the world learned through perception to generate a limitless variety of possibilities. A generative model estimates the probability distribution of an observed variable given a target variable, in contrast to discriminative models that estimate a target variable’s probability distribution based on observed variables. In other words, a generative model simulates the interactions among unobserved variables that might generate the observed variables. Rather than just creating input-output mappings or categorizing a signal, generative algorithms attempt to figure out how the data was generated to classify it, asking which target category is most likely to have produced the observation. By understanding the methods of generation, these models can also create new data similar to the observed data.

Williams (2020) provides detailed arguments to show that imagination and perception are best described as generative models. Discriminative models are unable to explain top-down effects in perception (where higher-level representations impact processing of early info) or endogenously generated percepts like mental imagery and dreams (which have no clear inputs for classification). Generative models correlate to the widely-accepted predictive processing framework in neuroscience, as they are prolific expectation-generators that allow continuous predictions of incoming sensory information based on estimates of their external causes. The brain likely uses temporal generative models, which use current observations and perceptual history to make inferences and find dependencies in input patterns that appear in timed order, to predict its future sensory stream.

The imagination co-opts this predictive capacity of perception and re-uses its core representational architecture, modifying our implicit, learned representations of the dynamics of the real world to generate imagined worlds with new or altered dynamics. Extensive evidence supports the theory that the brain uses a hierarchical generative model to “minimize prediction error in the cascade of cortical processing,” and higher-level areas can use these generative models to drive lower neural populations into predicted patterns and produce internal perception (Clark, 2013). Thus, the cortex likely implements a generative model to explain, predict, and learn about sensory data, and then cross-applies this model to synthesize rich visual representations without external input.

Treating imagination as a generative model is valuable for a few additional reasons. First, imagination is generally governed by principles of generation: a set of (implicit or explicit) rules that guide our imaginings (Walton, 1990, p. 53). For example, in Harry Potter, “Latin words and wands create magic” is a principle of generation that readers can consistently use to simulate the imagined world. The imagination generates a set of possibilities guided by context-relevant principles, like graphics rendering algorithms that unfold an artificial world procedurally using algorithmic rules. Treating imagination as a generative model also explains imaginative mirroring: our imagination defaults to follow the rules of the real world unless prompted otherwise by principles of generation (Leslie, 1994). If a cup ‘spills’ in an imaginary tea party, the participants will treat the spilled cup as empty, following the normal physics of reality. This occurs because perception involves generative models, using processes we derive from experience to simulate the physical world and predict its behavior. Imagination involves running a generative model ‘alongside’ or ‘on top’ of this internal simulation of reality. Some processes are modified in the imagining, but the ones that are not modified are ‘filled in’ by our default generative model of the real world.

The efficacy of generative models for explaining the imagination is demonstrated by computational models that simulate imagination. Generative models based on artificial neural networks (ANNs) can visualize objects that the network has never seen before, replicating the correctness, coverage, and compositionality of the human imagination (Lee et al, 2008). An ANN can learn the structure of an environment and then simulate or hallucinate it internally, but this process relies on creating an efficient, compressed, thorough, and interpretable model of the world (Ha & Schmidhuber, 2018). For instance, Testolin & Zorzi (2016) show that human perception is analogous to graphical models implemented with generative ANNs, which build high-level representations and extract statistical regularities from the environment in an unsupervised way and use feedback connections to carry-top down expectations. These generative models have psychologically and biologically plausible properties, like unsupervised learning and interactions between feedback and feed-forward activity.

Reichert et al (2013) demonstrates that generative models can explain human internal imagery, showing that the cortical dynamics of spontaneous hallucinations in Charles Bonnett syndrome (CBS) can be simulated and explained by an ANN-based generative model. In CBS, partial blindness results in a deficiency of visual input in early processing stages, resulting in spontaneous activity in the cortex. The authors show that recurrent connections between layers in ANNs are similar to reciprocal synaptic connections between layers in the neural visual processing hierarchy and enable simulating the balance between bottom-up sensory information and top-down internal priors that occurs in the brain. When the trained ANN is given empty or corrupted input, this results in realistic artificial hallucinations that can be strikingly decoupled from the input image. These examples are fascinating demonstrations of the potential of using generative models to facilitate progress in understanding the mechanisms of hallucinations, mental imagery, and perception in the human brain.

Conclusively, the generative model framework offers a fruitful way to understand the imagination. It also suggests the algorithmic components that the imagination involves, which likely correspond to somewhat separate neural correlates. For instance, the imagination requires a sensory system to collect information about the world and support simulations of it. It also needs a memory system to consolidate these experiences into representations that can be accessed for future imaginings. Then, some sub-system must support compressing a huge number of observations of reality into a generative world model, so that imagination can use this model to create realistic and task-relevant simulations. Finally, there must be some internal workspace that allows the mind to produce, combine, and manipulate imagined objects.

3. The Neural Correlates and Mechanisms of Imagination

Any complete model of imagination must accurately and comprehensively describe how imaginings are produced by complex interactions of neuron assemblies, regions, and networks in the human brain. How do neural circuits create an experienceable representation of an object that is not currently present in the subject’s sensory environment?

3.1 Perception and imagination

Imagination has many parallels with perception. This is unsurprising given our theoretical framework that suggests perception and imagination both involve similar generative models. For instance, research showed that in people with visual disorders, imagination is disabled in the same way as perception – e.g. people with hemispatial neglect cannot imagine things on the neglected side, suggesting that imagery and perception use the same machinery (Koch, 2004, p. 99). Furthermore, more vivid imaginings dilate your pupil more, suggesting that the imagination activates very early perceptual processes (Laeng, 2014). The excitability of the visual cortex also predicts imagery strength (Keogh & Pearson, 2020). Additionally, binocular rivalry experiments conducted by Tartaglia et al (2009) demonstrate that just perceiving something (like an oriented line) can improve your visual sensitivity to that thing, imagining visual content improves your sensitivity to that content. This priming effect indicates that imagination involves processes similar to perception. Finally, the contents of imaginings can be mostly decoded with activity in the early visual cortices like V1 and V2 (Vetter et al, 2014), showing that the representations of imagined objects are partially realized in early sensory areas.

As described in Pearson (2017), top-down imagination functions like a weak version of perception with a “reverse visual hierarchy” (p. 2): imagining begins with an initial conscious choice to create a mental image in the frontal lobe, producing a cascade of activity that runs ‘backwards’ in the brain, retrieving stored info and memories in medial temporal areas, and then finally, sensory and spatial representations of the imagery are created in the parietal and occipital lobes.[2] Additionally, the hippocampus can recruit long-term memories to help give richness and spatial coherence to complex, large imaginings (Buckner, 2010). After all, imagination is closely linked to memory, and more vivid imagery is linked to better performance in visual working memory tasks (Keogh & Pearson, 2014). While perception involves feed-forward information propagating upward from early visual areas, imagination involves a feedback cascade that begins in frontal regions and then recruits memories and visual areas to produce imaginings.

Finally, frontal regions play an executive rule in guiding the imagination, but do not produce the actual imagined content. As the patterns of activity associated with imagination move up from V1 to frontal areas, they become increasingly similar to the neural patterns of perception (Pearson 2017, p. 3). This is likely because the executive control mechanisms and high-level processes involved in triggering imagination are nearly indistinguishable from the ones involved in processing, modeling, and manipulating feed-forward visual information. Attention to perceptual realities and attention to possibilities therefore seem to implicate the same neural mechanisms in frontal-parietal areas.

3.2 Imagination and the Default Mode Network

There is a growing consensus that remembering the past, imagining the future, counterfactual thinking, and simulating possible experiences, all involve similar neural mechanisms in the default mode network (DMN) (Hassabis & Maguire 2017; Mullaly & Maguire, 2014; Pearson, 2019; Addis et al, 2007; Spreng et al, 2009). The DMN is a collection of brain areas often activated during wakeful rest and internal mental activity, and includes the medial prefrontal cortex, the posterior cingulate cortex or precuneus, and the angular gyrus, among other regions (Raichle, 2015). Winlove et al (2018) review 40 neuroimaging studies in a meta-analysis of the correlates of visual imagery, and identified 11 consistently activated regions, finding that the superior parietal lobule was involved in top-down control of imagery, the inferior frontal sulcus semantic processing and working memory, and the frontal eye fields and V1 supported internal visual depictions. Further, Whittingstall et al (2014) show that the posterior cingulate cortex (PCC) is a crucial hub for integrating occipital, parietal, and temporal areas together during visuospatial imagery.

Imagining future events and prospective thinking involved the same generation processes and areas in the right frontopolar cortex and left ventrolateral prefrontal cortex, showing that the episodic memory system is involved in imagining the future and vice versa (Addis & Schacter, 2007). More specifically, future-oriented and counterfactual thinking engages the posterior DMN (pDMN), centered around the posterior cingulate cortex (Xu et al, 2016). Researchers showed this by asking participants in an fMRI scan to make choices about their present situation, and then prospective choices about their future. Their findings demonstrated that people often engage vivid mental imagery in future-oriented thinking, and that this process activates the pDMN while reducing its connectivity with the anterior DMN. This provides a candidate neural process that underlies imaginative generation of possibilities. However, imagination requires not just the DMN, but organized interactions between the DMN, executive control network (ECN), and salience networks to create controlled, meaningful, and actionable imaginings (Gotlieb et al, 2018). The DMN may be essential to generating images, ideas, and possibilities, while other networks allow us to modify, select amongst, and move our attention between them.

3.3 Imagining as binding-by-synchrony

A key cognitive ability that underlies imagination is prefrontal synthesis (PFS), the ability to create novel mental images by combining experienced or remembered objects. The binding-by-synchrony hypothesis claims that this process is performed in the lateral prefrontal cortex (LPFC), which likely acts as an executive controller that synchronizes a network of neuronal ensembles (NEs) that represent familiar objects, synthesizing these objects into a new imaginary experience (Vyshedskiy, 2019). Familiar objects are encoded in the brain by neuronal ensembles, and the sensory component of objects is physically encoded in “the posterior cortical hot zone” (Koch & Tononi, 2016). Remembering or imagining objects requires synchronous resonant activity of the object-encoding neuronal ensembles, and when this synchrony occurs in the posterior cortical hot zone it causes the object to come to consciousness.

Imagining novel things, then, is the processes of synchronizing independent object-NEs through conscious attention. Objects can then be imaginatively modified by desynchronizing parts of an object-NE from the whole (called prefrontal analysis). The LPFC acts as a puppeteer in this process, flexibly synchronizing object-NEs to manufacture an unlimited number of novel mental images (Vyshedskiyp. 99). For example, an ensemble representing Bill Clinton and one representing a lion can be synthesized by synchronizing their firing activity in the same phase, creating a mental image of Clinton holding a lion (Vyshedskiy & Dunn, 2015). Any arbitrary type and number of ensembles can be synchronized in the mental workspace, limited by working memory, experience, and focus. Imagination can be either top-down and intentional, driven by the prefrontal synchronization of lower-level neuronal assemblies, or bottom up and unintentional, when lower-level ensembles synchronize non-volitionally and without a puppeteer, spontaneously producing dreams, hallucinations, or sudden insights and images. Children acquire PFS around 3 to 4 years of age, along with other imaginative abilities like mental rotation, storytelling, and advanced pretend play (Vyshedskiy, 2019, p. 101). While further study is needed, it is plausible that development of PFS is associated with mature modal cognition, advanced creative abilities, and generating more sophisticated imaginings. 

Creative thought relies on the ability to manipulate internal representations flexibly in a mental workspace. Schlegel et al (2013) confirm Winlove et al’s finding that 11 regions consistently are activated in imagination, including the occipital cortex, PPC, precuneus, posterior inferior temporal cortex, DLPFC, and frontal eye fields. However, this research also showed that maintenance and manipulation of imagined objects involved separate sub-networks, where maintaining involved a dense network integrated by the MTL and manipulation involved a sparser network with a hub in the precuneus. This supports the hypothesis for separate neural mechanisms for imaginative synthesis (forming and maintaining a mental image or object) and analysis (applying operations, filters, or decompositions to the imagined objects). Imagination relies on dynamically synchronizing neural assemblies in the mental workspace.

Finally, imagination is an example of type 3 qualia, which is the temporary binding of simple sensory objects (type 1 and 2 qualia) through endogenous attention (Tse, 2017). In daydreaming, mental imagery, and imagination, we can simultaneously experience the contents of the iconic buffer (our current sensory state) in the attentional background, and the contents of the working memory buffer in the foreground (Tse, 2017, p. 17). In contrast, dreams are an example of experiencing type 3 qualia alone, without external inputs or basic sensory. Binding-by-synchrony, and the idea that imaginings are type 3 quales, also explains why imagination only seems to represent an object while your attention is currently focused on it. Unlike perception, in which sensory areas are activated by external inputs, top-down imagination requires the constant, effortful synchronization of neural ensembles to maintain mental objects. When your attention moves, the synchronization collapses, and the imagined object vanishes.

4. Imagination and Free Will Perception

Imagination is fundamental to seeing oneself as a free agent. Here, I do not take a position on the complex and rife debates on free will, compatibilism, and determinism in philosophy. I do not argue that imagination is a literal precursor to free will in any deep metaphysical sense, but rather that it is indispensable to our perception of free will, bracketing away the question of whether this perception is an illusion or not. I support this position with several arguments. The most central argument claims that to represent or see oneself as choosing freely, one must be able to represent alternative possibilities for actions. Representing alternative possibilities requires imagination. Thus, imagination is required for free will perception. Seeing yourself as free requires representing or imagining alternative possibilities for action. Additionally, this implies that the systematic constraints on what possibilities we imagine restricts the choices we can make and limits our sense of free will.

“Free will also includes the creative ability to imagine,” as we can choose to apply our attention to internally generated qualia (Tse, 2013, p. 238). This enables freedom in two ways. First, imagining and mentally ‘playing out’ possible scenarios to form a plan for action is precisely what empowers us to make decisions. By creating an internal virtual reality, it allows us to pre-select and pre-experience actions, an essential part of human decision-making. Second, imagination itself involves choices among internal representations, even if this does not manifest in external actions. The imagination enables freedom in the sense that it supports generating a large number of possibilities with a “high degree of disorder or chance amounting to a kind of ‘freedom’” (Krausz & Bardsley, 2009, p. 133). In this sense, the very manner of representation and the kinds of mental operations involved in imagination are fundamentally intertwined with freedom. While memory and imagination involve similar processes, prospection (simulating the future) is less constrained and subject to ‘reality checks’ than retrospection (Kane et al, 2008, p. 132). Future-looking imagination is vital to volitional action.

4.1 Imagining alternative possibilities and its constraints

Research on modal cognition shows that imagining alternative possibilities is not free and boundless but has important constraints. By default, we only consider a systematically limited subset of the imaginable possibilities. Imagination produces a series of possibilities, and then during decision-making we sample from this distribution of imagined options in an adaptive way, constrained by relevant factors. The set of possibilities we consider is limited systematically by the sampling process (Morris, Phillips, and Cushman 2019). Under the theory of the psychological representation of modality developed by Phillips et al, the set of possibilities we consider is limited by the constraints of probability/normality, physics, and morality (Phillips & Knobe, 2018). For instance, both children and adults under severe time constraints tend to consider immoral options (e.g. stealing or lying) or unlikely and irregular options (e.g. painting polka dots on an airplane) as impossible (Phillips, Morris, & Cushman, 2017). Our perceptions of ourselves as freely acting are systematically limited by the possibilities we can imagine. Although we may be able to choose among possibilities, we do not have complete control over the pool of possibilities that are consciously available to us. Through imagination, we can modify and expand this pool of options, supporting a greater sense of agency and freedom.

We tend to judge agents as free when we can represent alternative possibilities for their action. In a sense, a failure to imagine can preempt free will perceptions, as one cannot choose to act upon a possibility that one does not represent, and one cannot see an option as ‘freely’ chosen if no other possibilities are represented. Indeed, people use judgements of possibility to inform judgements about whether an agent is free (Phillips & Knobe 2018). Generally, if we are able to imagine situations where the action could be different, we judge the agent as free. When participants generated more possibilities, imagining more alternative decisions a ship captain could have made, they were more likely to make the judgement that he was free and not forced (Phillips, Luguri, and Knobe, 2015). Unpublished data from the Dartmouth PhilLab supports this finding, suggesting that as people imaginatively generate more possibilities, these options become less constrained by the norms of probability, normality, morality, and rationality.[3] This may imply that possibilities become more divergent, unconventional, novel, or surprising as the quantity of ideas generated increases. Therefore, imagination is essential to free will judgements, and imagination enhances the sense of freedom by expanding the set of accessible options.

The ability to imaginatively project alternative possibilities may therefore underly individual differences in free will perceptions – if a person imagines many more available options, they see themselves as freer. Simply imagining more possibilities may engender a feeling of more freedom. Developmental research provides strong support for this claim.

Children tend to resist, or fail to generate, impossible and improbable imaginings. Kushnir (2018) also shows that free will beliefs originate in the ability to understand intentional action, inferring when agents are free to do otherwise and when they are constrained. Young children are often unable to imagine alternatives to improbable, irregular, or immoral events, and so tend to see them as impossible. Children’s imaginations are thus surprisingly reality-constrained: children (age 2-8) protest against pretense that contradicts their knowledge of regularity, expecting imaginary things to have ordinary properties (Friedman et al, 2017). Even when pretending, kids expect lions to roar and pigs to oink, and they resist imagining otherwise. Children also protest against pretense that contradicts their knowledge of regularity, expecting imaginary entities to have ordinary properties (Vandervoort and Friedman, 2017). Furthermore, 82% of the time, children extend fantasy stories with realistic events rather than fantastic events, while adults extend fantasy stories with fantastic events (Weisberg et al, 2013). Young children imagine along ordinary lines even when primed with fantastical contexts, filling in typical and probable causes for fantastical imaginary events (Lane et al, 2016). Children show a strong typicality bias in completing fictional stories, favoring additions to the story that match their regular experiences in reality (Thorburn et al, 2020). This evidence shows that children’s imaginations are limited by typicality, morality, and their understanding of the physical world. This suggests they are using simpler constraints, quick heuristics, and a more basic model of the world to effortlessly generate possibilities.

Most conclusively, an experiment by Flanagan and Kushnir (2019) found that performance on a task that involved generating ideas within an imagined fantasy world was the best predictor of children’s free will judgements: the more fluent the children were in this imagination task, the more likely they were to judge themselves as free. As the authors speculate, “one potential mechanism is a direct pathway from idea generation to judgments of choice and possibility” (p. 5). In my view, the pathway is not completely direct, as existing research indicates that after possibility-generation (imagination) we also evaluate the relevance of possibilities and rank them. However, the initial generation is crucial, and the nature and quantity of generated possibilities has demonstrable impacts on how people think about possibilities, freedom, and choice. Constraints on the imaginative process lead to downstream effects on our choices and our perceptions of freedom.

As children develop, they are able to soften these constraints and imagine more alternatives, and when they do endorse a choice as free rather than forced they often cite imagined alternatives to the scenario as an explanation (Kushnir, 2018). As children develop, the constraints on their imagination relax, leading to less restricted generation of possibilities. Older children are more likely to imagine improbable and physically impossible phenomena (Lane et al, 2016, p. 6). Explicitly prompting children to generate more possibilities leads them to imagine more like older children, producing possibilities less constrained by probability and regularity (Goulding & Friedman, 2020). Cultural contexts mediate this developmental process. For example, American children are more likely than Nepalese and Singaporean children to judge that they are free to act against cultural and moral norms (Chernyak and Kushnir, 2019). This is likely because children in cultures with stronger or more restrictive norms find it harder to generate evaluatively wrong possibilities or see these possibilities as relevant. As free will judgements depend on representing alternative possibilities, these children see themselves as less free to pursue possibilities that violate evaluative norms. When imaginative flexibility increases with age and experience, we can represent a wider range of possibilities for action and cultivate a broader conception of our own free will.

Viewing imagination as a generative model furthers fruitful interpretations of this research. When imagining, young children apply a generative model with the same rules of generation used in perception to produce expectations about reality. This early imagination may use simple constraints and empirical heuristics to allow effortless and rapid generation of possibilities. For instance, if the child regularly encounters an event, they are more likely to imagine this event (Goulding & Friedman, 2020). In later development and adulthood, the imagination generates possibilities in a more deliberative and analytical way. This suggests a dual process model of imagination (Stuart, 2019). Children may use a more uncontrolled, effortless, and unconscious imagination based on simple heuristics and experience-derived rules of generation. In contrast, adults use a more controlled, effortful and conscious imagination that generates possibilities based on relatively sophisticated and principled rules. 

Conclusively, the default representation of imagination results in resistance to imagining possibilities that violate physical laws, irregular or unlikely possibilities, and immoral or evaluatively bad possibilities. Experimental results reveal that the imaginations of young children are limited by precisely these constraints. Adults are able to deliberately generate more and less constrained possibilities. With very limited time or significant cognitive pressure, adult imaginations may resemble the imaginations of young children. However, just as adults can treat immoral possibilities as irrelevant, imaginative resistance shows that the adult imagination is inhibited against immoral possibilities. Finally, individual differences in openness to experience, creativity, and imaginative ability may predict some of the variation in judgements of possibility and freedom. For instance, people who are naturally more imaginative (and thus generate more possibilities) will be more likely to judge agents as free rather than forced.

4.2 Imagination and existential freedom

To imagine enables free consciousness, because it allows you to get beyond the real, developing a broader perspective on the world by imagining beyond it and escaping from it to some degree (Turner, 1968). The imagination is a radical break from the surrounding world, a negation of present circumstances, making-present something that is not there by making-absent what is ‘really’ there (Sartre, 2010). As Husserl writes, free phantasy (imagination) allows one to see more possibilities and attain a wider-ranging knowledge of experience (Husserl 2012, §70). Our sensory lives give us access only to a small selection of possible experiences, and thus we need imagination to explore the immensity of conceivable configurations of experiences, choices, and perceptions. The imagination can supplement our experience, and in turn we can use experiences to pollinate the imagination and enable coherent world-simulations.

This sheds light on an important debate in existentialism. Sartre claimed that human consciousness is able to transcend any given situation by pursuing the possibilities we imagine. He thought that we are radically, infinitely free to choose our possibilities (2015, p. 112). We can define our identity with negation, through the set of possibilities we reject. In contrast, Heidegger had a much limited view of human freedom. He thought that our world, and our set of available possibilities, is defined by social structures that are out of our control. The They (Das Man), or basically our social context, limits the set of possibilities we are capable of considering (Heidegger, 2010, §27). Certain possibilities will never be available to us, not just because we cannot factually achieve them, but because we cannot even conceive them. For him, freedom is the process of personally appropriating one of these socially given options, and authenticity consists in becoming one’s possibilities. Both perspectives are true to some extent: we have immense freedom to generate and choose amongst our imagined possibilities, but these possibilities are also limited by our social context and our cognitive abilities.

Furthermore, imagination is essential to the process of flexible identity-construction: developing a sense of oneself, seeing aspects of one’s identity, and moving towards a hopeful future self (Gotlieb et al, 2018). A sometimes forgotten aspect of free will is that to perceive yourself as acting freely, you must already perceive yourself as an agent. By allowing us to picture ourselves in the future, counterfactually vary features of our identity, and imagine a constant thread of who we are through our lifetimes, imagination supports the cross-temporal identity that is essential to seeing oneself as a free agent. Finally, imagination may enable a kind of existential creativity: an individual’s attitude of exploring life’s possibilities and experimenting with life-plans and versions of herself (Loi & Plas, 2020). Imagination “permits one to take the paths of many varied and opposed ways of thinking,” creating “the excess that gives to the free spirit the dangerous privilege of living for experiments and of being allowed to offer itself to adventure” (Nietzsche, 1996, #4). Imagining alternative future tracks and ways of life, and narratively constructing an identity that persists through these disparate pasts and possibilities, is crucial to a person’s ability to forge a meaningful life.

4.3 Disorders of imagination

It may even be the case that certain disorders increase free will perceptions by amplifying imaginative abilities, facilitating unexpected connections and more unpredictable mental pathways. While at its extreme this can lead to psychosis, it also amplifies the exploratory processes essential to generating alternative possibilities for choice. Both bipolar and ADHD are associated with significantly higher openness to experience (Van Dijk et al, 2017; Quilty, 2009). Openness is linked to trait creativity, is even used as a measure of creativity, and is associated with higher volume in brain regions that inhibit control and reduce constraint (Li, 2015). The highly-open personalities of patients with disorders like ADHD and bipolar may facilitate highly associative, fluent, and originative brainstorming of possibilities.

Looser cognitive limitations, weakened top-down control, and more unconstrained thinking may also potentiate imagination and free will perceptions in certain disorders. Creative tasks benefit from a state of hypofrontality, in which reduced PFC activation enables more spontaneous, bottom-up thought patterns (Ramey & Chrysikou, 2014). Bipolar patients exhibit disruptions in the frontoparietal control network which reduce top-down constraints, and mania and involves hypofrontality, a “significant attenuation of task-related activation of right lateral orbitofrontal function” that results in disinhibition and distractibility (Altshuler et al, 2005). Further, individuals with ADHD have impaired executive inhibition, which reduces the person’s ability to suppress creative but unconventional ideas – and ADHD patients exhibit improved performance on tasks like the Unusual Uses Test (White & Shaw, 2006). People with mental disorders associated with impulsivity like bipolar, Tourette’s, and ADHD often have more fluent and vivid imaginations, and are biased toward generation over evaluation (Ellamil, 2012). Therefore, the disinhibited imaginations of people with certain disorders may allow them to brainstorm more actionable possibilities. However, this may also explain the pathological aspect of these disorders: it may be harder to select appropriate actions given a far larger pool of possibilities, including irrelevant, unfit, or harmful ones. Limiting the number of projected possibilities is therefore likely adaptive – imaginative constraints are the bonds that set us free.

Finally, aphantasia is a well-documented disorder that involves the absence of a ‘mind’s eye,’ where otherwise normal, healthy individuals report a complete lack of visual experience when they attempt to imagine something (Keogh & Pearson, 2018). There are also degrees of aphantasia – it can involve an impaired imagination with reduced strength, control, or vividness, rather than a complete lack of imaginings. At opposite end of the imaginative spectrum is hyperphantasia, an exceptional strength, control, and vividness of imagination. One of the testable predictions of my theory is that if imagination does indeed play a crucial role in the perception of freedom, then there will be significant differences in the free will perceptions and judgements of aphantasic and hyperphantasic individuals. Specifically, higher scores on tests of imaginative ability will correlate with greater perceptions of free will.

4.4 Imagining to increase agency

Imagination can also be thought of as a trainable ability, which can be practiced to improve self-efficacy, self-control, and agency. For example, individuals who are more skilled at counterfactual thinking are more easily able to self-restrain and delay gratification in the service of later reward (Mischel et al, 2011). Imagination breaks our constant present-orientation and task-focus, moving us into more flexible, open, and future-oriented mode of internal reflection that is crucial for long-term decision-making (Gotlieb et al, 2018). Being able to imagine the future allows us to resist current temptations and focus on long-term goals. Imagination supports the perception of free will, and in turn, an increased belief in free will changes the way persons imagine their futures – promoting a focus on personal agency and interpersonal connection in prospective imaginings (Nagelmann, 2019). Imaginative skill thus promotes a sense of freedom.

Finally, social norms are one of the most powerful sources of constraints on the imagination. Just as a child who is only given certain props and stories will naturally shape their pretend play around these objects and narratives, adults mold their imaginings by their socio-cultural environment. The existing structures of the world can congeal in our minds, ossifying until they seem almost unshakeable, and are not even realized as constraints. So perhaps, for example, “it is easier to imagine the end of the world than it is to imagine the end of capitalism” (Fisher, 2009, p. 8). Overcoming the constraints of collective imagination, the rigid social orthodoxies that tell us what is and is not possible, can have transformative force – liberating entire populations to act more freely because they realize their actions are not as constrained as they thought (Dey & Mason, 2018). Disruptive truth-telling, courageous speech, and utopian imagination in the style of MLK, Ghandi, or Mandela can therefore literally enhance our sense of free will by increasing the number of possibilities for action and boosting their cognitive availability.

Conclusion

The ability to imagine is a core component of consciousness. “Imagination is a specifically human form of conscious activity” (Vygotsky, 1967) which distinguishes us from other organisms, supporting our ability to generate complex mental representations and reconfigure them into innumerable combinations. Imagination dominates consciousness both in duration and degree. The average person spends between 30% and 50% of their waking time daydreaming (McMillan et al, 2013), and even more conscious time is occupied engaged in prefrontal synthesis, dreaming, operating in the mental workspace, or simulating the future. Further, imagination represents a peak of consciousness, where endogenous attention is actively and volitionally applied to synchronize lower-level neural ensembles into complex internal simulations. Research into the neural correlates of consciousness must therefore treat the imagination as a central question.

In this paper, I have reviewed the neural basis of the imagination, from its computational role and diverse functions in supporting creativity, decision-making, and modal cognition, to its algorithmic structure in generative models, to its implementation in the brain through reversed perceptual processes, the default mode network, and binding-by-synchrony. Further, I have argued that imagination is central to the perception of free will. Without imagination, free will is unimaginable. Without it, we would not be able to represent alternative possibilities, simulate the consequences of our actions, or construct an identity through time and envisage different ways of living and being. Imagination also explains some of the systematic ‘soft’ limitations on our free will, which prevent us from acting on options that we cannot or do not imagine. By building our imaginative capabilities we can transform our personal and societal futures.

Works Cited

Abraham, A. (Ed.). (2020). The Cambridge handbook of the imagination. Cambridge University Press.

Addis, D. R., Wong, A. T., & Schacter, D. L. (2007). Remembering the past and imagining the future: common and distinct neural substrates during event construction and elaboration. Neuropsychologia45(7), 1363-1377.

Altshuler, L. L., Bookheimer, S. Y., Townsend, J., Proenza, M. A., Eisenberger, N., Sabb, F., … & Cohen, M. S. (2005). Blunted activation in orbitofrontal cortex during mania: a functional magnetic resonance imaging study. Biological psychiatry58(10), 763-769.

Bechtel, W., & Shagrir, O. (2015). The non‐redundant contributions of Marr’s three levels of analysis for explaining information‐processing mechanisms. Topics in Cognitive Science7(2), 312-322.

Berto, F. (2021). Taming the runabout imagination ticket. Synthese198(8), 2029-2043.

Buckner, R. L. (2010). The role of the hippocampus in prediction and imagination. Annual review of psychology61, 27-48.

Chernyak, N., Kang, C., & Kushnir, T. (2019). The cultural roots of free will beliefs: How Singaporean and US Children judge and explain possibilities for action in interpersonal contexts. Developmental psychology55(4), 866.

Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and brain sciences, 36(3), 181-204.

Cook, C., & Sobel, D. M. (2011). Children’s beliefs about the fantasy/reality status of hypothesized machines. Developmental Science14(1), 1-8.

Dokic, J., & Arcangeli, M. (2014). The heterogeneity of experiential imagination. Open MIND. Frankfurt am Main: MIND Group.

Eagleman, D. M., & Vaughn, D. A. (2021). The defensive activation theory: REM sleep as a mechanism to prevent takeover of the visual cortex. Frontiers in neuroscience15.

Ellamil, M., Dobson, C., Beeman, M., & Christoff, K. (2012). Evaluative and generative modes of thought during the creative process. Neuroimage59(2), 1783-1794.

Fisher, M. (2009). Capitalist realism: Is there no alternative?. John Hunt Publishing.

Flanagan, Teresa, and Tamar Kushnir (2019). “Individual differences in fluency with idea generation predict children’s beliefs in their own free will.” In CogSci, pp. 1738-1744.

Gaesser, B. (2013). Constructing memory, imagination, and empathy: a cognitive neuroscience perspective. Frontiers in psychology3, 576.

Gotlieb, R. J., Hyde, E., Immordino-Yang, M. H., & Kaufman, S. B. (2018). Imagination is the seed of creativity. The Cambridge Handbook of Creativity. New York, NY: Cambridge University press.

Goulding, B. W., & Friedman, O. (2020). Children’s beliefs about possibility differ across dreams, stories, and reality. Child development91(6), 1843-1853.

Ha, D., & Schmidhuber, J. (2018). Recurrent world models facilitate policy evolution. arXiv preprint arXiv:1809.01999.

Hassabis, D., Kumaran, D., & Maguire, E. A. (2007). Using imagination to understand the neural basis of episodic memory. Journal of neuroscience, 27(52), 14365-14374.

Heidegger, M. (2010). Being and time. Suny Press.

Hunt, M., & Fenton, M. (2007). Imagery rescripting versus in vivo exposure in the treatment of snake fear. Journal of Behavior Therapy and Experimental Psychiatry38(4), 329-344.

Husserl, E. (2012). Ideas: General introduction to pure phenomenology. Routledge.

Kane, J., McGraw, A. P., & Van Boven, L. (2008). Temporally asymmetric constraints on mental simulation: Retrospection is more constrained than prospection. The handbook of imagination and mental simulation, 131-149.

Keogh, R., & Pearson, J. (2014). The sensory strength of voluntary visual imagery predicts visual working memory capacity. Journal of vision, 14(12), 7-7.

Keogh, R., & Pearson, J. (2018). The blind mind: No sensory visual imagery in aphantasia. Cortex105, 53-60.

Keogh, R., Bergmann, J., & Pearson, J. (2020). Cortical excitability controls the strength of mental imagery. elife9, e50232.

Kielak, K. (2019). Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning. arXiv preprint arXiv:1904.13255.

Koch, C., Massimini, M., Boly, M., & Tononi, G. (2016). Neural correlates of consciousness: progress and problems. Nature Reviews Neuroscience17(5), 307-321.

Koch, Christof. (2004). The quest for consciousness: a neurobiological approach. Roberts & Co.

Kushnir, T. (2018). The developmental and cultural psychology of free will. Philosophy Compass13(11), e12529.

Laeng, B., & Sulutvedt, U. (2014). The eye pupil adjusts to imaginary light. Psychological science25(1), 188-197.

Lane, J. D., Ronfard, S., Francioli, S. P., & Harris, P. L. (2016). Children’s imagination and belief: Prone to flights of fancy or grounded in reality?. Cognition152, 127-140.

Lee H, Ekanadham C, Ng Ay (2008). Sparse deep belief net model for visual area V2. Advances in Neural Information Processing Systems, 20.

Leslie, A. M. (1994). Pretending and believing: Issues in the theory of ToMM. Cognition50(1-3), 211-238.

Liao, S. Y., Strohminger, N., & Sripada, C. S. (2014). Empirically investigating imaginative resistance. British Journal of Aesthetics54(3), 339-355.

Liao, Shen-yi and Tamar Gendler (2011). “Imagination.” The Stanford Encyclopedia of Philosophy.

Loi, M., Viganó, E., & van der Plas, L. (2020). The societal and ethical relevance of computational creativity. arXiv preprint arXiv:2007.11973.

Malinowski, J. E., & Horton, C. L. (2015). Metaphor and hyperassociativity: the imagination mechanisms behind emotion assimilation in sleep and dreaming. Frontiers in Psychology6, 1132.

Marques, H. G., & Holland, O. (2009). Architectures for functional imagination. Neurocomputing, 72(4-6), 743-759.

Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: W.H. Freeman.

McMillan, R., Kaufman, S. B., & Singer, J. L. (2013). Ode to positive constructive daydreaming. Frontiers in psychology4, 626.

Mischel, W., Ayduk, O., Berman, M. G., Casey, B. J., Gotlib, I. H., Jonides, J., et al. (2011). Willpower over the life span: decomposing self-regulation. Soc. Cogn. Affect. Neurosci. 6, 252–256.

Mullally, S. L., & Maguire, E. A. (2014). Memory, imagination, and predicting the future: a common brain mechanism?. The Neuroscientist20(3), 220-234.

Nagelmann, Stina. “How do attitudes on free will relate to students’ imagination of the future?.” Bachelor’s thesis, University of Twente, 2019.

Nietzsche, Friedrich. (1996). Human, all too human: A book for free spirits. Cambridge University Press.

Pearson, J. (2019). The human imagination: the cognitive neuroscience of visual mental imagery. Nature Reviews Neuroscience20(10), 624-634.

Phillips, J., & Knobe, J. (2018). The psychological representation of modality. Mind & Language33(1), 65-94.

Phillips, J., Morris, A., & Cushman, F. (2019). How we know what not to think. Trends in cognitive sciences23(12), 1026-1040.

Phillips, Luguri, and Knobe (2015). “Unifying morality’s influence on non-moral judgments: The relevance of alternative possibilities.” Cognition 145, 30-42.

Quilty, L. C., Sellbom, M., Tackett, J. L., & Bagby, R. M. (2009). Personality trait predictors of bipolar disorder symptoms. Psychiatry Research169(2), 159-163.

Raichle, M. E. (2015). The brain’s default mode network. Annual review of neuroscience38, 433-447.

Ramey, C. H., & Chrysikou, E. G. (2014). “Not in their right mind”: the relation of psychopathology to the quantity and quality of creative thought. Frontiers in psychology5, 835.

Reichert, D. P., Series, P., & Storkey, A. J. (2013). Charles Bonnet syndrome: evidence for a generative model in the cortex?. PLoS computational biology, 9(7), e1003134.

Reichert, D. P., Series, P., & Storkey, A. J. (2013). Charles Bonnet syndrome: evidence for a generative model in the cortex?. PLoS computational biology, 9(7), e1003134.

Sack, A. T., van de Ven, V. G., Etschenberg, S., Schatz, D. & Linden, D. E. J. Enhanced vividness of mental imagery as a trait marker of schizophrenia? Schizophr. Bull. 31, 97–104 (2005).

Sartre, J. P. (2010). The imaginary: A phenomenological psychology of the imagination. Routledge.

Sartre, J. P. (2015). Being and nothingness. Central Works of Philosophy: The twentieth century: Moore to Popper4, 155.

Schlegel, A., Kohler, P. J., Fogelson, S. V., Alexander, P., Konuthula, D., & Tse, P. U. (2013). Network structure and dynamics of the mental workspace. Proceedings of the National Academy of Sciences110(40), 16277-16282.

Shine, J. M. et al. Imagine that: elevated sensory strength of mental imagery in individuals with Parkinson’s disease and visual hallucinations. Proc. R. Soc. B 282, 20142047 (2014).

Shtulman, A., & Phillips, J. (2018). Differentiating “could” from “should”: Developmental changes in modal cognition. Journal of Experimental Child Psychology165, 161-182.

Spreng, R. N., Mar, R. A., & Kim, A. S. (2009). The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysis. Journal of cognitive neuroscience21(3), 489-510.

Stuart, M. T. (2021). Towards a dual process epistemology of imagination. Synthese198(2), 1329-1350.

Tartaglia, E. M., Bamert, L., Mast, F. W., & Herzog, M. H. (2009). Human perceptual learning by mental imagery. Current Biology19(24), 2081-2085.

Testolin, A., & Zorzi, M. (2016). Probabilistic models and generative neural networks: Towards an unified framework for modeling normal and impaired neurocognitive functions. Frontiers in Computational Neuroscience, 10, 73.

Thorburn, R., Bowman-Smith, C. K., & Friedman, O. (2020). Likely stories: Young children favor typical over atypical story events. Cognitive Development56, 100950.

Tse, P. (2013). The neural basis of free will: Criterial causation. MIT Press.

Turner, R. (1968). Sartre and Ryle on the Imagination. South African Journal of Philosophy, 20-28.

Van de Vondervoort, J. W., & Friedman, O. (2017). Young children protest and correct pretense that contradicts their general knowledge. Cognitive Development43, 182-189.

Van de Vondervoort, Julia W., and Ori Friedman,” Young children protest and correct pretense that contradicts their general knowledge,” Cognitive Development 43 (2017): 182-189.

Van Dijk, F. E., Mostert, J., Glennon, J., Onnink, M., Dammers, J., Vasquez, A. A., … & Buitelaar, J. K. (2017). Five factor model personality traits relate to adult attention-deficit/hyperactivity disorder but not to their distinct neurocognitive profiles. Psychiatry research258, 255-261.

Vetter, P., Smith, F. W., & Muckli, L. (2014). Decoding sound and imagery content in early visual cortex. Current Biology24(11), 1256-1262.

Vygotsky, L. S. (2016). Play and its role in the mental development of the child. International Research in Early Childhood Education7(2), 3-25.

Vyshedskiy, A., & Dunn, R. (2015). Mental synthesis involves the synchronization of independent neuronal ensembles. Research Ideas and Outcomes1, e7642.

Vyshedskiy, Andrey. “Neuroscience of imagination and implications for human evolution” (2019). Preprint DOI: 10.31234/osf.io/skxwc.

Walton, K. L. (1990). Mimesis as make-believe: On the foundations of the representational arts. Harvard University Press.

Weisberg, D. S., Sobel, D. M., Goodstein, J., & Bloom, P. (2013). Young children are reality-prone when thinking about stories. Journal of Cognition and Culture13(3-4), 383-407.

White, H. A., & Shah, P. (2006). Uninhibited imaginations: creativity in adults with attention-deficit/hyperactivity disorder. Personality and individual differences40(6), 1121-1131.

Williams, D. (2021). Imaginative constraints and generative models. Australasian Journal of Philosophy, 99(1), 68-82.

Williams, D. (2021). Imaginative constraints and generative models. Australasian Journal of Philosophy99(1), 68-82.

Williamson, T. (2016). Knowing by imagining. Knowledge through imagination, 113-23.

Winlove, C. I., Milton, F., Ranson, J., Fulford, J., MacKisack, M., Macpherson, F., & Zeman, A. (2018). The neural correlates of visual imagery: A co-ordinate-based meta-analysis. Cortex105, 4-25.

Xu, Xiaoxiao, Hong Yuan, and Xu Lei, “Activation and connectivity within the default mode network contribute independently to future-oriented thought,” Scientific reports 6 (2016): 21001.

Figures and images

Figure 1 – the hierarchy of types of imagination, split by top-down and bottom-up imagination.

Figure 2 – an illustration of the ‘reverse perception’ process of imagination.

Diagram  Description automatically generated

Diagram  Description automatically generated

Figure 3 – An illustration of the binding by synchrony hypothesis.

Chart, bar chart  Description automatically generated

Figure 4 – Chart showing that as more possibilities are generated, the possibilities increasingly deviate from the constraints of morality, normality, probability, and rationality. For instance, the 1st item generated is given a probability rating of about 6.3, while the 8th item generated is given a probability rating of about 5. This shows the importance of the quantity of ideas generated for escaping constraints and conceptual limitations during the brainstorming process. Based on unpublished data from the Dartmouth PhilLab, project by Jonathan Phillips, Eliza Jane, Margaret Garrard, and Maeen Arslan.

  1. See Figure 1.

  2. See Figure 2.

  3. See Figure 3, from unpublished data in the Dartmouth PhilLab.

Categories
Essays Philosophy Religion & Spirituality

The Critique of Spiritual Reason

Introduction

Growing up Mormon, I often heard people talk about spiritual experiences. There is a near-endless variety of these experiences, from the classic example of Joseph Smith’s First Vision to strange dreams of angels, to exorbitant narratives of Mormon garments deflecting bullets in a warzone, to small simple feelings in church, to tingly sensations, to more abstract and conceptual sense of confirmation. For Mormons (LDS people), these experiences are created by the Holy Ghost, and are evidence of God’s presence on Earth. I found these stories fascinating to listen to (although there were many that seemed trite and cliched). Even after leaving the church, I’ve been intrigued by fringe experiences in the human condition: transformative experiences, moments of inspiration, paradigm-breaking realizations, imaginings, visions, dreams, and beyond. Like William James, I’m amazed at the varieties of religious and spiritual experience, and I want to participate in the monumental project of exploring, documenting, and explaining them. What are they? How do they happen? What do they mean, and how can we interpret them? What can they tell us?

I have had many powerful experiences that challenged my understanding of reality, inspired me, and left me reeling to understand. In my own internal language, they are window-shattering moments. For me, these experiences create aporia: an impasse, a quandary one cannot resolve, a state of puzzlement, a doubting and bewilderment, a being-at-a-loss, a dazzling of the mind by the intricacy of existence. (I discuss aporia and similar topics in depth in Why Literature Matters: The Aporetic Approach). At first, because I had spent hundreds of hours in Mormon Sunday schools and seminaries, and I was indoctrinated into this religious tradition, I couldn’t help but interpret these experiences in the only way I knew how – in a way that reinforced the LDS belief system.

However, I started to realize there are serious problems with the way LDS people, and perhaps religious people in general, understand, talk about, and make inferences from these spiritual experiences. As an ever-skeptical and philosophical kid, I couldn’t take the experiences for granted and just naively subsume them into the LDS worldview. Further, I had a few experiences that broke the mold that Mormonism sets. I cannot hope to describe them in detail, but their context explains a lot. Many of them occurred in India. One was at a Hindu ceremony on the bank of the river Ganges, one was reading the Bhagavad Gita & Fight Club during a 17-hour train ride from New Delhi to Varanasi, and one was practicing the salah (Muslim form of prayer) and reading the Qu’ran. Later on, I had several episodes of bipolar mania that felt like a continuous chain of all-consuming, overwhelming, and beautiful spiritual experiences chained together for days or weeks at a time. These experiences were much more intense and undeniable than the relatively mild experiences I felt in a Mormon context.

This gave me many questions. If, as I was told, spiritual emotions while reading the Book of Mormon meant that the LDS Church was true, then does that mean that Islam or Hinduism are true because of these experiences? How can this be the case, when these religious have clearly contradictory beliefs, prescriptions, and interpretations of the world? How am I supposed to distinguish between “true” or “valid” spiritual experiences and episodes of mental illness, when they feel extremely similar, and the manias are often even more acute, prolonged, and even more structured and sensible? Since these experiences don’t have a clear, obvious, reflexive, or undeniable interpretation, and don’t merely ‘explain themselves’ or stand on their own, they must be interpreted somehow. What inferences should we make from them? What beliefs should we hold or what actions should we take based on them? What do they justify and support? What information do they give us or fail to give us?

I began to ask these questions. But my teachers, mentors, and even trained LDS religious scholars did not truly understand or engage with my objections. Instead of seriously and closely listening, reflecting on my questions, and giving me genuine and thoughtful responses, they often denied my experience or told me I should not even ask the questions. They often told me I was thinking too much, should “just have faith,” or should avoid reading any non-Mormon scholarly literature or thoughts. This just made me more skeptical and interested in digging into the questions. The more I thought, the more it seemed clear that the Church was wrong about spiritual experience, and that people were making far too many and resting far too many. Their faith was on a fundamentally shaky foundation, and instead of investigating the foundations they simply ignored them. Take out the crucial load-bearing keystone of their interpretation of spiritual experiences, and the entire belief system collapsed. Ultimately, this was the primary reason I decided to leave Mormonism at 15, although there are many other reasons to reject the religion’s claims. Here, almost a decade later, I will try to dig into these questions and explain the problems.

Ostler on Faith

One of the few people to engage with the questions of spiritual experiences in detail and in good faith is Blake Ostler, a prominent LDS theologian. Faith, Reason, & Spiritual Experience, he gives an argument for why spiritual experiences provide justification for faith. Ostler is not at his best as a philosopher here though. He fails to state his premises and assumptions in detail, show most of his logical work, or address the obvious objections to his arguments. I wish that LDS people and apologists understood their doubters more and engaged with them more directly and thoroughly. Unfortunately, dialogue on Mormonism is often more like ships passing in the night.

Still, it is refreshing to see Mormon theology addressed in a more rigorous philosophical way, including how these ideas interface with common problems in epistemology. Ostler begins with the foundational framing that “no argument can prove spiritual experiences, because the direct encounter with the divine will always be more basic and grounded —and frankly more compelling—than any other evidence or argument.” This is difficult to respond to. How can you know if you are having a genuine ‘encounter with the divine’? How can you know what to take from this encounter? Many people claim (and I accept they have truly had) these kinds of experiences, and they take radically different conclusions from them. Further, why do these experiences take epistemic priority? Why are they the most basic form of knowledge?

Ostler is making a massive claim here without any elaboration or sufficient justification. In philosophy, debates on what form of knowledge is most fundamental have taken up thousands of years and millions of pages, and he cannot simply bypass any of these questions by stating his claim without justification to an accepting audience. Further, in a talk that is supposed to show how spiritual experiences support faith, he seems to be skipping to the end, begging the question, undermining his own case, and making a circular argument. If his initial premise is that spiritual experiences are unprovable but override all other forms of argument or evidence, then what is the point of continuing to make the argument that spiritual experiences justify faith in Mormonism?

I appreciate that Ostler describes the epistemic structure of a Mormon spiritual experience with 6 characteristics – (1) cognitive and affective, (2) non-volitional, (3) familiarity, (4) presence of a loving being, (5) indescribable positive emotion (joy/peace/sweetness), (6) re-orienting all other experience. What constitutes a spiritual experience is rarely described in detail and is usually extremely vague (often intentionally, so that anything can be classed as a spiritual experience that supports the Church’s truth-claims). Part of the reason I appreciate his detailed description is that it makes it easier to understand and address the epistemic problems with this view of spiritual experiences. See below:

In Mormon epistemic practice, the experience of the spiritual knowledge often is described as including some or all of the following facets:

1. The experience cannot be reduced to a mere emotion or feeling. It involves a cognitive component essentially. Doctrine & Covenants 9:7-8 suggests that a precursor to such experiences requires studying out the questions at issue: “Behold, you have not understood; you supposed that I would give it [the answer to your questions] unto you, when you took no thought save it were to ask me. But, behold, I say unto you that you must study it out in your mind. . .” In addition, one must “ask me if it be right.” The scripture then predicts the form that the spiritual response will take: “. . . if it is right I will cause that your bosom shall burn with you; therefore, you shall feel that it is right. But if it be not right you shall have no such feelings, but you shall have a stupor of thought that shall cause you to forget the thing which is wrong” (9:8-9). The experience is both cognitive and affective; both head and heart. As Doctrine & Covenants 8:2 clarifies: “I will tell you in your mind and in your heart, by the Holy Ghost, which shall come upon you and which shall dwell in your heart.” The burning in the bosom, or heart, or very center of the human soul, is affective and involves feelings, but it also involves a sense of pure knowledge and enlightenment. Most often the experience of sensing the truthfulness of the message comes in the midst of such a search. The answers often come in conjunction with sincere study, searching and thoughtful pondering.

2. The spiritual experience cannot be produced at will but is experienced as coming as a grace in the midst an honest search for the truth.

3. It involves a sense of having always known – it is deeply familiar.

4. It involves more than just cognitive or discursive knowledge (sapere); it also involves interpersonal knowledge or conoscere and associated with a sense of the presence of a loving and personal being and being accepted in a relationship. This “knowing God as an interpersonal presence in one’s own life and being” is, at least theologically, the most important spiritual aspect of the experience because to “know God” in this sense is life eternal. Indeed, to know that we are accepted into relationship with God and to invite God to reside in our hearts is a moment of justification by grace through faith and the beginning of the life of sanctification in which the spirit enters into us and Christ takes up abode in us in the process of Christification, or being conformed to the image of Christ, and culminating in deification.

5. The feeling of a “burning” in the heart includes a feeling of indescribable joy, peace and sweetness.

6. The experience re-orients all other experience. Everything is seen in a new light through the lens of the experiential knowledge.

— Blake Ostler, Faith, Reason, & Spiritual Experience – Vol 5

There are many problems with this framework.

First, Ostler isn’t clear whether these criteria are necessary, sufficient, and/or both for constituting a spiritual experience. Must genuine spiritual experiences have all six, some of them, or just one? It is not perfectly clear exactly what he is claiming.

Second, many of the spiritual experiences I’ve heard described in an LDS context don’t meet these criteria. Things often talked about in testimony meetings, like finding your keys after praying and feeling relieved, feeling a vague sense of calmness or peace while reading the BoM, or feeling intense joy after carrying a handcart up a hill – these aren’t enough, and aren’t really valid or complete spiritual experiences for Ostler. Many people in the church have *never* had a spiritual experience that sufficiently meets these criteria, which under Ostler’s own framework would mean they don’t have a sufficient basis for faith in the church. Yet I doubt he would tell these people that they should not have faith, which seems to bare an inconsistency in his worldview.

Third, these criteria are far too permissive to support faith in the LDS church specifically. People often have experiences that match these criteria, in an innumerable variety of contexts. Billions of people have spiritual experiences that they then interpret very differently, supporting their faith in radically different worldviews. Ostler must not only show how spiritual experiences can justify faith in general. He also must show that spiritual experiences can justify faith in Mormonism specifically and solely and exclude faith in other contradictory or competitive religious or faith systems. Of course, he fails to do this — likely because it is not possible.

Additionally, many kinds of experiences—episodes of mental illness, psychedelic experiences, responses to art and music, and even feelings in survival circumstances or under intense physical exertion (like runner’s highs) — could fulfill these criteria. Certain chemical substances and drugs, especially the classical psychedelics like LSD, DMT, 5-MeO-DMT, and psilocybin, can reliably (almost always) induce experiences that fulfill Ostler’s criteria to a remarkable degree (Barrett & Griffiths 2017). They can do so far more reliably than LDS-related practices like prayer, Church worship, scripture reading, and the temple. Does this mean that psychonauts are justified in having faith in the conclusions they take from their often-strange and reality-bending experiences, or that bipolar people in manic episodes are justified in having faith in the content of their manias? Since Ostler’s criteria fail to exclude these types of experience, he is logically implicated in defending these claims. He has to accept that they are equally justified in their faith based upon spiritual experience as LDS people are in theirs. Further, if one’s degree of faith should be proportional to the intensity of the experience or the degree to which it fulfills the criteria, he may have to accept that these people are even more justified in their faith.

In the end, Ostler fails to make a satisfying or complete case. He concludes with an argument that there is “no way to distinguish between the phenomenal nature of experiences directly caused by God and knowledge based on memory or sensory experience.” I was surprised to see this statement, because it clearly undermines Ostler’s entire case. If there is no way to distinguish between God-caused experiences and normal sensory experiences, then there is no way to know if an experience is a genuine religious inspiration or not. The true promptings from the Holy Ghost, the true messages from God, are indistinguishable from meaningless sensory impressions and thoughts, like pebbles of gold lost in the fast-moving, endless stream of sensory and mental experiences. Thus, it is impossible to know what to have faith in. Trying to build a foundation of faith on this slippery, shifting, swirling surface is unworkable.

Gold Panning on Upper Sand Creek, by watercolor artist Richard DuBois -  DuBoisWatercolorExpressions.com
Panning for Gold, by Richard DuBois

Ostler then describes how faith is fundamentally subjective and internal to the individual, because it is (1) passionate, (2) has a unique subjective interpretative stance, (3) is a choice, (4) is a matter of the heart, and (5) is a non-willed gift from God. This roughly matches Kierkegaard’s conception of faith, which will discuss and critique shortly. However, here Ostler sets himself up for failure. Faith is meant to be something that can support decisions, providing a basis for a life-path and a belief system. It is also supposed to allow many people to arrive at the same conclusions independently (e.g. that the LDS Church is true). But if it is this radically subjective, emotional, intense form of knowing, one which is chosen in a kind of arbitrary leap, it seems extremely doubtful that faith can fulfill these desiderata.

Further Questions

I have only touched on the surface of these questions and addressed one of the more popular defenses of spiritual experiences in Mormonism. Here are a few more questions that I find crucially important and that are often ignored or misunderstood:

Faith as a choice or leap

Let’s say faith is a choice. This is a common claim in religion, an oft-repeated phrase in Mormonism, and a key premise in Ostler’s argument. But there many things that one could choose to have faith in. Many of these things are contradictory or competitive, and you cannot have faith in all of them at the same time. Thus, one must choose between them. On what basis do you make that choice?

Even Kierkegaard, the most prominent philosopher of faith and one of my favorite thinkers, fails to give an adequate analysis of what to have faith in. He claims that we cannot have faith by virtue of reason and must suspend our reason to believe in something higher than reason (SEP). This is a form of fideism, the epistemological position that faith is independent of reason. (Though Kierkegaard is characteristically contradictory here, as he uses reason to justify his claim that we cannot have faith in in virtue of reason). Faith is simply a leap into the unknown and unknowable, a radical decision made in response to the absurdity and ambiguity of the human condition. He sees faith as an unexplainable miracle, where eternal truth enters time in an instant and witnesses of God.

But where should one leap? In what direction? How does one tell between the eternal truths worth leaping for, and the contingent or potentially false ideas and beliefs that should be avoided or more tentatively walked into? He seems to take it as a foregone conclusion that if one has faith, it would be in Christianity. Why, if not just his upbringing and cultural biases? Why not take a leap of faith into Hinduism, Buddhism, Islam, new-age spiritualism, fascism, Marxism, scientism, or any other system of belief? Is the choice just arbitrary, random, and unfounded?

Further, why leap into any culturally created system of belief at all, rather than just having faith in disconnected ideas or principles? After all, the “will to a system is a lack of integrity” (Nietzsche). Once you decide to embrace a complete system, the experiences, evidence, and ideas that do not fit into your system will be rejected, ignored, or forced to fit into the system. You are no longer an authentic and serious investigator, seeking to interpret reality, understand experiences for their own sake, and make an inquiry into life and the world in a pursuit of truth. You are now attached to a rigid, preexisting belief system and set of concepts. When this system becomes ingrained in your mind by decades of acceptance and practice, it defines your social relations, and it shapes how you live your life every day, it becomes a herculean task to leave it. The cognitive dissonance becomes too powerful. There is a very real sense in which your brain will not let you escape the system. Further, the social and personal costs of leaving a system you have built your life around can be too much to bear. This is why Nietzsche urges us to avoid systems, to always be open to new evidence and experience, and to be able to hold contradictions in your mind without forcing them to resolve into one side or the other to fit some static framework.

Leap of Faith Painting by Brady Nielson | Saatchi Art
Leap of Faith, painting by Brady Nielson.

It is telling that Kierkegaard’s paradigm example of a leap of faith (a phrase he does not actually use in his corpus) is the story of Abraham nearly sacrificing his son, Isaac. He does this precisely because it is such a radical decision (killing your child), made entirely on the basis of Abraham’s faith and his spiritual experience of God speaking to him. He uses this story to show that the religious sphere of existence transcends or exceeds rationality and morality. But what if instead of Abraham, it was Osama bin Laden, who believed on the basis of his Islamic faith and his spiritual experiences that he should orchestrate the 9/11 attacks? This case differs from Abraham in degree, and in its cultural associations, but it does not seem fundamentally different in kind. There are an infinite number of actions that one could take on the basis of one’s faith or spiritual experiences, where many of these actions any reasonable person would consider completely unacceptable, but a faith-based person would see as potentially acceptable or even essential. Why take one specific faith-inspired action over any other? Further, a constitutive component of faith for Kierkegaard is that it is totally embraced, without doubt or rational criticism. Abraham does not stop to ask how he knows that God is speaking to him, to think that perhaps he is hallucinating or mishearing, or to wonder if he is misinterpreting what God’s intentions are. These would be unfaithful questions. Instead, he just acts. Thus, the idea of a leap of faith leaves us open to amoral, irrational actions that can result in atrocities and catastrophes. Personally, I would hate to live in a world where people acted on this idea.

Rembrant, Abraham and Isaac

Perhaps Kierkegaard’s view of faith is more understandable when you understand his idea of truth. See below:

“An objective uncertainty held fast in an appropriation process of the most passionate inwardness is the truth, the highest truth attainable for an existing individual…The truth is precisely the venture which chooses an objective uncertainty with the passion of the infinite.  I contemplate the order of nature in the hope of finding God, and I see omnipotence and wisdom; but I also see much else that disturbs my mind and excites anxiety.  The sum of all this is an objective uncertainty.  But it is for this very reason that the inwardness becomes as intense as it is, for it embraces this objective uncertainty with the entire passion of the infinite …Without risk there is no faith.  Faith is precisely the contradiction between the infinite passion of the individual’s inwardness and the objective uncertainty.”

Kierkegaard, Concluding Unscientific Postscript, p. 182.

This is an extremely complex passage that only explains Kierkegaard’s view in part. But in short, he sees life as objectively uncertain: the world is ambiguous, hard to interpret, and there is not an entirely objective or verifiable system that can tell us exactly what the truth is and what we should believe and do. However, any existing, living person must still take actions and have beliefs. We cannot simply sit back in a passive, agnostic, unmoving state of purgatory and suspension, like Dostoevsky’s Underground Man. Not only is that a life-negating and depressing outcome (a resignation to the deathworld), the passivity and agnosticism is also itself a kind of action that implies certain beliefs (e.g. that no action is worth taking). Taking no actions whatsoever, holding no beliefs at all, is impossible for human beings. Therefore, we must make choices and hold beliefs despite our uncertainties. Kierkegaard’s solution is faith. We must choose some uncertainty to hold fast with a passionate conviction. We make our choices based on this conviction, acting and believing despite the risk.

This is a compelling and profound description of the human condition, and I mostly agree with it. However, Kierkegaard runs into the same problem again here: faith provides no decision criterion, or a way to choose between beliefs. It only provides a way to hold an already-decided-upon belief. Faith cannot stand on its own. By itself, it is only a teetering, aimless toddler, without direction or foundation. As the case of Abraham and Osama above shows, it matters which uncertain beliefs and ideas that we decide to appropriate into our being and hold with the most passionate inwardness. We cannot just seize upon any old uncertainty and develop an intense faith in it – or at least we should not. So how do we choose? Whatever we choose, there will be an implicit criterion or framework at play, something that determines how we evaluate our options and decide amongst them.

What evaluative framework should we use to choose what to have faith in, what to believe? This is an enormous question, the fundamental question of epistemology, and I cannot answer it here. But I would argue that some of the best frameworks include logic and reason, the evidence of experience, and authentic love for others. The answer is certainly not faith, because that would just leave us where we started again. It is circular and arbitrary to choose what to have faith in on the basis of faith. Conclusively, faith is just a thin cover for other systems. Once you peel back the cover and see behind the curtains, it turns out faith is empty. It is a group of other systems in a trench coat. Upon investigation and reflection, faith dissolves into nothingness or devolves into other things, providing no real guidance. While faith may be necessary for practical action, and it is important for living a meaningful and coherent life, it does not itself provide us with any reasons to choose. It is the conclusion of the decision process, the step that must be taken once you have already decided what uncertainties to believe in.

Faith in the good

Should we have faith in ‘what is good’? Let’s say a paradigm religious or LDS person, call her Sophie, wants to have faith because she believes what she has faith in is “good” or morally valuable. But how does she know that what she has faith in is good, and how does she know what constitutes the Good itself? In other words, where does Sophie get her normative and moral beliefs? By saying that she has faith in what is good, she is applying her pre-existing moral framework (her understanding of what is good). This assumes that she already has a sufficient basis for that moral framework, and that it can stand independently from her faith. But I think the answer is that Sophie, the typical religious person, gets most or all of her moral beliefs from faith as well – from her religion or her spiritual experiences. Thus, she is using the moral beliefs she got from faith to assess her faith. This is circular logic. She is using the system of faith to evaluate itself. A faith based upon the good is an illusory solution that swallows its own tail.

A common phrase in Mormonism is “what is good is of God” (call this Alma’s formula). This comes from Alma 5:40 in the Book of Mormon, where the prophet Alma declares: “For I say unto you that whatsoever is good cometh from God, and whatsoever is evil cometh from the devil.” Countless times, when I asked about faith, or questioned how we can know if something is a genuine spiritual experience or a real inspiration from God, this was the answer I was given. However, this is a vacuous non-answer. After all, it presumes that we already know what is good and what is evil or bad. In philosophical terms, it assumes that we have reliable normative knowledge. But we don’t. Morality and decision-making are complex and is often unclear what is good, or even what it means for something to be good. People have conflicting moral intuitions, beliefs, and reasoning processes, and it is not obvious who is right (see the disagreement on trolley problem dilemmas). In other words, this logic of faith bypasses an entire field of philosophy (metaethics) and ignores an enormous and vital area of human thought, brushing it aside without reflection.

What is good may be of God, but how do we know what is good? Does God tell us somehow? If so, then how do we know if the things we think are God’s guidance are truly of God? Perhaps we can be mistaken about what is good, and what is of God. For instance, I think most people would agree that Osama bin Laden was mistaken that he was guided by God, and that the papal legate Arnaud Amalric was mistaken when he ordered all the heretics of Béziers to be slaughtered, saying “Kill them all. God will know his own.” (See The Ideology of Anti-Heretical Crusade in the 1838 Mormon War). The problem of figuring out what is good is not trivial and cannot be bypassed, and it is a matter of life or death what answers we choose. Alma naively ignores this problem.

Additionally, the formula “what is good is of God” reveals how empty or useless faith and spiritual experience are as decision criteria. In most cases where people are seeking answers from God (or whatever spiritual/religious entity they believe in), they are asking precisely because they are unsure what to do, and don’t know what the good option or right choice is. They are in a state of aporia or uncertainty with respect to the good. If they knew what was good already, they wouldn’t be asking God. For instance, perhaps someone is unsure about what college or job would be right for them. How does “what is good of God” help there? This creates a dilemma with two sharp horns, a catch-22. In any serious and difficult case where a decision must be made, faith and spiritual experience are completely unhelpful. Since you don’t know what is good, Alma’s formula is an almost insultingly pointless. Trying to determine what experiences might be genuinely spiritual (“of God”) is impossible, since you can only tell what is of God if you already know what is good. On the other hand, in the easier, more trivial and obvious cases where it is already clear what the good option is, you don’t need to resort to faith or call upon spiritual experiences at all. You can avoid consulting them entirely and leave them by the wayside, like blind guides at a crossroads that cannot help any lost traveler. Therefore, faith is either useless or irrelevant for decision-making.

The Road not taken by Michael Bosnar
The Road Not Taken by Michael Bosnar. The good path might seem obvious, but in real life this is rarely the case – and the thrust of Robert Frost’s poem is that the ‘good road’ is only clear in retrospect.

Conclusion

These are a sample of some of my most important recurring thoughts on faith and spiritual experiences. I’ve been thinking many of these things since I was 13 or 14, but this is the first place I have written them down in detail. I hope this helps explain why I find faith and spiritual experiences utterly unsatisfying, and why I am frustrated and disappointed by most discussions on them. For the things that serve as the foundations for many people’s entire lives, that shape every decision they make, it is shocking how rarely these people reflect deeply and thoroughly on faith and spirituality. If they did, I think they would reach the same conclusions I have. Faith is important, and spiritual experiences are a powerful and beautiful dimension of human life. They are both valuable in many ways. However, they cannot serve as the foundations of a belief system or as a criterion for decisions and actions. Spiritual experiences alone do not justify faith, and faith alone does not provide a way to make choices and hold beliefs in an uncertain world.

Categories
Essays Philosophy

Why We Need Emotion to Interpret the World

Heidegger's Being and Time will be cited as BT with marginal pagination. 

Disclosing the world is a precondition for any engagement or concern with the world, as it makes the ready-to-hand “accessible for circumspective concern” (BT 76). Something must light up the world, making its totality of references, assignments, and tools available to us. But how is the world lit up or disclosed? Through the inseparably connected components of the care-structure, including attunement, understanding, fallenness, and discourse. This essay focuses on attunement, perhaps the most fundamental part of the care-structure, as it is what makes things to matter to Dasein in the first place (BT 137). Section 1 reconstructs Heidegger’s account of attunement and moods in the context of his broader existential analytic. Section 2 addresses some major methodological concerns for his account. Ultimately, Heidegger’s analysis of attunement illuminates key ontological structures of our experience and remains relevant even in a modern scientific context.

1. Attunement and Mood

Heidegger distinguishes between two concepts: an attunement or state-of-mind (Befindlichkeit), and a mood (Stimmung).[1] Unfortunately, Heidegger does not explicitly delineate these terms, and often uses them interchangeably. One interpretation is that attunement is the ontological existentiale, while mood is the ontic manifestation of attunement. In less technical terms, attunement is the fundamental condition that allows us to experience the world as meaningful and ‘mooded.’ Mood is the term for more specific modes of attunement, like fear, anxiety, joy, anger, or focus. Moods are therefore derivative from attunement. Perhaps Heidegger does not need to distinguish between the two. After all, we never experience some abstract, free-floating, or content-free attunement. Instead, we are always experiencing a specific, concrete mood. Attunement is a concept for describing the character of moods in general, as they all share a common structure. What are the characteristics of this structure?

An intuitive view is that moods are occasional, transient emotional experiences that affect us temporarily. One can be more or less moody, or feel a particularly strong mood, but moods are not constant features of our experience. For Heidegger, moods are far more fundamental. We are always already in a mood, and “we are never free from moods” (BT 136). Dasein is Being-in-the-world: it is always absorbed in and engaged with a web of references and assignments that make a totality of equipment ready-to-hand (BT 76). Moods make things accessible to us as equipment, making them meaningful. For instance, a mood like “focus” reveals this laptop as a tool for-the-sake-of the project of writing this essay. I am able to encounter only what a mood has already disclosed to me. Moods thereby disclose the worldhood of the world.

Moods allow us to “encounter something that matters to us” (BT 138). In this sense, moods color the world. However, this metaphor is misleading, as it suggests attunement simply tinges or tints objects that are already revealed. As Schopenhauer writes, “subjective mood—the affection of the will—communicates its color to the purely viewed surroundings.”[2] For Heidegger, moods are not just tinted lenses that give already-revealed objects some emotional color. Attunement, the structure of mood, is more like an atmosphere than a tinted lens: moods are always present, even if not visible, and are necessary for any experience of the world whatsoever.[3] Attunement is how the world opens up to me – whether it is opened up as a burden, a fearful place, or a wonderland. For instance, fearfulness is the mood which allows me to discover threatening objects (BT 138). Furthermore, a mood is not from inside or outside the mind, “but arises out of Being-in-the-world” (BT 176). Heidegger again rejects the distinction between subject and object, as it “splits the phenomenon asunder” (BT 132). Moods are neither inner nor outer, within nor without, objective nor subjective. Rather, moods condition the way we encounter things within the unitary phenomena of Being-in-the-world.

Lee, "Stillwinds #8", Acrylic on Canvas, 30 x 36 in.
Lee, “Stillwinds #8”, acrylic on canvas. For Heidegger, art has a unique ability to communicate a mood.

Heidegger’s reasoning about attunement could fit into the pattern of a transcendental argument: (1) Being-in-the-world is the basic structure of experience as Dasein; (2) in Being-in-the-world, things are disclosed as meaningful and ready-to-hand; (3) there must be some way these things are disclosed and made meaningful; (4) attunement is a name for the way things are disclosed and made meaningful to Dasein.[4] Therefore, attunement is an ontological precondition for our experience of the world. As Heidegger puts it, “only because the ‘there’ has already been disclosed in a state of mind [attunement] can immanent reflection come across ‘experiences’ at all” (BT 136). Moods are not just a kind of experience or a way of being intentionally directed. Instead, moods are a condition that makes experience possible, making it “possible first of all to direct oneself toward something” (BT 137). This is why attunement is necessary for experience in general, and not just affective or emotional experience.

2. Methodological Problems for Heidegger’s Analysis

The first problem for Heidegger’s concept of attunement is a methodological one. If we are always already in a mood, it follows that even Heidegger’s existential analytic must be carried out in some mood. Therefore, we can ask what makes his mood, or any mood, existentially authoritative. Since moods condition experience in different ways, perhaps Dasein will reveal itself differently depending on the mood of the phenomenologist. Is there a ‘right’ mood for uncovering the real ontological structures of Dasein?

Initially, it is clear that Heidegger rejects the idea of a ‘pure’ phenomenology devoid of mood. For example, through the neutrality modification, Husserl aimed to “suspend everything connected to the will” to achieve a purer phenomenological method.[5] Heidegger argues that this is misguided. There is no pure, mood-free experience of objects, as mood is a precondition for being receptive to objects at all. Not “even the purest theory has left all moods behind it” (BT 138). We cannot get outside of moods and observe them from some external vantage point. Every investigation must have some mood that makes the objects of investigation accessible and meaningful.

Heidegger emphasizes that this does not mean we “surrender science ontically to ‘feeling’” (BT 138), but it does seem methodologically problematic for an existential analytic if ‘universal’ ontological structures are only visible in certain moods. One can understand why phenomenologists seek neutrality, to avoid this methodological subjectivity. A defender of Heidegger’s approach can make several responses. First, even if we only “see the ‘world’ unsteadily and fitfully in accordance with our moods” (BT 138), this may be the only way to analyze being as it truly manifests itself. If the investigation of being turns out to be mood-dependent and tumultuous, then so be it. We should not falsify our experience and create artificial uniformity, treating Dasein as always present-at-hand, just because this would make phenomenology seem more objective. Second, the existentiales Heidegger identifies are present regardless of mood: in “every state-of-mind…Being-in-the-world should be fully disclosed” (BT 191). Even if we are not explicitly aware of structures like understanding, Self, or the World, they still condition our experience. Indeed, Being will often be disguised and “covered up” to us (BT 35). Perhaps an in-depth analysis can reveal structures that are not visible in our average everydayness, but that are always present as ontological structures. Presumably, these structures will be recognizable in every mood, although in different ways and to different degrees.

Furthermore, not all moods are equal in their disclosure of Dasein. Information about Dasein is accessible to us through attunements, and more primordial attunements offer a greater possibility of accurately interpreting Dasein’s Being (BT 185). Heidegger argues that anxiety (angst) is the most primordial and disclosive attunement. Unlike fear in the face of some extant entity, we have anxiety in the face of Being-in-the-world as such, which is indefinite, unknown, and nowhere. Just as when our tools break, we become aware of them as present-at-hand objects, when our world breaks down, we become are aware of it as a world. Through anxiety, we see the networks of meaning we are normally absorbed in, realize our individuality and being-thrown, and recognize our freedom to live inauthentic or authentic possibilities. Anxiety also provokes feelings of uncanniness and homelessness in our once-familiar world. Thus, we usually flee from it, absorbing ourselves in projects and entities to “dim down” or tranquilize the anxiety (BT 189). Our ceaseless avoidance reveals the constant presence and primordiality of anxiety, showing that Dasein is anxious in the “very depths of its Being” (BT 190). Anxiety is therefore a primordial mood that can encourage authenticity and enable the analysis of Dasein.

Why You Need Anxiety to Be Creative and Authentic - Heidegger on The Daring  Ones - Overthinker's Journey
Digital art by Kyle Kerr. Angst is a mood that can disclose our authentic being and open up our possibilities.

However, Heidegger leaves serious methodological questions unanswered. Despite using the term “primordial” 371 times in B&T, he never offers a method for determining whether a phenomenon is more primordial than another. His evidence that anxiety is a primordial attunement rests on the claim that we are always fleeing from it. However, even if this is accepted as a phenomenologically apt description, it is not clear why this implies that anxiety is more primordial. Even more critically, Heidegger suggests that anxiety as a primordial mood is more disclosive – it offers us privileged epistemic access to Dasein and the worldhood of the world. Why does the fact that we flee from an attunement imply that it is primordial, and why does its primordiality imply that the attunement is more disclosive? In claiming that anxiety discloses primordial Being, Heidegger seems to be begging the question: he presupposes some significant knowledge of primordial Being. Without this preexisting knowledge, it is hard to see how Heidegger could claim that anxiety discloses more of the reality or primordiality of Being.[6] While perhaps we have an implicit awareness of Being that enables us to begin an investigation of Dasein (BT 7), Heidegger is assuming a much richer understanding of Being here.

Furthermore, it is not clear why a phenomenon like fallenness is not more primordial than anxiety. After all, it almost universally present, and being-fallen is the mode of being that we occupy proximally and for the most part. In contrast, “‘real’ anxiety is rare” (BT 190). We flee toward fallenness, and away from anxiety (BT 189). Why should the phenomena we flee away from be more primordial than the phenomena we flee toward? Often, it seems that Heidegger labels a phenomenon “primordial” to communicate normative preferences rather than descriptive claims about the reality of Being. This leaves serious concerns: how can we resolve epistemic disputes about the primordiality of phenomena? More generally, why should we accept Heidegger’s characterizations of Being? The primary method he employs is a description of phenomena in our experience, and logical analysis to make conclusions about Being based on these phenomena. At least to some degree, Heidegger relies on the aptness and explanatory power of his descriptions of our experience. Thus, the validity of his “fundamental ontology” is dependent on the resonance of his words in describing the human condition, and seems to be an aesthetic activity analogous to that of a novelist or fiction writer.

File:Van-gogh-shoes.jpg - Wikipedia
Shoes, Van Gogh (painting). Heidegger describes this painting as disclosing an entire life-world. Perhaps his own theory can be taken as an artistic depiction of the nature of Being, and not a rigorous ontological investigation.

Finally, in Heidegger’s time, the “psychology of moods” was a new, undeveloped field which “still lies fallow” (BT 134). Today, it has grown into the far more mature field of affective science. However, Heidegger would likely criticize even a more advanced, scientific, and explanatorily successful psychology as having critical problematic assumptions and a deeply flawed starting point. The sciences treat Dasein as a present-at-hand object which can be understood in a detached theoretical attitude, and this approach inevitably falsifies the phenomena. Empirical science is a restricted mode of disclosing being, and it is not epistemologically prior. Indeed, the existentiales that Heidegger elucidates are “a priori conditions for the objects which biology takes for its theme,” and the structures examined by any science can only be understood if they are first seen as structures of Dasein (BT 58). For instance, attunements are the fundamental conditions that render the world intelligible to us, making possible logical or theoretical investigation. Ontological structures like attunement must be presupposed by the sciences and can never be fully explained by present-at-hand analysis.

As it happens, many of Heidegger’s explanations of Being have proved fruitful in the sciences, and his work influences entire research areas like embodied cognition. The existential analytic of Dasein has been ‘naturalized,’ tested, and applied as a model of the extant human brain. For example, Ratcliffe (2002) argues that Heidegger’s account is “actually required as an interpretive backdrop for neuropsychological cases,” and provides a powerful framework for modern affective science.[7] Recent findings show that moods determine how the world is opened up to us, enabling cognitive processing, decision-making, and successful reasoning. These findings show that Heidegger’s analysis has explanatory power in science as well as phenomenology. Additionally, as they reveal the inextricability of emotion from cognitive processes like logic, these findings challenge the ‘purity’ of many theoretical methods and undermine the epistemological assumptions of the sciences.

However, attempting to use science to add credibility to Heidegger’s views implicitly accepts that his claims are legitimately interpretable and even testable in a scientific context. This implies that empirical sciences can offer meaningful knowledge about Dasein, a claim Heidegger would likely reject. If the existential analytic truly has ontological priority, then it does not require empirical validation through the study of present-at-hand beings, and it cannot be treated as a merely ontic science. In the process of applying Heidegger’s ideas, the sciences therefore may violate some of his most essential philosophical principles. However, the problems discussed above raise questions for Heidegger’s own methods. These methods may not be able to fulfill his own desiderata, as they do not reveal the phenomena in a sufficiently originary way and are not clearly epistemologically prior. Instead, Heidegger’s approach, insofar as it aims for explanatory power in its description of consciousness and being, could be interpreted as continuous with the natural sciences. After all, a strict division between the study of Dasein and the present-at-hand would commit a cardinal Heideggerian sin by splitting up unitary phenomena. Just as the sciences are not a privileged conduit to reality, perhaps the existential analytic of Dasein is just one limited but insightful way of disclosing Being.

Bibliography

Elpidorou, Andreas, and Lauren Freeman. “Affectivity in Heidegger I: Moods and emotions in Being and Time.” Philosophy Compass 10, no. 10 (2015): 661-671.

Heidegger, Martin. The fundamental concepts of metaphysics: World, finitude, solitude. Indiana University Press, 1995.

Heidegger, Martin. Basic Problems of Phenomenology. Albert Hofstadter, trans. Indiana University Press, 1988.

Heidegger, Martin. Being and Time. Trans. John Macquarrie & Edward Robinson. Harper Reprint, 2008.

Husserl, Edmund. Ideas for a pure phenomenology and phenomenological philosophy: First book: General introduction to pure phenomenology. Hackett Publishing, 2014.

Schopenhauer, Arthur. The World as Will and Idea – Vol. 2. Project Gutenberg, 2015.

Polt, Richard. Heidegger: an introduction. Routledge, 2013.

Ratcliffe, Matthew. “Heidegger’s attunement and the neuropsychology of emotion.” Phenomenology and the Cognitive Sciences 1, no. 3 (2002): 287-312.

  1. I will use “attunement” for Heidegger’s term Befindlichkeit, and “mood” for Stimmung. Many translators agree these English terms most accurately communicate Heidegger’s concepts. See Andreas Elpidorou and Lauren Freeman, “Affectivity in Heidegger I: Moods and emotions in Being and Time,” Philosophy Compass 10, no. 10 (2015): 661-671.
  2. Arthur Schopenhauer, The World as Will and Idea-Vol. 2, Project Gutenberg, 2015. Pg. 400.
  3. Heidegger, The fundamental concepts of metaphysics, pg. 45.
  4. Of course, attunement is not the only way things are disclosed – it is part of the whole care-structure.
  5. Husserl, Ideas I, §109, pg. 213.
  6. Ratcliffe, Matthew. “Heidegger’s attunement and the neuropsychology of emotion.” Phenomenology and the Cognitive Sciences 1, no. 3 (2002): 287-312.
Categories
Bipolar Cognitive Science Essays Neuroscience Science

The Sparks of Generative Creativity in Mental Disorders

“Almost everywhere it was madness which prepared the way for the new idea, which broke the spell of a venerated usage and superstition. Do you understand why it had to be madness which did this?”

— Nietzsche, Daybreak #14

How could ‘madness’ be helpful for idea-generation, brainstorming, artistic expression, and other creative processes? Mental disorders, as neural-cognitive differences that often misalign with a social context, may enable the kinds of divergence that contribute to creativity. I argue that conditions like bipolar disorder, Tourette’s syndrome, and ADHD (the ‘C-disorders’) share features that substantially increase generative creativity. Although they may not have a common etiology, the C-disorders have important shared cognitive styles and neural patterns. Part 1 provides a theoretical framework, describing generative creativity within a dual-process model, defending its value, and considering how it can be effectively studied. Part 2 analyzes the empirical evidence indicating that the cognitive styles and neural correlates of generative creativity are exceptionally exhibited in the C-disorders. I conclude by tying together these threads and calling for a new approach to treating the C-disorders that takes these findings into account.

1. The Dual-Process Model and Generative Creativity

1.1 — What is Creativity?

Creativity lies at the intersection of novelty and value. To be creative, an idea, invention, artwork, or other product must be both useful and new. Of course, this definition is vague and subject to difficult questions. For instance, what does it mean for a creative product to be valuable? This question is subject to social and evaluative norms. Often, the standards used are social consensus, scientific-technological innovation, or material-economic benefits, but it is not clear these are necessary or sufficient. The definition of creativity, and the best construct to measure and describe it, remains hotly disputed (Ford & Harris, 1992). In part to bypass some of these theoretical issues, this essay will be restricted to a specific sub-component: generative creativity.

1.2 — The Dual Process Model of Creativity

A dual state model of creative cognition: designing requires the... |  Download Scientific Diagram
Diagram from “A Dual-state Model of Creative Cognition for Supporting Strategies that Foster Creativity in the Classroom” by Howard-Jones (2002).

Creative thinking proceeds in phases—an initial phase of unconstrained generation or brainstorming, and a subsequent more-constrained and systematic evaluation. Under this dual-process model, creativity starts by generating a wide range of initial ideas and associations and finishes by exploring these crude options with evaluation and testing. Generative creativity is the first process. Computationally, a generatively creative system is one that creates new patterns regardless of their estimated benefit to the system, while evaluative or adaptive creativity involves creating patterns that fulfill established value-functions (Bown, 2012, pg. 364). Evidence suggests that the dual-process model has a basis in the brain, as the two phases of generative and evaluative creativity involve distinct neural systems: creative generation recruits primarily medial temporal lobe regions like the hippocampus, while evaluation co-recruits the default mode and executive control networks (Ellamil et al, 2012). Furthermore, this study finds that the generative and evaluative networks were somewhat competitive: “the more successfully [participants] were able to engage in creative generation while avoiding evaluative processes, the more they recruited MTL regions associated with creative generation.” These phases are both vital to successful creative production, but they are underpinned by diverging cognitive styles and neural correlates.

Indeed, the two processes often conflict. Brainstorming demonstrates the importance of quarantining the generative process from critical, evaluative, goal-directed, convergent mental processes. Listening to the critic in one’s head is the fastest way to make a brainstorming session crash on the runway. Unfettered generation is especially critical because “the more creative concepts you have to choose from, the better” (Adams, 2001, pg. 22). If one does not take the time for unconstrained, generation-focused, divergent thinking, it is far more likely that the creative process will be prematurely mired in conceptual blocks and arbitrary limitations. Effective brainstorming entails avoiding premature evaluations and quality checks, and instead focusing on ideational speed and fluency – producing a large number of new concepts, designs, or ideas. When it comes to creativity, quantity has a quality all its own.

1.3 — Modal Cognition and Generative Creativity

Research on modal cognition also has important theoretical import for generative creativity. Under the theory of the psychological representation of modality developed across multiple papers by Phillips et al, the initial set of possibilities we consider is limited by the constraints of probability, physics, and morality (Phillips & Knobe 2018). With limited time, we default to only considering a systematically limited subset of possibilities. For instance, both children and time-constrained adults tend to consider immoral options (e.g. stealing or lying) or unlikely and irregular options (e.g. painting polka dots on an airplane) as impossible (Phillips, Morris, & Cushman, 2017). Indeed, experimental data from the PhilLab suggests that as people generate more options, these options become less constrained by norms of probability, normality, morality, and rationality.[1] This may imply that possibilities become more divergent, unconventional, novel, or surprising as the quantity of ideas generated increases.

Figure 1: Chart showing that as more possibilities are generated, the possibilities increasingly deviate from the constraints of morality, normality, probability, and rationality. For instance, the 1st item generated is given a probability rating of about 6.3, while the 8th item generated is given a probability rating of about 5. This shows the importance of the quantity of ideas generated for escaping constraints and conceptual limitations during the brainstorming process. Based on unpublished data from the Dartmouth PhilLab, project by Jonathan Phillips, Eliza Jane, Margaret Garrard, and Maeen Arslan.

Using simple heuristics to delimit the most relevant and useful possibilities is computationally cheap, quick, and often adaptive. But for generative creativity, one must minimize constraint and mental friction to produce maximal options. Phillips theorizes that there are two processes in modal cognition: the default and the deliberative representations of possibility (2017). Perhaps generative creativity relies on the deliberative representation: as more possibilities are generated in a creative flow state, the ordinary restraints loosen, and the consideration set expands. Mental disorders may facilitate surpassing the default modal limitations, allowing unconstrained generation.

1.4 — Methodological Considerations in Creativity Studies

Empirical research on mental disorders and generative creativity should keep a few important considerations in mind. First, research should be constrained to adults. Including children and adolescents would introduce too many confounding variables, as neuroplasticity and other features of the developing brain likely influence generative creativity and interfere with attempts to isolate the effect of the C-disorders. Second, it should avoid an excessive focus on the DSM-V constructs — which are unlikely to map perfectly onto brain differences, are subject to change, and have serious conceptual and methodological problems (Hadfield, 2020). Instead, I emphasize the neuro-cognitive patterns exemplified in these disorders.

Finally, a key problem with creativity research is its focus on ‘demonstrated creativity’: concrete observable outcomes valued in a social context. For instance, creative professions, eminence, and forms of creative output are used as proxies for creativity. However, this paper is more concerned with creative processes than outcomes. Demonstrated creativity is a very ‘noisy’ measure, as actual generative creativity is filtered through social, economic, and pragmatic pressures. Therefore, it may be systematically biased against people with socioeconomic disadvantages, the mentally ill, and others for whom it is particularly difficult to conform to social criteria and fit within existing systems. Similarly, it would be misguided to measure intelligence (g) by financial or academic achievements alone. I will concentrate on measures of generative creativity that are process-based rather than outcome-based.

2. Review of Empirical Evidence

This cannot be a comprehensive research review. Rather, it is a sampling of some available evidence to provide preliminary support for the view that C-disorders increase generative creativity. The C-disorders are united by being approach-based rather than avoidance-based psychopathologies (like anxiety and depression), and meta-analyses have shown that approach disorders are associated with creativity (Baas et al, 2016). Compellingly, a DTI analysis found that there is “specific white matter architecture underlying the normal variance of divergent thinking, openness, and psychotic-spectrum traits,” which supports the idea of a continuum between creativity and psychopathology (Jung et al, 2010). The C-disorders share some specific cognitive styles and neural correlates connected to generative creativity.

2.1 — The C-Disorders and Ideational Speed

First, the C-disorders are associated with increased ideational fluency, racing thoughts, and some measures of cognitive speed. This could result in a higher pace of generation that outputs more ideas. A subjective acceleration of consciousness and an overproduction of ideas are involved in both adult ADHD and hypomania (Martz, 2021). ADHD symptoms like hyperactivity and impulsivity are associated with enhanced divergent thinking, originality, and cognitive flexibility, and improved performance on open-ended generation tasks (Boot et al, 2017). People with ADHD also generated more original ideas than controls when under competition, although they had trouble constraining ideas by practicality (Boot et al, 2020). Additionally, manic patients exhibited higher fluency scores, producing more novel word associations, and their associational fluency increased after discontinuing lithium (Johnson et al, 2012, pg. 8). A catalytic combination of ideational speed, fluency, and an excess of thought could allow people with C-disorders to brainstorm at an exceptional allegro-like tempo. The neural correlates of these processes are unclear, but possible candidates are dopaminergic hypersensitivity and potentially even a higher rate of synaptic transmission throughout the brain.

2.2 — The C-Disorders, Openness, and Divergent Thinking

Second, the C-disorders increase divergent thinking and openness, resulting in unexpected connections and more unpredictable mental pathways. While at its extreme this can lead to psychosis, it also amplifies the exploratory processes essential to generativity. Both bipolar and ADHD are associated with significantly higher openness to experience (Van Dijk et al, 2017; Quilty, 2009). Openness is linked to trait creativity, is even used as a measure of creativity, and is associated with higher volume in brain regions that inhibit control and reduce constraint (Li, 2015). The highly-open personalities of C-disorder patients seem to facilitate highly associative, fluent, and originative brainstorming.

Painting “Mania” by Florencio Yllana

Furthermore, mania risk is associated with divergent thinking (Johnson et al, 2012). The more adaptive symptoms of mania – reduced need for sleep, hyperactivity, excitement, motivation, and enhanced mental speed – are particularly related to generative creativity, while more damaging symptoms like hypersexuality, anger, and poor judgement were not helpful (Johnson et al, pg. 12). However, even seemingly negative symptoms of mania like impulsivity and distractibility can be essential to generative creativity, as they can enable expression with reduced constraint and cognitive control. Bipolar also correlates with many measures of demonstrated creativity: this review finds that mean occupational creativity and lifetime ratings of creative accomplishment are significantly higher in bipolar patients, and the disorder is over-represented in eminent creatives like famous writers and artists (Johnson, pg. 6). As a whole, the kinds of cognitive and neural divergence seen in the C-disorders are valuable for generative creativity.

2.3 – The C-Disorders and Weakened Constraints

Third, the C-disorders are linked to looser cognitive limitations, weakened top-down control, and more unconstrained thinking. Creative tasks benefit from a state of hypofrontality, in which reduced PFC activation enables more spontaneous, bottom-up thought patterns. Bipolar I patients exhibit disruptions in the frontoparietal control network which reduce top-down constraints (Ramey & Chrysikou, 2014). Mania involves hypofrontality, a “significant attenuation of task-related activation of right lateral orbitofrontal function” that results in disinhibition and distractibility (Altshuler et al, 2005). Further, individuals with ADHD have impaired executive inhibition, which reduces the person’s ability to suppress creative but unconventional ideas – and ADHD patients exhibit improved performance on tasks like the Unusual Uses Test (White & Shaw, 2006). All of the C-disorders involve similar neuro-cognitive disinhibitions.

2.4 – The C-Disorders and Creative-Expressive Motivation

Fourth, one important driver of creativity in the C-disorders may simply be motivation: a desire to express and be creative. My personal experience with bipolar has involved strange, unusual, and difficult-to-explain conscious experiences like free-wheeling hallucinations, the sense that my imagination is bleeding into reality, and profound states of inspired joy. This has instilled an intense motivation to try and communicate these experiences and convert the imaginative richness of mania into some real, sharable artifact. For instance, Tourette’s syndrome is also highly correlated with musical creativity, perhaps in part because artistic expression is an enjoyable and effective way to manage tics (Espert et al, 2017). Sacks describes how, for one friend, “the half-convulsive excitement of Tourette’s continually stimulates his perception and imagination, producing a ceaseless stream of extraordinary images” (1992). A rushing river of creative thought can evoke an inspired motivational state that drives people to actualize ideas. Indeed, a desire to act creatively is connected to dopaminergic modulation of a mesolimbic pathway altered in ADHD (Boot et al, 2017). Often, those with C-disorders pursue generative creativity as an autonomous interpretative response to their experiences.

2.5 – A Note on Tourette’s

Tourette’s syndrome (TS) is mentioned sparingly here because it is the least-studied of the three — the most comprehensive review to date called Tourette’s connection to creativity an ‘uncharted topic’ (Colautti et al, 2021). Although it is understudied, this review still shows that TS results in higher generative creativity and is associated with higher openness to experience and divergent thinking. The neural structures implicated in TS correspond to the systems involved in creativity, and “it has been postulated that the excess of dopamine characterizing TS can enhance creative thinking” (Coluatti et al). In short, it seems that Tourette’s facilitates rapid mental associations through hypersensitivity in postsynaptic dopamine receptors and reduced executive control via altered PFC circuitry.

Spectrum Of Tourettes Painting by Kevin Gavaghan | Saatchi Art
“Spectrum Of Tourettes,” by Kevin Gavaghan

3. Conclusion

Viewed holistically, this evidence establishes the initial plausibility of the hypothesis that the C-disorders (TS, BD, and ADHD) involve similar mental and neural mechanisms that result in enhanced generative creativity. Specifically, the disorders are connected to an increased rate of ideational production, augmented divergent thinking, reduced constraints, and higher motivation toward creative expression. These cognitive styles and brain differences form a loosely grouped cluster of traits that are remarkably valuable for the generation-focused initial steps of the creative process, like brainstorming.

While the core aim of this paper is to construct a hypothesis and ground it in existing empirical evidence, the findings reviewed here also have important practical implications. First, the C-disorders are not entirely pathological and have demonstrable and impactful benefits. This provides support for a neurodiversity approach, where psychiatry seeks to support patients with managing their conditions, channeling their creativity, and adjusting to society, rather than trying to ‘cure’ the disorders. However, existing treatments for ADHD, BD, and TS result in a state of diminished creativity that many patients find unpleasant. For instance, lithium, one of the most common medications for bipolar, produces well-documented creativity deficits (Rothenberg, 2001). Antidopaminergic medications for Tourette’s syndrome have also been documented to reduce creativity (Thenganatt & Jankovic, 2016). Psychiatric treatment should not be exclusively oriented toward mitigating all symptoms. Instead, it should aim to enhance the positive and creative features of these disorders, while minimizing the negative symptoms in line with patient’s wishes.

Second, this research suggests that cooperative, neurodiverse communities are essential for a maximally fruitful creative process. This is fundamentally based on the fact that the systems underlying generative and evaluative creativity are rivalrous. Exceptional generative and evaluative creativity, or remarkable talent in both divergent and convergent thinking, are therefore very unlikely to appear simultaneously in a single brain. The best creative solutions will be social. Highly generative, unconstrained thinkers can help break the ice of social norms, shatter conceptual blocks, and produce a gamut of novel ideas, but they will need the help of more structured, analytic, evaluative thinkers to turn the ideas into something valuable. Attempting to confine the creative process to a single individual’s mind is outdated, misguided, and mythologically rather than scientifically rooted. Instead, creativity operates in an extended way through multiple minds and in connection with external tools. Combining generative and evaluative processes through interpersonal synergy mixes together sparks of novelty and value that can light an inferno of creativity.

Colored Jellyfidshes
Photo by Hari Nandakumar. An artistic representation of neurodiverse cooperation?

Works Cited

Adams, J. L. (2019). Conceptual blockbusting: A guide to better ideas. Hachette UK.

Altshuler, L. L., Bookheimer, S. Y., Townsend, J., Proenza, M. A., Eisenberger, N., Sabb, F., … & Cohen, M. S. (2005). Blunted activation in orbitofrontal cortex during mania: a functional magnetic resonance imaging study. Biological psychiatry58(10), 763-769.

Baas, M., Nijstad, B. A., Boot, N. C., & De Dreu, C. K. (2016). Mad genius revisited: Vulnerability to psychopathology, biobehavioral approach-avoidance, and creativity. Psychological bulletin142(6), 668. https://doi.org/10.1037/bul0000049

Boot, N., Baas, M., van Gaal, S., Cools, R., & De Dreu, C. K. (2017). Creative cognition and dopaminergic modulation of fronto-striatal networks: Integrative review and research agenda. Neuroscience & Biobehavioral Reviews78, 13-23. https://doi.org/10.1016/j.neubiorev.2017.04.007

Boot, N., Nevicka, B., & Baas, M. (2017). Subclinical symptoms of attention-deficit/hyperactivity disorder (ADHD) are associated with specific creative processes. Personality and Individual Differences114, 73-81. https://doi.org/10.1016/j.paid.2017.03.050

Boot, N., Nevicka, B., & Baas, M. (2020). Creativity in ADHD: goal-directed motivation and domain specificity. Journal of attention disorders24(13), 1857-1866. doi: 10.1177/1087054717727352.

Bown, O. (2012). Generative and adaptive creativity: A unified approach to creativity in nature, humans and machines. In Computers and creativity (pp. 361-381). Springer, Berlin, Heidelberg.

Ellamil, M., Dobson, C., Beeman, M., & Christoff, K. (2012). Evaluative and generative modes of thought during the creative process. Neuroimage59(2), 1783-1794. https://doi.org/10.1016/j.neuroimage.2011.08.008

Espert, R., Gadea, M., Alino, M., & Oltra-Cucarella, J. (2017). Neuropsychology of Tourette’s disorder: cognition, neuroimaging and creativity. Revista de neurologia64(s01), S65-S72.

Ford, D. Y., & Harris, J. J. (1992). The elusive definition of creativity. The Journal of Creative Behavior, 26(3), 186–198. https://doi.org/10.1002/j.2162-6057.1992.tb01175.x

Hadfield, Jeremy (2020). The Conceptual Engineering of Mental Illness. Retrieved 9 June 2021, from https://jeremyhadfield.com/the-conceptual-engineering-of-mental-illness/.

Johnson et al (2012). Creativity and bipolar disorder: touched by fire or burning with questions?. Clinical psychology review, 32(1), 1-12. https://doi.org/10.1016/j.cpr.2011.10.001

Jung, R. E., Grazioplene, R., Caprihan, A., Chavez, R. S., & Haier, R. J. (2010). White matter integrity, creativity, and psychopathology: disentangling constructs with diffusion tensor imaging. PloS one, 5(3), e9818. https://doi.org/10.1371/journal.pone.0009818

Li, W., Li, X., Huang, L., Kong, X., Yang, W., Wei, D., … & Liu, J. (2015). Brain structure links trait creativity to openness to experience. Social cognitive and affective neuroscience10(2), 191-198. https://doi.org/10.1093/scan/nsu041

Martz, E., Bertschy, G., Kraemer, C., Weibel, S., & Weiner, L. (2021). Beyond motor hyperactivity: racing thoughts are an integral symptom of adult Attention Deficit Hyperactivity Disorder. Psychiatry Research, 113988. https://doi.org/10.1016/j.psychres.2021.113988

Partridge, D., & Rowe, J. (1994). Computers and creativity. Intellect Books.

Phillips, J., & Cushman, F. (2017). Morality constrains the default representation of what is possible. Proceedings of the National Academy of Sciences114(18), 4649-4654. https://doi.org/10.1073/pnas.1619717114

Phillips, J., & Knobe, J. (2018). The psychological representation of modality. Mind & Language33(1), 65-94. https://doi.org/10.1111/mila.12165

Phillips, J., Morris, A., & Cushman, F. (2019). How we know what not to think. Trends in cognitive sciences23(12), 1026-1040. https://doi.org/10.1016/j.tics.2019.09.007

Poincaré, H. (1908). L’invention mathématique, conférence faite à l’Institut général psychologique. Au siège de la Société.

Quilty, L. C., Sellbom, M., Tackett, J. L., & Bagby, R. M. (2009). Personality trait predictors of bipolar disorder symptoms. Psychiatry Research169(2), 159-163. https://doi.org/10.1016/j.psychres.2008.07.004

Ramey, C. H., & Chrysikou, E. G. (2014). “Not in their right mind”: the relation of psychopathology to the quantity and quality of creative thought. Frontiers in psychology, 5, 835. https://doi.org/10.3389/fpsyg.2014.00835

Rothenberg, A. (2001). Bipolar illness, creativity, and treatment. Psychiatric Quarterly72(2), 131-147.

Sacks, O. (1992). Tourette’s syndrome and creativity. BMJ: British Medical Journal305(6868), 1515.

Thenganatt, M. A., & Jankovic, J. (2016). Recent advances in understanding and managing Tourette syndrome. F1000Research5.

White, H. A., & Shah, P. (2006). Uninhibited imaginations: creativity in adults with attention-deficit/hyperactivity disorder. Personality and individual differences40(6), 1121-1131.

Van Dijk, F. E., Mostert, J., Glennon, J., Onnink, M., Dammers, J., Vasquez, A. A., … & Buitelaar, J. K. (2017). Five factor model personality traits relate to adult attention-deficit/hyperactivity disorder but not to their distinct neurocognitive profiles. Psychiatry research258, 255-261. doi: 10.1016/j.psychres.2017.08.037

Categories
Cognitive Science Essays Neuroscience Philosophy

The Psychological Representation of Imagination

Imagining plays a key role in thinking about possibilities. Modal terms like “could,” “should,” and “might” prompt us to imagine possible scenarios. I argue that imagination is the first step in modal cognition, as it generates the possibilities for consideration. The possibilities in the consideration set can then be partitioned into a more limited set of relevant possibilities, and ordered on some criteria, like value or probability.[1] Yet even imagination is not free, boundless, and unlimited. There are systematic constraints on imaginings. The three considerations that determine which possibilities are considered — physical possibility, probability or regularity, and morality — also influence which scenarios are imaginable or easier to imagine.

Ultimately, the evidence indicates that imagination uses a representation similar to the psychological representation of modality,[2] and operates under the constraints that apply to modal cognition in general. This paper has two key goals: (1) to strengthen the theory of a common underlying psychological representation of modality by applying it to imagination, and (2) to understand the imagination and its constraints better by illuminating the psychological representation it has in common with modal cognition.

1. Imagination as the Initial Generative Step of Modal Cognition

Modal cognition will be used as an umbrella term for any kind of thinking about possibility, including counterfactual thinking, causal selection, free will judgements, and more. Imagination is a sub-concept under modal cognition, as it is a form of “attention to possibilities.”[3] There are many types of imagination, but we can afford to gloss over most of the distinctions and instead use a broad definition. Imagination is to “represent without aiming at things as they actually, presently, and subjectively are.”[4] In other words, imagination is mental simulation. Since imagination is about non-occurrent possibilities – like fictional scenarios, images of the future, or counterfactuals – it is necessarily modal. But is modal cognition necessarily imaginative? In short: yes.

After all, we cannot represent possibilities based on a single proposition. Merely varying some proposition’s meaning or truth-value is a simple logical process that cannot characterize modal cognition in general, especially the rich kind of modal cognition involved in decision-making, causal judgements, and counterfactual reasoning. In modal cognition, we must conceive of a full scenario and then consider alternatives (possible worlds) for that scenario. This sounds a lot like imagination, which involves representing a situation: “a configuration of objects, properties, and relations.”[5] Considering the ways a captain could have prevented a ship from sinking, for instance, requires mentally simulating this scenario and varying its features to produce alternative possibilities.[6] Modal cognition relies on imagination to represent situations and generate their alternatives.

More precisely, imagination fits into modal cognition as the initial generative step: it produces the possibilities that are later considered and evaluated. This is inspired by the distinction between discriminative models and generative models in machine learning.[7] A discriminative model uses observed variables to identify unobserved target variables – for example, to find the probable causes of sensory inputs. These models often use a hierarchy of mappings between variables to represent an overall input-output mapping. In contrast, a generative model simulates the interactions among unobserved variables that might generate the observed variables. For example, graphics rendering programs can follow a set of processes to simulate a physical environment. Williams (2020) provides detailed evidence showing that both perception and imagination are best described as generative models.

While I will not repeat William’s arguments here, treating imagination as a generative model is valuable for a few additional reasons. First, imagination is governed by principles of generation: a set of (implicit or explicit) rules that guide our imaginings.[8] For example, in Harry Potter, “Latin words and wands create magic” is a principle of generation that readers can consistently use to simulate the imagined world. Rather than a graphics rendering program that deterministically yields a given outcome by following certain processes, the imagination generates a set of possibilities guided by the relevant principles of generation. However, imagination, like rendering, is a generative model that uses certain processes to produce (and explain) a set of phenomena.

Second, treating imagination as a generative model explains imaginative mirroring: unless prompted otherwise by principles of generation, our imagination defaults to follow the rules of the real world. If a cup ‘spills’ in an imaginary tea party, the participants will treat the spilled cup as empty, following the physics of reality.[9] In perception, we are always running a generative model of reality, using processes we derive from experience to simulate the physical world and predict its behavior.[10] Imagination involves running a generative model on top of this simulation of reality. Some processes are modified in the imagining, but the ones that are not modified are ‘filled in’ by our default generative model of reality. Further, we quarantine imaginative models from perceptual models, so that events in the imagining are not taken to have effects in the real world – imagined spills do not make the real table wet. Treating imagination as a generative model running separately but based upon a reality-based perceptual model is useful in explaining these effects.

Finally, the generative model view explains the systematic constraints on imagination and their function. Imaginings are not utterly free and boundless. Rather, imagination changes some aspects of the world, and then unfolds the impacts of these changes in a constrained way based on specific rules of generation. Later, I will show that imagination by default follows our world’s laws of physics and probability. We also resist imaginings that break normative limitations set by morality. If imagination is a generative model, then the constraints are the rules that determine how the generation process is carried out, analogous to rendering algorithms in animations or games. Imagination’s constraints allow it to serve a valuable and adaptive function in generating possibilities relevant to our real world.

In Kratzer semantics, a modal anchor is the element from which a set of possible worlds is projected.[11] In simpler terms, the anchor is the thing held constant in modal projection. For example, in the statement “people could jump off this roof,” the modal anchor is the situation of the roof. We project a domain of possible worlds that all include this roof and determine if people jump off the roof in at least one possible world. Imagination is the cognitive function that carries out modal projection, as it generates the possibilities prescribed by the modal anchor and its context. The modal anchor defines the processes of the generative model. Alternatively, modal anchors correspond to “props” in the philosophy of imagination, where a prop is the thing that prescribes what is to be imagined and the principles of generation to be used in imagining.[12] The modal anchor functions as a linguistic prop, prescribing an imagining that generates a set of possibilities relevant to the anchor.

Graphical user interface, text, application  Description automatically generated This sets the stage for a comprehensive picture of modal cognition. First, some prop or modal anchor elicits thought about possibilities and triggers the start of the process. Second, imagination acts as a generative model, creating a set of possibilities based on the rules of generation prescribed by the modal anchor. This produces the consideration set, the group of possibilities under consideration. Third, the generated possibilities can then be narrowed down further and partitioned into a relevance set.[13] Finally, the possibilities are ordered according to some criteria, so the possibilities most relevant to the task at hand are ranked the most highly.

While it is conceptually helpful to separate these steps, I do not claim the steps occur in a sequential, discrete order. These components can happen synchronously and are often blurred together. Steps two and three are especially entangled, as I will show that generation through imagination also involves constraints that winnow down the considered possibilities. The rest of this paper will examine step two in detail. I will focus on how the imagination is constrained, and how its constraints indicate that it involves the psychological representation of modality.

2. The Psychological Representation of Modality and Imagination

2.1 Constraints in Modality

A growing body of research shows that a common psychological representation underlies many kinds of thinking about possibilities. Using certain constraints, this representation supports quick, effortless, computationally cheap, and often unconscious modal cognition. The constraints of physics, morality, and probability influence which possibilities are considered relevant.[14] For instance, in counterfactual reasoning, we mostly consider probable events, evaluatively good events, and physically normal events. Evidence also indicates that a common psychological capacity underlies our judgements of moral permissibility and physical possibility.[15] Evaluative concerns and prescriptive norms play an especially critical role in constraining possibilities.

Phillips, Luguri, and Knobe (2015) show that morality plays a key role in limiting the set of relevant possibilities for many types of judgement. For instance, people are less likely to agree that a captain on a sinking ship was forced to throw his wife overboard than that he was forced to throw cargo overboard. With the added support of several other studies, researchers demonstrated this effect occurs because immoral possibilities are considered less relevant. Critically for my thesis, the researchers also showed that prompting participants to generate more possibilities led to significant effects on their judgements.[16] When participants imagined decisions the captain could have made, they were more likely to judgements that he was free and not forced. This demonstrates the importance of the initial generative step.

Further, Phillips and Cushman (2017) found that both children and adults under time constraints tend to judge immoral events as impossible.[17] Non-reflective modal judgements are “ought-like,” and exclude immoral possibilities from consideration. Given time to deliberate, adults can differentiate types of modal judgment and make more reasoned judgements about possibility. In this study, participants were presented with events and were asked to judge which events were possible. For example, for the person stuck at an airport, participants are asked if he can hail a taxi, teleport, sell his car, or sneak onto public transit. Importantly, the generative step is performed by the researchers. The participants do not have to imaginatively generate the options. Instead, they are given the options and asked to evaluate their possibility. This skips step 2 of modal cognition, and instead focuses on step 3. However, in most natural situations, we have to generate the available options ourselves. 

In general, research on modal cognition overlooks the mechanism that generates possibilities. Existing studies often ask participants to evaluate already-generated possibilities. This experimental design systematically misses the effects of the process that generates possibilities in the first place. One exception is Kushnir and Flanagan (2019), which tested whether a person’s ability to generate possibilities predicted their judgement that they have free will.[18] We tend to judge agents as free when we can represent alternative possibilities for their action. Thus, simply generating more possibilities may lead us to judge that agents are freer. Indeed, this experiment found that children’s fluency in generating ideas predicted their evaluation of their own free will. Performance on a task that involved generating ideas within an imagined world was the best predictor of a child’s judgements: the more fluent the children were in this imagination task, the more likely they were to judge themselves as free.

The researchers speculated that there may be a “direct pathway from idea generation to judgments of choice and possibility.”[19] In my view, the pathway is indirect, as existing research indicates that after possibility-generation we also evaluate the relevance of possibilities and rank them. However, the studies discussed above underscore the importance of the imagination as the initial generative step. The nature and quantity of the generated possibilities has demonstrable impacts on modal judgements. Furthermore, there may be important constraints on this generation process that lead to downstream effects on later processes in modal cognition.

2.2 Constraints in Imagination

The same constraints apply to both modality and imagination. This is surprising, as intuitively imagination seems far freer and more limitless than normal reasoning. We can easily imagine worlds where magic violates physical laws or where improbable events occur often. However, I argue that the default representation of imagination results in resistance to imagining possibilities that violate physical laws, irregular or unlikely possibilities, and immoral or evaluatively bad possibilities. Experimental results reveal that the imaginations of young children are limited by precisely these constraints. Adults are able to deliberately generate more and less constrained possibilities. However, just as adults can treat immoral possibilities as irrelevant, imaginative resistance shows that the adult imagination is inhibited against immoral possibilities. Conclusively, the imagination shows a startling resemblance to the psychological representation of modality.

Investigations of modal cognition often use developmental research to show constraints on children’s reasoning about possibilities, indicating a default representation of modality that is especially visible during early childhood.[20] Similarly, the imaginations of young children (ages 2-8) are surprisingly reality-constrained. Children tend to resist, or fail to generate, impossible and improbable imaginings. When prompted to imagine hypothetical machines, children judge that familiar machines could be real, but are reluctant to imagine possible machines that operate very differently from any object they have regular experience with.[21] Children also protest against pretense that contradicts their knowledge of regularity, expecting imaginary entities to have ordinary properties.[22] Even when pretending, kids expect lions to roar and pigs to oink, and they resist imagining otherwise. 

Furthermore, 82% of the time, children extend fantasy stories with realistic events rather than fantastic events, while adults extend fantasy stories with fantastic events.[23] Young children imagine along ordinary lines even when primed with fantastical contexts, filling in typical and probable causes for fantastical imaginary events.[24] Children show a strong typicality bias in completing fictional stories, favoring additions to the story that match their regular experiences in reality.[25] For example, even if an imagined character can teleport or ride dragons, a young child will say the character gets to the store by walking and arrives at school on a bus. Children’s bias toward adding regular events persisted even after experimental manipulations designed to encourage children to notice a pattern of atypicality in the story.[26] This is surprising: popular wisdom dictates that children are exceptionally and fantastically imaginative. However, this research shows that children have simple, limited, and relatively mundane imaginations that are constrained by regularity, probability, and typical reality. 

girl in black and red plaid jacket standing on white floor tiles
The imaginations of young children are not as free & creative as you might expect. (Image source: Kelly Sikkema)

Evaluative concerns are an additional constraint on the imagination. My theory predicts that children’s imaginations will show a bias toward generating evaluatively good possibilities, and a resistance to imagining possibilities that they see as evaluatively wrong. Some studies indicate that this is the case. For example, American children are more likely than Nepalese and Singaporean children to judge that they are free to act against cultural and moral norms.[27] This is likely because children in cultures with stronger or more restrictive evaluative norms find it harder to generate evaluatively wrong possibilities or see these possibilities as relevant. As free will judgements depend on representing alternative possibilities, these children see themselves as less free to pursue possibilities that violate evaluative norms. This means that morality is an additional constraint on the imagination, especially in early childhood. However, more research is needed to validate this hypothesis.

As children develop, the constraints on their imagination relax, leading to less restricted generation of possibilities. Older children are more likely to imagine improbable and physically impossible phenomena.[28] Explicitly prompting children to generate more possibilities leads them to imagine more like older children, producing possibilities less constrained by probability and regularity.[29] This shows that the initial generative step may underlie observed developmental changes in modal cognition. The imaginations of older children generate more total possibilities, including more irregular possibilities, and they are therefore more likely to judge irregular events as possible.

Viewing imagination as a generative model allows productive interpretations of this research. When imagining, young children apply a generative model with the same rules of generation used in perception to produce expectations about reality. This early imagination may use simple constraints and empirical heuristics to allow effortless and rapid generation of possibilities. For instance, if the child regularly encounters an event, they are more likely to imagine this event.[30] In later development and adulthood, the imagination generates possibilities in a more deliberative and analytical way. This suggests a dual process model of imagination.[31] Children may use a more uncontrolled, effortless, and unconscious imagination based on simple heuristics and experience-derived rules of generation. In contrast, adults use a more controlled, effortful and conscious imagination that generates possibilities based on relatively sophisticated and principled rules. 

Although adults can more easily imagine irregular events or events that violate physical laws, the developed imagination is still constrained by moral norms. Imaginative resistance refers to a phenomenon where people find it difficult to engage in prompted imaginative activities. For example, if a fiction prompts us to imagine that axe-murdering is morally good, we resist this imagining. Unfortunately, there are few empirical tests of imaginative resistance. In one study conducted by Liao, Strohminger, and Sripada (2014), participants exhibited resistance to imagining morally deviant scenarios.[32] For example, participants reported difficulty in imagining that it was morally right for Hippolytos to trick Larisa in the Greek myth “The Rape of Persephone,” even though Zeus declared the trickery was morally right.[33] Their imaginative difficulty was significantly correlated with their evaluation that this trickery was morally wrong. This effect was replicated in a second experiment with a different story. The experiments also showed that imaginative resistance was modulated by context and genre. Participants more familiar with Greek myth were less likely to resist imagining that Hippolytos’ trickery was right, and participants were more willing to imagine that child sacrifice is permissible in an Aztec myth than in a police procedural. Context-specific variation in imaginative resistance may explain some of the variation in modal judgements.

Further research has demonstrated the empirical reality of imaginative resistance. In one study, adults were asked to imagine morally deviant worlds, where immoral actions are morally right within the imagined world.[34] Most participants found morally deviant worlds more difficult to imagine than worlds where unlikely events occurred often, but easier to imagine than worlds with conceptual contradictions. Participants classified these morally deviant worlds as improbable, not impossible, although a subset reported an absolute inability to imagine a morally deviant world. Another study employed a unique design to avoid the effects of authorial authority and variation in prompts, asking participants to create morally deviant worlds themselves and describe these imagined worlds in their own words.[35] Participants still exhibited resistance to imagining moral deviant worlds, even when they were the authors of the world. Disgust sensitivity was correlated with imaginative resistance, while need for cognition and creativity were correlated with ease of imagining. Finally, Black and Barnes (2017) constructed an imaginative resistance scale to support future research on this phenomenon and its correlations with individual differences.

Taken as a whole, the research discussed above provides strong support for the view that imagination and thinking about possibilities involve the same psychological representation. This default representation is most visible in early childhood, but it still operates in adulthood, especially under time-constraints or in scenarios involving immoral possibilities. Imaginative resistance shows that the primacy of morality in limiting the imagination corresponds to the primacy of morality in limiting which possibilities are considered relevant. Overall, this shows that generation of possibilities through imagination and evaluations of possibility relevance both involve a common psychological representation that is present at all stages of modal cognition.

2.3 Neuroscience of Imagination & Modal Cognition

This paper primarily aims to describe imagination and modal cognition on Marr’s computational and algorithmic levels of analysis, without delving into the neural implementation. However, any complete model of modal cognition will describe the neural implementational details. Furthermore, an implication of my view is that interactions between imagination and modal cognition will be visible on a neural level. One falsification of my view could show that these two processes do not interact or involve very distinct neural pathways. As such, the limited review of the neuroscientific evidence below is meant only to establish the plausibility of two key claims: (1) modal cognition involves imagination, and (2) imagination and modal cognition use similar neural mechanisms.

Neuroscientific evidence shows that modal cognition and imagination involve the same neural correlates. There is a growing consensus that remembering the past, imagining the future, and counterfactual thinking all involve similar neural mechanisms in the default mode network (DMN).[36] Several studies show that the DMN is involved in simulating possible experiences, imagining, and counterfactual thinking.[37] At the outset, this indicates that modal cognition and imagination use the same parts of the brain. But more specifically, future-oriented and counterfactual thinking engages the posterior DMN (pDMN), centered around the posterior cingulate cortex.[38] Researchers showed this by asking participants in an fMRI scan to make choices about their present situation, and then prospective choices about their future. Their findings demonstrated that people often engage vivid mental imagery in future-oriented thinking, and that this process activates the pDMN while reducing its connectivity with the anterior DMN. This provides a candidate neural process that underlies imaginative generation of possibilities.

One prominent neuroscientific theory of the imagination. See "The Neurobiology of Imagination: Possible Role of Interaction-Dominant Dynamics and Default Mode Network." 

Furthermore, a key cognitive ability that underlies imagination is prefrontal synthesis (PFS), the ability to create novel mental images. This process is performed in the lateral prefrontal cortex (LPFC), which likely acts as an executive controller that synchronizes a network of neuronal ensembles that represent familiar objects, synthesizing these objects into a new imaginary experience.[39] Children acquire PFS around 3 to 4 years of age, along with other imaginative abilities like mental rotation, storytelling, and advanced pretend play.[40] Similarly, young children tend to lack a distinction between immoral, impossible, and irregular counterfactuals – they often conflate “could” and “should.”[41] While further study is needed, it is plausible that development of PFS is associated with mature modal cognition, making modal distinctions, and generating more sophisticated imaginings. 

3. Conclusion

This essay constructs a broad theory of modal cognition in which imagination plays a critical role. Namely, imagination serves as an initial step that generates the possibilities for consideration for later steps. Imagination is best described algorithmically as a generative model which operates based on rules of generation prescribed by a modal anchor. Furthermore, the evidence discussed in section 2 indicates that imagination and thinking about possibilities both use a default psychological representation with the same fundamental constraints. While this psychological representation is not always visible in adulthood, it is clear in early childhood, and it still has observable effects in adult cognition. The psychological representation of modality and imagination enables us to think about possibilities in rapid, effortless, and useful ways.

This theory also yields testable predictions that could be explored by future empirical research. For example, it predicts that young children will exhibit more imaginative resistance to violations of morality than adults. They will be more likely to classify morally deviant worlds as impossible or show a total inability to imagine these worlds.[42] Under time pressure, adults will exhibit more imaginative resistance, and they will be more likely to imagine valuable scenarios than dis-valuable scenarios – just as people are more likely to generate valuable possibilities under time pressure.[43] Correspondingly, people given more time and opportunity to engage the imagination might exhibit more willingness to imagine morally deviant worlds. With very limited time or significant cognitive pressure, adult imaginations may resemble the imaginations of young children. Finally, individual differences in openness to experience, creativity, and imaginative ability may predict some of the variation in possibility judgements, through differences in the generation of possibilities. For instance, people who naturally generate more possibilities will be more likely to judge agents as free rather than forced.

Existing research has not explicitly drawn this connection between the imagination and the psychological representation of modality. Even if this proposed model is not correct as a whole, I hope this paper can help integrate disconnected research projects on modal cognition and imagination in cognitive science, neuroscience, and philosophy.

Bibliography

Addis, Donna Rose, Alana T. Wong, and Daniel L. Schacter. “Remembering the past and imagining the future: common and distinct neural substrates during event construction and elaboration.” Neuropsychologia 45, no. 7 (2007): 1363-1377.

Barnes, Jennifer, and Jessica Black. “Impossible or improbable: The difficulty of imagining morally deviant worlds.” Imagination, Cognition and Personality 36, no. 1 (2016): 27-40.

Black, Jessica E., and Jennifer L. Barnes. “Measuring the unimaginable: Imaginative resistance to fiction and related constructs.” Personality and Individual Differences 111 (2017): 71-79.

Black, Jessica E., and Jennifer L. Barnes. “Morality and the imagination: Real-world moral beliefs interfere with imagining fictional content.” Philosophical Psychology 33, no. 7 (2020): 1018-1044.

Berto, Francesco. “Taming the runabout imagination ticket.” Synthese (2018): 1-15.

Bowman-Smith, Celina K., Andrew Shtulman, and Ori Friedman. “Distant lands make for distant possibilities: Children view improbable events as more possible in far-away locations.” Developmental psychology 55, no. 4 (2019): 722.

Cook, Claire, and David M. Sobel. “Children’s beliefs about the fantasy/reality status of hypothesized machines.” Developmental Science 14, no. 1 (2011): 1-8.

Cushman, Fiery. “Action, outcome, and value: A dual-system framework for morality.” Personality and social psychology review 17, no. 3 (2013): 273-292.

Gaesser, Brendan. “Constructing memory, imagination, and empathy: a cognitive neuroscience perspective.” Frontiers in psychology 3 (2013): 576.

Goulding, Brandon W., and Ori Friedman. “Children’s beliefs about possibility differ across dreams, stories, and reality.” Child development (2020).

Kind, Amy. “Imagining under constraints.” Knowledge through imagination (2016): 145-59.

Kratzer, Angelika, “Modality for the 21st century,” In 19th International Congress of Linguists, pp. 181-201. 2013.

Lane, Jonathan D., Samuel Ronfard, Stéphane P. Francioli, and Paul L. Harris. “Children’s imagination and belief: Prone to flights of fancy or grounded in reality?” Cognition 152 (2016): 127-140.

Leslie, Alan M. “Pretending and believing: Issues in the theory of ToMM.” Cognition 50, no. 1-3 (1994): 211-238.

Liao, Shen-yi and Tamar Gendler. “Imagination.” The Stanford Encyclopedia of Philosophy (Summer 2020 Edition). Edward N. Zalta (ed.). <https://plato.stanford.edu/archives/sum2020/entries/imagination/>.

Liao, Shen-yi, Nina Strohminger, and Chandra Sekhar Sripada. “Empirically investigating imaginative resistance.” British Journal of Aesthetics 54, no. 3 (2014): 339-355.

Moulton, Samuel T., and Stephen M. Kosslyn. “Imagining predictions: mental imagery as mental emulation.” Philosophical Transactions of the Royal Society B: Biological Sciences 364, no. 1521 (2009): 1273-1280.

Parikh, Natasha, Luka Ruzic, Gregory W. Stewart, R. Nathan Spreng, and Felipe De Brigard. “What if? Neural activity underlying semantic and episodic counterfactual thinking.” NeuroImage 178 (2018): 332-345

Pearson, Joel. “The human imagination: the cognitive neuroscience of visual mental imagery.” Nature Reviews Neuroscience 20, no. 10 (2019): 624-634.

Phillips, Jonathan, Adam Morris, and Fiery Cushman. “How we know what not to think.” Trends in cognitive sciences 23, no. 12 (2019): 1026-1040.

Phillips, Jonathan, and Fiery Cushman. “Morality constrains the default representation of what is possible.” Proceedings of the National Academy of Sciences 114, no. 18 (2017): 4649-4654.

Phillips, Jonathan, and Joshua Knobe. “The psychological representation of modality.” Mind & Language 33, no. 1 (2018): 65-94.

Phillips, Jonathan, Jamie B. Luguri, and Joshua Knobe. “Unifying morality’s influence on non-moral judgments: The relevance of alternative possibilities.” Cognition 145 (2015): 30-42.

Schubert, Torben, Renée Eloo, Jana Scharfen, and Nexhmedin Morina. “How imagining personal future scenarios influences affect: Systematic review and meta-analysis.” Clinical Psychology Review 75 (2020): 101811.

Shtulman, Andrew, and Jonathan Phillips. “Differentiating “could” from “should”: Developmental changes in modal cognition.” Journal of Experimental Child Psychology 165 (2018): 161-182.

Shtulman, Andrew, and Lester Tong. “Cognitive parallels between moral judgment and modal judgment.” Psychonomic bulletin & review 20, no. 6 (2013): 1327-1335.

Spreng, R. Nathan, Raymond A. Mar, and Alice SN Kim. “The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysis.” Journal of cognitive neuroscience 21, no. 3 (2009): 489-510.

Stuart, Michael T. “Towards a dual process epistemology of imagination.” Synthese (2019): 1-22.

Thorburn, Rachel, Celina K. Bowman-Smith, and Ori Friedman. “Likely stories: Young children favor typical over atypical story events.” Cognitive Development 56 (2020): 100950.

Van de Vondervoort, Julia W., and Ori Friedman. “Preschoolers can infer general rules governing fantastical events in fiction.” Developmental psychology 50, no. 5 (2014): 1594.

Van de Vondervoort, Julia W., and Ori Friedman. “Young children protest and correct pretense that contradicts their general knowledge.” Cognitive Development 43 (2017): 182-189.

Vyshedskiy, Andrey. “Neuroscience of imagination and implications for human evolution.” (2019). Preprint DOI: 10.31234/osf.io/skxwc.

Weisberg, Deena Skolnick, and David M. Sobel. “Young children discriminate improbable from impossible events in fiction.” Cognitive Development 27, no. 1 (2012): 90-98.

Weisberg, Deena Skolnick, David M. Sobel, Joshua Goodstein, and Paul Bloom. “Young children are reality-prone when thinking about stories.” Journal of Cognition and Culture 13, no. 3-4 (2013): 383-407.

Williams, Daniel. “Imaginative Constraints and Generative Models.” Australasian Journal of Philosophy (2020): 1-15.

Williamson, Timothy. “Knowing by imagining.” Knowledge through imagination (2016): 113-23.

Winlove, Crawford IP, Fraser Milton, Jake Ranson, Jon Fulford, Matthew MacKisack, Fiona Macpherson, and Adam Zeman. “The neural correlates of visual imagery: A co-ordinate-based meta-analysis.” Cortex 105 (2018): 4-25.

Xu, Xiaoxiao, Hong Yuan, and Xu Lei. “Activation and connectivity within the default mode network contribute independently to future-oriented thought.” Scientific reports 6 (2016): 21001.

  1. Phillips, Jonathan, Adam Morris, and Fiery Cushman, “How we know what not to think,” Trends in cognitive sciences 23, no. 12 (2019): 1026-1040.

  2. Phillips, Jonathan, and Joshua Knobe, “The psychological representation of modality,” Mind & Language 33, no. 1 (2018): 65-94.

  3. Williamson, Timothy, “Knowing by imagining,” Knowledge through imagination (2016): 113-23. Pg. 4.

  4. Liao, Shen-yi and Tamar Gendler, “Imagination,” The Stanford Encyclopedia of Philosophy.

  5. Berto, Francesco. “Taming the runabout imagination ticket.” Synthese (2018): 1-15.

  6. Phillips, Luguri, and Knobe. “Unifying morality’s influence on non-moral judgments: The relevance of alternative possibilities,” Cognition 145 (2015): 30-42.

  7. The difference between discriminative and generative models is (roughly) similar to the distinction between model-free and model-based reinforcement learning – see Cushman (2017).

  8. Walton, Kendall L, Mimesis as make-believe: On the foundations of the representational arts, Harvard University Press, 1990. Pg. 53.

  9. Leslie, Alan M, “Pretending and believing: Issues in the theory of ToMM,” Cognition 50, no. 1-3 (1994): 211-238.

  10. Williams, “Imaginative Constraints and Generative Models,” 2020.

  11. Kratzer, Angelika, “Modality for the 21st century,” In 19th International Congress of Linguists, pp. 181-201. 2013.

  12. Walton, Mimesis as Make-believe, pg. 47.

  13. Phillips, Morris, and Cushman, “How we know what not to think,” (2019).

  14. Phillips and Knobe (2018).

  15. Shtulman, Andrew, and Lester Tong, “Cognitive parallels between moral judgment and modal judgment,” Psychonomic bulletin & review 20, no. 6 (2013): 1327-1335.

  16. This was shown in the second “manipulation” studies for each type of judgement (1b, 2b, 3b, and 4b).

  17. Phillips and Cushman (2017).

  18. Flanagan, Teresa, and Tamar Kushnir, “Individual differences in fluency with idea generation predict children’s beliefs in their own free will,” Cognitive Science, pp. 1738-1744. 2019.

  19. Flanagan and Kushnir, pg. 5.

  20. For instance, see Shtulman, Andrew, and Jonathan Phillips, “Differentiating “could” from “should”: Developmental changes in modal cognition,” Journal of Experimental Child Psychology 165 (2018): 161-182.

  21. Cook and Sobel, “Children’s beliefs about the fantasy/reality status of hypothesized machines,” Developmental Science 14, no. 1 (2011): 1-8.

  22. Van de Vondervoort, Julia W., and Ori Friedman,” Young children protest and correct pretense that contradicts their general knowledge,” Cognitive Development 43 (2017): 182-189.

  23. Weisberg et al, “Young children are reality-prone when thinking about stories,” Journal of Cognition and Culture 13, no. 3-4 (2013): 383-407. Pg. 386.

  24. Lane et al, “Children’s imagination and belief: Prone to flights of fancy or grounded in reality?,” Cognition 152 (2016): 127-140. Pg. 131.

  25. Thorburn, Bowman-Smith, and Friedman, “Likely stories: Young children favor typical over atypical story events,” Cognitive Development 56 (2020): 100950.

  26. Thorburn, Bowman-Smith, and Friedman (2020).

  27. See Chernyak, Kang, and Kushnir (2019) and Chernyak et al (2013).

  28. Lane et al, pg. 6.

  29. See Lane et al, pg. 8; Goulding and Friedman, “Children’s beliefs about possibility differ across dreams, stories, and reality,” Child development (2020); and Bowman-Smith et al, “Distant lands make for distant possibilities: Children view improbable events as more possible in far-away locations,” Developmental psychology 55, no. 4 (2019): 722.

  30. Goulding and Friedman (2020).

  31. Stuart, Michael T, “Towards a dual process epistemology of imagination,” Synthese (2019): 1-22.

  32. Liao, Shen-yi, Nina Strohminger, and Chandra Sekhar Sripada, “Empirically investigating imaginative resistance,” British Journal of Aesthetics 54, no. 3 (2014): 339-355.

  33. Liao, Strohminger, and Sripada (2014), pg. 10.

  34. Barnes and Black (2016), “Impossible or improbable: The difficulty of imagining morally deviant worlds,” pg. 8.

  35. Black, Jessica E., and Jennifer L. Barnes, “Morality and the imagination: Real-world moral beliefs interfere with imagining fictional content,” Philosophical Psychology 33, no. 7 (2020): 1018-1044.

  36. Mullally, Sinéad L., and Eleanor A. Maguire, “Memory, imagination, and predicting the future: a common brain mechanism?” The Neuroscientist 20, no. 3 (2014): 220-234.

  37. Pearson (2019); Gaesser (2013); Addis et al (2007); Spreng et al (2009); and Winlove et al (2018).

  38. Xu, Xiaoxiao, Hong Yuan, and Xu Lei, “Activation and connectivity within the default mode network contribute independently to future-oriented thought,” Scientific reports 6 (2016): 21001.

  39. Vyshedskiy, Andrey. “Neuroscience of imagination and implications for human evolution.” (2019). Preprint DOI: 10.31234/osf.io/skxwc.

  40. Vyshedskiy, “Neuroscience of Imagination.”

  41. Shtulman, Andrew, and Jonathan Phillips. “Differentiating “could” from “should”: Developmental changes in modal cognition.” Journal of Experimental Child Psychology 165 (2018): 161-182.

  42. See Barnes and Black (2016).

  43. Phillips, Jonathan, and Fiery Cushman, “Morality constrains the default representation of what is possible,” Proceedings of the National Academy of Sciences 114, no. 18 (2017): 4649-4654.

Categories
Cognitive Science Essays Philosophy

The Conceptual Engineering of Mental Illness

How could the concept of mental illness be engineered? Should it be abolished, ameliorated, or reformed in some way? Can the existing concept be vindicated? This is preliminary exploration to scout the territory and identify questions for further research in the conceptual engineering of mental illness. This project is not simply an attempt to characterize the current semantic content of mental illness. The issue is not what we happen to mean, but rather what we should mean, given the concept’s immense roles in our social, political, and scientific practices. Inquiry into mental illness must involve conceptual ethics, not just conceptual analysis.

Therefore, this essay proceeds in three steps: conceptual analysis, conceptual ethics, and conceptual engineering. These steps roughly map onto Thomasson’s pragmatic method for normative conceptual work: (1) reverse engineering the concept to identify its current content and function, (2) identifying the function the concept should fulfill, and (3) actually engineering the concept to better serve this function.[2] Part 1 contains conceptual analysis of mental illness, addressing descriptive issues about the concept’s definition, content, current function, and conceptual history. Part 2 handles normative questions in conceptual ethics, assessing what function mental illness should have and critiquing the existing concept from both epistemic and practical perspectives. Finally, part 3 engages in conceptual engineering, constructing and evaluating a series of ameliorative options.

Mental illness will be underlined when specifically referring to the concept, will be in scare quotes when referring to the lexical item "mental illness," and will be left alone when referring to the colloquial meaning or phenomena of mental illness. 

1. Conceptual Analysis

1.1 What is mental illness?

For the purposes of this essay, “mental illness,” “psychological/psychiatric disorder,” and “mental disorder,” will all be considered labels for the same concept mental illness. These terms vary in connotation but have similar intensions and extensions. Additionally, mental illness is a type concept: it specifies a category that includes many other token concepts, like bipolar disorder and autism. I will abstract away from the token concepts here and concentrate on the broader type concept.[3]

This paper focuses on mental illness as defined by the 5th Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5): a behavioral or psychological pattern in an individual that results in clinically significant distress or disability and reflects an underlying dysfunction.[4] This is a theoretical concept rather than a folk concept,[5] although the theoretical mental illness concept defined in the DSM-5 heavily influences the commonly used folk concept. The DSM-5 also lists several caveats, including that the behavior must not be simply social deviance or an expectable response to events. Mental illness should also have clinical utility, helping clinicians diagnose and treat patients. Essential to this definition is that mental illness is a medical concept, one intended to facilitate treatment in a clinical setting.

This definition prompts several questions. What is dysfunction? Is it deviation from norms, a divergence from evolutionary role, or maybe a harmful difference in neurobiology? The DSM-5 specifies that dysfunction can be psychological, biological, or developmental. But each of these options taken individually suggests different contents. If the dysfunction must be demonstrably biological or developmental, then most existing disorders would be excluded because researchers have not identified their neurobiological basis.[6] The definition states that mental illnesses must reflect an underlying dysfunction, but the DSM-5 does not state the etiology of any listed disorder. How can it then establish that underlying dysfunctions cause the harmful psychological or behavioral patterns? This reflects an inconsistency between the approach of the DSM-5, which does not identify underlying dysfunctions, and the definition of mental illness, which requires an underlying dysfunction.

Further, how much distress or dysfunction is enough to qualify a pattern as a mental illness? It is plausible that a personality trait like openness to experience could lead to significant distress and impairment, as it is strongly associated with harmful risk-taking behaviors.[7] The DSM-5 does not clarify these issues. Perhaps a mental illness must be harmful on balance, or must cause ‘net distress,’ without significant benefits that offset the harms. The effects of personality traits depend on the context, and arguably no trait is on-balance harmful. For example, openness has substantial benefits including higher creativity.[8] Personality traits may also lack clinical utility because they cannot be treated effectively and are difficult to diagnose precisely. Clearly, normative and practical concerns are at play here, not just descriptive and theoretical concerns.

The DSM-5 may be intentionally broad to include many types of dysfunction and distress. Vagueness is not necessarily a problem. After all, the DSM-5 clarifies that mental illness is more of a dimensional concept than a categorical concept: there is a continuous spectrum between pathologies and non-pathologies rather than a rigid distinction.[9]

Further, mental illness is a thick concept: it has both descriptive and normative features. It describes a set of behaviors, psychological conditions, and neurobiological states. But it also contains a normative judgement: these conditions are harmful, non-valuable, or negative, causing distress and dysfunction. A value-neutral account of dysfunction is unachievable, as it requires some normative reasoning to explain why a certain kind of function is more positive or better than others. Some token mental illness concepts may be thicker than others, but all involve evaluative components.[10] Mental illness combines both fact and value, although it may be difficult or impossible to disentangle fact from value.[11]

Conclusively, this conceptual analysis has shown that mental illness is a type, thick, and dimensional concept. The next section will address the function of the concept in our existing conceptual scheme.

1.2 System function

Thomasson defines system function as the capacity a concept serves in the system it is embedded within.[12] What role does mental illness play in our current system? The DSM-5 specifies that its definition was developed for clinical, public health, and research purposes.[13] Thus, one aspiration of the concept is to improve health and scientific understanding. The concept may serve this role to some extent. However, it also has other current functions.

For instance, mental illness has substantial economic, political, legal, and scientific functions. Over forty thousand psychiatrists in the US rely on the concept to some extent.[14] The global psychiatric market is valued at over $197 billion,[15] while the global market for psychiatric drugs is worth over $88 billion.[16] The DSM itself originated to provide a way for insurance and law to evaluate psychological damages.[17] Furthermore, mental illness is essential to legal concepts like the insanity and diminished capacity defenses, disability evaluations under the ADA,[18] civil competencies, and personal injury lawsuits.[19] The mental illness concept is also indispensable to certain structures of power. Psychiatric power is remarkable in that it seems to even transcend political sovereignty – “madness is, in essence, the ultimate exclusion.”[20] For example, when King George III was diagnosed as insane, he was removed from his authority and placed in isolation.[21] The 25th Amendment also enshrines a provision that could theoretically remove a president diagnosed with a mental illness.[22] Finally, mental illness guides research efforts in psychiatry, sociology, and other scientific fields.

1.3 Does conceptual history matter?

Some of the most compelling critiques of psychiatric concepts have been historical.[23] These genealogies often trace mental illness to defective, objectionable, or harmful origins. However, does the history of a concept matter in evaluating its present form? Some might worry that conceptual history is misguided, commits the genetic fallacy, or merely addresses descriptive issues in history without a normative critique of the concept. After all, concepts in chemistry can be traced to alchemy, but this conceptual history alone is not a meaningful critique of these concepts.

Plunkett (2016) argues that conceptual history can provide descriptive information to evaluate which concepts improve the success of inquiry.[24] If a concept emerged due to irrational, problematic, or contingent historical processes that are not responsive to our aims, this gives us a prima facie reason to worry about the concept—especially if our justification for using the concept relies on its history. Additionally, the past performance of a concept can indicate its value as a representational tool. If concepts have been unjust or unsuccessful in the past, this informs their likelihood to succeed in the present.

Conceptual history is especially important for thick concepts like mental illness, because it allows us to gain distance from the concept and see how ideology or normativity have merged into descriptive concepts. History can also reveal hidden features of a concept, show that the ostensible role of a concept isn’t in line with its actual function, and identify alternate concepts which can serve similar functions. Therefore, conceptual history does matter, especially for thick concepts with complex social histories like mental illness. Delving into this conceptual history can be essential to conceptual analysis, providing key descriptive information.

2. Conceptual Ethics

What ought to be the function of mental illness? It is difficult to create a complete definition of its ideal function. However, the concept should do at least two things. First, it should fulfill an epistemic role: describing and providing knowledge about phenomenon that corresponds to mental illness. The concept should be coherent, fruitful, accurate, predictive, and it should be essential to good explanations of phenomena that need explaining.[25] Second, it should fulfill a normative role: upholding our practical and ethical aims, including promoting well-being, improving public health, and fostering a just society. The epistemic and normative conceptual goals are analogous to the DSM-5’s aims: increasing scientific understanding and public health. At minimum, mental illness should live up to these aims. Furthermore, the concept should be modified or replaced if alternative concepts can better fulfill these functions. This section addresses epistemic and normative criticisms of mental illness.

Simion (2010) argues that the epistemic role of concepts should be prioritized, and concept amelioration should be limited to “revisions that do not result in epistemic loss.”[101] Engineering projects should not leave us with concepts that fail us epistemically. Otherwise we might be left with concepts that are essentially “noble lies,” optimized for positive normative effects but failing to represent reality. Of course, often epistemic deficiencies will lead to a concept’s negative effects. But ultimately some conceptual changes will involve tradeoffs between epistemic and normative benefits, and in these cases there is a strong argument for avoiding epistemic losses.

2.1 Epistemic Critiques

2.1.1 Natural Kind?

Many epistemic critiques revolve around the question of whether mental illness is a natural kind. Broadly put, a natural kind is a grouping that reflects the structure of the natural world rather than just human interests or actions, like the chemical elements.[26] There are several competing notions of what constitutes a natural kind. For simplicity, I will use Dupré’s account, in which a natural kind is not a set that shares a specific essential property, but a dense cluster of properties in the natural world.[27] Whether mental illness is a natural kind is a critical issue in determining the epistemic validity of the concept.

Cooper argues that at least some mental illnesses are natural kinds in the same sense as weeds.[28] Classifying a plant as a weed depends on judging the weed as normatively dis-valuable for one’s purposes (e.g. gardening). But the plants themselves are natural kinds, as they are empirically classified into species based on objective natural properties. Like the weed concept, mental illness depends on a normative judgement of a natural kind. However, the behavioral and neurobiological conditions that correspond to a mental illness can be grouped based on natural properties.

In contrast to Cooper, Hacking argues that mental illness is an interactive human kind.[112] Describing a mental illness results in social processes that alter the very properties under study. The concept changes when it is described, and thus is interactive and not indifferent to its description. Cooper responds that mental illnesses can still be natural kinds even if they are affected by social processes. After all, classifying a bacteria species often leads to treatment efforts that change the bacteria, but this does not imply the bacteria is not a natural kind. Social processes like changing diagnoses of autism may lead to changes in the symptoms of autism, but this does not change that autism reflects an underlying natural condition with biological causes.

While Cooper’s arguments are valid, she only shows that mental illness as a descriptive phenomenon may be a natural kind. But mental illness is a thick concept: a normative judgement on descriptive phenomena. Perhaps the properties associated with mental illness are grouped closely enough to call this collection of phenomena a natural kind; this is an empirical question that has not yet been demonstrated. But the key point is that these collections of phenomena alone do not constitute a mental illness concept. The normative aspects of mental illness are inevitably social creations, not features of the natural world. Even if certain neurobiological, behavioral, or psychological differences are natural kinds, the mental illness concept remains a social kind.

However, even if mental illness is not a natural kind, it may be a practical kind – a grouping that is useful enough to support effective induction and ground explanations and predictions.[29] For instance, results from taxometric studies, neurobiology, and experimental psychology seem to show that individuals with major depression form a distinct group.[30] If this holds for mental illness in general, it may vindicate the concept. However, critics of this approach argue that statistical clusters of symptoms may simply reflect folk descriptions of distress or common responses and should not be called “illnesses.”[31] Clearly, any resolution to this debate must involve deep empirical and philosophical work.

2.1.2 Scientific Problems

Psychiatrists routinely argue that there are neurobiological dysfunctions like ‘chemical imbalances’ underlying mental illnesses. However, psychiatry has failed to demonstrate that these biological differences exist and are tied to behavioral differences. The largest and most recent umbrella review[32] of biomarkers for mental disorders found that “no convincing evidence supported the existence of a trans-diagnostic biomarker.[33] Although the DSM-5’s biomedical mental illness concept implies an underlying neurobiological dysfunction, 175 years of research has failed to show a neurobiological basis for any mental illness.[34] Neurobiology is not used in psychiatric diagnosis, and there are no validated clinical tests for mental disorders.[35] Davidson notes psychiatric research is characterized by an “obsession with brain anatomy coupled with the constant admission of its theoretical and clinical uselessness.”[36] Despite ongoing promises, psychiatry has not identified clear etiologies, biological aberrations, or clinical tests for mental illnesses. Thus, the biomedical concept fails to satisfy its own desiderata.

Adding to these deficiencies, mental illness diagnoses are notoriously unreliable. Most DSM categories lack construct validity and have little predictive power.[37] A cascade of studies in the 1970s demonstrated that psychiatrists only agreed upon diagnoses about 50% of the time.[38] A more recent quantitative review of 311 taxometric findings concluded that there was almost no replicated evidence for discrete psychiatric categories.[39] This unreliability can be traced to serious conceptual problems. Mental illnesses are often defined tautologically or incoherently. For instance, psychiatrists may claim that the cause of a person’s mood swings is bipolar disorder, and the evidence the person has bipolar is her mood swings. This response can only escape tautology if some clear external cause can be identified, like a specific neural aberration—but no mental illness has firmly identified etiology. Many diagnoses are also extremely vague or ambiguous. For instance, what constitutes “excessive anxiety”? The general concept of mental illness is also vague, as addressed in section 1.1. While some categories may be useful, the current approach to mental illness has not resulted in accurate categorization.

Psychiatry also has a bad track record of modifying mental illness when it fails to describe the world accurately. Despite explanatory failures, a lack of pathological neuroanatomy, and ethical harms, psychiatry retained since-debunked mental illnesses like ‘sexual perversions,’ homosexuality, and female hysteria.[40] These constructs were only scrapped after persistent social pressures from outside psychiatry.[41] This conceptual history riddled with epistemic failures casts some doubt on the validity of the existing understanding of mental illness. While the problematic disorders have been removed, the overarching mental illness concept that resulted in these failures has hardly changed.

Finally, mental illness as defined in the DSM-5 assumes that mental illnesses are relatively universal if not culturally invariant. This results in prioritizing Western understandings of mental health and illness and “homogenizing the way the world goes mad.”[42] However, mental illnesses vary dramatically across cultures, and some clusters of symptoms exist only in specific times or places.[43] In Hong Kong, symptoms of anorexia did not appear until Western psychiatry exported the concept; in Zanzibar, schizophrenia in the American form replaced existing symptoms; in Japan, the Western concept of depression was marketed by multinational pharmaceutical corporations and quickly replaced the indigenous disorder called yuutsu.[44] Ethan Watters’ detailed studies of these phenomena show that “culturally designated pathological states are often the flipside of states a culture values.”[45] Treating certain behavioral patterns as diseases inevitably reflects the norms of specific societies, and mental illness primarily reflects Anglo-American values. Exporting this culturally specific concept may be a form of psychiatric colonialism that results in both epistemic inaccuracies and negative impacts.

2.2 Normative Critiques

2.2.1 Treatment Failures

The epistemic defects of mental illness may impair the success of treatments based on this concept. In line with this prediction, psychiatry has serious practical failures in helping those it is intends to treat. The life expectancy for patients with mental illnesses has declined since the 1950s.[46] Suicide rates for patients with schizophrenia have increased by over 10 times.[47] Psychiatric treatment failed to improve outcomes for schizophrenic patients in 37 countries, and 66% of subjects found that antipsychotic medications completely lacked effectiveness.[48] Analysis of data from 1990 to 2015 in high-income countries found that “despite substantial increases in the provision of treatment” the prevalence of mood & anxiety disorders and their symptoms has not decreased.[49] Another large cross-national study found that on five out of six dimensions, mentally ill patients in developed countries had significantly worse outcomes then those in developing countries.[50] Developing countries have less adoption of the mental illness concept addressed here, fewer psychiatrists, and less access to pharmaceutical treatments. The fact that patients have better outcomes in these countries is not a good sign for the mental illness concept or casts doubt on psychiatry in general.

The development of psychiatric categories also faces serious methodological problems. Almost 50% of research on drugs is ghostwritten by non-experts or otherwise abnormally written.[104] In many psychiatry journals, more than 90% of authors receive research funding from drug companies.[105] Furthermore, 70% of the DSM-5 task force members had direct ties to the pharmaceutical industry.[106] It is also hard to argue that a 480% increase in the number of mental disorders over fifty years is merely the result of rigorous and unbiased scientific discovery.[107] Given the rapid growth of mental illness diagnosis and treatment “we may soon reach a point when it is statistically deviant not to be taking one of these medications,” and strange to not be diagnosed with a mental illness.[108] Can we trust mental illness categories to adequately describe the world when their development is so influenced by these factors?

However, some treatments for mental illnesses may be effective. For instance, one review of 94 meta-analyses compared psychiatric drugs to medical drugs and found that psychiatric medications were not generally less effective.[51] For instance, lithium was associated with reduction in bipolar relapse rates from 61% to 40%. Ultimately, whether or not psychiatric treatments are effective is a difficult empirical question that cannot be resolved here. However, psychiatry’s effectiveness is certainly not spectacular, and its remarkable failures cast doubt on the value of psychiatric concepts.

2.2.2 Social Costs: Oppression, Stigmatization, Marginalization

Mental illness may also have serious ethical harms that justify revising or rejecting the concept. For example, people judged as mentally ill can be involuntarily committed and are often deprived of freedom in psychiatric wards.[52] The concept is also often used to deny employment, legal rights, equal treatment, and epistemic status to those who are seen as mentally ill. In this way, classifying people as mentally ill may function as a mechanism of social control, “a cunning way of excluding certain people or certain patterns of behavior.”[53] Perhaps the concept gives pseudo-medical authority to practices of ostracism and moral condemnation.[54] Some argue that the concept should be changed or abolished to prevent these normative harms.

Some argue madness is fundamentally a failure to coordinate one’s behavior correctly with society, or a failure to conform to social and economic norms. Under this view, the DSM is a device to evaluate and improve the administration of human capital, and to predict “risks connected to the future exploitation of such capital.”[109] Psychiatry often “provides concepts and languages for marketers to use,” and the mental illness concept is essential for pharmaceutical industry, which often “markets diseases in the expectation that sales of the pills will follow.”[110] Major depression linked a wide range of common symptoms to a purported natural kind, which was an “enormously profitable gift to the pharmaceutical industry,” making SSRIs the bestselling drug category in the US, with almost 10% of the population using them.[111] If it is the case that mental illness functions to justify arbitrary discrimination against infringements of socio-economic norms, then it may not be a concept worth keeping.

Mental illness does often serve to legitimate the rejection, dismissal, or marginalization of ‘mentally ill’ people. This often employs weaponized uses of the mental illness concept, like “crazy,” “insane,” “loony,” and at least 250 other stigmatizing labels.[55] Bolinger argues that these terms are slurs, as they insult both the target based on their group membership, reinforcing “the assumption that people with mental illnesses ought to be generally dismissed as epistemic agents” and representing mentally ill people as deserving bad treatment.[56]

Stigmatization is a major cost of the concept of mental illness. Internalized stigma explained 74% of the variance in suicide risk for individuals with schizophrenia,[57] and correlates with higher symptom severity.[58] Even after multivariate analysis, internalized stigma is associated with more suicidal ideation, suicidal risk, number of suicide attempts, and depression.[59] A longitudinal research design also found that self-stigma was significantly associated with suicidal ideation.[60] Education on mental illness does not improve outcomes – it tends to worsen them. Psychoeducational programs are associated with increased suicidality, and awareness of illness is related to suicide risk.[62] Adolescents who self-label as mentally ill had higher ratings of self-stigma and depression.[61] Another study found that developing insight into having a mental illness increased depression.[63] Mental illness serves to promote stigmatizing views, and therefore this concept may be more harmful than helpful.

Leslie argues that certain linguistic constructions like generic concepts can encourage essentializing social kinds, leading to both cognitive mistakes and harmful stereotyping.[64] In line with this argument, extensive surveys and experiments have shown that essentialist thinking about mental illness is linked to stigma. Both laypeople and clinicians tend to believe that mental disorders are discrete, biologically based, and have inherent causes and properties, showing that essentialism dominates psychiatry and folk thinking.[65] People who endorse the biomedical mental illness concept distance themselves more from those seen as mentally ill, perceive them as more dangerous, have lower expectations of their recovery, and show more punitive behavior.[66]

Finally, a key idea of Hacking’s work is that “people spontaneously come to fit their categories,” and categorization creates new kinds of people.[67] It is not just that ‘what is measured can be managed,’ but what is measured can be created. For example, Hacking shows that the classification of multiple personality disorder in 1875 created a rush of people who exhibited the syndrome.[68] Diagnostic categories also create corresponding identities. People tend to ‘have’ physical illnesses but ‘be’ mental illnesses. For example, diagnosed individuals have extreme difficultly de-labeling from psychiatric disorders like “bipolar,” “anorexic,” or “OCD.”[69] As one patient said, “we start to define ourselves in a way that’s hard to break because we really believe that’s who we are.”[70] This may lead persons to adopt a ‘sick role’ that hinders their recovery and flourishing.

Diagnosed individuals tend to understand their own behavior in terms of dysfunction, and often identify as disordered their entire lives. People adapt to the concepts used to represent them. For instance, oppositional defiant disorder stigmatizes defiance as an illness, resulting in discipline practices that disproportionately harm young Black men – and if ODD is an interactive kind, those diagnosed with the disorder may “respond to their classification by exhibiting closer approximations to it.”[71] Clearly, mental illness can create group identities or new kinds of people. If this identity-creation has negative results, this is a reason to reject or modify mental illness.

3. Conceptual Engineering

3.1 Why engineer mental illness?

Mental illness is uniquely amenable to conceptual engineering. First, it is easier to engineer than many other concepts. Unlike concepts like woman, the meaning of mental illness is heavily influenced by a central body (the DSM-5), and thus its intension can be more easily changed by convincing the central body to revise its definition. Mental illness is no stranger to conceptual engineering. Previous efforts have successfully changed the concept, e.g. modifying the intension to exclude social deviances and removing homosexuality from the extension. In the 1950s the Renard School of psychiatry helped restructure mental illness from a psychoanalytic to a biomedical concept.[72] Of course, we should try to improve even the most difficult-to-change concepts if we have good normative reasons to do so. But mental illness is a low-hanging fruit that can serve as a proving ground for conceptual engineering efforts.

Additionally, as argued in section 2.1.1, mental illness is more of a human kind than a natural kind. Natural kinds can retain meaning despite changes in use. For instance, the extension and use of “number” have changed to include imaginary numbers and more, but the meaning of “number” itself remains the same.[73] If it is true that natural kinds have non-plastic meanings, they may be difficult to re-engineer. Human kinds are more tractable for engineering projects because their meaning is largely defined by their use in social contexts. As Simion argues, “when it comes to concepts representing social rather than natural kinds, by conceptually engineering, we would be, in effect, changing the world.”[74] Insofar as mental illness is a social/human kind, and language is constitutive of social reality, changing the concept may change the world itself. But this is a double-edged sword: changing the concept may also require changing the structures of the social world (reality engineering).[75]

This project also has importance beyond mental illness. As Capellen points out, a general theory of conceptual engineering can guide specific projects, and these practical projects can inform the theory.[76] Exploring or implementing changes can improve our understanding of how conceptual engineering works in practice. It can also uncover issues and approaches that apply to other conceptual engineering initiatives. These features of mental illness make it a vital area for conceptual engineering.

Proposals to modify mental illness will generally fall into three categories that correspond to Cappelen’s varieties of ameliorative strategies.[77] First, abandonment proposals argue the concept should be eliminated entirely. Second, meaning change proposals argue for keeping the lexical item “mental illness” while its meaning is revised. Third, some proposals argue that both the lexical item and the meaning of mental illness should be revised. I do not exhaustively survey possible proposals but construct examples of each type.

All conceptual engineering proposals, especially the third type, will tangle with difficult issues in topic continuity. If mental illness is altered, how do we know if we are still addressing the same idea and have not simply changed the subject? Conceptual engineers can use several replies to the topic continuity objection that are addressed elsewhere.[78] The proposals below are united in that they address (a) the same existing mental illness concept, and (b) attempt to fulfill the function of this concept in better ways. More radical proposals not discussed here may argue that inquiry into mental illness should be abandoned entirely, its function completely discarded.

3.1 Abandonment

3.1.1 Complete abolition

Advocates of abolition might argue that the concept’s epistemic deficits and normative harms are so substantial that we would be better off without it. These approaches may or may not provide an alternative to fulfill the conceptual vacuum. However, these proposals will face serious challenges. Without mental illness, how can the functions of this concept be pursued? How can psychiatric inquiry proceed? Will those with neurological or mental disorders be left without hope of treatment? These challenges are daunting enough that very few propose the abolition of mental illness.

However, in Abolishing the Concept of Mental Illness, Richard Hallam takes these challenges on. He argues that “psychiatry does not have to base itself on a presumption of pathology,” and that “if the concept of mental illness were to be abolished, our response to woes would have to be thought through anew.”[79] Mental and behavioral differences should be referred to in “a more neutral way” that allows individuals to construct “non-illness identities.”[80] These differences can still be studied and treated (if the individual chooses). However, they should not be pathologized. Abolition may therefore avoid many of the harms of stigmatization and negative identity-creation, while allowing scientists to study and develop treatments for neural differences in a less biased way.

3.1.2 Abolish overarching concept, keep (some) sub-concepts

Others argue that individual diagnostic categories are worth keeping, but we do not need a type concept mental illness and it should be abolished. After all, medicine does not need to define a unitary and generalized disease concept to effectively study and treat specific physical ailments.[81] Can a single representation of mental illness really be useful in the immense variety of contexts it is applied to? As Jaspers writes, “we do not need the concept of ‘illness in general’ at all and we now know that no such general and uniform concept exists.”[82] As such, psychiatry should jettison the abstract, all-encompassing definition of mental illness and the finite lists of illnesses grouped under this concept. Individual token concepts, like autism, will only be kept if they are valuable.

Some type concepts under the overarching mental illness concept might be particularly problematic. For example, Charland argues that cluster-B personality disorders are filled with moral judgements masked by clinical descriptive language.[102] For example, antisocial and narcissistic personality disorders are defined by clearly normative concepts like dishonesty and recklessness. They require essentially moral treatment that changes the individual’s moral character. Perhaps concepts like narcissistic personality disorder should be abolished entirely. Instead of treating these conditions like biomedical concepts they should simply be described with moral concepts. However, clusters A and C are less normative, as they are defined by descriptive empirical conditions. For example, schizoid personality disorder is specified by anhedonia, lack of close friends, and solitary activities — qualities that are in principle empirically observable. These concepts may be kept. This example shows how it could be possible to eliminate or alter token concepts under the overarching mental illness concept, without altering the type concept itself.

Proponents list several benefits for this kind of conceptual move. First, abolishing the overarching concept may have epistemic benefits, allowing researchers to accurately represent the natural world and make progress in understanding specific mental conditions. While individual mental disorders like bipolar and autism may be natural kinds, the mental illness concept itself is not a natural kind, as it is a collection of distinct conditions with no defining natural properties in common.[83] Grouping phenomena into mental illness might be useful if this category allowed us to see high-level patterns, but this is not the case; there are no general patterns or features that unite all these mental conditions. Scientific advances in psychiatry, neurobiology, and genetics indicate that there are “inherently fuzzy boundaries between disorder and non-disorder.”[84] Instead, this overarching concept may encourage generalizations and bad inferences about all of its sub-concepts.

Second, as section 2.2 shows, grouping people under mental illness also allows for oppression and stigmatization. Removing the overarching concept could help prevent the generalization that sustains these harmful social effects.

3.2 Keep lexical item, change meaning

3.2.1 Haslangerian Amelioration

Haslanger argues that we should change the meaning of certain concepts to achieve ethical aims like social justice. The key question is “whether tracking, communicating, and coordinating around” the concept is a good idea.[85] Given the concept’s role in oppression, perhaps we could construct an ameliorative new definition of mental illness in a Haslangerian fashion:

A group G is “mentally disordered” or “mentally ill” (in context C) iffdf Gs members exhibit similar behaviors, thoughts, or psychologies (in C); are subject to negative treatments including but not limited to subordinate status, reduced agency, and ignored speech and thought; and the members are “marked” by the dominant ideology (in C) as a target for these negative treatments by neurobiological or behavioral features presumed to be evidence of diminished or flawed mental capacities.

Would this definition be emancipatory? At the very least, this definition reveals “features of our meanings that we were mostly unaware of,”[86] as it exposes an ideology of marginalizing the mentally ill. Perhaps this new concept “cuts at the social joints” more effectively by explaining how a group is oppressed based on certain marks.[87] This amelioration might also reduce oppression, as “mentally ill” would no longer imply that someone is less deserving of equal treatment, but rather that they happen to be marginalized based on a mental feature. It would also allow the “mentally ill” to organize around the shared condition of being oppressed by by sanism[88] or ableism. What needs changing is not the individual, but the social structures that oppress and fail to accommodate the individual.

This proposal is vulnerable to many of the same criticisms that have been aimed at Haslanger’s projects. First, this amelioration may be a topic change—we are no longer talking about mental illness. Second, this amelioration is extremely difficult to achieve. Why fight two battles: (a) showing how a group is oppressed, and (b) attempting to change the use of words for this group in counter-intuitive ways?[89] Under (a), instead of revising the concept we can improve our understanding of the existing concept, realizing that mental illness functions to oppress and marginalize certain groups. It seems simpler to only attempt (a), and perhaps more effective.

Finally, a critic might respond that mental illness is not analogous to race and gender, it because it is actually the case that people deserve different treatment (e.g. less epistemic trust) based on certain mental features. For instance, if a person has severe brain damage, or is currently in schizophrenic psychosis, perhaps we shouldn’t give their statements exactly the same weight as those of a normal epistemic agent. However, it seems better to evaluate statements based on their merits rather than the issuing agent. And often mental illness is simply applied to agents whose speech one would like to reject.

3.2.2 Descriptive Reformulation

The descriptive reformulation project argues that we should revise all mental illnesses so that they depend entirely on nonmoral concepts and conditions that can be identified empirically.[90] Under this project, mental illness would essentially be used to refer to physical illnesses of the brain and nervous system that lead to dysfunction that can be described in an evaluatively neutral way—e.g. without appealing to social norms or moral standards. Advocates claim this project can both avoid normative judgement and set psychiatry on stronger scientific and epistemic grounds.

For instance, some researchers argue that psychiatry should adopt a ‘stratified medicine’ approach toward mental illness, aimed at identifying biomarkers or cognitive tests that stratify each mental disorder phenotype “into a finite number of treatment-relevant subgroups.”[91] Some major recent projects have attempted to create strictly biological classifications of mental disorders which do not map onto existing DSM-5 diagnoses.[92] This project may argue that if a candidate “mental illness” does not correspond to an identified neurobiological dysfunction, then it is not a mental illness.

The primary objection to this project is that it is not possible. First, scientific evidence casts doubt on the existence of descriptive properties like biomarkers that can qualify something as a mental illness. Second, there is no way to call something a “mental illness” without using normative concepts of some kind. Even in medicine, health requires a standard of well-being or functioning that requires normative judgements. Some proponents may argue that dysfunction can be evaluated descriptively. For example, perhaps we can identify evolutionary dysfunctions that correspond to mental illnesses. But this still involves normatively disvaluing evolutionary dysfunction. Furthermore, many mental illnesses have adaptive benefits,[93] and evolution alone cannot entail that any particular use of a trait is ‘more functional.’ It seems that any evaluation of dysfunction requires normativity.

However, this project could be salvaged by altering it slightly. Perhaps we should abandon the normative notion of dysfunction as well. Psychiatry should instead develop value-neutral classifications of behavioral, psychological, and neurobiological conditions, each associated with treatments that individuals can select if they choose.[94] None of these conditions would be considered dysfunctions or classified into illnesses.

3.3 Change both lexical item and meaning

3.3.1 Replace with Reclaimed Term

Some may argue that abnormal psychologies are not negative, and thus should not be called ‘mental illnesses.’ As one schizophrenic individual wrote:

“I consider myself the luckiest of individuals and I am most pleased with this mind…My life is an adventure, not necessarily safe or comfortable, but at least an adventure.”[95]

Many ‘mentally ill’ people agree. Advocates of this revision argue that just as being disabled is not having a “broken or defective body,” but simply a minority body, perhaps ‘mental illness’ is just having a minority brain.[96] This is not a bad-difference, but a mere-difference. For disabled people, it is the “experience of being disabled that is itself constitutive of some of the goods in their lives.”[97] In the same way, mental illness can be essential to certain goods—for instance, “Madness might represent another possible way of seeing.”[98]

Thus, some advocate a neutral or positive concept for these states. First, the revisionist could keep “mental illness,” changing its meaning so that it has no negative evaluation or connotation. For example, terms like “queer” and “crip” were reclaimed not by changing who the term applied to, but by changing the “affective, expressive component in the concept.”[99] However, this kind of revision is difficult when the term “illness” entails almost inbuilt negative evaluations & connotations. Second, the revisionist could abandon “mental illness,” and replace it with a new lexical item with a new meaning. This concept could be (1) a reclaimed term with existing negative connotations, like “mad,” “crazy,” or “insane,” (2) a currently positive or neutral term like “shaman” or “neurodivergent,” or even (3) a neologism. Replacing mental illness with a more positive conception may improve our social practices towards psychological difference.

4. Conclusion

Conflicts over the meaning of mental illness are proxy battles, the linguistic site of an underlying struggle over the purposes of psychiatry. The concept has immense impact. Falling within the extension of mental illness can enable access to treatment and insurance, legal protection, and entry to support and advocacy groups. It can also lead to involuntary commitment, social stigma, and exclusion. Given its significance, ensuring that mental illness fulfills our epistemic and ethical aims is critical. If the concept is defective, it could lead scientific efforts astray; if it has negative ethical effects, revising, replacing, or even abandoning it could help prevent harm.

Most researchers recognize that the concepts and terms of psychiatry can be revised: “as a linguistic sign, madness becomes available for our critical manipulation.”[100] What is not clear is how these concepts function currently, what they should mean, and how we can change them. Through the process of conceptual analysis, conceptual ethics, and conceptual engineering, this essay explores these issues. By introducing the fruitful methodology of conceptual engineering to psychiatry, philosophers can develop and clarify their descriptions, critiques, and proposals for conceptual improvement.

Bibliography

“Psychiatrists Market By Segmentation (Mental Disorder Type, Patient Type, Geography), By Trends, By Restraints, By Drivers, By Major Competitors – Global Forecasts To 2023.” The Business Research Company. January 2020.

Ahn, Woo-kyoung, Elizabeth H. Flanagan, Jessecae K. Marsh, and Charles A. Sanislow. “Beliefs about essences and the reality of mental disorders.” Psychological Science 17, no. 9 (2006): 759-766.

American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub, 2013. Pg. 671.

American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub, 2013. Pg. 671.

Banicki, Konrad. “Personality disorders and thick concepts.” Philosophy, Psychiatry, & Psychology 25, no. 3 (2018): 209-221.

Beck, Angela J., Cory Page, J. Buche, Danielle Rittman, and Maria Gaiser. “Estimating the Distribution of the US Psychiatric Subspecialist Workforce.” Ann Arbor: University of Michigan School of Public Health Workforce Research Center (2018).

Beck, Angela J., Cory Page, J. Buche, Danielle Rittman, and Maria Gaiser. “Estimating the Distribution of the US Psychiatric Subspecialist Workforce.” Ann Arbor: University of Michigan School of Public Health Workforce Research Center (2018).

Bird, Alexander, and Emma Tobin. “Natural kinds.” Stanford Encyclopedia of Philosophy (2008).

Bolinger, Renee (forthcoming). The Language of Mental Illness. In Justin Khoo & Rachel Katharine Sterken (eds.), Routledge Handbook of Social and Political Philosophy of Language. Routledge.

Boyd, Jennifer E., Emerald P. Adler, Poorni G. Otilingam, and Townley Peters. “Internalized Stigma of Mental Illness (ISMI) scale: a multinational review.” Comprehensive Psychiatry 55, no. 1 (2014): 221-231.

Brückl, Tanja M., Victor I. Spoormaker, Philipp G. Sämann, Anna-Katharine Brem, Lara Henco, Darina Czamara, Immanuel Elbau et al. “The biological classification of mental disorders (BeCOME) study: a protocol for an observational deep-phenotyping study for the identification of biological subtypes.” BMC psychiatry 20 (2020): 1-25.

Cappelen, Herman. Fixing language: An essay on conceptual engineering. Oxford University Press, 2018.

Carballo, Alejandro Pérez. “Conceptual evaluation: epistemic.” In Alexis Burgess, Herman Cappelen & David Plunkett (eds.), Conceptual Ethics and Conceptual Engineering. Oxford, UK: Oxford University Press (2020). Pg. 304-332.

Carvalho, André F., Marco Solmi, Marcos Sanches, Myrela O. Machado, Brendon Stubbs, Olesya Ajnakina, Chelsea Sherman et al. “Evidence-based umbrella review of 162 peripheral biomarkers for major mental disorders.” Translational Psychiatry 10, no. 1 (2020): 1-13.

Colton CW, Manderscheid RW (2006). Congruencies in increased mortality rates, years of potential life lost, and causes of death among public mental health clients in eight states. Prevention of Chronic Disease. www.cdc.gov/pcd/issues/2006/apr/05_0180.htm.

Cooper, Rachel. Classifying Madness: A Philosophical Examination of the Diagnostic and Statistical Manual of Mental Disorders. Vol. 86. Springer Science & Business Media, 2006.

Cosgrove, Lisa, and Harold J. Bursztajn. “Toward credible conflict of interest policies in clinical psychiatry.” (2009).

Davidson, Arnold I. “Diseases of sexuality and the emergence of the psychiatric style of reasoning.” Meaning and Method: Essays in Honor of Hilary Putnam (1990): 295.

Demazeux, Steeves, and Patrick Singy. The DSM-5 in Perspective. New York, NY: Springer. http://dx. doi. org/10.1007/978-94-017-9765-8, 2015.

Drescher, Jack. “Out of DSM: Depathologizing homosexuality.” Behavioral Sciences 5, no. 4 (2015): 565-575.

Dupré, John. “Natural kinds and biological taxa.” The Philosophical Review 90, no. 1 (1981): 66-90.

Foucault, Michel, Peter Stastny, and Deniz Şengel. “Madness, the absence of work.” Critical inquiry 21, no. 2 (1995): 290-298.

Foucault, Michel. Madness and civilization: A history of insanity in the age of reason. Vintage, 1988.

Goldberg, Ann. Sex, Religion, and the Making of Modern Madness: The Eberbach Asylum and German Society, 1815-1849. Oxford University Press on Demand, 1999.

Greenough, Patrick. “Conceptual Engineering via Reality Engineering.” Unpublished, under review. 2020.

Hacking, Ian. “Historical ontology.” In the Scope of Logic, Methodology and Philosophy of Science, pp. 583-600. Springer, Dordrecht, 2002.

Hacking, Ian. Mad travelers: Reflections on the reality of transient mental illnesses. University of Virginia Press, 1998.

Hacking, Ian. Rewriting the soul: Multiple personality and the sciences of memory. Princeton University Press, 1998.

Harper, Marjory, ed. Migration and Mental Health: Past and Present. Springer, 2016.

Harré, John Read, Niki. “The role of biological and genetic causal beliefs in the stigmatisation of ‘mental patients’.” Journal of mental health 10, no. 2 (2001): 223-235.

Haslam, N. “Genetic essentialism, neuroessentialism, and stigma: Comment on Dar-Nimrod and Heine.” Psychological Bulletin 17: 819 – 824 (2011).

Haslam, Nick, Elise Holland, and Peter Kuppens. “Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research.” Psychological medicine 42, no. 5 (2012): 903-920.

Healy D, Harris M, Tranter R, Gutting P, Austin R, Jones-Edwards G, Roberts AP (2006). Lifetime suicide rates in treated schizophrenia: 1875–1924 and 1994–1998 cohorts compared. British Journal of Psychiatry 188, 223–228.

Healy D, Savage M, Michael P, Harris M, Hirst D, Carter M, Cattell D, McMonagle T, Sohler N, Susser E (2001). Psychiatric bed utilisation: 1896 and 1996 compared. Psychological Medicine 31, 779–790.

Healy, David, and Michael E. Thase. “Is academic psychiatry for sale?” The British Journal of Psychiatry 182, no. 5 (2003): 388-390.

Healy, David. Mania: A short history of bipolar disorder. JHU Press, 2008.

Horwitz, Allan V., and Jerome C. Wakefield. The loss of sadness: How psychiatry transformed normal sorrow into depressive disorder. Oxford University Press, 2007.

Howard, Jenna. “Negotiating an exit: Existential, interactional, and cultural obstacles to disorder disidentification.” Social Psychology Quarterly 71, no. 2 (2008): 177-192.

Jablensky, Assen, Norman Sartorius, Gunilla Ernberg, Martha Anker, Ailsa Korten, John E. Cooper, Robert Day, and Aksel Bertelsen. “Schizophrenia: manifestations, incidence and course in different cultures A World Health Organization Ten-Country Study.” Psychological Medicine Monograph Supplement 20 (1992): 1-97.

Jablensky, Assen. “Does psychiatry need an overarching concept of” mental disorder”?.” World Psychiatry 6, no. 3 (2007): 157.

Jaspers, Karl. General psychopathology. Vol. 2. JHU Press, 1997.

Kapur, Shitij, Anthony G. Phillips, and Thomas R. Insel. “Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it?.” Molecular psychiatry 17, no. 12 (2012): 1174-1179.

Karagianis, Jamie, D. Novick, Jan Pecenak, Josep Maria Haro, M. Dossenbach, T. Treuer, W. Montgomery, R. Walton, and A. J. Lowry. “Worldwide‐Schizophrenia Outpatient Health Outcomes (W‐SOHO): baseline characteristics of pan‐regional observational data from more than 17,000 patients.” International Journal of Clinical Practice 63, no. 11 (2009): 1578-1588.

Kincaid, Harold, and Jacqueline A. Sullivan, eds. Classifying psychopathology: Mental kinds and natural kinds. MIT Press, 2014.

Kingdon, David, and Allan H. Young. “Research into putative biological mechanisms of mental disorders has been of no value to clinical psychiatry.” The British Journal of Psychiatry 191, no. 4 (2007): 285-290.

Kirk, Stuart A., David Cohen, and Tomi Gomory. “DSM-5: The delayed demise of descriptive diagnosis.” In The DSM-5 in perspective, pp. 63-81. Springer, Dordrecht, 2015.

Lam, Danny CK, Paul M. Salkovskis, and Hilary MC Warwick. “An experimental investigation of the impact of biological versus psychological explanations of the cause of “mental illness”.” Journal of Mental Health 14, no. 5 (2005): 453-464.

Leucht, Stefan, Sandra Hierl, Werner Kissling, Markus Dold, and John M. Davis. “Putting the efficacy of psychiatric and general medicine medication into perspective: review of meta-analyses.” The British Journal of Psychiatry 200, no. 2 (2012): 97-106.

Leucht, Stefan, Sandra Hierl, Werner Kissling, Markus Dold, and John M. Davis. “Putting the efficacy of psychiatric and general medicine medication into perspective: review of meta-analyses.” The British Journal of Psychiatry 200, no. 2 (2012): 97-106.

McPherson, Tristam and David Plunkett. “Conceptual ethics and the methodology of normative inquiry.” Conceptual Engineering and Conceptual Ethics. 2020.

Mehta, S., and A. Farina. “Is being sick really better? Effect of the disease view of mental disorder on stigma.” Journal of Social and Clinical Psychology 16: 405 – 419 (1997).

Moses, Tally. “Self-labeling and its effects among adolescents diagnosed with mental disorders.” Social Science & Medicine 68, no. 3 (2009): 570-578.

Oexle, Nathalie, Nicolas Rüsch, Sandra Viering, Christine Wyss, Erich Seifritz, Ziyan Xu, and Wolfram Kawohl. “Self-stigma and suicidality: a longitudinal study.” European archives of psychiatry and clinical neuroscience 267, no. 4 (2017): 359-361.

Patel, Vikram, Shekhar Saxena, Crick Lund, Graham Thornicroft, Florence Baingana, Paul Bolton, Dan Chisholm et al. “The Lancet Commission on global mental health and sustainable development.” The Lancet 392, no. 10157 (2018): 1553-1598.

Phelan, Jo C. “Geneticization of deviant behavior and consequences for stigma: The case of mental illness.” Journal of health and social behavior 46, no. 4 (2005): 307-322.

Plunkett, David. “Conceptual history, conceptual ethics, and the aims of inquiry: a framework for thinking about the relevance of the history/genealogy of concepts to normative inquiry.” Ergo, an Open Access Journal of Philosophy 3 (2016).

Plunkett, David. “Conceptual history, conceptual ethics, and the aims of inquiry: a framework for thinking about the relevance of the history/genealogy of concepts to normative inquiry.” Ergo, an Open Access Journal of Philosophy 3 (2016).

Pols, Jan. “The Politics of Mental Illness: Myth and Power in the Works of Thomas S. Szasz.” Trans. Mira de Vries (1984/2005). Nijmegen, 1976. Pg. 178.

Preston, Beth. 1998. Why is a Wing like a Spoon? A Pluralist Theory of Function. The Journal of Philosophy 95 (5):215–54.

Prinzing, Michael. “The revisionist’s rubric: conceptual engineering and the discontinuity objection.” Inquiry 61, no. 8 (2018): 854-880

Putnam, H. (2002). The collapse of the fact/value dichotomy and other essays. Cambridge, MA: Harvard University Press.

Rose, Diana, Graham Thornicroft, Vanessa Pinfold, and Aliya Kassam. “250 labels used to stigmatise people with mental illness.” BMC health services research 7, no. 1 (2007): 97.

Scott, Charles, ed. DSM-5® and the Law: Changes and Challenges. Oxford University Press, 2015.

Scull, Andrew. Madness in Civilization: A Cultural History of Insanity, from the Bible to Freud, from the Madhouse to Modern Medicine. Princeton University Press, 2015.

Sharaf, Amira Y., Laila H. Ossman, and Ola A. Lachine. “A cross-sectional study of the relationships between illness insight, internalized stigma, and suicide risk in individuals with schizophrenia.” International journal of nursing studies 49, no. 12 (2012): 1512-1520.

Simion, Mona. “The ‘should’ in conceptual engineering.” Inquiry 61, no. 8 (2018): 914-928.

Simion, Mona. “The ‘should’ in conceptual engineering.” Inquiry 61, no. 8 (2018): 914-928.

Stein, Dan J., Katharine A. Phillips, Derek Bolton, K. W. M. Fulford, John Z. Sadler, and Kenneth S. Kendler. “What is a mental/psychiatric disorder? From DSM-IV to DSM-V.” Psychological medicine 40, no. 11 (2010): 1759-1765.

Stretton, Serina. “Systematic review on the primary and secondary reporting of the prevalence of ghostwriting in the medical literature.” BMJ open 4, no. 7 (2014): e004777.

Surís, Alina, Ryan Holliday, and Carol S. North. “The evolution of the classification of psychiatric disorders.” Behavioral Sciences 6, no. 1 (2016): 5.

Szasz, Thomas. Manufacture of madness: A comparative study of the inquisition and the mental health movement. Syracuse University Press, 1997.

Testa, Megan, and Sara G. West. “Civil commitment in the United States.” Psychiatry (Edgmont) 7, no. 10 (2010): 30.

Tan, Chee‐Seng, Xiao‐Shan Lau, Yian‐Thin Kung, and Renu A/L. Kailsan. “Openness to experience enhances creativity: The mediating role of intrinsic motivation and the creative process engagement.” The Journal of Creative Behavior 53, no. 1 (2019): 109-119.

Tcherpakov, Marianna. “Drugs for Treating Mental Disorders: Technologies and Global Markets.” BCC Publishing. January 2011.

Thomasson, A. “A pragmatic method for normative conceptual work.” Conceptual Engineering and Conceptual Ethics. OUP (2020).

Touriño, R., Acosta, F. J., Giráldez, A., Álvarez, J., González, J. M., Abelleira, C., Benítez, N., Baena, E., Fernández, J. A., & Rodriguez, C. J. (2018). Suicidal risk, hopelessness and depression in patients with schizophrenia and internalized stigma. Actas Españolas de Psiquiatría, 46(2), 33–41.

Watters, Ethan. Crazy like us: The globalization of the American psyche. Simon and Schuster, 2010.

Zachar, Peter. “Psychiatric disorders are not natural kinds.” Philosophy, Psychiatry, & Psychology 7, no. 3 (2000): 167-182.

Appendix

Foucault, Szasz, problems with madness

Foucault cites the influential French psychiatrist Pinel, who argued that the mad should be treated as morally ill and not imprisoned. Insane people should be freed from their shackles, and treated with (a) silence, (b) encouragements to see their own reflection and recognize their madness, and (c) perpetual judgement by their caretakers to encourage more sane behavior. Foucault argues that this “liberation” is really a form of subjugation, meant to inflict the mad person with constant shame. Their punishment is made invisible and used to mold the ‘mad’ people into “disciplined bodies.” Mental illness is diagnosed by conduct but treated biologically.

Footnotes

  1. Thomasson, A. “A pragmatic method for normative conceptual work.” Conceptual Engineering and Conceptual Ethics. OUP (2020).
  2. However, the token concepts will be relevant examples to clarify the type concept. This also presents some issues in conceptual engineering, as concepts within the type are not uniform, and it’s possible some token mental illness concepts are better than others. I will address these issues in section 3.
  3. Stein, Dan J., Katharine A. Phillips, Derek Bolton, K. W. M. Fulford, John Z. Sadler, and Kenneth S. Kendler. “What is a mental/psychiatric disorder? From DSM-IV to DSM-V.” Psychological medicine 40, no. 11 (2010): 1759-1765.
  4. As defined in: McPherson, Tristam and David Plunkett. “Conceptual ethics and the methodology of normative inquiry.” Conceptual Engineering and Conceptual Ethics. 2020.
  5. See the evidence discussed in section 2.1.
  6. Marco Lauriola, Irwin P Levin, Personality traits and risky decision-making in a controlled experimental task: an exploratory study. Personality and Individual Differences, Volume 31, Issue 2, 2001, Pages 215-226, ISSN 0191-8869, https://doi.org/10.1016/S0191-8869(00)00130-6.
  7. Tan, Chee‐Seng, Xiao‐Shan Lau, Yian‐Thin Kung, and Renu A/L. Kailsan. “Openness to experience enhances creativity: The mediating role of intrinsic motivation and the creative process engagement.” The Journal of Creative Behavior 53, no. 1 (2019): 109-119.
  8. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub, 2013. Pg. 671.
  9. Banicki, Konrad. “Personality disorders and thick concepts.” Philosophy, Psychiatry, & Psychology 25, no. 3 (2018): 209-221.
  10. Putnam, H. (2002). The collapse of the fact/value dichotomy and other essays. Cambridge, MA: Harvard University Press.
  11. Thomasson (2020), pg. 444. Thomasson here is following Preston (1998) and Millikan (1984).
  12. American Psychiatric Association, pg. 62.
  13. Beck, Angela J., Cory Page, J. Buche, Danielle Rittman, and Maria Gaiser. “Estimating the Distribution of the US Psychiatric Subspecialist Workforce.” Ann Arbor: University of Michigan School of Public Health Workforce Research Center (2018).
  14. “Psychiatrists Market By Segmentation (Mental Disorder Type, Patient Type, Geography), By Trends, By Restraints, By Drivers, By Major Competitors – Global Forecasts To 2023.” The Business Research Company. January 2020.
  15. Tcherpakov, Marianna. “Drugs for Treating Mental Disorders: Technologies and Global Markets.” BCC Publishing. January 2011.
  16. Fulford, 96.
  17. Americans with Disabilities Act.
  18. Scott, Charles, ed. DSM-5® and the Law: Changes and Challenges. Oxford University Press, 2015.
  19. Serres, Michel. “The geometry of the incommunicable: madness.” In Davidson, Arnold Ira. Foucault and his interlocutors. University of Chicago Press (1997). Pg. 30.
  20. Foucault 2003, lecture of November 14, 1973.
  21. Link, Arthur S., and James F. Toole. “Presidential disability and the twenty-fifth amendment.” JAMA 272, no. 21 (1994): 1694-1697.
  22. See Foucault (1988), Goldberg (1999), Hacking (1998), Szasz (1997), and more.
  23. Plunkett, David. “Conceptual history, conceptual ethics, and the aims of inquiry: a framework for thinking about the relevance of the history/genealogy of concepts to normative inquiry.” Ergo, an Open Access Journal of Philosophy 3 (2016).
  24. Carballo, Alejandro Pérez. “Conceptual evaluation: epistemic.” (2020) In Alexis Burgess, Herman Cappelen & David Plunkett (eds.), Conceptual Ethics and Conceptual Engineering. Oxford, UK: Oxford University Press. Pg. 304-332.
  25. Bird, Alexander, and Emma Tobin. “Natural kinds.” Stanford Encyclopedia of Philosophy (2008).
  26. Dupré, John. “Natural kinds and biological taxa.” The Philosophical Review 90, no. 1 (1981): 66-90. He is following Quine (1969)’s account.
  27. Cooper, Rachel. Classifying Madness: A Philosophical Examination of the Diagnostic and Statistical Manual of Mental Disorders. Vol. 86. Springer Science & Business Media, 2006. Pg. 11.
  28. Zachar, Peter. “Psychiatric disorders are not natural kinds.” Philosophy, Psychiatry, & Psychology 7, no. 3 (2000): 167-182.
  29. Kincaid, “Defensible Natural Kinds,” in Kincaid and Sullivan (2014). Pg. 161.
  30. Hallam, Richard. Abolishing the concept of mental illness: Rethinking the nature of our woes. Routledge, 2018. Pg. 60.
  31. An umbrella review is a meta-analysis of meta-analyses.
  32. Carvalho, André F., Marco Solmi, Marcos Sanches, Myrela O. Machado, Brendon Stubbs, Olesya Ajnakina, Chelsea Sherman et al. “Evidence-based umbrella review of 162 peripheral biomarkers for major mental disorders.” Translational Psychiatry 10, no. 1 (2020): 1-13.
  33. Kingdon, David, and Allan H. Young. “Research into putative biological mechanisms of mental disorders has been of no value to clinical psychiatry.” The British Journal of Psychiatry 191, no. 4 (2007): 285-290.
  34. Kapur, Shitij, Anthony G. Phillips, and Thomas R. Insel. “Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it?.” Molecular psychiatry 17, no. 12 (2012): 1174-1179.
  35. Davidson, Arnold I. “Diseases of sexuality and the emergence of the psychiatric style of reasoning.” Meaning and Method: Essays in Honor of Hilary Putnam (1990): 295.
  36. Kincaid, Harold, and Jacqueline A. Sullivan, eds. Classifying psychopathology: Mental kinds and natural kinds. MIT Press, 2014. Pg. 51.
  37. Scull (2015), pg. 385.
  38. Haslam, Nick, Elise Holland, and Peter Kuppens. “Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research.” Psychological medicine 42, no. 5 (2012): 903-920.
  39. Davidson (1990), pg. 312.
  40. Drescher, Jack. “Out of DSM: Depathologizing homosexuality.” Behavioral Sciences 5, no. 4 (2015): 565-575.
  41. Harper, Marjory, ed. Migration and Mental Health: Past and Present. Springer, 2016.
  42. Hacking, Ian. Mad travelers: Reflections on the reality of transient mental illnesses. University of Virginia Press, 1998.
  43. Watters, Ethan. Crazy like us: The globalization of the American psyche. Simon and Schuster, 2010.
  44. Watters (2010), pg. 176.
  45. Colton and Manderscheid (2006).
  46. Healy et al (2006).
  47. Karagianis et al (2009).
  48. Patel et al (2018).
  49. Jablensky et al (1992).
  50. Leucht et al (2012).
  51. Testa and West (2010).
  52. Foucault, “Madness, the absence of work,” Critical inquiry (1995).
  53. Foucault 1988, pg. 498-501.
  54. Rose et al (2007)..
  55. Bolinger, Renee (forthcoming). The Language of Mental Illness. In Justin Khoo & Rachel Katharine Sterken (eds.), Routledge Handbook of Social and Political Philosophy of Language. Routledge.
  56. Sharaf, Ossman, and Lachine (2012).
  57. Boyd et al (2014).
  58. Touriño et al (2018).
  59. Oexle et al (2017).
  60. Moses (2009).
  61. Cunningham Owens, D. G., A. Carroll, S. Fattah, Z. Clyde, I. Coffey, and E. C. Johnstone. “A randomized, controlled trial of a brief interventional package for schizophrenic out‐patients.” Acta Psychiatrica Scandinavica 103, no. 5 (2001): 362-369.
  62. Rathod, Shanaya, David Kingdon, Peter Smith, and Douglas Turkington. “Insight into schizophrenia: the effects of cognitive behavioural therapy on the components of insight and association with sociodemographics—data on a previously published randomised controlled trial.” Schizophrenia research 74, no. 2-3 (2005): 211-219.
  63. Wodak, Daniel, Sarah‐Jane Leslie, and Marjorie Rhodes. “What a loaded generalization: Generics and social cognition.” Philosophy Compass 10, no. 9 (2015): 625-635.
  64. Ahn, Woo-kyoung, Elizabeth H. Flanagan, Jessecae K. Marsh, and Charles A. Sanislow. “Beliefs about essences and the reality of mental disorders.” Psychological Science 17, no. 9 (2006): 759-766.
  65. See Haslam (2011), Mehta and Farina (1997), Lam, Salkovskis, and Warwick (2005), Phelan (2005, and Read and Harré (2001).
  66. Hacking, Ian. Historical Ontology. Harvard University Press, 2004. Pg. 108.
  67. Hacking, Ian. Rewriting the soul: Multiple personality and the sciences of memory. Princeton University Press, 1998. Pg. 16.
  68. Howard (2008).
  69. Howard (2008), pg. 7.
  70. Potter, Nancy Nyquist. “Oppositional defiant disorder: Cultural factors that influence interpretations of defiant behavior and their social and scientific consequences.” Classifying Psychopathology: Mental Kinds and Natural Kinds 175 (2014).
  71. Surís, Alina, Ryan Holliday, and Carol S. North. “The evolution of the classification of psychiatric disorders.” Behavioral Sciences 6, no. 1 (2016): 5.
  72. Greenough, Patrick. “Conceptual Engineering via Reality Engineering.” Forthcoming.
  73. Simion, Mona. “The ‘should’ in conceptual engineering.” Inquiry 61, no. 8 (2018): 914-928.
  74. Greenough (forthcoming).
  75. Cappelen (2018), chapter 4, pg. 51-53.
  76. Cappelen (2018), chapter 2, pg. 23.
  77. See Cappelen (2018), part 3; Thomasson (2020); and Prinzing (2018).
  78. Hallam, Abolishing the Concept of Mental Ilness, pg. 16.
  79. Hallam, pg. 105.
  80. Jablensky, Assen. “Does psychiatry need an overarching concept of ‘mental disorder’?.” World Psychiatry 6, no. 3 (2007): 157.
  81. Jaspers, Karl. General psychopathology. Vol. 2. JHU Press, 1997.
  82. Zacher, in Kincaid and Sullivan (2014). Pg. 87.
  83. Jablensky, “Does psychiatry need an overarching concept of ‘mental disorder’?” (2007).
  84. Haslanger, Sally. “Going On, But Not in the Same Way.” In Alexis Burgess, Herman Cappelen & David Plunkett (eds.), Conceptual Ethics and Conceptual Engineering. Pg. 236.
  85. Haslanger (2020), pg. 253.
  86. Richard (2020), pg. 356.
  87. discrimination and oppression against a mental trait or condition a person has or is judged to have.
  88. Richard (2020), pg. 370.
  89. Konrad (2018).
  90. Kapur et al (2012). Pg. 1176.
  91. Brückl, Tanja M., Victor I. Spoormaker, Philipp G. Sämann, Anna-Katharine Brem, Lara Henco, Darina Czamara, Immanuel Elbau et al. “The biological classification of mental disorders (BeCOME) study: a protocol for an observational deep-phenotyping study for the identification of biological subtypes.” BMC psychiatry 20 (2020): 1-25.
  92. Nesse, Randolph M. Good reasons for bad feelings: insights from the frontier of evolutionary psychiatry. Penguin, 2019.
  93. Kingdon and Young (2007), pg. 2.
  94. Cooper, Rachel. Psychiatry and philosophy of science. Routledge, 2014. Pg. 26.
  95. Barnes, Elizabeth. The minority body: A theory of disability. Oxford University Press, 2016. Pg. 7.
  96. Barnes (2016), pg. 111.
  97. Scull (2015), pg. 30.
  98. Richard (2020), pg. 373.
  99. Kirk, Stuart A., David Cohen, and Tomi Gomory. “DSM-5: The delayed demise of descriptive diagnosis.” In The DSM-5 in perspective, pp. 63-81. Springer, Dordrecht, 2015. Pg. 67.
  100. Simion, Mona. “The ‘should’ in conceptual engineering.” Inquiry 61, no. 8 (2018): 914-928.
  101. Charland, L. C. (2006). Moral nature of the DSM-IV cluster B personality disorders. Journal of Personality Disorders, 20, 116–25.
  102. Pols, Jan. “The Politics of Mental Illness: Myth and Power in the Works of Thomas S. Szasz.” Trans. Mira de Vries (1984/2005). Nijmegen, 1976. Pg. 178.
  103. Stretton, Serina. “Systematic review on the primary and secondary reporting of the prevalence of ghostwriting in the medical literature.” BMJ open 4, no. 7 (2014): e004777.
  104. Healy, David, and Michael E. Thase. “Is academic psychiatry for sale?” The British Journal of Psychiatry 182, no. 5 (2003): 388-390.
  105. Cosgrove, Lisa, and Harold J. Bursztajn. “Toward credible conflict of interest policies in clinical psychiatry.” (2009).
  106. Howard (2008).
  107. Hallam, Abolishing the Concept of Mental Illness, pg. 13.
  108. Fulford, Kenneth WM, Martin Davies, Richard Gipps, George Graham, John Sadler, Giovanni Stanghellini, and Tim Thornton, eds. The Oxford handbook of philosophy and psychiatry. OUP Oxford, 2013. Pg. 95.
  109. Healy, David. Mania: A short history of bipolar disorder. JHU Press, 2008. Pg. 227.
  110. Horwitz, Allan V. “11 The Social Functions of Natural Kinds: The Case of Major Depression.” Classifying Psychopathology: Mental Kinds and Natural Kinds (2014): 209.
  111. Hacking, Ian. “The looping effects of human kinds.” In D. Sperber, D. Premack, & A. J. Premack (Eds.), Symposia of the Fyssen Foundation. Causal cognition: A multidisciplinary debate (p. 351–394). Clarendon Press/Oxford University Press.
Categories
Essays Philosophy

The Paradoxes of Joy and Suffering Abolition in Nietzsche

Note: All of Nietzsche’s works will be cited in paragraph citations with their standard abbreviations (e.g. BT for Birth of Tragedy) and their section, page, or aphorism numbers, while the translation/versions will be listed in the bibliography. All other sources will appear in footnotes. 

Two key paradoxes are built into Nietzsche’s views of suffering and joy. First, Nietzsche propounds the art and discipline of suffering while simultaneously praising happiness. This is the joy paradox. Second, Nietzsche denounces the wholesale abolition of suffering, but he also seeks to eliminate meaningless suffering. This is the suffering abolition paradox. I argue that Nietzsche has a complex, multifaceted account of suffering and joy that accounts for these apparent paradoxes. The first part of this paper reconstructs Nietzsche’s view of suffering, from its origins to his defense of its value. I also address several objections to this view, including the argument that some kinds of suffering are purely destructive and irredeemable. The second part traces Nietzsche’s less well-known view of the nature of joy and how it can be sought. Finally, the third part attempts to resolve the contradiction between these two aspects and outlines the prospect of a Nietzschean transhumanism.

1. Suffering

a. The Nature of Anguish

As per usual, Nietzsche begins in conversation with Schopenhauer and the Greeks. For Schopenhauer, life consists of endlessly chasing desires that can never be satisfied, making “life an unprofitable episode, disturbing the blessed calm of non-existence.”[1] He thus affirms the wisdom of Silenus: that it is best to have never existed, and second-best to die soon (BT §3). The constant source of suffering is not external, but within the individual’s will. The only liberation from this cycle of suffering-filled desire is the aesthetic contemplation that “lifts us out of real existence and transforms us into disinterested spectators of it.”[2] In these moments, the “fierce pressure of the will” is briefly extinguished, and we can experience sublime joy without desire.[3] The logical consequence of this view is that the complete cessation of the will would be ideal. Schopenhauer, with the Buddha, sought to eliminate the desires at the root of suffering.

Nietzsche accepts the noble truth[4] that life is suffering, but his response to this fact is different: like the tragic Greeks, he affirms both the will and the suffering it causes. The Greeks realized that suffering is inevitable in the fragile, imperiled, and chaotic human condition; they “knew and felt the terror and horror of existence” (BT §3). To even endure this terrible understanding, the Greeks had to invent art, myth, and the Olympian gods. The beautiful Apollonian dream-vision is related to the painful Dionysian reality in the same way “as the rapturous vision of the tortured martyr is to his suffering” (BT §3). The martyr envisions a salvation to redeem his pain, just as tragedy creates a beautiful narrative to instill meaning into suffering. Tragedy is not just a numbing drug or palliative, but an invigorating experience that brings exuberant health in the face of the worst suffering. Even if the tragedy’s plot is a series of disasters, it brings these events together to transfigure them into a joyful experience. The Hellenic pantheon also reflected human life rather than some other world, so the Greeks saw themselves glorified and made gods: beneath the “bright sunshine of such gods, existence is felt to be worth attaining” (BT §3). Greek myth-makers and tragic writers made life worth living despite its inherent suffering.

The Transfiguration by Raphael – my daily art display
Raphael, The Transfiguration. Nietzsche uses this painting as an example in The Birth of Tragedy.

In this way the pain-prone, sensitive Greeks were able to courageously affirm their existence. Just as Raphael’s Transfiguration depicts “luminous hovering in purest bliss” above a world of woe and strife, the Greeks transfigured their pain into life-affirming tragic art (BT §4). The “hidden substratum of suffering” is not just a sideshow, but essential to creating beauty (BT §4). As Nietzsche exclaims, “how much must these people have suffered to be able to become so beautiful!” (BT §21). Ultimately, the cheerfulness of the Greeks did not rest on a contented freedom from suffering, but a powerful affirmation of it. Nietzsche continues to uphold the value of tragedy in his last works — “I promise a tragic age: the highest art in saying Yes to life, tragedy, will be reborn.”[5]

b. In Defense of Suffering

In Nietzsche’s view, “the problem is that of the meaning of suffering,” and not merely suffering (WP #1052). Man is accustomed to pain and “does not repudiate suffering as such; he desires it, he even seeks it out, provided he is shown a meaning for it” (GM §3 #28). With this fundamental understanding, Nietzsche develops concepts that will imbue suffering with meaning — and not just any meaning, but a life-affirming meaning that will bring genuine health.

Condemning value-systems centered on pleasure and pain as shallow and naive, Nietzsche urges hedonists, utilitarians, pessimists, and Epicureans to look for higher values (BGE #255). He has a “higher compassion which sees further,” recognizing that these value-systems make man smaller in the long-term (BGE #255). Nietzsche saw the British utilitarians of his time as seeking only a soporific, comfortable, mediocre, ‘herd animal’ kind of happiness (BGE #228). Those who “experience suffering and displeasure as evil, worthy of annihilation and as a defect of existence” merely subscribe to a “religion of comfortableness” (GS #338). Eliminating our species-preserving suffering would leave humanity anemic and unable to change, adapt, or resist, undermining the long-term future of mankind.

Most of all, Nietzsche condemns utilitarianism because of its “harmful consequences for the exemplary human being” (WP #399). He rejects the idea that the “ultimate goal” is the “greatest happiness of all” (SE §6). Nietzsche argues that the individual can “receive the highest value, the deepest significance” only by “living for the good of the rarest and most valuable exemplars, and not for the good of the majority” (SE §6). This reflects a critical idea: Nietzsche may not be speaking to all people, and his defenses of suffering may not apply to everyone. His intended audience may only be these extraordinary individuals. For these brave and creative individuals, pleasure and pain are always epiphenomena and not ultimate values. To achieve anything, we must seek out both.

While the hedonists may want to “do away with suffering” with some fantastic means, Nietzsche’s higher souls want it “higher and worse than it ever was!” (BGE #255). Well-being as the hedonists understand it would be a contemptible endpoint for humanity. After all, he asks:

The discipline of suffering, of great suffering – don’t you realize that up to this point it is only this suffering which has created every enhancement in man up to now? That tension of a soul in misery which develops its strength, its trembling when confronted with great destruction, its inventiveness and courage in bearing, holding out against, interpreting, and using unhappiness…

(BGE #255)

Man contains both chaotic, formless clay and the hammer to shape this rough clay into something more. We cannot have pity for the clay, for the parts of ourselves that must and “should suffer” to achieve positive transformation (BGE #255). The creature in us must suffer so that the creator in us can persevere and grow. For instance, by imposing the suffering of asceticism on himself, the philosopher “affirms his existence” (GM §3 #8). He strengthens his dominant instinct — to spirituality, knowledge, or insight — by rejecting small pleasures and sensualities. Furthermore, if we value the overcoming of resistance (the will to power), then we must also value the resistance itself – and the suffering it entails.

person making pot
Human clay can be shaped by suffering.

While we moderns mostly know agony through fantasy, the ancients trained themselves in real suffering. For them, Nietzsche argues, pain was less painful. Meanwhile, our lack of habituation to pain explains why “inevitable mosquito bites” seem to us like an objection against life as a whole (GS #48). The solution to this kind of oversensitive suffering may therefore be more suffering, so that we can become whole and strong enough to withstand the unavoidable ills of existence. Then we may welcome any kind of suffering because it will strengthen us, just as distress makes a bow tauter (GM §1 #12). Like our ancestors, we might even begin to see suffering as a virtue and as a genuine enchantment to life rather than an argument against life.

In line with Nietzche’s arguments, Haidt argues that some kinds of suffering can create posttraumatic growth.[6] This is also known as anti-fragility,[7] and it is more than just stable resilience, as it can result in positive transformation and improvements from the previous state. Posttraumatic growth has been empirically documented in many circumstances, including in refugees, Holocaust survivors, cancer patients, and prisoners.[8] This research finds that an individual’s posttraumatic growth is often predicted by their ability to make the experience meaningful. As Nietzsche provides an abundance of tools for meaning-making, he encourages growth and enables more anti-fragility.

Furthermore, certain kinds of truth and knowledge are inextricably connected to suffering. Characters like Prometheus and Faust, who steal knowledge from beyond the human world and are thus tortured for eternity, represent this fundamental fact: truth has a price. This type of exemplary individual “voluntarily takes upon himself the suffering inherent in truthfulness” to create a complete revolution in himself (SE §4). This heroic individual who tries “to transcend the curse of individuation” and “attain universality” inevitably suffers from experiencing the hidden primordial contradictions of existence (BT §9). The value of an individual can even be assessed by how much truth they can endure (GM §3 #19; BGE #39).

Enduring suffering is especially critical for the free spirits that Nietzsche considers his audience: “we first had to experience the most varied and contradictory states of distress and happiness in our souls and bodies, as the adventurers and circumnavigators of that inner world called ‘man’” (HH #7). Nietzsche implores these knowledge-seeking free spirits to “collect the honey of knowledge from diverse afflictions, disturbances, illnesses,” exploring all types of experience while “despising nothing, losing nothing, savoring everything.”[9] As voyagers in the state-space of consciousness, the free spirits must test the entire complex palette of human experiences, learning their nature and their interrelations. Therefore, extraordinary truth-seeking individuals cannot value knowledge without also valuing suffering.

Conclusively, Nietzsche defends suffering as a kind of transformative experience.[10] Suffering can be personally transformative in helping us develop ourselves, recognize our authentic aims, and become stronger, more life-affirming, and more anti-fragile beings. Suffering can also be epistemically transformative. At the very least, suffering provides knowledge about certain qualia: it tells us what some kinds of experiences are like. But pain can also provide previously inaccessible knowledge, restructuring our entire worldview. Some knowledge worth pursuing may be inseparable from suffering. As suffering is an inherent feature of existence, Nietzsche argues that we should affirm it and make it meaningful rather than avoid it.

c. Critiques of Suffering

I will address three primary critiques of Nietzsche’s defense: (1) some responses to suffering are negative, (2) simply affirming suffering because it exists commits the genetic fallacy, (3) some forms of suffering are inherently negative and irredeemable.

First, Nietzsche agrees that there are many negative reactions to suffering. He makes it clear that there are both positive (creative) and negative (destructive) reactions to suffering. Positive reactions include sublimation, virtue development, meaning-making, and creativity. Negative reactions include ressentiment, pity, and collapse. Ressentiment consists of swallowing anger, fear, hatred, or other negative emotions and letting them fester.[11] The resentful individual cannot forget some past wrong or suffering, and becomes nasty, filled with rancor, consumed by a desire to rectify or revenge a past event. This is just one example of a negative reaction. However, our harmful responses to suffering alone are not an argument against suffering itself.

scope image
Like mold, suffering can fester, grow, and turn into hateful ressentiment.

Furthermore, some interpretations of suffering are negative. For instance, in Nietzsche’s view, Christianity tells those searching for something to blame for their suffering that “you alone are to blame for it!” (GM §2 #15). This provides some meaning – we suffer because we are sinful and infinitely guilty. As we are desperate for meaning, we cling to this interpretation: “any meaning is better than none at all” (GM §3 #28). But ultimately this meaning only brings “deeper, more inward, more poisonous, more life-destructive suffering” (GM §3 #28). For the Christian then redirects his ressentiment back onto himself, lashing himself for his guilt. Christianity thus encourages moralistic thinking that increases guilt and suffering.

Nietzsche also rejects slave morality in part because of its association with misery. Slave values reflect the ressentiment of the weak and suffering (GM §1 #16). Slave morality itself was created to relieve suffering, an upwelling of a long-impotent bitterness that finally finds expression in the revolt against the master. Clearly Nietzsche does not believe all suffering is positive, for he argues that “the preponderance of feelings of displeasure over feelings of pleasure is the cause of this fictitious morality and religion” (AC #15). The slaves compensate themselves for the suffering inflicted upon them by the masters with a psychological revenge: negating the values of the masters and painting them as evil. Under their revaluation of morals, “the suffering, deprived, sick, and ugly alone are pious,” while the “powerful and noble” are painted as evil (GM §1 #7). The slave is induced to follow these inverted values because of the promise of heaven as the reward for a true believer who suffers at the hands of evil. The priest thus manipulates the slaves’ sufferings to wreak revenge on the masters.

While Christianity, slave morality, and afterworlds all affirm or provide meaning for suffering in their own ways, Nietzsche opposes them all. They end up only making suffering worse, harming exemplary individuals, negating life, and damaging even their adherents. In contrast to the guilt-manufacturing Christianity, the Greeks vindicated humanity by making the gods guilty, as these gods “took upon themselves, not the punishment, but what is nobler—the guilt” (GM §2 #23). As the gods were the source of wickedness, man was liberated from self-loathing and guilt. Nietzsche sees guilt and shame as pathologies that can be overcome by cultivating a critical awareness, a sense of generosity and self-respect, and an unashamed affirmation of life. Clearly, Nietzsche recognizes that reactions and interpretations to suffering are not all equal, and the value of suffering will often be dictated by our response to it.

Second, some may argue that Nietzsche’s affirmation of suffering commits the genetic fallacy. But even if it is true that humans were shaped by evolutionary or historical forces to both suffer and see suffering as virtue, this does not imply we must keep suffering. This would make the fallacious assumption that the origins of a concept should dictate its current use.[12] Furthermore, the claim that ‘life is suffering’ cannot entail the conclusion that ‘suffering ought to be affirmed.’ As Hume showed, descriptive claims cannot imply normative claims; an ‘ought’ cannot be derived from an ‘is.’[13] Finally, a critic of suffering might argue that all the ‘goods’ of suffering are circular and non-transferable. These skills are only beneficial insofar as suffering exists. Yes, suffering may help us develop certain skills, including the capacity to respond to unpredictable suffering, to revise goals in calamity, and to move past loss. But these are essentially ‘virtues of dealing with suffering,’ or methods of getting used to it. It seems circular to claim that that suffering should exist because of the virtues it produces while these virtues are themselves justified by the existence of suffering.

Third, some suffering seems unaffirmable. Purely destructive agony can cause only harm, undermining health, strength, joy, and preventing the affirmation of life, and is therefore antithetical to Nietzsche’s own values. While Nietzsche’s defense emphasizes the growth and transformation enabled by suffering, he seems to ignore the kind of suffering that falls outside this description.[14] Some suffering does not even involve resistance or overcoming – sometimes, it is just powerlessness, subjection, and destruction. These painful states are a form of “hermeneutical death,” as they destroy the victim’s abilities to interpret suffering or make meaning from it.[15] As Levinas writes, this kind of suffering “rends the humanity of the suffering person,” and “intrinsically, it is useless, ‘for nothing.’”[16] Critics may argue that Nietzsche’s praise of suffering ignores the existence of this purely destructive and life-negating suffering.

File:Titian-Sisyphus.jpg - Wikipedia
Sísifo by Titziano. Sisyphus is the existentialist symbol of pointless, repetitive suffering.

However, Nietzsche is not committed to the position that all pain develops us. His passages do not claim that all suffering should unequivocally affirmed, and he even objects to senseless suffering. As he writes, “what really arouses indignation against suffering is not suffering as such but the senselessness of suffering” (GM §1 #7). What Nietzsche rejects is the “mortal hatred for suffering in general” (BGE #202), a position that universally rejects all kinds of negative experience. Nietzsche’s view is more multidimensional, affirming some kinds of upbuilding suffering while rejecting other kinds of destructive suffering (e.g. the festering, passive suffering that leads to ressentiment). He clearly supports the suffering that forges individuals from chaotic fragments into stronger, more creative beings, but nowhere defends purely destructive agony. He also implies that disciplined and voluntary suffering is more likely to be positively transformative, rather than the forced and externally imposed suffering that tends to be destructive (GS #48, BGE #62). Critiques of Nietzsche’s views that rely on the existence of extreme and pointless suffering are therefore strawman arguments, attacking a position that Nietzsche does not even defend. Of course, one could still argue that Nietzsche’s views of suffering have a key blind spot, as they fail to explicitly address useless, extreme suffering.

2. Joy

person standing on rock raising both hands

Despite his advocacy for transformative suffering, Nietzsche also extols emotional states that seem to be the opposite of pain: well-being, joy, happiness, and jubilation. He proclaims that the future needs “a new health, stronger, more seasoned, tougher, more audacious, and gayer than any previous health,” and praises the ideal of “a superhuman well-being and benevolence” (GS #382). He dreams of “human beings distinguished as much by cheerfulness…more fruitful human beings, happier beings!” (GS #283) He urges poets, artists, and philosophers to “let your happiness too shine out,” instead of “painting all things a couple of degrees darker than they are” (D #561). He testifies that joy is “deeper yet than agony,” for “woe implores: Go! / but all joy wants eternity” (TSZ pg. 340). He calls for us to “share not suffering but joy” (GS #338) and to “harken to all cheerful music” (GS #302), for “life is a well of joy” (TSZ pg. 208). He declares that it is a lack of joy that brings degradation and decay, for the “mother of dissipation is not joy but joylessness” (MM #77). Nietzsche concludes that “man has felt too little joy: that alone, my brothers, is our original sin” (TSZ pg. 200).

How can we reconcile Nietzsche’s exuberant praises of joy with his embrace of suffering? Part III will address this apparent paradox. This section will extract some of Nietzsche’s core views of joy: (a) the defining features that characterize happiness, and (b) the methods and processes which produce or prevent joy.

a. The Nature of Happiness

What unites positive affective states is that “happiness…no matter what the sort, confers air, light, and freedom of movement” (D #136), and that it contains an “abundance of feeling and high-spiritedness” (D #439). Happiness for Nietzsche is closely associated with the expression of the will to power: “what is happiness? The feeling that power increases—that a resistance is overcome” (AC #2). Indeed, he states that happiness can be “understood as the liveliest feeling of power” (D #113). There are also two kinds of happiness: “the feeling of power and the feeling of surrender” (D #60). This is similar to the distinction “between the impulse to appropriate and the impulse to submit” (GS #188). The appropriating impulse feels joy in desiring and in transforming things into functions, while the submitting impulse feels joy in being-desired and becoming a function. Often, it is the “people who strive most feverishly for power” who most want to “tumble back into a state of powerlessness,” like mountain climbers who dream of effortlessly rolling back downhill (D #271). Power is an essential but hidden aspect of happiness.

Nietzsche’s conception of joy is antithetical to his era’s moderate ideas of ‘cheerfulness,’ ‘comfort,’ and ‘happiness.’ Zarathustra calls this pitiful, polluted, and stale conception of happiness “wretched contentment” (TSZ pg. 125). This mass-produced kind of pleasure only hinders the achievement of true joy. The rabble poisons life’s well of joy, and “when they called their dirty dreams ‘pleasure,’ they poisoned the language too” (TSZ pg. 208). The Last Man is the symbol of a self-satisfied and stable society that has given up on any ideal beyond wretched contentment, and that is “increasingly suspicious of all joy” (GS #259). This society teaches its members to live by the “ticktock of a small happiness” and to develop only those virtues that “get along with contentment” (TSZ pg. 281). The crowd embraces the Last Man, who preaches of acceptable levels of pleasure, rather than the ideal of the overman and great health. However, for Nietzsche, anything like this mild Epicurean satisfaction “is out of the question. Only Dionysian joy is sufficient” (WP #1029). Nice, pleasurable feelings are not enough, for “happiness ought to justify existence itself” (TSZ pg. 125). Joy, like suffering, must be transfigured into meaning.

Nietzsche rebukes the contemporary cheerleaders for the simple ideas of wretched contentment, the 19th-century Last Men, for they

do not even perceive the sufferings and monsters that as thinkers they pretend to perceive and fight, and their cheerfulness provokes displeasure simply because it deceives, for it seeks to seduce one into believing that a victory has been won. For basically there is cheerfulness only where there is victory.

(SE §2)

This mediocre happiness will only depress and torment the insightful thinkers who recognize that it is founded upon a lie. The Last Men are not concerned with the long-term joy of humanity, and instead “want to cheat it out of its future for the sake of a painless, comfortable present” (HH #434). Nietzsche even argues that “the primal suffering of modern culture” is a result of the degeneration of “authentic art” into mere “superficial entertainment” (BT #19). Those with a more “delicate taste for joy” see this kind of “crude, musty, brown pleasure” as repulsive (GS Preface #4). Genuine cheerfulness arises not from deceptions but from a hard-fought victory over a difficult problem confronted honestly. While the weak consume opiate-like pleasures to numb and console themselves, the stronger spirits attempt to overcome challenges worthy of jubilation and actually build a life worthy of joy.

b. The Joyful Science

How can genuine joy be achieved? Unfortunately, there are no foolproof methods. Just as no medicine can cure all patients, no philosophy can guarantee happiness. Whether a philosophy produces happiness is no argument for or against it. As hunting for joy-guaranteeing wisdom is futile, “may each of us be fortunate enough to discover that philosophy of life which enables him to realize his greatest measure of happiness” (D #345). Universal laws cannot lead the individual to happiness, because each person’s happiness “springs from one’s unknown laws,” and “external precepts can only hinder and check it” (D #108). Forcing all people to abide by a single law to achieve happiness is as irrational as a tyrannical individual stamping his idiosyncratic, narrow, and personal way of suffering “as an obligatory law” upon all others (GS #370).

Despite this reality, one of humankind’s great errors is the belief that happiness can come from passive submission to prescribed rules or ideals. The classic moral refrain is “do this and that, refrain from this and that—then you will be happy!” (TI §6 #2). Nietzsche rejects this formulation. In his view, virtue does not cause happiness; happiness causes virtue. In reality, “a well-balanced human being, a ‘happy one,’ must perform certain actions and shrink instinctively from other actions,” and this virtue is a consequence of his happiness (TI §6 #2). Morality is also not the way to happiness. Indeed, morality has “opened up such abundant sources of displeasure” that we can conclude it is a wellspring of more profound misery and not a source of joy (D #106). Whenever moral precepts lead a person to “unhappiness and misery to set in instead of the vouchsafed happiness,” the moralists will claim that the person overlooked some rule or practice (D #21). The idea that those who disobey morality cannot experience happiness is absurd, for “evil people have a hundred types of happiness about which the virtuous have no clue” (D #468). Subscribing to a set of moral norms is no way to achieve joy.

Additionally, individuals who are stuck in the ‘it was,’ constantly tormented by the past, cannot experience joy. Happiness relies on limited horizons, restricting one’s view to the present and forgetting the past:

Anyone who cannot forget the past entirely and set himself down on the threshold of the moment, anyone who cannot stand, without dizziness or fear, on one single point like a victory goddess, will never know what happiness is; worse, he will never do anything that makes others happy.

(HL §1)

Without the ability to forget living is impossible. While stronger natures may be able to incorporate more of the past without being stifled, every person has necessary limits. Without these limits the past can “become the gravedigger of the present” (HL §4). Individuals seeking happiness must give up their “profound insight,” their over-satiated sagacity and exhaustive knowledge of their own past, in exchange for the “divine joy of the creative and helpful person” (HL §4). Furthermore, the will must become its own “liberator and joy-bringer” by embracing past events, converting all “‘it was’ into a ‘thus I willed it’” in a demonstration of amor fati (TSZ pg. 253). Forgetting is vital for joy and creation.

blue ocean water during daytime
Happiness requires limits and fixed horizons – created by the ability to forget.

Nietzsche also emphasizes the hedonic paradox, which states that pursuing happiness directly will only reduce happiness. At the fountain of pleasure, “often you empty the cup again by wanting to fill it. And I must still learn to approach you more modestly: all-too-violently my heart still flows toward you” (TSZ pg. 210). Seeking fulfillment of pleasures will empty you of them. After all, “joy is only a symptom of the feeling of attained power…one does not strive for joy…joy accompanies; joy does not move” (WP #688). This paradox has also been validated by modern empirical research.[17] Pursuing joy directly is ineffectual. This may be why Zarathustra declares “am I concerned with my happiness? I am concerned with my work!” (TSZ pg. 258) He implores his listeners that “one shall not wish to enjoy,” for enjoyment is a bashful thing that does not want to be sought — it would be better to seek out suffering! (TSZ pg. 311) This also suggests that simple hedonists have a deficient understanding of human psychology: seeking out pleasure will only reduce it, and often pursuing pain is more beneficial.

Furthermore, just as suffering provides epistemic access to some knowledge, some truths are only available during immense joy. The primordial unity (das Ur-Eine) is experienced through a form of Dionysian ecstasy, rapturous enthrallment or intoxication (Rausch). This Dionysian experience is characterized by a “mystical, jubilant shout” (BT §16), filled with “exuberant fertility” (BT §17) and an “immeasurable, primordial delight in existence” (BT §17). The Dionysian destroys individuation “so that the mere shreds of it flutter before the mysterious primordial unity” (BT §1). Ecstasy is required to apprehend the primordial unity, and the feeling of oneness with all of nature generates immense joy. The connection between joy and knowledge runs deep. This may be why philosophers from Plato and Aristotle to Descartes and Spinoza agreed that seeking knowledge “constitutes the highest happiness” for humans (D #500). However, “there is no preestablished harmony between the furthering of truth and the well-being of humanity” (HH #517), and knowledge or truth do not necessarily generate happiness.

Ultimately the search for joy amounts to a search for personal meaning and a way to express one’s will to power. The individual need not ask why the “world” or “humanity” exists, or even why she personally exists. Instead, the individual must “try to justify the meaning of your existence a posteriori, as it were, by setting yourself a purpose…a lofty and noble ‘reason why’” (HL §9).[18] Each individual must cross the stream of life alone and cannot be simply carried by another. Nietzsche urges the young soul to look back on the things they have truly loved, that have dominated their soul while “simultaneously making it happy,” for this series of revered objects can reveal the “fundamental law of your authentic self” (SE §1). By throwing all of her abilities and powers in the direction of this life-path, the individual can reach the highest joys possible for her.

3. The Paradoxes

a. Joy and Suffering

There is an apparent conceptual tension between Nietzsche’s defense of the discipline of intense suffering and his praise of joy. However, this paradox dissolves when one stops seeing pain and pleasure as antitheses. The “breadth of space between highest happiness and deepest despair has been established only with the aid of imaginary things” (D #7). We may also overestimate this distance between suffering and joy because language exaggerates the gap. We have words primarily for superlative, extreme states, while “the milder middle degrees” are left unnamed (D #433). As we cannot apply labels to the myriad emotional states between suffering and joy, we are unable to conceptualize a continuum between the two extremes. Once again, the human obsession with dichotomous thinking prevents us from seeing the complexity of spectrums and interconnected networks.

The extreme hedonic states are not opposites. Indeed, pleasure must always include pain and may itself be the overcoming of pain: “one could describe pleasure in general as a rhythm of little unpleasurable stimuli” (WP #697). Nietzsche conceptualizes happiness as a kind of overcoming, and overcoming requires resistance, which is experienced as suffering. This means that happiness is necessarily connected to suffering. As such, Nietzsche felt that the most sublime “happiness could be invented only by a man who was suffering continually” (GS #45). As he wonders,

What if pleasure and displeasure were so tied together that whoever wanted to have as much as possible of one must also have as much as possible of the other — that whoever wanted to learn to jubilate up to the heavens would also have to be prepared for depression unto death?

(GS #12)

We must choose between either “as little displeasure as possible,” or “as much displeasure as possible as the price for the growth of an abundance of subtle pleasures and joys that have rarely been relished yet” (GS #4). In contrast, the “comfortable and benevolent” Last Men know nothing of human happiness, for they do not understand that “happiness  and unhappiness are sisters and even twins that either grow up together or, as in your case, remain small together” (GS #338). Suffering is essential to experience the height of joy. The two emotional poles cannot be separated from each other; they are two aspects of the same process.

However, Nietzsche does not claim that happiness justifies suffering. It is not a matter of a simple felicific calculus, where the positive valence in the world outweighs the negative valence. He rejects this utilitarian summation. It is not that the happiness vindicates the suffering, but rather that humans create joy despite the suffering:

Right beside the sorrow of the world and often upon its volcanic ground, human beings have laid out their little gardens of happiness…everywhere they will find some happiness sprouting beside the misfortune -and indeed, the more happiness, the more volcanic the ground was-only it would be ridiculous to say that the suffering itself could be justified by this happiness.

(HH #591).

This oft-overlooked passage demonstrates that Nietzsche does not merely think our sufferings are ‘justified’ by some happiness, but that we create happiness in response to suffering: “Perhaps I know best why man alone laughs: he alone suffers so deeply that he had to invent laughter” (WP #91). In more plain terms, “the sorrow in the world has caused human beings to suck a sort of happiness from it.”[19] It is not that Nietzsche fails to see the unjustifiable badness of some suffering — like Schopenhauer, he has a devastating understanding of the sufferings of the world, but Nietzsche also sees the necessity to create joy and meaning despite the anguish.

Finally, the eternal recurrence transmutes the eternal return of suffering into something worth joyfully embracing. Nietzsche’s eternal recurrence is “a formula for the highest affirmation, born of fullness, of overfullness, a Yes-saying without reservation, even to suffering,” and it represents the “ultimate, most joyous, most wantonly extravagant Yes to life” (EH pg. 272). The affirmer of life doesn’t desire the eternal recurrence because she wants suffering, but because she does not simply weigh pain against pleasure to determine life’s value. This kind of calculus is misguided because life is not a series of discrete events. Rather, all events are deeply interconnected by complex causal chains. (See The Calm and the Cataract for the connections between eternal recurrence and the Buddhist concept of interbeing). In affirming any single event, we affirm the whole. If each “individual” thing is connected to all other things, then when you say yes to one moment you say yes to all moments. If “all things are chained and entwined together,”[20] then we affirm the entire chain when we affirm a single link. If we say yes to one moment of joy, then we also say yes to all the suffering intertwined with this moment.

macro shot of spider web
Every moment is wired together, connected by the spiderweb of the universe.

b. Suffering Abolition

The question is not just what Nietzsche means to us, but what we would mean to him, how he might evaluate our contemporary situation, “how our epoch would appear to his thought.[21] To answer this question, this section brings Nietzsche into conversation with the modern transhumanist philosopher David Pearce, who upholds The Hedonistic Imperative: to “abolish suffering throughout the living world” through technological means like genetic engineering.[22] While Nietzsche’s critiques of hedonism remains relevant and compelling, his thought may be surprisingly adaptable to this kind of transhumanist project.

After all, Nietzsche’s philosophical project is motivated by his desire “to take away from human existence some of its heartbreaking and cruel character.”[23] This suggests that Nietzsche himself is engaged in the suffering abolition project. Nietzsche may “still be in the business of abolishing precisely the helplessness, the interpretive vacuum, that gives suffering its sting.”[24] After all, if meaninglessness is constitutive of suffering, then suffering interpreted well is no longer suffering. Many philosophers define suffering as an unpleasant experience S conjoined with the desire that S not be occurring.[25] By increasing the meaningfulness and value of suffering, Nietzsche’s work can reduce our desire to avoid suffering, making it a positive good. Suffering on its own is helpless and does not inevitably create growth. However, we can give it a value by making it constitutive of growth, creativity, and positive transformation. If we will our suffering, we are no longer helpless – it becomes an ‘I willed it,’ not a mere ‘it was’ out of our control. As Nietzsche writes about his trials, “I have never suffered from all this, for what is necessary does not hurt me” (EH pg. 332). This is abolition in a radically different sense than the simple elimination of suffering, the comfort-making that the hedonists of his time advocated. Nietzsche’s suffering-abolition focuses on filling the interpretative vacuum of suffering.

Transhumanists may be skeptical that we can really conjure suffering out of existence merely by coloring it with a kind of life-affirming interpretation. They may doubt Nietzsche’s exorbitant claim that he never suffered from necessary things. Furthermore, transhumanism can critique Nietzsche as stuck in his time. The technology to overcome suffering, end aging, or re-engineer human biology did not exist in the 1800s. Therefore, Nietzsche affirmed suffering as it existed because his best available option was to make our inevitable sufferings meaningful and beneficial. The transhumanist claims that we now have the technological ability to reform suffering dramatically or eliminate it. Maybe it is only a contingent fact that pain and pleasure are tied together, and not a necessary principle—and maybe this knot can be untied through technologies like neurobiological and genetic engineering. Indeed, the Qualia Research Institute is developing an understanding of the fundamental nature of pain and pleasure to lay the foundation for super-happiness. Nietzsche agrees that evolution “does not have happiness in view,” but only evolution itself (D #108). Why should we accept the haphazard consequences of evolution instead of guiding it towards joy? Perhaps life is suffering, but it does not have to be.

However, section 2 demonstrates that a core Nietzschean aim is to bring about immense joy, well-being, and great health for humanity. Ultimately, if happiness & suffering come into conflict, Nietzsche’s priority may be joy: “I may have done this and that for sufferers; but always I seemed to have done better when I learned to feel better joys” (TSZ pg. 200). Nietzsche also argues that “man is something that must be overcome,” and man is just “a bridge and no end,” a bridge that may be “the way to new dawns” (TSZ pg. 310). This, along with Nietzsche’s revulsion at the Last Man who is complacent in humanity’s current level of contentment, suggests that he is not satisfied with merely human happiness and instead strives for superhuman joy. This seems deeply compatible with Pearce’s supplication that we use all available technologies to create “information-sensitive gradients of superhuman bliss.”[26] Furthermore, section 1c shows that Nietzsche does not explicitly defend pointless, destructive suffering, but only the kind of transformative suffering that enhances extraordinary individuals. If Nietzsche saw modern innovations, he may encourage some kinds of transhumanism that reduce our gratuitous and futile suffering, making humans stronger and more joyous.

Despite this essential agreement about some core ideas, Nietzsche’s critiques of the hedonistic imperative would be deep and numerous; a few can be addressed here. First, the transhumanist proposal fails to evaluate all values. It may reject the value of the “natural,” but it does not question most other values and is primarily a continuation of humanist morality. Nietzsche would not accede to this form of simple, egalitarian, utilitarian transhumanism. Second, Nietzsche would likely question the kind of happiness that transhumanism advocates. Will it be the numbing, anesthetic, decadent contentment of the Last Man, who blithely believes “we have invented happiness”? (TSZ pg. 129) For Nietzsche argues this kind of happiness will only throw humanity into a rut it can no longer escape, making our souls “poor and domesticated” so we no longer have enough chaos to “give birth to a dancing star” (TSZ pg. 129). Nietzsche would reject this type of transhumanism, for it uses “the holy pretext of ‘improving’ mankind, as the ruse for sucking the blood of life itself” (EH pg. 342). While not all forms of transhumanism are vulnerable to these critiques, Nietzsche would likely urge caution so we do not stumble into the trap of the Last Man.

Finally, transhumanism may simply be a form of afterworldliness. Some long for an afterworld, a dreamed-of place where suffering will be miraculously relived, as a desperate flight away from the painful human world we live in. These afterworlds are created as phantasmic compensations for the real suffering of the world: “It was suffering and incapacity that created all afterworlds” (TSZ pg. 143). The inability to deal with or affirm the existing world leads the weary sufferer to abandon this world and dream of another, higher world, a “dehumanized inhuman world which is a heavenly nothing” (TSZ pg. 144). These afterworlds are rooted in a desire to lie about reality that comes from a sense of suffering from reality. But placing supreme value on this afterworld devalues earthly life and makes it meaningless, producing further endless suffering.

Transhumanists may respond that they are not afterworldly, for their proposals are not ideal dreams but can actually be implemented through concrete human actions. Transhumanism may even imbue life with even more meaning, for it strives for the kind of brilliant, hopeful future that makes all current efforts immensely important. In consonance with this idea, Zarathustra urged his students to fix all that is mere “dreadful accident” in man, to “work on the future and to redeem with their creation all that has been” (TSZ pg. 310). However, he cautions against manipulative idealistic visions, and condemns the idea of immortality as a “big lie.”[27] Conclusively, some kinds of transhumanism may be sickly, sweet, dishonest, and dripping with impossible idealism. But a more realistic transhumanism that does not passively dream of contentment in some afterworld may be more congruous with Nietzsche’s aspirations for the future.

sea of clouds
Transhumanism should avoid being ensnared in dreams of some perfect, unblemished, utopian afterworld.

Conclusively, Nietzsche’s ideas may be compatible with some kinds of transhumanist suffering abolition. But he cautions against the dream of an eventual technological utopia based on the ideal of the cessation of suffering. The plausibility of this utopia is a difficult empirical question; if it is even possible, suffering abolition is tenuous and distant. A significant part of Nietzsche’s rejection of suffering abolition may rest on its implausibility. In the meantime, the dream of the end of suffering can become a passive afterworldliness, and the ideals of the afterworld can vilify the existing world. After all, the transhumanist abolitionists do not fill the interpretative vacuum—they just eliminate the actual suffering. In the process of abolishing suffering, we might undermine our interpretative ability to justify life despite its suffering, and thereby fall into nihilism. Transhumanism cannot instantly abolish suffering, and while we wait, we must make suffering meaningful.

In response, the transhumanist may argue that if we justify suffering too much, we might excessively affirm our existing condition. If we cling to the way we happen to suffer currently, we may be rendered unable to become more than human. Our strict commitment to Nietzschean suffering-affirmation could condemn us to the condition of the Last Man, preventing radical new futures and thwarting the overcoming of man. Making suffering meaningful can function to defend suffering and reduce motivation to prevent extreme, pointless, irredeemable suffering. The solution may be a synthesis: Insofar as suffering exists, we should sublimate it and make our experience of it more positive and growth-producing. But we should also strive to abolish extreme, pointless suffering wherever possible.

Conclusion

There are deep conceptual tensions in Nietzsche’s work: his defense of suffering contrasts with his accolades for joy, and he critiques the abolition of suffering while engaged in a kind of suffering abolition himself. This paper has attempted to explore and resolve these tensions. Just as Nietzsche withdraws his faith in morality “out of morality” (D #4), he withdraws his support for endless joy out of a desire for joy. Happiness alone is not enough, for suffering and joy are not antithetical but symbiotic. Both must be affirmed and sought after together. While suffering abolition is a dangerous proposition, Nietzsche may support some forms of abolition that focus on our pointless suffering. Regardless of the correct answer, probing these paradoxes reveals profound complexities in Nietzsche’s work—and in the human condition.

Bibliography

Beauvoir, Simone de. The Second Sex. New York, NY: Vintage Books, 1949. Trans. by Borde and Chevallier.

Bain, David, Michael Brady, and Jennifer Corns, eds. Philosophy of Suffering: Metaphysics, Value, and Normativity. Routledge, 2019.

Chan, K. Jacky, Marta Y. Young, and Noor Sharif. “Well-being after trauma: A review of posttraumatic growth among refugees.” Canadian Psychology/psychologie canadienne 57, no. 4 (2016): 291.

Carel, Havi, and Ian James Kidd. “8 Suffering as transformative experience.” Philosophy of Suffering: Metaphysics, Value, and Normativity (2019): 165.

Davis, C. G., Nolen-Hoeksema, S., & Larson, J. (1998). Making sense of loss and benefiting from the experience: Two construals of meaning. Journal of Personality and Social Psychology, 75(2), 561–574.

Hanh, Thich Nhất. The Heart of Understanding: Commentaries on the Prajñaparamita Heart Sutra. Berkeley, California: Parallax Press, 1998. Print.

Hauskeller, Michael. “Nietzsche, the Overhuman and the posthuman: A reply to Stefan Sorgner.” Journal of Evolution and Technology 21, no. 1 (2010): 5-8.

Higgins, Kathleen Marie. Nietzsche’s Zarathustra. Lexington Books, 2010.

Kroo, A., & Nagy, H. (2011). “Posttraumatic Growth Among Traumatized Somali Refugees in Hungary.” Journal of Loss and Trauma, 16(5), 440–458.

Sartre, Jean-Paul. Existentialism is a Humanism. Yale University Press, 2007.

Honderich, Ted, ed. The Oxford companion to philosophy. OUP Oxford, 2005

Elderton, Anna, Alexis Berry, and Carmen Chan. “A systematic review of posttraumatic growth in survivors of interpersonal violence in adulthood.” Trauma, Violence, & Abuse 18, no. 2 (2017): 223-236.

Fosse, Magdalena J. Posttraumatic growth: The transformative potential of cancer. Massachusetts School of Professional Psychology, 2005.

Levinas, Emmanuel. “Useless Suffering.” The Provocation of Levinas (2002): 168-179.

Medina, José. “Varieties of Hermeneutical Injustice.” In The Routledge Handbook of Epistemic Injustice. Routledge, 2017.

Meyerson, David A., Kathryn E. Grant, Jocelyn Smith Carter, and Ryan P. Kilmer. “Posttraumatic growth among children and adolescents: A systematic review.” Clinical psychology review 31, no. 6 (2011): 949-964.

May, Simon. “Why Nietzsche is still in the morality game.” Cambridge University Press (2011).

Nietzsche, Friedrich. “Beyond Good and Evil.” Trans. Walter Kaufmann. Basic writings of Nietzsche (1966).

Nietzsche, Friedrich. “On the Genealogy of Morals and Ecce Homo, trans. Walter Kaufmann.” J. Hollingdale. New York: Vintage Books (1967).

Nietzsche, Friedrich Wilhelm. Unfashionable observations. Vol. 2. Stanford University Press, 1998.

Nietzsche, Friedrich. “On the Genealogy of Morals and Ecce Homo, trans. Walter Kaufmann.” J. Hollingdale. New York: Vintage Books (1967).

Nietzsche, Friedrich. “Daybreak: Thoughts on the prejudices of morality.” Cambridge University Press (1997).

Nietzsche, Friedrich Wilhelm. The Twilight of the Idols; or, How to Philosophize with the Hammer. The Antichrist. Good Press, 2019.

Nietzsche, Friedrich Wilhelm. “Human, All Too Human, I.” Stanford University Press (1997).

Nietzsche, Friedrich Wilhelm. The antichrist. Trans. Walter Kaufmann. Knopf, 1924.

Park, Crystal L., Donald Edmondson, Juliane R. Fenster, and Thomas O. Blank. “Meaning making and psychological adjustment following cancer: the mediating roles of growth, life meaning, and restored just-world beliefs.” Journal of consulting and clinical psychology 76, no. 5 (2008): 863.

Paul, Laurie Ann. Transformative experience. OUP Oxford, 2014.

Schopenhauer, Arthur. The Essays of Arthur Schopenhauer; Studies in Pessimism. Good Press, 2019.

“The Imperative To Abolish Suffering. David Pearce Interviwed By Sentience Research (Dec. 2019).” 2020. Hedweb.Com. https://www.hedweb.com/hedethic/sentience-interview.html.

Appendix

1. The Birth of Suffering

Nietzsche also tells a story about the origins of the suffering and its value. Under intense conditions, prehistorical humans developed the view that “voluntary suffering, self-chosen torture, is meaningful and valuable” (D #18). Too much well-being invited mistrust, while hard suffering encouraged confidence. The community’s moral exemplars were those who had the “virtue of the most frequent suffering” (D #18). These individuals needed voluntary suffering, both to inspire belief and to believe in themselves. The practice of pain was a demonstration of overflowing strength and was viewed as a festive spectacle for the sacrifice-loving gods. Nietzsche realizes that we have not yet “freed ourselves completely from such a logic of feeling” (D #18). Even now, every step towards free thought and toward shaping one’s life has to be paid for with spiritual and bodily suffering. Prehistorical eras forged humankind’s character, and this character has not changed since. These eras saw suffering as a virtue, and this is a human instinct that has only been suppressed through civil society.

“Enclosed within the walls of society,” early humans felt that “suddenly all their instincts were disvalued” (GM §2 #16). They were unable to cope with even the easiest challenges in this new world. Civilization undermined the trustworthy instinctual guides that had once provided strength and joy. As he could not trust instincts that were only well-adapted to wilderness, man had to rely on his “most fallible organ,” the conscious mind (GM §2 #16). But his old instincts still needed expression. Thus, they were turned inward. Man’s will to power, hostility, cruelty, joy in attacking, and drive to adventure were directed against himself, creating the “bad conscience” (GM §2 #16). This introduced a new appalling plague: “man’s suffering of man, of himself” (GM §2 #16). Of course, some may question if this narrative about is anthropologically or historically plausible. But even taken as a fable, it reflects important ideas about the nature of suffering.

2. Nietzsche’s meaning-making

Even if he never touched pen to paper, Nietzsche’s ability to affirm his life in the face of immense pain is a testament to his meaning-making ability. Nietzsche exemplifies the unity of suffering and joy in himself, for he felt that “my health is disgustingly rich in pain,” and despite the near-constant affliction he kept “contemplating life with joy.”[28] In Ecce Homo, he expresses gratitude for his sickness, because it allowed him to develop the skill of “looking from the perspective of the sick toward healthier concepts and values” (EH pg. 233). He attributes his capacity to instigate the revaluation of all values to this ability to reverse perspectives. Nietzsche writes that if “my sickness had not forced me to see reason,” he may have abandoned his great task and become a mere pathetic specialist (EH pg. 239). Both his and Wagner’s incredible creative gifts were enabled only by their capability to endure profound suffering (EH pg. 250). His existence was filled with physical and mental suffering, isolation, excruciating trials, and unknown efforts – and yet he has unabashed love for his fate, and does not give into yearning for something different, or much less some ideal afterworld.

We continue to live only through illusions: the “pleasure in understanding,” “art’s seductive veil of beauty,” or through some “metaphysical solace” (BT §18). What matters is not the truth of these artistic illusions but their life-affirming nature.

3. Ennui-stricken youths

Sometimes ennui-stricken youths have a desire for suffering because it gives them a motive “for doing something” (GS #56). Their imaginations invent monsters “so that they may afterwards be able to fight with a monster” (GS #56). The problem with these “distress-seekers” is that they cannot create distress internally to motivate action, but instead need some external menace – “they always need others!” (GS #56). This desperate need for troubles from outside is ultimately a form of “the nihilistic question ‘for what?’ which is rooted in the old habit of supposing that the goal must be put up, given, demanded from outside—by some superhuman authority.” [29]

4. Buddhism & Suffering

Nietzsche praises Buddhism over Christianity, as it “is a hundred times more austere, more honest, more objective. It no longer has to justify its pains…it simply says, as it simply thinks, ‘I suffer.’”[30] Buddhism does not create a glorious, moralizing, or anesthetic story for suffering. It simply describes suffering without condemning it as the result of sin. It sometimes even affirms suffering in a Nietzschean style. As Zen Buddhist thinker Thich Nath Hanh writes, “Touch your suffering. Face it directly, and your joy will become deeper.”[31] In Nietzsche’s stated utopia, the “troubles of life will be meted out to those who suffer least from them,” so that those “who are most sensitive to the highest and most sublimated kinds of suffering” will be freed from unnecessary suffering (HH #462).

5. The Jews & Suffering

The Jews are exemplars of this discipline of suffering, as they have converted crisis and oppression into spiritual strength, cultural depth, and moral, ethical, and aesthetic masterworks. As a result of terrible centuries of education, “the psychological and spiritual resources of the Jews today are extraordinary,” and every Jew can look up to exemplars who exhibit courage, endurance, and heroism in the face of the worst situations (D #205). Their suffering has only strengthened their virtue and their conviction in a higher calling.

6. Nietzsche & Levinas

However, Nietzsche’s proposals are not the only ways to make meaning from suffering. Levinas argues that pointless pain can only be made meaningful when it becomes a suffering for the suffering of someone else.[32] Suffering becomes meaningful when the individual recognizes the call to help a fellow-sufferer gratuitously, without any concern for reciprocity. Nietzsche may see this view as a “debilitation and cancellation of the individual” for the sake of the herd, “adapting the individual to fit the needs of the throng” (D #132). Nietzsche’s nuanced and numerous critiques of pity cannot be enumerated here. But other interpretations of suffering, like Levinas’ view, may have other aims and values. It is not clear that Nietzsche’s responses to suffering are the ideal responses for all individuals – and he would likely not defend this claim himself.

7. Responses to Parfit

Derek Parfit argues that “when Nietzsche tried to believe that suffering is good, so that his own suffering would be easier to bear, Nietzsche’s judgment was distorted by self-interest.”[33] However, Nietzsche does not simply assert that suffering is good. As discussed in 1b, Nietzsche is not clearly committed to defending all types of suffering, but only the kind of suffering that promotes meaning, growth, or positive transformation for the kinds of individuals he is concerned with. Thus, Parfit begins with an inaccurate premise.

Furthermore, Nietzsche recognizes that at its core, life is suffering, and the harm of suffering primarily stems from its meaninglessness. He then claims that an individual (and perhaps only some individuals) can imbue their pointless suffering with meaning to affirm existence and make life worth living. Nietzsche would likely admit that he has a vested interest in affirming and making life meaningful even if it does not have an inherent meaning. He does not aim to be an indifferent, unbiased spectator who investigates suffering from a neutral perspective. In fact, he recognizes that for this kind of indifferent spectator, the wisdom of Silenus would be overwhelming and life would seem to be not worth living. In the end, Nietzsche does not claim that he is an unbiased evaluator of life, but instead acts as a deeply interested creator of values seeking to redeem life. Therefore, Parfit’s claim of bias is largely insignificant. However, even if his example of Nietzsche does not hold, he does accurately diagnose a cognitive bias towards overestimating suffering’s value because we need it to be valuable in order to live. Nietzsche could be construed as doubling down on this bias, rendering suffering as supremely meaningful to promote the affirmation of life.

8. Responses to Vinding

Magnus Vinding argues that while meaning and purpose can help keep suffering at bay and make it more bearable, their “ability to reduce suffering should not lead us to consider them positive goods that can justify the creation of more suffering.”[34] However, first, Nietzsche does not accept an overriding imperative to eliminate suffering. Instead, he sees some kinds of suffering as worth experiencing, and focuses on values far beyond pain and pleasure — perhaps great health, the affirmation of life, or the development of the overman.

Second, he may also argue that a constitutive aim of any value-system is to imbue life with meaning, and thus meaning-making is not merely a side pursuit. Having some kind of meaning or reason to live is nearly a prerequisite to any human action, and so this is a strong argument that finding meaning must necessarily be an intrinsic positive good. Perhaps in Nietzsche’s view, the value of meaning or purpose does not reduce to the amount of suffering they prevent. They are intrinsic positive goods beyond just suffering-prevention. Why? Well, the simple argument is this:

  • P1. All thought, action, and ethics require living beings to carry them out. In other words, an action cannot be performed without a being to perform it.
  • P2. Living beings, or at least humans, require some kind of meaning or purpose to remain alive.
  • C1. Without meaning or purpose, thought, action, and ethics cannot be carried out. Thus meaning or purpose are necessary prerequisites to all thought, action, and ethics. Meaning/purpose are therefore ‘ethical priors’ in that without them, one cannot have an ethics.
  • C2. Meaning/purpose are prerequisites for all other goods. It would therefore be logically contradictory for ethics to deny that meaning/purpose are goods.

One could contest P2, arguing that individuals can live without a purpose. This may be the case. However, the premise can be strengthened by adding some caveats: (a) consciously lacking a purpose, (b) while having both the ability and mental capability to end one’s life, (c) will often lead to the end of a person’s life, (d) if the person is in a circumstance that leads to the desire to end their own life (e.g. suffering). Arguably, most humans live with an implicit purpose of some kind, or are unaware of their lack of a purpose (cf. Sartre’s idea of bad faith). Still, P2 certainly remains open to critique.

If the argument above holds, it may be justified in principle to produce some kinds of suffering to develop meaning. This is especially true if meaning is an inherent positive good. However, (a) this will likely only be justified if an individual is producing suffering for themselves voluntarily, and (b) this does not include extreme suffering – especially because under my understanding of extreme suffering, it is meaningless and destructive of purpose almost by definition. Nietzsche’s views are compatible with both (a) and (b).

Finally, insofar as meaninglessness is an essential feature of suffering, adding purpose will always reduce suffering. This makes meaning and purpose such indispensable instrumental goods that they can be functionally treated as inherent goods.

“Have I not changed? Has not bliss come to me as a storm? My happiness is foolish and will say foolish things: it is still young, so be patient with it. I am wounded by my happiness: let all who suffer be my physicians.”[35]

“Like a cry and a shout of joy I want to sweep over wide seas, till I find the blessed isles where my friends are dwelling. And my enemies among them! How I now love all to whom I may speak! My enemies too are part of my bliss.”[36]

The eternal recurrence means that “every pain and every joy and every thought and sigh and everything unutterably small or great in your life will have to return to you.”[37] As Zarathustra asks, “are not all things knotted together so firmly that this moment draws after it all that is to come?”[38] We cannot separate pain and pleasure from each other because they are two aspects of the same process.

Even though “in all ages barbarians were happier,” we fear the return to barbarism because we value knowledge so much that we cannot “value happiness without knowledge.”[39]

A simple hedonic calculus would squander these exemplary individuals. These individuals see beyond immediate consequences and focus on “more distant aims,” even at the “expense of the suffering of others.”[40] For example, they seek knowledge even if this freethinking will make others feel doubt or distress.

The Atonement may assuage his suffering temporarily by making him feel he will not be punished. But ultimately, it will only increase a key cause of suffering: guilt. After all, mankind was already infinitely guilty, and the Atonement makes us also guilty for the death of the son of God.

To buy the sublime happiness of the Greeks, “the most precious shell that the waves of existence have ever yet washed on the shore,” one must be capable of immense suffering.[41]

These afterworlds need not be religious – in Nietzsche’s lifetime, political ideologies like nationalism would also dream up utopian ideals of collective redemption.

Furthermore, vice does not destroy or decay a people, but destruction and decay produce vice as a symptom of this “degeneration of instinct.”[42]

Footnotes

  1. Psychological Observations, pg. 20. In Schopenhauer, Arthur. The Essays of Arthur Schopenhauer; Studies in Pessimism. Good Press, 2019.
  2. Schopenhauer, Psychological Observations, pg. 25.
  3. Ibid, pg. 26.
  4. In Buddhism, the first noble truth is that ‘life is suffering’ or that dukkha (suffering) is an inherent feature of life in samsara (the cycle of earthly existence).
  5. EH, ‘The Birth of Tragedy,’ §34. Pg. 274.
  6. Haidt, Jonathan. The happiness hypothesis: Finding modern truth in ancient wisdom. Basic books, 2006.
  7. Taleb, Nassim Nicholas. Antifragile: Things that gain from disorder. Vol. 3. Random House Incorporated, 2012.
  8. See Kroo and Nagy (2011); Fosse (2005); Chan, Marta, and Sharif (2016); Elderton et al (2017); Meyerson et al (2011); Davis et al (1998); Park et al (2008).
  9. Notes to HH, fall 1855-86, in the Stanford translation of HH.
  10. See Paul, Laurie Ann. Transformative experience. OUP Oxford, 2014. Also see Carel & Kidd, “Suffering as Transformative Experience,” in Bain, David, Michael Brady, and Jennifer Corns, eds. Philosophy of Suffering: Metaphysics, Value, and Normativity. Routledge, 2019.
  11. Bain, David, Michael Brady, and Jennifer Corns, eds. Philosophy of Suffering: Metaphysics, Value, and Normativity. Routledge, 2019.
  12. “Genetic Fallacy.” In Honderich, Ted, ed. The Oxford companion to philosophy. OUP Oxford, 2005.
  13. Hume, David. A Treatise on Human Nature: 1. Longmans, 1874. Pg. 335.
  14. Coronado, Amena. “Suffering & The Value of Life.” PhD diss., UC Santa Cruz, 2016. Pg. vi.
  15. Medina, José. “Varieties of Hermeneutical Injustice.” In The Routledge Handbook of Epistemic Injustice. Routledge, 2017. Pg. 41.
  16. Levinas, pg. 157.
  17. Gleibs, Ilka H., Thomas A. Morton, Anna Rabinovich, S. Alexander Haslam, and John F. Helliwell. “Unpacking the hedonic paradox: A dynamic analysis of the relationships between financial capital, social capital and life satisfaction.” British Journal of Social Psychology 52, no. 1 (2013): 25-43.
  18. This proto-existentialist maxim came before Sartre’s statement that “existence precedes essence” in Existentialism is a Humanism, but it conveys a similar idea.
  19. Notes to HH, in the Stanford edition of HH pg. 343.
  20. TSZ, pg. 333.
  21. Zizek, Slavoj. First As Tragedy, Then As Farce. Verso Books, 2009. Pg. 6.
  22. Pearce, David. Hedonistic Imperative. David Pearce., 1995.
  23. Letter to von Stein, as cited in Higgins, Kathleen Marie. Nietzsche’s Zarathustra. Lexington Books, 2010. Pg. 8
  24. May, Simon. “Why Nietzsche is still in the morality game.” Cambridge University Press (2011).
  25. Carel, Havi, and Ian James Kidd. “8 Suffering as transformative experience.” Philosophy of Suffering: Metaphysics, Value, and Normativity (2019): 165.
  26. “The Imperative to Abolish Suffering. David Pearce Interviewed by Sentience Research (Dec. 2019).” 2020. Hedweb.Com. https://www.hedweb.com/hedethic/sentience-interview.html.
  27. Hauskeller, Michael. “Nietzsche, the Overhuman and the posthuman: A reply to Stefan Sorgner.” Journal of Evolution and Technology 21, no. 1 (2010): 5-8.
  28. Letter of January 22, 1879. In footnote, Portable Nietzsche, pg. 110.
  29. Nietzsche, Friedrich. Sämtliche Briefe: Kritische Studienausgabe. Walter de Gruyter GmbH & Co KG, 2015. 12:9[43]. Pg. 355.
  30. The Antichrist, #23.
  31. Hanh, Thich Nhất. The Heart of Understanding: Commentaries on the Prajñaparamita Heart Sutra. Berkeley, California: Parallax Press, 1998. Print.
  32. Levinas, Emmanuel. “Useless Suffering.” The Provocation of Levinas (2002): 168-179. Pg. 163.
  33. Parfit, Derek. On what matters. Vol. 1. Oxford University Press, 2011. Chapter 126.
  34. Vinding, Markus. “Suffering-Focused Ethics: Defense and Implications.” Ratio Ethica (2020). Pg. 147.
  35. TSZ, pg. 196.
  36. TSZ, pg. 196.
  37. GS, #341.
  38. TSZ, pg. 270.
  39. Daybreak, Book IV, #429.
  40. Daybreak, Book II, #146. See also D, Book IV, #467: “You will cause a lot of people pain that way.- I know it; and know as well that I will suffer doubly for it, once from compassion with their suffering and then from the revenge they will take on me. Nevertheless, it is no less necessary to act as I am acting.”
  41. GS, #302.
  42. Ibid.