Modelling the outcomes of animal welfare interventions: One possible approach to the trade-offs between subjective experiences

May 27, 2024
34 min read

Updated: Oct 11, 2024

This article describes the framework we at Animal Ask us to quantify the subjective experiences of animals so we can be transparent about our research methods, allowing others to see how we reach particular conclusions and to improve on our methods.

Author: Ren Ryba

Executive summary

At Animal Ask, we spent most of 2023 conducting prioritisation research to identify the most promising goals and strategies for the animal advocacy movement. The aim of that project was to identify goals and strategies that could be more impactful than the current leading campaigns (e.g. cage-free campaigns). If we succeed, we can unlock new opportunities for the movement to help even more animals.

This article outlines one component of the methodology we used for that research. Specifically, to guide our research on new interventions for the animal advocacy movement, we needed a framework that allows us to quantify the subjective experiences of animals. For example, if we were comparing two campaigns—say, a) phasing out fast-growing breeds to reduce suffering in broiler chickens and b) implementing more humane pesticides to reduce suffering in wild insects killed on agricultural land—we would need to produce a quantitative estimate of the potential impact of these two campaigns.

In essence, our framework allows us to systematically compare different campaign opportunities in how good they are for animals—and these comparisons can be made across different species, across different intensities of experience (e.g. mild vs extreme suffering), and across both positive and negative experiences. The framework allows us to be clear and transparent about the worldviews, philosophical positions, and/or empirical assumptions that are used in drawing conclusions about particular campaigns.

In this report, we outline our framework. The key characteristics of the framework are:

The framework is cumulative, meaning that it considers the total duration of an animal's experience over time. In fact, our framework is heavily based on the existing Cumulative Pain framework, which was developed by the researchers Wladimir J. Alonso and Cynthia Schuck-Paim.
The framework allows experiences to be adjusted. We can place different moral weights on more intense experiences—for example, we might think that preventing extreme suffering is more important than preventing mild suffering, all else being equal.
The framework incorporates the fact that uncertainty is high, and society has simply not made much progress on many of the core questions relating to trade-offs between different subjective experiences.

We advise serious caution when applying the ideas from our framework (or any framework on this topic). As any statistician knows, no model is a perfect representation of reality. When we're looking at the specific question of moral trade-offs between different subjective experiences, things get seriously murky. We view this framework as a tool to help us think quantitatively, subject to the serious limitations to society's knowledge.

We chose to describe our framework in this article so we can be transparent about our research methods, allowing others to see how we reach particular conclusions and to improve on our methods. We discuss the many limitations of our framework at the end of this article. Due to these limitations, this framework is only a small part of our research process. When evaluating animal advocacy interventions, we consider a range of other information interventions, many of which we have already been using before this point—these include qualitative factors (e.g. weight-of-evidence, tractability, scalability, and so on) and quantitative measures (e.g. individual animals helped, animal-years improved). The framework is mostly derived from components that are already published or well-established in the animal advocacy movement; for readers already familiar with research methods in animal advocacy, this article mostly revisits familiar ground.

The three areas of research we draw upon (e.g. Rethink Priority's welfare ranges, and Welfare Footprint Project's thinking on intensity-duration tradeoffs) are still topics of active scientific and philosophical development, so there will probably be substantial advances over the next couple of years that will require us to modify our framework. (In fact, this article was written in early 2023 and posted in early 2024, so there might be important, recent developments that are not included in this article.)

Summary of our framework

Our framework operates as follows:

We begin with a long list of ideas for new or existing animal advocacy campaigns.
For each campaign, we calculate the cumulative time in pain that would be prevented for an animal affected by the campaign, and the cumulative time in pleasure that would be created. This closely follows the Cumulative Time method published by Alonso and Schuck-Paim, except we have added pleasure categories to complement the existing pain categories. Pain and pleasure are each considered as four categories of different intensities (eight categories total).
We multiply these numbers by the reach of the campaign (i.e. number of animals who would be directly affected) to calculate the total cumulative time in pain prevented, and pleasure caused, by the campaign.
We consider the moral weightings associated with each pain or pleasure category. In practice, we consider numerous sets of weightings, each associated with a particular worldview about the moral value of different intensities of pain and pleasure. This lets us convert the four pain categories and the four pleasure categories to a single, common metric.
We consider our credence in each of these worldviews, as well as a range of "soft", qualitative factors (e.g. tractability, scalability, and so on) to arrive at final rankings of campaign ideas.
By including established, high-impact campaigns (e.g. cage-free hens, slow-growing broiler chickens) in this process, we can see how new campaign ideas compare to these existing campaigns.

The framework is summarised visually in the following flowchart. (If you would prefer to see the framework demonstrated using an example, see the final section of this report: "A toy example").

Our framework in detail

In this section, we describe the background to our framework and provide some additional details.

To guide our research on new interventions for the animal advocacy movement, we need a framework that allows us to quantify the subjective experiences of animals. For example, if we were comparing two campaigns—say, a) phasing out fast-growing breeds or broiler chickens and b) implementing more humane pesticides for use on crops—we would need to produce a quantitative estimate of the potential impact of these two campaigns.

Note that we also consider many qualitative factors—this quantitative estimation represented by our framework is only one piece of information among many that we use.

To produce these quantitative estimates, we wanted to find a framework for quantifying the subjective experiences of animals that met the following criteria:

Captures the intensity of animals' experiences. More intense experiences seem to matter more than less intense experiences, all else being equal. Also, a single experience might have a different intensity at different points in time.
Captures the duration of animals' experiences. Longer experiences seem to matter more than shorter experiences, all else being equal.
Allows us to consider many different animal species. Our research involves species whose physiology and welfare is both well-understood (e.g. chickens) and poorly-understood (e.g. black soldier flies). We do not want to be limited to only the well-understood species.
Allows us to consider many different interventions for any particular animal species. For example, if we were looking at farmed carp, we might want to compare improving water quality to implementing humane slaughter. These have very different intensities and durations, but we need to compare both.
Captures both negative and positive welfare. Some interventions focus entirely on preventing suffering (e.g. implementing humane slaughter), while other interventions might also create pleasure (e.g. environmental enrichment). We want to consider both of these aspects, even if we might not place the same moral importance on creating pleasure versus preventing suffering (see the next point, # 6).
Allows us to incorporate multiple, competing worldviews in a transparent way. There are major debates around what the "best" moral worldviews are, and even how to make decisions in light of this moral uncertainty (1,2). For our purposes, we think the best way to account for this uncertainty is "worldview diversification", in which we systematically incorporate information from all major/plausible moral worldviews. So, our framework needs to allow for this. For example, different worldviews might make different tradeoffs between the intensity and duration of pain, or between the value of pleasure versus pain. If a campaign idea looks particularly strong regardless of these tradeoffs (3), then we can have more confidence in that idea.
Allows us to place different moral weight on different animal species. It is plausible that the experiences of some animal species might have more or less moral importance than those of other animal species, though this is a topic of active research and debate. For example, see the work by Rethink Priorities (4) and Open Philanthropy (5).
Can produce estimates for both welfare improvements and lives averted. For example, we might want to compare improving chicken welfare, which improves the lives of chickens, versus reducing chicken consumption, which means that fewer chickens are brought into existence.
Has existing estimates for leading animal advocacy campaigns. Our current goal as an organisation is to identify campaigns that might have an impact comparable to, or greater than, existing high-impact animal advocacy campaigns, so having access to estimates for these leading campaigns can provide a useful benchmark against which to compare new candidate campaigns.
Can be applied relatively quickly. During our research process, we will examine hundreds of campaign ideas each year.
Can be challenged and debated as new empirical data arises. Since most animal species and campaign ideas are poorly studied, it is unavoidable that our estimates will often be based on limited information. However, it is important that these estimates are transparent, so they can be criticised by others and updated as new data emerges.

Our framework was designed to meet these criteria and the needs of our research process at Animal Ask. Other researchers in animal advocacy and/or effective altruism probably have different needs, so it would make sense for them to use a different framework. Some examples of other frameworks are described by Charity Entrepreneurship, an organisation that conducts research similar to the research project at Animal Ask that is motivating the present report (6).

Our framework is cumulative, adjusted, and incorporates both pleasure and pain. We will now describe what we mean by each of these three aspects.

Cumulative Pain: The basis of our framework

The basis of our framework is the Cumulative Pain metric. This metric is, itself, based on the Pain-Track framework. These tools have been developed and published by Wladimir J. Alonso and Cynthia Schuck-Paim, who run the Welfare Footprint Project.

The key article on Cumulative Pain is available as a preprint (7), which also refers to the publication on Pain-Track (8). We will summarise some points that are most relevant to our purposes:

Cumulative Pain is, itself, based on Pain-Track. Pain-Track describes pain as a time series, where a particular experience can be visualised by the intensity of pain at each point in time. This tracks the evolution of a pain experience over time. If you add up the time spent at each intensity level of pain, you get the Cumulative Pain for each intensity level.
"Pain" is defined broadly to mean any negative affective state, so it can include physical pain as well as psychological pain (e.g. fear, anxiety, frustration).
Cumulative Pain and Pain-Track consider four categories of pain intensity: Annoying, Hurtful, Disabling, and Excruciating. We give the full definitions of these categories in the table below. The four categories were "grounded on (evolutionary) principles that should be common to most pain experiences: the disruptive character of the pain experience and its effectiveness to promote adaptive behaviours." The authors give evolutionary and physiological reasons to justify why "more unpleasant sensations should be in general more disruptive". The names of the categories themselves are chosen to "evoke an empathic appreciation of intensity". Though the authors give four categories, they point out that these could be divided into even more categories in the future.
This definition of pain, and the four categories, provide a common biological meaning that allow comparisons to be made between different species and different interventions. Likewise, the framework allows pain to be considered for both an individual animal and a population (by considering the prevalence of a particular welfare issue, or the scale and reach of a particular campaign).
The authors say that it might be possible to combine the categories of pain into a single metric, but "such an exercise requires understanding the numerical relationship among the intensity categories in terms of the aversiveness they cause: how much worse is the hurtful experience compared to an annoying or disabling pain?; or how long should an individual endure an annoying pain to make it equivalent to a few minutes of excruciating pain?". Therefore, the authors do not attempt to combine the categories of pain in this way—but we do.
In the same way, the authors mention pleasure, but they point out that incorporating pleasure is "hindered by the challenges to establish the equivalence between positively and negatively valenced states (e.g., how much time of pleasure would be necessary to compensate an hour of torture-like pain?)". The authors do not attempt to incorporate pleasure—but we do.
Like any framework, Cumulative Pain and Pain-Track have limitations: "noise, biases and confounding factors will blur access to a realistic and accurate depiction of the pain experience", and the validity and reliability have yet to be tested. In fact, the numbers assigned using this framework to a particular experience should be considered as an initial hypothesis, which "can be made more precise as knowledge becomes available, or [...] challenged".

This table gives the full definition of each category and pain, as expressed by Alonso and Schuck-Paim in their preprint article on Cumulative Pain (7).

Pleasure: Adding positive welfare

As Alonso and Schuck-Paim pointed out, they chose not to include categories for pleasure. In practice, we also don't pay much attention to pleasure, since pleasure simply doesn't come up much during our typical animal advocacy research. However, we'll briefly talk about pleasure in this section, and we'll make some suggestions for how you could incorporate pleasure into a framework like ours, if you were so inclined.

Pleasure might be relevant for some campaign ideas. For example, some campaigns involve providing environmental enrichment to farmed animals, and one function of environmental enrichment may be to give the animals opportunities for pleasure. Therefore, we want our framework to incorporate pleasure.

The value of pleasure compared to pain varies depending on your worldview, which in turn depends on your philosophical position and empirical beliefs. For example, some worldviews suggest that creating pleasure can compensate or "make up" for the existence of pain, while other worldviews think that preventing pain is far more important than creating pleasure. Most people do believe that, all else being equal, more pleasure is a good thing. Many experiments have shown that farmed animals often pay a cost in exchange for pleasure, suggesting that some forms of pleasure might outweigh, at least, some mild forms of pain. So, it is reasonable to expect that there might be some moral value to pleasure, even if this is not necessarily guaranteed (see next section for more details on these trade-offs).

Alonso and Schuck-Paim, in developing the categories of pain, use the following principles: "the disruptive character of the pain experience and its effectiveness to promote adaptive behaviours". These principles were chosen because they are, it was argued, common to all pain experiences.

Therefore, we might draw on similar principles in developing the categories of pleasure. Like pain, pleasure does appear to promote adaptive behaviours. But rather than indicating harm or danger as pain does, pleasure indicates usefulness. As such, pleasure does not disrupt—it attracts. So, for developing categories of pleasure, we might use these principles: "the attractive character of the pleasure experience and its effectiveness to promote adaptive behaviours". (The evolutionary purpose of pleasure is a complex topic, and we do not do justice to the nuanced literature here, but one useful source is the seminal article by Cabanac (9)). And, as with the names given by Alonso and Schuck-Paim to the different pain categories (Annoying, Hurtful, Disabling, Excruciating), we can also pick "terms that evoke an empathic appreciation of intensity".

We make two further decisions in developing the categories of pleasure. Firstly, we choose to use four pleasure categories. Secondly, we define each pleasure category as morally equivalent to a pain category. For example, we might hypothesise that there exists a level of pleasure (let's say "Agreeable") that an animal would choose to experience for one hour, even if it meant experiencing one hour in the Annoying pain category. Roughly speaking, an individual in the Agreeable category is as happy as an individual in the Annoying category is sad—and so on, for the other three pairs of pain and pleasure categories. We could therefore hypothesise the existence of four different levels of pleasure, each of which is equivalent to a pain category in this way.

Note that we are not definitively claiming that all of these levels of pleasure exist, or that they have any particular moral importance. Philosophically, many authors have argued that some pains are so extreme that they cannot be compensated for by any amount of pleasure. Future empirical work might, for example, show experimentally that animals would never pay Excruciating-category pain for any pleasure, no matter how blissful. Or we might discover that pleasure and pain might be fundamentally incomparable, as some authors have suggested. On the other hand, other authors have argued that pleasure and pain are roughly symmetrical. Since we want to consider all of these competing worldviews (see the following subsection), it is important to allow for all of these possibilities in our framework. The framework still allows for worldviews that do not place much importance on pleasure, as those worldviews can simply assign a small or zero weighting to particular categories of pleasure.

The way that we've constructed our pleasure categories are based on the fact that we, at Animal Ask, are developing this framework for our particular purpose (conducting prioritisation research to identify new, promising campaigns for the animal advocacy movement that might be highly effective). This approach is rather simplistic and ad hoc. We are not claiming that this is the overall best way to come up with pleasure categories—indeed, we would be very happy for somebody to come up with a better way.

Therefore, here are the four pleasure categories that could be used in a framework like ours:

Adjusted: Making trade-offs between different intensities of pain and pleasure

Alonso and Schuck-Paim did not attempt to combine the different categories of pain into a single, composite number. For a review of the challenges in doing so, see the recent post by Schuck-Paim et al here. One way to circumvent this problem is offered by Hamilton in a post here.

However, we would very much like to combine the different categories of pain into a single, composite number. If we are comparing an intervention that prevents a large amount of Annoying pain or a small amount of Excruciating pain, we need a way to estimate how morally bad each of those outcomes might be.

To make quantitative trade-offs between different intensities of pain (and pleasure), we need to assign numerical weightings to the different intensities of pain. For example, if somebody thinks that preventing one hour of Hurtful pain is morally worth the same as preventing 10 hours of Annoying pain, then we could express the weighting on the Hurtful category as 10.

Nobody knows what the "correct" weightings are, if such a thing even exists. In assigning these weightings, we are moving from describing something empirical (the time spent in pain) to making normative, ethical judgments (whether we should prevent one experience or another), though such ethical judgments can be guided by empirical evidence. These weightings will depend heavily on people's philosophical values (e.g. is it morally acceptable to prevent a large amount of mild pain, if this means allowing a small amount of extreme pain to exist? (10)) and people's empirical hypotheses about how minds work (e.g. how unpleasant is the worst possible pain, compared to mild pain? (11–13)). Evidence testing these empirical hypotheses is, unfortunately, quite lacking.

Our approach is to use worldview diversification. There are numerous worldviews, each of which answers these questions in different ways. As such, if we take the perspective of a particular worldview, we could therefore assign a particular set of weightings to the different categories of pain (and pleasure). If we do this for every worldview, we will then end up with a whole collection of weightings. We can incorporate all of these different weightings based on how plausible we find their corresponding worldviews. In other words, although nobody knows what the "correct" weightings are, we can still develop different sets of weightings that seem plausible and incorporate them in a transparent way.

Firstly, some definitions:

When we say "worldview", we mean something roughly like "a belief about what experiences matter morally and how much they matter." For example, one worldview might be "pleasure and pain both matter", while a different worldview might be "only intense pain matters". You can typically arrive at a particular worldview from a variety of initial philosophical positions and/or empirical hypotheses.
When we say "philosophical position", we mean one of the formal positions that have been defined in the academic literature in philosophy. For example, we think of traditional utilitarianism and negative utilitarianism as two philosophical positions.
When we say "empirical hypothesis", we mean a view about how the world works. Empirical hypotheses in this context might relate to how the brain processes positive and negative experiences, or about whether particular experiences are common in the world.
When we say "a set of weightings", we mean a collection of numbers that defines how a worldview might make numerical tradeoffs between different intensities of pleasure and pain, using the categories defined above.
For pain, we construct our weightings using disabling pain as the baseline. For a particular category of pain, if we assign some number X to that category, then we are saying that preventing X hours of pain of that category and 1 hour of disabling pain are equivalent. For example, placing a weight of 10 on the Hurtful category would be saying that preventing 10 hours of Hurtful pain is morally the same as preventing 1 hour of Disabling pain.
In the same way, for a particular category of pleasure, if we assign the weighting Y to that category, then we are saying that causing Y hours of pleasure in that category and 1 hour of pleasure in the Delightful category are equivalent.

What philosophical positions do we want to consider? Here is a short list of popular philosophical positions. We focus more on the practical implications of these positions, rather than giving precise philosophical definitions. Definitions for the first three positions listed here are adapted from the articles by Toby Ord (14) and Simon Knutsson (15).

Traditional utilitarianism: "Equal weight to happiness and suffering". (Note: Knutsson emphasises that this is one possible definition of traditional utilitarianism, not the only one.)
Absolute negative utilitarianism: "Only suffering counts." (Note: This also includes lexical negative utilitarianism, under which suffering and happiness both count, but no amount of happiness can outweigh any amount of suffering. For our purposes, these views are basically the same, because our research aims to advise funding decisions, and any money spent on improving happiness is money that is not spent on reducing suffering.)
Weak negative utilitarianism, or negative-leaning utilitarianism: "Suffering and happiness both count, but suffering counts more. [...] perhaps some nonlinear function which shows how much happiness would be required to outweigh any given amount of suffering." We do not consider this philosophical position further—in the context of animal advocacy, where very few interventions focus on generating happiness rather than preventing or alleviating suffering, we feel that this worldview would not add much useful information beyond that contributed by the above two worldviews.
Additionally, many people think that the most urgent thing is to reduce the suffering of the worst-off—for these people, preventing suffering is always more important than creating happiness, and preventing extreme suffering is always more important than preventing mild suffering. We consider this to be an additional worldview because there could be many philosophical and/or empirical positions that lead to this type of worldview, some of which are described by Animal Ethics (16). This type of worldview can give us a few sets of weightings depending on a) where we draw the line between mild and extreme suffering, and b) how much we weigh different intensities of extreme suffering. We don't want to get hung up on the precise philosophical definition of this worldview— this is more just a matter of practical application. That is, under this view, extreme suffering is so widespread that the distinction doesn't matter for practical decision-making in 2024.

There are also a couple of empirical hypotheses that might guide our weightings. Intuitively, many lay people might think that the most intense pain is only several times more awful than mild pains (e.g. 5-10x more awful). By contrast, Andrés Gómez-Emilsson (11) argues that "the most intense pains are orders of magnitude more awful than mild pains (and symmetrically for pleasure)". This latter view is closer to the (informal) views of many members of the animal advocacy community (see below). This is also similar to the view that "intense suffering can be in some ways qualitatively different and more serious than lesser suffering in a way that isn’t really captured by a linear pain scale" (17). So, across our worldviews, we consider two main empirical hypotheses:

One order of magnitude: Compared to disabling pain, the moral value of each other category of pain is within one order of magnitude. This roughly approximates the view that more intense suffering is only a bit more unpleasant than mild suffering (or, conversely, that more intense happiness is only a bit more pleasant than mild happiness).
Many orders of magnitude: Compared to disabling pain, the moral value of each other category of pain spans multiple orders of magnitude. This roughly approximates the view that more intense suffering is much, much more unpleasant than mild suffering (or, conversely, that more intense happiness is much, much more pleasant than mild happiness).

In the following table, we show what some rough sets of weightings might look like for different worldviews. We do not claim that these sets of weightings are the most accurate representations of these worldviews—rather, these are simply plausible interpretations, based on our intuitions. This is why we give multiple sets of weightings for each worldview. Please let us know if you think any of these worldviews (or any other major worldview you are aware of) could be more accurately represented in this table.

Also, the differences between sets of weightings don't matter too much. In our approach, we rank each campaign idea according to each worldview, then we compare the rankings—we do not compare the "total utility" of campaign ideas across worldviews. It is the relative differences between numbers within a set of weightings that matter.

The numbers in the below table are based on our intuitive interpretations of the worldviews that we considered. There are almost no studies that can be used to rigorously derive empirical estimates of these sets of weightings. As such, we are forced to depend on our own intuitive interpretations of the worldviews that we considered, which are guided by two main sources of information:

Firstly, we were strongly guided by Schuck-Paim et al's interpretation of the pain study conducted by Wallenstein et al (18). Alonso and Schuck-Paim calculate a set of weightings that would be inferred by the results of that study, resulting in 94—496 hours for Annoying pain and 8—64 hours for Disabling pain (with no estimate of Excruciating pain available). According to Schuck-Paim et al, there are almost no studies beyond Wallenstein et al (18) that a) can be used to infer a set of weightings in this way and b) were conducted on people actually experiencing pain, rather than imagining hypothetical scenarios.
Secondly, we were weakly guided by weightings that other people have come up with informally. Most of these are from an informal survey we conducted of the well-informed people in the shared office where we work. A couple of these are taken from informal guesstimates made in online posts (19–22). We use these to loosely guide our intuitions in the absence of more rigorous empirical evidence, certainly not as reliable estimates that could only be produced by formal empirical studies. We are very enthusiastic about replacing our informal, intuitive weightings with stronger evidence once such studies become available.

In the following table, we construct our pain weightings using disabling pain as the baseline. For a particular category of pain, if we assign some number X to that category, then we are saying that preventing X hours of pain of that category and 1 hour of disabling pain are equivalent. In the same way, for a particular category of pleasure, if we assign the weighting Y to that category, then we are saying that causing Y hours of pleasure in that category and 1 hour of pleasure in the Delightful category are equivalent. A dash (-) indicates that a particular worldview considers a particular category of pain or pleasure to have no moral relevance. In worldview N, the zero indicates that only that category of pain has moral relevance.

Some notes on this table:

In practice, we don't really use the "pleasure" categories, as they're less relevant to the research that we conduct on a daily basis. I included them for illustrative purposes.
The last column gives an idea for how plausible I (Ren) find this worldview, based on perhaps ~50 hours of reading and reflection on this question. These are not the views of the rest of my team at Animal Ask. I expect that there would be significant divergence in views within my team and, more broadly, within the effective animal advocacy community—this is an illustration.

How do we make comparisons across different animal species?

Welfare ranges

To make comparisons of subjective experiences across species, the most comprehensive and relevant evidence comes from the Moral Weight Project conducted by Rethink Priorities (23) and the associated welfare range estimates (4). As the authors explain, "Welfare ranges allow us to convert species-relative welfare assessments, understood as percentage changes in the portions of animals’ welfare ranges, into a common unit." Welfare ranges express "the relative peak intensities of different animals’ valenced states at a given time"—it is repeatedly emphasised that this is a very different thing to the all-encompassing "strength of our moral reasons to benefit members of one species rather than another."

Rethink Priorities does provide actual estimates of the welfare ranges for a number of species relevant to animal advocacy. These estimates are placeholders: "Our view is that the estimates we’ve provided should be seen as placeholders—albeit, we submit, the best such placeholders available" (4). The authors emphasise that these estimates are influenced by the availability of evidence and that these estimates are likely to change over time as the evidence grows. For this reason, we do not believe that it is appropriate to uncritically apply these estimates (a view shared by the authors themselves).

However, the authors do draw some conclusions that are important for our purposes:

"No one should be very confident in any estimate of a nonhuman animal’s welfare range. We know far too little for that. However, we’re reasonably confident about some things. Given hedonism and conditional on sentience, we think (credence: 0.7) that none of the vertebrate nonhuman animals of interest have a welfare range that’s more than double the size of any of the others. [...] Given hedonism and conditional on sentience, we think (credence 0.6) that all the invertebrates of interest have welfare ranges within two orders of magnitude of the vertebrate nonhuman animals of interest. Invertebrates are so diverse and we know so little about them; hence, our caution." (4)

So, we don't really have enough evidence to produce reliable, species-specific welfare ranges. We are currently operating at a "low resolution" understanding of animals' welfare ranges—we can't reliably assign a different number to each species, but we do understand a little bit about the difference between vertebrate animals and invertebrate animals.

As the extract above shows, the authors believe that the "true" welfare ranges of different vertebrate species tend to be similar, and that the "true" welfare ranges of invertebrate animals is likely to be no less than 0.01x than that of vertebrate animals. This is the conclusion that we will apply in our framework.

Essentially, this means that we may need to apply a "discount" to the welfare range of invertebrate animals that we consider. For example, if we are choosing between improving chicken welfare and improving beetle welfare, this evidence may be a reason to think that chickens are capable of having more intense experiences (though, as should be clear, there are numerous other factors that can tip the balance in favour of either of those two interventions).

Let us turn our attention to the weightings that we place on different categories of pain/pleasure (above section, "Adjusted: Making trade-offs between different intensities of pain and pleasure"). The categories themselves relate to the intensity of experience. However, the weightings on those categories are expressed as the number of hours of a particular category of pain/pleasure that would be required for the average individual of a species to be indifferent between that experience and 1 hour of Disabling pain/Delightful pleasure. Therefore, our pain/pleasure weightings are expressing preferences, not intensities.

The work by Rethink Priorities typically refers to a continuous scale of pain/pleasure intensity, whereas we have something different—namely, discrete categories of pain/pleasure intensity that are only related quantitatively through preferences. What is 10% of Excruciating-level pain? We don't know.

With this challenge in mind, if we want to make comparisons between invertebrate animals and vertebrate animals using our framework, we need to integrate the conclusions of Rethink Priorities with our pain/pleasure categories. Due to the fact that a) we are operating at a "low resolution" understanding of welfare ranges and that b) we only have a small number of discrete pain/pleasure categories, any solution to this challenge will necessarily be coarse. Here is a short, non-exhaustive list of ways that we could proceed:

We could assume that the welfare ranges of invertebrates are actually close to, or the same as, those of vertebrates (i.e., roughly the same rather than spanning a full two orders of magnitude). The authors of the welfare range report express concern about pro-vertebrate bias, and we believe that such bias may arise when scientists conduct studies on animal behaviour (24).
We could assume that invertebrate welfare ranges have an "upper limit" that is not present in vertebrate animals. For example, this could mean that Excruciating pain in invertebrates is actually equivalent to Disabling pain in vertebrates, but that the other pain categories are unaffected.
We could assume that invertebrate experiences are systematically less intense than comparable experiences in vertebrates. For example, we could assume that Excruciating pain in invertebrates is actually equivalent to Disabling pain in vertebrates, that Disabling pain in invertebrates is actually similar to Hurtful pain in vertebrates, and so forth for all of the pain/pleasure categories.

In practice, we will probably apply all three of those approaches, enabling us to select interventions that appear robust regardless of the assumption we choose.

Sentience

The authors of the welfare range report also point out that we need to consider one more thing: whether or not we think that particular species of animals are sentient. The presence or absence of sentience in various groups of non-human animals is the subject of scientific debate. While recent progress has been impressive (e.g. 25), we still need to exercise caution (26), and the academic debate remains fierce (27).

We take it as given that at least some animals are sentient, as this is a key assumption underpinning the animal advocacy movement. Nevertheless, it may be possible that some species of animals are sentient while others are not. To illustrate, the authors of the welfare range report separate the vertebrate species they studied from the invertebrates:

"[...] There is deep uncertainty about consciousness generally and sentience specifically. In the face of that uncertainty, we think there’s no good argument for assigning a credence below 0.3 (30%) to the hypothesis that normal adult pigs, chickens, carp, and salmon are sentient. Likewise, we think there’s no good argument for assigning a credence below 0.01 (1%) to the hypothesis that normal adult members of the invertebrate species of interest are sentient." (4)

Likewise, Luke Muehlhauser conducted a shallow review of evidence relating to consciousness and drew similar conclusions. (Consciousness and sentience are subtly distinct, but closely related.) Muehlhauser typically assigned a credence of 0.60 – 0.85 (60% – 85%) to the probability of consciousness in various vertebrate species, and 0.07 – 0.10 (7% – 10%) for invertebrate species (5).

Essentially, we do not place too much emphasis on these exact numbers. But it does appear that we may want to be more confident in the proposition that vertebrate animals are sentient than that invertebrate animals are sentient—if only for the sake of making sure we select interventions that are robust. We intend to include this as a qualitative factor.

How do we compare improving welfare vs. averting lives?

Some animal advocacy interventions improve the welfare of animals, but do not stop those animals from coming into existence (whether on a farm or in the wild). For example, cage-free campaigns lead to better lives for chickens, but the chickens are still farmed. Other interventions prevent animals from being born into farms or into the wild entirely. For example, vegan outreach campaigns may cause fewer people to purchase meat, causing the farming industry to produce fewer animals to begin with.

How does our framework deal with the difference between welfare improvements and lives averted? The Cumulative Pain scale, on which our framework is constructed, provides us with a solution that is elegant if imperfect.

In the Cumulative Pain scale, non-existence is not explicitly compared to existence—only painful experiences (and, in our adaptation, pleasurable experiences) are assigned values.

Therefore, if you prevent a being from coming into existence, you could measure the effect of that action by tallying up the total pain and pleasure that the being would have experienced. This way, you end up with an estimate of the pain prevented and pleasure caused by your actions, and this estimate can be directly compared to estimates of welfare improvements.

This is simple in principle, though it will be challenging in practice to tally up the total pain and pleasure throughout the lifespans of typical farmed (not to mention wild) animals. This is also a topic of considerable moral uncertainty—is coming into existence good (28–30)? For beings who do not yet exist, should we be valuing pain and pleasure differently (31)? Can non-existence and existence even be compared (32)? We expect that these major uncertainties will remain with us for some time.

How do we make comparisons across time?

Another factor that we will consider qualitatively, rather than quantitatively, is the time scale at which various interventions bring about impact. We mention this factor here due to its importance.

We can distinguish between interventions depending on when their impact is actually realised, i.e. when the benefit to animals actually materialises. We can think about interventions as short-term (within the next few years from today, e.g. distributing humane slaughter equipment to fish farmers to begin using immediately), medium-term (within the next decade or two, e.g. campaigning for retailers to switch to higher-welfare supply chains), and long-term (centuries or millennia for now, i.e. the time-frames typically thought of as "longtermism").

Below, we will list the top few reasons in favour of each of these three time scales. These lists are non-exhaustive.

In favour of short-term interventions (next couple of years):

Faster interventions make it much easier to have confidence about whether the intervention is bringing the desired benefit for animals or not (e.g. 33).
We might hypothesise that there will be a near point in the future when society transforms sufficiently that the movement's actions today only have a few years to have any effect, which would mean that medium-term interventions (e.g. corporate campaigns) are now too slow to help animals (34,35).

In favour of medium-term interventions (next couple of decades):

Interventions that can affect large numbers of farmed animals typically operate at this time scale, such as those interventions seeking to establish new legislation or to secure new corporate commitments.
Having many years to build relationships—whether among the public, politicians, industry members, or other stakeholders—can help improve support for a campaign and increase its chance of success.
The animal advocacy movement, and the people who make it up, has a large amount of experience conducting campaigns at this time scale, so interventions at this time scale may be more likely to succeed.

For long-term interventions (centuries, millenia and beyond):

The expected number of animals in the far future could be simply enormous, and most animals might exist in the long-term future. This means that considering the lives of animals in the far future could have a large impact.
There are very few resources dedicated to helping animals at this time scale, meaning that there may be many easy opportunities for immense impact (low-hanging fruit).

For more information on these longtermist views for animal advocacy, I invite you to consult my list of resources on this topic (36). There has also been more recent work on the EA Forum since I wrote that list.

Limitations of the framework

As any statistician knows, no model can perfectly capture reality. Our framework is no different—like all similar frameworks, it suffers from numerous limitations. We apply our framework with these limitations at the fronts of our minds. Here is a short summary of the main ones:

It should go without saying that this framework will not be perfectly accurate. We are aiming for a level of accuracy that is comparable to other, similar frameworks—that is, sufficiently accurate for us to make the best possible decisions about animal advocacy campaigns given the available information. In animal advocacy (and, indeed, in any field), policy decisions are typically hindered by information availability.
Most of the general limitations raised by Holden Karnofsky (37) and Saulius Šimčikas (38) also apply to our framework, and we encourage you to read those articles if you would like more details about those limitations.
The most pernicious limitation, in our view, is how we address indirect effects—that is, any impact of an intervention beyond what we can capture in our framework or model. These are sometimes known as flow-through effects, knock-on effects or second-order effects, though these terms can also mean different things. Flow-through effects are critically important to understanding the effects of any policy on the lives of both animals and humans (39–43). But flow-through effects are also relevant to all policy in all fields, having been analysed in energy efficiency, criminal behaviour, and pandemic preparedness, just to name a few (44–46). When it comes to accounting for indirect effects in decision-making, there is no single, universal method. The way that we will account for indirect effects will depend on the context, and may even vary between interventions, but typically we will consider many different qualitative factors and focus on robust interventions (39,42).

A toy example: Trilobites and dodos

Finally, we would like to illustrate how the logic of our framework operates by way of an example.

This is a fictional example, purely for illustration. The point of this section is to show how the framework operates—we don't want readers to focus on the numbers themselves, so we are using a toy example with extinct animals.

Our setting is the make-believe country of Fakeland, where extinct animals still roam the land and the sea. We are considering two different campaign options. We want to figure out which of these two campaign options would be a better choice as we work to improve the lives of animals.

The structure of this section closely follows the earlier dot-point summary of our framework (see above section: "Summary of our framework").

First, we begin with a long list of ideas for new or existing animal advocacy campaigns. In Campaign X (trilobites), we would lobby local governments to treat ocean-polluting wastewater, thus improving water quality for wild trilobites living on the coastal sea floor. Trilobites are extinct sea creatures somewhat similar to crabs or insects. In Campaign Y (dodos), we would conduct outreach to farmers in the dodo farming industry, helping them to reduce dodo stocking densities and provide environmental enrichments. Dodos are extinct, flightless birds.
Second, for each campaign, we calculate the cumulative time in pain that would be prevented for an animal affected by the campaign, and the cumulative time in pleasure that would be created. These estimates could be derived from a variety of sources, such as scientific literature or expert judgement. In section 2 of the table below, we give our pretend estimates of the average time in pain prevented (and pleasure created) for each animal affected by the campaign. These estimates relate to the change in pain (and pleasure) across the entire lifespan of the average individual affected by the campaign.
Third, we multiply these numbers by the reach of the campaign (i.e. number of animals who would be directly affected) to calculate the total cumulative time in pain prevented, and pleasure caused, by the campaign. In section 3A of the table below, we give our pretend estimates of the number of animals who would be affected by the campaign for each year that the campaign runs. In section 3B of the table, we multiply the time per animal by the total number of animals to give the total number of hours of pain prevented (and pleasure caused) by the campaign for each year that the campaign runs. This is expressed as billions of hours.

4. Fourth, we consider the moral weightings associated with each pain or pleasure category. In practice, we consider numerous sets of weightings, each associated with a particular worldview about the moral value of different intensities of pain and pleasure. This lets us convert the four pain categories and the four pleasure categories to a single, common metric.

We use the same worldviews as given in the main body of this report, labelled A through N. For each worldview, the time in pain (or pleasure) associated with each campaign can be converted to a single, common metric: equivalent of Disabling pain. These metrics are shown in 4A in the table below.

Then, for each worldview, we can see which campaign is preferred. In 4B in the table below, we use a simple ordering of the campaign options (first vs second). It would be straightforward to consider how strongly each worldview considers each campaign option, but we don't do this here.

We also calculate the percentage weighting that we want to assign to each worldview. In 4C in the table below, we simply weight each worldview in direct proportion to our intuitive credence of that worldview. Other systems for weighting each worldview could also be used.

5. Fifth, we consider our credence in each of these worldviews, as well as a range of "soft", qualitative factors (e.g. tractability, scalability, and so on) to arrive at final rankings of campaign ideas. In the table below, we find that Campaign X (trilobites) is preferred 65% of the time, and Campaign Y (dodos) 35% of the time.

In this case, we would select Campaign X (trilobites) as the more impactful option. However, this ranking would be tempered and further informed by a range of qualitative factors outside of this quantitative framework (see below).

In practice, we would also spend time considering a large range of qualitative factors. For example, one campaign might be easier to achieve than another, or we might have doubts about whether one campaign could be scaled easily (47). We do not consider such factors for this example, but they would form a large part of our judgement—for each campaign option included in our research, we expect to spend about the same amount of time considering those qualitative factors as we do the quantitative framework we have outlined in this report.

We also note that we did not apply the methods for comparing across vertebrate and invertebrate species that we identified above (see above section, "How do we make comparisons across different animal species?"). In practice, we would conduct this ranking a few times, each under a different one of the assumptions listed in that section.

References

1. Karnofsky H. Open Philanthropy. 2016. Worldview Diversification. Available from: https://www.openphilanthropy.org/research/worldview-diversification/

2. MacAskill M, Bykvist K, Ord T. Moral uncertainty. 2020; Available from: https://library.oapen.org/handle/20.500.12657/42728

3. Tomasik B. Center on Long-Term Risk. 2015 [cited 2021 Oct 4]. Charity Cost-Effectiveness in an Uncertain World. Available from: https://longtermrisk.org/charity-cost-effectiveness-in-an-uncertain-world/

4. Fischer B. Effective Altruism Forum. 2023. Rethink Priorities’ Welfare Range Estimates. Available from: https://forum.effectivealtruism.org/posts/Qk3hd6PrFManj8K6o/rethink-priorities-welfare-range-estimates

5. Muehlhauser L. Open Philanthropy. 2021 [cited 2021 Oct 6]. 2017 Report on Consciousness and Moral Patienthood. Available from: https://www.openphilanthropy.org/2017-report-consciousness-and-moral-patienthood

6. Charity Entrepreneurship. Charity Entrepreneurship. 2018 [cited 2021 Oct 26]. Is it better to be a wild rat or a factory farmed cow? A systematic method for comparing animal welfare. Available from: https://www.charityentrepreneurship.com/blog/is-it-better-to-be-a-wild-rat-or-a-factory-farmed-cow-a-systematic-method-for-comparing-animal-welfare

7. Alonso WJ, Schuck-Paim C. Cumulative Pain: An evidence-based, easily interpretable and interspecific metric of welfare loss [Internet]. Preprints. 2022. Available from: https://www.preprints.org/manuscript/202208.0247/v1

8. Alonso WJ, Schuck-Paim C. Pain-Track: a time-series approach for the description and analysis of the burden of pain. BMC Res Notes. 2021 Jun 5;14(1):229.

9. Cabanac M. Pleasure: the common currency. J Theor Biol. 1992 Mar 21;155(2):173–200.

10. Vinding M. Suffering-Focused Ethics: Defense and Implications. Independently published; 2020. 312 p.

11. Gómez-Emilsson A. Qualia Research Institute. 2019. Logarithmic Scales of Pleasure and Pain. Available from: https://qri.org/blog/log-scales

12. Bain D, Corns J, Brady M. The Philosophy of Pain - Introduction. In: Bain D, Corns J, Brady M, editors. The Philosophy of Pain. London: Routledge;

13. Thong ISK, Jensen MP, Miró J, Tan G. The validity of pain intensity measures: what do the NRS, VAS, VRS, and FPS-R measure? Scand J Pain. 2018 Jan 26;18(1):99–107.

14. Ord T. Why I’m Not a Negative Utilitarian [Internet]. 2013. Available from: https://www.amirrorclear.net/academic/ideas/negative-utilitarianism/

15. Knutsson S. Simon Knutsson. 2016 [cited 2021 Oct 5]. What is the difference between weak negative and non-negative ethical views? Available from: https://www.simonknutsson.com/what-is-the-difference-between-weak-negative-and-non-negative-ethical-views/

16. Animal Ethics. Animal Ethics. Prioritarianism. Available from: https://www.animal-ethics.org/prioritarianism/

17. Gómez-Emilsson A. Effective Altruism Forum. 2019. Get-Out-Of-Hell-Free Necklace. Available from: https://forum.effectivealtruism.org/posts/8Sed33q54kdhZ4M9m/get-out-of-hell-free-necklace

18. Wallenstein SL, Heidrich G 3rd, Kaiko R, Houde RW. Clinical evaluation of mild analgesics: the measurement of clinical pain. Br J Clin Pharmacol. 1980 Oct;10 Suppl 2(Suppl 2):319S – 327S.

19. Springlea R. Effective Altruism Forum. 2023. Reminding myself just how awful pain can get (plus, an experiment on myself). Available from: https://forum.effectivealtruism.org/posts/xtcgsLA2G8bn8vj99/reminding-myself-just-how-awful-pain-can-get-plus-an

20. Grilo V. Effective Altruism Forum. 2022. Comparison between the hedonic utility of human life and poultry living time. Available from: https://forum.effectivealtruism.org/posts/eomJTLnuhHAJ2KcjW/comparison-between-the-hedonic-utility-of-human-life-and

21. Grilo V. Effective Altruism Forum. 2022. Corporate campaigns for chicken welfare are 10,000 times as effective as GiveWell’s Maximum Impact Fund? Available from: https://forum.effectivealtruism.org/posts/nDgCKwjBKwFvcBsts/corporate-campaigns-for-chickenwelfare-are-10-000-times-as

22. Šimčikas S. Comment on “EAA is relatively overinvesting in corporate welfare reforms” [Internet]. 2022. Available from: https://forum.effectivealtruism.org/posts/kHdKWmTcS3FfcYAZj/?commentId=cfQ2Dox9GktW6EMpn

23. Fischer B. Effective Altruism Forum. 2022. THE MORAL WEIGHT PROJECT SEQUENCE. Available from: https://forum.effectivealtruism.org/s/y5n47MfgrKvTLE3pw

24. Mikhalevich I, Powell R. Minds without spines: Evolutionarily inclusive animal ethics. Animal Sentience [Internet]. 2020 [cited 2022 Dec 14];29(1). Available from: https://scholarworks.rit.edu/article/1993/

25. Gibbons M, Crump A, Barrett M, Sarlak S, Birch J, Chittka L. Chapter Three - Can insects feel pain? A review of the neural and behavioural evidence. In: Jurenka R, editor. Advances in Insect Physiology. Academic Press; 2022. p. 155–229.

26. Knutsson S, Munthe C. A Virtue of Precaution Regarding the Moral Status of Animals with Uncertain Sentience. J Agric Environ Ethics. 2017 Apr 1;30(2):213–24.

27. Freeling BS, Connell SD. Animal Minds, Social Change, and the Future of Fisheries Science. Frontiers in Marine Science [Internet]. 2021;8. Available from: https://www.frontiersin.org/articles/10.3389/fmars.2021.684841

28. Holtug N. On the Value of Coming into Existence. J Ethics. 2001 Dec 1;5(4):361–84.

29. Bykvist K. The Benefits of Coming into Existence. Philos Stud. 2007 Sep 1;135(3):335–62.

30. Luper S. Never existing. Mortality. 2018 Apr 3;23(2):173–83.

31. Benatar D. Better Never to Have Been: The Harm of Coming into Existence. OUP Oxford; 2006. 256 p.

32. Herstein OJ. Why “nonexistent people”do not have zero wellbeing but no wellbeing at all. J Appl Philos. 2013;30(2):136–45.

33. Karnofsky H. Effective Altruism Forum. 2018. Update on Cause Prioritization at Open Philanthropy. Available from: https://forum.effectivealtruism.org/posts/PsiiMHvHoHqzGu6Qm/update-on-cause-prioritization-at-open-philanthropy

34. Cook T. Effective Altruism Forum. 2022. Neartermists should consider AGI timelines in their spending decisions. Available from: https://forum.effectivealtruism.org/posts/ebYdBNpGnshhm2Gkq/neartermists-should-consider-agi-timelines-in-their-spending

35. Springlea R. Should all global health + animal advocacy people change strategies to doing good in the immediate future? 2023.

36. Springlea R. Effective Altruism Forum. 2023. Longtermism and animals: Resources + join our Discord community! Available from: https://forum.effectivealtruism.org/posts/cziz5YGLnSnpa5qzS/longtermism-and-animals-resources-join-our-discord-community

37. Holden. The GiveWell Blog. 2016 [cited 2021 Oct 4]. Why we can’t take expected value estimates literally (even when they’re unbiased). Available from: https://blog.givewell.org/2011/08/18/why-we-cant-take-expected-value-estimates-literally-even-when-theyre-unbiased/

38. Šimčikas S. Effective Altruism Forum. 2019. Corporate campaigns affect 9 to 120 years of chicken life per dollar spent. Available from: https://forum.effectivealtruism.org/posts/L5EZjjXKdNgcm253H/corporate-campaigns-affect-9-to-120-years-of-chicken-life

39. CE Team. Charity Entrepreneurship. 2018. Why You Should Care About Indirect Effects. Available from: https://www.charityentrepreneurship.com/post/why-you-should-care-about-indirect-effects

40. Šimčikas S. Effective Altruism Forum. 2023. Why I No Longer Prioritize Wild Animal Welfare (edited). Available from: https://forum.effectivealtruism.org/posts/saEQXBgzmDbob9GdH/why-i-no-longer-prioritize-wild-animal-welfare-edited

41. Holden. The GiveWell Blog. 2013. Flow-through effects. Available from: https://blog.givewell.org/2013/05/15/flow-through-effects/

42. Wildeford P. Effective Altruism Forum. 2016. Five Ways to Handle Flow-Through Effects. Available from: https://forum.effectivealtruism.org/posts/mnMHkMRiHMTyBzKmb/five-ways-to-handle-flow-through-effects

43. Springlea R. Effective Altruism Forum. 2022. A direct way to reduce the catch of wild fish. Available from: https://forum.effectivealtruism.org/posts/xwvQpjQHEMDqxwmHj/a-direct-way-to-reduce-the-catch-of-wild-fish

44. Drago F, Galbiati R. Indirect Effects of a Policy Altering Criminal Behavior: Evidence from the Italian Prison Experiment. Am Econ J Appl Econ. 2012 Apr;4(2):199–218.

45. Banerjee A, Sudlow C, Lawler M. Indirect effects of the pandemic: highlighting the need for data-driven policy and preparedness. J R Soc Med. 2022 Jul;115(7):249–51.

46. Peters B, McWhinnie SF. On the rebound: estimating direct rebound effects for Australian households. Aust J Agric Resour Econ. 2018 Jan;62(1):65–82.

47. Charity Entrepreneurship. Charity Entrepreneurship. Weighted Factor Model. Available from: https://www.charityentrepreneurship.com/weighted-factor-model