Disability Representation in Text-to-Image AI Models

Kelly Avery Mack
6 min readApr 9, 2024

--

In this post, we summarize the findings of our research paper “‘They only care to show us the wheelchair:’ Disability Representation in Text-to-Image AI Models”, which was accepted to the ACM Conference on Human Factors in Computing Systems (CHI) 2024. This paper was authored by Kelly Avery Mack, Rida Qadri, Remi Denton, Shaun K. Kane* and Cynthia L. Bennett*. (*Both authors contributed equally to this work).

three AI generated images: a photo of a person with light skin and short hair eating breakfast a person with light skin in a suit walking in a crosswalk with a guide dog and a white cane. the white cane is going through the dogs harness in an unrealistic way a person with light skin and a prosthetic leg runs outside
Examples of positive disability representation in AI-generated images where disabled people are doing everyday activities.

Generative artificial intelligence (GAI) is becoming increasingly popular, with text-to-image (T2I) models becoming more powerful in the types of representations they can produce. Text-to-image models take in a text prompt and generate an image corresponding to the prompt. However, prior work has found that GAI models can recreate problematic biases and stereotypes that exist in the real world and the data that they were trained on [1–2, 4–5]. We wanted to investigate how T2I represented people with disabilities, and specifically, if they produced images that perpetuated bias or ones that showed positive portrayals of disability.

To understand how disabled communities feel about the representation of disabled people that T2I models generate, we conducted eight focus group sessions with 25 people. In these sessions, we asked participants to share their thoughts about different images we generated based on the topics that interested them, and we asked them to prompt the model themselves and respond to the results. The images we used in this study came from three publicly available models in the summer of 2023: Midjourney, DALLE 2, and Stable Diffusion 1.5.

Models produced negative tropes about disability

Our participants identified many negative stereotypes of disability that were common in these models. Our paper contains a full list, but here we share a few of the most common or notable themes. It is important to note that these problematic representations have real world impacts for people with disabilities. As one of our participants commented: “These stereotypes can limit opportunities for people with disabilities in … education, employment, and social interactions by shaping negative perceptions about their capabilities.”

Too many wheelchairs

Models overwhelmingly produced images of people using wheelchairs when we prompted the model for topics about disability, which was also a trend found by researchers looking at the outputs of large language models [4]. While some people with disabilities do use wheelchairs, there are also a wide variety of other experiences with disability that rarely appeared in the model outputs, unless we explicitly prompted for them. This trend perpetuates the misconception that almost all people with disabilities use wheelchairs.

three AI generated images: A person in a wheelchair sitting outside. The image is cropped so that only the right side of this person and chair from their waist down are visible. A person with light skin a in a wheelchair wearing jeans outside. The image is cropped so that only the right side of this person and chair from their waist down are visible. a person sits cross legged with their hands in their lap looking up thoughtfully. behind them there are stacks of grey bodies at all angles.
AI-generated images demonstrating the over representation of wheelchairs in response to the prompt “a person with a disability.”

Over-emphasis on negative emotions

Our participants were also unhappy that most of the people in the model’s images looked sad, lonely, or in pain. One participant commented: “what struck me is that we always concentrate on the bad,” and explained that, while some elements of disability might be hard, there are many neutral to positive experiences when living with a disability as well. People were frustrated that there were not “more smiles,” and that the model overwhelmingly showed people looking stationary and upset.

Dehumanizing imagery

Finally, the models output images that were dehumanizing. Some images had near horror-movie-like aesthetics, which were unsettling to the participants. These images include faceless heads, stacks of grey bodies, faces that have unrealistically exaggerated decaying skin and sunken eyes, faces covered in shadows, and tattered or dirty clothes. One participant stated: “this image should have required a long and detailed prompt to achieve something so bizarre,” rather than being returned for a simple prompt like: “a person with depression.”

three AI generated images: A person in a wheelchair sitting outside. The image is cropped so that only the right side of this person and chair from their waist down are visible. A person with light skin a in a wheelchair wearing jeans outside. The image is cropped so that only the right side of this person and chair from their waist down are visible. a person sits cross legged with their hands in their lap looking up thoughtfully. behind them there are stacks of grey bodies at all angles.
AI-generated images that are dehumanizing because they crop out humans to show the assistive technologies (left and center) or show unsetting imagery (right).

Another form of dehumanizing representation was showing images that cropped the person out of the frame. These images highlighted the assistive technology at the expense of removing a person’s face from the photo.

People wanted images of disabled people doing everyday activities

Instead of these images of people using wheelchairs looking sad or lonely, our participants wanted to see representations of disabled people in community with nondisabled people doing everyday activities, like cooking dinner, playing sports, or parenting a child. They emphasized the desire to see a diverse range of disabilities, races, genders, and ages represented in the images, since “[disability] looks like you and you and me. And that person over there and that person over there.”

three AI generated images: a parent and child with light skin look at a book that is emitting light. Both look happy and engaged two kids, one light skin and one with brown skin, sit at a table in a classroom looking at a paper. one student does not have visual indicators of disability, the other student is in a manual wheelchair. They both look engaged and pleasant. a person with short hair jumps athletically in a desert with a frisbee flying in the air in front of them.
Examples of positive disability representation in AI-generated images where disabled people are doing everyday activities.

Some topics were more complicated

Our participant’s feedback was not always so straightforward. Sometimes they discussed their own conflicting priorities in representation or directly disagreed with each other. Here are a few of the top issues where our participants had conflicting opinions:

  1. How can we represent disabilities that are less visually discernible in images? Some disabilities are more or less visible than others in different contexts [3]. For example, our participants were split about how to represent people with chronic or mental health conditions, people with Autism, or people with epilepsy in a single image. Participants wanted to be able to show these disabilities in images, but were unsure of how to represent them in a single image without falling back on reductive representations (e.g., showing a person with depression looking sad).
  2. How do models choose what details to add to an image? No prompt can ever fully define what an image should look like; the model needs to fill in some of the blanks. For example, if a person enters the prompt “a person playing basketball,” the model needs to decide if the person is playing with other people, if they are playing inside or outside, and what kind of clothes they are wearing. Sometimes, our participants did not approve of the way the model filled in the blanks in their prompts. For example, images of “a person using a walker in a park” showed only the disabled person rather than a busy public park, which participants felt made the person look lonely.
  3. How can model filters protect users without offending them? We found that the filtering mechanisms put in place to help ensure generated images follow content policies sometimes filtered out certain groups altogether. For example, when we put the prompt “a person with bipolar disorder” into three models, two returned results, and one returned an error message stating: “It looks like this request may not follow our content policy.” With this response, the model suggests that the user is incorrect or bad for trying to prompt about a stigmatized disability (in this case, bipolar disorder), which further alienates people with this disability. Models create filters to attempt to keep users safe by preventing them from seeing offensive content. But, when they filter out or are unable to show respectful images related to our participants’ experiences, these filtering mechanisms contribute to making topics like disabilities taboo when communities want them to be normalized.

In summary, text-to-image AI models are biased in how they represent people with disabilities. They are biased towards showing people in wheelchairs and showing people looking sad, incapable, or scary, which are not accurate portrayals of the vast diversity of disabilities and experiences that exist. Instead, participants wanted realistic portrayals of disabled individuals doing everyday things to help normalize disability in society. We argue that AI model developers need to continue to engage with people with disabilities to limit the harm that these models produce. We invite you to read our paper for more details.

References

[1] Bender, Emily M., et al. “On the dangers of stochastic parrots: Can language models be too big?🦜.” Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 2021.

[2] Bianchi, Federico, et al. “Easily accessible text-to-image generation amplifies demographic stereotypes at large scale.” Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 2023.

[3] Faucett, Heather A., et al. “(In) visibility in disability and assistive technology.” ACM Transactions on Accessible Computing (TACCESS) 10.4 (2017): 1–17.

[4] Gadiraju, Vinitha, et al. “” I wouldn’t say offensive but…”: Disability-Centered Perspectives on Large Language Models.” Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 2023.

[5] Qadri, Rida, et al. “Ai’s regimes of representation: A community-centered study of text-to-image models in south asia.” Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency.

--

--

Kelly Avery Mack
Kelly Avery Mack

Written by Kelly Avery Mack

PhD Candidate at the University of Washington

Responses (1)