Unreasonable Accommodation - Web Experiences as a Blind Person

Keywords

website design, accommodations, memory limitations, artificial intelligence

1. Problem and Background

The many advances we have made toward a more equitable world for blind people have paralleled those made for women's rights and the rights of racial minorities. The difference is that for these other disadvantaged groups, the only change that is needed to give equity of opportunity is to remove artificial barriers. This obviously ignores the different starting points that past disadvantage caused, but the removal of an artificial barrier is much simpler (at least in theory) than an intrinsic barrier that requires functional teaching materials to be translated into materials that can be understood by someone who lacks the advantage of seeing pictures and all of the spatial information that most complex systems of thought rely upon. For instance, mathematics makes use of the position of different parts of the equation to give some idea about how they relate to one another. A huge hurdle has to be overcome to put the entire formula on one line. The Nemeth code for mathematics achieves this, but the human brain is very limited in how much of that linearized equation it can read and remember at one time (5). Blind humans do not have computer-like random access memory. Reading more involved equations spanning several Braille lines can be like trying to remember all of the positions of the pieces on a chess board. Yes, it is technically possible to play chess completely in your head, and some people (even the blind) have. However, this is not a common skill, and demanding savant powers in order to do these things is no excuse to withhold accessibility.

Demanding that a good system for communicating about mathematics be reworked to make it understandable easily in the absence of vision seems like a petty thing to ask. Thankfully, researchers continue to develop in this field through the creation of such technologies as MathML (6, 7). However, the example leads us to think about the fact that the ADA sets a standard where only "reasonable" accommodations are required; this then begs the question: At what point are the accommodations 'unreasonable'?

This thought experiment taken to an extreme becomes infeasible. If we must accommodate someone who is both deaf and blind, this requires more resources than if the person is just blind or just deaf. We can grow the number of disabilities that our hypothetical learner has, until we eventually have a brain with no senses or outputs attached (locked-in syndrome) that needs to be taught math via electrodes implanted into its gray matter directly.

Instead of arguing that the adaptations are just 'for' disability, we should broaden the scope of what we consider a disability. Many learning disabilities are associated with sensory input and output (3, 4). I experience this because sound is a 2-way road of information. If I am required to hear and understand two things at once, I personally don't get either. If I am required to use verbal working memory and then have to talk, I have the human equivalent of an "All Interrupt Requests" fault. Keeping verbal information in my memory works OK, until I have to verbalize an output, and then the stored information is at odds with the information being sent to the voice (output). Consider that vision is a sense with no 'output'. Sighted people have no difficulty looking and talking at the same time. The same is not true of hearing and vocalizing, which use the same parts of the brain and interfere with one another if presented as tasks simultaneously.

2. Challenges

A number of accessibility "features" have been built with little regard for human memory limitations. A picture is considered worth a thousand words. For the first time, I am getting some idea of how polluted the visual world is. Instead of finding an unlabeled picture, I am confronted with a long 'word salad' problem, and the digestion isn't easy. Pictures and graphs are placed seemingly randomly in the body of text. It is like hearing several conversations that randomly switch from one to the next. I am carried along by the analog stream of audio coming from the computer's speech like I am in a kayak on a rapidly flowing river and will go at the same speed 'forward' for as long as the speech runs. Transitions happen unceremoniously. A page might be interrupted by a picture and preceded and followed by repetitive headers and footers.

To make things even weirder, some websites now label pictures with AI systems. When I first tried to use these, I heard a lot of guesswork coming from the computer, with a particularly entertaining "paramecium" as one of the things the AI ventured as a guess about the water it was seeing. It would produce long lists of words, which I was supposed to filter for the "reality check" and decide what was relevant. Since those early days, the descriptions have gotten more accurate, but even without the outright wrong guesses, they are now bloated with the 'objective facts' about the picture. Instead of unlabeled images, websites may now be filled with description of every image, e.g. the color of every person's outfit. I cannot focus on the parts of a person's visage that might be relevant to me. I would like to know things like young/old or even attractive/unattractive (though this might have political implications). Instead, as someone who has never experienced color, I am told about the style of clothing and the color that the person chose to wear. I can understand that training AI is hard and intensive in both computing and energy. The AI trainer is going for the "low hanging fruit", rather than asking what the description is supposed to do. The description is needed to bring the user a clear idea of why the picture is there and its relevance to the overall presentation. At no point is my comprehension of a scene improved by knowing that a "black sleeveless top" is involved.

The problem is that the image is now a clutter of irrelevant information. In the midst of trying to understand something, I am presented with this extraneous information that breaks my flow of thought and comprehension of the work I am doing.

This is similar to the experience of being on a site with random popup ads, except I cannot simply block the descriptions of the pictures, as they might actually have useful information. So to make this analogy more interesting, the popups have vital information interspersed among the really annoying ads. This is not just for fun; the work I do on the computer is important to me and to other blind people. Upping the ante, these popups aren't just happening when the user is doing fun stuff and they are accepting that their video is free as long as they watch the ads. No, this is a case where my work is being impeded by random information that adds nothing, until… it says something vitally important. It is like listening to a stream of consciousness description of someone's dream while trying to get work done.

3. Outcomes and Future Perspective

It can seem that rather than solving problems, the AI introduces noise. Sighted people overlook this because they can skim the picture, the description, or both. If they were forced to read exactly what was on the page serially, they would place a much higher premium on the relevance of the information being presented.

Furthermore, this exposes a disconnect between developers / designers and their users. A lot of this confusion could be coming from the cascading style sheets (or CSS) versus the innate document object model (or DOM) of the website. Style sheets are enabling visual layouts that do not match the underlying structure (2), further complicating the job of the screen reader and the person using it.

While we continue to advocate for best web practices, we should be applying AI to this problem to help struggling users. Specifically, we need a web tool that is able to reduce this clutter and better understand the visual layout without relying on just the document structure. Some websites may already be using AI to generate accessible content or to re-arrange content to improve its readability, but these attributes do not specify whether the image description was generated by an AI or a human.

There are as yet no standards for AI and web accessibility, and the W3C could consider creating new standards to address the use of AI, both by the user and by the developer, to generate descriptive content or otherwise change the presentation of an app or website. Recent discussion from W3C members has considered the benefits and limitations of AI in this space, with the goal of addressing accuracy and reliability issues through conformance evaluation (1). Providing more transparency about the use of AI on websites will empower all users, while promoting trust in this new and emerging technology.

References

  1. S. Abou-Zahra, J. Brewer, and M. Cooper (2018) Artificial Intelligence (AI) For Web Accessibility: Is Conformance Evaluation A Way Forward?. In Proceedings of the 15th International Web for All Conference (pp. 1-4).
  2. S. C. Baker (2014) Making It Work For Everyone: Html5 And Css Level 3 For Responsive, Accessible Design On Your Library's Web Site. Journal Of Library & Information Services In Distance Learning 8.3-4: 118-136.
  3. K. Banai, D. Abrams, and N. Kraus (2007) Sensory-based Learning Disability: Insights From Brainstem Processing Of Speech Sounds. International Journal Of Audiology 46.9: 524-532.
  4. M. Du Feu and K. Fergusson (2003) Sensory Impairment And Mental Health. Advances In Psychiatric Treatment 9.2: 95-103.
  5. T. Klingberg (2009) The Overflowing Brain: Information Overload And The Limits Of Working Memory. Oxford University Press.
  6. N. Soiffer and S. Noble (2019) Mathematics And Statistics. In Web Accessibility (pp. 417-443). Springer, London.
  7. J. J. White (2020) The Accessibility Of Mathematical Notation On The Web And Beyond. Journal of Science Education for Students with Disabilities, 23(1), n1.