Introduction
As an industry, we are constantly seeking scalable solutions to the critical challenge of image accessibility, and generative AI has emerged as a particularly powerful, dual-natured tool. Effectively integrating AI, however, requires us to understand its inherent limitations and carefully guide its alt text authoring. This article will examine why simply implementing automated AI alt text authoring is insufficient, demonstrating that designing Human-in-the-Loop (HITL) workflows is essential to ensure that image descriptions meet both technical compliance and functional user needs for true equity and context.
The world of alt text authoring has become curiouser and curiouser with the introduction of Generative AI as an author. AI promises to solve image accessibility at scale, offering instant, affordable alt text.
During a recent talk at the New York Public Library Accessible Technology Conference Scribely’s Chief Product Officer, Erin Coleman, discussed AI’s dual nature as a powerful yet growing alt text author by examining its strength and weaknesses, and shared how to build resilient workflows with human-in-the-loop strategies to create virtuous alt text cycles where expert human input goes beyond simple error correction to help AI become a wiser more discerning alt text author.
This article explores the three biggest flaws in AI-generated alt text, using the iconic images and narrative of Alice in Wonderland to show why a human partner is non-negotiable.
What Are the Main Flaws in AI-Generated Alt Text?
Like the nonsensical rules of Wonderland, automated AI descriptions can fail at the three essentials of meaningful access: context, accuracy, and equity.

1. Lack of Context: The Grin Without the Cat
The Problem: AI alt text can lack context. It sees an image in isolation, completely divorced from your brand, your website, or the image's purpose.
In Wonderland, Alice asks the Cheshire Cat, "Would you tell me, please, which way I ought to go from here?" His answer: "That depends a good deal on where you want to get to."
This fundamental question of "where you want to get to"— the intent— is precisely where AI alt text struggles. AI can describe the "what" of an image but not the "why" an image is important in context. Without context, AI generates a description that is like the Cheshire Cat's grin floating in mid-air: a technically present feature, but one that's disembodied from any real meaning or purpose.
Alt text that lacks context is functionally useless. It fails to tell the user why the image is there. A well constructed Alt text workflow and context engineering needs to provide the AI Alt text author important information about the reason for the image it is describing – like image intent, surrounding circumstances, brand tone.

2. Lack of Accuracy: "Always Tea Time"
The Problem: AI alt text can be factually inaccurate. Because of AI models’ architecture and the statistically-driven fluency of pre-trained language an AI alt text author can "hallucinate"— a term for when it confabulates, fabricates, or states a delusion as fact.
This means the AI may misidentify objects or invent details that simply aren't there.
At the Mad Tea Party, the Mad Hatter insists it's "always tea time," a frustrating, looping inaccuracy. In the same way, an AI can confidently misidentify a product, invent a person in a landscape, or, like the March Hare offering Alice wine when there was none, describe details that don't exist in the visual truth of the image.

3. Lack of Equity: "Sentence First, Verdict Afterwards!"
The Problem: In AI alt text authoring there is a potential to mirror and amplify societal biases. AI models are trained on the internet, a dataset filled with imperfect, biased human-generated content.
This means an AI's output can easily replicate and scale harmful stereotypes, resulting in descriptions that are inequitable, reinforce narrow cultural norms, or use damaging language.
Like the Queen of Hearts' mandate of "Sentence first—verdict afterwards!", the AI often generates a description based on arbitrary, biased patterns it learned from its training data. As human alt text managers, we must know when to assert ourselves in the workflow—like Alice standing up to the Queen's absurdity—to ensure fairness and create reason.
What is Context Engineering for AI Alt Text?
To get quality AI alt text authoring we must build a workflow. We call this approach Context Engineering: the method of capturing and feeding essential knowledge to the AI before it writes the first draft.
We must instruct the AI Alt text author the “why” of an image in the context that it’s in so that AI alt text descriptions are focused by context.
It’s about engineering the AI’s steps to get a better description, rather than leaving the process up to chance.
What Information Does AI Need for Quality Alt text?
To move beyond a simple "grin," an AI needs the "cat." This information includes:
- Authorial Intent: Why was this image placed here? (e.g., “This image is a testimonial to show a happy customer.”)
- Brand Guidelines: How should the tone sound? (e.g., Professional, playful, direct.)
- Equity Constraints: What language should be used or avoided when describing people?
- Experience-Level Context: What is the caption, the page title, the surrounding text, or the product description?
How Do You Implement Context Engineering?
Alongside the image, context information can be provided to the AI Alt text model is to control the output. This can be done by:
- A well-formed prompt that provides strong contextual guidance that constrains the model’s vast output space. Including the surrounding text or summary of the intended purpose in the prompt to guide the AI model to produce contextually appropriate descriptions.
- Provide the model with associated labeled metadata that allows it to generate far more detailed, accurate, and grounded information than it could from an image alone.
What is a Human-in-the-Loop (HITL) Workflow for Alt Text?

A Human-in-the-Loop (HITL) workflow is a resilient process that strategically combines AI's speed with essential human skill. It's not just about error correction; it's about creating a virtuous cycle that makes the AI smarter and more aligned with your goals over time.
To prevent the AI from generating "false appearances"—like the playing cards' nonsensical task of painting white roses red—this workflow ensures a human is there to "paint" the context correctly.
The Steps of a Resilient HITL Workflow
- Ingest & Contextualize: A human provides the image(s) and the contextual inputs (from your Context Engineering process).
- First Draft: The AI generates a first draft based on those specific human inputs.
- Review & Refine: A human expert evaluates the draft for accuracy, context, and equity, making necessary edits.
- Approval & Publication: The human-approved alt text is published.
- Feedback Loop: This is the most crucial step. The human-corrected version and its context are fed back into the system to fine-tune the AI model, making it a wiser, more discerning author for future images.
The Future of AI Alt Text: Augmentation, Not Replacement
The lesson from Wonderland is clear: Generative AI is a powerful tool, but without human input, it can easily lead users astray.
True innovation in AI image description lies not in replacing humans, but in augmenting their expertise. As Alice muses, “And what is the use of a book... without pictures or conversations?” Images are critical to our conversations, and we must build systems that make them truly accessible for everyone.
Ready to build your resilient alt text workflow? Scribely’s Context Engineering approach and platform are designed to help you get started.

Check out Scribely's 2024 eCommerce Report
Gain valuable insights into the state of accessibility for online shoppers and discover untapped potential for your business.
Read the ReportCite this Post
If you found this guide helpful, feel free to share it with your team or link back to this page to help others understand the importance of website accessibility.





.jpg)





.jpg)



























_edited_6x4-p-1080.jpeg)


