Semantic HTML & the Accessibility Tree

Key takeaways

The accessibility tree is built from semantic HTML and ARIA — every interactive element needs a role, an accessible name, and correct state; native HTML elements provide all three for free, while ARIA-on-divs requires manual wiring that is easy to get wrong.
Landmark regions and heading hierarchy create the navigational skeleton that screen reader users rely on; label duplicate landmarks with aria-label and never choose heading levels for their visual size.
The first rule of ARIA holds: prefer native HTML over custom ARIA implementations — a <button> provides keyboard behavior, focus management, and implicit semantics that a <div role="button"> requires significant effort to replicate.
Test the accessibility tree directly in browser DevTools, automate with axe-core in CI, and always validate with keyboard-only navigation and at least one screen reader before shipping.
WCAG 2.2 AA is the current legal and ethical baseline; target it rather than the outdated 2.0, and treat APCA as a supplementary perceptual lens rather than an adopted compliance requirement.

The full lesson

Semantic HTML is not a formatting preference — it is the contract between your markup and every assistive technology on the planet. When browsers parse your HTML, they build two parallel trees. The first is the DOM tree, which drives rendering. The second is the accessibility tree, which drives screen readers, voice control software, switch devices, and other assistive technologies. Understanding both trees — what feeds them, how they differ, and how ARIA layers on top — turns accessibility from a compliance checklist into a design superpower.

What the Accessibility Tree Actually Is

Every modern browser exposes its parsed document to the operating system’s accessibility APIs. On Windows that’s MSAA/UIA, on macOS it’s NSAccessibility, and on Linux it’s AT-SPI. The OS then passes that structured data to assistive technology. The accessibility tree is the intermediate representation the browser builds for this purpose.

Each node in the accessibility tree carries four core properties:

Property	What it communicates	Example
Role	What kind of thing this is	`button`, `heading`, `list`
Name	A human-readable label	”Submit order”
State	Current condition	`expanded: false`, `checked: true`
Properties	Relational or descriptive metadata	`required`, `haspopup`, `describedby`

Not every DOM node makes it into the accessibility tree. Browsers prune elements with display: none, visibility: hidden, and aria-hidden="true". Decorative images with an empty alt="" are excluded too. This pruning is intentional — flooding assistive technology with irrelevant structure makes navigation harder, not easier.

Why Semantic HTML Beats ARIA Every Time

ARIA (Accessible Rich Internet Applications) can add role, name, state, and property information to any element. That sounds like a universal fix — and that is exactly the trap. The W3C’s own guidance states the first rule of ARIA: if a native HTML element already has the semantics and behavior you need, use it. Native semantics give you several things for free:

Keyboard behavior included. A <button> gets focus, activates on Space and Enter, and joins the tab order automatically. A <div role="button"> gets none of that unless you also add tabindex="0" and write keyboard event handlers by hand.
Browser-managed state. A <details>/<summary> pair exposes expanded state to the accessibility tree without a single line of JavaScript.
Implicit ARIA roles. Every semantic element maps to a default ARIA role. <nav> is role="navigation", <main> is role="main", <h2> is role="heading" aria-level="2". You get these automatically.
More robust under churn. Developers constantly add, remove, and refactor JavaScript. Native HTML semantics survive component rewrites. Hand-authored ARIA often does not.

The fundamental mistake is treating ARIA as “accessibility sprinkles” you scatter after building a feature with <div> soup. This creates compounding technical debt. Every JavaScript interaction you wire up also needs ARIA state updates, focus management, and keyboard handling — things browsers handle for free with semantic elements.

Use <button> for clickable actions, <a href> for navigation, <input type="checkbox"> for toggle choices, and <select> for option lists. Let the browser own the interaction model.

Don't

Build interactive controls from <div> or <span> elements with click handlers and then patch them with role="button" or role="checkbox". You will spend 10x the effort and still miss edge cases that native elements handle automatically.

Landmark regions are semantic containers that let screen reader users jump directly to major page sections — skipping the header navigation on every page load. Think of them as the assistive-technology equivalent of a table of contents.

Here are the HTML landmark elements and their implicit ARIA roles:

HTML element	Implicit ARIA role	Notes
`<header>` (top-level)	`banner`	Only when not nested inside `<article>` or `<section>`
`<nav>`	`navigation`	Label with `aria-label` when multiple nav elements exist
`<main>`	`main`	One per page; skip-nav link should target this
`<aside>`	`complementary`	Related but secondary content
`<footer>` (top-level)	`contentinfo`	Only when a direct child of `<body>`
`<section>` with accessible name	`region`	An unlabeled `<section>` gets no landmark role
`<form>` with accessible name	`form`	An unlabeled `<form>` gets no landmark role
`<search>`	`search`	New in HTML Living Standard; use for site search inputs

The common failure is having multiple <nav> elements with no distinguishing labels. A screen reader user who opens the landmarks list sees “navigation, navigation, navigation” and has no idea which is the primary nav, the breadcrumb, or the footer links. Fix this with aria-label="Primary navigation", aria-label="Breadcrumb", and aria-label="Footer links" on each one.

Heading Hierarchy: Structure Over Style

Headings do two jobs at once. They communicate document structure to assistive technology, and they let screen reader users navigate by heading level — a feature that gets used constantly. Surveys of screen reader usage consistently show heading navigation is the top browsing mode. A broken heading hierarchy is disorienting in the same way a document with scrambled chapter numbers would be.

Rules for heading hierarchy in 2026:

There is exactly one <h1> per page, representing the main topic.
Headings nest in order: <h2> for major sections, <h3> for subsections of <h2>, and so on. Never skip a level going down.
Heading level communicates structure, not visual size. If an <h4> looks too small visually, restyle it with CSS — do not demote it to a lower heading level.
Design systems should map heading scale tokens to visual size independently of semantic level. An <h2> inside a compact card widget may legitimately appear smaller than an <h3> on the marketing homepage — that is fine as long as the document outline stays logical.

The habit to eliminate: choosing heading tags for their default browser styling. Picking <h3> because it renders at 20px bold — rather than for document structure — creates heading outlines that jump from h1 to h4 to h2. That is confusing to navigate and fails WCAG 1.3.1 (Info and Relationships).

Images, Icons, and the Name Computation

Every meaningful image needs an accessible name. Every decorative image needs to be explicitly hidden. There is no acceptable middle ground.

The accessible name computation algorithm — defined in the Accessible Name and Description Specification — determines what a screen reader announces for an element. The browser checks sources in this priority order for most elements:

aria-labelledby (references another element’s text by ID)
aria-label (an inline string override)
Native HTML naming: <img alt>, a <label> associated with an <input>, <button> text content, <figure><figcaption>, etc.
title attribute (last resort — tooltip-dependent and unreliable)

For images specifically:

Informative images: Describe the content and meaning, not just the appearance. For example: “Bar chart showing 40% increase in Q2 revenue.”
Decorative images: Use alt="" — the empty string explicitly tells assistive technology to skip the element entirely.
Functional images (inside links or buttons): Alt text should describe the destination or action. Use “Search” for a magnifying-glass icon inside a <button>.
Inline SVG icons: Add aria-label on the wrapping <button> or <a>, then add aria-hidden="true" on the SVG itself to stop the browser from exposing raw SVG structure.

Forms: Labels, Errors, and Grouping

Forms are where accessibility failures cause the most direct harm — users cannot complete a purchase, file a claim, or book an appointment. WCAG 2.2 criteria 1.3.1, 3.3.1, and 3.3.2 all converge here.

Labels: Every form control must have a programmatically associated label. The right method is pairing <label for="email"> with a matching id="email" on the input. Using placeholder as the only label fails WCAG 1.3.1 and creates a real usability problem: placeholder text disappears when the user starts typing, leaving them with no reminder of what the field expects. This is one of the clearest cases where an outdated habit — placeholder-as-label — conflicts with both accessibility standards and usability research.

Error messages: When a field fails validation, the error message must be both visually visible and programmatically linked. Use aria-describedby to point the input at its error element, aria-invalid="true" to signal the error state, and role="alert" on the error message so it is announced immediately when injected into the DOM:

<label for="email">Email address</label>
<input
  id="email"
  type="email"
  aria-describedby="email-error"
  aria-invalid="true"
/>
<span id="email-error" role="alert">
  Enter a valid email address, for example [email protected]
</span>

Specific error messages with a concrete example — like the one above — dramatically outperform generic “Invalid input” messages in both usability and accessibility. The modern pattern is inline validation on blur, not on submit only.

Grouping: Related controls must be grouped with <fieldset> and <legend>. This is mandatory for radio groups and checkboxes, and strongly recommended for multi-field address inputs. The legend is announced before each control in the group, providing essential context. Without it, screen reader users hear “Yes” and “No” radio buttons with no indication of what question those options answer.

WCAG 2.2 additions that affect forms: Success Criterion 3.3.7 (Redundant Entry, Level A) requires that information a user already provided is either auto-populated or available for them to select, avoiding re-entry within the same session. SC 3.3.8 (Accessible Authentication, Level AA) prohibits cognitive function tests — solving a puzzle or transcribing distorted text — as the sole authentication method.

Interactive Components and ARIA Authoring Patterns

Sometimes native HTML cannot satisfy a design requirement — a tab panel, a combobox with custom filtering, or a tree view. In those cases, the ARIA Authoring Practices Guide (APG) defines the expected keyboard behavior and ARIA semantics. The APG is a specification, not a code library.

Every custom interactive component must fulfill a three-part contract:

Role: Declare the widget role on the container element — for example role="tablist", role="combobox", or role="tree".
State and properties: Keep ARIA state attributes synchronized with actual UI state in JavaScript. aria-expanded="false" on a collapsed panel must flip to aria-expanded="true" when the panel opens. Stale ARIA state is worse than no ARIA — it actively misleads assistive technology.
Keyboard interaction: Implement the keyboard contract for that widget role. For tabs: arrow keys move between tabs, Enter or Space activates. For combobox: arrow keys navigate options, Escape closes. Deviating from APG patterns confuses screen reader users who have learned the conventions.

One critical pattern for 2026: focus management in dialogs. When a modal dialog opens, focus must move into it. When it closes, focus must return to the trigger element. Use the inert attribute — now baseline-supported across all major browsers — to trap focus within open dialogs. It marks all outside content as non-interactive and invisible to the accessibility tree at the same time, replacing the fragile tabindex loop hacks that were the previous standard:

<!-- When dialog is open: -->
<main inert>...</main>
<dialog open>...</dialog>

Testing the Accessibility Tree End-to-End

Automated tools catch approximately 30–40% of WCAG failures. The rest require manual testing.

Automated baseline: Use Axe DevTools, the Deque browser extension, or axe-core integrated in your CI pipeline. These reliably catch missing alt text, color contrast failures, duplicate IDs, missing form labels, and invalid ARIA usage. Run them on every pull request.

Manual keyboard testing: Navigate your entire user flow using only Tab, Shift+Tab, Enter, Space, and arrow keys. Every interactive element must be reachable, operable, and self-explanatory from its focused state alone. Focus must never be lost or trapped unintentionally. Check that focus indicators are visible — WCAG 2.2 SC 2.4.11 (Focus Not Obscured, AA) requires that focused components are not entirely hidden by sticky headers, cookie banners, or chat widgets.

Screen reader testing: The minimum viable testing matrix for 2026:

Screen reader	Browser	Platform
NVDA (free)	Firefox or Chrome	Windows
JAWS	Chrome or Edge	Windows
VoiceOver	Safari	macOS / iOS
TalkBack	Chrome	Android

Navigate by landmarks, by headings, and by interactive elements using each screen reader’s browse-mode shortcuts. Verify that dynamic content updates — alerts, status messages, live regions — are announced without disrupting reading flow.