Taxonomies, Thesauri & Ontologies
Master the three knowledge organization structures that sit beneath every navigable, searchable product — and learn when to reach for each one.
11 min read
The full lesson
Information that cannot be found might as well not exist. Behind every well-organized navigation system, every search that returns useful results, and every filter panel that doesn’t leave users empty-handed, there are invisible knowledge structures at work.
Taxonomies, thesauri, and ontologies are the three most important of those structures. Most product designers either confuse them or ignore them entirely, reaching for flat tag clouds when a controlled vocabulary would serve users far better. Understanding the differences — and knowing when each one is the right tool — is foundational to designing products where content stays findable as it grows.
Why Knowledge Organization Structures Matter in UX
When a product has fewer than a hundred content items, almost any labeling scheme will work. Past a few hundred items, the cracks appear. Navigation labels that made sense on day one now cover too much ground. Search returns irrelevant results because the same concept has three different names. Filter panels fail because items haven’t been consistently categorized.
These are not content problems. They are knowledge organization problems.
Taxonomies, thesauri, and ontologies come from library science and knowledge management — long before the web existed. Each one solves a distinct problem:
- Taxonomies define hierarchy and membership — where does this item belong?
- Thesauri define equivalence and relatedness between terms — what else should we search or browse for?
- Ontologies define typed relationships between concepts — how does this thing relate to that thing, and what kind of relationship is it?
UX designers typically own the first, collaborate on the second, and need to understand the third even if they don’t build it themselves.
Taxonomies: Hierarchy and Membership
A taxonomy is a hierarchical classification of concepts or content. The word comes from the Greek taxis (arrangement) and nomia (method). In its purest form, it is a tree: a root category divides into parent categories, which divide into children, which divide into leaves. Every item belongs to exactly one node.
The classic example is biological classification: Kingdom → Phylum → Class → Order → Family → Genus → Species. In UX, taxonomies appear as site category structures (“Men’s Clothing → Tops → T-Shirts”), topic tag hierarchies (“Technology → Software → Design Tools”), and content type trees (“Resources → Templates → Email Templates”).
Flat vs. Hierarchical vs. Polyhierarchical
Taxonomies exist on a structural spectrum:
| Structure | Description | Best for |
|---|---|---|
| Flat list | All terms at one level, no hierarchy | Small, stable domains (under ~50 terms) |
| Hierarchical | Multi-level tree, single parent per node | Navigation menus, breadcrumb trails, site structure |
| Polyhierarchy | Multi-level tree, multiple parents allowed | Large catalogs where items cross categories (e.g. a camera that is both “Photography” and “Video Production”) |
Polyhierarchy is often the right answer for e-commerce and content-rich products. But it introduces complexity: if an item lives in three places at once, breadcrumb trails become ambiguous, URL structures become contested, and analytics attribution gets messy. Those are solvable problems — but you must design for them explicitly.
Depth vs. Breadth Trade-offs
A taxonomy that is too deep forces users to drill through many layers. That increases the cognitive cost of finding anything. A taxonomy that is too broad buries content without giving users any meaningful orientation.
The empirical guideline from card-sorting research — validated repeatedly since Peter Morville and Louis Rosenfeld’s original findings — is that navigation structures tend to work best when they stay between two and four levels deep, with no more than roughly seven to nine sibling categories at any level. This maps to the working memory limit on holding categorical distinctions at once. The exact numbers should always be tested rather than treated as hard rules.
Thesauri: Equivalence and Relatedness
A thesaurus in the knowledge-organization sense is not a synonym finder. It is a controlled vocabulary with declared relationships between terms. The defining standard is ISO 25964, which replaced the older ISO 2788 and ISO 5964 standards in 2011–2013. A thesaurus has three core relationship types:
- Broader/Narrower (BT/NT): hierarchical relationships (“Footwear” is broader than “Sneakers”)
- Related (RT): associative relationships that are not hierarchical (“Sneakers” relates to “Athletic Apparel”)
- Use/Used For (UF): equivalence relationships linking a preferred term to synonyms (“Running shoes USE Sneakers”; “Running shoes” is the preferred term; “Sneakers” is the entry term)
The Use/Used For relationship is what makes thesauri so powerful for search. When a user queries “running shoes” and your index only contains the term “sneakers,” they get zero results — a classic search failure. A thesaurus maps “running shoes” to the preferred term “sneakers” automatically, so the search system expands the query and finds what the user is actually looking for.
Thesauri in Practice: Where They Live
In modern products, thesauri power several systems that designers often touch:
- Search query expansion: The search engine uses the thesaurus to broaden queries with synonyms and related terms. This dramatically improves recall — the proportion of relevant items returned.
- Auto-suggest and autocomplete: Typing “sofa” surfaces “couch,” “settee,” and “loveseat” as alternatives.
- Faceted navigation: A product tagged with the preferred term “Sneakers” appears in a filter for “Running shoes” because the thesaurus declares the relationship.
- Tag normalization in CMS platforms: When content editors tag articles, the thesaurus enforces that “COVID-19,” “coronavirus,” and “SARS-CoV-2” all resolve to a single canonical term. This prevents fragmented content siloes.
Do
- Define a preferred term (the canonical label your system will use) and map all synonyms, abbreviations, and regional variants to it.
- Distinguish between true synonyms (map to the same preferred term) and near-synonyms with meaningful distinctions (keep as separate terms, declare as Related).
- Store the thesaurus as a structured, machine-readable file (SKOS-formatted RDF, a database table, or a headless CMS taxonomy field) so it can feed search, navigation, and tag normalization simultaneously.
- Review and update the thesaurus as language evolves — preferred terms get outdated, new terms emerge, and regional usage shifts.
Don't
- Use a flat tag cloud where every editor invents their own tags — this creates synonym fragmentation at scale and makes search recall plummet.
- Treat all relationships as hierarchical when many are associative — forcing “related” terms into a parent/child structure distorts meaning and confuses navigation.
- Build a thesaurus as a static document or spreadsheet that lives outside the systems it is meant to control — drift between the document and the live system is the most common thesaurus failure mode.
- Conflate thesaurus management with taxonomy management — they are complementary but distinct tools and should be governed separately.
Ontologies: Typed Relationships and Rich Semantics
An ontology takes the relational expressiveness of a thesaurus and multiplies it. A thesaurus has three relationship types (broader, narrower, related). An ontology can have dozens or hundreds of named, typed relationships — each carrying specific semantic meaning.
Here is a simple example. A product database for a design tools company might declare:
- Adobe Figma is-a Design Tool
- Adobe Figma has-feature Component Library
- Adobe Figma competes-with Sketch
- Adobe Figma integrates-with Storybook
- Component Library is-part-of Design System
These typed relationships let a system answer complex queries: “Show me all design tools that integrate with Storybook and have a component library feature.” A flat taxonomy cannot answer that question. A thesaurus cannot answer that question. An ontology can.
When Ontologies Appear in UX Work
Most UX designers do not build ontologies. That work typically belongs to information architects, knowledge engineers, or data architects working on large-scale content systems, enterprise knowledge bases, or semantic platforms. But designers encounter the consequences of ontological decisions constantly:
- Recommendation systems that surface “people also viewed” content are built on ontological relationships.
- Knowledge graphs — used by Google Search, Amazon product pages, and Wikipedia’s Wikidata backend — are ontologies that power rich search result features.
- AI-powered search (vector search, retrieval-augmented generation) still benefits enormously from explicit ontological relationships. An LLM that knows how concepts relate makes better retrieval decisions than one that only knows what concepts exist.
- Enterprise content management systems that link policies, procedures, roles, and regulations rely on ontologies to surface “you might also need” related content.
The Spectrum: Choosing the Right Structure
The three structures exist on a spectrum of expressiveness and implementation cost:
| Structure | Relationships | Implementation effort | When to reach for it |
|---|---|---|---|
| Taxonomy | Parent/child, sibling | Low — a tree structure | Navigation, site architecture, breadcrumbs, category filters |
| Thesaurus | BT/NT/RT/UF | Medium — a controlled vocabulary store | Search query expansion, tag normalization, synonym handling |
| Ontology | Typed, domain-specific | High — schema design, graph store | Recommendations, knowledge bases, complex faceted search, AI retrieval |
A common mistake is reaching for an ontology when a taxonomy would do, or deploying a flat tag cloud when a thesaurus is warranted. The right call depends on two questions: how many items does the system need to organize, and how many distinct dimensions will users search or filter on?
For a 50-page marketing site, a two-level taxonomy is almost certainly enough. For a 500,000-SKU e-commerce catalog with price, brand, material, color, size, and use-case dimensions, faceted navigation backed by a controlled vocabulary thesaurus is the minimum viable structure. For a healthcare knowledge base that must connect symptoms, treatments, drugs, contraindications, and patient demographics, an ontology is the appropriate tool.
Controlled Vocabularies: The Bridge Concept
Controlled vocabulary is an umbrella term that covers all three structures. It simply means: a defined, managed, approved set of terms used to index and retrieve content. A taxonomy is a controlled vocabulary with hierarchy. A thesaurus is a controlled vocabulary with declared relationships. An ontology is a controlled vocabulary with typed, domain-specific relationships.
In day-to-day design and content work, “controlled vocabulary” is the most useful concept to reach for when advocating for this kind of structure to stakeholders. It is concrete (“we need a list of approved tags that editors must use”) without requiring a full explanation of the IA theory behind it.
SKOS: The Interoperability Standard
The W3C’s Simple Knowledge Organization System (SKOS) is a stable standard since 2009. It provides a common data model for expressing taxonomies, thesauri, and classification schemes as linked data. SKOS uses RDF and defines concepts, labels (preferred, alternative, hidden), and relationships (broader, narrower, related, exactMatch, closeMatch, broadMatch, narrowMatch, relatedMatch).
For design teams, SKOS matters mainly as a signal. If your organization’s content engineers are using SKOS-formatted taxonomy data, your navigation and search systems can consume and reuse the same structure without custom mappings. It is worth knowing the term when scoping technical requirements.
Governance: The Part Designers Often Skip
Knowledge organization structures are not set-and-forget artifacts. They decay. New content gets created using informal terms that drift outside the controlled vocabulary. Business categories change. User language evolves (what users called “smartphones” in 2010 differs from what they call them today). Regional and multilingual variants emerge.
Effective taxonomy governance requires four things:
- An owner — often the information architect, content strategist, or a librarian-role on larger teams — who reviews and updates the structure on a defined schedule.
- An editorial process for proposing and approving new terms, with clear criteria for when a synonym becomes a preferred term.
- Analytics integration so that zero-result searches and high-bounce category pages trigger vocabulary reviews. Behavioral data is the most reliable signal that the structure has drifted from user language.
- Version control — taxonomy changes can break existing content metadata, navigation links, and search indexes, so changes need to be tracked and communicated to engineering.
Putting It Together: A Practical Design Workflow
When starting a new product or doing an IA overhaul, follow this sequence:
- Content inventory and audit first. You cannot design a taxonomy for content you haven’t mapped. Audit what exists, identify key content types, and note the informal labels already in use.
- Run generative card sorting to discover how users mentally group and label content. This gives you the raw material for your taxonomy and preferred terms.
- Draft the taxonomy — two to four levels, tested with tree testing before committing.
- Identify synonym clusters from your card sort and search log analysis. These become your thesaurus equivalence mappings.
- Define the facets that users will filter on. This shapes whether you need a flat taxonomy (single facet) or a multi-dimensional controlled vocabulary.
- Validate with tree testing and iterate before handing off to engineering.
- Establish governance — a named owner, a review cadence, and a process for incorporating new terms from zero-result search queries.
Steps 1–6 are design work. Step 7 is often treated as “someone else’s problem” — and that is exactly why knowledge structures so often decay within eighteen months of launch.