Taxonomy Tools: Requirements and Capabilities

Taxonomy Basics

Taxonomy means different things to different people in different disciplines. 

Biological taxonomy, for example, is intended to place organisms in one and only one category, similar to the call number or Dewey Decimal classification that identifies the one location for a book on the shelf of a library. 

Such unitary classification is convenient but limiting, because often things belong in more than one category. For example, dogs are the species Canis familiari, genus Canis, family Canidae, order Carnivora, class Mammalia, phylum Chordata, and kingdom Animals in the Linnaean Classification. But dogs are also pets and farm animals. We call taxonomies that are designed to provide multiple contexts for things “faceted taxonomies,” and the classification of things into multiple categories “polyhierarchy.”

Figure 1A - Comparison of biological and faceted taxonomies
Figure 1B - Comparison of biological and faceted taxonomies

Figure 1 – Comparison of biological and faceted taxonomies

Taxonomy Tools Presentation Download

Types of Schemes for Organizing Concepts

There are other types of schemes for organizing the types of things that you don’t just hold in your hands, but that you hold in your head. Some are simpler than taxonomies, and some are more complex or more expressive, meaning that they can be used to convey more nuanced contexts. Table 1 provides short descriptions of various category schemes, and Figure 1 presents these schemes on a continuum from simple to complex.

Table 1 – Types of semantic schemes

Category SchemeDescription
Synonym RingA set of words/phrases that can be used interchangeably for searching, e.g., Hypertension or High blood pressure 
Controlled VocabularyA list of preferred and variant terms, which may have defined hierarchical and associative relationships. A taxonomy is a type of controlled vocabulary.
Authority FilesA controlled vocabulary typically used for names of individuals, organizations, countries, and other named entities. 
TaxonomyA hierarchical scheme that organizes concepts into “is a” or “part of” trees that may be mono- or poly-hierarchical or faceted into discrete divisions
Classification SchemeAn arrangement of knowledge that does not follow taxonomy rules but is usually enumerated, e.g., the Dewey Decimal Classification
ThesaurusA tool that controls synonyms and identifies the semantic relationships among terms
OntologyResembles a faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules

Figure 2 - Semantic schemes: Simple to complex

Figure 2 – Semantic schemes: Simple to complex

Another way to think about taxonomy is as a set of fields called a metadata scheme, with controlled values that are used to describe what content is about and why it is important. 

The taxonomy is used to tag content with categories to make content easier to find, to provide ways to group large sets of search results called search filtering, and to enable web services like RSS feeds and personalization. This has been called “taxonomic metadata.” A taxonomy breaks up a long list of topics into groupings that are easy and natural for different audiences to use to tag and find information.

Steve Papa, the former CEO of Endeca, an early faceted search engine, coined the term “guided navigation” to describe the process of refining a rich metadata search result. Metadata-controlled vocabularies don’t need to be that large or complex to provide the granularity to accomplish this task; four metadata-controlled vocabularies of 10 values each have the same discriminatory power as one taxonomy of 10,000 values. 

Figure 3 is a simple example of taxonomic metadata. Broad and shallow taxonomies (visualize 10 groups with 4 categories each) have great utility, and are easier to build, maintain, and apply than narrow and deep taxonomies (visualize 4 groups that are 10 levels deep). 

Figure 3 - Taxonomic metadata is a simple metadata scheme with just a few controlled vocabularies Busch’s Golden Rule: Four metadata-controlled vocabularies of 10 values each have the same discriminatory power as one taxonomy of 10,000 values.

Figure 3 – Taxonomic metadata is a simple metadata scheme with just a few controlled vocabularies

Business Taxonomy Problems

Many brick and mortar and online businesses have a large assortment of products. It’s a challenge to devise a high-level product taxonomy that can effectively organize products for merchandising, but also be scalable and maintainable. 

The product taxonomy should also be designed to specify the set of product attributes that need to be associated with products in that category. Complete and consistent product attributes are key to picking the specific product you want or need from a large product assortment. 

How does taxonomy translate into a front-end interface? Taxonomic metadata is critical to empower the web search interface. Figure 5 shows how on bluefly.com (and all clothing shopping sites) category, brand, size, and color are the key attributes available to quickly narrow down your search in just a few clicks.

Figure 5 - Searching for shoes on bluefly.com

Figure 5 – Searching for shoes on bluefly.com

How can a customer pick from more than 29,000 types of faucets without giving up? Figure 4 shows how customers can refine their search for faucets on homedepot.com by product attributes such as: Category, Price, Brand, Color/Finish, Number of Handles, Series Name, Water Filter, Faucet Spray, Handle Shape, Soap Dispenser, etc.

Figure 4 - Searching for faucets on Homedepot.com

Figure 4 – Searching for faucets on Homedepot.com

Technical Standards 

There are several technical standards related to taxonomies. These are important because standards enable systems to talk to each other, or interoperate without the need to do custom programming. 

Taxonomy Definitions

So far, we’ve talked about taxonomy from an end user’s perspective—how and why one would want to use this particular type of controlled vocabulary. 

In this section we talk about taxonomy from a technical perspective, in the context of data standards. First, some fundamental definitions related to taxonomy.

  • Concept. A real or imaginary object that is expressed as Terms in the taxonomy.
  • Controlled Vocabulary. A list of terms that have been explicitly enumerated. The terms are controlled and published by a designated authority or authoritative source. If multiple terms are used to mean the same thing, one of the terms is identified as the “Preferred Term” in the Controlled Vocabulary and the other terms are listed as synonyms or aliases.
  • Facet. A grouping of Concepts of the same inherent category. Examples of categories that may be used for grouping Concepts into facets are: Audience, Channels, Components, Content Types, Functions, Industries, Intentions, Lifecycle, Location, Organization, Products, etc.
  • Taxonomy. The core metadata elements and the Controlled Vocabularies required to find, use, and manage content in a collection.

Here are some definitions related to terms in a taxonomy, or a taxonomy data dictionary.

  • UID. The unique identifier for the Concept.
  • Entry Term. The Preferred Term that is used to label a Concept. An entry term is also known as a Descriptor. 
  • Broader Term (BT). A term to which another term or multiple terms is subordinate in a hierarchy. 
  • Narrower Term (NT). A term that is subordinate to another term or to multiple terms in a hierarchy. 
  • Used For Term (UF). A non-preferred term that is equivalent to the Entry Term. Used For Terms may be synonyms, aliases (such as abbreviations) and quasi-synonyms (such as more specific terms).
  • Related Term (RT). A term that is associatively but not hierarchically linked to another term in a Controlled Vocabulary. 
  • Scope Note (SN). A note following a term explaining its source, rationale, coverage, specialized usage, or rules for assigning it. 

Finally, these are the common taxonomy term relationships.

  • Associative Relationship. A relationship between or among terms that leads from one term to other terms that are related to or associated with it. An Associative Relationship is a Related Term (RT) or cross-reference relationship.
  • Equivalence Relationship. A relationship between or among terms in a Controlled Vocabulary that leads to one or more terms that are to be used instead of the term from which the reference is made. An Equivalence Relationship is a Used For Term (UF) relationship.
  • Hierarchical Relationship. A relationship between or among terms in a Controlled Vocabulary that depicts broader (generic) to narrower (specific) or whole-part relationships. A Hierarchical relationship is a Broader Term (BT) to Narrower Term (NT) relationship.

Figure 4 is a simple example of a Concept (IBM) and some of its terms and their relationships.

Figure 6 - A simple example of a Concept, and its terms, and relationships

Figure 6 – A simple example of a Concept, and its terms and relationships

Taxonomy Development Process

There are a variety of methods for developing a taxonomy as summarized in Table 2. 

Taxonomy Strategies has used all of these methods at one time or another over many years, adapting them for each project based on our judgment of what will be most effective given the requirements and timeframe for the project as well as the organizational culture. Generally, we adopt a hybrid or best of breed approach to taxonomy development. 

Table 2 – Taxonomy development methods

MethodDescriptionPros & Cons
Automated AnalysisAnalyze content using automated methods to identify key concepts.  Very good for testing, but not very good for taxonomy construction.
Workshopping Guide stakeholder group in activities to identify key concepts. Can be good for building up a team and getting buy-in, but is not a fast method.
Strawman Prepare a best guess, then bring it to the table to discuss. Can speed discussions; however, a strawman developed before any client input has been received will always be off-target.
Adapt Existing VocabulariesCustomize internal terminology, industry standards, etc. A fast method that can reduce some of the acceptance problems of the strawman approach. However, existing vocabularies developed for one purpose (such as recognizing revenue across product lines) may be ill-suited for other purposes (such as allowing customers to search a website for product information).
Hybrid Combination of some or all of these methods. Allows a solution that builds on the advantages and minimizes the disadvantages mentioned above. However, it relies on having experienced consultants in order to make the proper choice of methods.

Figure 7 shows the key components of a successful taxonomy development process. 

Figure 7 - Taxonomy development process

Figure 7 – Taxonomy development process

  • Taxonomy Team. Projects require a team that will be dedicated to working on the project over a period of time. This should include a business sponsor and internal stakeholders, as well as a project manager and technical team who will do the bulk of the work. 
  • Identify the Business Case. Agreement on the business goals of the project is critical to obtaining executive sponsorship for your taxonomy project. 

Examples of business goals:

  • Improve search and browsing to reduce the amount of time employees spend looking for information. 
  • Reduce business silos, foster collaboration and content reuse, and reduce redundant work.
  • Reduce the amount of time employees spend emailing basic information to each other. 
  • Build confidence that employees are getting the most up to date information, and increase employee loyalty by helping them stay “up to date” on the company.

Ask yourself: How will this taxonomy project help you save money or make money and mitigate risk? Identify the key costs and benefits, and build a simple model to calculate the return on investment. 

For example, if you make content findable, how many minutes will that save per employee every day; or how much is avoiding an inappropriate information disclosure worth in organizational credibility; or how much is it worth to add useful years to expensive industrial equipment through proper operation and maintenance according to the manufacturer’s specification. 

  • Planning and Research. Collecting quantitative and qualitative information about content and user behavior, and analysis of how users interact with content, is the foundation for a successful taxonomy project. 
    • Identify the specific target content that is to be focused on. 
    • Identify and gather a representative sample of content items. 
    • Gather any query logs, usage statistics (analytics), and usability surveys. 
    • Collect any existing user research.
    • Collect any documentation related to audience personas, content organization, metadata, keywords, and any other guidelines or standards.
    • Identify and gather any internal classifications (org charts, sales regions, records retention schedule, code of conduct, product lists, etc.); and any relevant industry standard classifications (UNSPSC, NAICS, USPS, regulated activities, etc.).
  • Interview Stakeholders. Just like healthcare professionals, information professionals need you to tell them about your pain points. One-on-one interviews or group workshops with business, information management, IT representatives and sometimes customers are an efficient and effective way to gather information and documentation. 
    • Recruit people from business-critical functions such as marketing, public relations, product marketing, legal, etc. Include people who have credibility, are early adopters, hold large amounts of content, and are “squeaky wheels” or “fans.”
    • Conduct 10-20 interviews.
    • The goal is for stakeholders to be the review board during the taxonomy development process and beyond.
  • Define Use Cases. Think through both the strategic and practical goals of the taxonomy to help define its scope. A use case can be one or more sentences in the language of the user that describes what the user needs to do. Use cases can be helpful in later validating whether the taxonomy will address its intended purpose and making adjustments during the process to refocus the work. Table 3 compares some intranet and public website use cases.

Table 3 – Use cases examples

Intranet Use Case ExamplesPublic Website Use Case Examples
Content related to business areas or facilities by geographic location, by type, by specific facility, by access restrictions, by audience, etc.Web content managers by content type, by topic, by location, etc.
  • Use Case: Create a safety policies and procedures website for facilities organized by state.
  • Use Scenario: Find all safety policies and procedures related to facilities located in Ohio.
  • Use Case: Find and recall all public-facing pages that describe a specific safety tip.
  • Use Scenario: Find and recall all public-facing pages that discuss gas safety.
Company-wide content by business function, topic, access rights, etc.Public users seeking information by topic, location, etc.
  • Use Case: Locate any content that has policies and procedures around a particular topic.
  • Use Scenario: A policy regarding smoking company-wide has changed and references to outdated policies should be removed. Find official policies, as well as newsletters related to the smoking policy company-wide.
  • Use Case: Provide search for dividend schedules, earnings statements and stock splits,; and the corresponding press releases for a specific time period.
  • Use Scenario: An investor who recently sold stock is preparing taxes and would like to do a concise search so that they can find historical information about their holdings.

Build High-Level Taxonomy. Once research and requirements gathering has been completed, build a high-level outline of the taxonomy and review it with the core stakeholder team. Instead of a single hierarchy, adopt a faceted approach with distinct divisions for key contexts. 

A business taxonomy should have no more than 6-10 broad divisions.

  • Identify the types of actors (audiences, roles & access rights)
  • Identify the types of content 
  • Identify the types activities (business processes, applications & uses)
  • Identify the types of named entities (products, services, projects, organizations, locations, etc.)
  • Topics will be everything else.

Plan to reuse existing (especially internal) vocabularies for as many of the facets as possible. Plan to develop fully custom taxonomies for “Content Types” and “Topics.” 

The Oracle taxonomy (Figure 8) is built entirely around their list of products; it has no explicit topics, only actors, content types, and named entities. For marketing purposes, these products are grouped by Product Line (Oracle Cloud Infrastructure, Oracle Cloud Applications, and Hardware and Software), Technology (Middleware, Database Technology, and Security), Applications (Customer Relationship Marketing, Retail, and Manufacturing), and Industry (Financial Services, Healthcare, and Automotive). At Oracle, product names are also carefully edited to be consistent when they are mentioned in marketing collateral. By simply recognizing a product name in content text, the content item can be categorized by the appropriate Technology, Application, and/or Industry. However, the Singapore Government taxonomy (Figure 9) is much more focused on topics.

Figure 8 - Oracle.com high-level taxonomy

Figure 8 – Oracle.com high-level taxonomy 

Figure 9 - Singapore Government Taxonomy

Figure 9 – Singapore Government Taxonomy 

  • Build-Out Taxonomy Detail. If the core stakeholders approve, then buildout the detailed taxonomy. Reuse any existing terminology resources that you can because these will be familiar to the users and smooth adoption of the new taxonomy.
    • Get agreement on the broad divisions first, then build-out the detailed taxonomy.
    • Use existing terminologies whenever they are available for business functions, locations, products and services, etc., or consider adapting publicly available or published taxonomies. See: BARTOC.org (Basic Register of Thesauri, Ontologies & Classifications). Licensing a pre-existing taxonomy will cost less than developing a taxonomy from scratch, but a pre-existing taxonomy will rarely fit an organization’s needs and may require extensive customization. Table 4 lists free sources for common taxonomies.
    • Only build a vocabulary when no alternative authoritative source exists.
    • Only create categories for which there already is content, or likely to be content soon.
    • Keep the taxonomy broad and shallow. 
    • Roll-up more specific terms into broader categories.

Table 4 – Free sources for eight common taxonomies

TaxonomyDefinitionPotential Sources
OrganizationOrganizational structureSP 800-87, U.S. Government Manual, Your organizational structure, etc.
Content TypeStructured list of the various types of content being managed or usedDublin Core Type Vocabulary, AGLS Document Type, Your records management policy, etc.
IndustryBroad market categories such as lines of business, life events, or industry codesSIC, NAICS, Your market segments, etc.
LocationPlace of operations or constituenciesGNIS, ISO 3166, UN Statistics Div, US Postal Service, your sales regions, etc.
Business ActivityBusiness activities or functions performed to accomplish mission and goalsFederal Enterprise Architecture Business Reference Model, enterprise ontology, your business functions, etc.
TopicBusiness topics relevant to your mission and goalsFederal Register Thesaurus, NAL Agricultural Thesaurus, your research areas, etc.
AudienceSubset of constituents to whom a piece of content is directed or is intended to be used byERIC Thesaurus, IEEE LOM, your psycho-graphics or personas, etc.
Products & ServicesNames of products/programs and servicesERP system, your products and services, etc.

The NASA Taxonomy shown in Figure 10 is an example of an enterprise taxonomy because it’s intended to cover information management organization-wide.

Figure 10 - NASA Taxonomy 2.0

Figure 10 – NASA Taxonomy 2.0 (https://webarchive.loc.gov/all/20111207224736/http://nasataxonomy.jpl.nasa.gov/2.0/

Validation Testing. Test drive the taxonomy with users to validate that they can tag content completely and consistently, and that they can easily find content. The use cases can provide test scenarios. Table 5 summarizes various taxonomy validation testing methods.

Table 5 – Validation testing and review summary

MethodProcessWhoRequiresValidation
Walk-throughShow & explain
  • Taxonomist
  • SME
  • Team
  • Rough taxonomy
  • Approach
  • Appropriateness to task
Walk-throughCheck conformance to editorial rules
  • Taxonomist
  • Draft taxonomy
  • Editorial Rules
  • Consistent look and feel
Usability TestingContextual analysis (card sorting, scenario testing, etc.)
  • Users
  • Rough taxonomy
  • Tasks & Answers
  • Tasks are completed successfully
  • Time to complete task is reduced
User SatisfactionSurvey
  • Users
  • Rough Taxonomy
  • UI Mockup Search prototype
  • Reaction to taxonomy
  • Reaction to new interface
  • Reaction to search results
Tagging SamplesTag sample content with taxonomy
  • Taxonomist
  • Team
  • Indexers
  • Sample content
  • Rough taxonomy (or better)
  • Content ‘fit’ 
  • Fills out content inventory
  • Training materials for people & algorithms Basis for quantitative methods
  • Migrate Content. Existing content may need to be retrofitted according to the new taxonomy. This is usually accomplished with a combination of automated and editorial efforts. Best practices include:
    • Identify and dispose of Redundant, Obsolete, and Trivial content (ROT).
    • Prioritize content to be tagged.
    • Use business rules to automate content tagging. For example, tag landing pages of major sections, then lower-level pages inherit tags from top-level pages.
    • Use workflow to enforce tagging. For example, require entry of simple tagging in order to submit an item into the content management system.
    • Use templates to guide user tagging. Pre-populate template fields whenever possible. Use context-sensitive pick lists. Link to a taxonomy tool for more complex controlled vocabularies.
    • Provide tagging feedback. For example, set goals and display statistics on how many pages a user has tagged.
  • Maintain and Evolve. Implement methods to gather and handle taxonomy change requests according to an agreed service level. Evaluate how the taxonomy is performing by monitoring query logs and collection analytics. 
  • Review and Revise. Taxonomy is not a “one and done” activity. It requires maintenance and management to adapt to organizational changes.
Taxonomy Tools Presentation Download

Taxonomy Construction Tools

There are different kinds of tools related to taxonomy. These are for taxonomy management, content tagging, and content management.

  • Taxonomy Management. A taxonomy tool is an application for building, maintaining and governing changes made to a taxonomy scheme. These tools include Data Harmony, MultiTes, PoolParty, Web Protégé, Synaptica KMS, VocBench, and others. Table 6 is a list of taxonomy editing tools and vendors and their key characteristics.
  • Content Tagging. Tagging tools are designed to populate metadata with taxonomy terms manually, automatically, or with some combination of manual and automated processes. Sometimes this is referred to as “enriching” content with metadata. These tools include Data Harmony, Expert.ai, Megaputer, PoolParty, SAS, SmartLogic, and others.
  • Content Management. Content management applications combine a database with workflow to create, edit, collaborate, publish, and store digital content. These applications include Drupal, OpenText, SharePoint, WordPress, and many others.

Table 6 – Taxonomy Editing Tools

VendorTaxonomy Editing ToolKey CharacteristicsURL
Access Innovations, Inc.Data HarmonyComplete platform for taxonomy management & automated tagginghttps://www.accessinn.com/data-harmony/
Cambridge SemanticsAnzoComplete platform for knowledge graphshttps://cambridgesemantics.com/anzo-platform/
MicrosoftExcelSimple tool that everyone hashttps://www.microsoft.com/en-us/microsoft-365/excel
MondecaIntelligent Taxonomy ManagerComplete platform for taxonomy managementhttps://mondeca.com/software/
MultitesMultites ProInexpensive thesaurus management toolhttps://multites.net/
Semantic Web CompanyPoolPartyComplete platform for knowledge graphshttps://www.poolparty.biz/taxonomy-thesaurus-management
SmartLogicSemaphoreComplete platform for taxonomy management & automated tagginghttps://www.smartlogic.com/
Stanford UniversityProtegeInexpensive ontology toolhttps://protege.stanford.edu/
SynapticaGraphite; KMSComplete platform for vocabulary management or knowledge graphshttps://www.synaptica.com/
TopQuadrantTopBraid EDG-VMComplete platform for vocabulary managementhttps://www.topquadrant.com/vocabulary-management
Università degli Studi di Roma ‘Tor Vergata’VocBenchOpen source vocabulary management platformhttp://vocbench.uniroma2.it/

Taxonomy Editing Tool Functions

These are some Basic taxonomy editing functions that all tools should provide:

  • Standard and Custom Fields. Standard fields are those that are specified by the relevant Technical Standards such as Z39.19 and ISO 25964 described above. These include the standard thesaurus fields such as preferred term and scope note (SN). It should also be possible to define custom fields to be used locally in a particular taxonomy, for example, Term Source or Editorial Note.
  • Standard and Custom Relations (Intra-Vocabulary Relations). Standard relations are those that are specified in the relevant Technical Standards such as broader (BT), narrower (NT), and related (RT) terms. It should also be possible to define custom relations to be used locally such as IsA, PartOf, HasA, etc.
  • Data Typing and Restrictions. Data typing is the capability to define certain characteristics of a field such as it must be an alpha value, or it must be a numeric value. Restrictions are the capability to identify specific valid values for example from a list of predefined values such as True or False, or a range of values such as “1/1/2000 to 12/31/2000”.
  • Consistency Enforcement. Enforcement is the capability to require that a value be entered, or whether only a single value can be entered, or multiple values.
  • Flexible Reporting. There are standard formats that should be output by the tool. These should include generic formats such as CSV or XML, as well as specific displays such as thesaurus record or hierarchical tree.
  • Flexible Importing. There should be standard import formats accepted by the tool, including a CSV import format where the top row specifies the data attributes, and the remaining rows elaborate the entries. There should also be an XML import format that conforms to SKOS and/or OWL.

These are more advanced taxonomy editing functions that most, but not all, tools should provide:

  • Unicode. The tool should use and support the Unicode Standard for consistent encoding, representation, and handling of text in most languages, including multiple languages in the same vocabulary.
  • Multiple Vocabulary Support. The tool should support the capability to build and maintain multiple discrete vocabularies in the same platform.
  • Inter-Vocabulary Relations. The tool should support creating and managing relationships between concepts in separate vocabularies that may be on the same or different platforms.
  • Unique IDs. The tool should generate and manage unique and persistent identifiers (URIs)

These are advanced taxonomy editing functions that not all tools provide:

  • Workflow. The tool should provide a workflow engine or some other mechanism for users with specified roles to define and manage taxonomy entities throughout their lifecycle.
  • Voting. The tool should provide a mechanism for users who have specified taxonomy management roles to review and approve or disapprove of taxonomy changes.
  • Change Request Management. The tool should provide a mechanism to gather and prioritize taxonomy requests, and to report on their disposition to all identified stakeholders.
  • Stylistic Rules Enforcement. The tool should have the capability to define more advanced rules to enforce term consistency, for example, ensuring terms are in direct vs. inverted order, or terms are spelled out rather than abbreviated, etc. 
  • Programmability. The tool should provide an application programming interface (API) for integrating with other applications using JavaScript, C++, Visual Basic, Perl, Python, or some other scripting or programming language.

Table 7 groups taxonomy tools by functional area.

Table 7 – Types of taxonomy functions

Functional areaFunctions
Taxonomy Development
  • Create a taxonomy
 
  • Create a namespace for a taxonomy
Taxonomy Maintenance
  • Add, edit, move, or delete items
 
  • Assign or modify ownership to one or a group of concepts
 
  • Log activities
Taxonomy Governance
  • Setup an approval workflow for additions and changes
 
  • Assign user roles and permissions
Metadata Controlled Vocabulary
  • Assign attributes to a category
 
  • Associate a controlled vocabulary with a metadata field
 
  • Implement thesaurus capabilities (lookup, make equivalents, scope up or down, make associations)
User Interface
  • Search and browse
 
  • Drag and drop
 
  • Expose & group functions, open new tabs/windows
Reporting
  • Create templates for alphabetical, hierarchical, & other views
 
  • Generate visualizations
 
  • Import & export taxonomies
Application Integration
  • Alerts
 
  • APIs (WSDL, Scripts, Java, etc.)
 
  • Application integration (CMS, DMS, search engine, etc.)

Taxonomy Tool Examples

Figure 11 is the Synaptica KMS taxonomy tool showing the hierarchy browser on the left and standard term information on the right. Synaptica KMS is one of the most scalable taxonomy tools. It has been used to manage extremely large vocabularies in the medical domain. 

Figure 11- Synaptica KMS taxonomy tool

Figure 11- Synaptica KMS taxonomy tool

Figure 12 is the PoolParty taxonomy tool. PoolParty supports modeling collections of interlinked descriptions of entities, for example Exact, Close, and Broader matching concepts. This type of taxonomy is called a knowledge graph. Synaptica also offers a more modern taxonomy tool called Graphite designed for modeling interlinked entities to build knowledge graphs.

Figure 12 - PoolParty taxonomy tool

Figure 12 – PoolParty taxonomy tool

A much simpler and less expensive tool is MultiTes Pro. This is a Z39.19 compatible taxonomy editor that supports standard thesaurus fields. MultiTes offers these self-study tutorials:

  • Getting Started with MultiTes Pro
  • Navigating your thesaurus
  • Importing data from text files
  • Working with Subject Categories
  • Working with Multilingual Thesauri

Figure 13 is the MultiTes Pro taxonomy tool. Since MultiTes was designed as a thesaurus tool, the main MultiTes display is an alphabetical list rather than a tree browser. The term record is a pop-up window. The term’s hierarchical context is viewed in a tab in the pop-up window. 

Figure 13 - MultiTes taxonomy tool

Figure 13 – MultiTes taxonomy tool

Importing a new taxonomy into MultiTes is very easy. A thesaurus display can be reformatted as shown in Figure 14 to import it into MultiTes.

ACA

USE: Affordable Care Act

ACA Reassignment Notice

UF: Affordable Care Act Reassignment Notice

BT: Medicare Documents

Access to Medical Records

USE: Privacy

Accountable Care Organization

 SN: A group of health care providers who give coordinated care, chronic disease management, and thereby improve the quality of care patients get. The organization’s payment is tied to achieving health care quality goals and outcomes that result in cost savings. Read a fact sheet about accountable care organizations. (http://www.healthcare.gov/glossary/04262011a.pdf)

BT: Topic

RT: Fee for Service

Actuarial Value

SN: The percentage of total average costs for covered benefits that a plan will cover. For example, if a plan has an actuarial value of 70%, on average, you would be responsible for 30% of the costs of all covered benefits. However, you could be responsible for a higher or lower percentage of the total costs of covered services for the year, depending on your actual health care needs and the terms of your insurance policy.

BT: Topic

Acupuncture

BT: Condition and Treatment

Address Change

USE: Moving Residence

Figure 14 – MultiTes import format

MultiTes can export the thesaurus as a set of HTML files that operate as a website. This can be used to generate simple websites that can be styled further (see, for example, the NASA Taxonomy in Figure 10). 

Figure 15 is an example of a MultiTes generated website with the taxonomy developed for healthcare.gov. The earlier nomenclature was the federal facilitated exchange or FFE (for health insurance). Using an out of the box MultiTes feature, you can search healthcare.gov with any term in the FFE taxonomy.

Figure 15 - MultiTes was used to build this health.gov taxonomy

Figure 15 – MultiTes was used to build this health.gov taxonomy

Scenarios to evaluate taxonomy tools

These are some scenarios that are useful to evaluate taxonomy tools

  • Database Definition. How is the database created? Where is it stored? Is it Z39.19 and ISO 25964 compliant? Does the database, e.g., Ontotext, need to be licensed?
  • Importing/Exporting Data. How are data imported? What file formats are supported? Can data files be in batches that are sequentially uploaded?
  • Add, Edit, Delete Categories. How easily are categories added, edited, or deleted? Can categories be added, edited, or deleted in batches?
  • Relationship Types. How are relationship types defined? What types are supported? How is polyhierarchy handled?
  • Add, Edit, Delete Relationships. How easily are relationships added, edited, or deleted? Can relationships be added, edited, or deleted in batches? Does a change propagate to all instances?
  • Reporting. How does the taxonomy tool report new, edited, deleted taxonomies and categories; new, edited, deleted relationship types and relationships; and mapped taxonomies and categories? How are the reports presented? What audit logs are available? Can changes be traced to users who suggested them? Is an “approval” step for changes available for administrators?
  • User Access. Can the taxonomy tool integrate user accounts with existing authentication systems, e.g., Active Directory? Is there support for role-based access or defined group membership with configurable access? Is there a workflow to approve changes? What functionality is available or restricted based on a user’s security privileges?
Taxonomy Tools Presentation Download