Use of technology to organize content into groups so it can be retrieved when needed. The result of automatic classification is either a content collection clustered into groups (possibly a candidate taxonomy), or content categorized according to a pre-existing taxonomy. The best results are obtained by defining a business process that combines manual and automated processing so that technology is leveraged and human editorial input is optimized.
A set of 15 metadata elements (the Dublin Core Metadata Element Set) used to describe and catalog content so it can be discovered and retrieved. The Dublin Core is the de facto standard for cataloging web content.
Automated methods to analyze, classify, search for, and retrieve text. The basic principles of information retrieval or IR are based on research done in the 1940’s and 1950’s. The key observation was that word frequency provides a useful measure of significance. Many refinements have been made to this simple observation utilizing statistics, linguistics, logic, and clever combinations of one or more methods.
A common set of attributes that contain critical information to describe and catalog content. The basic concept behind metadata has been used to organize content since the beginning of clay tablet and papyrus scroll collections 3000 years ago. Card and book catalogs and bibliographic databases have used a commonly understood metadata standard to organize large collections.
Dublin Core metadata example:
|Dublin Core Elements|
The Who, Where and When
|Title, Creator, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language|
The What and Why
|Subject, Description, Coverage|
Links between Assets
How to Monetize Assets
Overall scheme for organizing content to solve a business problem such as improving search, browsing for content on an enterprise-wide portal, enabling business users to syndicate content, and otherwise providing the basis for content re-use. The basic idea behind taxonomy is to provide a controlled vocabulary for metadata attributes, and to specify relationships between terms in the controlled vocabulary. The simplest relationships are broader, narrower, and related, but relationships can be much more specific and complex. Click here for a glossary of taxonomy terms.
UNSPSC Taxonomy example:
|Prepared and preserved foods||Broader term|
|Corn chips||Narrower term|
|Potato chips||Narrower term|
Data models expressed in XML. XML schema provide a means for defining and implementing a consistent structure or syntax, and semantics for XML documents that allow machines to carry out rules made by people. A facetted taxonomy provides the names of metadata elements and a consistent set of attribute values or vocabularies for filling the elements in an XML schema.