Everybody’s an Ontologist

Clay Shirky is right. “Here comes everybody.” WordPress has just released an amazing version that makes it easy for anyone to make a high-quality website that includes hierarchical search through topics. That means everybody can enrich their content with by enriching the concepts associated with their content and pages. There are several nice features in WordPress categorization widgets:

v Anyone who has the patience to play with the categorization widgets in the dashboard can build a topical or indented navigation that is more intuitive.

v Concepts are not exposed until content is added so “form follows content.”

v As best as I can tell from some research searches, the more specific the concept tags, you are more likely to be retrieved in a search (that is the more specific I am as a searcher, the more likely I am to find good content, and the more specific you are, as the content provider or blogger, the more likely you are to be found). This is an old indexer’s rule, which was to “index to the narrowest term.”

v You can set the postings so the parent category does/or does not retrieve the child categories, thus you can choose whether to enforce inheritance.

v If you index to concepts from different facets, you have implemented a faceted search.

There’s been a raging debate about the value of taxonomies and whether they should be implemented in RDF, custom XML, SOX or even as an RDBMS or have any value whatsoever. It doesn’t matter. Taxonomies are agnostic. They have a fundamental hierarchical structure. But the next step is to have taxonomies move towards ontology, and teach everyone to be an ontologist.

Why does this matter? Because semantic technologies don’t have to belong to the privileged. Anyone who understands <subject>-<object>-<predicate> construction that you learned in English can start to build a model, and enrich their content. WordPress has a clever implementation of something called XFN that will help build networks of FOAF files to find out who is linking to who, but there need to be many more templates to help locate and find critical information for our everyday lives. For example, take an information need as critical as elder care. For the sandwich generation, it is important to find out <who> < provides><service >for an aging parent as well as to rate the quality of the provider? For parents who have special education needs, what about content this is categorized by <Who ><provides > <services > that can help deliver better share information about expertise and programs that can support students with a range of individual needs? What if it was easier to link a teacher to <resources><in a> <content area>? What if an excellent teacher could put their lesson plans online and receive royalties by doing what publishers do and adding better indexing concepts in their metadata?

Taxonomy support in WordPress is very exciting but fundamental. WordPress allows hierarchical categories but has no synonym support. For example, you cannot have synonyms for concepts like “groups”, “organizations”, “communities”, and “clubs.” You cannot add “ Home Energy Efficiency” as a synonym to “Weatherization” (or the English spelling “Weatherisation.”). “Home energy efficiency” would have to be a “child to” the term “weatherization.” And voila, if you can add synonyms or concepts to your categorization in WordPress, you probably can improve the status of your content in your community.

However, with a modest level of categorization can achieve sophisticated results. By simply ensuring that content is categorized to multiple relevant facets, such as the <who> category, <services> category, geography or other attribute (or any category that matters in your domain), the content is enriched and potentially more “findable.” A fully-formed taxonomy should be categorized and have links between categories as in RDF. That’s not part of the taxonomist’s vocabulary, though that’s what the ontologists do.

One impediment to progress will be the inability to import and export taxonomies between applications. Like everything else, taxonomies are a commodity, so there needs to be work on data interchange in the tools provided by vendors. But there are existing standards that can help engineers in developing these interchange standards. Many years ago, I worked with Instructional Materials Standards (IMS) for interchange of educational content objects, but the XML technology was not evolved. Since then, I have been through many painful migrations of educational learning objects between learning platforms, as well as watched the exorbitant increases in textbook costs. The time has come to help encourage the exchange of information. It does not diminish the work of categorizing and analysis, just shifts the work from one cost center to “everybody.

The content management tools and blogs are evolving quickly. For example, according to Drupal is evolving to allow semantic support of RDF values. O’Reilly’s Kurt Cagle has pointed out that content enrichment is probably the hot trend in 2009.   O’Reilly has taken a great step forward to the librarian’s belief in metadata and descriptive cataloging  by adding metadata to its publications in RDF and aligning the RDF with  Dublin Core metadata standards.

But the next step is to improve metadata and keyword tagging so content, irregardless of format, is findable.

Categorization is an important step on the way to analysis, empathy and knowledge, so be intellectually adventurous and find out what happens when you try to categorize tags. All of this takes a bit of thought to research the current state of information in a field (market research), to find out what is the language of the field, how users search for information and then some creativity to brainstorm the possible terms and then to categorize them.

So for everybody to succeed, we need to talk about content and concept enrichment. It no longer takes any great skill to build a taxonomy, but it does take a little bit of thought and patience, some interface testing and some valuable content that is worth the analysis to preserve and share.