Taxonomy Accordion in Drupal

The open source community at Drupal is  quickly catching up on how to use its taxonomy module.  The latest code module  creates a Taxonomy Accordion— aka faceted navigation.  What Taxonomy Accordion shares with faceted navigation is good.  A taxonomy accordion lets a user know at a glance what a website is about, and how to find information, and also what won’t be found.

A Taxonomy Accordion does more than a faceted navigation (plus Taxonomy Accordion is a great name):

  • Using color and shade, you can graduate the color display so parent terms have one shade and the children have another shade
  • Hierarchies close and expand hierarchies much like a venetian blind or elegant fan
  • Has modular code that can be integrated as a part of Drupal Taxonomy module

But, as with other open source,  there is a requirement to plan and invest in the work that goes on  “under the covers.” Here are some of the dirty little secrets – the “work” that tune a  taxonomy accordion or any faceted navigation:

  • Pay attention user-centered design and validation: The fundamental choices of categories has to make sense to users.  Even if you make up an initial set of categories, use a validation process to ensure that the taxonomy makes sense to users.  Validation is a two-step process.  Part one is an open process, sometimes called an open card sort, where terms are collected from users, content, and sources, and then organized into a draft of the taxonomy. The second part of the process is closed, users are asked to find content using the navigation scheme to test whether the classes and hierarchy are useful or need to be refined.  More importantly, by using a validation process and making it part of the plan, you become more user aware and attentive to user needs.
  • Use this opportunity to improve tagging and metadata management: Content has to be tagged with terms from the taxonomy so you need a back-end business process and metadata eg database design to store the tags and pointers to associated content.  This backend metadata record can also help in creating an optimized your search engine especially an engine that supports faceted search such as SOLR.
  • Understand restrictions and attributes: Some facets are not larger super-classes, but are attributes (sometimes also called “slot facets” or “datatype properties”)  that are used to restrict  or narrow search.  These restrictions in an ecommerce application might be facets size as “Measurement, Color,  Availability.”    In a content or digital asset application, the restrictions might be “Content Type, Publication Date, Format.”    By grouping these terms, it helps to reduce permutations and complexity in interface design and in writing queries.
  • Foster distributed environments  and local control: This is hard to understand, but the faceted design is not authoritarian.  If the faceted design is based on user needs and a validation process, than it is likely to reflect shared values.   It still allows local organizations to develop and manage their information; it makes it easier to map that information to process and workflow.   For example,  a music company might have all its artists map  their music to shared facets such as genre.   A local social service agency might be asked to map its services to a common public service metadata scheme.  Allowing local agencies to update their metadata,  tag content, and suggest terms for taxonomy is a great way to identify user needs and changing requirements.
  • Change and Improve: Once categories are established, a change management process needs to be in place to monitor user queries to make sure that the categories and terms remain current and useful.   Setting baseline thresholds —  vital statistics —  (to be discussed in next month’s post) —  can help in recognizing changing markets, technologies or user needs.

An open source faceted navigation should allow implementation at a lower cost. Even with an Open Source solution like Drupal, which offers flexible options,  it  pays to invest some attention to understanding taxonomy business process because it will lead to more efficient implementation and efficient backend process.

The Return of Investment (ROI) justification  include not only user interface improvements (reduced clicks to right content) but also programming cost efficiencies such as  more simplicity in writing backend queries – great ROI justifications for the work. Validation work segues with the work of marketing and customer relations, so consider integrating taxonomy validation and governance into existing work processes.   Some organizations roll taxonomy management into a knowledge management function which oversees the entire process from organizing knowledge categories, managing content acquisition, and monitor.

Drupal’s development community has some very sophisticated features that will be available in the upcoming years including ways to visualize and cluster linked data, using RDFa.   Developing faceted navigation and taxonomies is a great way to get ready for an exciting future of visually interesting interfaces that better help users find and share information in complex organizations.

Don’t let the simplicity of the Taxonomy Accordion fool you.   Use the accordion as  an opportunity to understand user needs, how users look for information, and making underlying production, tagging and databases more efficient and focused  on user needs and high quality information.

~ Marlene Rockmore

Enhanced by Zemanta

Everybody’s an Ontologist

Clay Shirky is right. “Here comes everybody.” WordPress has just released an amazing version that makes it easy for anyone to make a high-quality website that includes hierarchical search through topics. That means everybody can enrich their content with by enriching the concepts associated with their content and pages. There are several nice features in WordPress categorization widgets:

v Anyone who has the patience to play with the categorization widgets in the dashboard can build a topical or indented navigation that is more intuitive.

v Concepts are not exposed until content is added so “form follows content.”

v As best as I can tell from some research searches, the more specific the concept tags, you are more likely to be retrieved in a search (that is the more specific I am as a searcher, the more likely I am to find good content, and the more specific you are, as the content provider or blogger, the more likely you are to be found). This is an old indexer’s rule, which was to “index to the narrowest term.”

v You can set the postings so the parent category does/or does not retrieve the child categories, thus you can choose whether to enforce inheritance.

v If you index to concepts from different facets, you have implemented a faceted search.

There’s been a raging debate about the value of taxonomies and whether they should be implemented in RDF, custom XML, SOX or even as an RDBMS or have any value whatsoever. It doesn’t matter. Taxonomies are agnostic. They have a fundamental hierarchical structure. But the next step is to have taxonomies move towards ontology, and teach everyone to be an ontologist.

Why does this matter? Because semantic technologies don’t have to belong to the privileged. Anyone who understands <subject>-<object>-<predicate> construction that you learned in English can start to build a model, and enrich their content. WordPress has a clever implementation of something called XFN that will help build networks of FOAF files to find out who is linking to who, but there need to be many more templates to help locate and find critical information for our everyday lives. For example, take an information need as critical as elder care. For the sandwich generation, it is important to find out <who> < provides><service >for an aging parent as well as to rate the quality of the provider? For parents who have special education needs, what about content this is categorized by <Who ><provides > <services > that can help deliver better share information about expertise and programs that can support students with a range of individual needs? What if it was easier to link a teacher to <resources><in a> <content area>? What if an excellent teacher could put their lesson plans online and receive royalties by doing what publishers do and adding better indexing concepts in their metadata?

Taxonomy support in WordPress is very exciting but fundamental. WordPress allows hierarchical categories but has no synonym support. For example, you cannot have synonyms for concepts like “groups”, “organizations”, “communities”, and “clubs.” You cannot add “ Home Energy Efficiency” as a synonym to “Weatherization” (or the English spelling “Weatherisation.”). “Home energy efficiency” would have to be a “child to” the term “weatherization.” And voila, if you can add synonyms or concepts to your categorization in WordPress, you probably can improve the status of your content in your community.

However, with a modest level of categorization can achieve sophisticated results. By simply ensuring that content is categorized to multiple relevant facets, such as the <who> category, <services> category, geography or other attribute (or any category that matters in your domain), the content is enriched and potentially more “findable.” A fully-formed taxonomy should be categorized and have links between categories as in RDF. That’s not part of the taxonomist’s vocabulary, though that’s what the ontologists do.

One impediment to progress will be the inability to import and export taxonomies between applications. Like everything else, taxonomies are a commodity, so there needs to be work on data interchange in the tools provided by vendors. But there are existing standards that can help engineers in developing these interchange standards. Many years ago, I worked with Instructional Materials Standards (IMS) for interchange of educational content objects, but the XML technology was not evolved. Since then, I have been through many painful migrations of educational learning objects between learning platforms, as well as watched the exorbitant increases in textbook costs. The time has come to help encourage the exchange of information. It does not diminish the work of categorizing and analysis, just shifts the work from one cost center to “everybody.

The content management tools and blogs are evolving quickly. For example, according to Drupal is evolving to allow semantic support of RDF values. O’Reilly’s Kurt Cagle has pointed out that content enrichment is probably the hot trend in 2009.   O’Reilly has taken a great step forward to the librarian’s belief in metadata and descriptive cataloging  by adding metadata to its publications in RDF and aligning the RDF with  Dublin Core metadata standards.

But the next step is to improve metadata and keyword tagging so content, irregardless of format, is findable.

Categorization is an important step on the way to analysis, empathy and knowledge, so be intellectually adventurous and find out what happens when you try to categorize tags. All of this takes a bit of thought to research the current state of information in a field (market research), to find out what is the language of the field, how users search for information and then some creativity to brainstorm the possible terms and then to categorize them.

So for everybody to succeed, we need to talk about content and concept enrichment. It no longer takes any great skill to build a taxonomy, but it does take a little bit of thought and patience, some interface testing and some valuable content that is worth the analysis to preserve and share.