Search Patterns and Faceted Taxonomies

Peter Morville and Jeffrey Callendar have produced a beautiful  manifesto calling to improve search  called Search Patterns: Design for Discovery (Oreilly, 2010). It is an ode to making complex data beautiful and navigable in user interfaces.  It’s nice to see O’Reilly produce a book with visual flair.

But once you journey through the many beautiful interfaces and design principles on how to present data,  you realize that there is still a need to understand that data presentation is related to data organization.  Morville hints at how data is organized to facilitate these interfaces.  In Chapter 2 on the anatomy of search, the authors write that sites should “embrace faceted navigation… Global facets might include topic, format, date and author.”   Morville downplays the role of formal hierarchies, focusing instead of the user experience of multiple interactions from “pearl growing” to browsing to managing your data to work towards a more immediate user experience.  Faceted navigation is described as “arguably the most significant search innovation of the past decade” (p 95), but there is only one short chapter on called Engines for Discovery that discusses how to create faceted navigation.

The data organization that combines the product taxonomy with other facets is called “unified discovery.”  The engines of this discovery (Chapter 6) and this is where we get into the expanded role of the taxonomists is to add facets for

  • Category: broad classifications that vary by application,
  • Topics:  the smaller areas of common interest  such as specific cars or books or recipes
  • Format: how data is formatted whether as content, video, or idea
  • Audience:  the fundamental activity of understanding the needs of who might need the data, from scholar and expert to novice browser

This global “one size fits all’  recommendation leaves out Time and Chance, which is when an object is produced, and the element of chance in that it is highly respected and relevant to the needs of users.  Date and date range is an important global facet.  Whether there is an “out of the box ” global taxonomy is probably up for debate.   Facets, and how many and how they are labeled,  needs to be validated by user need, application and content.   A global  model is a good starting point, but will probably need to be tuned.  Search across health care policies, for example, which probably requires facets on diseases, symptoms and treatments, and additional resources.    Determining the top categories can take some time so that these categories reflect common shared knowledge and vocabulary.  The top facets do not have to be 5 or 7 plus or minus 2, but rather what is needed by the application, users, and to organize the content.   Get over fixed universality rules and instead collect more data about user needs and content.

These navigations rely on separate and distinct data structures which allow users to navigate and refine queries before they are passed to underlying database or data structures.  These data structures  needs to be maintained, governed and analyzed. Over time, the richer this conceptual metadata, the better the search experience – better techniques for creating and using metadata are only around the corner.

On taxonomies and ontologies, the authors specifically argue that there may be other approaches to disambiguating terms (like Java the programming language from Java the island) based on clues like user and context rather than vocabularies:

“It’s not that there’s no value in parsing sentences for meaning or developing thesauri (or ontologies) that map equivalent, hierarchical, and associative relationships.  These approaches can add value, especially within verticals with limited formal vocabularies, like medicine, law and engineering.  It’s just that less obvious approaches like employing query-query reformulation and post-query click data to drive autosuggest – may deliver better results at lower costs. And we should be wary of claims that computers “understand meaning,” at least until they get a whole lot better at filtering spam.” (p. 162)

While these ideas are valid, it loses the essential wisdom of why librarians adapted taxonomies and spent so long building a body of standards for taxonomy creation. One thing librarians have long known about taxonomies is that they have a shelf-life beyond a specific application – that they can be used to share data across applications, communities and across the globe.

If we are to move the beauty of Morville and Callendar’s interfaces to uses beyond e-commerce and towards accessible, lower cost applications, we are going to have to understand the data structures behind these beautiful designs, and reach some shared understandings about how they should be built.  Search-side approaches to search are wise, but they depend on a good design for faceted navigation where it has validated user categories with user’s needs.  The skills of the taxonomist can be applied to search-side information design.

One discussion I enjoyed was on the under-appreciated role of color as a “quick way to reference the major categories and key players.” (p.15) I have often thought that it might be useful to have a color attribute when defining a facet or category so that all the terms and concepts within a facet share the same color.  That would help in visual sorting of ideas which is an idea Morville and Callendar explore more on the following pages.  Sites without a visual library of photos but only ideas and concepts could become more visual through the use of color-coding.  That would be useful if blogs and databases would look at ways of adding color so that similar concepts in a facet or category  can also be categorized by shared color.

To move to the next level, where we move search patterns from e-commerce to other uses, such as health care or better access to government information and more widely adapt better and more visual search designs,  we have to broaden the understanding of how to create and validate  faceted navigation and categories and what the supporting data structures need to be.  Perhaps O’Reilly’s next book should be on the common data structures for design for discovery such as the art of taxonomy and ontology.

Search Patterns is a valuable little  book  to stimulate creative juices.  The link  to buy Search Patterns is at http://searchpatterns.org/

Thank you to Andy Oram, a mensch of an editor at O’Reilly.

~ Marlene Rockmore

Enhanced by Zemanta
Advertisements

Book Review: Organising Knowledge by Patrick Lambe

Although the interest in and applications of taxonomies has grown in recent years, there are still not many books on the subject. Most of the information on taxonomies currently resides in online discussion group archives, blogs, wikis, conference presentations, white papers and reports (the latter at quite a premium price), but not much yet in easily accessible books. A search on Amazon.com on “taxonomies” yields numerous books of specific taxonomies, but very few on the art of creating taxonomies in general. Even the “books” page on the Taxonomy Community of Practice Wikispace lists mostly books on information architecture, a classic book on classification theory, chapters of books on broader topics, and high-priced research reports. There is just one book listed with a focus on taxonomies: Organising Knowledge: Taxonomies, Knowledge and Organisational Effectiveness by Patrick Lambe (Oxford, England: Chandos Publishing, 2007)

Indeed, as its title and subtitle suggest, taxonomies are presented within a broader view of how knowledge is organized. The book is neither a simple “how to” book, nor a scholarly treatment of the subject, but in fact combines both: practical advice on how to create taxonomies along with thoroughness in covering the field of knowledge organization and analysis of various ideas and previous literature on the subject, with many footnotes and a lengthy bibliography.

The author, Patrick Lambe, is a Singapore-based consultant in the field of knowledge management who can base his ideas on his own business experience. Yet Lambe also has the academic credentials of an information scientist, a Master’s degree in Information Studies and Librarianship and experience teaching as an adjunct professor. Thus, he aptly bridges both sides of taxonomies, the traditional library science side and the newer corporate knowledge management side, although it is the latter that is the subject of this book. What I appreciate in this book is that Lambe writes based on both his research and his experience, and based on these he has developed a number of his own ideas.

While common definitions of taxonomies often limit them to hierarchies, Lambe prefers a broader definition. The forms of taxonomies that Lambe presents, along with a detailed explanation for each, are: lists, trees, hierarchies, polyhierarchies, matrices, facets, and system maps. Stretching the definition and boundaries of what taxonomies are and can do is a central theme of Organising Knowledge. Lambe states: “Taken together, it becomes clear that taxonomy work holds a wider range of application and use than simply as a tool of information retrieval.” (p. 95) .

Organising Knowledge presents a number of real world examples, scenarios, and case studies of the application of taxonomies in their broadest sense. These include implementations by the U.S. Department of Homeland Security, Unilever, and Club Med. These examples illustrate the wide range of uses for taxonomies. Among business activities, Lambe says that taxonomies can support the areas of risk recognition and response, cost control, customer and market management, and innovation.

Lambe does not simply describe taxonomies and their use. In this in-depth book he discusses their varied roles, how they are understood, and trends in their implementation. He describes how different kinds of taxonomies can either (1) structure and organize (both things and processes), (2) establish common ground, (3) span boundaries between groups, (4) help in sense-making, or (5) aid in the discovery of risk and opportunity.

Several later chapters turn to the practical steps of preparing, designing, and implementing a taxonomy project. Lambe breaks out the process into ten steps, the first six of which are all still part of the preparation stage. Among the topics presented in the preparation phase are taking technology into consideration and communicating well with the taxonomy sponsor and stakeholders. While it is appreciated that technology/computer systems are mentioned, I would have liked to learn more about this. It becomes quite evident that different situations require different approaches and different kinds of taxonomies, the different kinds of taxonomies that Lambe describes earlier in the book. My only point of disagreement here is the continual distinction between tree taxonomies and faceted taxonomies, since taxonomies often exhibit both characteristics at the same time.

The book is well written and relatively easy to follow, but it is not a “light” read. It has a number of helpful tables and diagrams. Particularly useful is the table (two and half pages long) comparing the uses and issues for each of the seven forms of taxonomies: lists, trees, hierarchies, polyhierarchies, matrices, facets, and system maps.

I highly recommend this book of great breadth and depth to anyone who works on taxonomies or is interested in working on taxonomies. The intended audience of the book is indeed limited to knowledge management and taxonomy professionals. Even those with considerable experience working in taxonomies will find this book informative and enlightening.

– Heather Hedden

This review is based on a longer book review written by Heather Hedden and published in Key Words, the Bulletin of the American Society for Indexing, Vol. 15, No. 4, October-December 2007, pp. 130-132.