The Right Prescription for a Crowd-source Experiment

My last post was an experiment in using remote online card sorting as a way to build a taxonomy.  And why start small.  My sample data was the picklist used on www.medicare.gov when you  search on “What does Medicare Cover?”   For my experiment, I used websort.net. as the remote card sorting tool.

First, let’s start with the good news.  Online tools are basically very cool way to bring together remote groups where it would be too expensive or politically impossible to connect.  That’s the promise.

But to have a successful  remote card sort requires  preliminary planning and work.   Here are my lessons learned:

  • Keep the test under 20 minutes: Online card sorting is a time-consuming task for the participant, so for the experiment to be successful,  you need to make sure that participants have the time and that the number of terms to be sorted are not overwhelming. Joseph Busch of Taxonomy Strategies and Dave Cooksey, saturdave.com suggest 20 minutes/25 terms at most.  My comprehensive test  of all 132 picklist terms from the Medicare site was too big.
  • Pretest the taxonomy: Since the card-sorting activity is a one-time opportunity to  engage testers , some prior testing of the taxonomy should occur.  Remote card sorting is better for closed experiment where a taxonomy has been designed, rather than an open card sort where the goal is to discover categories and facets.   The best practice recommendation is to run some prior tests of the taxonomy before that online experiment.  Have a trusted expert do the test, and then throw away obvious problems.  If the pre-test doesn’t go well,  try again.   Testers in an online setting have a low tolerance for obvious problems, so the test needs to  about validating  a good design.
  • Choose online tools carefully: The tool I used, websort.net, had a major problem.  It only allowed a term to be classified under one and only category.  This proved frustrating to users. For example, users wanted to classify durable medical equipment under the category for Equipment but also under the category for the Disease or Chronic Condition.   Dave Cooksey, who tracks tools, says remote tools are improving all the time  — so evaluate tools and choose wisely.
  • Be sure to thank the participants: We all feel manipulated by many of the group activities we attend in the face-to-face world, and that can happen in the remote world as well.   Being authentic and courteous is important. Provide a thank you and be sure to share results or feedback.  If possible, consider some kind of compensation such as a gift card.

So given that a test that seems so simple on the surface requires work to set up, what is the value of this work. The purpose of a taxonomy is to determine top level facets that can be used to organize and search for information.  If we look at a topic like Medicare, we know that we have a national problem determining standards for insurance policies.  It is difficult to compare policies, and it is also time-consuming to manage the costsIn designing good remote crowdsourced  card sorting tests, Dave and Joseph have the following recommendations

  • Pay attention to the sample size
  • Recruit carefully to be sure the sample has balance of perspectives
  • Run tests prior to online activity. Have experts try the test.
  • Remember the goal of a taxonomytest is to find the higher level categories that overlap between the technical expertise and general understanding.
  • The result is a better analysis of shared group understanding – shared mental models of how we collectively categorize concepts,  not individual understanding

In the scheme of a trillion dollar problem like health care, a project to set up  well-designed remote cards sorts that can compare how different user groups sort fundamental medicare concepts seems like a small investment.   A well-run test with a good recruitment could be a very good way to jumpstart better designs of  websites such as Medicare  that deliver  clearer information about benefits and choices.

Reblog this post [with Zemanta]

Using Taxonomies to Sort through Health Care Reform

I am very interested in the health care reform debate, thus I wanted to know what a public option might look like. I was told by my sources that a robust public option might look a bit like Medicare. So off I went to the Medicare.gov website to find out what was covered.   In the middle of the home page in the second column, there is  a link to ‘Find Out What is Covered, ” which leads to an advanced search criteria page. The search page  includes picklist of about 143 topics,  just the right size for a sample set of candidate terms  for a card sort.

This month, I am offering a small interactive experiment in online card sorting.   Taxonomies are collections of facets, which are created by organizing concepts into categories.  Card sorting is one of the best ways to identify categories by having controlled tests with groups of users to create categories, that can be validated through repeated tests, until there a consensus.  In health care reform, taxonomies might be useful to help create consumer-friendly interfaces to help search across the national insurance exchanges.

A card sort method uses the following steps:

  • Collect a sample set of candidate concepts
  • Group or cluster terms into categories
  • Refine the design iteratively until there is a set of facets, groups of categories that have similar properties

I’ve put 130+  topics from Medicare into an online card sorting tool called Websort.net.  The topics have not been formatted or massaged; they are just as they appear the Medicare search picklist.   Websort.net suggests  that I use a closed card sort,  where participants sort terms into predetermined categories. So to get  started,   I’ve come up with about 20 starter categories.   Some of these categories will become subtopics in a faceted design

The experiment is open to the first 10 participants who want to take the time to try this task.   To try the card sort, link to

http://websort.net/s/80CDD6/

Please feel free to assign terms to multiple categories or to suggest other categories.

Last month, Joseph Busch blogged about the judicious use of online web sorting tools – that they may not be the most cost-effective way to build taxonomies. One of his arguments is that the sample set of users will not be random. That’s true. This blog has a small readership who have interest in taxonomies, and probably have a consumer’s interest in health care reform. Let me know what you think of websort.net.

This little experiment could help demonstrate some bigger observations. Government may be looking to advanced high volumentechnologies such as clustering or semantic technologies to identify categories and to map claims data.   Perhaps one of the applications will be  to build interfaces that will help consumers search across the national exchanges.  But at the core of these technologies, there will be a need for well-designed taxonomies to help analyze text and building better interfaces to access health care information.

A well-designed taxonomy with facets and linking relationships can

  • Group information into useful categories
  • Identify gaps in coverage
  • Help point to important related information

Let’s find out if taxonomy design can help us sort through health care reform.

Thanks to Andy Oram and the Sunlight Foundation for introducing me to this tool and to Dave Cooksey who is virtually updating my card-sorting skills.

What’s wrong with crowdsourcing the design of public websites?

A blog post from Sunlight Labs on “Redesigning the FCC: Getting Organized” suggests an experiment that employs a public card-sorting program, websort.net, to help redesign the Federal Communications Commission (FCC) website.  The FCC has a notoriously convoluted web site, hard to navigate and hard to search.  Sunlight Labs invites anyone interested in helping the FCC to this open card-sorting activity, which organizes about 60 terms into categories related to the FCC. But is a public web sort the right approach to redesigning a government website?

Should we crowdsource the design of a public website?

Here are some considerations: –

  • First, the success of any design process depends on who sits at the table. Site designers have not succeeded over the years by roping in anyone who happens to be around. Rather, carefully identifying the right participants for any design activity is very important. Engaging busy professionals and bureaucrats in order to derive the maximum impact with the minimum effort is a tricky business. One of the most cutting critiques of the Wikipedia has been that the editorial perspective is overwhelmingly white-male twenty-something—not necessarily the authority of choice for everyone else.
  • Second, open processes tend to be very time-consuming, which works in your favor for some kinds of crowdsourcing but not for selecting terms and categories. Unless the sample is large and controlled, the emerging pattern from crowdsourced card sorting may not be helpful because experts with limited time will be overrun by people with lots of time and a fast hand on the keyboard, no matter how much or how little they know. Some types of crowdsourcing (such as prediction markets) work because the errors of ignorant participants cancel each other out and allow the experts to win out—but card sorting is entirely different and results in just chaos.
  • Third, it would be much quicker for the FCC to suggest a model for organizing its content based on its expertise than to crowdsource the design. There are standard ways to organize things, including website content, which people can learn even if they are not entirely natural. We learn about brand, price, size, color, material, and fit because they help us find the stuff we want to buy, not necessarily because there is a shopping gene in our DNA.
  • Fourth, the users of these sites, such as broadcasters, regulators, website publishers, and ordinary people, are not always interested in the same things. The FCC will have to comply with legislative and executive branch imperatives that may be of little interest to many people in the crowd.

A better way to approach website design and redesign focuses on the backend nomenclature—buckets and categories, which are called facets and vocabularies. These form the basis of a useful taxonomy.

So when can crowd-sourcing be used effectively? If the FCC engaged in the process of designing facets and vocabularies, the crowd could be useful as a follow-up. First, it can be helpful in validating a design. After all, the test of a taxonomy is whether it helps people find information. One of the appropriate roles for crowd sourcing in taxonomy is to observe how the users access a collection of items over time, the searches they use, and the click paths they follow. The taxonomy can then be tuned based on how the activity distributes among the categories—splitting and merging categories as warranted.

Another place for crowdsourcing is to allow users to add free-text “tags” to the content. Those tags can then be evaluated to either map them to existing taxonomy categories, or to suggest changes to the taxonomy. In this case the crowd and the taxonomy work together in synergy. Users typically add a tag to only a fraction of the pages, so in most cases these terms will be synonyms or equivalents to existing categories.

Finally, a card-sorting exercise can be useful after the field is carefully constrained by the experts who know the site. The true test of any card-sorting activity is whether people can actually find what they are looking for afterward. Mapping a tag as a synonym of an existing taxonomy category, effectively applies that tag to all the content already in that taxonomy category. This synergy is one method that can help improve access to information.

Here are several techniques that are intuitive and natural for people to use with little or no training, allowing them to validate a taxonomy. These techniques are much faster than open card sorts, and provide results that are easier to interpret.

  • Classifying some content
  • Conducting walk-throughs
  • Closed card sorting

Classifying some content

In this exercise, people are presented with a representative subset of content from the site and are asked to tag it. You can select it randomly or try to include examples of the site’s primary content types, as well as content you think may be hard to tag, find, or use. Plotting the number of items tagged into each taxonomy category, you should expect to see 80% of the content fall into 20% of the categories.

Conducting Taxonomy Walk-Throughs

One-on-one and group presentations to stakeholders showing and explaining or walking through the taxonomy, is an effective way to extract specific comments and sometimes overall approval. During walk-throughs, standard questions should be asked about the category structure, as well as about problematic categories, to gather feedback on the taxonomy. Delphi walk-throughs are done using a stack of cards. It is not a set of raw terms, however, as in the FCC exercise. Instead, the cards are already marked with categories chosen by the experts. Reviewers are asked to mark changes to the category labels on the cards. Each subsequent reviewer is given their walk-through using the cards with the label mark-up from the previous session. The process usually stabilizes after a few sessions, indicating that the categories are appropriate. According to Dave Cooksey, Founder and Principal of saturdave, 20 sessions will usually result in a consensus taxonomy revision, and this method provides results without any further analysis.

Closed Card Sorting

Closed card sorting, where categories are in predefined buckets, can be used to test whether stakeholders and end users consistently sort categories into the correct taxonomy facets. The categories to test should be a set of important topics, such as the most frequently searched words and phrases from the search engine logs. The test can be done using actual cards, or using a simple grid with categories to be tested down the left column and the taxonomy facets across the top. Paper card sorts work well enough for up to 20 trials.

Websort.net is a good tool when you need a larger, distributed closed-card sort test. If users can’t map terms to the categories, the designers will know that they have to adjust their design. But our experience shows that pre-analysis captures about 80% of the common categories and use cases. Sunlight Labs has undertaken a commendable task in seeking to improve the FFC web site’s layout. By carrying out a card sort too quickly, they’ll just get their signals crossed. Performing some professional taxonomy work first will channel public efforts in the right direction.

Submitted by – Joseph A. Busch, Founder and Principal, Taxonomy Strategies,  Sept  8, 2009

Reblog this post [with Zemanta]