Looking Ahead to Faceted Searching - Part 2
In the last post we discussed the history of library search technology as a lead-up to our forthcoming addition of faceted search to the library catalogue.
But we didn't say all that much about what faceted search is. So what is faceted searching and why is it exciting for improving the library catalogue?
Faceted Search Defined
A search of the web will turn up quite a few results for the question "what is faceted search?"; I like the definition offered by the Association for Computing Machinery's Special Interest Group on Information Retrieval (SIGIR does a lot of work in areas of computer technology of specific interest to libraries):
- Navigational search uses a hierarchy structure (taxonomy) to enable users to browse the information space by iteratively narrowing the scope of their quest in a predetermined order, as exemplified by Yahoo! Directory, DMOZ, etc.
- Direct search allows users to simply write their queries as a bag of words in a text box. This approach has been made enormously popular by Web search engines, such as Google and Yahoo! Search.
An Old Idea in the Library World
Faceted search as an idea is related to (though not identical with) the concept of faceted classification, a fairly old idea in the library world. See the Bliss Classification System or the Colon Classification System, both developed by librarians who considered the Dewey Decimal Classification System insufficient for describing and categorizing the richly varied world of information.
"Navigate a multi-dimensional information space"
A piece of information (let's say a book from here on out for the sake of convenience) has many different possible points of access that might be of interest to someone looking for it. This is where the "facets" terminology comes from--each possible access point is one "facet" of the whole piece of information.
Some of these are "flat", such as the name of an author or the title of a book, but for others it may be possible to identify a hierarchy from general to specific, such as for geographic area of coverage:
- Earth > North America > Canada > Ontario > Toronto
A huge range of possible books exist within the geographic coverage of "Earth". A narrower subset of that range geographically covers "North America", and a narrower subset within that covers "Canada". And so on... You could also consider more granular hierarchies such as having "Western Hemisphere" between "Earth" and "North American Continent".
But hierarchical subject browsing based on a subject heading system such as the Library of Congress' has been a feature of some online library catalogues in the past. The real power of faceted searching comes with...
"Combining text search with a progressive narrowing of choices in each dimension"
You may already use faceted search and not realize it. The ability to start with a free-text search and then narrow down your results within various dimensions is a common one on e-commerce sites:
The screenshot to the right show the websites of Canadian Tire and eBay using faceted search to narrow within a free-text search.
You get a lot of power from this ability to search freely and then progressively narrow your search by the available facets of the retrieved results. Ideally you get the best of both worlds in a user-friendly manner--you can look for whatever you want, but the system will then progressively guide you through its particular information structure to improve precision, eliminate false hits, and help you find information that's on target.
If you've asked a librarian to look up a book (we have 99 branches to do this at if you feel the need) you've probably seen them pull relevant results very quickly, because librarians have extensive training in (among other things) the particular way in which catalogue records are organized.
A big part of what the web team hopes to do with faceted search is leverage our existing structured records (subject headings and other access points in the catalogue record) to make searching easier without having pre-existing knowledge of how the information is organized##.
Faceted Search Technology and the Library
For an example of faceted search working in a library catalogue, you can visit the North Carolina State University Library.
The specific faceted search technology we'll be using is made by Endeca. An interview with one of the founders in 2008 gives some insights into the origins of the technology (and it warms my librarian heart to see the acknowledgement of S.R. Ranganathan as one of the original thinkers of faceted search).
The web team aims to have faceted search technology in place for Toronto Public Library by late summer. Watch this space for further announcements.
Also: what about "adjacent searching", such as is used by Lexis/Nexis, where words must be within a certain number of words of each other (same paragraph, w/20, w/10, etc.)?
Posted by: Dean Tudor | May 20, 2009 at 17:25
Hi Dean,
Interesting question! I used Lexis/Nexis in library school and always found the proximity searching very useful for doing full text article searches.
Endeca is very robust search technology and definitely has implicit adjacency searching (like Google and other search engines) to improve search results.
I expect there's a way to be explicit about your adjacency as well, though I'm not sure how necessary it would be for most catalogue searches, which tend to search short fields like title, author, subject, etc. Was there a particular way you thought adjacency searching would be useful for the catalogue?
Posted by: Alan H. | May 21, 2009 at 11:15
It's funny that you don't mention it, going all the way to North Carolina, but faceted search was recently implemented in the UofT library system interface.
Posted by: Nathaniel | May 25, 2009 at 13:46
Hi Nathaniel,
Neat! Wasn't aware that U of T had recently implemented faceted searches, and they appear to be using Endeca as their platform for this as well. North Carolina State was one of the first libraries to implement Endeca-based faceted searching, so it's kind of the "default" example, but great to know U of T has taken this step as well.
Faceted searching lets you discover things like the Victoria University Library owning five of Roger Zelazny's novels from the personal collection of Northrop Frye, annotated by Frye.
Posted by: Alan H. | May 26, 2009 at 09:17
As a former librarian, I am just interested in having as many facets exposed as possible. I think it would be less expensive and demanding to do adjacency searching with shorter fields, such as in a library catalogue. In my days at Lib School (1965/67), we were lucky to know what KWIC was. Data processing still meant a lot of typing or keyboarding.
Posted by: Dean Tudor | May 28, 2009 at 17:17
Great post, Alan. The two-part series really explains why the problems exist...
Posted by: DM | May 31, 2009 at 14:39
With apologies for this being off-topic, I would appreciate knowing how to access the JSTOR database when in a branch. It seems to have been dropped from the "Title List of Databases Available in this Branch" (the last few times I've looked), and I haven't had an answer through Answerline.
Many thanks for this blog and for your help.
Posted by: James | Jun 17, 2009 at 12:53
Hi James,
I'm doing this from North York Central Library (on a public computer) and I see JSTOR as available from the A-Z list here.
Which branch do you usually try to access JSTOR from?
Posted by: Alan H. | Jun 18, 2009 at 14:19
Hi Alan,
I agree with your points completely. Faceted search unfortunately is not that commonly offered, but where it is it really makes a site stand out, and makes you want to come back as long as the content is good or the offerings are good values. Basically, it maximizes the speed of finding what you are looking for. Sites like this are likely to attract repeat visitors given good content or offerings.
Posted by: Stewart Engelman DNI Services | Oct 11, 2009 at 22:14