Evaluating Quality on the Net

final version – March 2003.
This paper was originally created in 1995 with the title, “Evaluating the Quality of Information on the Internet or Finding a Needle in a Haystack” as a presentation delivered at the John F. Kennedy School of Government, Harvard University, Cambridge, Massachusetts, September
6, 1995. I believe the philosophy found in this paper should continue to be of value, but I will no longer be updating the examples. Please refer to my husband Walt Howe’s Quality section : http://www.walthowe.com/navnet/quality.htmlfor more current information.

I see most of my talk as pure common sense from a librarian standpoint . We need to use the same critical evaluative skills in looking for information on the Internet that we would do in a book, a paper index, a musical score, or on an online commercial database. The content of the Internet is only more diverse because of the potential of interaction with more media. By media, I mean, not just audio and video but all forms of technology-assisted communication.

With the growth of information on the Internet and the development of more sophisticated searching tools, there is now the more likely possibility of finding information and answers to real questions. But, within the morass of networked data are both valuable nuggets and an incredible amount of junk.

How should users today approach searching on the net and critically evaluating the data they find?

You need a systematic approach to evaluating the tools you will use for searching and what they will cause you to receive or keep you from receiving and also you need a systematic approach to evaluating the document or result that you receive as a result of your search. As information professionals we are in the best position to determine and expand the relevance of existing criteria to new and future formats.

What this paper will address:


 

How should we look at Internet information?

Consider the continuum of information on the net as opposed to the continuum in print. Is it really any different? And if so, what makes that difference?

In print: vanity to very scholarly/specific

On the Net: vanity to very scholarly/specific but with more variation and with the inclusion of promotion/advertising which may be more difficult to differentiate on the net than in print or mass media/television.

The “home page” may be nothing more than a form of vanity or self publishing. Within what I might characterize vanity would be the sites where an individual decides to share working papers or information they have been working on for a dissertation. Many home pages have been through a rigorous review process and should not be equated with the term “vanity.”

Vanity publishing A vanity work may be a very specific document that has information of great value but it hasn’t been through the peer review process intrinsic to scholarship or it hasn’t been disseminated by the trade publishing industry. Heretofore, vanity and short-run specialty publishing has been possible in print and can be “quality” in nature, although its value may not be as easy to determine without analysis. It will not have some of the visual clues which facilitate the viewer’s critical analysis.

My grandfather had my grandmother’s childhood memoirs published and distributed to family and friends. I always thought of it as a very entertaining and pretty well written story of a little girl growing up as part of an acting troupe in the midwest. The title was “A Little Girl Goes Barnstorming.” Reading it, it belongs in the history of the American stage in the late nineteenth century. How did it really differ from regular publishing? It was carefully edited but no publisher was involved. We look to publishers to give us assurance of added value and provided quality control — both editorial review and adherence to standards.

While the term vanity press is a derogatory one, the content of what comes out of a vanity press may not be bad. But it is, from an information professional’s standpoint, much more suspect. It lacks any of the trappings that scholarly publishing affords.

Grey literature is another category – pamphlets, preprints, technical reports — I am not sure the Internet is any better or worse in its indexing than were the subject based vertical files of my early library career years. ERIC has played a valuable role of giving us access to some of the gray literature for the education and library profession. I would think anything that is submitted to ERIC today probably could find its way onto the Web as well, and probably should.

Professional associations have played a historical role in the indexing of hard-to-find materials within their scope. For instance, in 1972 the American Gas Association formed the Library Services Committee to participate in information sharing among members, including the preparation of bibliographies of concern to the industry, a directory of gas industry libraries, and a union list of reference tools and services. (Shirk, Virginia R. and Davis, Marc L. “Gas Libraries: An industry-wide network,” Science & Technology Libraries, vol 1 no. 2 (1981), 15-22). Distribution of those tools was limited to members of that association not so much by their choice but by feasibility.

Today, a group of professionals such as the Australian Firenet can share their information with the world, for better or worse. Firenet : http://www.csu.edu.au/firenet/firenet.html,
hosted by the Australian National University, is a cooperative set of World Wide Web servers for discipline specialists in the field of
fire management and fire ecology. In this case librarians have not been involved. FIRENET’s specialized publications are locally mounted and managed and distributed via the Internet. Among other awards, they have been honored with the 911 Fire Police Medical Web Page First Alarm Site Award. In this case, I would consider a professional award much more telling than one from one of the many Internet awarding bodies.

The role of professional associations can already be seen. Contrast FIRENET with the American Mathematical Society : http://www.e-math.ams.org, which I would put on the scholarly end of the spectrum. Access is provided to MathSciNet, a web-accessible subscription database of the data in Mathematical Reviews (MR) and Current Mathematical Publications (CMP), which index and review the mathematics research literature from 1940 to the present. Bibliographic data only is available from 1940 to 1979, and from 1980 to the present both bibliographic data and review texts are available. Items listed in the annual indexes of Mathematical Reviews but not given an individual review are also included. Those in Mathematical Reviews appear first in Current Mathematical Publications. Institutional site licenses are the primary way that users get access. The cost for an individual can be steep, but MathSci Online is offered via commercial services such as Dialog, CompuServe as an option. In this case the web is integrated with the association’s publishing program and can be seen as just another distribution medium, to meet the needs of their customers.

Current Experimentation of all types of publishers includes parallel publishing with print and/or supplementary publishing of putting some
information on the Internet but holding back something for the print publication. The Internet gives us access to large volumes of data. One of the earliest research projects that the net facilitated was the Genome Project : http://www.genome.gov. It allows us to manage materials that many libraries have not collected before, such as the statistics site Statlib : http://www.stat.cmu.edu at Carnegie Mellon.

Advertising and Public Relations as an Additional Category At the original 1995 NEASIS presentation, Clifford Lynch brought up this category that I had not originally put in my list. Since then marketing has taken a front seat on the Internet, and I certainly agree belongs as a category of its own. Internet publishing categories include promotion, from self-publishing to the commercial variety. Along with providing information about products, it is perfectly natural for companies to promote them. Consider the automobile sites which describe all the features of this year’s models. There is nothing wrong with this information being available and I certainly want to have access to it, but as an information professional, I also want to be aware of the bias of what I am viewing. This is no different than the need to understand what you are reading in a 10K document filed with the SEC and contrast that from the role of a company’s annual report.

A perfect example of the value added that a promotional site can bring can be seen by the bookstore sites, such as Amazon Book Store : http://www.amazon.com. Not only can you find bibliographic citations and order books, but here are comments from authors and unsolicited reviews of books by anyone who wants to contribute them, both good or bad, as well as professional reviews. Amazon compiles a wealth of information on its site to encourage anyone to return and **by the way** <smiley-face> to order a book or two because it is such an easy and cost-effective way to get what you need. What is most impressive is the level of customer service provided and speed of delivery.

Amazon is not alone; its competitor Barnes and Noble has partnered with sites such as the Northern Light Search engine to provide search for books and CDs once you have finished searching for articles on a topic.

There are a growing number of sites that may have started out because some people felt that the content belonged on the web, but now these sites need to support themselves. An example is the excellent Internet Movie Database : http://us.imdb.com. The commercial label is blurred, and the important thing to pay attention to is whether a site has
valuable content and whether its presentation or content biases make any difference in terms of what you need to get out of it.

Multimedia Issues Given the continuum of Internet “publishing”, additional criteria must be added to reflect the multimedia nature of the medium. Quality of sound is still pretty early in its evolutionary cycle. Sound files of any size may take an unreasonable time to transfer, but that is getting better and I have confidence video will be improving as well. [multimedia can bring immediate access to bird images and sounds or animation of a bird in flight]. I am not a proponent of the medium for its own sake, but
where it is used effectively, it can provide an enhanced product.

For example, the National Geographic River Wild–Running the Selway : http://www.nationalgeographic.com/selway/index.html is an excellent example of merging sound and graphics with print content to enhance the educational and recreational experience. However, there is the caveat that you need to have the right technology (hardware and software) to be able to take advantage of the sound, in this case, a sound card and Real-Audio software. The multimedia technology is not sufficiently developed that the browsers have everything you need built in.

Print publishers can run the gamut of quality as well, and as information professionals we have generally gleaned something about a whole line of a publishers’ works and the care with which titles are brought out. In the Internet publishing field, for instance, there are currently some shops that are known to move books out so fast that you can expect typos and errata that will be corrected if there are later printings or the errata can be tracked down with some effort by going to their web site.

Some publishers are known to be advocates or supporters of different causes and their biases are part of what we keep in mind when we evaluate them. Consider the Sierra Club : http://www.sierraclub.org
— their publications are slanted in a particular direction, just as I would expect campaign literature, any other form of advocacy or activist publishing. This translates on to the Internet and we must look at the viewpoint of the site. These may be explicit in a scope statement, or you may not be able to confirm your suspicions except by analyzing the point of view of the contents of the site.

The Internet has enabled a vast new group to enter the world of publishing – those who didn’t learn the culture of the print publishing trade. And we need to have them use the right information so that we can evaluate their sites. So we have a responsibility to explain
the rules to new publishers, just as the Internet community tells new users the Internet netiquette rules of the road.

So how do you come to terms with quality be it vanity or grey literature or scholarly? I take a pragmatic view of quality. At the very least, I want my facts accurate, current, and the bias and authority of authors clear.


Just to look at some of the issues to consider in evaluation of a web site, take a look at a site I think very highly of: Gilbert and Sullivan Archive : http://diamond.boisestate.edu/gas.

There is a clear table of contents and very good navigation. It is designed to be viewed both by text browsers (Lynx) and graphics browsers (Netscape Navigator and Microsoft Internet Explorer). Graphics load quickly.

The G&S Photo Gallery : http://diamond.boisestate.edu/gas/html/galindex.html displays black and white photographs, which show best on monitors with high resolution. A collection of public domain photographs of the stars and other principals of the original Gilbert & Sullivan products has been scanned. Some, such as the picture of Alice Barnett as Queen of the Fairies in Iolanthe, has some text, while others are just the picture and the name of the star.

The Midi and Mpeg audio files are particularly appropriate and well done for this site. Since this is for afficionados, the karaoke nature of the midi files is designed for the members who want to sing the parts. The mpeg files, such as the Mikado March by John Philip Sousa, are not as easy to play, because even though the format was set as a helper application, it insisted that I download the file to play directly with the mp2 format player while the midi files play directly. This represents an existing problem, solvable, but a hurdle to overcome.

What is the authority of the site? The webmaster Alex Feldman is Associate Professor in the Department of Mathematics and Computer Science at Boise State University, Boise, Idaho, which hosts the web site. The curator of the archive is Jim Farron who is a computer and electronic publishing specialist with the U.S. government. They are joined by a number of others who participate in making this such a rich site. For instance, interested individuals are contributing libretti, diaries of festivals, and additional audio files. One member is compiling a complete discography of all G&S that has been recorded based on his own collection as well as that of others. The peer review process for a site such as the G&S Archive is the care and attention of its contributors. Just as with any print or other types of resource, the viewer must bring his or her own critical evaluative questioning to the content.

How complete is the Gilbert & Sullivan archive? What can one expect to find here? The web site archive has grown from the initial files such as the photo gallery and a couple of libretti which had been on the FTP site to at least one libretto for each of the operas. They
have now moved on to adding works by either Gilbert or Sullivan individually.

The content includes libretti in the public domain, and sources are identified.

While there is minimal dating of entries as a whole, there is a What’s
New: http://diamond.boisestate.edu/gas/#new
archive of past years pointed to from the What’s New section.


 

Generic Criteria for Evaluation

  • Stated criteria for inclusion of information
  • Authority of author or creator
  • Comparability with related sources
  • Stability of information
  • Appropriateness of format
  • Software/hardware/multimedia requirements

Keep in mind that you must understand the current state of the Internet to determine how you best identify the quality of an Internet resource in this volatile, continually changing environment.


Current State of Evaluation Tools on the Net

  • How and when are you best served by an intermediary tool such as one of the review guides that describes the resource and puts its stamp of approval by the number of stars, such as in the former Bschool business school rating guide, originally the Marr-Kirkwood business school rating guide.
  • Where do tools that help you identify like resources so that you can compare them fit in? One inclusive directory I would include here is Yahoo : http://www.yahoo.com. This is a territory that the search engines have jumped into, i.e. Altavista, Hotbot, and Lycos. They are looking to be “portals” with
    their own directories or a licensed directory as well as a search engine. Check to see which of the companies the search engine you are using is partnering with for added services.
  • When are you best served by the basic search engines and evaluating the results for yourself? In a lot of cases it makes more sense to search a popular search engine to go directly to material on your specific term than it does to browse through directories or review tools. While Google, Lycos, Altavista, HotBot, Infoseek or Excite used in this way may bring up lots of screens of bad hits,
    that really does not matter if you get to what you want on the first screen of hits.

Popular Search Engines listed alphabetically. Here you are searching using the value of description rather than that of evaluation. Most search engines today have some sort of associated portal. Danny Sullivan’s Searchengine Watch : http://www.searchenginewatch.com and Greg Notess’ Search Engine Showdown : http://www.notess.com/search are two current tools for keeping abreast of search engine developments. There is no good advice as to which ONE search engine is best. They are constantly changing. At this time Google and Northern Light are the first two I try. It is good to check back to each of the engines on a regular basis because of the amount of change.

  • Altavista: http://www.altavista.com
  • Fast: http://www.alltheweb.com
  • Google: http://www.google.com
  • HotBot: http://www.hotbot.com
  • Lycos: http://www.lycos.com

Search Engine Partners listed alphabetically. Each adds a value added depending upon its niche.

Directory Partners Check to see if your search engine has licensed one of these or is creating its own.

Sites are springing up that purport to provide “evaluations” of Internet resources. The next thing that is needed is to evaluate their
track records to determine the value of their evaluations. While there are criteria in each case, the implementation of the evaluations are frequently subjective or biased. Note that this is really no different than what we have lived with in the print environment, except that now it is digital!

General Guides and Directories

Among the general guides are a number of sites that purport to be THE site you should start with.

You will want to compare in terms of value to you the level of specificity in Yahoo and the WWW Virtual Library and the newer general directories versus the set of categories in the various directories of the search engines.

Specialized Guides

More traditional library resources (fee-based and generally worth the cost)

When Sharyn Ladner and I wrote our first book surveying the Internet use of special librarians in 1991 and 1992, we noted that

“the Internet allows all types of publishing in the broadest sense–much of the information contained in Internet resident discussion groups is transitory–and this network of networks will continue to expand exponentially so that bibliographic control will continue to be out of reach. There is no Dialog superstructure to create a “dialindex” of indexes, and one is not likely to exist in the future because of the distributed nature of the system and the ephemeral quality of much of the information posted to network repositories. Librarian skill at creating specialized indexes or other retrieval tools will be needed.” (Sharyn J. Ladner and Hope N. Tillman, Internet and Special Librarians: Use, Training, and the Future. Washington, D.C.: Special Libraries Association, 1993, p. 58)

What a difference a couple of years makes. Our crystal ball was not very good. There is the potential for a whole lot more bibliographic control today; and at the same time there is increasing complexity. I still believe in the importance of information professionals’ contributing their skill to develop the searching tools for whatever the Internet is going to become.

General Guides

Argus Clearinghouse: http://www.clearinghouse.net

What started as the University of Michigan ClearingHouse project now is the Argus Clearinghouse. It is now truly separate in name as well as management. It has had growing pains. There is now a tighter process to ensure the quality of their guides. An early flaw that is being remedied is that many of these developed as student projects, and after the end of the year, the students left. After that there was a staff to do the reviews. Then it closed in 2002. It is a model for good quality reviewing.

Not all guides are done by students, and Internet gurus including John December and Diane Kovacs have been among the contributors. Guides not updated in the past year are listed in a separate file.

Several years ago, I did a review of the ClearingHouse project handling of business resources for the Journal of Business and Finance Librarianship. Since it has been over two years, I have removed it from my web site as out of date.

The original project leader, Lou Rosenfeld, began the ClearingHouse project while a Ph.D. candidate at the University of Michigan library school. With Peter Morville, he currently heads a business Argus Associates. In Fall 1995 to improve the value of the guide, a plan: http://www.clearinghouse.net/ratings.html
was put into effect to rate guides according to 4 criteria:

  1. Level of resource description – descriptive information providing userswith an objective sense of what an Internet resource covers
  2. Level of resource evaluation: Evaluative information provides userswith a subjective sense of the quality of an Internet resource,
  3. Organizational schemes, or how the guide is organized (by subject, format, audience, or other)
  4. Level of meta-information, or information about the other information. For instance, information about the authors, their professional or institutional affiliations and their knowledge or experience with the subject; how the guide was researched and constructed; and the mission of the guide.

The guides are organized within the following categories:

  • Arts & Entertainment
  • Business & Employment
  • Education
  • Engineering & Technology
  • Environment
  • Government & Law
  • Health & Medicine
  • Humanities
  • News & Publishing
  • Regional Information
  • Science
  • Social Sciences & Social Issues

A Digital Librarian’s Award: http://www.clearinghouse.net/dla.html was given monthly for the best guide of that month, and for select guides the rating system may be seen.

last checked by Clearinghouse: May 27, 1997

Overall Rating: (rated 7/97)
Resource Description: 5
Resource Evaluation: 4
Guide Design: 5
Organization Schemes: 5
Guide Meta-information: 5

This gives an excellent set of characteristics to frame how to look at that particular site and what to expect from it. In this case, it is interesting that the weak link is the resource evaluation of what his site points to. The site gives a great view of the universe of
education available via the Internet. However, its annotations about the resources it points to are no more than one liners. I will not have unwarranted expectations about the evaluations but will expect the site to have an excellent organizational structure. The biggest value in leading you to explore the strengths of a work.

Gale’s Cyberhound Guide — an early casualty

Gale has been in the directory service business for a long time, as its many library customers will attest. It looked to leverage its indexing skills to help those looking for information on the Internet. However, its web-accessible endeavor was shortlived, as it has pulled the plug on the Cyberhound, formerly at http://www.cyberhound.com/, and will just be providing print reviews.

” Searching for the best sites on the web, 24 hours a day, 365 days a year has Cyberhound completely fried. (No wonder you never catch him without his shades.)
He’s retiring from the Internet spotlight to pursue his writing career. From now on, please access Cyberhound reviews in one of his quality softcover reference volumes.”

Given the ability to update information on the web (if done), I certainly could not expect a print publication to be timely and that is a major requirement of an Internet evaluation tool.

Internet Tools of the Profession, 2nd edition, 1997 by Tillman & Ladner

This book served its purpose and went through two editions. Until 2003, the 2nd edition of this resource guide had a web
site where URLs of the reviewed titles could be updated, and chapter authors could add new sites, as needed.

Specialized Guides

A good resource for identifying the best of these is to use the searching feature of the Clearinghouse: http://www.clearinghouse.net website described above.

  • An early specialized guide on rating business schools was called the Marr-Kirkwood Guide to Business School Webs

I have particularly enjoyed the development of this specialized evaluation site which began rating business school web sites for several years. This site uses a table to display the criteria by which the business schools’ sites are evaluated so that not only it is clear whether or not they have met a particular criteria, but you can “click” on that category and see its display at the specific site.

The table formatting is particularly effective as a way to see the comparison between the business schools’ web sites.

Directories

Yahoo: http://www.yahoo.com

  • Yahoo also started as a project of its two co-authors to share their well-organized Web bookmarks. Originally their only basis for authority was that they were graduate students at Stanford and Stanford was sponsoring their subject listings. With their success, they moved away from Stanford, hired a staff to help them, and have grown their directory to become a major Internet tool. They even hire cataloging librarians with MLSes, i.e. Anne Callery. Their chief ontologist or Director of Surfing spoke at Computers in Library in 1996 and will be speaking at Internet Librarian in November. Yahoo is soliciting URLs, categorizing them, and adding them to their database if they want to. They are not guaranteeing their quality, nor are they guaranteeing that they add all that come their way.
  • See their categories from the Yahoo home page: http://www.yahoo.com.
  • Yahoo has grown its list very quickly using people and technology to assist. Those who submit URLs are forced to select from among existing categories; there is a place to recommend alternate or new categories. Its categories are home grown using a variety of techniques but no different than any library list of subject headings, with its own set of biases developed because of the nature of Yahoo and what they have looked at. For instance, they poll automatically to see if sites are up or available. They may not catch forwarding addresses with this technique.

WWW Virtual Libraries Project: http://www.w3.org/hypertext/DataSources/bySubject/Overview.html

  • The W3 Virtual Libraries initial approach to subject guides to the Internet purported to be a scholarly one. CERN solicited subject experts to develop annotated lists of sites in their fields, both broadly and narrowly. The problem has become the uneven quality of the guides and even the different approaches which grew out of the creativity of their developers. While there are clues on the pages, some have not been maintained and represent an initial or periodic effort rather than an ongoing one. Others are very up-to-date and complete. As the web has exploded, keeping up with these types of subject guides has become much more complex and difficult.
  • Of particular interest is the WWW Virtual Library disclaimer: