Dangers on the Digital Frontier

At first, Roy Rosenzweig thought he had unearthed a treasure. While clicking through the Internet five years ago, the George Mason University history professor found a letter posted on a Web site that was purportedly written by future president Martin Van Buren in 1829.

In the letter, Mr. Van Buren -- whom historians recognize as a champion of free markets --appeared to implore then-President Andrew Jackson to shift policies and start regulating "a new form of transportation known as 'railroads.'"

For a historian, such a letter was like gold. It seemed to reveal a whole new side of a political figure whom academics thought they had pegged long ago. An excited Mr. Rosenzweig spent days huddled with colleagues trying to further decipher the import of his find.

After more study, however, Mr. Rosenzweig realized that the letter could not be genuine. Subtle but incriminating errors, such as the word "unemployment," a term not used until much later in the 19th century, gave the ruse away. "People who know history" could tell the letter was a fake, he says. Mr. Rosenzweig's discovery illustrates just one of the thorny issues presented to scholars as they ponder the vast new world of resources available on the Internet. Their options multiply daily, as Internet titans, libraries and educational institutions team up to convert new troves of printed information into digitized, easily searchable form online.

Scan-Do

For instance, there's Google Book Search, a project of Google Inc. that is scanning and posting online tens of millions of volumes from the libraries of Oxford, Stanford and Harvard universities, the University of Michigan and the New York Public Library. Similarly, Amazon.com Inc. has its "Search Inside" program, which allows anyone to search and read millions of pages from books on its site. Yahoo Inc. and Microsoft Corp., meanwhile, are part of the Open Content Alliance, a project scanning material from sources including the university libraries of California and Toronto and and the British national archives.

Scanned books, though, represent only part of the online materials that researchers are finding so valuable. While scholars once had to troll through stacks of academic journals, now they can search through huge numbers of publications instantly. Generalized tools do this, such as Google Scholar, an index of various published academic work, and more specialized sites, such as arXiv.org, a repository of physics papers maintained by Cornell University in Ithaca, N.Y.

More exciting is the growing amount of primary sources such as journals, letters, photographs and other original documents. Mr. Rosenzweig says his ability to sift through 19th-century newspapers digitized and indexed at ProQuest.com has shaved months from the time it used to take him to complete research projects. ProQuest Co., based in Ann Arbor, Mich., assembles databases of newspapers, periodicals and other materials for researchers, libraries and schools.

Historical material heading onto the Web isn't all documents and images, either. Last November, 5,000 digitized wax-cylinder recordings dating back to 1895 were posted online by the Cylinder Preservation and Digitization Project at the University of California at Santa Barbara. Among the recordings: Tin Pan Alley music, vaudeville performances and advertisements from that time.

Seldom Heard

Rick Altman, a professor of cinema and comparative literature at the University of Iowa, says that the digitized cylinders have been a blessing for his research work. He recently downloaded routines by Russell Hunting, a comedian around the turn of the 20th century whose recordings, until now, were nearly inaccessible. Mr. Altman has written extensively about silent-movie-era performers who specialized in making sounds to match the action on the screen -- from chirping birds to foreign accents -- and says that many of these performers modeled their styles after Mr. Hunting's.

"I had to write about this without ever having heard him," Mr. Altman says. "Now I'll have a better sense of what people were looking for."

Fred Turner, an assistant professor of communication at Stanford University, says he made a seminal discovery while browsing through The Sixties Project, a site hosted by the University of Virginia. What he found was a manifesto from a 1960s student group at the University of California at Berkeley that called itself the Free Speech Movement.

In the manifesto, the students wrote that they felt like mindless cogs in a machine, something that gave Mr. Turner the idea for his forthcoming book -- a study of how 1960s protesters and utopians went from criticizing the dehumanizing nature of machines to celebrating the Internet's ability to connect people.

For secondary sources, Mr. Turner favors two nonprofit subscription-based services, JSTOR.org3 and Project MUSE (muse.jhu.edu4). New York-based JSTOR.org began with grants from the Mellon Foundation but now is supported by academic publishers in the humanities and sciences who link their papers to the site in return for a share of the revenue from subscriptions sold to schools and libraries.

Project MUSE posts published papers from a variety of fields and is sponsored by various grants, publishers and academic libraries. Based in Baltimore, Project MUSE is owned and operated by Johns Hopkins University Press in collaboration with the Sheridan Libraries of Johns Hopkins University and 70 participating publishers.

Geoffrey Bowker, who studies communication history as executive director of the Center forScience, Technology and Society at Santa Clara University, says a Web site he has found "enormously useful" is VictorianWeb.org5, a database about Britain's Victorian era, where he has found material ranging from poetry to contemporary ads.

The Abyss

Mr. Bowker, however, does see dangers on the digital frontier. One that he mentions: the risk of shrinking horizons.

"If you have a digital catalog with 75% of the books in it in the field, and a card catalog with 100%, people will still choose the digital," Mr. Bowker says. The untapped 25% gets pushed further into the abyss.

"There's material that used to be looked at that's not being looked at," Mr. Bowker says. "Many of us are hoeing the same kind of territory now."

Other scholars disagree. John King, dean of the School of Information at the University of Michigan, says Google Book Search, for example, will serve to broaden knowledge. "Stuff that's been lost from the foreground of understanding will be moved back," he says.

At Northwestern University, Pablo Boczkowski, an associate professor of communication studies, says that an Internet temptation his peers fall prey to is a simple failure to get up out of their chairs. "Faculty don't go to libraries very often anymore," Mr. Boczkowski laments. What they therefore miss out on, he says, is the serendipity of accidental discoveries: riffling through sources in search of one thing, and finding unexpected gems.

It was purely by chance, he says, that he happened on the topic for his first book, titled "Digitizing the News," a study of ways newspapers use technology to develop new products. "I just stumbled upon it," Mr. Boczkowski says, "and couldn't leave it for seven years."

Where Is Relevance?

Mr. Turner, for his part, complains that what search engines do find is listed in order of the most heavily used sites first, not by what is most relevant. This "doesn't work as well in a scholarly setting," he says, adding that hits listed on Google Scholar do not quite match his own understanding of what is important in his field.

Another danger Mr. Bowker sees is the Web's potential harm to the peer-review process. This time-honored ritual, whereby scholars judge one another's work without knowing who the author is, is essential for advancement in academia. Shielding an author's identity prevents rivals from gratuitously trashing their work. But using a search engine, Mr. Bowker says, it is temptingly easy to enter a few key phrases from the work under review and be led directly to the author's Web site.

There is also the potential to be led astray. The bogus Van Buren letter, for example, seems to have first appeared in the early 1980s in a newspaper advertisement for coal interests in the Washington Post. The fraud was quickly discovered and was reported by the Post itself.

But according to Mr. Rosenzweig, at the time that he found the letter posted on a Web site, news about the hoax had not yet made it onto the Web, at least not in any widely accessible form. No harm done, in this particular case. Indeed, the Web makes it possible for "group scrutiny" to provide a kind of self-policing function, he says, though outing fakes takes some time. Meanwhile, Mr. Rosenzweig continues to strongly believe that the burgeoning amounts of research material available online should be welcomed, despite the dangers.

"The irony is that even if not everything is good, it helps the aggregate get better," he says. Today, he adds, if one performs a Google search with the terms Van Buren, Jackson and railroads, the first two results that come up are exposés on the fraud.

Source: The Wall Street Journal, 2/13/06

Back to top