Cautions About Using Google Books

From an essay in the Chronicles of Higher Education. (HT: John Bales)

Subscribe to the Heidelblog today!


3 comments

  1. Oh the horror, users of Google book search will actually have to look at the pages that contain the publication data instead of having Google actually having done that bit of research for them. The article in question didn’t fault Google for omitting those from the online copies of the books, it only complains that the associated meta data is bad. So now we have scholars complaining that Google hasn’t done all their work for them, only most of it. I’m suspicious of scholars that whine.

    The article complains about What Maisie Knew, however the first search result has 1897 in the metadata, but the title page says M DCCC XCVIII, which is actually 1898, but although Google’s meta data is wrong it’s easy enough to see what is right by simply looking at the title page. What also seems strange is that the date of 1848 is not returned in any the first 15 pages of results when searching for “What Maisie Knew” in Google book search. Even adding “Henry James” to the search criteria, 1848 doesn’t appear in the first 5 pages of search results. So while there may be an edition in Google book search with 1848 as the publication date in the metadata, one must be diligent to find it.

    Then if one considers how OCR works, it’s easy too understand that if 1898 in indo-arabic numerals were scanned and processed via OCR depending on the typeface and other possible extraneous marks on the page, the 9 in 1898 could easily scan as a “4”, rendering the date as 1848.

    It leads me to wonder if in physical libraries, scholars are actually pulling the books off the shelves or just recording the info as it appears in what used to be the “card catalogue” (now online). So now instead of having to actually walk to the stacks and pull the book in question off the shelf, the scholar has to simply browse to the first page of the Google online book and see the actual scanned publication page, and he still whines. So much for the idea of painstaking work.

    So, stop the whining, and just turn to the first couple of pages and examine the scanned pages directly when doing research via Google book search.

    The world could really come to an end, and someone could start a wiki of corrected metadata for Google book search.

Comments are closed.