Some comments on Monday's post, as well as some correspondence, have made clear that my understanding of Google Scholar was outdated. For example, as one commentator noted, Google Scholar appears to pick up citations in journal articles behind paywalls, and since journal articles are overwhelmingly on-line these days (even if behind paywalls) that means Google Scholar has a huge database from which to draw. In addition, starting in 2005, Google began digitizing the entire University of Michigan library collection, one of the larger university collections in the United States, meaning that almost all that material is now part of the database. Of course, there are differential citation rates, as I noted, but as Peter Carruthers (Maryland) wrote to me:
I believe your closing remark about Google Scholar is partly off-base.
It does pick up only citations in publications accessible online. But this now includes all journal articles from the last twenty years or more, as well as a great many books (not only e-books released by the original publishers, but also books that Google itself has been placing online).
I think the main factor explaining different Google scholar citation scores among sub-disciplines of philosophy is not whether people in those sub-disciplines do or do not place their own work online, but rather the citation practices common in those sub-disciplines. In philosophy of psychology and other sciences, for example, it is common to cite other people's work frequently -- much more frequently than a normal paper in metaphysics, say, or ethics.
Some other readers wrote me making similar points, and I think they are correct.
Finally, another reader gave me a clear explanation of the H-Index that Google Scholar calculates, which I had not understood:
The h-index is that number n such that your nth most cited publication is cited n times. It strongly correlates with a well-published *and* influential academic career (as opposed to one or two articles or books that hit it big). Obviously much older academics have higher h-indices. It's very hard to move from an h-index of n to an h-index of n+1, and there are very few philosophers with an h-index of 20 or over (you would need at least 20 publications, and they would have to be cited at least 20 times or more).
Indeed, one finds that distinguished philosophers with long careers of 40+ years do have very high H-Index numbers (e.g., Dennett with a 62, and Block with 32), while leading younger philosophers also have high H-Indices but not as high (e.g., DeRose with 20, and Stanley with 20). Someone relativley young, like Chalmers, with a 41 H-Index is quite unusual, but that certainly correlates well with the substantial impact of his work.
UPDATE: A reader points out that Dennett's second most-cited reference is actually a review of a book, and 99% of the citations are to the book, not the review (Google Scholar obviously can't discriminate, since the title of the book is in the title of the review). Even if that were removed, it wouldn't affect, I suspect, Dennett's H-Index. But apparently one can remove content from the Google Scholar results.
ANOTHER: A reader points out another interesting indicator, the i10 index, esp. for the last five years: this shows how many of an author's papers have been cited at least ten times in the last five years. Some big and striking differences here based on a quick survey.
AND ANOTHER: Mohan Matthen (Toronto) writes with more observations about interrpreting Google Scholar data:
I just wanted to say a word about the h-index. Each step on this index is a bar that is higher than the previous. So think of Dave Chalmers. His h-index is 41, three times mine at 14. But this means that he has 41 works that have been cited at least 41 times! How many works do I have that have been cited that often? Only 3! So he is by that way of looking at things, nearly 14 times better than me.
You might think that Chalmers gets a lot of attention for the "hard problem", ergo a lot of citations. And this would be true. But this does not account for the breadth of his cited works, which is what the h-index measures. For an illuminating contrast, think of Gettier. The Gettier problem is, of course, extremely well-known. Fifty years of epistemology rotated around it. It is probably better known, at least within philosophy, than Chalmers' hard problem. But what is Gettier's h-index? 1. He has just one work that is cited at least once. In other words, his h-number is the same as that of a young scholar who has just received his first citation.