Newer
Older
Digital_Repository / SoBRG / admin letter draft.txt
Hello, my name is Nigel Stanger and I am a lecturer in the Department of Information Science at the University of Otago. Since I am not sure who best to contact at your institution regarding my query, I am sending it to the administrative address for your institutional repository (IR). I would be much obliged if you could forward this message on to the appropriate person if necessary.

I am writing to you to ask for your help in a research project that I am leading. We are investigating how effective IRs are at exposing their content to the wider world via various Internet search engines (enhancing "discoverability", if you like). To that end, we have three main goals (detailed below), and we would be most grateful if you could assist with any of these goals.


Goal 1: Compare download rates, growth rates and other statistics across a selection of IRs.

We have already undertaken some preliminary work for this goal. In particular, if your site runs EPrints and uses the Tasmania Statistics package, we may already have harvested some of your publicly visible statistics for use in high level comparisons (an example may be found in [1]), and we thank you for making your data publicy accessible in this manner. However, such data are fairly coarse and we would like to use more detailed data where possible.

To that end, would you be willing to share with us the web server logs from your IR (the preferred option), or, if you use the Tasmania Statistics package, the contents of your statistics database? Obviously we would keep any data that you send us in strictest confidence. One option would be to anonymise the log files by removing usernames, etc., although we would prefer that the request IP addresses remain intact so that we can analyse the distribution of countries from which downloads have occurred. A suitable confidentiality agreement could be another option.

The results from our analyses will be made available to participating institutions at the end of the study. We plan to submit a journal article detailing the results of this analysis, and would welcome any of the contributors as co-authors. Partly this is to thank them for their valuable assistance, but also to contribute to Goal 3 (see below).


Goal 2: Run an experiment in which we identify a collection of different documents across a selection of IRs, construct suitable search queries for those articles (including key words and phrases from the title, abstract and full text), then submit those queries to several search engines and compare the results.

We have already completed some work towards this goal and produced some interesting preliminary results. However, the results of this experiment are somewhat limited in that it is difficult to compare the results of searching for a document in one IR with the results of searching for another, completely different, document in a different IR. The results should be generalisable within a single IR, but not across multiple IRs. It would thus be more interesting to be able to search for the *same* paper across multiple IRs. Hence:


Goal 3: Repeat the experiment from Goal 2, but rather than a collection of different documents, we want to test the discoverability of the *same* document when stored in several different IRs.

This is where the paper produced from Goal 1 comes into play. Since it is co-authored by people at several different institutions, it can quite legitimately be archived in each of those institutions' IRs. This then provides the opportunity to compare search results for the same item across multiple IRs.

Alternatively, we could convince a collection of IRs to deposit a document for us, but this seems somewhat artificial, and such a request could quite reasonably be turned down. Producing an actual article with co-authors from several instutitions provides a natural mechanism for injecting the article into all relevant IRs.



[1] Stanger, N. & McGregor, G. (2007) "EPrints makes its mark". OCLC Systems & Services: International Digital Library Perspectives, 23(2):133--141.