- \documentclass[a4paper]{article}
-
- \usepackage{mathpple} \usepackage[margin={1in,0.5in}]{geometry}
- \usepackage{graphicx}
-
- \title{School of Business Publications Repository \\
- (DRAFT: not for circulation)}
- \author{Nigel Stanger\thanks{Department of Information Science, email
- \texttt{nstanger@infoscience.otago.ac.nz}.}}
-
- \def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em
- T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}}
-
- \begin{document}
-
- \maketitle
-
- \section{Executive Summary}
-
- A database-managed repository is currently in the early stages of
- development (under the auspices of the School's Information Technology
- Policy Committee), for the purpose of storing (primarily research)
- publications authored by staff within the School of Business. Such a
- repository provides several important benefits, including:
- \begin{itemize}
-
- \item A single, well-managed, flexible repository for storing
- details on publications within the School.
-
- \item Easily publish details of publications on the web, including
- downloadable copies of papers where appropriate.
-
- \item Eliminate (or at least reduce) duplication of publication data
- in multiple locations, thus enhancing consistency.
-
- \item A searchable database of publications spanning the entire
- School, accessible via the web. The repository will also be
- available to major web search engines, such as Google and Yahoo.
-
- \item Enable individual departments, research groups or staff
- members to generate web pages of their publications using whatever
- ``look and feel'' that they desire.
-
- \item Improved workflow when forwarding publication details to
- Research, Enterprise and International (RE\&I) for inclusion into
- the annual list of University publications, and for PBRF.
-
- \end{itemize}
-
- The basic engine of such as system is currently being implemented, and
- work is progressing.
-
-
- \section{Why would such a system be useful?}
-
- There are several reasons why such a system would be useful. First, it
- provides a single, consistent, flexible way of disseminating publication
- details via the web. Second, it will reduce the amount of duplication of
- publication details that currently exists. Third, it will improve the
- workflow associated with forwarding publication details to Research,
- Enterprise and International.
-
-
- \subsection{Web access}
-
- Consider a person from outside the University wanting to find all
- publications by a particular staff member in a particular department
- within the School. For most departments they will typically find a list
- of publications in chronological order, perhaps subdivided by
- publication type. To find all publications by a particular staff member,
- they will have to physically scan through all the publications web pages
- to find what they want. If they are lucky, publication lists may be
- available on individual staff members' web pages, but this is by no
- means certain, and these lists are not usually comprehensive.
-
- It would obviously be more effective to simply enter the name of the
- author you are interested in into a search field, and quickly retrieve
- only the publications by that author. To do so effectively requires an
- underlying database and associated software, however, of all the
- departments in the School, only the Department of Marketing has such a
- system in use. The remaining departments use static, manually created
- web pages that cannot easily be searched and are difficult to keep up to
- date. (The author of this document is the coordinator of the Department
- of Information Science Discussion Paper Series, and has first-hand
- experience of the issues associated with this approach.)
-
- The typical state of affairs for most departments is illustrated in
- Figure~\ref{fig:current}. Considering only the left hand side of the
- diagram for the moment, we see that authors produce publications, which
- are submitted to some publication venue. Details of publications are
- typically forwarded to a ``Publications Person'' within the department,
- who organises placing those details on the department's web site. This
- is usually a manual process, and may only occur once or twice per year.
-
- \begin{figure}[htb]
- \includegraphics[width=\columnwidth,keepaspectratio]{PublicationsCurrent}
- \caption{Typical state of affairs for publications in most departments.}
- \label{fig:current}
- \end{figure}
-
- Contrast this with the situation shown in Figure~\ref{fig:repository}.
- Authors load details of their publications directly into the new
- publications repository. Once these details are verified by the
- ``Publications Person'', the publication immediately becomes visible on
- the web. The whole process is streamlined considerably, and the
- ``Publications Person'' is spared the work of manually updating web
- pages. The web pages generated by the repository will be template-based,
- making it easy to customise web pages for specific purposes, and to
- quickly change the ``look and feel'' of the entire system.
-
- \begin{figure}[htb]
- \includegraphics[width=\columnwidth,keepaspectratio]{PublicationsRepository}
- \caption{The proposed publications repository.}
- \label{fig:repository}
- \end{figure}
-
- Many departments currently provide downloadable versions of papers
- (where copyright allows), and this will obviously also be a feature of
- the proposed repository. With a static web site it can be difficult to
- determine whether a particular document has been downloaded, how many
- times it has been downloaded, and by whom. With a dynamic web site
- driven from the publications repository, it will be easy to track the
- number of downloads for each publication. The system can even ask the
- reader if they would like to enter their details, which will then be
- automatically emailed to the author, enabling them to contact readers of
- their publications and enhancing the possibilities for future
- collaborations.
-
- The repository will also be made visible to the major Internet search
- engines such as Google and Yahoo, which will enhance the visibility of
- the School's research output. It should also be possible to
- automatically ``plug in'' to specialised publication search engines in
- various disciplines (for example, CiteSeer).
-
-
- \subsection{Single point of storage}
-
- Publication details often appear in multiple locations under the current
- regime (for example, in the department's full publication list and on
- the author's personal web page). This can obviously lead to problems if
- some detail of a publication needs to be changed---you might change one
- entry, but miss another, resulting in inconsistencies. The repository
- addresses this by creating a single point of storage for all
- publications within the School. Changing a publication's details in the
- database will change it everywhere that it appears.
-
- It is envisioned that the repository will be a central resource for the
- School, rather than being run on a department-by-department basis. It
- will be run on a central server and be accessible by all. Authors will
- be able to log in to the repository in order to enter their
- publications, and each department will have a designated ``Publications
- Person'' who verifies the details of new publications and makes them
- visible to the outside world (more on this person's responsibilities
- shortly).
-
-
- \subsection{Publications workflow}
-
- Referring again to Figure~\ref{fig:current}, we see that the major flow
- of data relating to publications is from authors to RE\&I. This flow is
- usually mediated by a ``Publications Person'' within a department. This
- person has access to the ResearchMaster database, and ensures that staff
- publications are entered into this database in the correct format, and
- with all required details. This is typically a manual process that might
- take place once or twice a year. The annual University publications list
- is produced directly from the ResearchMaster database.
-
- PBRF has introduced a second parallel database: the Performer database,
- which stores details of staff members' research performance, including
- publication details. These details can be extracted from the existing
- ResearchMaster database, so no further consideration of the Performer
- database is required here.
-
- Now consider Figure~\ref{fig:repository}. Once a new publication has
- been verified by the ``Publications Person'', the details of this
- publication will be immediately available for entry into the
- ResearchMaster database. There are at least four ways that this could
- occur, in roughly descending order of preference:
- \begin{enumerate}
-
- \item The publications details are automatically loaded directly into
- ResearchMaster.
-
- \item RE\&I periodically query the publications repository for
- new publications.
-
- \item At the end of each year, the ``Publications Person'' generates
- a list of new publications in some suitable format, and forwards
- this list to RE\&I for entry into ResearchMaster.
-
- \item At the end of each year, the ``Publications Person'' generates
- a text file of new publications, and copies and pastes the details
- into the ResearchMaster web interface.
-
- \end{enumerate}
- The last option is probably only a slight variation on what happens at
- present (staff email publication details to the ``Publications Person'',
- and these are copied and pasted into the web interface). It is likely
- that more than one of these options will be implemented in the
- publications repository, but technical considerations to do with
- interfacing the two systems could potentially rule out the first option.
-
-
- \subsection{Responsibilities of the ``Publications Person''}
-
- The last thing anyone wants to do is to burden the ``Publications
- Person'' with any more work than they are undertaking at present. The
- publications repository is in fact intended to reduce the amount of work
- these people have to do, by streamlining and semi-automating many of the
- processes that currently exist.
-
- At present, the ``Publications Person'' primarily acts as a combination
- of a publication information collator (ensuring that all required
- details have been collected, and querying authors for any information
- that is missing) and a data entry operator (manually entering these
- details into ResearchMaster, and also any departmental database that
- might exist). Some also manage the dissemination of publication details
- on the web, usually by manually editing web pages. Most usually have
- other additional related or unrelated responsibilities.
-
- With the publications repository in place, this person's
- responsibilities would normally comprise the following:
- \begin{itemize}
-
- \item Verifying new entries into the repository to ensure that the
- publication is valid and all important details have been included.
-
- \item Making verified publications visible to the outside world
- (this should just be a matter of checking a box on a web form).
-
- \item Possibly transferring data from the repository to
- ResearchMaster (depending on how this link is implemented, as noted
- earlier).
-
- \end{itemize}
- There are two important points to note here. First, the ``Publications
- Person'' does not enter new publications into the repository. Rather,
- this is done by authors directly. Entry of required details (which will
- vary according to the type of publication) will be enforced by the
- repository's web interface. Verification will therefore become more of a
- quality control process than an exercise in data gathering. Second, the
- only thing that the ``Publications Person'' needs to do to make a
- publication visible on the web is to check a box to indicate that the
- publication has been verified. No manual editing of web pages is
- necessary.
-
- The combination of getting authors to directly enter their own
- publications and automated web publishing should reduce the amount of
- work undertaken by the ``Publications Person''. The only aspect of the
- process that might not change (as noted earlier) is the submission of
- publication details to RE\&I.
-
-
- \section{System requirements}
-
- The following are the original requirements as set forth by the School's
- IT Policy Committee in late 2002. They have been lightly edited for
- clarity and consistency, and additional comments have been included in
- [brackets].
- \begin{enumerate}
-
- \item The repository will store electronically various research
- publications produced by staff (and students?) within the School of
- Business.
-
- [Obviously the repository does not have to be restricted to only
- research publications. Also, it will not be possible to store some
- publications in the database because of copyright constraints.]
-
- \item The repository content will be sortable by type (technical
- report or conference paper), author, department (Information
- Science, Marketing) and subject keyword (interesting to see
- inter-disciplinary research).
-
- [Date is another important criterion. Much of this requirement will
- be taken care of by the search feature of the repository. It should
- be possible to search on combinations of criteria (e.g.,
- publications on ``data mining'' by Nigel Stanger published within
- the last three years).]
-
- \item Abstracts should be selectable.
-
- \item The repository should also be able to format a listing as
- required by the University's ``Publications'' document.
-
- [This could be as simple as including an output format selector on
- the search form. Multiple output formats could be supported:
- ResearchMaster, Otago CV, \BibTeX, Refer format (for import into
- EndNote), XML, plain text, etc.]
-
- \item The site should be accessible from every department's home
- page.
-
- [This should just be a matter of including a link on the home page
- that performs a search on ``department = `XXX'\,''. A similar
- principle can be applied to individuals and research groups.]
-
- \item Each time a paper is downloaded, the author(s) will be
- automatically and electronically (email?) notified of the event and
- of the paper downloaded and who downloaded it. This is to allow for
- the author to make contact with the person downloading the paper and
- to possibly develop collaborations with that person.
-
- [An obvious concern here is that authors of popular papers will be
- bombarded with an endless stream of download messages (download
- spam?). Given that there is no automatic way of determining who
- downloaded a paper, these messages would be essentially useless. We
- can solve the spam problem by limiting emails about ``anonymous''
- downloads to a monthly report detailing which of an author's papers
- were downloaded and how many times. We can solve the anonymity
- problem by asking downloaders if they would like to send their
- contact details (at least their name and email address) to the
- author, and presenting them with a form to do so. These details
- could perhaps also be stored in the database for future reference.
-
- The inverse of this feature could also be useful. That is, the
- ability for visitors to place a ``watch'' on particular documents or
- authors, so that they can be automatically notified of updates. This
- would require some sort of registration subsystem, and is not
- currently considered a core requirement.]
-
- \item The system will have the capability for individuals to simply
- upload their papers directly from their desktop. A process similar
- to that used by Blackboard for uploading documents. [Note that this
- is a standard feature provided by web browsers, and is not peculiar
- to Blackboard.] The system serves as a vehicle for distributing the
- School's research. It is not intended for verification that the
- paper is a published paper. If verification is required for say, end
- of year reporting to RE\&I by the department, a secure field could
- be included in the database that allows an appointed member of staff
- [the ``Publications Person''] to verify that the papers have been
- published, etc.
-
- \item All Tech Reports and Discussion Papers should still go through
- a Department's own reviewing process before being up-loaded to the
- site.
-
- [This is really a procedural rather than a technical issue.]
-
- \end{enumerate}
-
- An important point that also needs to be considered is that Marketing
- already have a publications database. Any new system should therefore be
- compatible with the database used by Marketing in order to ease the
- transfer of data between the two systems. Note that this is not meant to
- imply that the repository will necessarily replace Marketing's existing
- database; merely that the two should be compatible so that data can be
- moved in either direction as necessary.
-
- The repository will be run on the School's existing servers and is being
- developed using freely available (open source) software, so no
- additional hardware or software will need to be purchased. The only
- costs that will be incurred are associated with system infrastructure
- development.
-
-
- \section{Summary}
-
- The School of Business IT Policy committee has set forth the
- requirements for a publications repository for the School, and
- development work is currently under way. The proposed repository will
- streamline several processes associated with management of publication
- details. In particular, it will provide a single point of storage for
- details of all publications within the School. This will enhance the
- consistency of publication details on departmental web sites, and will
- automate the generation of publication web pages for departments,
- research groups and individuals. The repository will be able to produce
- output in multiple formats, and should also improve the workflow for
- submitting publication details to RE\&I.
-
- The basic infrastructure for the repository has been completed, and a
- simple prototype system has been demonstrated to the committee. Work to
- further enhance the prototype is currently progressing.
-
-
- \vspace*{1cm}
- \noindent Nigel Stanger \\
- Project Manager \\
- Department of Information Science
-
-
- \end{document}