diff --git a/OARiNZ/DIY/DIY_spec.tex b/OARiNZ/DIY/DIY_spec.tex index 5a1a6aa..6083314 100755 --- a/OARiNZ/DIY/DIY_spec.tex +++ b/OARiNZ/DIY/DIY_spec.tex @@ -38,20 +38,26 @@ \section{Introduction} -Implementing a digital repository, using a typical open source solution such as GNU EPrints or DSpace, is currently a complex proposition that requires a reasonable level of technical expertise in order to find, download and install all the required software, then separately configure these components appropriately for the target operating system. This process can be simplified, in particular removing the need to manually find, download, install and configure multiple separate components. Instead, a single installer could manage the entire process from start to finish. +Implementing a digital repository, using a typical open source solution such as GNU EPrints or DSpace, is currently a complex proposition that requires a reasonable level of technical expertise in order to find, download and install all the required software, then separately configure these components appropriately for the target operating system. Ongoing maintenance of the repository configuration can also be complex. Both tasks can be simplified, in particular removing the need to manually find, download, install and configure multiple separate components. Instead, separate higher-level installer and configuration tools could manage these tasks. -Objective 7 of the OARiNZ project aims to address this need. This objective aims to produce a freely distributable, easy to install CD-ROM containing either pre-configured or self-configuring open source software for use by institutions looking for entry-level assistance with developing their own shareable digital repository. This document outlines a specification for such a solution. +Objective 7 of the OARiNZ project aims to address this need. This stated aim of this objective is to ``produce a freely distributable, easy to install CD-ROM containing pre-configured (or self-configuring) open source software for use by institutions looking for entry-level assistance with developing their own shareable digital repository''\footnote{\url{http://www.oarinz.ac.nz/objectives.php#seven}}. This document outlines a specification for such a solution. -The nature of currently available repository software means that it is unlikely that we can completely eliminate the need for some technical expertise. Several installation and configuration tasks require administrator level access, for example, so the solution cannot be fully automated. Regardless, the solution will enable repository implementers to quickly install and configure a complete digital repository, either from ``bare metal'' on a new server or on an existing system. In addition, the level of required technical expertise and the complexity of the installation and configuration process will be reduced, thus lowering the bar for implementing a digital repository. +The nature of currently available open source repository software makes it unlikely that we can completely eliminate the need for some technical expertise. Most such software targets a LAMP environment (Linux/Unix, Apache, MySQL, Perl/PHP), and several installation and configuration tasks require administrator level access, so the solution cannot be fully automated. Regardless, the solution will enable repository implementers to quickly install and configure a complete digital repository, either from ``bare metal'' on a new server or on an existing system (i.e., one that already has installed an operating system and preferably the required LAMP components). In addition, the level of required technical expertise and the complexity of the installation and configuration process will be reduced, thus lowering the bar for implementing a digital repository. -In the spirit of ``lowering the bar'', a key aim is to automate or abstract as much of the repository installation and configuration process as possible, focusing attention instead on only those elements that \emph{require} human intervention. In other words, repository implementers will not be forced to type in arcane commands unless it is absolutely unavoidable, nor will they be forced to read many pages of dense and obscure documentation before they start or be burdened with byzantine installation procedures. A laudable (but perhaps overly optimistic) goal would be to make the installation process as easy as installing software under Mac OS X or Windows. +In the spirit of ``lowering the bar'', a key aim should be to automate or abstract as much of the repository installation and configuration process as possible, focusing attention instead on only those elements that \emph{require} human intervention. In other words, repository implementers will not be forced to type in arcane commands unless it is absolutely unavoidable, nor will they be forced to read many pages of dense and obscure documentation before they start, or be burdened with byzantine installation procedures. A laudable (but perhaps overly optimistic) goal would be to make the installation process as easy as installing software under Mac OS X or Windows. -The following two key deliverables are therefore proposed: +With regard to repository maintainance, the ideal would be to produce a high-level configuration tool that is able to configure a repository (or collection of repositories) without requiring the administrator to manually edit configuration files. This will certainly be feasible with repositories that are installed by the proposed installer tool, and may be feasible, with some restrictions, for manually installed repositories. + +The following key deliverables are therefore proposed: \begin{enumerate} - \item A ``bare metal'' installer for creating completely new repositories on new hardware, that includes both an operating system and all the required repository software. + \item A ``bare metal'' installer for creating completely new repositories on new hardware, that includes an operating system, the required LAMP components and all the required repository software. - \item A standalone tool for installing and configuring an EPrints repository on an existing server. + \item A standalone tool for installing an EPrints repository on an existing server, i.e., a server with an already installed operating system and preferably the required LAMP components. + + \item A pre-packaged EPrints distribution in \texttt{.deb} (and possibly \texttt{.rpm}) format for use by the two previous deliverables. + + \item A standalone tool for configuring an EPrints repository. \end{enumerate} Both of these deliverables would be distributed in the form of a CD-ROM (or equivalent medium) containing all the required software and a ``shell'' for managing the installation and configuration process. Downloadable disk images would also be made available. @@ -69,11 +75,13 @@ \subsection{Operating systems} -EPrints repositories are typically run on Unix-based systems (e.g., Linux, BSD, Mac OS X), and we have experience at Otago with installing EPrints on Debian Linux, FreeBSD, Mac OS X and Ubuntu Linux. Unix-based systems will therefore be our primary target for implementation. Note that the EPrints web site currently states that there are ``no plans for a version to run under Microsoft Windows''. +EPrints repositories are typically run on Unix-based systems (e.g., Linux, BSD, Mac OS X), and we have experience at Otago with installing EPrints on Debian Linux, FreeBSD, Mac OS X and Ubuntu Linux. Unix-based systems will therefore be our primary target for implementation. Note that the EPrints web site currently states that there are ``no plans for a version to run under Microsoft Windows''\footnote{\url{http://www.eprints.org/documentation/tech/php/intro.php#what_will_it_run_on}}. -For bare metal installations, a complete operating system distribution will also be required. It is clearly not possible to provide an installation disk for every possible Unix platform, nor for proprietary operating systems such as Mac OS X. The bare metal installer can therefore realistically only support one operating system platform. The easiest way to achieve this is to pick a Unix-based operating system that provides a bootable ``live CD''. +For bare metal installations, a complete operating system distribution will also be required. It is clearly not feasible to provide an installation disk for every possible Unix platform, nor for proprietary operating systems such as Mac OS X. The bare metal installer can therefore realistically only support one operating system platform. The easiest way to achieve this is to pick a Unix-based operating system that provides a bootable ``live CD''. -We have experience at Otago with installing EPrints repositories under Ubuntu Linux\footnote{\url{http://www.ubuntu.com/}}, which provides a live CD feature, so this is an obvious choice. The Ubuntu live CD is also easily customisable, so a custom live CD could be created that included not only the base operating system but also the required packages for installing EPrints and our configurator software. (Note that installation of the repository software would be incorporated into the operating system installation process, so the standalone repository installer would not be required for bare metal installs.) +We have experience at Otago with installing EPrints repositories under Ubuntu Linux\footnote{\url{http://www.ubuntu.com/}}, which provides a live CD feature, so this is an obvious choice. The Ubuntu live CD is also easily customisable, so a custom live CD could be created that installed not only the base operating system but also the required packages for installing EPrints and our configurator software. We will limit ourselves to the x86 architecture in order to keep things simpler. + +Installation of the repository software could be incorporated directly into the operating system installation process, implying that the standalone repository installer would not be required for bare metal installs. Alternatively, the standalone installer could be provided on a separate CD. The OS installer could then say something like ``please insert the CD labelled `EPrints Installer'{}'' and simply call the standalone installer once the CD has been inserted. The latter option should be easier to achieve and avoids any potential duplication of effort in both the bare metal and standalone installers. \subsection{Package installation} @@ -82,6 +90,10 @@ It therefore needs to be considered whether the standalone repository installer for existing systems should use the native package management software (e.g., Red Hat's \texttt{rpm} or Debian's \texttt{dpkg}), or independent installer software. If the native route is taken, the installer will need to detect the operating system version and then look for appropriate package management tools, which of course makes implementation more complex. The non-native route will lead to a simpler implementation, but would lose the significant advantage of having packages managed by the operating system, which is particularly useful for dependency management and upgrades. The native option is therefore preferred. +Another consideration is how to handle pre-existing EPrints installations, whether they be installed manually or by our installer. For ongoing sustainability, the installer should be able to install in a way that enables future version installers, thus enabling future upgrades to the EPrints software in a reasonably transparent manner. Things become a bit murky when installing over the top of a pre-existing manual EPrints installation, however. EPrints does use a standard directory and file structure, so as long as the installation has not been radically restructured, it should in theory be feasible to install over the top. + +A much safer option in general, however, would be for the installer to create a new installation alongside the existing one, then perhaps offer to copy across any customised files. This would give the repository administrator the opportunity to thoroughly test the new installation before going live, which would probably just be a matter of swapping the ``new'' and ``old'' installation directories. + \subsection{Repository installation and configuration interface} @@ -152,7 +164,9 @@ \item Unix-based operating systems in general - \item Ubuntu Linux (server distribution) for the bare metal install option + \item Ubuntu Linux (server distribution, x86 architecture) for the bare metal install option + + \item Base OS installer to call standalone repository installer (on separate CD) \end{itemize} @@ -161,6 +175,8 @@ \begin{itemize} \item Use native package management tools provided by the operating system wherever possible + + \item For prior EPrints installations, install alongside rather than overwrite \end{itemize} @@ -186,9 +202,9 @@ \item Downloadable disk images in standard formats - \item One disk (or set of disks) for bare metal installs: base operating system + configurator + \item At least two disks for bare metal installs: base operating system + repository software (including configurator) - \item One disk (or set of disks) for existing system installs: repository installer + configurator + \item One disk (or set of disks) for existing system installs: repository software only (including configurator) \end{itemize}