diff --git a/OARiNZ/DIY/DIY_spec.tex b/OARiNZ/DIY/DIY_spec.tex index 6083314..8a588c3 100755 --- a/OARiNZ/DIY/DIY_spec.tex +++ b/OARiNZ/DIY/DIY_spec.tex @@ -90,45 +90,56 @@ It therefore needs to be considered whether the standalone repository installer for existing systems should use the native package management software (e.g., Red Hat's \texttt{rpm} or Debian's \texttt{dpkg}), or independent installer software. If the native route is taken, the installer will need to detect the operating system version and then look for appropriate package management tools, which of course makes implementation more complex. The non-native route will lead to a simpler implementation, but would lose the significant advantage of having packages managed by the operating system, which is particularly useful for dependency management and upgrades. The native option is therefore preferred. -Another consideration is how to handle pre-existing EPrints installations, whether they be installed manually or by our installer. For ongoing sustainability, the installer should be able to install in a way that enables future version installers, thus enabling future upgrades to the EPrints software in a reasonably transparent manner. Things become a bit murky when installing over the top of a pre-existing manual EPrints installation, however. EPrints does use a standard directory and file structure, so as long as the installation has not been radically restructured, it should in theory be feasible to install over the top. +Another consideration is how to handle pre-existing EPrints installations, whether they be installed manually or by the DIY installer. For ongoing sustainability, the installer should be able to install in a way that enables future version installers, thus enabling future upgrades to the EPrints software in a reasonably transparent manner. Things become a bit murky when installing over the top of a pre-existing manual EPrints installation, however. EPrints does use a standard directory and file structure, so as long as the installation has not been radically restructured, it should in theory be feasible to install over the top (it could make sense to limit any such capability to a particular range of EPrints versions, however). A much safer option in general, however, would be for the installer to create a new installation alongside the existing one, then perhaps offer to copy across any customised files. This would give the repository administrator the opportunity to thoroughly test the new installation before going live, which would probably just be a matter of swapping the ``new'' and ``old'' installation directories. \subsection{Repository installation and configuration interface} -The kind of interface to present to the person performing the repository installation and configuration process also needs to be considered. Possible options include: +The kind of interface to present to the person performing the repository installation and configuration process also needs to be considered. Since the installation and configuration tools will be separate, it probably also makes to present separate interfaces for each phase (this also fits with the typical ``install then configure'' model that occurs with most software). Note that this discussion applies only to the standalone repository installer and configuration tools, not the bare metal operating system installer. + +There are three obvious options for the installation step: \begin{description} - \item[Use operating system-provided installer] The native installer program supplied by the operating system could be used (if such exists), such as the Mac OS X installer application. While this would provide an installation experience that is consistent with the user's interface expectations, this would almost certainly require the development of separate installers for each operating system platform, with consequent increase in development and maintenance complexity. It is also unclear whether such tools would also be able to effectively implement the configuration step, and they may or may not be able to integrate with any native OS package management tools (this is certainly not the case for the Mac OS X installer, for example). + \item[Operating system-provided installer] The native installer program supplied by the operating system could be used (if such exists), such as the Mac OS X installer application. While this would provide an installation experience that is consistent with the user's interface expectations, this would almost certainly require the development of separate installers for each operating system platform, with consequent increase in development and maintenance complexity. It is also unclear whether such tools would also be able to integrate with any native OS package management tools. - \item[Cross-platform GUI-based installer and configurator] There are many cross-platform installer tools available that could be used to build an installation tool. Many of these tools are written in Java, which could enable the installation user interface to look reasonably ``native'' for each platform. Non Java-based tools may impose a particular look and feel which could be visually jarring on different platforms. As with the native installer option, it is also currently unclear whether any of these tools could provide a GUI for the configuration step, and they may or may not be able to integrate with the native operating system package management tools. + \item[Cross-platform GUI installer] There are many cross-platform installer tools available that could be used to build the repository installer. Many of these tools are written in Java, which could enable the installation user interface to look reasonably ``native'' for each platform. Non Java-based tools may impose a particular look and feel which could be visually jarring on different platforms. As with the native installer option, it is also currently unclear whether any of these tools are able to integrate with the native operating system package management tools. - \item[Web-based installer and configurator] A web interface could be used to manage the installation and configuration process. This would require an active web server with some sort of back-end scripting support, so an embedded web server may be necessary for the initial installation step. There is also the issue of gaining administrator level access in order to install and configure many of the components. This is not insurmountable, however, as web-based system administration tools like Webmin can do this. The big advantage of using a web browser is that it should work on almost any platform if web standards are adhered to, and it will provide a reasonably ``native'' user interface experience in all cases. - - \item[Shell-based installer with text interface] This is the lowest common denominator for all Unix-based systems. Almost any Unix-based system will have some variant of C-shell available, or at least something compatible. The interface will not be very ``pretty'', but will be relatively simple to implement, and can handle both the installation and configuration steps without any difficulty, including prompting for administrator-level access. If implemented in a modular fashion, the installer/configurator should be readily portable to other Unix-based operating systems. Furthermore, a shell-based configurator could even act as a back-end application layer behind a web-based front end, solving two problems at once. + \item[Shell-based installer] This is the lowest common denominator for all Unix-based systems. Almost any Unix-based system will have some variant of C-shell available, or at least something compatible. The interface will not be very ``pretty'', but will be relatively simple to implement, and can easily handle issues like integrating with package managers and prompting for administrator-level access. If implemented in a modular fashion, the installer should be readily portable to other Unix-based operating systems. \end{description} +The main issue with the first two options is clearly the ability to interface with native package management tools. Any installer tool that is able to do so would be a suitable candidate, but if no such tool can be found, then a shell-based installer may be the only option. -The web-based option provides the best compromise between a truly ``native'' user interface and the flexibility required to provide a cross-platform solution that can interface with native package management tools, especially when combined with the shell-based option. +There are two obvious options for the configuration step: +\begin{description} + + \item[Web-based interface] A web interface could be used to manage the configuration process. This would require an active web server with some sort of back-end scripting support. There is also the issue of gaining administrator level access in order to install and configure many of the components. This is not insurmountable, however, as web-based system administration tools like Webmin can already do this. The big advantage of using a web browser is that it should work on almost any platform if web standards are adhered to, and it will provide a reasonably ``native'' user interface experience in all cases. + + \item[Shell-based interface] As described above. If implemented in a modular fashion, the configurator should be readily portable to other Unix-based operating systems. Furthermore, a shell-based configurator could even act as a back-end application layer behind a web-based front end, solving two problems at once. + +\end{description} +The web-based option provides a more consistent cross-platform user experience with the flexibility required to provide a cross-platform solution that can interface with native package management tools (especially when combined with the shell-based option). + +In all cases, consideration should be given to alternate language interfaces (M\={a}ori in particular). Regardless of the interface method used, users should be able to easily select their preferred language. Some installer tools provide this capability already, and the web-based configuration interface should be designed in such a way as to support language templates. \subsection{Distribution media} While the discussion so far has been about distribution on CD-ROMs, there is no particular reason to limit the solution to only this medium. For example, the solution could also be made available in DVD form and as downloadable disk images. This will provide repository implementers with a choice of installation media to suit the vagaries of their particular installation environment. -Furthermore, it is likely that the CD-ROM version would actually comprise more than just a single CD-ROM. A bare metal install would not only need the operating system files, but also pre-compiled versions of all the prerequisite software in a package format appropriate for that operating system. Similarly, an existing system install would need to include duplicate copies of all of the prerequisite software in appropriate formats for the various supported package management tools. This could easily run to at least two CD-ROMs, but would definitely fit onto a single DVD. +Furthermore, it is likely that the CD-ROM version would actually comprise more than just a single CD-ROM. A bare metal install would not only need the operating system files, but also pre-compiled versions of all the EPrints prerequisite software in a package format appropriate for that operating system. An existing system install could reasonably assume a pre-existing functional LAMP installation, but would still need to include copies of other EPrints prerequisites such as libraries, Perl modules, etc., in appropriate formats for the various supported package management tools. Combined, this could easily run to at least two CD-ROMs, but would definitely fit onto a single DVD. It is also recommended that there should be separate disks for the bare metal install and the existing system install options, for the following reasons: \begin{itemize} \item People with existing systems would not want to download an unnecessary operating system distribution in order to get the just repository software. - \item The bare metal installer would only need the base operating system installer and the repository configurator, as the repository software installation will be incorporated into the base operating system installation process. + \item The bare metal installer would minimally need only the base operating system installer and the repository configurator, as the repository software installation could be incorporated into the base operating system installation process. \item Keeping the two separate simplifies the installation instructions. If the disks were combined the instructions might read something like this: ``If you want to install a complete operating system and repository from scratch, boot from this CD and follow the instructions. If you want to install the repository on an existing system, insert the CD and run XXX.'' This is long-winded and potentially confusing. - With separate disks, the instructions would read more like this: ``To install the operating system and repository software, boot from this CD and follow the instructions'' (bare metal install disk), and ``To install the repository software, insert the CD and run XXX'' (existing system install disk). + With separate disks, the instructions could read more like this: ``To install the operating system and repository software, boot from this CD and follow the instructions'' (bare metal install disk), and ``To install the repository software, insert the CD and run XXX'' (existing system install disk). \item A combined installer would probably not fit on one CD-ROM, whereas a separate CD-ROM for each installer might be feasible. @@ -136,16 +147,24 @@ \subsection{Items to be configured} +\label{sec-configure} The basic repository configuration includes things like its internal identifier, domain name, HTTP port number and so on. All of these items are required as part of the base configuration and will need to be included in the configurator. Configuration of the Tasmania EPrints statistics software would also be included here. -In addition to these compulsory items, there are also numerous optional aspects of EPrints itself that can be configured, such as enabling the editorial buffer, required document formats, etc. These will be included as optional items within the configuration process, accessed via an ``advanced configuration'' page. The list of advanced configuration items should be easily extensible, probably via some form of XML specification, so as to cater for future developments. (This mechanism could also be used to specify compulsory configuration items.) +In addition to these compulsory items, there are also numerous optional aspects of EPrints itself that can be configured, such as enabling the editorial buffer, required document formats, etc. These will be included as optional items within the configuration process, accessed via an ``advanced configuration'' page. The list of advanced configuration items should be easily extensible, probably via some form of XML specification, so as to cater for future developments. (The same mechanism could also be used to specify compulsory configuration items.) -One optional configuration item of particular relevance to the OARiNZ project is configuration of the EPrints OAI-PMH interface. While it is recommended that this remain an optional configuration, an unconfigured OAI-PMH subsystem should be prominently highlighted within the configurator interface, preferably on the main page. This gives repository implementers the option to forgo initial configuration of OAI-PMH, while gently encouraging them to eventually do so. +One optional configuration item of particular relevance to the OARiNZ project is configuration of the EPrints OAI-PMH interface. While it is recommended that this remain an optional configuration (as some thought is required to set it up properly), an unconfigured OAI-PMH subsystem should be prominently highlighted within the configurator interface, preferably on the main page. This gives repository implementers the option to forgo initial configuration of OAI-PMH, while gently encouraging them to eventually do so. On this note, there is no reason why the configurator should be limited to once-only use when the repository software is first installed. Rather, it should be installed alongside the repository software and used as a general management tool for creating and configuring repositories on that server. The configurator should keep an internal record of the configuration settings for each repository that it creates, which will make it easier to re-configure repositories at any time. The configurator should probably also check the saved configuration against the actual configuration files when opened, in case someone manually edits them. -The configurator will not assist with the process of customising the look and feel of the repository web pages, simply because there are too many possible permutations of how to modify the look and feel. The configurator could, however, provide information on which files need to be changed in order to achieve this. +Another consideration is whether the configurator should be able to configure pre-existing manual EPrints installations. We have excluded this from the solution on the grounds that it would introduce considerable complexity. For example, the configurator would need to be able to detect the version of EPrints that was installed and keep a database of which configuration items apply to which version. Additional problems would arise if the pre-existing EPrints was installed in a non-standard manner. We may consider this capability for a future version of the configurator. + +The configurator will not assist with the process of customising the look and feel of the repository web pages, simply because there are too many possible permutations of how to modify the look and feel. The configurator could, however, provide information on which files need to be changed in order to achieve this. This information would also be included in the OARiNZ knowledge base wiki. + + +\subsection{Other items} + +The repository installer will include the M\={a}ori and Pacific Island language packs for EPrints that were developed at Wintec. No special handling is required for these; they will simply be included as standard components in the EPrints installation. \subsection{Summary of design recommendations} @@ -162,7 +181,7 @@ \begin{itemize} - \item Unix-based operating systems in general + \item Unix-based operating systems that have functional Apache, MySQL and Perl/PHP components already installed \item Ubuntu Linux (server distribution, x86 architecture) for the bare metal install option @@ -186,9 +205,11 @@ \item Shell-based option (ideally usable as a back-end CGI script), as the ultimate fallback - \item Web-based installation interface (if feasible) + \item Cross-platform GUI installation interface (if feasible) \item Web-based configuration interface + + \item Alternate language options \end{itemize} @@ -202,9 +223,9 @@ \item Downloadable disk images in standard formats - \item At least two disks for bare metal installs: base operating system + repository software (including configurator) + \item At least two disks for bare metal installs: base operating system (disk 1) + repository software (disk 2, including configurator) - \item One disk (or set of disks) for existing system installs: repository software only (including configurator) + \item One disk (or set of disks) for existing system installs: repository software only (including installer and configurator) \end{itemize} @@ -222,6 +243,14 @@ \end{itemize} +\subsubsection*{Other items} + +\begin{itemize} + + \item M\={a}ori and Pacific Island language packs for EPrints to be included + +\end{itemize} + \section{Typical usage scenarios} @@ -236,7 +265,7 @@ \end{center} \end{figure} -In this scenario, shown in Figure~\ref{fig-bare-metal}, a repository implementer wishes to bootstrap a complete repository installation on new hardware. They boot from the repository live CD (\ding{'300}), which installs the Ubuntu operating system along with all the required packages for EPrints (\ding{'301}). The latter will probably also include the repository configurator and configuration items list, as implied by the dashed arrows at bottom right. After the base installation completes (a reboot may be required), the operating system (\ding{'302}) and repository configurators (\ding{'303}) are executed in sequence. The repository configuration is saved for future reference. +In this scenario, shown in Figure~\ref{fig-bare-metal}, a repository implementer wishes to bootstrap a complete repository installation on new hardware (this includes virtualisation environments such as VMware or Virtual PC). They boot from the repository live CD (\ding{'300}), which installs the Ubuntu operating system along with all the required packages for EPrints (\ding{'301}). The latter will probably also include the repository configurator and configuration items list, as implied by the dashed arrows at bottom right. After the base installation completes (a reboot may be required), the operating system (\ding{'302}) and repository configurators (\ding{'303}) are executed in sequence. The repository configuration is saved for future reference. \subsection{Installation on existing system} @@ -249,9 +278,9 @@ \end{center} \end{figure} -In this scenario, shown in Figure~\ref{fig-existing}, a repository implementer wishes to install a repository on an existing server, which already has an operating system and associated software installed. They insert the installation CD and launch the installer bootstrap application (\ding{'300}). The bootstrap application launches a CGI-enabled instance of Apache from the CD (\ding{'301}), then opens the repository installer application in a web browser (\ding{'302}). The installer installs all the necessary packages to support EPrints (those that are not installed already), including the repository configurator and configuration items list, as implied by the dashed arrows at bottom right. After the installation completes, the repository configurator is executed (\ding{'303}). The repository configuration is saved for future reference. +In this scenario, shown in Figure~\ref{fig-existing}, a repository implementer wishes to install a repository on an existing server, which already has an operating system and associated software installed (including the various LAMP components). They insert the installation CD and launch the repository installer application (\ding{'300}). The installer installs all the necessary packages to support EPrints (those that are not installed already), including the repository configurator and configuration items list, as implied by the dashed arrows at bottom right. After the installation completes, the installer opens the repository configurator in a web browser (\ding{'301}). The repository configuration is saved for future reference. -Note the separation of the installer into a web user interface and back-end shell for executing installation tasks. A similar architecture will be used for the repository configurator, but this is not shown here. +Note the separation of the repository configurator into a web user interface and back-end shell for executing installation tasks. \subsection{Reconfiguring an existing system} @@ -264,9 +293,13 @@ \end{center} \end{figure} -In this scenario, shown in Figure~\ref{fig-reconfigure}, a repository administrator wishes to reconfigure their existing installation, for example, to create new repository or to change the settings of an existing repository. They launch the repository configurator that was installed on the server during the original installation process (\ding{'300}). This reads the existing repository configuration (\ding{'301}) and the configuration items list (\ding{'302}) and uses these to initialise the configurator. When complete, the new configuration is saved for future reference. +In this scenario, shown in Figure~\ref{fig-reconfigure}, a repository administrator wishes to reconfigure their installation\footnote{As noted in Section~\ref{sec-configure}, this only applies to EPrints installations that were installed by the DIY installer, not to manual EPrints installations.}, for example, to create new repository or to change the settings of an existing repository. They launch the repository configurator that was installed on the server during the original installation process (\ding{'300}). This reads the existing repository configuration (\ding{'301}) and the configuration items list (\ding{'302}) and uses these to initialise the configurator. When complete, the new configuration is saved for future reference. +\subsection{Updating to a new version} + +\ldots{}blah blah\ldots + \section{Implementation plan}