Oxford University Computing Services
Installation of the BNC for Network Access
The British National Corpus (World Edition) is supplied on two CD-ROMs in compressed form, with an installation kit which enables it to be set up for single-user access on any machine running under any Microsoft Windows 32-bit system. The same CD-ROMs may also be used to install the corpus for networked use on any UNIX system. This document describes briefly how to install the corpus and its access sofware in a networked Unix environment
If you don't plan to use the SARA software, then you only need to carry out the first step. If you plan to use the SARA server, but to access it via your own client (e.g. via the web using a simple CGI script) you don't need to carry out the last step.
First decide where you will put everything. You will need at least 8 Gb (eight gigabytes) of free disk space. For simplicity, we recommend you create a top-level directory to hold each of the five major components of the system (each of which has its own hierarchy): by default the installation assumes you will call this BNC-world but the name is up to you. Create the top-level directory if necessary and make it current.
Everything needed to run the server is on the CDs in compressed form. Refer to the table below for a list of the directories to be unpacked, the CD on which to find them, and the directory name that should result. For example, if your root directory is called BNC-world, the result of unpacking etc.tar.gz (which is on CD 2) should be a directory called BNC-world/Etc.
Here is an example shell script to do the complete installation. The commands for mounting and dismounting CD-ROMs (in particular) may vary from system to system: what you see below has been tested under RedHat Linux 6.1
$ mkdir BNC-world $ cd BNC-world # insert CD-1 # $ mount /dev/cdrom /mnt/cdrom $ tar -xzf /mnt/cdrom/texts.tar.gz $ mkdir Index $ cd Index $ tar -xzf /mnt/cdrom/index0.tar.gz $ umount /mnt/cdrom # remove CD-1 and insert CD-2 # $ mount /dev/cdrom /mnt/cdrom $ tar -xzf /mnt/cdrom/index1.tar.gz $ tar -xzf /mnt/cdrom/index2.tar.gz $ tar -xzf /mnt/cdrom/index3.tar.gz $ cd .. $ tar -xzf /mnt/cdrom/etc.tar.gz $ tar -xzf /mnt/cdrom/sara.tar.gz $ cp /mnt/cdrom/Doc . $ cp /mnt/cdrom/SGML . $ umount /mnt/cdrom $ ls -l
You don't need to unpack anything else, but you may find useful the documentation in the directories Doc (on CD1) and SGML (on CD2). The other folders on CD1 contain material specific to the single-user Windows licence.
sarad -c /home/SARA/myCorpus.prmor you can default the name, in which case the server will look for a file called corpus.prm in the same directory as it is executed from
$corpadm -p /home/SARA/myCorpus.prmor you can default the name, in which case the server will look for a file called corpus.prm in the same directory as it is executed from.
corpadm>type add guest and then fill in the remaining prompts: when prompted, give the password guest. This will create the username which the solve program expects to use by default. You may also wish to add other usernames, with rather less transparent passwords.
solve fishcakeIf everything is properly installed, you will get a result like the following
Connected!1 solutions OK KD7 412 44 8 AJ0 if they haven't got any scampi get an extra fish cake and an extra spring roll
Cannot Connect!check that the name of the server and port you are using are correctly given in the source file you compiled. If you get the message
Connected! 0 solutionscheck that that the index and text files are readable and that there are no unexplained error messages in the testlog.txt file.
In the directory SARA, as well as the source code for the server and utilities, you will find a ZIP archive containing Windows executables for the SARA Windows 32 bit client. The file is called sara98.zip. Transfer this file to a Windows machine connected to your server by a TCP/IP network, and then unzip its contents into an appropriate directory (e.g. c:\program files\sara98).
When it starts up, the client will need to be configured to connect to your server. You configure it by entering the IP name or number of the server (e.g. myserver.mydomain.org) and the port on which the server listens for calls (by default 7000). Refer to the client documentation at http://www.natcorp.ox.ac.uk/tools/sara/ for more information on how to use it.
This version of the BNC has been substantially revised from the first (1.0) release. The User Reference Guide in the Docs directory contains a section outlining the nature of the changes made in the text and its encoding.
This version of the SARA software has been substantially enhanced. Note in particular that earlier versions of the server software cannot be used with the new index files. Both server and client software are incompatible with earlier versions.
Please consult the BNC website at http://www.natcorp.ox.ac.uk/ for updated information about the BNC and SARA. For news and notices specifically about SARA, please consult http://www.natcorp.ox.ac.uk/tools/sara/