BNC

British National Corpus User Reference Guide

1. Introduction

  Author: edited by Lou Burnard (revised LB) Date: (revised 19-22 Nov 2003)

Up: Contents Next: 2. Design of BNC-baby

This manual describes BNC-baby, a four million word sampling of the 100 million word British National Corpus. It contains a brief description of the design of this sample and information about the way in which it is encoded, in particular, a definition of the XML document type declaration (DTD) used. A list giving brief bibliographic details for each text making up the sample is also included.

The present document is derived from the User Reference Guide provided as a part of the BNC World Edition, first released in October 2000, and available on the web. Definitive information about the design principles, sampling methods, and encoding principles of the BNC should be taken from that manual. Further information about the BNC is also available from its World Wide Web server at http://www.natcorp.ox.ac.uk

The BNC was originally created by an academic-industrial consortium whose original members were:

Creation of the corpus was funded by the UK Department of Trade and Industry and the Science and Engineering Research Council under grant number IED4/1/2184 (1991-1994), within the DTI/SERC Joint Framework for Information Technology. Additional funding was provided by the British Library and the British Academy.

After the completion of the first edition of the BNC, a phase of tagging improvement was undertaken at Lancaster University with funding from the Engineering and Physical Sciences Research Council (Research Grant No. GR/F 99847). This tagging enhancement project was led by Geoffrey Leech, Roger Garside and Tony McEnery. Correction and validation of the bibliographic and contextual information in all the BNC Headers was also carried out for this second version of the corpus, known as the BNC World Edition.

BNC-baby was produced at OUCS by Lou Burnard, Martin Wynne, and Ylva Berglund. It is the first version of the BNC to be distributed entirely in XML.

Up: Contents Next: 2. Design of BNC-baby


Date: (revised 19-22 Nov 2003) Author: edited by Lou Burnard (revised LB).
British National Corpus.