BNC W01: BNC Document Management System <author>Lou Burnard <date> 10th September 91 </FRONT><body> <div1><head>Online storage of BNC documents All BNC documents received by OUCS in electronic form will be stored in the directory [NATCORP.DOX], initially on the OUCS VAX, to which all project members have READ access. Files will be named by the document number, with an extension indicating the format (e.g. .SGML for SGML, .TEX for LaTex, .WP5 for WordPerfect, .ASC for `plain ASCII'). Only the most recent version of documents will be kept online. When a new document (or a new version of an existing one) is acquired, a notice will be sent to all project members by electronic mail. No special arrangements are envisaged for storage or circulation of documents in paper form, but all documents received by OUCS will be numbered and registered in the document register. The register itself is held online in the file [NATCORP.DOX]REGISTER, and a copy of its current state is appended to this document. <div1><head>Form of document numbers <div2><head>Document Provenance Documents are classified primarily by the part of the project responsible for them, as follows: <ul> <li>BNC - project wide <li>TGA - Task Group A (corpus design) <li>TGB - Task Group B (copyright clearance) <li>TGC - Task Group C (corpus encoding and storage) <li>TGD - Task Group D (corpus enrichment) <li>PC - Project Committee <li>EC - Exploitation Committee <li>AB - Advisory Board <li>EXT - external to the project </ul> It may possibly be felt desirable to add additional codes for documents internal to one of the participants. Thus documents circulating only within OUCS might be given the prefix OUCS, those within Longmans LONG etc. Such documents will not however (by definition) be included in the project's document registry and are therefore not considered further here. <div2><head> Type of document The document number also indicates the kind of document concerned, as follows: <ul> <li>A -- meeting agendas <li>N -- informal notes e.g. of meetings <li>M -- minutes or other formal record of meetings <li>P -- published or formally presented papers <li>R -- short formal reports other than minutes <li>W -- working drafts and proposals <li>X -- formal letters, publicity material, newspaper stories etc. </ul> A full document number thus consists of a 3 or 4 letter prefix followed by a two digit sequential number. The current document is a Working draft relating to the project as a whole, and thus has the prefix BNCW. The minutes of the Project Committee will have the prefix PCM, and so on. The document number should be allocated at the time the document is created (i.e. requested or proposed) by the group responsible for it. Documents are numbered sequentially within the group, not within group+type. Note that the earlier numbers may not be in chronological sequence, because I have numbered things as they turned up rather than in order of date of composition. <back> <div1><head>Current Document Register This is the state of the document register as of May 1st, 1991. An updated version may be found in the file [NATCORP.DOX]REGISTER <div2><head>Project Wide Documents <gl> <gt>BNCW01<gd><hi>Burnard </hi><citn> BNC Document management system</citn>1 May 91 <gt>BNCW02<gd><hi>Clear</hi><citn>Planned uses of the National Corpus</citn> 11 Apr 91   <gt>BNCX04<gd>Mission statement <gt>BNCP05<gd>Consortium Agreement <gt>BNCW06<gd>Apr 91<hi>Summers </hi><citn>The Spoken Corpus</citn>(includes draft transcription scheme for spoken texts) <gt>BNCW07<gd><hi>Clear</hi><citn>Task Groups</citn> : 11 Apr 91 </gl> <div2>2. Project Committee documents <gl> <gt>PCM01<gd>Minutes of the National Corpus Initiative Consortium Meeting held 29 Aug 90 <gt>PCM02<gd>Minutes of the Project Committee Meeting held 4 Feb 91 <gt>PCR03<gd>Progress Reports tabled at PC meeting held 17 Apr91 </gl> <div2><head>Advisory Council documents <gl> <gt>ACM01<gd>Minutes of Advisory Council Meeting held 7 Mar 91 </gl> <div2><head>4. Task Group specific documents <div3>Task group A: Corpus Design <gl> <gt>TGAN01<gd>Report on BNC TG A Meeting of 10 Apr 91 <gt>TGAW02<gd><hi>Atkins Clear & Ostler</hi> <citn>Corpus design criteria</citn> (Pisa Paper) <gt>TGAW03<gd><hi>Summers </hi><citn>Longman/Lancaster English Language Corpus - criteria and design</citn>(March 1991) <gt>TGAW04<gd><hi>Clear </hi><citn>Corpus Design Specification</citn> (24 May 91) <gt>TGAW04<gd><hi>Clear </hi><citn>Written Corpus Design Specification</citn> (2 September 91) <gt>TGAW05<gd><hi>Burnage </hi><citn>Corpus Design: OUCS Comments</citn> (31 May 91) <gt>TGAW06<gd><hi>Leech </hi><citn>Comments on the OUP Corpus Design Document and the OUCS Response</citn> (3 June 91) <gt>TGAW07<gd><hi>Crowdy </hi><citn>Longman reply to BNC design specification </citn> (14 May 91) <gt>TGAW08<gd><hi>Crowdy </hi><citn>BNC: Corpus Design Specification: Longman comments on draft paper</citn> (10 June 91) <gt>TGAM01<gd><citn>Agenda for meeting of 5th June</citn> <gt>TGAM02<gd><hi>Burnage </hi><citn>Minutes of meeting of 5th June </citn> (7 June 91) <gt>TGAW09<gd><hi>Burnard</hi><citn>Queries for meeting of TGA on spoken texts</citn> (15 August 91) <gt>TGAW10<gd><hi>Dunlop</hi><citn>Matters for discussion at BNC task group A meeting of 29th August</citn> (28 August 91) </gl> <div3>Task Group C: Encoding and Storage <gl> <gt>TGCW01<gd><hi>Burnard</hi><citn>Markup scheme for the BNC</citn>, 25 April 91 <gt>TGCW02<gd><hi>Leech</hi><citn>Basic grammatical tagset initial proposal</citn> and <citn>Penn Treebank Tagset </citn> (summarises mapping of MM's reduced tagset onto the 66 Lancaster word class tag set) <gt>TGCW03<gd>CPH Appendix A 13 Jan 89 (Longman's original proposals for encoding corpus materials) <gt>TGCW04<gd><hi>Clear</hi><citn>Markup for the Oxford Pilot Corpus</citn> <gt>TGCW05<gd><hi>Burnage</hi><citn>Database Design Specification</citn> (29 May 91) <gt>TGCW06<gd><hi>Dunlop</hi><citn>Text Submission Guidelines</citn> (29 May 91) <gt>TGCW07<gd><hi>Dunlop</hi><citn>Encoding the Oxford Milton</citn> (31 May 91) <gt>TGCM01<gd><citn>Agenda for meeting of June 5th</citn> <gt>TCGM02<gd><hi>Burnard </hi><citn>Minutes of BNC Task Group C Meeting, 5 June</citn> (7 June 91) <gt>TGCW08<gd><hi>Dunlop</hi><citn>Text Submisson Guidelines --- Rekeyed or scanned materials from OUP</citn> (24 June 91) <gt>TGCW09<gd><hi>Crowdy</hi><citn>Spoken Corpus: Selective side discourse categories and types; Draft spoken corpus transcription scheme</citn> <gt>TGCN01<gd><hi>Dunlop</hi><citn>Notes on meeting held at OUCS, 23rd August</citn> (27 August 91) <gt>TGCW10<gd><hi>Dunlop et al</hi><citn>British National Corpus development DTD</citn> (22 August, 1991) <gt>TGCW11<gd><hi>OUCS</hi><citn>Marked up spoken corpus sample</citn> <gt>TGCW12<gd><hi>OUCS</hi><citn>Marked up written corpus sample</citn> (The Wimbledon Poisoner) <gt>TGCW13<gd><hi>Dunlop</hi><citn>UNIX manual page for vm2 parser</citn> (20 August 91) <gt>TGCW14<gd><hi>Du Bois</hi><citn>Transcription design principles for spoken discourse research</citn> 5 March 91) </gl> <div3>Task Group D: Corpus Enrichment <gl> <gt>TGDW01<gd><hi>Leech</hi><citn>British National Corpus --- Basic Grammatical Tagset</citn> (3 September 91) <gt>TGDW01<gd><hi>Leech</hi><citn>British National Corpus --- Basic Grammatical Tagset</citn> (alternative tags --- 5 September 91) <gt>TGDW02<gd><hi>Langendoen</hi><citn>Proposal for TEI-Conformant Encoding of Basic Grammatical Tagset</citn> (3 September 91) <gt>TGDW03<gd><hi>Dunlop</hi><citn>Mail to task group D</citn> (points for meeting of 5 September 91) <gt>TGDW04<gd><hi>Dunlop</hi><citn>Lexical tagging: Position following meeting of 5th September</citn> (6 September 91) <gt>TGDW05<gd><hi>Clear</hi><citn>Notes on Basic Grammatical Tagset</citn> (4 September 91) <gt>TGDA01<gd><citn>Agenda for task group D meeting of 5th September</citn> <gt>TGDM01<gd><hi>Bryant</hi><citn>Task Group D -- Corpus Processing --- Minutes of First Meeting</citn> (5 September 91) </gl> <div2><head>External Documents of related interest <gl> <gt>EXTW01<gd><hi>Leech</hi><citn>Corpus annotation schemes</citn>(Pisa paper; includes Garside and Leech `Running a Grammar Factory' to be published in Johansson & Stenstrom) <gt>EXTW02<gd><hi>Biber</hi><citn>Representativeness in corpus design </citn>(Pisa paper) <gt>EXTW03<gd><hi>Sampson</hi><citn>Needed - a grammatical stocktaking</citn> (Pisa paper) <gt>EXTW04<gd><hi>Johansson</hi><citn>Some thoughts on the encoding of spoken texts in machine-readable form</citn> (Pisa Paper, including examples of spoken texts) <gt>EXTP05<gd><hi> Leech & Johansson</hi><citn>LOB Coding Manual</citn> (an extract) <gt>EXTX06<gd><hi>Doreen King</hi>`Corpus dialecti. Oxford's oceanographers of language set sail'<citn>Oxford Times</citn>, 20 Jul 90 <gt>EXTX07<gd><hi>Brian Keaney </hi>`The keeper of the living language' <citn>Guardian, </citn>25 Apr 91 <gt>EXTW08<gd><hi>And Rosta</hi><citn>System of preparation and annotation of I.C.E. texts </citn> (12 Dec 90) <gt>EXTW09<gd><hi>Akiva Quinn and Gerry Nelson</hi><citn>I.C.E. Markup Manual for written texts</citn> (March 91) <gt>EXTW10<gd><citn>TEI AI 1W2: List of common morphological features for inclusion in the TEI starter set of grammatical-annotation tags</citn> (14 June 91) </gl> </Back></LDOC>