add this bookmarking tool

Formal Specification of the BNC XML schema

The structure of the XML edition of the British National Corpus is described by means of a single XML schema, which is however expressed in three different schema languages: the traditional DTD language which XML inherits from SGML; the more recently defined ISO schema language known as RELAXNG; and the W3C defined schema language. The three schema files are all generated from the same TEI-conformant XML source file, which is also used to generate the present documentation.

This section of the document contains the TEI-conformant reference specification for all components of the BNC schema. These include definitions for attribute classses, model classes, and macro patterns as well as definitions for elements and their associated attributes and possible value lists. A full description of these concepts and how they are used to define and document XML encoding schemes is given by the TEI Guidelines (in particular, in chapter TD); the following summary provides only basic information about them.

When several elements in a schema share attributes of the same name, with values drawn from a common set, they are considered to form an attribute class. The members of such a class can then all reference the same class definition rather then each repeat the same information. In the BNC, for example, the elements <bibl>,<corr>, <div>, <head>, <hi>, and half a dozen others, all have the same attribute rend which takes a coded value taken from the same short list of possibilities. Rather than repeat this definition half a dozen times therefore, the relevant elements are all said to be members of a class att.rendered, which is defined independently of those elements (but includes a list of its members). In the same way, the <w> and <mw> elements, as members of the att.c5coded class, share the same definition for the possible CLAWS5 codes specified by their c5 attribute. Note however that the element <c>, although it has an attribute c5, is not a member of this class because the possible values for this attribute on this element are entirely different.

In any reasonably large schema, and particularly one derived from the TEI model, several elements are likely to have very similar content models, since it will often be the case that at a given point in the document hierarchy any one of several possible elements will be permissible. The specific subset of elements (<w>, <mw>, <c> and a few others) which may appear within an <s> element in the BNC, is different from the subsets of elements which may appear within a <p> or <div> element. However, there are several elements which can appear in the same places as a <p>. Following TEI practice, we call the set of elements which can appear together (in sequence or alternation) at a specific place in the document hierarchy a model class. For example, since <l>, <lg>, <list>, <p>, <quote>, and <sp> are all permitted as immediate components of a <div> elements, we define a class model.divPart, of which these six elements are all members. Wherever convenient, content models are defined in terms of these model classes.

As noted above, this usage of model classes is a distinctive and pervasive feature of the TEI encoding scheme. Because the BNC derives from the TEI scheme, it uses the same names and (as far as is practicable) the same model classes throughout. Although this introduces an occasionally redundant degree of indirection in the resulting schema, it also makes clearer the relationship between the components defined for the BNC and their origins in the TEI scheme.

Finally, we define here a few macros for commonly encountered content models. These are also taken from the TEI encoding scheme, though in a few cases with different meanings. In the TEI for example, the macro macro.phraseSeq is defined as a mixture of various ‘phrase level’ elements and plain text; in the BNC scheme, it has been redefined as plain text only. The places where this macro is referenced however are unchanged; in this respect therefore, the BNC schema is a proper subset of the full BNC schema.

The remainder of this section lists in alphabetical order all of the attribute classes, model classes, elements, and macros defined for the BNC encoding scheme, using a similar method of display as the full TEI Guidelines. For each component, we give a brief description and also a usage example. Note that many of the elements listed here appear only in the corpus header rather than in the texts, and may thus be safely disregarded by applications which operate on the texts alone or in isolation.

Classes defined

Class att.ascribed

provides attributes for elements representing speech or action that can be ascribed to a specific individual.

Attributes: In addition to global attributes
who
indicates the person, or group of people, to whom the element content is ascribed.

Class: (none)

Members: change  event  setting  sp  u  vocal 

Module: tei

Class att.authorialIntervention

provides attributes describing the nature of an authorial intervention.

Attributes: In addition to global attributes
hand
signifies the hand of the agent which made the addition or performed the deletion.
status
may be used to indicate faulty deletions, e.g. strikeouts which include too much or too little text, or erroneous additions, e.g., an insertion which duplicates some of the text already present. Sample values include:
duplicate
(all of the text indicated as an addition duplicates some text that is in the original, whether the duplication is word-for-word or less exact.)
duplicate-partial
(part of the text indicated as an addition duplicates some text that is in the original)
excessStart
(some text at the beginning of the deletion is marked as deleted even though it clearly should not be deleted.)
excessEnd
(some text at the end of the deletion is marked as deleted even though it clearly should not be deleted.)
shortStart
(some text at the beginning of the deletion is not marked as deleted even though it clearly should be.)
shortEnd
(some text at the end of the deletion is not marked as deleted even though it clearly should be.)
unremarkable
(the deletion is not faulty.)
type
classifies the type of addition or deletion using any convenient typology.

Class: (none)

Members:

Module: tei

Class att.c5coded

elements which carry a CLAWS 5 Part of speech code

Attributes: In addition to global attributes
c5
supplies the CLAWS 5 code associated with this word. Legal values are:
AJ0
Adjective (general or positive) (e.g. good, old, beautiful)
AJC
Comparative adjective (e.g. better, older)
AJS
Superlative adjective (e.g. best, oldest)
AT0
Article (e.g. the, a, an, no)
AV0
General adverb: an adverb not subclassified as AVP or AVQ (see below) (e.g. often, well, longer (adv.), furthest.
AVP
Adverb particle (e.g. up, off, out)
AVQ
Wh-adverb (e.g. when, where, how, why, wherever)
CJC
Coordinating conjunction (e.g. and, or, but)
CJS
Subordinating conjunction (e.g. although, when)
CJT
The subordinating conjunction that
CRD
Cardinal number (e.g. one, 3, fifty-five, 3609)
DPS
Possessive determiner-pronoun (e.g. your, their, his)
DT0
General determiner-pronoun: i.e. a determiner-pronoun which is not a DTQ or an AT0.
DTQ
Wh-determiner-pronoun (e.g. which, what, whose, whichever)
EX0
Existential there, i.e. there occurring in the there is ... or there are ... construction
ITJ
Interjection or other isolate (e.g. oh, yes, mhm, wow)
NN0
Common noun, neutral for number (e.g. aircraft, data, committee)
NN1
Singular common noun (e.g. pencil, goose, time, revelation)
NN2
Plural common noun (e.g. pencils, geese, times, revelations)
NP0
Proper noun (e.g. London, Michael, Mars, IBM)
ORD
Ordinal numeral (e.g. first, sixth, 77th, last) .
PNI
Indefinite pronoun (e.g. none, everything, one [as pronoun], nobody)
PNP
Personal pronoun (e.g. I, you, them, ours)
PNQ
Wh-pronoun (e.g. who, whoever, whom)
PNX
Reflexive pronoun (e.g. myself, yourself, itself, ourselves)
POS
The possessive or genitive marker 's or '
PRF
The preposition of
PRP
Preposition (except for of) (e.g. about, at, in, on, on behalf of, with)
TO0
Infinitive marker to
UNC
Unclassified items which are not appropriately considered as items of the English lexicon.
VBB
The present tense forms of the verb BE, except for is, 's: i.e. am, are, 'm, 're and be [subjunctive or imperative]
VBD
The past tense forms of the verb BE: was and were
VBG
The -ing form of the verb BE: being
VBI
The infinitive form of the verb BE: be
VBN
The past participle form of the verb BE: been
VBZ
The -s form of the verb BE: is, 's
VDB
The finite base form of the verb BE: do
VDD
The past tense form of the verb DO: did
VDG
The -ing form of the verb DO: doing
VDI
The infinitive form of the verb DO: do
VDN
The past participle form of the verb DO: done
VDZ
The -s form of the verb DO: does, 's
VHB
The finite base form of the verb HAVE: have, 've
VHD
The past tense form of the verb HAVE: had, 'd
VHG
The -ing form of the verb HAVE: having
VHI
The infinitive form of the verb HAVE: have
VHN
The past participle form of the verb HAVE: had
VHZ
The -s form of the verb HAVE: has, 's
VM0
Modal auxiliary verb (e.g. will, would, can, could, 'll, 'd)
VVB
The finite base form of lexical verbs (e.g. forget, send, live, return) [Including the imperative and present subjunctive]
VVD
The past tense form of lexical verbs (e.g. forgot, sent, lived, returned)
VVG
The -ing form of lexical verbs (e.g. forgetting, sending, living, returning)
VVI
The infinitive form of lexical verbs (e.g. forget, send, live, return)
VVN
The past participle form of lexical verbs (e.g. forgotten, sent, lived, returned)
VVZ
The -s form of lexical verbs (e.g. forgets, sends, lives, returns)
XX0
The negative particle not or n't
ZZ0
Alphabetical symbols (e.g. A, a, B, b, c, d)
AJ0-AV0
Probably AJ0 (adjective), but maybe AV0 (adverb)
AJ0-NN1
Probably AJ0 (adjective), but maybe NN1 (singular noun)
AJ0-VVD
Probably AJ0 (adjective), but maybe VVD (verb past tense)
AJ0-VVG
Probably AJ0 (adjective), but maybe VVG (-ing verb)
AJ0-VVN
Probably AJ0 (adjective), but maybe VVN (verb past participle)
AV0-AJ0
Probably AV0 (adverb), but maybe AJ0 (adjective)
AVP-PRP
Probably AVP (adverb particle), but maybe PRP (preposition)
AVQ-CJS
Probably AVQ (wh- adverb), but maybe CJS (subordinating conjunction)
CJS-AVQ
Probably CJS (subordinating conjunction), but maybe AVQ (wh- adverb)
CJS-PRP
Probably CJS (subordinating conjunction), but maybe PRP (preposition)
CJT-DT0
Probably CJT ("that" as conjunction), but maybe DT0 (determiner)
CRD-PNI
Probably CRD (number), but maybe PNI (indefinite pronoun)
DT0-CJT
Probably DT0 (determiner), but maybe CJT ("that" as conjunction)
NN1-AJ0
Probably NN1 (singular noun), but maybe AJ0 (adjective)
NN1-NP0
Probably NN1 (singular noun), but maybe NP0 (proper noun)
NN1-VVB
Probably NN1 (singular noun), but maybe VVB (verb)
NN1-VVG
Probably NN1 (singular noun), but maybe VVG (-ing verb)
NN2-VVZ
Probably NN2 (plural noun), but maybe VVZ (-s verb)
NP0-NN1
Probably NP0 (proper noun), but maybe NN1 (singular noun)
PNI-CRD
Probably PNI (indefinite pronoun), but maybe CRD (number)
PRP-AVP
Probably PRP (preposition), but maybe AVP (adverb particle)
PRP-CJS
Probably PRP (preposition), but maybe CJS (subordinating conjunction)
VVB-NN1
Probably VVB (verb), but maybe NN1 (singular noun)
VVD-AJ0
Probably VVD (verb past tense), but maybe AJ0 (adjective)
VVD-VVN
Probably VVD (verb past tense), but maybe VVN (verb past participle)
VVG-AJ0
Probably VVG (-ing verb), but maybe AJ0 (adjective)
VVG-NN1
Probably VVG (-ing verb), but maybe NN1 (singular noun)
VVN-AJ0
Probably VVN (verb past participle), but maybe AJ0
VVN-VVD
Probably VVN (verb past participle), but maybe VVD (verb past tense)
VVZ-NN2
Probably VVZ (-s verb), but maybe NN2 (plural noun)

Class: (none)

Members: mw  w 

Module: module-from-bncxml

Class att.datePart

(attributes for temporal expression) attributes for component elements of temporal expressions involving dates and time

Attributes: In addition to global attributes
value
supplies the value of a date or time in a standard form.
Example

Examples of W3C date, time, and date & time formats.

 <date value="1945-10-24">24 Oct 45</date>  <date value="1996-09-24T07:25Z">September 24th, 1996 at 3:25 in the morning</date>  <time value="1999-01-04T20:42-05:00">Jan 4 1999 at 8 pm</time>  <time value="14:12:38">fourteen twelve and 38 seconds</time>  <date value="1962-10">October of 1962</date>  <date value="--06-12">June 12th</date>  <date value="---01">the first of each month</date>  <date value="--08">August</date>  <date value="2006">MMVI</date>
Example

Examples of time formats with reduced precision.

 <date value="2006-05-18T10:03+09:00">a few minutes after ten in the morning on Thu 18 May</date>  <time value="03:00">3 A.M.</time>  <time value="12">around noon</time>

Software intended for use with W3C XML Schema datatypes may be unable to properly process times expressed with reduced precision.

Example

A usage example of <date>.

This list begins in the year 1632, more precisely on Trinity Sunday, i.e. the Sunday after Pentecost, in that year the <date calendar="Julian" value="1632-06-06">27th of May (old   style)</date>.
Example

A usage example of <time>.

He likes to be punctual. I said <q>   <time value="12">around noon</time>  </q>, and he showed up at <time value="12:00:00">12 O'clock</time> on the dot.
dur
(duration) indicates the length of this element in time.
Example

Examples of W3C durations.

 <distance dur="PT45M">forty-five minutes</distance>  <distance dur="P1DT12H">a day and a half</distance>  <distance dur="P7D">a week</distance>  <distance dur="PT0.02S">20 ms</distance>

Note: In providing a ‘regularized’ form, no claim is made that the form in the source text is incorrect; the regularized form is simply that chosen as the main form for purposes of unifying variant forms under a single heading.

Class: (none)

Members:

Module: tei

Class att.editLike

Attributes: In addition to global attributes
resp
indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber.

Class: (none)

Members: corr  gap 

Module: tei

Class att.identifiable

the class of elements which describe other elements by means of their generic identifiers

Attributes: In addition to global attributes
ident
supplies an element's generic identifier, or one of the codes * (meaning all elements), or name() meaning that the name of the referenced element is to be used rather than its value.
ns
supplies the namespace within which the generic identifier is to be found.

Note: The values * and name() are used for ident as well.

Class: (none)

Members: attDef  attributePolicy  elementPolicy  gi  ident  valItem  valList  valSource  xairaItem 

Module: module-from-bncxml

Class att.interpLike

provides attributes for elements which represent a formal analysis or interpretation.

Attributes: In addition to global attributes
resp
indicates who is responsible for the interpretation.
type
indicates what kind of phenomenon is being noted in the passage. Sample values include:
image
(identifies an image in the passage. )
character
(identifies a character associated with the passage. )
theme
(identifies a theme in the passage. )
allusion
(identifies an allusion to another text. )
inst
points to instances of the analysis or interpretation represented by the current element.

Class: (none)

Members:

Module: tei

Class att.personal

(attributes for components of personal names) common attributes for those elements which form part of a personal name.

Attributes: In addition to global attributes
type
provides more culture- linguistic- or application- specific information used to categorize this name component.
full
indicates whether the name component is given in full, as an abbreviation or simply as an initial. Legal values are:
yes
(the name component is spelled out in full.)
abb
(the name component is given in an abbreviated form.)
init
(the name component is indicated only by one initial.)
sort
specifies the sort order of the name component in relation to others within the personal name.

Class: (none)

Members:

Module: namesdates

Class att.rendered

the class of elements whose rendition has been recorded intermittently in the BNC

Attributes: In addition to global attributes
rend
a code briefly characterising the way the element content was originally presented. Legal values are:
bo
bold weight font
bx
boxed
hi
superscript
ib
italic and bold
ih
italic superscript
il
italic subscript
it
italic font
iu
italic and underlined
lo
subscript
qc
centre-aligned
ro
roman within italic
st
strike-out
ub
bold underlined
ul
underlined
xx
crossed-out

Class: (none)

Members: bibl  corr  div  head  hi  item  l  label  list  p  quote  stage 

Module: module-from-bncxml

Class att.spanning

provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it.

Attributes: In addition to global attributes
spanTo
indicates the end of a span initiated by the element bearing this attribute.

Note: The span is defined as running in document order from the start of the content of the pointing element (if any) to the end of the content of the element pointed to by the spanTo attribute (if any). If no value is supplied for the attribute, the assumption is that the span is coextensive with the pointing element.

Class: (none)

Members:

Module: tei

Class att.tableDecoration

provides attributes used to decorate rows or cells of a table.

Attributes: In addition to global attributes
role
indicates the kind of information held in this cell or in each cell of this row. Suggested values include:
label
(labelling or descriptive information only.)
data
(data values.)
rows
indicates the number of rows occupied by this cell or row.
cols
indicates the number of columns occupied by this cell or row.

Class: (none)

Members:

Module: tei

Class att.timed

Attributes: In addition to global attributes
dur
(duration) indicates the duration of the element in minutes.

Class: (none)

Members: event  pause  unclear  vocal 

Module: tei

Class att.typed

Attributes: In addition to global attributes
type
characterizes the element in some sense, using any convenient classification scheme or typology.

Class: (none)

Members:

Module: tei

Class att.uniqueId

the class of elements which carry an identifier which is unique across the whole corpus.

Attributes: In addition to global attributes
xml:id
provides the unique identifier for this element.

Class: (none)

Members: bncDoc  category  person  recording  setting  taxonomy 

Module: module-from-bncxml

Class model.assertLike

the class of elements concerning which assertions are made, for example as parts of a biographical element.

Class: model.personPart

Class: model.personPart

Members: model.persStateLike  [age  dialect  occupation  persName  persNote  ]

Module: tei

Class model.biblLike

groups elements containing a bibliographic description.

Class: model.inter: model.common

Class: model.inter: model.common

Members: bibl 

Module: tei

Class model.blockLike

groups segmenting elements.

Class: (none)

Members:

Module: tei

Class model.castItemPart

elements used within an entry in a cast list, such as dramatic role or actor's name.

Class:

Members:

Module: tei

Class model.catDescPart

groups elements which may be used inside catDesc and appear multiple times

Class: (none)

Members:

Module: tei

Class model.complexVal

(complex values) groups elements which express complex feature values in feature structures.

Class: model.featureVal

Class: model.featureVal

Members:

Module: tei

Class model.dateLike

(dates and date ranges) groups elements containing a date specifications.

Class: model.pPart.data: model.recordingPart

Note: This class allows certain content models to allow either a single date or a date-range element.

Class: model.pPart.data: model.recordingPart

Members: date 

Module: tei

Class model.datePart

(temporal expression) groups component elements of temporal expressions involving dates and time.

Class: (none)

Members:

Module: tei

Class model.divPart

groups elements which can occur between, but not within, paragraphs and other chunks.

Class: model.common

Note: Note that this element class does not include members of the inter class, which can appear either within or between chunks. Unlike elements of that class, chunks cannot occur within chunks.

Class: model.common

Members: l  lg  list  note  p  quote  sp 

Module: tei

Class model.divPart.spoken

groups those elements which appear at the component level in spoken texts only.

Class: (none)

Members: event  pause  shift  trunc  u  vocal 

Module: spoken

Class model.divWrapper

(top-of-div elements) groups elements which can occur at the start of any division class element.

Class: (none)

Members: head 

Module: tei

Class model.divWrapper.bottom

(Bottom-of-division elements) groups elements which can occur at the end of a text division; for example, trailer, byline, etc.

Class: (none)

Members:

Module: tei

Class model.editorialDeclPart

groups elements which may be used inside editorialDecl and appear multiple times

Class: (none)

Members:

Module: header

Class model.encodingPart

groups elements which may be used inside encodingDesc and appear multiple times

Class: (none)

Members: classDecl  editorialDecl  projectDesc  refsDecl  samplingDecl  tagsDecl  xairaSpecification 

Module: header

Class model.frontPart.drama

groups elements which appear at the level of divisions within front or back matter of performance texts only.

Class: model.frontPart

Class: model.frontPart

Members:

Module: tei

Class model.gLike

groups elements which are interspersed with normal text, representing non-Unicode items.

Class: (none)

Members:

Module: tei

Class model.global

(global inclusions ) groups empty elements which may appear at any point within a TEI text.

Class: (none)

Members: model.global.edit  [gap  ] model.milestoneLike  [pb  ]

Module: tei

Class model.global.edit

groups empty elements which perform a specifically editorial function, for example by indicating the start of a span of text added, deleted, or missing in a source.

Class: model.global

Note: Members of this class can appear anywhere within a document, between or within components or phrases.

Class: model.global

Members: gap 

Module: tei

Class model.glossLike

groups elements which provide an alternative name, explanation, or description for any markup construct.

Class: (none)

Members: desc 

Module: tei

Class model.headerPart

groups elements which may be used inside teiHeader and appear multiple times

Class: (none)

Members: encodingDesc  profileDesc 

Module: header

Class model.hiLike

groups phrase-level elements related to highlighting.

Class: model.phrase

Class: model.phrase

Members: hi 

Module: tei

Class model.imprintPart

groups the bibliographic elements which occur inside imprints.

Class: model.biblPart

Class: model.biblPart

Members: pubPlace  publisher 

Module: tei

Class model.inter

Attributes: Global attributes only

Class: (none)

Members: model.biblLike  [bibl  ] model.listLike  [list  ] model.noteLike  model.oddRef  model.qLike  [lg  quote  ] model.stageLike  [stage  ]

Module: tei

Class model.lLike

groups elements representing metrical components such as verse lines.

Class:

Members: l 

Module: tei

Class model.listLike

groups all list-like elements.

Class: model.inter: model.common

Class: model.inter: model.common

Members: list 

Module: tei

Class model.milestoneLike

(reference system elements) groups milestone-style elements used to represent reference systems

Class: model.global

Class: model.global

Members: pb 

Module: tei

Class model.nameLike

(names of people, places, or organizations, or refering strings) groups those elements which name or refer to a person, place (man-made or geographic), or organization

Class: model.addrPart: model.pPart.data

Note: A superset of the naming elements that may appear in datelines, addresses, statements of responsibility, etc.

Class: model.addrPart: model.pPart.data

Members: model.nameLike.agent  [name  ]

Module: tei

Class model.nameLike.agent

groups elements which contain names of individuals or corporate bodies.

Class: model.nameLike

Note: This class is used in the content model of elements which reference names of people or organizations.

Class: model.nameLike

Members: name 

Module: tei

Class model.noteLike

groups all note-like elements.

Class: model.inter: model.common

Class: model.inter: model.common

Members:

Module: tei

Class model.oddRef

(ODD reference class) groups elements which reference declarations in some markup language in ODD documents.

Class: model.common: model.inter

Class: model.common: model.inter

Members:

Module: tei

Class model.pLike

The class of elements which are paragraphs for the purpose of interchange.

Class: (none)

Members: p 

Module: tei

Class model.pLike.front

(Front matter chunk elements) groups elements which can occur as direct constituents of front matter, when a full title page is not given.

Class: (none)

Members:

Module: tei

Class model.pPart.data

groups phrase-level elements containing names, dates, numbers, measures, and similar data.

Class: model.phrase

Class: model.phrase

Members: address  model.dateLike  [date  ] model.nameLike  [model.nameLike.agent  ]

Module: tei

Class model.pPart.edit

groups phrase-level elements for simple editorial correction and transcription.

Class: model.phrase

Class: model.phrase

Members: corr  unclear 

Module: tei

Class model.persNamePart

(components of personal names) groups those elements which form part of a personal name.

Class: (none)

Members:

Module: namesdates

Class model.persStateLike

the class of elements describing changeable characteristics of a person which have a definite duration, for example occupation, residence, name... These characteristics of an individual are typically a consequence of their own action or that of others.

Class: model.assertLike

Class: model.assertLike

Members: age  dialect  occupation  persName  persNote 

Module: tei

Class model.personLike

the class of elements used to provide information about people and thir relationships.

Note: This class is referenced in the header module, but is not populated unless the namesdates module is loaded.

Class: (none)

Members:

Module: tei

Class model.personPart

groups elements which describe characteristics of the people referenced by a text, or participating in a language interaction.

Note: This class is used to define the content model for the <person> and <personGrp> elements.

Class: (none)

Members: model.assertLike  [model.persStateLike  ]

Module: tei

Class model.phrase

Attributes: Global attributes only

Class: (none)

Members: model.hiLike  [hi  ] model.pPart.data  [address  model.dateLike  model.nameLike  ] model.pPart.edit  [corr  unclear  ] model.ptrLike  [align  ] model.segLike  [c  mw  s  w  ]

Module: tei

Class model.physDescPart

specialised descriptive elements constituting the physical description of a manuscript or similar written source.

Class:

Members:

Module: tei

Class model.placeNamePart

(place name components) groups those elements which form part of a place name.

Class: (none)

Members:

Module: tei

Class model.profileDescPart

groups elements which may be used inside profileDesc and appear multiple times

Class: (none)

Members: langUsage  particDesc  settingDesc  textClass 

Module: header

Class model.ptrLike

groups elements used for purposes of location and reference

Class: model.phrase

Class: model.phrase

Members: align 

Module: tei

Class model.publicationStmtPart

(publication statement elements) groups the children of publicationStmt

Class: (none)

Members: address  availability  date  distributor  idno  pubPlace  publisher 

Module: tei

Class model.qLike

groups elements related to highlighting which can appear either within or between chunk-level elements.

Class: model.inter: model.common

Class: model.inter: model.common

Members: lg  quote 

Module: tei

Class model.quoteLike

(quote and similar elements) groups elements used to directly contain quotations.

Class:

Members:

Module: tei

Class model.recordingPart

(dates and date ranges) groups elements used to describe details of an audio or video recording

Class: (none)

Members: model.dateLike  [date  ]

Module: tei

Class model.respLike

groups elements which are used to indicate intellectual responsibility, for example within a bibliographic element.

Class: model.biblPart: model.msItemPart

Class: model.biblPart: model.msItemPart

Members: author  editor 

Module: tei

Class model.segLike

Class: model.phrase

Attributes: Global attributes only

Class: model.phrase

Members: c  mw  s  w 

Module: tei

Class model.settingPart

elements used to describe the setting of a linguistic interaction.

Class:

Members: activity  locale  placeName 

Module: tei

Class model.singleVal

(atomic values) group elements used to represent atomic feature values in feature structures.

Class: model.featureVal

Class: model.featureVal

Members:

Module: tei

Class model.sourceDescPart

groups elements which may be used inside sourceDesc and appear multiple times

Class: (none)

Members: recordingStmt 

Module: header

Class model.stageLike

Class: model.divPart.stage: model.inter

Attributes: Global attributes and those inherited from [model.divPart.stage ]

Class: model.divPart.stage: model.inter

Members: stage 

Module: tei

Class model.textDescPart

elements used to categorise a text for example in terms of its situational parameters.

Class:

Members:

Module: tei

Class model.titlepagePart

(Title page elements) groups those elements which can occur as direct constituents of a title page (docTitle, docAuth, docImprint, epigraph, etc.)

Class: (none)

Members:

Module: tei

Elements defined

<activity>

(activity) contains a brief informal description of what a participant in a language interaction is doing other than speaking, if anything.

Class: model.settingPart

Declaration

element activity { attribute spont { text }?, macro.phraseSeq }

Attributes: In addition to global attributes
spont
level of spontaneity Values are:
H
high
M
medium
L
low
X
not applicable or unknown

Example

 <activity>driving</activity>

Module: corpus

<address>

contains a postal or other address, for example of a publisher, an organization, or an individual.

Class: model.pPart.data: model.publicationStmtPart

Declaration

element address { macro.phraseSeq }

Attributes: Global attributes only

Example

 <address>natcorp@oucs.ox.ac.uk</address>

Example

 <address>13 Banbury Road, Oxford OX2 6NN,UK</address>

Module: core

<age>

specifies the age in years of a recorded participant at the time of the recording in which they participate.

Class: model.persStateLike

Declaration

element age { macro.phraseSeq }

Attributes: Global attributes only

Example

 <age>25</age>

Module: namesdates

<align>

marks an temporal alignment point within transcribed speech

Class: model.ptrLike

Declaration

element align { attribute with { data.pointer }, empty }

Attributes: In addition to global attributes

Example

 <u who="PS6SF">   <s n="12">    <w c5="VVB" hw="tell" pos="VERB">tell </w>    <w c5="NP0" hw="billy" pos="SUBST">Billy </w>    <align with="KSWLC001"/>    <w c5="DT0" hw="that" pos="ADJ">that </w>   </s>  </u>  <u who="PS6SJ">   <s n="13">    <align with="KSWLC001"/>    <w c5="ITJ" hw="hello" pos="INTERJ">Hello </w>    <w c5="AV0" hw="now" pos="ADV">now </w>    <w c5="VM0" hw="can" pos="VERB">can </w>    <w c5="PNP" hw="you" pos="PRON">you </w>    <w c5="VVI" hw="hear" pos="VERB">hear </w>    <w c5="PNP" hw="i" pos="PRON">me</w>   </s>  </u>

Module: module-from-bncxml

<attDef>

(attribute definition) provides the definition for a single attribute.

Class: att.identifiable

Declaration

element attDef { att.identifiable.attributes, att.identifiable.attribute.ident, ( desc*, valList? ) }

Attributes: Global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident

Module: tagdocs

<attList>

contains documentation for all the attributes associated with this element, as a series of attDef elements.

Declaration

element attList { attDef+ }

Attributes: Global attributes only

Module: tagdocs

<attributePolicy>

specifies the indexing policy to be used for one or more attributes.

Class: att.identifiable

Declaration

element attributePolicy { att.identifiable.attributes, attribute ident { data.name }?, att.identifiable.attribute.ns, attribute type { "none" | "jointo" | "joinfrom" | "taxonomy" }?, ( nameList?, joinTo? ) }

Attributes: In addition to global attributes and those inherited from [att.identifiable ]
ident
identifies the attribute to which the indexing policy applies
att.identifiable.attribute.ns

Module: module-from-bncxml

<author>

in a bibliographic reference, contains the name of the author(s), personal or corporate, of a work; the primary statement of responsibility for any bibliographic item.

Class: model.respLike

Declaration

element author { attribute domicile { text }?, attribute n { text }?, attribute born { text }?, macro.phraseSeq }

Attributes: In addition to global attributes
domicile
main country of residence where known
n
internal identifier
born
year of birth where known

Example

 <author n="AubreC1" domicile="Britain">Aubrey, Crispin</author>

Module: core

<availability>

supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, etc.

Class: model.publicationStmtPart

Declaration

element availability { ( text | para )* }

Attributes: Global attributes only

Example

 <availability> This  material is protected by international copyright laws and  may not be copied or redistributed in any way. Consult the BNC Web  Site at http://www.natcorp.ox.ac.uk for full licencing and distribution conditions.</availability>

Module: header

<bibl>

(bibliographic citation) contains any bibliographic reference, occurring either within the header of a written corpus text in which case it has a fixed substructure, or within the body of a corpus text, in which case it contains only s elements.

Class: att.rendered: model.biblLike

Declaration

element bibl { att.rendered.attributes, att.rendered.attribute.rend, ( s+ | ( title+, ( editor | author )*, imprint, pp? ) ) }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <bibl>   <title>British intelligence services in action. </title>   <author n="LindsK1" born="1924">Lindsay, Kennedy</author>   <imprint n="DUNROD1">    <publisher>Dunrod Press</publisher>    <pubPlace>Dundalk, Ireland</pubPlace>    <date value="1980">1980</date>   </imprint>   <pp>74-176</pp>  </bibl>

Module: core

<bncDoc>

contains a distinct document within the corpus, either spoken or written.

Class: att.uniqueId

Declaration

element bncDoc { att.uniqueId.attributes, att.uniqueId.attribute.xmlid, ( teiHeader, ( wtext | stext ) ) }

Attributes: Global attributes and those inherited from [att.uniqueId ]
att.uniqueId.attribute.xmlid

Module: module-from-bncxml

<c>

(character) contains a significant punctuation mark as identified by the CLAWS tagger.

Class: model.segLike: att.segLike

Declaration

element c { model.segLike.attributes, attribute c5 { "PUN" | "PUL" | "PUR" | "PUQ" }, text }

Attributes: In addition to global attributes and those inherited from [att.segLike ]
c5
the CLAWS 5 code associated with this punctuation mark. Legal values are:
PUN
any separating punctuation mark
PUL
opening round or square parenthesis
PUR
closing round or square parenthesis
PUQ
any quotation mark

Example

 <c c5="PUN">?</c>

Note: Character data. Should only contain a single character or an entity that represents a single character.

Module: analysis

<catDesc>

(category description) provides a description for one category within the text taxonomies provided in the corpus header.

Declaration

element catDesc { macro.phraseSeq }

Attributes: Global attributes only

Example

 <category xml:id="ACPROSE">   <catDesc>Academic prose</catDesc>  </category>

Module: header

<catRef>

(category reference) provides a list of codes identifying the categories to which this text has been assigned, each code referencing a category element declared in the corpus header.

Declaration

element catRef { attribute targets { data.pointers }, empty }

Attributes: In addition to global attributes
targets
identifies the categories concerned

Example

 <catRef    targets="WRI ALLTIM3 ALLAVA2 ALLTYP3 WRIAAG0 WRIAD1 WRIASE1 WRIATY3 WRIAUD3 WRIDOM7 WRILEV1 WRIMED1 WRIPP5 WRISAM2 WRISTA2 WRITAS3"/>

Module: header

<category>

(category) defines a single category within a taxonomy of texts.

Class: att.uniqueId

Declaration

element category { att.uniqueId.attributes, att.uniqueId.attribute.xmlid, catDesc }

Attributes: Global attributes and those inherited from [att.uniqueId ]
att.uniqueId.attribute.xmlid

Example

 <category xml:id="FICTION">   <catDesc>Fiction and verse</catDesc>  </category>

Module: header

<change>

summarizes a particular change or correction made to a particular version of an electronic text which is shared between several researchers.

Class: att.ascribed

Declaration

element change { att.ascribed.attributes, attribute date { data.temporal }?, att.ascribed.attribute.who, macro.phraseSeq }

Attributes: In addition to global attributes and those inherited from [att.ascribed ]
date
supplies the date of the change in standard form, i.e. yyyy-mm-dd.
att.ascribed.attribute.who

Example

 <change date="2006-10-21" who="#OUCS">Tag usage updated for BNC-XML</change>

Note: Changes should be recorded in a consistent order, for example with the most recent first.

Module: header

<classCode>

(classCode) contains the classification code used for this text in some standard classification system.

Declaration

element classCode { attribute scheme { data.pointer }, macro.phraseSeq }

Attributes: In addition to global attributes

Example

 <classCode scheme="#DDC12">410</classCode>

Module: header

<classDecl>

(classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

Class: model.encodingPart

Declaration

element classDecl { taxonomy+ }

Attributes: Global attributes only

Module: header

<collate>

supplies any additional ICU-conformant collating rules to be used when sorting words in the corpus.

Declaration

element collate { text }

Attributes: Global attributes only

Note: The format for collating rules is defined at http://icu.sourceforge.net/userguide/Collate_Customization.html

Module: module-from-bncxml

<corr>

(correction) contains the correct form of a passage apparently erroneous in the copy text.

Class: att.rendered: att.editLike: model.pPart.edit

Declaration

element corr { att.rendered.attributes, att.editLike.attributes, attribute sic { text }?, att.rendered.attribute.rend, attribute resp { data.pointer }?, ( w | c | mw | gap )* }

Attributes: In addition to global attributes and those inherited from [att.rendered att.editLike ]
sic
contains verbatim text which has been corrected, or an empty string if the correction consists of an addition.
att.rendered.attribute.rend
resp
a code identifying the agency responsible for making the correction.

Example

 <corr sic="existant">   <w c5="AJ0" hw="existent" pos="ADJ">existent </w>  </corr>

Module: core

<creation>

contains information about the creation of a text.

Declaration

element creation { attribute date { data.temporal }?, macro.phraseSeq }

Attributes: In addition to global attributes
date
supplies the year of original composition, if known; or 000-00-00 if the date is unknown.

Example

 <creation date="0000-00-00"> Origination/creation date not known </creation>

Example

 <creation date="1986"> Original publisher: A & C Black (Publishers) Ltd, London </creation>

Module: header

<date>

contains a date in any format.

Class: model.dateLike: model.publicationStmtPart

Declaration

element date { attribute value { data.temporal }?, macro.phraseSeq }

Attributes: In addition to global attributes
value
supplies a standardized representation of the date

Example

 <date value="1991-02-16">1991-02-16</date>

Example

 <date value="1989">1989</date>

Module: core

<defaultVal>

specifies the default declared value for an attribute.

Declaration

element defaultVal { text }

Attributes: Global attributes only

Example

 <defaultVal>#IMPLIED</defaultVal>

Note: any legal declared value or TEI-defined keyword

Module: tagdocs

<desc>

(description) supplies explanatory text associated with a category or other component defined in the corpus header

Class: model.glossLike: att.translatable

Declaration

element desc { macro.phraseSeq }

Attributes: Global attributes and those inherited from [att.translatable ]

Example

 <desc>contains a brief description of the purpose and application for  an element, attribute, attribute value, class, or entity.</desc>

Note: TEI convention requires that this be expressed as a finite clause, begining with an active verb.

Module: core

<dialect>

contains an informal description of the regional variety of English used by a participant in a spoken text.

Class: model.persStateLike

Declaration

element dialect { macro.phraseSeq }

Attributes: Global attributes only

Example

 <dialect>Home Counties</dialect>

Module: module-from-bncxml

<distributor>

supplies the name of a person or other agency responsible for the distribution of a text.

Class: model.biblPart: model.publicationStmtPart

Declaration

element distributor { macro.phraseSeq }

Attributes: Global attributes and those inherited from [model.biblPart ]

Example

 <distributor>Distributed under licence by Oxford University Computing Services on behalf of the BNC Consortium.</distributor>

Module: header

<div>

(text division) contains a subdivision of the front, body, or back of a text.

Class: att.rendered

Declaration

element div { att.rendered.attributes, attribute n { text }?, attribute decls { data.pointers }?, attribute level { data.count }?, attribute type { text }?, att.rendered.attribute.rend, ( ( model.divWrapper | model.global )*, ( ( ( model.divPart ), ( model.divPart | model.global )* ) | ( ( model.divPart.spoken ), ( model.divPart.spoken | model.global )* ) )?, div* ) }

Attributes: In addition to global attributes and those inherited from [att.rendered ]
n
for a spoken text, identities the tape corresponding to this division.
decls
for a spoken text, identities the declarations (for setting, recording etc.) in the header which apply to this division.
level
specifies the hierarchic level of this division as a number between 1 (outermost or largest division) and 4 (innermost or smallest).
type
identifies the type or function of the division (for a written text). Values are:
advertisement
advertisement section or insert
appendix
appendix
article
single article in a journal
blurb
any kind of promotional front matter
cartoon
cartoon
chapter
chapter of a novel etc.
column
newspaper column, regular feature etc.
compo
composite material
contents
table of contents
front
any kind of front matter
leaflet
free-standing leaflet or pamphlet
paper
an academic paper in a collection
part
subdivision of a chapter
recipe
separate recipe in a cookbook
section
any subdivision
sidebar
sidebar or displayed paragraph e.g. in a news story
story
distinct story in a periodical or collection
subsection
smaller subdivision of any kind
att.rendered.attribute.rend

Example

 <div level="1" n="1" type="chapter">   <head rend="it">    <s n="1">     <w c5="AV0" hw="so" pos="ADV">So </w>     <w c5="PNP" hw="you" pos="PRON">you </w>     <w c5="VVB" hw="want" pos="VERB">want </w>     <w c5="TO0" hw="to" pos="PREP">to </w>     <w c5="VBI" hw="be" pos="VERB">be </w>     <w c5="AT0" hw="an" pos="ART">an </w>     <w c5="NN1" hw="actor" pos="SUBST">Actor</w>     <c c5="PUN">?</c>    </s>   </head>   <p>    <s n="2">     <w c5="PNI" hw="everyone" pos="PRON">Everyone </w>     <w c5="PNQ" hw="who" pos="PRON">who </w>     <w c5="VVZ" hw="want" pos="VERB">wants     ... </w>    </s>   </p>  </div>

Note: any sequence of low-level structural elements, possibly grouped into lower subdivisions.

Module: textstructure

<edition>

(Edition) describes the particularities of one edition of a text.

Class: model.biblPart

Declaration

element edition { attribute n { data.count }?, macro.phraseSeq }

Attributes: In addition to global attributes and those inherited from [model.biblPart ]
n
supplies an identifying number for the edition

Example

 <editionStmt>   <edition>BNC XML Edition, December 2006</edition>  </editionStmt>

Module: header

<editionStmt>

(edition statement) groups information relating to one edition of a text.

Declaration

element editionStmt { edition }

Attributes: Global attributes only

Example

 <editionStmt>   <edition>BNC XML Edition, December 2006</edition>  </editionStmt>

Module: header

<editor>

(editor) secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc.

Class: model.respLike

Declaration

element editor { attribute n { text }?, macro.phraseSeq }

Attributes: In addition to global attributes
n
supplies a number for the editor where multiple editors are specified for a single text

Example

 <editor n="2">Boileau, John</editor>

Module: core

<editorialDecl>

(editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text.

Class: model.encodingPart: att.declarable

Declaration

element editorialDecl { ( text | para )* }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <editorialDecl>   <para>Material included in the BNC was produced by   several different agencies ...</para>  </editorialDecl>

Note: This element is supplied in the BNC corpus header only

Module: header

<elementPolicy>

specifies the xaira indexing policy to be used for one or more elements.

Class: att.identifiable

Declaration

element elementPolicy { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, attribute type { "none" | "children" | "content" | "markup" }?, nameList? }

Attributes: In addition to global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: module-from-bncxml

<encodingDesc>

(Encoding description) documents the relationship between an electronic text and the source or sources from which it was derived.

Class: model.headerPart

Declaration

element encodingDesc { model.encodingPart* }

Attributes: Global attributes only

Example

 <encodingDesc>   <projectDesc>    <para>The British National Corpus (BNC) Consortium was formed in     1990...</para>   </projectDesc>   <samplingDecl>    <para>Definitive information on the sampling policies...     </para>   </samplingDecl>   <editorialDecl>    <para>Material included in the BNC was produced by    several different agencies ...</para>   </editorialDecl>   <refsDecl>    <para>Canonical references to the BNC should ...</para>   </refsDecl>   <classDecl>    <taxonomy xml:id="DLee">     <desc>David Lee's register and domain classification</desc>    </taxonomy>...    </classDecl>   <xairaSpecification>   ...</xairaSpecification>  </encodingDesc>

Note: Used in corpus header only

Module: header

<event>

(Event) any phenomenon or occurrence, not necessarily vocalized or communicative, for example incidental noises or other events affecting communication.

Class: model.divPart.spoken: att.timed: att.ascribed

Declaration

element event { att.timed.attributes, att.ascribed.attributes, attribute desc { text }?, att.timed.attribute.dur, empty }

Attributes: In addition to global attributes and those inherited from [att.timed att.ascribed ]
desc
provides a brief description of the event
att.timed.attribute.dur

Example

 <event desc="music playing"/>

Module: spoken

<extent>

specifies the approximate size of the text, in orthographic words, w elements, and s elements

Class: model.biblPart

Declaration

element extent { macro.phraseSeq }

Attributes: Global attributes and those inherited from [model.biblPart ]

Example

 <extent>432434 tokens; 432859 w-units; 26215 s-units</extent>

Module: header

<fileDesc>

(File Description) contains a full bibliographic description of an electronic file.

Declaration

element fileDesc { macro.fileDescPart, sourceDesc+ }

Attributes: Global attributes only

Note: The major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elswhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

Module: header

<gap>

(omitted material) indicates a point where material has been omitted from the transcription.

Class: model.global.edit: att.editLike

Declaration

element gap { att.editLike.attributes, attribute desc { text }?, attribute reason { text }?, att.editLike.attribute.resp, empty }

Attributes: In addition to global attributes and those inherited from [att.editLike ]
desc
briefly describes the material which has been omitted.
reason
gives further details of the reason for omission.
att.editLike.attribute.resp

Example

 <gap desc="address" resp="OUP"/>

Module: core

<gi>

(generic identifier) contains the name (generic identifier) of an element.

Class: att.identifiable

Declaration

element gi { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, text }

Attributes: Global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: tagdocs

<head>

(heading) contains any type of heading, for example the title of a section or a poem.

Class: att.rendered: model.divWrapper

Declaration

element head { att.rendered.attributes, attribute type { "MAIN" | "SUB" | "BYLINE" }?, att.rendered.attribute.rend, ( s | gap | pb )+ }

Attributes: In addition to global attributes and those inherited from [att.rendered ]
type
Legal values are:
MAIN
a major heading.
SUB
any sub-heading.
BYLINE
a sub-heading providing the name of a journalist or other source of a newspaper report.
att.rendered.attribute.rend

Example

 <head rend="ub" type="MAIN">   <s n="93">    <w c5="VDB" hw="do" pos="VERB">Do </w>    <w c5="PNP" hw="i" pos="PRON">I </w>    <w c5="VVI" hw="need" pos="VERB">need </w>    <w c5="DT0" hw="any" pos="ADJ">any </w>    <w c5="NN1" hw="training" pos="SUBST">training</w>    <c c5="PUN">?</c>   </s>  </head>

Note: The <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

Module: core

<hi>

(highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.

Class: att.rendered: model.hiLike

Declaration

element hi { att.rendered.attributes, att.rendered.attribute.rend, macro.paraContent }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <exemplum>   <egXML>    <s n="2211">     <hi rend="it">      <w c5="NN1" hw="apple" pos="SUBST">Apple      </w>     </hi>     <w c5="VBZ" hw="be" pos="VERB">is </w>     <w c5="PRP" hw="to" pos="PREP">to </w>     <hi rend="it">      <w c5="NN0" hw="fruit" pos="SUBST">fruit </w>     </hi>     <w c5="CJS-PRP" hw="as" pos="CONJ">as     </w>     <hi rend="it">      <w c5="NN1" hw="dog" pos="SUBST">dog </w>     </hi>     <w c5="VBZ" hw="be" pos="VERB">is </w>     <w c5="PRP" hw="to" pos="PREP">to     </w>     <hi rend="it">      <w c5="ZZ0" hw="x" pos="SUBST">X </w>     </hi>     <c c5="PUN">.</c>    </s>   </egXML>  </exemplum> <!-- FAC -->

Module: core

<ident>

contains an identifier or name for an object of some kind in a formal language

Class: att.identifiable

Declaration

element ident { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, text }

Attributes: Global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Note: In running prose, this element may be used for any kind of identifier in any formal language.

Module: tagdocs

<idno>

(identifying number) supplies an identifying code for a text.

Class: model.biblPart: model.publicationStmtPart

Declaration

element idno { attribute type { data.enumerated }?, text }

Attributes: In addition to global attributes and those inherited from [model.biblPart ]
type
categorizes the code number used.

Example

 <idno type="bnc">KD7</idno>  <idno type="old"> XMa0KP </idno>

Module: header

<imprint>

groups information relating to the publication or distribution of a bibliographic item.

Declaration

element imprint { attribute n { text }?, ( pubPlace | publisher | date | pp )* }

Attributes: In addition to global attributes
n
internal identifier

Example

 <imprint n="JOHNMU1">   <publisher>John Murray (Publishers) Ltd</publisher>   <pubPlace>London</pubPlace>   <date value="1989">1989</date>  </imprint>

Module: core

<item>

contains one component of a list.

Class: att.rendered

Declaration

element item { att.rendered.attributes, att.rendered.attribute.rend, ( model.pLike | model.qLike | model.listLike | s | model.global )+ }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <list>   <item>    <s n="516">     <w c5="VVB" hw="substitute" pos="VERB">Substitute </w>     <w c5="AJ0-NN1" hw="plain" pos="ADJ">plain </w>     <w c5="NN2" hw="biscuit" pos="SUBST">biscuits </w>     <w c5="PRP" hw="for" pos="PREP">for </w>     <w c5="AJ0-VVN" hw="filled" pos="ADJ">filled </w>     <w c5="CJC" hw="or" pos="CONJ">or </w>     <w c5="AJ0" hw="chocolate-covered" pos="ADJ">chocolate-covered </w>     <w c5="NN2" hw="one" pos="SUBST">ones</w>...</s>   </item>   <item>    <s n="517">     <w c5="VVB" hw="try" pos="VERB">Try </w>     <w c5="VVG" hw="eat" pos="VERB">eating </w>     <w c5="AT0" hw="a" pos="ART">a </w>     <w c5="AJ0" hw="small" pos="ADJ">small </w>     <w c5="NN1" hw="amount" pos="SUBST">amount </w>...</s>   </item>  </list>

Module: core

<joinTo>

supplies a list of element names carrying an attribute which has been specified with the xaira "joinTo" indexing policy.

Declaration

element joinTo { gi+ }

Attributes: Global attributes only

Module: module-from-bncxml

<keywords>

(Keywords) contains a list of keywords or phrases identifying the topic or nature of a text.

Declaration

element keywords { attribute scheme { data.pointer }?, term+ }

Attributes: In addition to global attributes
scheme
identifies the controlled vocabulary within which the set of keywords concerned is defined.

Example

 <keywords scheme="COPAC">   <term>Fluid dynamics</term>   <term> Fluids. Dynamics</term>  </keywords>  <keywords/>

Module: header

<l>

(verse line) contains a single, possibly incomplete, line of verse.

Class: att.rendered: model.divPart: model.lLike

Declaration

element l { att.rendered.attributes, att.rendered.attribute.rend, ( s | gap | pb )+ }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <l>   <s n="3287">    <w c5="ORD" hw="next" pos="ADJ">Next </w>    <w c5="NN1" hw="day" pos="SUBST">Day </w>    <w c5="PRP" hw="at" pos="PREP">at </w>    <w c5="CRD" hw="six" pos="ADJ">Six </w>    <w c5="CJS" hw="before" pos="CONJ">before </w>    <w c5="AT0" hw="the" pos="ART">the </w>    <w c5="NN1" hw="gate" pos="SUBST">Gate </w>    <w c5="VVZ" hw="appear" pos="VERB">appears</w>    <c c5="PUN">,</c>   </s>  </l>  <l>   <s n="3288">    <w c5="AT0" hw="the" pos="ART">The </w>    <w c5="NN1" hw="wretch" pos="SUBST">Wretch </w>    <w c5="VVN" hw="divide" pos="VERB">divided </w>    <w c5="PRP" hw="by" pos="PREP">by </w>    <w c5="DPS" hw="he" pos="PRON">his </w>    <w c5="NN2" hw="hope" pos="SUBST">Hopes </w>    <w c5="CJC" hw="and" pos="CONJ">and </w>    <w c5="NN2-VVZ" hw="fear" pos="SUBST">Fears</w>    <c c5="PUN">.</c>   </s>  </l>

Module: core

<label>

contains the label associated with an item in a list; in glossaries, marks the term being defined.

Class: att.rendered

Declaration

element label { att.rendered.attributes, att.rendered.attribute.rend, ( s | gap | pb )+ }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <label>   <s n="8176">    <w c5="NN1-VVB" hw="amount" pos="SUBST">Amount</w>    <c c5="PUN">:</c>   </s>  </label>  <item>   <s n="8177">    <w c5="CRD" hw="52153" pos="ADJ">52153 </w>    <w c5="NN2" hw="pound" pos="SUBST">Pounds</w>   </s>  </item>  <label>   <s n="8178">    <w c5="NN1-VVB" hw="date" pos="SUBST">Date </w>    <w c5="NN1" hw="award" pos="SUBST">Award </w>    <w c5="VVD" hw="begin" pos="VERB">Began</w>    <c c5="PUN">:</c>   </s>  </label>  <item>   <s n="8179">    <w c5="CRD" hw="01" pos="ADJ">01 </w>    <w c5="NP0" hw="january" pos="SUBST">January </w>    <w c5="CRD" hw="1992" pos="ADJ">1992</w>   </s>  </item>

Module: core

<labelGen>

specifies the label to be generated for the parent reference.

Declaration

element labelGen { attribute change { "onStart" | "within" }?, text }

Attributes: In addition to global attributes

Module: module-from-bncxml

<langUsage>

(language usage) describes the languages, sublanguages, registers, dialects etc. represented within a text.

Class: model.profileDescPart: att.declarable

Declaration

element langUsage { language+ }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <langUsage>   <language ident="en-GB">The language of the British National Corpus is modern British English. ...</language>  </langUsage>

Note: Appears only in the corpus header.

Module: header

<language>

characterizes a single language or sublanguage used within a text.

Declaration

element language { attribute ident { data.language }, macro.phraseSeq }

Attributes: In addition to global attributes
ident
Supplies a language code constructed as defined in RFC 3066 (or its successor) which is used to identify the language documented by this element, and which is referenced by the global xml:lang attribute.

Example

 <language ident="en-GB">The language of the British National Corpus is modern British English. ...</language>

Note: Particularly for sublanguages, an informal prose characterization should be supplied as content for the element.

Module: header

<lg>

(line group) contains a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.

Class: model.qLike: model.divPart

Declaration

Attributes: Global attributes only

Example

 <lg>   <l>    <s n="463">     <w c5="AV0" hw="too" pos="ADV">Too </w>     <w c5="AJ0-VVD" hw="jellied" pos="ADJ">jellied</w>     <c c5="PUN">, </c>     <w c5="AJ0" hw="viscous" pos="ADJ">viscous</w>     <c c5="PUN">, </c>     <w c5="VVG" hw="float" pos="VERB">floating </w>     <w c5="AT0" hw="a" pos="ART">a </w>     <w c5="NN1" hw="condition" pos="SUBST">condition</w>    </s>   </l>   <l>    <s n="464">     <w c5="TO0" hw="to" pos="PREP">to </w>     <w c5="VVI" hw="inspire" pos="VERB">inspire </w>     <w c5="DT0" hw="more" pos="ADJ">more </w>     <w c5="NN1" hw="action" pos="SUBST">action </w>     <w c5="CJS" hw="than" pos="CONJ">than </w>     <w c5="AT0" hw="a" pos="ART">a </w>     <w c5="NN1" hw="sigh" pos="SUBST">sigh </w>     <c c5="PUN">—</c>    </s>   </l>...</lg>

Note: contains verse lines or nested line groups only, possibly prefixed by a heading.

Module: core

<list>

contains any sequence of items organized as a list.

Class: att.rendered: model.listLike: model.divPart

Declaration

element list { att.rendered.attributes, att.rendered.attribute.rend, ( ( model.divWrapper | model.global )*, ( ( item, model.global* )+ | ( label, model.global*, item, model.global* )+ ), model.divWrapper.bottom* ) }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Module: core

<locale>

(locale) contains a brief informal description of the nature of a place for example a room, a restaurant, a park bench etc.

Class: model.settingPart

Declaration

element locale { macro.phraseSeq }

Attributes: Global attributes only

Example

 <locale>a fashionable restaurant</locale>

Module: corpus

<mw>

contains a multi-word unit as identified by CLAWS, that is, a sequence of individual tokens which function as a single unit and can be given a single part of speech code.

Class: model.segLike: att.c5coded

Declaration

element mw { model.segLike.attributes, att.c5coded.attributes, att.c5coded.attribute.c5, w+ }

Attributes: Global attributes and those inherited from [att.c5coded ]
att.c5coded.attribute.c5

Example

 <mw c5="PRP">   <w c5="PRP" hw="in" pos="PREP">in </w>   <w c5="NN1" hw="response" pos="SUBST">response </w>   <w c5="PRP" hw="to" pos="PREP">to </w>  </mw>

Note: In CLAWS output the components of a <mw> are given ‘ditto’ tags inherited from the parent <mw>. In BNC they have been given the same code as elsewhere in the corpus.

Module: module-from-bncxml

<name>

(name, proper noun) contains a proper noun or noun phrase.

Class: model.nameLike.agent: att.naming

Declaration

element name { macro.phraseSeq }

Attributes: Global attributes and those inherited from [att.naming ]

Example

 <name>Longman </name>

Note: This element is used only in the header.

Module: core

<nameList>

supplies a list of element names or attribute identifiers

Declaration

element nameList { ( gi | ident )+ }

Attributes: Global attributes only

Module: module-from-bncxml

<namespace>

supplies the formal name of the namespace to which the elements documented by its children belong.

Declaration

element namespace { attribute name { data.namespace }, tagUsage+ }

Attributes: In addition to global attributes
name
the full formal name of the namespace concerned.

Note: This element is not used in the current release of the BNC: all elements belong to the empty namespace.

Module: header

<note>

contains a note or annotation.

Class: model.divPart: att.placement

Declaration

element note { attribute place { text }?, attribute n { text }?, s+ }

Attributes: In addition to global attributes and those inherited from [att.placement ]
place
Values are:
FOOT
footnote
SIDE
side note
END
endnote
n
internal identifier

Example

 <note place="SIDE">   <s n="477">    <w c5="AT0" hw="the" pos="ART">The </w>    <w c5="AJ0-NN1" hw="short" pos="ADJ">short </w>    <w c5="VBZ" hw="be" pos="VERB">is </w>    <w c5="AT0" hw="a" pos="ART">a </w>    <w c5="NN1" hw="film" pos="SUBST">film </w>    <w c5="PRP" hw="about" pos="PREP">about </w>    <w c5="NN1-VVG" hw="sailing" pos="SUBST">sailing</w>    <c c5="PUN">.</c>   </s>...</note>

Module: core

<occupation>

contains an informal description of a person's trade, profession or occupation.

Class: model.persStateLike

Declaration

element occupation { macro.phraseSeq }

Attributes: Global attributes only

Example

 <occupation>student</occupation>

Module: namesdates

<p>

(paragraph) marks paragraphs in prose.

Class: att.rendered: model.pLike: model.divPart

Declaration

element p { att.rendered.attributes, attribute type { text }?, att.rendered.attribute.rend, macro.paraContent }

Attributes: In addition to global attributes and those inherited from [att.rendered ]
type
indicates how the paragraph is displayed Values are:
caption
the paragraph is displayed as a caption
caption:byline
the displayed paragraph contains a byline
caption:display
the paragraph is displayed as a floating caption
caption:attached
the paragraph is displayed as an attached caption
att.rendered.attribute.rend

Example

 <p type="caption">   <s n="7234">    <w c5="VVB" hw="brave" pos="VERB">BRAVE</w>    <c c5="PUN">: </c>    <w c5="NP0" hw="louise" pos="SUBST">Louise</w>   </s>  </p>

Example

 <p>   <s n="7244">    <w c5="AJ0" hw="jobless" pos="ADJ">JOBLESS </w>    <w c5="NP0" hw="darren" pos="SUBST">Darren </w>    <w c5="NP0" hw="st" pos="SUBST">St    </w>    <w c5="NP0" hw="john" pos="SUBST">John </w>    <w c5="VVD" hw="gobble" pos="VERB">gobbled </w>    <w c5="NN0" hw="5lb" pos="SUBST">5lb </w>    <w c5="PRF" hw="of" pos="PREP">of </w>    <w c5="NN2" hw="strawberry" pos="SUBST">strawberries </w>    <w c5="PRP" hw="in" pos="PREP">in </w>    <w c5="CRD" hw="two" pos="ADJ">two </w>    <w c5="NN2" hw="pint" pos="SUBST">pints </w>    <w c5="PRF" hw="of" pos="PREP">of </w>    <w c5="AJ0" hw="chilli-flavoured" pos="ADJ">chilli-flavoured </w>    <w c5="NN1" hw="gravy" pos="SUBST">gravy </w>    <w c5="TO0" hw="to" pos="PREP">to </w>    <w c5="VVI" hw="raise" pos="VERB">raise </w>    <w c5="NN0" hw="£450" pos="UNC">£450 </w>    <w c5="PRP" hw="for" pos="PREP">for </w>    <w c5="NN1" hw="charity" pos="SUBST">charity </w>    <w c5="PRP" hw="at" pos="PREP">at </w>    <w c5="NP0" hw="henley" pos="SUBST">Henley</w>    <c c5="PUN">, </c>    <w c5="NP0" hw="oxon" pos="SUBST">Oxon</w>    <c c5="PUN">.</c>   </s>  </p>

Module: core

<para>

contains descriptive text appearing within components of a TEI header

Declaration

element para { ( text | hi | list )* }

Attributes: Global attributes only

Example

 <para>For information, the conditions of the Standard License Agreement are as  follows:</para>

Module: module-from-bncxml

<particDesc>

(participation description) describes the identifiable speakers, voices, or other participants in a linguistic interaction.

Class: model.profileDescPart: att.declarable

Declaration

element particDesc { attribute n { text }?, person+ }

Attributes: In addition to global attributes and those inherited from [att.declarable ]
n
internal identifier

Example

 <particDesc n="C114">   <person     ageGroup="Ag4"     xml:id="PS1US"     role="unspecified"     sex="m"     soc="UU"     dialect="NONE"     educ="X">    <age>45</age>    <persName>Terry</persName>    <occupation>british rail employee</occupation>   </person>...</particDesc>

Module: corpus

<pause>

a pause either between or within utterances.

Class: model.divPart.spoken: att.timed

Declaration

element pause { att.timed.attributes, att.timed.attribute.dur, empty }

Attributes: Global attributes and those inherited from [att.timed ]
att.timed.attribute.dur

Example

 <s n="199">   <w c5="UNC" hw="erm" pos="UNC">Erm </w>   <pause dur="10"/>   <w c5="AV0" hw="right" pos="ADV">right </w>   <w c5="AV0" hw="now" pos="ADV">now</w>   <c c5="PUN">, </c>...</s>

Module: spoken

<pb>

(page break) marks the boundary between one page of a text and the next in a standard reference system.

Class: model.milestoneLike

Declaration

element pb { attribute n { text }?, empty }

Attributes: In addition to global attributes
n
gives the number of the page beginning here

Example

 <pb n="15"/>

Module: core

<persName>

(personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including any or all of the person's forenames, surnames, honorifics, added names, etc.

Class: model.persStateLike: model.nameLikeAgent

Declaration

element persName { macro.phraseSeq }

Attributes: Global attributes and those inherited from [model.nameLikeAgent ]

Example

 <persName>Norman</persName>

Module: namesdates

<persNote>

contains any additional information supplied about a participant in a spoken text

Class: model.persStateLike

Declaration

element persNote { macro.phraseSeq }

Attributes: Global attributes only

Example

 <person ageGroup="X">   <persNote>May well be an actor portraying a Davidian</persNote>  </person>

Module: module-from-bncxml

<person>

provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source.

Class: att.uniqueId

Declaration

element person { att.uniqueId.attributes, attribute ageGroup { text }?, attribute dialect { text }?, attribute firstLang { "XX-XXX" | "DE-DEU" | "FR-FRA" | "EN-GBR" | "EN-USA" | "XX-IND" }?, attribute n { text }?, attribute educ { "Ed0" | "Ed1" | "Ed4" | "X" }?, attribute soc { "AB" | "C1" | "C2" | "DE" | "UU" }?, attribute sex { "m" | "f" | "u" }?, attribute role { text }?, att.uniqueId.attribute.xmlid, ( model.pLike+ | model.personPart* ) }

Attributes: In addition to global attributes and those inherited from [att.uniqueId ]
ageGroup
specifies the age group to which the participant belongs. Values are:
Ag0
Under 15 years
Ag1
15 to 24 years
Ag2
25 to 34 years
Ag3
35 to 44 years
Ag4
45 to 59 years
Ag5
Over 59 years
X
Unknown
dialect
specifies the dialect or accent of a participant's speech, as identified by the respondent. Values are:
CAN
Canadian
NONE
No accent recorded
XDE
German
XEA
East Anglian
XFR
French
XHC
Home Counties
XHM
Humberside
XIR
Irish
XIS
Indian subcontinent
XLC
Lancashire
XLO
London
XMC
Central Midlands
XMD
Merseyside
XME
North-east Midlands
XMI
Midlands
XMS
South Midlands
XMW
North-west Midlands
XNC
Central Northern England
XNE
North-east England
XNO
Northern England
XOT
Other or unidentifiable
XSD
Scottish
XSL
Lower south-west England
XSS
Central south-west England
XSU
Upper south-west England
XUR
European
XUS
American (US)
XWA
Welsh
XWE
West Indian
firstLang
specifies the country of origin of the participant, as identified by the respondent. Legal values are:
XX-XXX
Unknown
DE-DEU
German
FR-FRA
French
EN-GBR
British English
EN-USA
North American English
XX-IND
Unknown Indian language
n
internal identifier
educ
specifies the age at which the participant ceased full-time education. Legal values are:
Ed0
Still in education
Ed1
Left school aged 14 or under
Ed4
Education continued until age 19 or over
X
Unknown
soc
specifies the social class of the participant. Legal values are:
AB
Higher management: administrative or professional
C1
Lower management: supervisory or clerical
C2
Skilled manual
DE
Semi-skilled or unskilled
UU
Social class unknown
sex
specifies the sex of the participant. Legal values are:
m
male
f
female
u
unknown
role
describes the relationship or role of this participant with respect to the respondent.
att.uniqueId.attribute.xmlid

Example

 <person    ageGroup="4"    xml:id="PS1V0"    role="unspecified"    sex="f"    soc="UU"    dialect="NONE"    educ="X">   <age>55</age>   <persName>Nola</persName>   <occupation>british rail employee</occupation>  </person>

Note: May contain either a prose description organized as paragraphs, or a sequence of more specific demographic elements drawn from the model.personPart class.

Module: namesdates

<placeName>

(place name) contains an absolute or relative place name.

Class: model.settingPart

Declaration

element placeName { macro.phraseSeq }

Attributes: Global attributes only

Example

 <placeName>North Yorkshire: York </placeName>

Module: namesdates

<pp>

supplies page numbers for a bibliographic citation.

Class: model.biblPart

Declaration

element pp { macro.phraseSeq }

Attributes: Global attributes and those inherited from [model.biblPart ]

Example

 <bibl>   <title>Misfortunes of Nigel. </title> ... <pp>67-173</pp>  </bibl>

Module: module-from-bncxml

<profileDesc>

(text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting.

Class: model.headerPart

Declaration

element profileDesc { creation?, model.profileDescPart* }

Attributes: Global attributes only

Example

 <profileDesc>   <creation date="1992"/>   <textClass>    <catRef      targets="WRI ALLTIM3 ALLAVA2 ALLTYP5 WRIAAG0 WRIAD0 WRIASE0 WRIATY2 WRIAUD3 WRIDOM5 WRILEV3 WRIMED3 WRIPP5 WRISAM0 WRISTA0 WRITAS0"/>    <classCode scheme="DLEE">W hansard</classCode>    <keywords>     <term> Parliamentary debates </term>    </keywords>   </textClass>  </profileDesc>

Module: header

<projectDesc>

(project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.

Class: model.encodingPart: att.declarable

Declaration

element projectDesc { para+ }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <projectDesc>   <para>The British National Corpus (BNC) Consortium was formed in 1990, and   started work in 1991 on the three-year task of producing a   hundred-million word corpus of modern British English for use in   commercial and academic research. The first edition was published in   1994.</para>  ...</projectDesc>

Module: header

<pubPlace>

contains the name of the place where a bibliographic item was published.

Class: att.naming: model.imprintPart: model.publicationStmtPart

Declaration

element pubPlace { macro.phraseSeq }

Attributes: Global attributes and those inherited from [att.naming ]

Module: core

<publicationStmt>

(publication statement) groups information concerning the publication or distribution of an electronic or other text.

Declaration

element publicationStmt { model.pLike+ | model.publicationStmtPart+ }

Attributes: Global attributes only

Example

 <publicationStmt>   <distributor>    <availability> This material is protected by international copyright laws and may not be copied or redistributed in any way.    Consult the BNC Web Site at http://www.natcorp.ox.ac.uk for full licencing and distribution conditions.</availability>    <idno type="bnc">HHV</idno>    <idno type="old"> HansrA </idno>   </distributor>  </publicationStmt>

Module: header

<publisher>

provides the name of the organization responsible for the publication or distribution of a bibliographic item.

Class: model.imprintPart: model.publicationStmtPart

Declaration

element publisher { macro.phraseSeq }

Attributes: Global attributes only

Example

 <imprint>   <pubPlace>Oxford</pubPlace>   <publisher>Clarendon Press</publisher>   <date>1987</date>  </imprint>

Module: core

<quote>

(quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text.

Class: model.qLike: model.divPart: att.rendered

Declaration

element quote { att.rendered.attributes, att.rendered.attribute.rend, ( bibl?, model.divPart+, bibl? ) }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <quote>   <p>    <s n="1426">     <w c5="NN1-NP0" hw="thrift" pos="SUBST">Thrift </w>     <w c5="VHZ" hw="have" pos="VERB">has </w>     <w c5="VVN" hw="go" pos="VERB">gone </w>     <mw c5="PRP">      <w c5="AVP" hw="out" pos="ADV">out </w>      <w c5="PRF" hw="of" pos="PREP">of </w>     </mw>     <w c5="NN1" hw="fashion" pos="SUBST">fashion</w>     <c c5="PUN">.</c>    </s>   </p>  </quote>

Note: Any bibliographic source or reference provided for the quotation may be included within the quote element.

Module: core

<recording>

(recording event) details of an audio or video recording event used as the source of a spoken text, either directly or from a public broadcast.

Class: att.uniqueId

Declaration

element recording { att.uniqueId.attributes, attribute date { data.temporal }?, attribute n { text }?, attribute time { text }?, attribute type { text }?, attribute dur { data.count }?, att.uniqueId.attribute.xmlid, macro.phraseSeq }

Attributes: In addition to global attributes and those inherited from [att.uniqueId ]
date
date of the recording in standardized form.
n
tape number.
time
time of day the recording was made.
type
kind of recording. Values are:
dat
recording made directly to Digital Audio tape.
walkman
recording made to Walkman tape.
dur
duration of the recording in minutes.
att.uniqueId.attribute.xmlid

Example

 <recording n="087902" date="1993-04-30" type="DAT"/>

Module: header

<recordingStmt>

(recording statement) describes a set of recordings used in transcription of a spoken text.

Class: model.sourceDescPart

Declaration

element recordingStmt { model.pLike+ | recording+ }

Attributes: Global attributes only

Module: header

<refsDecl>

(references declaration) provides documentation for the reference system applicable to the corpus.

Class: model.encodingPart: att.declarable

Declaration

element refsDecl { para+ }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <refsDecl>   <para>Canonical references to the BNC should be   constructed by taking the value of the n   attribute of the <bncDoc> element containing the target text, and   concatenating a dot separator, followed by the value of the n attribute of the   target <s> element containing the material to be referenced.</para>  ...</refsDecl>

Module: header

<resp>

contains a phrase describing the nature of a person's intellectual responsibility.

Declaration

element resp { macro.phraseSeq }

Attributes: Global attributes only

Example

 <respStmt>   <resp>compiler</resp>   <name>Edward Child</name>  </respStmt>

Module: core

<respStmt>

(statement of responsibility) supplies a statement of responsibility for someone responsible for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply.

Declaration

element respStmt { ( name | resp )+ }

Attributes: Global attributes only

Example

 <respStmt>   <resp>Text enrichment</resp>   <name>Unit for Computer Research into the English Language,   University of Lancaster</name>  </respStmt>

Module: core

<revisionDesc>

(revision description) summarizes the revision history for a file.

Declaration

element revisionDesc { change+ }

Attributes: Global attributes only

Example

 <revisionDesc>   <change date="2006-10-21" who="#OUCS">Tag usage updated for BNC-XML</change>..</revisionDesc>

Note: Record changes with most recent changes at the top of the list.

Module: header

<s>

(s-unit) contains a sentence-like division of a text.

Class: model.segLike

Declaration

element s { model.segLike.attributes, attribute n { text }, ( model.global | model.phrase | model.divPart.spoken )+ }

Attributes: In addition to global attributes
n
sequence number

Example

 <s n="1">   <w c5="VVB" hw="come" pos="VERB">Come </w>   <w c5="AVP" hw="in" pos="ADV">in</w>   <c c5="PUN">.</c>  </s>

Module: analysis

<samplingDecl>

(sampling declaration) contains a prose description of the rationale and methods used in sampling texts in the creation of a corpus or collection.

Class: model.encodingPart: att.declarable

Declaration

element samplingDecl { ( text | para )* }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <samplingDecl>   <para>Definitive information on the sampling policies   applied during construction of the BNC is provided in the associated documentation...</para>  </samplingDecl>

Module: header

<setting>

(setting) describes one particular setting in which a language interaction takes place.

Class: att.uniqueId: att.ascribed

Declaration

element setting { att.uniqueId.attributes, att.ascribed.attributes, attribute n { text }?, att.uniqueId.attribute.xmlid, att.ascribed.attribute.who, ( date | model.settingPart )* }

Attributes: In addition to global attributes and those inherited from [att.uniqueId att.ascribed ]
n
an internal identifier for a setting
att.uniqueId.attribute.xmlid
att.ascribed.attribute.who

Example

 <setting n="090910" who="PS1YR PS1YS">   <placeName>Strathclyde: Glasgow </placeName>   <locale> doctor's surgery </locale>   <activity> medical consultation </activity>  </setting>

Note: If the who attribute is not supplied, the setting is assumed to be that of all participants in the language interaction.

Module: corpus

<settingDesc>

(setting description) describes the setting or settings within which a language interaction takes place, either as a prose description or as a series of setting elements.

Class: model.profileDescPart: att.declarable

Declaration

element settingDesc { setting+ }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <settingDesc>   <setting n="104701" who="PS000 PS302 PS303 PS304 PS305 PS306 PS307 HYFPS000">    <placeName>Unknown</placeName>    <activity> analysts meeting speech </activity>   </setting>  </settingDesc>

Module: corpus

<shift>

(Shift) marks the point at which some paralinguistic feature of a series of utterances by any one speaker changes.

Class: model.divPart.spoken

Declaration

element shift { attribute new { data.enumerated }?, empty }

Attributes: In addition to global attributes
new
specifies the new state of the paralinguistic feature specified.

Example

 <u who="PS09K">   <s n="606">    <shift new="laughing"/>    <w c5="ITJ" hw="yeah" pos="INTERJ">Yeah </w>    <shift/>   </s>  </u>

Module: spoken

<sourceDesc>

supplies a description of the source text(s) from which an electronic text was derived or generated.

Class: att.declarable

Declaration

element sourceDesc { bibl | recordingStmt | para+ }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <sourceDesc>   <recordingStmt>    <recording      xml:id="KC1RE000"      n="040701"      dur="18"      date="1992-02-21"      time="19:30+"      type="Walkman"/>    <recording      xml:id="KC1RE001"      n="040702"      dur="526"      date="1992-02-21"      time="19:30+"      type="Walkman"/>...</recordingStmt>  </sourceDesc>

Example

 <sourceDesc>   <bibl>    <title>The worst poverty: a history of debt and debtors. </title>    <author n="BartyH1" domicile="ESussex">Barty-King, Hugh</author>    <imprint n="ALANSU1">     <publisher>Alan Sutton Publishing Ltd</publisher>     <pubPlace>Gloucester</pubPlace>     <date value="1991">1991</date>    </imprint>    <pp>85-203</pp>   </bibl>  </sourceDesc>

Module: header

<sp>

(speech) An individual speech in a performance text, or a passage presented as such in a prose or verse text.

Class: model.divPart: att.ascribed

Declaration

element sp { att.ascribed.attributes, att.ascribed.attribute.who, ( model.global*, ( speaker, model.global* )?, ( ( model.lLike | lg | model.pLike | model.blockLike | model.stageLike ), model.global* )+ ) }

Attributes: Global attributes and those inherited from [att.ascribed ]
att.ascribed.attribute.who

Example

 <sp>   <speaker>    <s n="1627">     <w c5="NP0" hw="mr." pos="SUBST">Mr. </w>     <w c5="NP0" hw="speaker" pos="SUBST">Speaker</w>    </s>   </speaker>   <p>    <s n="1628">     <w c5="PNP" hw="i" pos="PRON">I </w>     <w c5="VVB" hw="call" pos="VERB">call </w>     <w c5="NP0" hw="mr." pos="SUBST">Mr. </w>     <w c5="NP0" hw="dennis" pos="SUBST">Dennis </w>     <w c5="NP0" hw="turner" pos="SUBST">Turner</w>     <c c5="PUN">.</c>    </s>   </p>  </sp>

Module: core

<speaker>

A specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment.

Declaration

element speaker { ( s | gap | pb )+ }

Attributes: Global attributes only

Example

 <sp>   <speaker>    <s n="1627">     <w c5="NP0" hw="mr." pos="SUBST">Mr. </w>     <w c5="NP0" hw="speaker" pos="SUBST">Speaker</w>    </s>   </speaker>   <p>    <s n="1628">     <w c5="PNP" hw="i" pos="PRON">I </w>     <w c5="VVB" hw="call" pos="VERB">call </w>     <w c5="NP0" hw="mr." pos="SUBST">Mr. </w>     <w c5="NP0" hw="dennis" pos="SUBST">Dennis </w>     <w c5="NP0" hw="turner" pos="SUBST">Turner</w>     <c c5="PUN">.</c>    </s>   </p>  </sp>

Note: In the BNC, used only for speaker labels in dramatic texts, or Hansard

Module: core

<stage>

(stage direction) contains any kind of stage direction within a dramatic text or fragment.

Class: att.rendered: model.stageLike

Declaration

element stage { att.rendered.attributes, model.stageLike.attributes, att.rendered.attribute.rend, macro.paraContent }

Attributes: Global attributes and those inherited from [att.rendered ]
att.rendered.attribute.rend

Example

 <stage>   <s n="8004">    <w c5="DT0" hw="several" pos="ADJ">Several </w>    <w c5="AJ0" hw="hon." pos="ADJ">Hon. </w>    <w c5="NN2" hw="member" pos="SUBST">Members </w>    <w c5="VVD" hw="rise" pos="VERB">rose</w>   </s>  </stage>

Module: core

<stext>

contains a single spoken text, i.e. a transcription or collection of transcriptions from a single source.

Declaration

element stext { attribute type { "CONVRSN" | "OTHERSP" }, ( model.divPart.spoken*, div* ) }

Attributes: In addition to global attributes

Module: module-from-bncxml

<tagUsage>

(tagUsage) supplies information about the usage of a specific element within a text.

Declaration

element tagUsage { attribute gi { data.name }, attribute occurs { data.count }?, macro.phraseSeq }

Attributes: In addition to global attributes
gi
the name (generic identifier) of the element indicated by the tag.
occurs
specifies the number of occurrences of this element within the text.

Example

 <tagUsage gi="c" occurs="41685"/>

Module: header

<tagsDecl>

(tagging declaration) provides information about the XML elements actually used within a BNC text

Class: model.encodingPart

Declaration

element tagsDecl { namespace* }

Attributes: Global attributes only

Example

 <tagsDecl>   <namespace name="">    <tagUsage gi="bibl" occurs="17"/>    <tagUsage gi="c" occurs="4348"/>...</namespace>  </tagsDecl>

Module: header

<taxonomy>

(taxonomy) defines a typology used to classify texts either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.

Class: att.uniqueId

Declaration

element taxonomy { att.uniqueId.attributes, att.uniqueId.attribute.xmlid, ( desc?, ( category+ | model.biblLike ) ) }

Attributes: Global attributes and those inherited from [att.uniqueId ]
att.uniqueId.attribute.xmlid

Example

 <taxonomy xml:id="textMode">   <desc>Text mode</desc>   <category xml:id="WRI">    <catDesc>Written</catDesc>   </category>   <category xml:id="SPO">    <catDesc>Transcribed speech</catDesc>   </category>  </taxonomy>

Module: header

<teiHeader>

(TEI Header) supplies the descriptive and declarative information making up an electronic title page prefixed to every TEI-conformant text.

Declaration

element teiHeader { fileDesc, model.headerPart*, revisionDesc? }

Attributes: Global attributes only

Module: header

<term>

contains a word or phrase used to describe the topic or nature of a text.

Declaration

element term { macro.phraseSeq }

Attributes: Global attributes only
Choice:

Example

 <keywords>   <term> Parliamentary debates </term>  </keywords>

Note: Used to specify a single keyword or phrase

Module: core

<textClass>

(text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.

Class: model.profileDescPart: att.declarable

Declaration

element textClass { catRef, classCode*, keywords* }

Attributes: Global attributes and those inherited from [att.declarable ]

Example

 <textClass>   <catRef targets="ALLTIM3 ..."/>   <classCode scheme="DLEE">W hansard</classCode>   <keywords>    <term> Parliamentary debates </term>   </keywords>  </textClass>

Module: header

<title>

contains the full title of a work of any kind.

Declaration

element title { attribute level { text }?, macro.phraseSeq }

Attributes: In addition to global attributes
level
indicates the bibliographic level of this title Values are:
a
the title is an analytic title, rather than a monographic one

Example

 <title>Amnesty  International meeting. Sample containing about 15274 words speech  recorded in public context</title>

Example

 <bibl>   <title>An awfully big adventure. </title>   <author n="BainbB1" domicile="England">Bainbridge, B</author>   <imprint n="DUCKWO1">    <publisher>Duckworth & Company Ltd</publisher>    <pubPlace>London</pubPlace>    <date value="1990">1990</date>   </imprint>   <pp>49-192</pp>  </bibl>

Module: core

<titleStmt>

(title statement) groups information about the title of a work and those responsible for its intellectual content.

Declaration

element titleStmt { title+, ( author | editor | respStmt )* }

Attributes: Global attributes only

Example

 <titleStmt>   <title> So you want to be an actor?. Sample containing about 35817 words from a book (domain: arts) </title>   <respStmt>    <resp> Data capture and transcription </resp>    <name> Oxford University Press </name>   </respStmt>  </titleStmt>

Module: header

<tokenize>

supplies any additional ICU-conformant rules to be used when tokenization is performed by xaira rather than by explicit XML markup.

Declaration

element tokenize { text }

Attributes: Global attributes only

Module: module-from-bncxml

<trunc>

contains one or more truncated words in transcribed speech.

Class: model.divPart.spoken

Declaration

element trunc { ( w | mw | gap | unclear )+ }

Attributes: Global attributes only

Example

 <s n="1377">   <trunc>    <w c5="UNC" hw="the" pos="UNC">The </w>   </trunc>   <c c5="PUN">, </c>   <w c5="AV0" hw="then" pos="ADV">then </w>   <w c5="PNP" hw="he" pos="PRON">he </w>   <trunc>    <w c5="UNC" hw="bo" pos="UNC">bo </w>   </trunc>   <w c5="VVD" hw="bowl" pos="VERB">bowled </w>  </s>

Module: module-from-bncxml

<u>

(utterance) a stretch of speech usually preceded and followed by silence or by a change of speaker.

Class: att.ascribed: model.divPart.spoken

Declaration

element u { att.ascribed.attributes, att.ascribed.attribute.who, ( text | model.gLike | model.phrase | model.divPart.spoken | model.global )* }

Attributes: Global attributes and those inherited from [att.ascribed ]
att.ascribed.attribute.who

Example

 <u who="PS0KU">   <s n="414">    <w c5="VM0" hw="shall" pos="VERB">shall </w>    <w c5="PNP" hw="i" pos="PRON">I </w>    <w c5="VVI" hw="get" pos="VERB">get </w>    <w c5="PNP" hw="it" pos="PRON">it </w>    <w c5="CJC" hw="or" pos="CONJ">or </w>    <w c5="XX0" hw="not" pos="ADV">not</w>    <c c5="PUN">?</c>   </s>   <s n="415">    <w c5="PNP" hw="i" pos="PRON">I </w>    <w c5="VDB" hw="do" pos="VERB">do</w>    <w c5="XX0" hw="not" pos="ADV">n't </w>    <w c5="VVI" hw="know" pos="VERB">know </w>    <w c5="DTQ" hw="what" pos="PRON">what </w>    <w c5="TO0" hw="to" pos="PREP">to </w>    <w c5="VDI" hw="do" pos="VERB">do</w>   </s>  </u>  <u who="PS0KR">   <s n="416">    <w c5="ITJ" hw="yes" pos="INTERJ">Yes </w>    <w c5="VVB" hw="get" pos="VERB">get </w>    <w c5="PNP" hw="it" pos="PRON">it</w>   </s>  </u>  <u who="PS0KP">   <s n="417">    <w c5="ITJ" hw="eh" pos="INTERJ">eh</w>    <c c5="PUN">, </c>    <w c5="PNP" hw="i" pos="PRON">me </w>    <w c5="CJC" hw="and" pos="CONJ">and    </w>    <w c5="DPS" hw="you" pos="PRON">your </w>    <w c5="NN1" hw="mother" pos="SUBST">mother </w>    <pause/>   </s>  </u>

Note: In the BNC, each change of speaker is marked by a new <u> element.

Module: spoken

<unclear>

contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.

Class: att.timed: model.pPart.edit

Declaration

element unclear { att.timed.attributes, att.timed.attribute.dur, empty }

Attributes: Global attributes and those inherited from [att.timed ]
att.timed.attribute.dur

Example

 <u who="PS000">   <unclear/>  </u>

Module: core

<valItem>

(value definition) contains a single value and gloss pair for an attribute.

Class: att.identifiable

Declaration

element valItem { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, desc }

Attributes: Global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: tagdocs

<valList>

(value list) contains one or more valItem elements defining possible values for an attribute.

Class: att.identifiable

Declaration

element valList { att.identifiable.attributes, attribute copyOf { data.pointer }?, attribute type { "closed" | "semi" | "open" }?, att.identifiable.attribute.ident, att.identifiable.attribute.ns, valItem+ }

Attributes: In addition to global attributes and those inherited from [att.identifiable ]
copyOf
supplies the identifier of a previously-defined value list to be used at this point
type
specifies the extensibility of the list of attribute values specified. Legal values are:
closed
(only the values specified are permitted.)
semi
(all the values specified should be supported, but other values are legal and software should have appropriate fallback processing for them. )
open
(the values specified are sample values only.)
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: tagdocs

<valSource>

specifies where the xaira indexer is to find a value.

Class: att.identifiable

Declaration

element valSource { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, attribute type { "element" | "attribute" | "pseudo" }, attribute caseFold { text }?, ( nameList?, ( defaultVal | labelGen )? ) }

Attributes: In addition to global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: module-from-bncxml

<vocal>

(Vocalized semi-lexical) any vocalized but not necessarily lexical phenomenon, for example voiced pauses, non-lexical backchannels, etc.

Class: model.divPart.spoken: att.timed: att.ascribed

Declaration

element vocal { att.timed.attributes, att.ascribed.attributes, attribute desc { text }?, att.timed.attribute.dur, att.ascribed.attribute.who, empty }

Attributes: In addition to global attributes and those inherited from [att.timed att.ascribed ]
desc
provides a brief description of the vocal event
att.timed.attribute.dur
att.ascribed.attribute.who

Example

 <u who="PS000">   <vocal desc="laugh"/>  </u>

Module: spoken

<w>

(word) represents a grammatical (not necessarily orthographic) word.

Class: att.c5coded: model.segLike

Declaration

element w { att.c5coded.attributes, model.segLike.attributes, attribute pos { "ADJ" | "ADV" | "ART" | "CONJ" | "INTERJ" | "PREP" | "PRON" | "STOP" | "SUBST" | "UNC" | "VERB" }, attribute hw { text }, att.c5coded.attribute.c5, text }

Attributes: In addition to global attributes and those inherited from [att.c5coded ]
pos
supplies a simplified part-of-speech code. Legal values are:
ADJ
adjective
ADV
adverb
ART
article
CONJ
conjunction
INTERJ
interjection
PREP
preposition
PRON
pronoun
STOP
punctuation
SUBST
substantive
UNC
unclassified or non-lexical word
VERB
verb
hw
specifies the headword under which this lexical unit is conventionally grouped, where known.
att.c5coded.attribute.c5

Example

 <w c5="PNP" hw="i" pos="PRON">I </w>  <w c5="VDB" hw="do" pos="VERB">do</w>  <w c5="XX0" hw="not" pos="ADV">n't </w>  <w c5="VVI" hw="care" pos="VERB">care </w>

Module: analysis

<wtext>

contains a single written text.

Declaration

element wtext { attribute type { "ACPROSE" | "FICTION" | "NEWS" | "NONAC" | "OTHERPUB" | "UNPUB" }, ( ( model.divPart | model.global )*, ( div, ( div | model.global )* )? ) }

Attributes: In addition to global attributes

Module: textstructure

<xairaItem>

provides data needed to define one part of a xaira specification.

Class: att.identifiable

Declaration

element xairaItem { att.identifiable.attributes, att.identifiable.attribute.ident, att.identifiable.attribute.ns, attribute type { "element" | "form" | "addKey" | "lemmaScheme" | "region" | "textRef" | "scopeRef" | "unitRef" | "indexPol" | "defaultLang" | "langRules" }, ( desc*, ( ( valSource, labelGen? ) | attList | nameList | elementPolicy | attributePolicy | tokenize | collate )? ) }

Attributes: In addition to global attributes and those inherited from [att.identifiable ]
att.identifiable.attribute.ident
att.identifiable.attribute.ns

Module: module-from-bncxml

<xairaList>

contains a list of xaira parameters of a particular type

Declaration

element xairaList { attribute type { "elementSpec" | "keySpec" | "regionSpec" | "lemmaSpec" | "refSpec" | "indexSpec" | "langSpec" }, xairaItem+ }

Attributes: In addition to global attributes

Module: module-from-bncxml

<xairaSpecification>

specifies additional information needed by xaira.

Class: model.encodingPart

Declaration

element xairaSpecification { xairaList+ }

Attributes: Global attributes only

Module: module-from-bncxml

<bnc>

(TEI corpus) contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more TEI elements, each containing a single text header and a text.

bnc

Declaration

element teiCorpus { teiHeader, bncDoc+ }

Attributes: Global attributes only

Example

 <teiCorpus>   <teiHeader> <!-- header for corpus -->   </teiHeader>   <TEI>    <teiHeader> <!-- header for first text -->    </teiHeader>    <text> <!-- content of first text -->    </text>   </TEI>   <TEI>    <teiHeader> <!-- header for second text -->    </teiHeader>    <text> <!-- content of second text -->    </text>   </TEI> <!-- more TEI elements here -->  </teiCorpus>

Note: Must contain one TEI header for the corpus, and a series of <TEI> elements, one for each text.This element is mandatory when applicable.

Module: core

Macros defined

Macro data.count

defines the range of attribute values used for a non-negative integer value used as a count

Declaration

data.count = xsd:nonNegativeInteger

Note: Only positive integer values are permitted

Module: tei

Macro data.enumerated

defines the range of attribute values expressed as a single word or token taken from a list of documented possibilities

Declaration

data.enumerated = token

Note: Typically, the list of documented possibilities will be provided (or exemplified) by a value list in the associated element specification. If the value contains whitespace, it must be normalised: neither leading or trailing sequences of whitespace characters nor internal sequences of more than one whitespace character are allowed.

Module: tei

Macro data.language

defines the range of attribute values used to identify a particular combination of human language and writing system

Declaration

data.language = xsd:language

Note: The values for this attribute are language ‘tags’ as defined in RFC 3066 or its successor. Examples include
sn
Shona
zh-TW
Taiwanese
en-SL
English as spoken in Sierra Leone
pl
Polish
es-MX
Spanish as spoken in Mexico

Module: tei

Macro data.name

defines the range of attribute values expressed as an XML name or identifier

Declaration

data.name = xsd:Name

Note: Attributes using this datatype must contain a single word which follows the rules defining a legal XML name: for example they cannot include whitespace or begin with digits.

Module: tei

Macro data.namespace

(an XML namespace) defines the range of attribute values used to indicate XML namespaces as defined by the W3C Namespaces in XML technical recommendation

Declaration

data.namespace = xsd:anyURI

Note: The range of syntactically valid values is defined by RFC 2396 Uniform Resource Identifier (URI) Reference

Module: tei

Macro data.pointer

defines the range of attribute values used to provide a single pointer to any other resource, either within the current document or elsewhere

Declaration

data.pointer = xsd:anyURI

Note: The range of syntactically valid values is defined by RFC 2396 Uniform Resource Identifier (URI) Reference

Module: tei

Macro data.pointers

defines the range of attribute values used to provide a list of pointers to other resources, either within the current document or elsewhere

Declaration

data.pointers = list { data.pointer+ }

Note: A white-space delimited list of values, defined by the datatype data.pointer

Module: tei

Macro data.temporal

defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them

Declaration

data.temporal = xsd:date | xsd:gYear | xsd:gYearMonth

Note: A normalized form of temporal expression conforming to the W3C XML Schema Part 2: Datatypes Second Edition, except that times may be expressed with reduced precision (i.e., to the minute or the hour). Software intended for use with W3C XML Schema datatypes may be unable to properly process times expressed with reduced precision.If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

Module: tei

Macro data.word

defines the range of attribute values expressed as a single word or token

Declaration

data.word = token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" }

Note: Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Module: tei

Macro macro.fileDescPart

(file description elements) groups elements which occur inside fileDesc and biblFull

Declaration

macro.fileDescPart = titleStmt, editionStmt?, extent?, publicationStmt

Module: tei

Macro macro.paraContent

(paragraph content) defines the content of paragraphs and similar elements.

Declaration

macro.paraContent = ( model.phrase | model.inter | model.global )+

Module: tei

Macro macro.phraseSeq

(phrase sequence) defines a sequence of character data and phrase-level elements.

Declaration

macro.phraseSeq = text

Module: tei

Macro mix.spoken

(mixed-base spoken-text components) contains a string used in constructing the definition of macro.component used in the mixed base tag set.

Declaration

mix.spoken = model.divPart.spoken

Module: spoken

Up: Contents Previous: List of Sources