BNC Workshop Paris 6 February 1998

The saga of how I came to be invited to give this workshop is a long one, but in essence it came about because the Sorbonne 3 Institut du Monde Anglophone had begun to investigate the BNC last year and were interested in the use that they might make of it. They felt, however, that they needed more information and some guidance in using the corpus and the SARA software, so I was invited over by Beatrice Vautherin-Fiala who did a wonderful job of organising the workshop..

The audience of about twenty people came from various different parts of the Sorbonne, the Ecole Normale Superieur, and the University of Poitiers. Most were academics who were either linguists or teachers of English language, or both. There were also some graduate students. Only a very few of them had used the BNC at all, although some had experience of other corpora.

In the morning session I gave a presentation about the BNC and SARA and demonstrated SARA, which was running on their own server at ENS Fontenay. This worked very well, which was largely thanks to the persistence and hard work of Benoit Habert and his team. In the afternoon they ran the UNIX client XKWIK and SARA so that their different uses could be compared. This caused the SARA server to be somewhat unreliable, since they were both running on the same machine, but was otherwise a good idea, since it showed everyone that differenct types of software can be used on the BNC. This session was spent discussing some of the particular research interests of some of the participants and trying to work on them using SARA. This was not always entirely successful, since several participants wanted to be able either to search by suffix, ie all words ending with ‘ing’ or look for a pattern of parts of speech, eg any noun followed by any verb. These are the very things that SARA finds difficult. However, we solved this problem partially, by searching for specific examples of words and specifying the desired part of speech.

Many of the participants were interested in using the corpus to test hypotheses they had about the way English works. For example was it true that you cannot say or write a phrase like ‘I would/wouldn't like having’? Although we were all able to make up convincing sentences like ‘I wouldn't like having my mother in law living with me’ the corpus showed that this collocation never happens, instead the infinitive is used, eg ‘I wouldn't like to have…’. This delighted all concerned and proved if proof were needed the usefulness of corpus data. We also compared usages like ‘do you mind me/my smoking’ with ‘do you mind if I smoke’ and found the latter was by far the most common. The BNC also shows that the word ‘if’ and the phrase ’in case’ are increasingly being used interchangeably, despite a difference in meaning.

The format of allowing a whole day for the workshop which was then divided into two sections worked particularly well, allowing plenty of time to investigate the corpus, and allowing me as much time as I needed to go into detail about the BNC and SARA in my presentation. In all this was, from my point of view at least, a great sucess.