WEEKLYSUMMARY(1l) MISC. REFERENCE MANUAL PAGES WEEKLYSUMMARY(1l) NAME weeklySummary - summarize corpus throughput in last week (or other period) SYNOPSIS weeklySummary [-D database] [-b period_start] [-e period_end] [-p period] DESCRIPTION WeeklySummary produces a report on on the number of words in texts received, processed and forwarded by OUCS over the last week for which full figures are available (or some other period), and in total. It is currently run automati- cally on Monday mornings to report on activity during the week ending eight days earlier. (See NOTES for an explana- tion of the delay.) Categories reported at present are: Texts received from data capture agencies by OUCS, but rejected as not meeting corpus design criteria (Rejected). Texts received from data capture agencies, but returned as requiring rework by data capture agancy before resubmission to OUCS (Awaiting rework). Texts currently accepted by OUCS (Net to OUCS). Texts forwarded to Lancaster but rejected as requiring rework by OUCS before resubmission to Lancaster (Await- ing rework). Texts forwarded to Lancaster by OUCS (Net to Lancs). Texts received from Lancaster by OUCS (From Lancs). Texts accessioned to corpus (Complete). All catgories are subdivided according to data collection agency (Longman, OUP etc.) and the text category (written published, spoken demographic etc.). Rows in which all figures would be zero are not printed. OPTIONS -bperiod-start Earliest date for totals is period-start, rather than period-end date minus period. (See -e and -p below.) Period-start must be a valid date for Ingres. The -b and -p options are mutually exclusive. Sun Release 4.Last change: TGCW40: 19 November, 1993 1 WEEKLYSUMMARY(1l) MISC. REFERENCE MANUAL PAGES WEEKLYSUMMARY(1l) -Ddatabase Use database instead of the default databse, bnc, in compiling the report. -eperiod-end Latest date for totals is period-end, rather than eight days ago. (See NOTES.) Period-end must be a valid date for Ingres. -pperiod Report on throughput in the period preceeding 24:00 on period-end, rather than the default period of seven days. (See -e above.) Period must be a valid time interval for Ingres, and will almost certainly need to be quoted to protect internal whitespace from the shell. DIAGNOSTICS Invalid arguments for -b, -e, and -p will elicit Ingres error messages. ENVIRONMENT II_SYSTEM Location of Ingres files. Defaults to /usr/local. FILES ~natcorp/bin/weeklySummary The program itself. The man page is embedded: hand the program to nroff -man. AUTHOR Dominic Dunlop SEE ALSO perl(1), Ingres/SQL Reference Manual. NOTES The automatic database update procedures observe a grace period between the appearence of a new version of a text and its registration in the database - and hence its potential to contribute to this report. Grace periods are as follows: Submissions, bounces One working day CDIF checked Three working days Word-class tagged Six working days Sun Release 4.Last change: TGCW40: 19 November, 1993 2 WEEKLYSUMMARY(1l) MISC. REFERENCE MANUAL PAGES WEEKLYSUMMARY(1l) The intention is to allow mistakes to be corrected in CDIF- checked and word-class tagged files before the database registers the files' presence. However, it also means that the picture given by this report is incomplete for these two categories if the end date of the period being reported is close to the current date: grace periods may still be expir- ing for some files which would otherwise contribute to the totals reported. For this reason, the default end date for the report is seven days before the date on which it is being run, this representing the latest date for which a a complete picture can be given. The previous version of this report gave figures for gross deliveries from data-capture agencies to OUCS. This column counted the words in a text each time that it was resubmit- ted after having been bounced. This resulted in an accurate figures for week-by week gross receipts, but was confusing in the context of total deliveries since the beginning of the project. Consequently, the information is no longer supplied. If a text returns from rework containing more words than it did at the time it was bounced, the result can be that the total number of words in texts requiring rework appears to be negative. As this is potentially confusing, and the effect is small, such negative totals are shown as zero. BUGS For bounces from Lancaster to OUP, only the first bounce for a given text is taken into account in the net figures. The report does not show the effect of reworking of previ- ously word-class tagged files by Lancaster. It's not an Ingres report: I'll have to learn the language sometime... Sun Release 4.Last change: TGCW40: 19 November, 1993 3