Extracts: .....
"...when I began to look at the available data about the UK outbreak and where it was situated, it very quickly became apparent to me that the information was incomplete, inaccurate, inconsistent and difficult to use. ....
"...The Imperial team - Professor Anderson and his colleagues from Imperial College were well known to the Chief Scientist but were not among the groups originally asked to take part. They had experience of BSE modelling and various (mostly human) epidemic diseases. The team had recently moved (Nov 2000) from Oxford to a new department researching Human Health headed by Professor Anderson at Imperial College. Their only experience of FMD was that Ms Donnelly had co-authored a paper in July 2000 where the data from the FMD epidemics in UK 1967 and Taiwan in 1997 was run through epidemic simulations. ....
"....Imperial model - adapted by Neil Ferguson and Christl Donnelly from calculations that used the transmission of human sexually transmitted diseases to model spread together with knowledge gleaned from their work on BSE....... Simple model - generalised animal species, and constant infectivity assumed. ...
.....relies heavily on the 'contact tracing' carried out by MAFF in the first 3 weeks and discovered from this information that farms within 3 kms of an IP appeared to be at greater risk of infection. There was no differentiation between different species although information about the different infectivity of sheep, cattle and pigs to this particular strain of FMD was readily available. There was another generalised assumption that infectivity is constant from day 3 after infection to day 11.
Presumably the 3km spread was assumed to be via windborne transmission, although it was known by the Pirbright team that this strain of virus did not spread that way over more than 200 metres......"... data was widely published in the newspapers and on the BBC. Almost as soon as the data was published the inconsistencies began to show. .....
..."when the blood tests came back as negative, the cases were left on the list of infected premises and remained there. Contiguous farms or farms on a 'D' notice were still culled out or kept on movement restrictions - taking up much needed veterinary and surveillance resources. .....
"During May I found one of the most curious anomalies - still there on both tables: Case 1557 has two different names and addresses (many miles apart) .....
"by that point I had become aware that a considerable proportion of the Infected Premises had tested negative and had probably not had FMD at all. ....
"It would have been considered a failure on their part to suggest that a return to manual methods would perhaps produce better results. So they carried on - with the enormous glare of publicity I expect anyone with doubts as to the accuracy of what they were doing kept their mouths firmly shut. .....
Scientists should have questioned the conclusions and methods more - unfortunately politicians have different agendas. Many do not really understand scientific method and the need for debate to validate and discuss different views. And FMD is a political disease. ....
".....there was pressure from political sources to come up with a quick solution. The data available for the mathematical modelling was such that it would have been better to use other more traditional methodology right at the beginning to get control of what was happening to the disease. Modelling should have been used as a simple adjunct to other methods - not to shape the policies.
SUBMISSION TO THE ROYAL SOCIETY OF EDINBURGH
INQUIRY INTO FOOT AND MOUTH
By
Valerie Lusmore
The White House
Orchard Rise
Pwllmeyric
Monmouthshire
NP16 6JT
Tel: 01291 623573
Email: val_lusmore@hotmail.com
AUTHOR
I have degrees in Mathematics and Applied Mathematics from the University of Natal and I have worked in the computer industry since 1969. I have worked mainly as a consultant with large international companies, and have been involved in computer modelling on many occasions. In the past 10 years I have evolved into a data specialist. This role developed as I realised that many of the problems in complex computer systems lie in the data rather than the system. The problems are exacerbated by databases containing very large numbers of records, some of which are not consistent with the original assumptions of how the information would be. This leads to anomalies and calculations that do not work as they have been designed.
During the 2001 FMD outbreak I became involved with the National Foot and Mouth Group. When people kept telling me that they could not understand the daily data published by MAFF/DEFRA publicly on their FMD website , I made a daily analysis and published regular summaries for those who had an interest. I kept as complete a record as I could so that other people could use my data as a resource to get accurate and meaningful figures.
INDEX
1. Introduction
2. Executive summary
3. Mathematical Models
3.1 Pirbright scientists and modelling FMD
3.2 The mathematical modelling teams
3.3 The computer modelling tools
3.4 Papers written by the modelling teams - inconsistencies
3.5 Major problems with modelling
4. Information Collection
4.1 Initial data setup
4.2 Commentary on data integrity
4.3 Summarisation of Data
4.4 Example of Inconsistency and Inaccuracy in MAFF statistics
4.5 Collection History for IP data
4.6 County Tables for Slaughter Statistics
5. Examples of Data Problems
5.1 Analysis of Statistics
5.2 Name and Address anomalies
5.3 Data validation
5.4 Constituency Tables - why do they exist?
5.5 Changing the Tables
5.6 Analysis of the new County List by Local Authority
6. Other Methodology
6.1 Other modelling methods- volunteers
6.2 Maps
6.3 Tracings
6.4 Picture of Epidemic spread using more traditional methods
6.5 Results of Picture
6.6 Local Spread?
7. Conclusions
8. Appendices
8.1. Bibliography
8.2. County and Local Authority Lists - a comparison
1. Introduction
The policies used during the FMD crisis of 2001 were mainly driven by epidemiologists and bio-mathematicians. These policies were brought in hurriedly at the end of the fifth week into the crisis when there were concerns that the disease was already out of hand. The main policy that the modellers developed was that of the contiguous cull.
Along with many other people with an interest in the subject I tried to understand the factors behind this model - and found it difficult to understand why this particular approach had been used especially as there was much information available on the Internet of other methods of controlling the disease that were used in other parts of the world.
In particular, when I began to look at the available data about the UK outbreak and where it was situated, it very quickly became apparent to me that the information was incomplete, inaccurate, inconsistent and difficult to use. This led me to develop consistent ways of reporting the MAFF/DEFRA information so that it was clear and simple to understand.
2. Executive summary
This paper covers the mathematical models - and what I consider important about them.
This includes brief descriptions of the groups of people involved, their experience of the subject; the computer modelling tools available. There are a few comments on the scientific papers produced by the different groups justifying their work and a section on the main problems with the modelling itself.
I then move on to the major importance of the information needed to drive the models, how it was set up, examples of what information was available; discussion of the management and control needed for such information; and a commentary on the integrity and accuracy of the data itself.
The section on the data problems covers simple examples which led me to believe that the quality of the data was such that it was impossible for the modellers to give an accurate picture of what was happening .
The next section then postulates whether other methods could and should have been used. It describes some traditional techniques I used with the information available to the modellers at the time and the pictures I created of how the epidemic was spreading.
My conclusions follow - the main one being that there was pressure from political sources to come up with a quick solution. The data available for the mathematical modelling was such that it would have been better to use other more traditional methodology right at the beginning to get control of what was happening to the disease. Modelling should have been used as a simple adjunct to other methods - not to shape the policies.
3. Mathematical Models
3.1 Pirbright scientists and modelling FMD
The scientists at Pirbright, world reference laboratory for FMD, had been working with mathematical models to study various aspects of foot and mouth disease, since the 1970s. The EUFMD research group discussed research papers on modelling at their annual conferences in both 1999 and 2000 as well as in earlier years. The group was led by Alex Donaldson and Paul Kitching who were world class specialists. Both had worked on the EUFMD research group for a number of years and knew most of the other international specialists on this subject.
The EUFMD group had a great deal of knowledge of the pan-Asian O strain of FMD and there had been increasingly serious discussions for at least two years of how to cope with this strain when it arrived in Europe. There had also been discussions of how to cope with the logistics of slaughter and disposal of large numbers of carcasses as it was felt that increasingly this would lead to a public outcry in many countries.
In Alex Donaldson's 1999 paper he wrote about the earliest model created through collaboration between IAH, Pirbright, and the UK Meteorological Office, a computer-based model was developed during the 1970s for assessing the risk of airborne spread of FMD. It was created by bringing together data on the aerobiology of FMD with data on the physical behaviour of particles in the atmosphere under different climatic conditions. The model was shown to be capable of giving a prediction within a few of hours of the confirmation of an outbreak of FMD of whether there was a risk of spread and, if so, which farms were in jeopardy. The model could predict accurately up to a distance of 10 km from the source. Any farms considered to be at risk could be placed under intensive surveillance so that suspected cases could be quickly identified and eliminated. The model was used successfully under operational conditions during the outbreaks of FMD on Jersey and the Isle of Wight in March 1981.
3.2 The mathematical modelling teams
MAFF invited 3 teams of mathematical modellers to assist with analysis and prediction of the outbreak - a fourth team from Imperial College independently created their own model
I will refer to them as:
THE VLA team - Professor Wilesmith from the State Veterinary Labs agency was backed up by colleagues from Massey University in New Zealand - they had worked together previously on the BSE problem.
The Cambridge team - Professor Grenfell from Cambridge and his colleagues - they were very experienced at analysing data from an epidemic against various factors to see if fresh insights could be obtained (eg measles, soay sheep, etc)
The Edinburgh team - Professor Woolhouse and colleagues - had the most expertise in Geographic Information systems and were called upon by the other groups in this area.
The Imperial team - Professor Anderson and his colleagues from Imperial College were well known to the Chief Scientist but were not among the groups originally asked to take part. They had experience of BSE modelling and various (mostly human) epidemic diseases. The team had recently moved (Nov 2000) from Oxford to a new department researching Human Health headed by Professor Anderson at Imperial College. Their only experience of FMD was that Ms Donnelly had co-authored a paper in July 2000 where the data from the FMD epidemics in UK 1967 and Taiwan in 1997 was run through epidemic simulations. The conclusions of that simulation exercise was that it was imperative that herds be slaughtered on the day that disease was confirmed and that resources should be available to implement this policy should an outbreak occur.
3.3 The computer modelling tools
Epiman database - developed at Massey University in the early 1990s and used to track and manage outbreaks. Had been adapted for EU conditions and tested by various European groups of FMD researchers. Purchased by MAFF some years previously to the 2001 UK outbreak but not set up with data. Needs time to be set up
Quote from a Dutch team of researchers in the mid-1990s 'The EpiMAN(EU) GIS application was user friendly and provided the user with good tools to facilitate certain tasks in the control of a FMD outbreak. The system could be used in the Netherlands, and has potential for other countries as well. However, digitized data has to be available in advance of an outbreak, which is not completely the case yet in the Netherlands. To fully use the possibilities of a DSS such as EpiMAN(EU), a permanent, updated database with farm full information, including farm locations, is necessary.'
This database provided the information for the associated Interspread model which was used for predictions and modelling .
Cambridge model - more complex model - contains more detail, in terms of describing transmission between individual farms, as a random process, allowing for more heterogeneity: differences between farms, in terms of numbers of animals, different species.
Imperial model - adapted by Neil Ferguson and Christl Donnelly from calculations that used the transmission of human sexually transmitted diseases to model spread together with knowledge gleaned from their work on BSE. Simple model - generalised animal species, and constant infectivity assumed.
Rimpuff Model together with GIS system - developed by University of Denmark in 1990s and adapted for use to predict plumes of virus and spread where predicted weather factors are included.
3.4 Papers written by the modelling teams - inconsistencies
There were a number of scientific papers written by the teams of modellers whose work was used by the government to determine their policies.
The paper published by the Imperial team in May describes the 'model' they used which led government policy to use the contiguous cull of all livestock within 3kms of an Infected Premise.
This model relies heavily on the 'contact tracing' carried out by MAFF in the first 3 weeks and discovered from this information that farms within 3 kms of an IP appeared to be at greater risk of infection. There was no differentiation between different species although information about the different infectivity of sheep, cattle and pigs to this particular strain of FMD was readily available. There was another generalised assumption that infectivity is constant from day 3 after infection to day 11.
Presumably the 3km spread was assumed to be via windborne transmission, although it was known by the Pirbright team that this strain of virus did not spread that way over more than 200 metres.
In the subsequent paper published by this team in October significant bias in the contact tracing was uncovered and it was suggested that 'local' spread may have been via personnel or vehicles; it was also discovered by analysing what had happened that there was significant differences in infectivity between different sizes of farm and types of animal.
In a paper by the Edinburgh and Cambridge teams analysing the epidemic data afterwards, it was again discussed that there was a bias in the contact tracing data towards 'local' contacts without discovering how the disease was spread. They also suggest that in reality there are epidemic dynamics within a farm which means that infectivity changes over time. These became more significant as delays in culling infected animals (as well as all the contiguous stock) built up. Analysis found significant differences in infectivity between species with cattle being more liable to infection and sheep being relatively little affected.
All these facts were known by the Pirbright team BEFORE this epidemic occurred - but were not fully taken account of by the modellers.
3.5 Major problems with modelling
The main problem with all the models was not the methodology or the assumptions, but the quality and integrity of the available data. This was not of a standard consistent with modern practice - there are comments in most of the papers by the modelling teams about the data. These range from the relatively mild comments from the Cambridge team about 'lacunae' in the data, to Anderson of Imperial's comments to the Parliamentary agriculture committee that several of the farms were, according to MAFF's figures, situated in the North Sea .
From my own experience of the data, checked and gathered every single day since I became involved in this epidemic, I know that the data which MAFF/ DEFRA published is extremely inaccurate. Any experienced data analyst would have realised that there was no point in continuing with the modelling unless significant effort was put into validating, verifying and correcting the available data. Until the data validity was improved only extremely simplistic models could be run.
4. Information Collection
4.1 Initial data setup
The information required for the models is in two sections:
Firstly the general information of the overall picture of the locale
- both for farming and geography of the areas
Secondly, the particular details of each IP in the FMD outbreak - by location and type of livestock involved
Ideally the general information should already have been available and fed into a Database (presumably that was why the Epiman database was purchased by the MAFF two years previously). This would be useful and appropriate under contingency procedures for ANY outbreak of disease.
Once the foot and mouth disease outbreak was identified in Britain, four New Zealand experts were sent immediately to work with British colleagues to get the system up and running urgently. Professor Morris of Massey University said that loading all the data and getting the program running was done in four days, by cutting corners to get available data into the system as quickly as possible, "warts and all", rather than methodically and calmly as intended.
In this case, the data had to be hurriedly gathered and converted from many different sources - not all of which were compatible. There was very little time for checking the accuracy or consistency of the available data on the 144000 holdings in the UK. Data was collected from:
June 2000 agricultural census
Local MAFF office databases (Vetnet)
Database created for Swine fever outbreaks
From these sources a database of all farms was created to be used in conjunction with the data collected from the Infected Premises. Basic data from this database was then immediately available to be passed to the Epiman database of information about the IPs and used in conjunction with the Interspread model.
4.2 Commentary on data integrity
Unfortunately this data was flawed from the start:
The June census figures are not consistently collected and the numbers of animals on holdings change considerably during the year, especially in February where the number of sheep on each holding need to be maximised for the sheep premium..
Overall geographical mapping data loaded from current sources did not necessarily match the Vetnet data collected from a number of sources over a period of years. This data was often seriously out of data with regard to crucial factors such as map coordinates, addresses, postcodes, local authorities and counties .
Holding data had not been removed for farms that had stopped farming many years before. Holding numbers were not unique. I was told that there were no computer systems available in local MAFF offices at the start and inexperienced personnel were drafted in to copy all this data in from the Vetnet system, and that inadequate procedures were followed for checking the data for accuracy.
All of these items were crucial for accuarate modelling. The data for the infected premises was not sufficiently accurate or consistent to model clearly -nor was it particularly accurate for the MAFF officials who needed to go round the local area to put on the D notices and movement restrictions. This was certainly observed at almost every stage throughout the epidemic and must have been even worse at the beginning.
Data on the outbreak was gathered by MAFF but, as they were not ready to deal with the outbreak, the accuracy and consistency of that data was also poor. This became very apparent as soon as data was published on the MAFF website daily for public viewing.
This data was widely published in the newspapers and on the BBC.
Almost as soon as the data was published the inconsistencies began to show.
4.3 Summarisation of Data
One of the main decisions to be made when setting up data of this type, is the levels to which the data may be aggregated for various purposes. There may be several different data fields that are stored merely so that data can be aggregated - for example, counties, type of holding, which office they report to, etc etc. The decision must be taken when setting up the data as to exactly which lists are to be used, otherwise all data previously loaded has to be changed whenever decisions are made to use a different categorisation.
In the case of the county lists on which the data was based this was never sorted out and even today some of the data is aggregated to one set of 'counties' (local authorities) and other information is aggregated to the 'counties' as they were in 1979. This may seem in itself a small point - but it ensures that it is impossible to completely reconcile the information in the two sets of tables!
This has been true throughout the running of the epidemic - the 'local authority' table used to tell the general public where the Infected Premises were located, has been amended constantly since last year, and has only just (Jan 18th 2002) been corrected.
Because the data has been changed so often, the total number of IPs (Infected premises) assigned to a particular county on the Totals page is often different from the number of Names and addresses on the actual Table; the total number of all these sub-totals is often different from the actual number of IPs at any point in time.
When producing large amounts of data in a computer it is common practice to produce 'control totals' which ensure that all the data is actually entered into the system. At any point, adding up details of the data gives a check as to whether all the data is included. This principle has been totally ignored throughout the course of this epidemic - the summary tables often did not add up to the number of cases in the database.
4.4 Example of Inconsistency and Inaccuracy in MAFF statistics
A telling example is the slaughter statistics - three different sets of numbers are published on the DEFRA website every day which give the following details:
List 1: the number to date of animals slaughtered rounded to thousands - categorised by cattle, sheep, pigs, goats, deer and 'other' - this is over all premises whatever the status - (infected, direct contact, contiguous and 'slaughtered on suspicion')
List 2: a complete table by counties of the actual animals slaughtered with one row each for cattle, sheep, pigs, goats, deer and 'other' - and one column each for the 4 different kinds of premises
List 3: a summary OVER all counties of the data on list 2
Logically these 3 lists should match each other but they do not. To demonstrate this here is an example from a random day taken recently.
Examples below: (taken from DEFRA as at 14/1/2002 published on 15/1/2002)
List 1: from DEFRA as at 17:00 on 14/1/2002 published on 15/1/2002
4,050,000 animals recorded as slaughtered (594,000 cattle, 3,310,000 sheep, 142,000 pigs, 2,000 goats, 1,000 deer, 1,000 other animals slaughtered
List 2: base slaughter data by county TOTALLED over all counties - from DEFRA as at 17:00 on 14/1/2002 published on 15/1/2002
Animals
IPs
DC
Non-contiguous
SOS
Total
Cattle Total
304934
193267
82151
14356
594708
Sheep Total
953989
978262
1270834
110902
3313987
Pigs Total
20204
48944
70714
2543
142405
Goats Total
934
664
544
293
2435
Deer Total
25
578
411
3
1017
Other Total
283
306
0
3
592
Grand Total
1280369
1222021
1424654
128100
4055144
List 3 - from DEFRA summary of data over all counties as at 17:00 on 14/1/2002 published on 15/1/2002
Total animals slaughtered
Infected
premisesDC Contiguous
premisesDC Non contiguous
premisesSlaughter on suspicion
Grand Total
Cattle
304934
193267
82151
14356
594708
Sheep
953989
979505
1270834
110902
3315230
Pigs
20204
51594
70714
2543
145055
Goats
934
666
544
293
2437
Deer
25
578
411
3
1017
Other
283
306
0
3
592
Grand Total
1280369
1225916
1424654
128100
4059039
COMPARISON OF DATA ON LISTS 1, 2 AND 3 ABOVE
Looking at the above 3 Lists the assumption must be that the data on List 2 is the most accurate (as the others are only summaries and List 2 contains ALL the basic data by county)
List 3 has anomalies for sheep, pigs and goats in the DC contiguous premises column (highlighted numbers) and the same anomalies in the Grand Total column.
List 1 has anomalies in both the details for sheep and cattle. This data should be:
(amended List 1)
4,055,000 animals recorded as slaughtered (595,000 cattle, 3,314,000 sheep, 142,000 pigs, 2,000 goats, 1,000 deer, 1,000 other animals slaughtered)
Note: On 20th February 2002, List 3 has finally been corrected to match the totalled information over all counties on List 2 - however the rounded data on List 1 is still different from the details on the other list by some 5000.
4.5 Collection History for IP data
There are 2 different tables on the MAFF/ DEFRA website that list all the Infected Premises - one in numerical order and one list of similar information by county. This latter table has always been rather strange and caused much confusion. Let me explain what happened chronologically:
March 2001
During the first month of the outbreak the only list that became available was the list of IPs by county.
The information on this table is (by column)
Case number
Date confirmed
Name and address of Owner and premises
April 2001
Sometime during April the List of All Infected Premises became available and contained an extra column - the number and type of animals (apparently this information provided from the valuation details which would account for why so many of them appeared to have nicely rounded numbers such as 2000 sheep).
As I had collected all the county data in my own spreadsheet, I extracted the list of the numbers of animals so I would have the most complete data available. It soon became apparent to me, when checking the data from one list to the other, that the names and addresses from the two lists were slightly and subtly different - as if typed by different people.
Typing data twice would normally seem just a waste of time and resources - BUT in Information Technology terms it is a disaster. Data should only be saved in one place so that it can remain consistent - and then when changes need to be made they only need to be made in one place. At the beginning I thought this must just be due to a shortage of staff in a pressurised department - but as time went on, it became just another factor demonstrating that MAFF really had no proper controls and plan for how to organise and validate the information needed.
This was clearly true in the main computer database they used for ALL the data about the farms. IACS holding numbers are not unique - anecdotal evidence from farmers indicated there were often several farms with the same (supposedly unique) holding number - probably because numbers were generated by local offices and not centrally - a throwback to old manual pre-computerisation methods.
May 2001
It had become apparent that there were other inconsistencies with the way data was collected.
At least two cases were 'downgraded' from IPs to 'slaughter on suspicion' and the numbers re-used for later cases - as you see from the data below extracted from the 2 different sets of tables:
County
IP no
Date
Premises
Cattle
Sheep
Dumfries & Galloway
229*
14/03/01
Mr D Stoddart High Law Lockerbie Dumfries and Galloway
296 cattle
145 sheep
Devon
590*
25/03/01
Mr B Drury Winstode Farm Crediton
Devon EX17 5HQ
Northumberland
590
02/04/01
Mr W Aynsley Whiteside Law Hallington Newcastle Northumberland
600 cattle
71 sheep
Cumbria
229
11/04/01
M/S R W & J J Steel Lesson Hall Wigton Cumbria CA7 0EA
89 cattle
224 sheep
The two original cases 229 and 590 were revoked and the numbers reused later (note dates!), so sorting the list into numerical order does not give it in date order.
However in all other cases, when the blood tests came back as negative, the cases were left on the list of infected premises and remained there. Contiguous farms or farms on a 'D' notice were still culled out or kept on movement restrictions - taking up much needed veterinary and surveillance resources.
4.6 County Tables for Slaughter Statistics
The list of counties was changed several times during the outbreak - mostly trying to get the data more consistent. However, near the end of May, tables of slaughter statistics became available.
These consisted of , for each county, the number of:
Cattle, sheep, pigs, goats, deer and other animals killed - plus a total
with a column for each type of premise:
IPs, Contiguous Premises, Direct Contacts, Slaughter on Suspicion plus a total
Definition of type of premises (see definition from DEFRA website below)
Infected Premises (IPs): premises where foot and mouth disease has been confirmed.
Dangerous Contacts (DCs): premises where animals have been subject to direct contact with infected animals or have in any way been exposed to infection. The figures in the 'other DCs' column exclude data separately recorded for contiguous premises (CPs).
Contiguous Premises (CPs): a category of dangerous contacts where animals may have been exposed to infection on neighbouring infected premises.
Slaughtered on Suspicion (SOS): premises where a veterinary inspection detects some symptoms of disease, but these are insufficient to confirm that foot and mouth disease is present. Animals are culled and samples taken to confirm the presence/absence of disease. The tables include only those animals where samples have proven negative or remain unconfirmed; SOS cases giving positive results, or that are subsequently confirmed on clinical grounds, are classified as IPs, and slaughtered animals are recorded in the IP column.
Example of section of slaughter statistics taken from DEFRA slaughter statistics:
County
Data
IP
Contiguous
Non Contiguous
SOS
Total
Avon
Cattle
116
1291
0
6
1413
Avon
Sheep
40
829
0
534
1403
Avon
Pigs
5491
9
0
2058
7558
Avon
Goats
8
3
0
0
11
Avon
Deer
0
0
0
0
0
Avon
Other
0
0
0
0
0
Avon
Total
5655
2132
0
2598
10385
The county list for these statistics is very different from that used to describe the IPs, and is described by the DEFRA website (during early January 2002) by 'These statistics are classified into county boundaries as previously determined in 1979'.
It is difficult to cross reference this list of counties to the original list classifying the IPs.
Much of the data provided by DEFRA as the answers to various Parliamentary questions is in this format.
5. Examples of Data Problems
5.1 Analysis of Statistics -Original County List
The list of counties was totally uncontrolled and inconsistent - people doing their best but with no direction and management. Quite often the total of the number of IPs by county (on the page at the front so you could access data by county) did not add up to the total number of cases to date.
It was apparent (from the OIE reports provided to the EU by the chief vet every fortnight) that the totals by county are an international convenience for management and reporting - especially as the management of many government people who need to be involved eg Trading Standards officers, footpath management (to name but two), are all managed by the network of local authorities.
However, instead of providing a full list of local authorities for the data entry clerks to enter the data correctly and consistently, it was left for the data entry people to work it out for themselves. This they tried to do from the addresses - however much of the data on the database created by MAFF was out of date and, especially since the new Unitary Authorities arrived in the 1990s, these are no longer part of most people's postal address..
The counties and local authorities changed several years before, post codes (if available) had also been extensively changed in the last few years- names and addresses had been extracted from internal MAFF databases and did not tie up at all with current accurate information.
This was obvious to me from the beginning - from the data published, it was difficult to locate the farms involved on the online mapping systems - very quickly I realised that, as with so many enterprises of this nature, the data was not very good and was being badly managed.
5.2 Name and Address anomalies
During May I found one of the most curious anomalies - still there on both tables: Case 1557 has two different names and addresses (many miles apart) !
Current numeric list of all IPs
1557 May 5 JN & GM Hadwin, High Aulthurstside Farm, Woodland
Broughton in Furness, Cumbria, LA20 6AE Cattle 32
Current list for Cumbria (January 2002)
FMD 2001/1557 Stoddart, Hillside, Wigton, Cumbria
06/05/2001 10:39 CUMBRIA
5.3 Data validation
To show what I mean:
An example that demonstrates the differences and lack of checking is the following set of information extracted from the two different parts of the database a few weeks apart:
COUNTY
IP no
Date
Premises
Cattle
Sheep
Numeric list
Herefordshire
535
23/03/01
M/S SN & SJ Gibbens Llwynbrain Farm Llanigon Hay on Wye Hereford HR3 5QF
41 cows
220 sheep
County list
Devon
536*
24/03/01
M/S AF & SJ Gibbens North Down Farm
Lewdown Okehampton Devon EX20 4EB
Numeric list
Devon
536
24/03/01
M/S AF & SG Loud & Sons North Down Farm Lewdown Okehampton
Devon EX20 4EB
480 cows
Clearly the person putting the data in for case number 536 on the county list started doing it correctly, 'M/S AF & ' and was then interrupted. This person then continued typing 'SJ Gibbens' as in the previous address, then continued again , presumably after another break, with the remainder of the address for the Loud family without ever checking that the correct name and address had been entered.
I picked up these differences because one name and address was on the County list, and the other was on the 'all cases' list. The case 536* later disappeared as the correction was made unlike many of the other mistakes which remain to this day.
5.4 Constituency Tables - why do they exist?
Another table created in May 2002 (and still updated regularly) is Infected Premises by Parliamentary Constituency - this table has columns for
Constituency name
Constituency MPs name
Party to which the MP belongs
Number of IPs in that Constituency
This table was 23 cases short when first published and has been ever since, however often it was updated. My explanation for this is that presumably the person who was given the task of creating this table (relatively easy on the Internet as long as you have the exact post codes for all the IPs) extracted the data at a particular point. Once the table had been created, it was a few days later and there were 23 more cases of Foot and Mouth - however whoever was in charge did not check that the total number of cases added up to that on the IPs. After that, whoever updated this table on the daily list merely updated for the new cases that day. So that table is still 23 cases short!
This table was last updated around Christmas and has finally had the totals corrected.
It worries me as to who this table was for - there are less than 700 MPs and this specialised group of people certainly didn't need some of the limited resources to create this table for them. It concerns me that this continued to be updated and corrected right up until the beginning of 2002.
5.5 Changing the Tables
In January 2002 the website data for the Infected Premises by County was transformed once more into a list which classified each IP according to the Local Authority which controls it
The OLD County tables looked like this:
FMO 2001/nn
Date confirmed
Owner/premises
1470
25 Apr
Mr W J Francis
Goitre Coed Farm
Abercynon
Mountain Ash
Mid Glamorgan
CF45 4EN1426
21 Apr
Mr EH Leyshon
Pentwyn Farms
Crymlyn Road
Skewen
Neath
West Glamorgan
SA10 6NL1108
8 Apr
Mr M A J Jarrold
Parc Farm
Nelson
Caerphilly
Mid Glamorgan
CF46 6DR
This data is now on 3 different data tables and looks like this:
(1 table for each unitary authority)
INFECTED PREMISES
(IP) FMD NUMBER
OWNER NAME & IP
ADDRESS
CONFIRMATION
DATE
COUNTY NAME
DESCRIPTION
FMD 2001/1108
Jarrold
Park Farm
Nelson
Caerphilly
Mid Glam
07/04/01 20:10
CAERFFILI -
CAERPHILLY
Unitary Authority
FMD 2001/1426
Leyshon
Pentwyn Farms
Crymlyn Road
Skewen
West Glasmorgan
21/04/01 17:53
CASTELL-NEDD PORT
TALBOT - NEATH
PORT TALBOT
Unitary Authority
FMD 2001/1470
Francis
Goitre Coed Farm
Abercynon
Mountain Ash
Mid Glamorgan
25/04/01 13:10
RHONDDA, CYNON,
TAF - RHONDDA,
CYNON, TAFF
Unitary Authority
You will notice that the new tables have additional data, the name of the Local authority and a column that describes what type of authority it is.
However, the names and addresses have been retyped yet again and, in doing so, have lost both the owners initials and the Post Code. This makes the farms difficult to locate for anyone interested in studying the outbreak.
The minor spelling mistakes and the changes in dates demonstrate yet again that there is insufficient checking and validation of data.
5.6 Analysis of the new County List by Local Authority
I spent some time trying to match the county tables from the December County Tables to those in the new lists. The numbers appeared to match fairly reasonably until I looked up specific data and found that the new lists contained as many anomalies in the first rough analysis of the comparison as the old lists.
When I tried to find whether the new list matched up for counties I knew well, I found that the information as recoded is more accurate in some places and less accurate in others - there are complications with Hereford, Monmouthshire, Powys, Shropshire. Some of the old IPs have been corrected but new ones have now been moved into the 'wrong' counties.
Take case 1 (Cheale Meats abattoir) and case 2 (A Cheales farm) which is right next door - literally a few hundred meters, as is the farm of case 3 - Mr Gemmill. These all used to be in Essex - now case 1 and 3 are in Essex and case 2 is in Thurrock
Shropshire used to have 12 cases in the old list - it now has 15 cases of which 3 cases are in the old Powys list (and are still in Montgomeryshire which is in Wales and hence definitely not in Shropshire, 10 cases are from the old Shropshire list and 2 cases are from the old Staffordshire list. Two cases that used to be on the Shropshire list have moved over to Staffordshire!
When I tried to sort out the relationship of the old and new Powys lists I found links to cases previously in Monmouthshire, Herefordshire, and Shropshire- it was just too complex to sort it all out in a hurry.
So the new lists do not seem any more accurate than the old ones - and with even less data available, it is more difficult to find the locations of individual IPs.
6. Other Methodology
6.1 Other modelling methods- volunteers
On the Internet there were several other people with different experience of modelling who have been involved in trying to predict how the foot and mouth epidemic behaved. Some of these used mathematical modelling techniques such as logistic equations (used in population biology) and simple parameterised models. Others used traditional mapping techniques using standard computerised maps which are easily available on the Internet.
Most of these techniques were of a less complex nature than those created by the teams of bio-mathematicians employed by MAFF and because of the inherent simplicity of the techniques it was often easier to 'see' how changing the parameters produced different results.
Hence these independent researchers were able to validate their 'models' against the predictions made at various stages and check the assumptions against the actual results. Many of these people published extensively during the outbreak - because of their concerns at the way the epidemic was being tackled and the reliance on complex mathematical models to determine policy.
Web discussion groups all over the world became involved in trying to find a better way forward. Researchers volunteered and shared knowledge; volunteers worked very hard
doing whatever they could to assist.
6.2 Maps
Several volunteers mapped all the outbreaks using computerised tools. Others tried to predict the spread via these maps. Many theories abounded as to what caused the spread.
6.3 Tracings
The tracing information created by MAFF/DEFRA did not become available to the general public until the outbreak was finished. However, when I tried to use it to understand the information in the paper by the Cambridge group, I soon found the data in the scientific paper did not tie up to the tracing information published to the Parliamentary agriculture committee by Jim Scudamore. Again the anomalies - this time in the number of abattoirs, markets shown. And of course, by that point I had become aware that a considerable proportion of the Infected Premises had tested negative and had probably not had FMD at all. So I abandoned that particular line of investigation until I have more accurate information.
6.4 Picture of Epidemic spread using more traditional methods
When trying to comment on the mathematical modelling techniques available and used in the early days of the epidemic, I tried to put myself in the position of having only the data that was available at the time the models were created in March, and the ordinary tools available at the time.
Naturally the modellers had the advantage over me in that they had more complete data on each infected premise and of any neighbouring premises as well as models.
However I had the complete list of cases as and when they were reported with all the public details available and I also had a simple road map. I would have preferred to use the standard ordinance survey maps as (especially the older ones) they show nearly all farms, but then I would have needed the sort of 'war-room' size table that the Army and logistics people presumably used.
Anyway, I just took the Infected Premises and numbered small round white labels with the case numbers and coloured the labels differently for weeks 1,2,3 and 4 so that the pattern emerged on the map. I then located the IPs as accurately as I could (more difficult with the early ones because Post Codes were not at that stage included on the MAFF website)
This gave me the similar sort of information that was available to the logistics teams AND the modellers. The picture that emerged gave the following figures over the first 4 weeks, translating the little dots on my map:
The following Table demonstrates the picture I could see on my maps.
Table 6.4.1 Weekly cases of FMD over first 4 weeks analysed as IPs, Counties and Areas
Week number
1
2
3
4
Ending Sun
25th Feb
4th Mar
11th Mar
18th Mar
First 4 weeks
Weekly
IPs
7
61
96
161
325
Counties
3
16
7
4
30
Areas
3
26
20
16
65
Total
Cumulative
IPs
7
68
164
325
325
Counties
3
19
26
30
30
Areas
3
29
49
65
65
Key: IPs = confirmed cases ie infected premises
Counties are mostly as defined in the postal address (some anomalies)
Areas are defined by groups of dots showing several cases round a local geographical area
Week 1 was characterised by just a few cases BUT already it was clear that these were in 3 very geographically dispersed areas - Essex, Devon and Northumberland
At the end of Week 2 there were 68 confirmed IPs in 19 counties. 29 different areas of the country affected. This week was the worst for spread.
Week 2 and week 3 show the results of spread from markets and increasing recognition of the disease.
At the beginning of week 4, although the number of cases continued to rise, the results of the stopping of markets and movements began to work. The cases in the 4th week were mainly due to spread from other IPs or 'not known'.
Adding in the data from the Tracing document (for first 325 cases) and grouping them by what MAFF believed to be the method of spread we get the following table:
Table 6.4.2 Weekly cases analysed by Tracing document for method of spread of FMD
Tracing From
Week 1
Week 2
Week 3
Week 4
Total 1st 4 weeks
Index case
1
1
Markets
27
21
6
54
Abattoirs
3
1
6
3
13
Dealers
17
13
2
32
Previous IP
3
15
43
82
143
Unknown
5
42
47
Investigating
5
22
27
Not Listed
1
3
4
8
Total
7
61
96
161
325
This is clearer when we take the data on the previous table and categorise the method of spread as Unknown, Professional (ie via markets, abattoirs, and via dealers) and then from other Infected premises and then use this as percentages of the whole.
Table 6.4.3 Categorisation of Table 6.4.2 as a % of types of spread
Week 1
Week 2
Week 3
Week 4
Total 1st 4 weeks
%s of above
%
%
%
%
%
Origin not known
0
2
14
42
26
Professional
43
73
42
7
30
Other IP
57
25
44
51
44
Total
100
100
100
100
100
Week 1 was evenly divided by the spread in Essex believed to be from the 'Index case' (ie first case of FMD) and then from other 'local' spread, and the long distance spread via dealers and markets and the wind.
Week 2 has most cases being caused by long distance spread via markets and dealers and abattoirs in the previous weeks before movement restrictions were enforced.
Week 3 and 4 have a larger number of infections that had not been traced at the time this tracing document was published in December - but the infections have spread to 30 counties - and we know now that the virus only spread eventually to 33 counties.
Admittedly counties are a rather blunt instrument - some, like North Yorkshire, cover a huge area. And of course, political boundaries are not respected by the virus in its spread - just look at Cumbria and the Dumfries and Galloway cases - so close together.
In the work I did above a finer and more exact picture would be provided by using part of the Postcodes - eg for postcode NP16 3JT use the first part NP16 together with the numeric part of the second part 3 - to give NP163 - this indicates a fairly close locality (usually a postman's route).
This 'postcoding' to show how close farms are to each other is a reasonable approximation _ I used it after late April to try and track the outbreak - but the post code data was not always available at the beginning of the outbreak. This is an approximation - but better than counties for tracking spread!
This was the very best data that would have been available at the time of the meetings with MAFF about modelling. The actual data available was probably rather worse than the above - undoubtedly the tracing data was even less complete than the above data extracted from the tracing data discussed by Jim Scudamore in December at the Brussels conference.
6.5 Results of the Picture method
However the locations of the IPs were available and could be mapped, traditionally with a map or using the computers as a tool. What was clearly visible from the maps was that the outbreaks were contained at that point in 30 counties and 65 local areas of the country.
The movement restrictions in the whole county did not necessarily take account of geographical features and boundaries which were natural limiters - the virus paid no heed to 'political' boundaries and many 'areas' straddled county boundaries.
The tendency to spread along major routes once the first case occurred in an area was noticeable by the mapping even then. This factor could well have been due to spread via regular visitors to local farms, especially milk tankers, feed lorries, and once under surveillance, vets. Virus is shed for several days before the signs of FMD show - and the virus can be carried in the throats of humans. It appeared that some cases appeared along routes used by cull lorries taking carcasses to be burnt or buried.
Conventional FMD procedure at this point in an epidemic in many countries outside Europe is to stop all movements in the infected areas, cull ALL stock on each IP, and make regular observations on all stock within the local area after risk assessment. Suppressive or prophylactic vaccination can also be applied at this point often in a ring, working inwards from the boundary, so that the virus 'has no place to go'. Whether this vaccination is vaccination to cull or vaccination to live, is a political decision. It does stop the virus very quickly.
This might have been possible in week 3 or even week 4, if the amount of vaccine had been available and personnel could be drafted in. It would probably have been no more difficult than finding slaughtermen and people to deal with the subsequent piles of carcasses. Certainly the foci for vaccination in many areas were quite visible.
6.6 Local Spread ?
The paper in the Veterinary record by the VLA team on their mathematical model refers to 'local' spread as a reason for spread of FMD - without any other explanation. This was either spread through close contact of animals with infected animals (not airborne except for a very short distance) or spread by contact with a person who had contact with infected animals.
The computer videos created with the paper by Woolhouse demonstrate a phenomena I noticed much later in the epidemic when I had time to use the maps to check where the daily cases appeared - the tendency to spread down roads from one location to the next indicated to me that this spread (probably unwittingly) by visitors moving along roads . The Cumbria county website has maps of cases over time that also show this phenomenon, which I observed .
7. Conclusions
Trying to look back at the modellers and when they did this work in mid-March, the data was not really adequate. However, unless any of them were expert in the particular data areas (such as post codes, county data), they may not have realised quite how bad the data was.
I said previously that I personally would have felt that with that quality of data it was impossible to model accurately.
Hindsight is a wonderful thing, and it is difficult to know what went on with all these disparate teams of 'experts'. Certainly there was a great deal of pressure from politicians as well as other people in positions of power to come up with a solution. There were other agendas as well as just coping with the disease.
In this situation it would have needed a very strong and resolute character to admit that their results might not be particularly good. With everyone in such a hurry and 'making it up as they went along' there was no system in place to run the elementary data checks that were necessary as information was gathered. It would have been considered a failure on their part to suggest that a return to manual methods would perhaps produce better results. So they carried on - with the enormous glare of publicity I expect anyone with doubts as to the accuracy of what they were doing kept their mouths firmly shut.
Scientists should have questioned the conclusions and methods more - unfortunately politicians have different agendas. Many do not really understand scientific method and the need for debate to validate and discuss different views. And FMD is a political disease.
Much of the comment hidden in the later papers suggests that more research is necessary into areas which are not yet clear. This may well be of use in the future, except that, accurate information and testing is not available for many of the Infected Premises and hence the data is still not of sufficiently high quality to do further analysis on what they call the 'biggest epidemic of FMD in the world'. This is very unfortunate.
There was one place where modelling could and SHOULD have been used - I don't know if it was. This was in the management of the logistics of the whole effort rather than the spread of the disease- attempting to predict the personnel, vehicles, vaccine, test facilities required versus the constraints and limitations of what was available and how fast it could be made available.
This sort of model exists in many business packages - classical game theory type variants are readily available for most computers. Modelling these areas should have paid off and allowed the managers to recognise the limitations of what they could achieve very quickly.
Resources were always going to be limited - modelling what one could do with the resources available was very necessary.
8. APPENDIX
8.1 Bibliography
Number
Title
Group
1
REQUIREMENTS OF A GEOGRAPHICAL INFORMATION SYSTEM TO BE USED DURING A FOOT-AND-MOUTH DISEASE OUTBREAK by M NIELEN, AW JALVINGH1, AA DIJKHUIZEN1, R LATTUADA published in mid 1990s
WageningenNetherlands
2
Airborne spread of foot-and-mouth disease by Alex Donaldson, MICROBIOLOGY TODAY VOL26/ AUG99
Pirbright
3
Risk and economic consequences of exotic animal disease outbreaks: Computer simulation to help set priorities in policy making. - Dr. Suzan Horst, Wageningen University, Farm Management - EUFMD meeting in Maisons-Alfort, France Sept 1999
WageningenNetherlands
4
The importance of immediate destruction in epidemics of foot and mouth disease - Howard and Donnelly - published October 2000
Imperial
5
Report of the Session of the Research Group of the Standing Technical Committee of EUFMD - Preliminary results from the EUFMD Expert Elicitation Workshop on risk of introduction of FMD to Europe.John Ryan and Lisa Gallagher. Borovets, Bulgaria - September 2000
EUFMD
6
Foot and Mouth Disease 2001 - epidemiological forecasts - MAFF/DEFRA 23rd March 2001
MAFF
7
Managing foot-and-mouth - Woolhouse and Donaldson - Nature, 29th March 2001
Edinburgh & Pirbright
8
The Foot-and-Mouth Epidemic in Great Britain : Pattern of Spread and Impact of Interventions: Ferguson, Donnelly and Anderson - Science/vol 292/ May 2001
Imperial
9
Relative risks of the uncontrollable (airborne) spread of FMD by different species - Donaldson, Alexandersen, Sorenson, Mikkelson - Veterinary Record, May 12, 2001
Pirbright
10
Relative resistance of pigs to infection by natural aerosols of FMD virus - - Donaldson, Alexandersen, - Veterinary Record, May 12, 2001
Pirbright
11
Predictive spatial modelling of alternative control strategies for the foot and mouth disease epidemic in Great Britain 2001 - Morris, Stern, Stevenson, Wilesmith and Sanson - Veterinary Record/ vol 149/ August 2001
VLA and Massey
12
Dynamics of the 2001 UK Foot and Mouth Epidemic: Stochastic Dispersal in a Heterogeneous Landscape - Keeling, Woolhouse, Shaw, Matthews, Chase-Topping, Haydon, Cornell, Kappey, Wilesmith, Grenfell --Science/vol 294/ October 2001
Cambridge & Edinburgh
13
Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain - Ferguson, Donnelly and Anderson - Nature, vol 413, October 2001
Imperial
14
Links between FMD cases - Scudamore, Parliamentary Agriculture Committee Nov 2001 and Brussells conference December 2001
MAFF
15
Descriptive epidemiology of the 2001 foot and mouth disease epidemic in Great Britain: the first five months - Gibbens, Sharpe, Wilesmith, Mansley, Michalopoulou, Ryan, Hudson - Veterinary Record, Dec 15, 2001
VLA
Appendix 8. 2 : INCONSISTENT COUNTY AND LOCAL AUTHORITY LISTS
8.2 (a) OLD County Classification as at DECEMBER 2001 - 33 counties
Number
County Name
IPs
1
Anglesey
13
2
0
3
11
4
16
5
93
6
4
7
893
8
8
9
173
10
176
11
11
12
5
13
76
14
43
15
5
16
53
17
6
18
26
19
1
20
87
21
134
22
2
23
69
24
11
25
8
26
48
27
5
28
6
29
2
30
6
31
9
32
26
33
Northern Ireland
4
Total
2030
Source: DEFRA website
8.2 (b) NEW County and Local Authority Classification - as at JANUARY 2002 -
44 COUNTIES
Name
IPs*
Description
1
Bradford District
5
Metropolitan District
2
Caerffili - Caerphilly
2
Unitary Authority
3
Casnewydd - Newport
3
Unitary Authority
4
Castell-Nedd Port Talbot - Neath Port Talbot
1
Unitary Authority
5
Cheshire County
16
County
6
City of Bristol
1
Unitary Authority
7
Cornwall County
4
County
8
County of Herefordshire
44
Unitary Authority
9
Cumbria County
893
County
10
Darlington
8
Unitary Authority
11
Derbyshire County
8
County
12
Devon County
173
County
13
Dumfries and Galloway
176
Unitary Authority
14
Durham County
85
County
15
Essex County
9
County
16
Gloucestershire County
72
County
17
Greater London Authority
1
Greater London Authority
18
Kent County
3
County
19
Lancashire county
53
County
20
Leeds District
1
Metropolitan District
21
Leicestershire County
6
County
22
Medway
2
Unitary Authority
23
Newcastle Upon Tyne District
6
Metropolitan District
24
North Yorkshire County
133
County
25
Northamptonshire County
1
County
26
Northumberland County
88
County
27
Oxfordshire County
2
County
28
Powys - Powys
70
Unitary Authority
29
Rhondda, Cynon, Taf - Rhondda, Cynon, TAFF
1
Unitary Authority
30
Scottish Borders
11
Unitary Authority
31
Shropshire County
15
County
32
Sir Fynwy - Monmouthshire
23
Unitary Authority
33
Sir Ynys Mon - Isle of Anglesey
13
Unitary Authority
34
Somerset County
8
County
35
South Gloucestershire
3
Unitary Authority
36
Staffordshire County
45
County
37
Stockton-on-Tees
4
Unitary Authority
38
Telford and Wrekin
1
Unitary Authority
39
Thurrock
1
Unitary Authority
40
Warrington
1
Unitary Authority
41
Warwickshire County
2
County
42
Wigan District
1
Metropolitan District
43
Wiltshire County
9
County
44
Worcestershire County
22
County
Total number of Infected Premises
2026
Source: DEFRA website
8.2 ( c ) List of Counties on Slaughter Statistics - 59 counties (48 with non-zero data)
County number
Slaughter County (as in 1979)
IPs
1
Aberdeenshire
0
2
Avon
4
3
Ayrshire
0
4
Bedfordshire
0
5
Berkshire
0
6
Berwickshire
5
7
Buckinghamshire
0
8
Cambridgeshire
0
9
Cheshire
17
10
Cleveland
2
11
Clwyd
1
12
Cornwall
4
13
Cumbria
892
14
Derbyshire
8
15
Devonshire
172
16
Dumfriesshire
140
17
Durham
94
18
Dyfed
0
19
East Lothian
0
20
Essex
9
21
Gloucestershire
72
22
Greater London - East
2
23
Greater Manchester
1
24
Gwent
25
25
Gwynedd
12
26
Hereford & Worcester
66
27
Inverness-shire
0
28
Kent
5
29
Kircudbright
21
30
Lancashire
53
31
Leicestershire
6
32
Lincolnshire
0
33
Mid Glamorgan
2
34
Moray
0
35
Norfolk
0
36
North Yorkshire
133
37
Northamptonshire
1
38
Northumberland
88
39
Nottinghamshire
0
40
Oxfordshire
2
41
Powys
74
42
Ross & Cromarty
0
43
Roxburgh
6
44
Selkirk
0
45
Shropshire
12
46
Somerset
9
47
South Glamorgan
0
48
South Yorkshire
0
49
Staffordshire
47
50
Suffolk
0
51
Surrey
0
52
Sutherland
0
53
Tyne & Wear
6
54
Warwickshire
2
55
West Glamorgan
1
56
West Midlands
0
57
West Yorkshire
8
58
Wigtown
15
59
Wiltshire
9
Source: DEFRA website