Flood Damage Data

Home PageAccuracy of Damage Estimates


Flood Damage in the United States, 1926-2003
A Reanalysis of National Weather Service Estimates

5. Accuracy of Damage Estimates

In general, estimates of damage contain a high degree of uncertainty. Ideally, estimation errors would be measured by systematically comparing estimates with actual costs, which often are not known until long after a flood event. Unfortunately, actual cost data are seldom collected in a form that can be compared with estimates made at the time of the flood. This section examines the accuracy of flood damage estimates in two ways: (1) by comparing estimates with actual costs in one large flood disaster, and (2) by comparing pairs of estimates from different sources for many flood events.

A. Errors in Early Damage Estimates

NWS flood damage estimates are usually compiled within three months after a flood event, long before the actual costs can be known. Until recently, even in serious disasters, actual total damage costs were not systematically compiled by any agency. There was no way of checking the accuracy, or even the reasonableness, of most damage estimates.

In recent years, however, FEMA has systematically collected cost data for the programs it administers – admittedly only a fraction of total disaster costs. Beginning in 1992, FEMA instituted a computerized system for recording and tracking applications for federal assistance in presidentially declared disasters. State and county governments have gradually developed the capabilities to link to this system. The damage estimates submitted by local officials to FEMA probably represent the best available early estimates under disaster conditions. A team visits each damage site to view the extent of losses and make preliminary estimates. Thus, in some disasters and some jurisdictions, it is now possible to systematically compare early damage estimates with actual costs. Data from FEMA’s Public Assistance Program are particularly appropriate for our purposes because a large portion of the losses involve physical damage to property. Public assistance covers damage to public facilities such as roads and bridges, schools, government buildings, and nonprofit agencies.

In the aftermath of a natural disaster, damage information is assembled according to guidelines established by FEMA. The following stages are described by FEMA (1998) and Michael Sabbaghian1 of the California Office of Emergency Services (OES) (personal communication 8/30/00).

  1. Initial Damage Estimate (IDE): Local officials provide estimates of physical damage based on early reports and descriptions, without necessarily visiting the damage sites.
  2. Preliminary Damage Assessment (PDA): A team including local, state, and FEMA officials visits the damage sites to do a “windshield estimate,” perhaps viewing the sites from a car window or walking around. The PDA estimates are used to decide whether federal assistance is needed. If so, they are submitted to FEMA as part of the governor’s request for a presidential disaster declaration.
  3. Damage Survey Report (DSR): Applicants submit requests for public assistance with detailed worksheets estimating the cost of repairs. FEMA or the state perform inspections (physical surveys) for each large project and “verify documentation on a portion of the small projects” (FEMA 1998). The DSR is used to obligate federal and state disaster assistance funds. The DSR obligations change as bids are received to accomplish the repair work, and computer records are updated accordingly.
  4. Actual Cost: Final total costs when all projects are completed and the DSR is closed. For large disasters, closure might not occur until 4 to 5 years after the disaster event.

Descriptions of the NWS procedures for obtaining flood damage estimates suggest that, in most cases, the estimates have been qualitatively similar to the IDE and certainly no better than the PDA. Indeed, NWS field offices obtain some of their estimates from FEMA’s survey teams (Section 2). Only in the largest floods (notably, the widespread flooding of the upper Mississippi basin in 1993) have extensive efforts been made to update the damage estimates over an extended period.

Therefore, to estimate the errors in early damage estimates that can be expected under good conditions (that is, from officials who have systematically viewed the damage), we use FEMA records from a recent flood disaster as a case study. In February 1998, winter storms with heavy rains led to widespread flooding in California. The president declared a major disaster in 41 counties, designated the “1998 California El Niño” disaster (FEMA-1203-DR). Table 5-1 shows the IDE and PDA estimates for each county under the public assistance program. It also shows the funds that had been obligated in the FEMA database as of June 1, 2001. Although the DSR has not been closed at the time of this writing, it is expected that nearly all costs have been obligated; therefore we will treat these figures as the “actual costs.”

The bottom line of Table 5-1 shows that total public assistance costs in the state were approximately $316 million. The PDA underestimated the total costs by only 6% ($19 million). Because no IDE was provided for several counties, the total IDE of $240 million should be compared with the total actual cost of $279 million from the matching 33 table entries. On that basis, the IDE underestimated total costs by about 14% ($39 million).

Estimates for smaller units (individual counties and the “state agencies” category) are much less accurate, however. Errors in the IDE are particularly large, ranging from underestimation by $26 million (82%) in Los Angeles County to overestimation by $20 million (316%) in San Benito County. In the PDA, errors range from underestimation by $16 million (52%) in the state agencies category to overestimation by $23 million (304%) in San Bernardino County.

Table 5-1. California 1998 El Niño Disaster: Estimated and actual public assistance costs, in thousands of current dollars.

County Actual Cost
(By 6/1/01)
Estimate Prop. of
Estimate Prop. of
State Agencies 30091 7129 0.24 14497 0.48
Alameda 18471 12971 0.70 8176 0.44
Amador 258 235 0.91 176 0.68
Butte 1726 665 0.39 706 0.41
Calaveras 131 162 1.24
Colusa 4652 25000 5.37 1829 0.39
Contra Costa 5631 3885 0.69 4760 0.85
Del Norte 271 461 1.70
Fresno 1701 820 0.48 1052 0.62
Glenn 3802 21250 5.59 9884 2.60
Humboldt 7748 1049 0.14 1753 0.23
Kern 12312 10306 0.84
Lake 1889 1395 0.74 3044 1.61
Los Angeles 31229 5660 0.18 35516 1.14
Marin 6449 3319 0.51 5447 0.84
Mendocino 2836 4259 1.50 3846 1.36
Merced 2327 490 0.21 734 0.32
Monterey 26182 20181 0.77 11822 0.45
Napa 468 720 1.54 448 0.96
Orange 12617 3992 0.32 16720 1.33
Riverside 3130 5964 1.91
Sacramento 2366 3066 1.30
San Benito 6455 26870 4.16 10595 1.64
San Bernardino 7525 30429 4.04
San Diego 6977 9180 1.32
San Francisco 3859 12300 3.19 3703 0.96
San Joaquin 2657 655 0.25 3155 1.19
San Luis Obispo 4006 772 0.19 4915 1.23
San Mateo 21951 16110 0.73 26328 1.20
Santa Barbara 15816 75 0.00 12954 0.82
Santa Clara 13638 9846 0.72 13310 0.98
Santa Cruz 12459 13673 1.10 6320 0.51
Solano 3346 3628 1.08 8564 2.56
Sonoma 11779 11180 0.95 4127 0.35
Stanislaus 2122 909 0.43
Sutter 1039 1582 1.52 758 0.73
Tehama 881 20000 22.70 616 0.70
Trinity 1091 1970 1.81 975 0.89
Tulare 2149 919 0.43
Ventura 20391 3302 0.16 14350 0.70
Yolo 909 4321 4.75 4484 4.93
Yuba 592 196 0.33 249 0.42
Total 315929 239500 0.86* 297204 0.94

* Proportion of actual cost ($279 million) of cases with an IDE.

Figures 5-1(a,b) show scatterplots of (a) the IDE vs. actual costs and (b) the PDA vs. actual costs. Logarithmic scales are used on the axes to highlight proportional differences between estimates and actual costs. The solid diagonal line represents perfect agreement. Data points outside of the two dashed lines are cases in which the estimate differs from the actual costs by more than a factor of two. Clearly the IDE is less accurate than the PDA: the points are much more scattered. (Correlations between the logs of estimates and actual costs are r = 0.46 for the IDE and 0.88 for the PDA.)

Since the Initial Damage Estimates are based on rather superficial damage descriptions, it is not surprising that large errors are the norm: Over half of the IDEs (18 out of 33) are off by at least a factor of two, and 13 of them are off by more than a factor of four. As a percentage of the actual costs, the IDE errors can be enormous, ranging from a 99.5% underestimate in Santa Barbara County to a 2170% overestimate in Tehama County. The Preliminary Damage Assessments are somewhat better, yet over one-third (15 out of 42) are off by at least a factor of two and 3 of them are off by more than a factor of four. The PDA errors range from a 77% underestimate in Humboldt County to a 393% overestimate in Yolo County.

The population of some California counties exceeds that of many small states. So estimation errors in the larger counties are indicative of the error levels to be expected in many states. For example, Los Angeles County, with a 1990 population of 8.9 million, is larger than 42 of the states. Table 5-1 shows that, in this disaster, the IDE underestimated actual costs by 82%.

To check for systematic bias in these early damage estimates, we used a statistical paired-comparison test. A systematic tendency to underestimate might be expected if some types of damage cannot be observed without careful inspection. On the other hand, we wondered if there might be a tendency for local officials to overestimate damage in order to increase the chance of being considered for federal aid. The IDE and PDA estimates were compared with actual costs, as follows:

Let ei = estimated damage, ai = actual cost. We wish to test the null hypothesis that the geometric mean of ei/ai = 1. This is equivalent to the hypothesis that mean[log(ei) – log(ai)] = 0. We tested the hypothesis twice, first letting ei represent the IDE values in Table 5-1 (N = 33), then letting ei represent the PDA values (N = 42). A t-test is appropriate, even in these small samples, because the sample values log(ei) ­- log(ai) are approximately normally distributed. For the IDE, t = -1.27, and for the PDA, t = -1.10, neither of which is statistically significant at a 95% confidence level. Though there may be a tendency to underestimate the amount of damage, the bias is not statistically significant.

In summary, this example indicates that positive and negative estimation errors tend to average out when estimates are highly aggregated in a large flood event (over $300 million damage in 1998 dollars, in this case). The initial rough estimates (IDE) tended to underestimate actual damage and the more careful PDA estimates were reasonably accurate. It shows, however, that in smaller flood events ($30 million damage or less in 1998 dollars), which involve substantially less aggregation, the errors can be extremely large. Half of the PDA estimates were in error by more than a factor of 1.5; and half of the IDEs were in error by more than a factor of 2 (with many off by more than a factor of 4).

Figure 5-1. Estimated flood damage in California counties in the 1998 El Niño disaster, compared with actual costs as of June 1, 2001:

(Click on thumbnail images to enlarge images.)

(a) Initial Damage
(b) Preliminary Damage
Initial Damage Estimate Preliminary Damage Assessment

Given the methods used by NWS field offices to obtain flood damage estimates (described in Section 2), it is unlikely that the NWS estimates are much better than the IDEs examined here. Thus, when an annual flood damage estimate for a state is less than about $30 million, one should not expect the NWS estimate to depict actual losses accurately. However, the above analysis does not indicate systematic bias in the individual estimates, and errors tend to average out when the estimates are summed.

From the above results, we conclude that aggregation of many damage estimates in floods that have caused high levels of damage ($300 million or more in 1998 dollars) provides reasonably good estimates of total damage. However, estimates at a low level of aggregation ($30 million or less) often are in error by factors of 2 or more. Such small estimates should be used with great caution: Direct comparisons of individual estimates are likely to be misleading.

B. Comparison of Damage Estimates from NWS and States

Appropriate data are not available for comparing NWS estimates with actual flood damage costs. However, comparable estimates are available from independent state sources to do an assessment of typical estimation variability.

Every state in the U.S. has an emergency management agency. In July 2000, we wrote to the head of the emergency management agency in each state asking for historical data on flood damage in their state. The letter was followed by a phone call to the appropriate administrator if a response was not received within three weeks. Twenty-one states responded2, but many of them could provide damage information only after 1990 and only related to losses covered by FEMA. Five states either had published historical summaries of flood damage or were able to compile flood damage estimates from their files covering at least 20 years which were based on criteria similar to those used by the NWS.

  1. California: A report (Montane 1999) describes disasters from 1950 through 1998 including for each disaster a brief description, general location, estimated damage, number of deaths, and whether a presidential disaster declaration was issued. We selected the disasters that involved flood, heavy rainfall, or severe storms for this comparison.
  2. Colorado: The state has formally collected flood data since 1937. A report (McLaughlin Water Engineers, Ltd. 1998) summarizes flood history and provides damage estimates for major floods since 1864.
  3. Michigan: A report (Michigan Dept. of State Police 1999) summarizes the 14 floods during 1975-1998 that resulted in a disaster declaration by either the governor or the president. Damage estimates are given for all of the floods that received a presidential declaration and four that received only a gubernatorial declaration.
  4. Virginia: Damage estimates in presidentially-declared flood disasters during 1977-1999 were provided by Michael Cline, State Coordinator of the Virginia Dept. of Emergency Services (personal communication 2000).
  5. Wisconsin: One report on the 1993 Midwest flood summarizes flood losses in Wisconsin from 1973 through 1992 (FEMA 1993), and another report provides loss estimates for the 1993 flood (Wisconsin Dept. of Natural Resources 1993).

In the state reports, the loss estimates are provided for each major flood event, sometimes with two or more events occurring in a given year. To match the annual loss estimates provided by NWS-HIC, we added up the flood losses in each state for each year, using calendar years during 1955-1982 and fiscal years (Oct-Sep) during 1983-1998 to match the time periods used in the NWS estimates3. Our comparison covers a total of 155 years in the 5 states: 44 years each in California and Colorado (1955-1998), 24 years in Michigan (1975-1998), 22 years in Virginia (1977-1998), and 21 years in Wisconsin (1973-1993).

Of course, the state estimates are subject to the same types of error as the NWS estimates – neither is assumed a priori to be more accurate. The intent of this section is to investigate large discrepancies between estimates from different sources in order to understand how estimates of the same event vary and to determine whether some floods are overlooked. In the following analysis, all loss estimates are reported in inflation-adjusted 1995 dollars.

When estimates are very low or missing:

Table 5-2 provides a comparison of the estimates in all 155 years, with cases along the diagonal (from upper-left to lower-right) showing the closest agreement. An obvious difference between the NWS and state estimates is in the amount of missing data – a result of different purposes of the data. NWS flood loss estimates are collected every year, with relatively small losses included; hence, estimates are missing or zero in only 28 years and are below $5 million in 56 years. In contrast, the state reports focus on more serious floods, so years of relatively low flood loss are not included. The states did not report losses in 91 cases, and included losses below $5 million in only 6 cases4. The threshold for reporting appears to be somewhat higher in California, where the lowest reported loss was $15 million.

We conclude that these five states do not attach great importance to floods that cause less than $5 million in damage; therefore, annual losses below that threshold will be described as “low” flood losses. Lumping the low and missing categories together, the NWS and states agree that 78 (50%) of the 155 cases involved little or no flood damage. Disagreements arise, however, when at least one estimate is above $5 million.

Table 5-2. Crosstabulation of flood damage estimates from the NWS and five states. Estimates are in millions of 1995 dollars.

NWS Estimate
State Estimate Missing Est < 5
5 < Est < 50
50 < Est < 500
500 < Est
Missing 26 48 14 2 1 91 (59%)
Est < 5 (Low) 0 4 2 0 0 6 (4%)
5 < Est < 50 (Moderate) 2 4 13 3 0 22 (14%)
50 < Est < 500 (High) 0 0 5 16 1 22 (14%)
500 < Est (Major) 0 0 0 1 13 14 (9%)
Total 28 (18%) 56 (36%) 34 (22%) 22 (14%) 15 (10%) 155

Disagreement #1: State estimate above $5 million, NWS estimate missing or low.

California describes flood losses of $50 million in 1979 and $15 million in 1984, both years in which the NWS provides no loss estimate. In addition, states claim moderate losses in four years when the NWS estimate is low (< $5 million): Colorado 1969 and 1983 ($20 and $24 million, respectively), California 1972 ($29 million), and Virginia 1998 ($13 million). Because these floods were cited as significant in the state reports, it seems likely that the damage was considerably greater than the NWS estimates would indicate. The differences between estimates range from a factor of 6 in the 1998 Virginia case to a factor of 169 in the 1983 Colorado case.

Out of 84 cases in which the NWS indicated flood losses were low or missing, 78 (93%) were in reasonable agreement with the state reports; but 6 cases in which over $5 million damage was claimed by a state were either overlooked entirely by the NWS or underestimated by a large factor.

Disagreement #2: NWS estimate above $5 million, state estimate missing or low.

The top row of Table 5-2 shows 17 cases, not mentioned in the state reports, in which the NWS indicates flood losses over $5 million. In all but one case, the NWS estimate is below $51 million. We assume that some flood damage probably occurred, but the state did not include it in their report. Four of these cases are in Virginia and would have been omitted because they did not receive a Presidential disaster declaration. Excluding Virginia, the three largest NWS estimates are for California, where flood losses are generally high and a $50 million loss might be considered relatively unremarkable.

In one case, however, the NWS estimate is very high: $806 million in Michigan in 1981. This is contradicted by Michigan’s report (Michigan Dept. of State Police 1999), which lists eight floods since 1975 and describes the 1986 flood (with losses of about $400 million) as the most damaging, but makes no mention of a flood in 1981. This blatant error casts doubt on the NWS estimates for 1980-1982, which were derived from broad damage categories in Storm Data, apparently with little or no verification. (See also Section 3 on 1980-1982 damage estimates.)

Comparisons of estimates:

For California, Figures 5-2(a,b) show cases in which at least one estimate is greater than $50 million. For the other states, Figures 5-2(c-f) show cases in which at least one estimate is greater than $5 million. Visually, the graphs are dominated by the major floods (over $500 million), where most of the disagreements appear to be relatively small (except for the erroneous estimate we have already noted for Michigan in 1981). At the moderate-to-high damage levels ($5-500 million), however, some differences are proportionately large. For example, estimates differ by more than a factor of two in California in 1965, 1973, 1976 and 1993, Colorado in 1984 and 1995, Michigan in 1982 and 1998, Virginia in 1979, 1984, 1992 and 1996, and Wisconsin in 1973, 1978, 1980 and 1986.

Figure 5-2. Comparison of National Weather Service flood damage estimates with estimates obtained from five states:

(Click on thumbnail images to enlarge images.)

(a) California, 1955-1977 (b) California, 1978-1998
California, 1955-1977 California, 1978-1998
(c) Colorado, 1955-1998 (d) Michigan, 1975-1998
Colorado, 1955-1998 Michigan, 1975-1998
(c) Virginia, 1977-1998 (d) Wisconsin, 1973-1993
Virginia, 1977-1998 Wisconsin, 1973-1993

Figure 5-3 is a scatterplot of all cases that have estimates from both NWS and the state. Logarithmic scales are used on the axes to highlight proportional differences in the estimates. The solid diagonal line represents perfect agreement between the estimates. Data points outside of the two dashed lines are cases in which the estimates differ by more than a factor of two. Seventeen cases are above the upper dashed line, representing state estimates more than twice as large as the NWS estimate. Six cases are below the lower dashed line, with NWS estimates more than twice as large as the state estimate.

The closest agreement between state and NWS estimates occurred in floods involving major damage (over $500 million). At the other extreme, the largest proportional disagreements (cases farthest outside the dashed lines) occurred when both sources indicated that flood damage was low or moderate (under $50 million).

From the standpoint of the NWS estimates, when the NWS damage estimate was:

  1. moderate ($5-50 million), then 55% of state estimates differed by a factor of 2 or more;
  2. high ($50-500 million), then 30% of state estimates differed by a factor of 2 or more;
  3. major (over $500 million), then none of the differences exceeded a factor of 1.4.

There are many plausible explanations why agreement might improve as total damage increases. First, the crisis of a major flood spurs studies by numerous agencies. Collection of damage information is more likely to be systematic and complete in a major flood than in a smaller one. Second, agencies are more likely to share information about major floods (which would lead to increased agreement, but does not guarantee greater accuracy). In smaller floods, on the other hand, collection of damage information is likely to be haphazard and there is less interest in checking and correcting early damage estimates. Third, the damage in large floods is aggregated from many individual damage estimates so that random errors tend to cancel out. Small floods involve less aggregation and, hence, relatively larger errors.

C. Accuracy: Summary and Conclusions

The following conclusions are drawn from the analysis of accuracy and consistency presented in Sections 4 and 5.

  1. The collection and processing of flood damage data by the NWS has been reasonably consistent from 1934 to the present, except during the period 1976-1982. Errors are probably somewhat larger in the first few years after data collection resumed in 1983.

    Data from NWS files and other sources made it possible to reconstruct state and national flood damage estimates for 1976-1979. However, little data was collected during 1980-1982 and large errors were discovered in estimates developed later for that period. As a result, the years 1980-1982 have been excluded from the reanalyzed data sets. Annual compilation of damage estimates resumed in 1983, but depended mainly on information from Storm Data in the first few years. Particularly in 1983-1984, omissions are more likely and estimates probably contain somewhat larger errors because of the use of damage categories.

    Figure 5-3. Scatterplot of National Weather Service flood damage estimates versus estimates obtained from five states, in millions of 1995 dollars.

    (Click on thumbnail image to enlarge image.)

    Scatterplot of National Weather Service flood damage estimates versus estimates obtained from five states

  1. Individual damage estimates for small floods or for local jurisdictions within a larger flood area tend to be extremely inaccurate.

    It is rare to have actual cost data to compare with damage estimates. The above analysis of one large flood disaster indicates that, in cases where actual costs are less than $30 million, a large proportion of estimates are off by at least a factor of two and sometimes much more. When damage in a state is estimated to be less than $50 million, estimates from NWS and other sources frequently disagree by more than a factor of two.

  2. Damage estimates become more accurate at higher levels of aggregation. Thus NWS estimates totaled over large geographic areas or many years are likely to be fairly reliable (within about a 50% margin of error).

    Errors tend to average out, as long as the local estimates are not systematically biased. For example, the sum of estimates from many counties in a large flood area are found to be quite close to the actual total costs for the area as a whole. When damage in a state is estimated to be greater than $500 million, disagreement between estimates from NWS and other sources are relatively small (40% or less). The relatively close agreement between NWS and state estimates in years with major damage is reassuring, since the most costly floods are of greatest concern and make up a large proportion of total flood damage.

  3. Floods causing moderate damage are occasionally omitted, or their damage greatly underestimated, in the NWS data sets.

    When discrepancies between NWS and state estimates are large, most often the state estimate is the higher one. Occasionally, NWS estimates are missing for floods in which the state claims as much as $50 million damage. Such omissions would have little effect on national total damage estimates. However, they might be important in analyses of damaging floods at the state or river basin level. Researchers studying flood damage in states or river basins should be aware that the NWS estimates occasionally overlook some locally significant damage.

  1. Michael Sabbaghian, Deputy Public Assistance Officer for the California OES, manages disaster recovery activities for infrastructure and is responsible for grant management. He explained the process for estimating and recording losses in presidentially declared disasters. He also provided the damage estimates and cost data for the 1998 California El Niño disaster, which is used in this section.   [ Back to Text ]
  2. States that responded were AL, CA, CO, FL, GA, HI, IL, IN, LA, MA, MI, MN, MO, OH, OR, SC, TX, VA, WA, WV, WY.   [ Back to Text ]
  3. Estimates for 1980-82 were included at this stage of the analysis. California flood damage in Dec 1982 could be attributed differently by the two sources because of the overlap in definition of calendar year 1982 and fiscal year 1983. The other four states did not report losses in Oct-Dec 1982.   [ Back to Text ]
  4. During 1955-98, California reported losses in 26 years (59%), while Colorado reported losses in only 13 years (30%). The other three states reported losses in 33-41% of the years covered by their reports (8 years in Michigan, 9 years in Virginia, and 8 years in Wisconsin).   [ Back to Text ]

[ Back ]   [ Next ]

Home PageAccuracy of Damage Estimates