Water Data Boot Camp: Fall 2019
Our team hydrologist suggested that one method for evaluating the impacts of dam construction is to monitor changes in flood return intervals. Falls Lake is a flood control reservoir, so it should decrease the amount of downstream flooding.
*Figure: Reservoirs should moderate downstream flows. There is a flood control pool to hold flood waters that can be released slowly over time. There is also a conservation pool that holds water that can be released downstream during drier conditions to meet minimum streamflow requirements.*
Flood insurance policy is built around the concept of the 100-year flood event. The housing industry has been working to explain what that risk actually means to homeowners with 30 year mortgages. Understanding the flood risk relative to mortages is helpful for insurance companies to know. Has Falls Lake decreased the flood risk for downstream homes?
Reservoirs decrease the likelihood of downstream flooding, but that often means development occurs in areas that would have been frequently flooded prior to the reservoir. We’ve seen examples of this just quite recently with Hurricanes Harvey and Florence.
We will use Leopold’s (1994) flood frequency curve and the Weibull equation to calculate the recurrence interval. Here the return interval is computed as where is the number of years of data and is the rank of the year from largest to smallest (see this link for more info).
* NOTE: The accuracy of a return interval is highly affected by the length of the time series.
So, for us to do this analysis, we need to first compute maximum annual discharge, i.e. extract the largest discharge observed from each water year. Then we sort and rank our data on max annual discharge and then compute a regression line from which we can determine the discharge of a 100 and 500 year flood.
Excel's pivot tables are one of it's more powerful features, allowing you to easily extract and cross tabulate various summaries from your data. We'll use to sift through discharge records on a per-year basis and identify the largest one.
Highlight your EDA table - all the columns up to and including "Mean flow (cms)", headers and all.
From the Insert menu, select Pivot Table, choosing to place the report in a New Worksheet
.
Rename this worksheet "Flood".
In the new worksheet created, click on the Pivot Table if the Pivot Table Fields
dialog is not shown.
In the Pivot Table Fields dialog:
Water year
field into the Rows
sectionMean discharge (cms)
field into the Values
sectionthe Mean discharge (cms)
item now in the Values
section, change the Value Fields Settings
to summarize the field by the Max
value. We now have the data we want. Next we'll compute rankings and then sort the data. However, to do this we need to copy the data from the dynamic Pivot Table.
Copy contents of your pivot table
Paste the values of the contents right click
+s
+v
.
Delete the last row in the pasted values (the grand total)
Rename columns if needed (e.g. replace "Row Labels" with "Water Year")
Sort the data from largest to smallest:
Compute rankings in a new column named rank
Type in a few numbers, e.g., 1, 2, 3.
Select these numbers and double click the lower right hand corner of the selected range.
Alternatively, you can use the RANK.EQ
function
*Note: How do these methods differ for those years with the same maximum values (ties)?
Recall that the return interval is computed from the sorted and ranked time series data. Now that we have our data sorted and ranked, we compute from the number of years of data we have (plus 1), divided by the ranking of a given year's discharge.
Calculate Return Intervals in a new column named RI
Calculate the Annual Exceedance Probability in a new column named Pe
Create a scatterplot of max discharge (y-axis) vs. recurrence interval (x-axis)
Note: Excel will default to setting the left most column in your table to the X-axis and the other as the Y-axis, meaning your scatterplot will default to setting the X-axis to Max Discharge and the Y-axis to Recurrence Interval. You'll have to manually switch this using the
Select Data
tool.
In the Select Data Source dialog, Edit the Series Name
category so that it points to the cell that is the header of the Max Flow (cms) column, and swap the Series X values
and Series Y values
so that X is pointing to the values in the Pe
column and Y is pointing to the Max Flow (cms)
column. The result should appears similar to this:
Place the x-axis on a log scale and add minor tick marks.
Add a regression line to your plot
Set the aesthetics of your plot...
Using the regression equation you just calculated and added to your chart, estimate the discharge for the 100, 500, and 1000 year events, i.e., compute y
for x
= 100, 500, and 1000, respectively, somewhere in your Excel worksheet. These are the estimated discharge at 100, 500, and 1000 year flood events.
To do this, we'll create a table, listing the years for which we want to explore (100, 500, 1000):
Finally, we can add these points to our original plot
X Values
, Y Values
are the values computed from the regression.Repeat the above analysis only using data prior to 1980 to calculate the return interval.
Plot the new dataset on top of the original plot.
Design
> Select Data
> Add
How big is the difference between the 100 and 500 year estimates?
Calculate the discharge for different return periods and exceedance probabilities
Plot annual discharge and calculate the number of times the 100 year flood was surpassed for both the POR and prior to 1980
Look at the first plot you did of streamflow on the EDA spreadsheet to look at the distribution of peak events. These events are all hurricanes. How does it change your understanding of why Falls Lake doesn’t seem to impact flood frequency?
What happens to your answer if you remove those three points?
We've now discovered how data can be transformed, or "Pivoted" into various summary formats and how, via Excel's plotting functionality, we can compute quick regressions of our data.