The method for installing and loading libraries, as well as downloading data from the USGS, are explained in the LoadStreamflowDescription.Rmd
file. LoadStreamflowDescriptions.Rmd
file.
library(dataRetrieval); library(EGRET)
library (ggplot2); library(trend); library(lubridate)
library(dplyr); library(magrittr); library(leaflet)
(The data should be saved in the data folder one up from the current R script…)
#set working directory
swd <- "../data/"
#read in the csv files
sites <- read.csv(paste0(swd,"station.csv"), sep=',',header=TRUE); dim(sites)
## [1] 91 36
sites[1:3,]
## OrganizationIdentifier OrganizationFormalName
## 1 USGS-NC USGS North Carolina Water Science Center
## 2 USGS-NC USGS North Carolina Water Science Center
## 3 USGS-NC USGS North Carolina Water Science Center
## MonitoringLocationIdentifier MonitoringLocationName
## 1 USGS-0209330990 BROOKS LAKE TRIB NR BROWNS SUMMIT, NC
## 2 USGS-0209331325 CANDY CR AT SR2700 NR MONTICELLO, NC
## 3 USGS-0209437825 SMITH BRANCH UPPER TRIB NR MONTICELLO, NC
## MonitoringLocationTypeName MonitoringLocationDescriptionText
## 1 Stream
## 2 Stream
## 3 Stream
## HUCEightDigitCode DrainageAreaMeasure.MeasureValue
## 1 3030002 0.0600
## 2 3030002 1.1000
## 3 3030002 0.0075
## DrainageAreaMeasure.MeasureUnitCode
## 1 sq mi
## 2 sq mi
## 3 sq mi
## ContributingDrainageAreaMeasure.MeasureValue
## 1 NA
## 2 NA
## 3 NA
## ContributingDrainageAreaMeasure.MeasureUnitCode LatitudeMeasure
## 1 NA 36.22791
## 2 NA 36.23402
## 3 NA 36.21597
## LongitudeMeasure SourceMapScaleNumeric
## 1 -79.72198 NA
## 2 -79.66170 NA
## 3 -79.66031 NA
## HorizontalAccuracyMeasure.MeasureValue
## 1 1
## 2 1
## 3 1
## HorizontalAccuracyMeasure.MeasureUnitCode HorizontalCollectionMethodName
## 1 seconds Interpolated from MAP.
## 2 seconds Interpolated from MAP.
## 3 seconds Interpolated from MAP.
## HorizontalCoordinateReferenceSystemDatumName
## 1 NAD83
## 2 NAD83
## 3 NAD83
## VerticalMeasure.MeasureValue VerticalMeasure.MeasureUnitCode
## 1 NA
## 2 NA
## 3 NA
## VerticalAccuracyMeasure.MeasureValue
## 1 NA
## 2 NA
## 3 NA
## VerticalAccuracyMeasure.MeasureUnitCode VerticalCollectionMethodName
## 1
## 2
## 3
## VerticalCoordinateReferenceSystemDatumName CountryCode StateCode
## 1 US 37
## 2 US 37
## 3 US 37
## CountyCode AquiferName FormationTypeText AquiferTypeName
## 1 81 NA NA NA
## 2 81 NA NA NA
## 3 81 NA NA NA
## ConstructionDateText WellDepthMeasure.MeasureValue
## 1 NA NA
## 2 NA NA
## 3 NA NA
## WellDepthMeasure.MeasureUnitCode WellHoleDepthMeasure.MeasureValue
## 1 NA NA
## 2 NA NA
## 3 NA NA
## WellHoleDepthMeasure.MeasureUnitCode ProviderName
## 1 NA NWIS
## 2 NA NWIS
## 3 NA NWIS
dim(sites)
## [1] 91 36
Load in the site data and plot a map to see where water quality data were present from this portal.
Let’s play with the data and see what it looks like to subset the data by removing small streams with a drainage area of less than 25 mi2.
#plot sites
map <- leaflet(data=sites) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(~LongitudeMeasure,~LatitudeMeasure,
color = "red", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~MonitoringLocationName)
map
When we do this, we find there are only 10 sites remaining. This seems really small, so we plot these new sites on the map (blue), overlaying the orginal sites in red.
# subset to only include those with a drainage area exceeding 25 square miles
sites2 <- subset(sites, DrainageAreaMeasure.MeasureValue >= 25); dim(sites2)
## [1] 10 36
#where are sites located?
leaflet(data=sites) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(~LongitudeMeasure,~LatitudeMeasure,
color = "red", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~MonitoringLocationName) %>%
addCircleMarkers(~sites2$LongitudeMeasure,~sites2$LatitudeMeasure,
color = "blue", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~MonitoringLocationName)
#Notice for Jordan Lake... the two values we will be most interested in are: Haw River at SR 1943 and
#All the names on Jordan Lake are "Jordan Lake"... change popup name to see options
We take a closer look at the drainage measurement values and see there are numerous NA values. Clearly, filtering by drainage area will not look. The main point I want to make here is don’t be afraid to play with the data.
Let’s replot the site map but now change the popup name to see which site names are located within each branch of Jordan Lake.
leaflet(data=sites) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(~LongitudeMeasure,~LatitudeMeasure,
color = "red", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~MonitoringLocationIdentifier)
#reduce columns to those needed
sites <- sites[,c(1,3:4,8:9,12:13)]
Once we load in the measurements data, we need to do some cleaning. The first step is to filter the data to what we want to use. For example,
We want to make sure we are using routine samples and not extreme event sampling (biased for specific occasions and not for estimating annual average load).
We specify what type of nitrogen we want to use. From the literature we found that regulations for nitrogen include: nitrate, nitrite, ammonia and organic forms. Doing some reading about the WQX standards, you learn that nitrogen, mixed froms incorporates all of the above forms of nitrogen.
We also want to make sure we are looking at total Nitrogen, so make sure the Results Sample Fraction Text only includes those with Total.
A filtered dataset is provided for you as the results.csv
in the data folder. We’ll load this in and resume analysis…
results <- read.csv(paste0(swd,"result.csv"), sep=',',header=TRUE); dim(results)
## [1] 98590 63
names(results)
## [1] "OrganizationIdentifier"
## [2] "OrganizationFormalName"
## [3] "ActivityIdentifier"
## [4] "ActivityTypeCode"
## [5] "ActivityMediaName"
## [6] "ActivityMediaSubdivisionName"
## [7] "ActivityStartDate"
## [8] "ActivityStartTime.Time"
## [9] "ActivityStartTime.TimeZoneCode"
## [10] "ActivityEndDate"
## [11] "ActivityEndTime.Time"
## [12] "ActivityEndTime.TimeZoneCode"
## [13] "ActivityDepthHeightMeasure.MeasureValue"
## [14] "ActivityDepthHeightMeasure.MeasureUnitCode"
## [15] "ActivityDepthAltitudeReferencePointText"
## [16] "ActivityTopDepthHeightMeasure.MeasureValue"
## [17] "ActivityTopDepthHeightMeasure.MeasureUnitCode"
## [18] "ActivityBottomDepthHeightMeasure.MeasureValue"
## [19] "ActivityBottomDepthHeightMeasure.MeasureUnitCode"
## [20] "ProjectIdentifier"
## [21] "ActivityConductingOrganizationText"
## [22] "MonitoringLocationIdentifier"
## [23] "ActivityCommentText"
## [24] "SampleAquifer"
## [25] "HydrologicCondition"
## [26] "HydrologicEvent"
## [27] "SampleCollectionMethod.MethodIdentifier"
## [28] "SampleCollectionMethod.MethodIdentifierContext"
## [29] "SampleCollectionMethod.MethodName"
## [30] "SampleCollectionEquipmentName"
## [31] "ResultDetectionConditionText"
## [32] "CharacteristicName"
## [33] "ResultSampleFractionText"
## [34] "ResultMeasureValue"
## [35] "ResultMeasure.MeasureUnitCode"
## [36] "MeasureQualifierCode"
## [37] "ResultStatusIdentifier"
## [38] "StatisticalBaseCode"
## [39] "ResultValueTypeName"
## [40] "ResultWeightBasisText"
## [41] "ResultTimeBasisText"
## [42] "ResultTemperatureBasisText"
## [43] "ResultParticleSizeBasisText"
## [44] "PrecisionValue"
## [45] "ResultCommentText"
## [46] "USGSPCode"
## [47] "ResultDepthHeightMeasure.MeasureValue"
## [48] "ResultDepthHeightMeasure.MeasureUnitCode"
## [49] "ResultDepthAltitudeReferencePointText"
## [50] "SubjectTaxonomicName"
## [51] "SampleTissueAnatomyName"
## [52] "ResultAnalyticalMethod.MethodIdentifier"
## [53] "ResultAnalyticalMethod.MethodIdentifierContext"
## [54] "ResultAnalyticalMethod.MethodName"
## [55] "MethodDescriptionText"
## [56] "LaboratoryName"
## [57] "AnalysisStartDate"
## [58] "ResultLaboratoryCommentText"
## [59] "DetectionQuantitationLimitTypeName"
## [60] "DetectionQuantitationLimitMeasure.MeasureValue"
## [61] "DetectionQuantitationLimitMeasure.MeasureUnitCode"
## [62] "PreparationStartDate"
## [63] "ProviderName"
#find out what options are present for the Hydrologic Events and Characteristic Name
unique(results$HydrologicEvent);
## [1] Routine sample Not Determined (historical)
## [3] Flood Storm
## [5] Drought Spring breakup
## [7]
## 7 Levels: Drought Flood Not Determined (historical) ... Storm
unique(results$CharacteristicName);
## [1] Kjeldahl nitrogen
## [2] Ammonia and ammonium
## [3] Inorganic nitrogen (nitrate and nitrite)
## [4] Organic Nitrogen
## [5] Orthophosphate
## [6] Phosphate-phosphorus
## [7] Phosphorus
## [8] Nitrogen, mixed forms (NH3), (NH4), organic, (NO2) and (NO3)
## [9] Nitrate
## [10] Nitrite
## [11] Nitrogen
## [12] Ammonia-nitrogen
## [13] Orthophosphate as P
## [14] Ammonia-nitrogen as N
## [15] Phosphate-phosphorus as P
## [16] Ammonia
## [17] Inorganic nitrogen (nitrate and nitrite) as N
## [18] Orthophosphate as PO4
## 18 Levels: Ammonia Ammonia-nitrogen ... Phosphorus
#filter data
nitrogen <-results %>%
filter(ActivityMediaSubdivisionName=="Surface Water") %>%
filter(HydrologicEvent != "Storm" & HydrologicEvent != "Flood" & HydrologicEvent != "Spring breakup" & HydrologicEvent != "Drought") %>%
filter(CharacteristicName=="Nitrogen, mixed forms (NH3), (NH4), organic, (NO2) and (NO3)") %>%
filter(ResultSampleFractionText=="Total")
dim(nitrogen)
## [1] 4040 63
#check that filters worked
unique(nitrogen$HydrologicEvent); unique(nitrogen$ActivityMediaSubdivisionName)
## [1] Routine sample Not Determined (historical)
## 7 Levels: Drought Flood Not Determined (historical) ... Storm
## [1] Surface Water
## Levels: Bulk deposition Surface Water Wet Fall Material
unique(nitrogen$CharacteristicName); unique(nitrogen$ResultSampleFractionText)
## [1] Nitrogen, mixed forms (NH3), (NH4), organic, (NO2) and (NO3)
## 18 Levels: Ammonia Ammonia-nitrogen ... Phosphorus
## [1] Total
## Levels: Bed Sediment Dissolved Suspended Total
#reduce to needed columns
nitrogen <- nitrogen[,c(1,7,22,31:35,59:61)]
You may have noticed that many sample sites state “not detected”. This is important data that are not currently being represented. Create a new column and set the value equal to the results, unless it is below the detection limit - in which case set it equal to ½ of the detection limit.
You may also have noted that the total nitrogen was sometimes reported as mg/l or mg/l NO3. We want mg/l. To convert to mg/l, we know the atomic weight of nitrogen is 14.0067 and the molar mass of nitrate anion (NO3) is 62.0049 g/mole. Therefore, to convert between units:
#set data below detection limits equal to 1/2 the detection limit
nitrogen$TotalN <- ifelse(nitrogen$ResultDetectionConditionText=="Not Detected", nitrogen$DetectionQuantitationLimitMeasure.MeasureValue/2, nitrogen$ResultMeasureValue)
#convert mg/l as No3 to mg/l as N
nitrogen$TotalN <- ifelse(nitrogen$ResultMeasure.MeasureUnitCode=="mg/l NO3", nitrogen$TotalN*0.2259, nitrogen$TotalN)
nitrogen$TotalN <- ifelse(nitrogen$DetectionQuantitationLimitMeasure.MeasureUnitCode=="mg/l NO3", nitrogen$TotalN*0.2259, nitrogen$TotalN)
Now that we have explored and tidied our sites
and results
datasets pulled from the Water Quality Data portal, we’ll merge them together and generate our visualizations. Recalling our objective is to reveal trends in water quality, particularly in repsonse to water quality rules issued in 2009, we’ll want to constrain (i.e. filter) our analyses to sites with data collected before and after 2009. It’s these remaining sites that we’ll construct plots of total nitrogen: line plots of N over time, box plots by year, and box plots by month. We’ll also plot our concentrations on a map.
Additionally, we’ll convert our N concentrations to annual load (lbs/year), which will require pulling in discharge data (as we did in previous sessions). We’ll do this for a few sites, comparing the findings to allowable levels to answer: are these sites in compliance with Total Daily Maximum Loads (TMDLs)?
Currently, site and measurement data are not connected together. However, we may want to show the nitrate values on a map. To do this, we merge the data together based on a unique identifier shared between the two data sets. In this case, it is the MonitoringLocationIdentifier column.
nitrogen.data <- merge(sites, nitrogen, by.x="MonitoringLocationIdentifier", by.y="MonitoringLocationIdentifier"); dim(nitrogen.data)
## [1] 4040 18
#create year and month categories. Make a table to see how many samples were collected each year/month
nitrogen.data$Year <- year(nitrogen.data$ActivityStartDate); table(nitrogen.data$Year)
##
## 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993
## 4 44 48 50 82 154 183 74 124 128 408 242 100 58 53
## 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
## 54 40 33 32 27 41 51 94 115 142 109 90 90 94 114
## 2009 2010 2011 2012 2013 2014 2015 2016 2017
## 114 115 97 204 196 104 73 78 181
nitrogen.data$Month <- month(nitrogen.data$ActivityStartDate); table(nitrogen.data$Month)
##
## 1 2 3 4 5 6 7 8 9 10 11 12
## 139 284 153 555 278 608 268 623 281 517 110 224
With the data merged, we now want to know how many sites are collecting data of interest. We find there are 22 sites. Of those sites, we want to know which were collecting data after the Jordan Lake Rules were passed and how much data are being collected at this site.
Based on this, we see several sites stopped collecting data prior to 2009, when the first iteration of Jordan Rules were passed. We also see that some of these sites collected only a few years of data. Remove those sites and plot the remaining sites on the map.
#ED to know which sites to keep or remove
unique.sites <- unique(nitrogen.data$MonitoringLocationIdentifier)
n.sites <- length(unique.sites)
#Create dataframe of information on: n samples, start year, end year
site.info <- nitrogen.data %>%
group_by(MonitoringLocationIdentifier) %>%
summarise(n=n(), StartYear=min(Year), EndYear=max(Year))
site.info <- as.data.frame(site.info)
site.info$Range <- site.info$EndYear-site.info$StartYear
site.info
## MonitoringLocationIdentifier n StartYear EndYear Range
## 1 USGS-0209330990 151 1985 1990 5
## 2 USGS-0209331325 236 1985 1990 5
## 3 USGS-0209437825 270 1985 1990 5
## 4 USGS-0209437850 260 1985 1990 5
## 5 USGS-02096842 50 1979 1981 2
## 6 USGS-02096846 263 1988 2017 29
## 7 USGS-0209684980 253 1989 2017 28
## 8 USGS-02096960 179 1981 2004 23
## 9 USGS-0209699999 255 1991 2017 26
## 10 USGS-02097314 69 1982 2001 19
## 11 USGS-0209741955 75 1982 2002 20
## 12 USGS-02097464 265 1988 2017 29
## 13 USGS-0209749990 252 1988 2017 29
## 14 USGS-02097517 83 1982 2014 32
## 15 USGS-0209768310 262 1992 2017 25
## 16 USGS-0209771550 99 1991 2017 26
## 17 USGS-0209781125 90 2012 2017 5
## 18 USGS-0209782609 106 1999 2017 18
## 19 USGS-0209791010 89 2012 2017 5
## 20 USGS-0209799150 355 1991 2017 26
## 21 USGS-0209801100 256 1991 2017 26
## 22 USGS-02098198 122 1980 1985 5
#Limit data to those with more than 10 years data and still operating
site.info <- filter(site.info, Range>=10 & EndYear>=2017); site.info
## MonitoringLocationIdentifier n StartYear EndYear Range
## 1 USGS-02096846 263 1988 2017 29
## 2 USGS-0209684980 253 1989 2017 28
## 3 USGS-0209699999 255 1991 2017 26
## 4 USGS-02097464 265 1988 2017 29
## 5 USGS-0209749990 252 1988 2017 29
## 6 USGS-0209768310 262 1992 2017 25
## 7 USGS-0209771550 99 1991 2017 26
## 8 USGS-0209782609 106 1999 2017 18
## 9 USGS-0209799150 355 1991 2017 26
## 10 USGS-0209801100 256 1991 2017 26
df <- nitrogen.data[nitrogen.data$MonitoringLocationIdentifier %in% unique(site.info$MonitoringLocationIdentifier), ]; dim(df)
## [1] 2366 20
#Where are they located
leaflet(data=df) %>% addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(~LongitudeMeasure,~LatitudeMeasure,
color = "red", radius=5, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~MonitoringLocationIdentifier)
###Plot total nitrogen over time, by month and by year for each site.
####Plot nitrogen over time at each site
unique.site <- unique(df$MonitoringLocationIdentifier)
#convert activity date into a data
df$Date <- as.Date(df$ActivityStartDate, "%Y-%m-%d")
df$Year <- year(df$Date); df$Month <- month(df$Date)
i=1
for (i in 1:length(unique.site)){
zt <- filter(df, MonitoringLocationIdentifier == unique.site[i])
#group by date - take average of a day
foo <- zt %>% group_by(Date) %>%
summarise(AverageN = mean(TotalN, na.rm=TRUE), Year=mean(Year), Month=mean(Month))
foo <- as.data.frame(foo)
foo$color <- rgb(0.596,0.961,1,0.5)
foo$color <- ifelse(foo$Month==1, rgb(0.478,0.772,0.804,0.8), foo$color)
foo$color <- ifelse(foo$Month==2, rgb(0.325,0.525,0.545,0.8), foo$color)
foo$color <- ifelse(foo$Month==3, rgb(0.792,1,0.439,0.8), foo$color)
foo$color <- ifelse(foo$Month==4, rgb(0.635,0.804,0.352,0.8), foo$color)
foo$color <- ifelse(foo$Month==5, rgb(0.333,0.420,0.184,0.8), foo$color)
foo$color <- ifelse(foo$Month==6, rgb(1,0.447,0.337,0.8), foo$color)
foo$color <- ifelse(foo$Month==7, rgb(0.804,0.357,0.114,0.8), foo$color)
foo$color <- ifelse(foo$Month==8, rgb(0.545,0,0,0.8), foo$color)
foo$color <- ifelse(foo$Month==9, rgb(1,0.765,0.059,0.8), foo$color)
foo$color <- ifelse(foo$Month==10, rgb(0.804,0.584,0.047,0.8), foo$color)
foo$color <- ifelse(foo$Month==11, rgb(0.545,0.396,0.031,0.8), foo$color)
#create a plot
par(mfrow=c(1,1))
layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE), widths=c(2,2), heights=c(1.5,1.5))
par(mar = c(2,5,2,2)) #set plot margins
#Plot single graph on top row
plot(foo$Date, foo$AverageN, type='n', yaxt="n", main=zt$MonitoringLocationIdentifier[i], ylab="Total Nitrogen (mg/l)", xlab = '', cex.lab=0.9, ylim=c(0,6))
axis(2, las=2, cex.axis=0.8)
lines(foo$Date, foo$AverageN, col=rgb(0,0,0,0.4))
points(foo$Date, foo$AverageN, col=foo$color, cex=0.8, pch=19)
legend_order <- matrix(1:12,ncol=4,byrow = TRUE)
legend("topright", c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),
pch=19, col=c(rgb(0.478,0.772,0.804,0.8),rgb(0.325,0.525,0.545,0.8),rgb(0.792,1,0.439,0.8),rgb(0.635,0.804,0.352,0.8),rgb(0.333,0.420,0.184,0.8),
rgb(1,0.447,0.337,0.8),rgb(0.804,0.357,0.114,0.8),rgb(0.545,0,0,0.8),rgb(1,0.765,0.059,0.8),rgb(0.804,0.584,0.047,0.8),
rgb(0.545,0.396,0.031,0.8), rgb(0.596,0.961,1,0.5)), ncol=4)
#Add blank months to foo in case
blank.months <- as.data.frame(matrix(nrow=12,ncol=5)); colnames(blank.months) <- colnames(foo)
months.tab <- as.data.frame(table(foo$Month)); colnames(months.tab)<-c("Month","Freq");
k=1
for (j in 1:12){
check <- j;
check.zt <- subset(months.tab, Month==j)
if(dim(check.zt)[1]==0){
blank.months$Month[k]=check; blank.months$AverageN[k]=0
k=k+1
}
}
blank.months <- blank.months[!is.na(blank.months$Month),]
blank.months
foo.mo <- rbind(foo, blank.months)
boxplot(foo.mo$AverageN~foo.mo$Month, yaxt="n", ylab="Total Nitrogen (mg/l)", ylim=c(0,10), yaxs='i',
col=c(rgb(0.478,0.772,0.804,0.8),rgb(0.325,0.525,0.545,0.8),rgb(0.792,1,0.439,0.8),rgb(0.635,0.804,0.352,0.8),rgb(0.333,0.420,0.184,0.8),
rgb(1,0.447,0.337,0.8),rgb(0.804,0.357,0.114,0.8),rgb(0.545,0,0,0.8),rgb(1,0.765,0.059,0.8),rgb(0.804,0.584,0.047,0.8)), pch=19, cex=0.7)
axis(2, las=2)
mtext("Nitrogen by Month", side=3, line=-2)
abline(h=0.01, col="black", lwd=1)
#Add blank years to foo in case
year.tab <- as.data.frame(table(foo$Year)); colnames(year.tab)<-c("Year","Freq");
year.tab$Year <- as.numeric(as.character(year.tab$Year))
row.no <- max(year.tab$Year)-min(year.tab$Year)
blank.years <- as.data.frame(matrix(nrow=row.no,ncol=5)); colnames(blank.years) <- colnames(foo)
k=1
for (j in min(year.tab$Year):max(year.tab$Year)){
check <- j;
check.zt <- subset(year.tab, Year==check)
if(dim(check.zt)[1]==0){
blank.years$Year[k]=check; blank.years$AverageN[k]=0
k=k+1
}
#print(check)
}
blank.years <- blank.years[!is.na(blank.years$Year),]
blank.years
foo.yr <- rbind(foo, blank.years)
boxplot(foo.yr$AverageN~foo.yr$Year, yaxt="n", ylab="Total Nitrogen (mg/l)", ylim=c(0,6), col="grey", pch=19, cex=0.7)
axis(2, las=2)
mtext("Nitrogen by Year", side=3, line=-2)
abline(h=10, col="red", lwd=2, lty=4)
#print(i)
}
Let’s show the amount of Nitrogen coming from each site based on the size of the circle.
nitrogen.data$Year <- year(nitrogen.data$ActivityStartDate)
last.year <- filter(nitrogen.data, Year==2017)
last.year.n <- last.year %>%
group_by(MonitoringLocationIdentifier) %>%
summarize(AveN = mean(TotalN, na.rm=T), LongitudeMeasure = mean(LongitudeMeasure), LatitudeMeasure = mean(LatitudeMeasure))
last.year.n <- as.data.frame(last.year.n)
last.year.n$AveN <- round(last.year.n$AveN,3)
leaflet(data=last.year.n) %>%
addProviderTiles("Esri.WorldTopoMap") %>%
addCircleMarkers(~LongitudeMeasure,~LatitudeMeasure,
color = "red", radius=~AveN*5, stroke=FALSE,
fillOpacity = 0.6, opacity = 0.8,
popup=~paste0(MonitoringLocationIdentifier, "<br>",
"Average 2017 N: ", AveN, " mg/l"))
The water quality data reports Nitrogen as mg/l. In order to convert to an annual load (lbs/yr), we need to know the volume of water flowing through each site. Go to the NWIS Mapper to find which USGS gauges are closest to the Haw River Arm (1 site) and the New Hope Arm (3 sites).
#USGS-0209699999 is the arm of the Haw River and there is a USGS stream gauge site nearby: USGS 02096960
#Identify gauge to download
siteNo <- '02096960'
#Identify parameter of interest: https://help.waterdata.usgs.gov/code/parameter_cd_query?fmt=rdb&inline=true&group_cd=%
pcode = '00060' #discharge (cfs)
#Identify statistic code for daily values: https://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html
scode = "00003" #mean
#Identify start and end dates
start.date = "1970-10-01"
end.date = "2017-12-31"
#Load in data using the site ID
q <- readNWISdv(siteNumbers = siteNo, parameterCd = pcode, statCd = scode, startDate=start.date, endDate=end.date)
q <- renameNWISColumns(q); colnames(q)
## [1] "agency_cd" "site_no" "Date" "Flow" "Flow_cd"
q.siteInfo <- attr(q, "siteInfo")
q$Year <- year(q$Date); q$Month <- month(q$Date)
haw.n <- filter(df, MonitoringLocationIdentifier == "USGS-0209699999")
#where are sites located?
leaflet(data=q.siteInfo) %>%
addProviderTiles("Esri.WorldTopoMap") %>%
addCircleMarkers(~dec_lon_va,~dec_lat_va,
color = "blue", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~station_nm) %>%
addCircleMarkers(~haw.n$LongitudeMeasure[1],~haw.n$LatitudeMeasure[2],
color = "red", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~haw.n$MonitoringLocationName[1])
To calculate the annual load we need to convert cfs to MGD. Then we use pipes and dplyr to calculate the total annual flow at Haw River. Next, we calculate the average Nitrogen load for samples taken during each year. This is a rough proxy. A finer analysis can be undertaken by summarizing monthly flow and water quality, aggregating to the year as the last step. The annual load is then the: Total Flow * average Nitrogen * 8.34 lbs per gallon
We can then plot the annual load with the threshold of 2.567 Million pounds per year.
#Calculate total volume of water each year
q$Flow_MGD <- 646317*q$Flow/1000000
q.year <- q %>%
group_by(Year) %>%
summarise(Total_MGD = sum(Flow_MGD, na.rm=T), n=n())
q.year <- as.data.frame(q.year)
q.year <- filter(q.year, n>=350)
#calculate the average Nitrate concentration
n.year <- haw.n %>%
group_by(Year) %>%
summarise(AveN = mean(TotalN, na.rm=T), n=n())
n.year <- as.data.frame(n.year)
#For days with a water quality measurement - what was the daily flow?
haw.n <- merge(n.year, q.year, by.x="Year", by.y="Year")
lbspergal <- 8.34
haw.n$Pounds <- haw.n$Total_MGD*haw.n$AveN*lbspergal
#convert to million pounds
haw.n$MPounds <- haw.n$Pounds/1000000
par(mfrow=c(1,1))
par(mar = c(2,5,2,2)) #set plot margins
#Plot single graph on top row.
plot(haw.n$Year, haw.n$MPounds, type='n', yaxt="n", main="Haw River Branch", ylab="Annual Nitrogen Load (million lbs)", xlab = '', cex.lab=0.9)
axis(2, las=2, cex.axis=0.8)
lines(haw.n$Year, haw.n$MPounds, col=rgb(0,0,0,0.4))
points(haw.n$Year, haw.n$MPounds, col=rgb(0,0,0,0.6), cex=1, pch=19)
abline(h=2567000/1000000, col="red", lty=4, lwd=2)
Here there are 3 upstream gauges for New Hope Creek. We will download that information into a single file. We repeat the above analyis with the New Hope stream gauges.
#USGS-0209768310 is the arm of the Haw River and there are three usgs gauges upstream on different tributaries
#Identify gauges to download
siteNo <- c('02097314', '02097517','0209741955')
pcode = '00060' #discharge (cfs)
scode = "00003" #mean
start.date = "1970-10-01"
end.date = "2017-12-31"
#Load in data using the site ID
q <- readNWISdv(siteNumbers = siteNo, parameterCd = pcode, statCd = scode, startDate=start.date, endDate=end.date)
q <- renameNWISColumns(q); colnames(q)
## [1] "agency_cd" "site_no" "Date" "Flow" "Flow_cd"
q.siteInfo <- attr(q, "siteInfo")
q$Year <- year(q$Date); q$Month <- month(q$Date)
#If flow is unknown set to NA
q$Flow <- ifelse(q$Flow <0,NA, q$Flow)
#grab new hope monitoring station
newhope.n <- filter(df, MonitoringLocationIdentifier == "USGS-0209768310")
#where are sites located?
leaflet(data=q.siteInfo) %>%
addProviderTiles("Esri.WorldTopoMap") %>%
addCircleMarkers(~dec_lon_va,~dec_lat_va,
color = "blue", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~station_nm) %>%
addCircleMarkers(~newhope.n$LongitudeMeasure[1],~newhope.n$LatitudeMeasure[2],
color = "red", radius=3, stroke=FALSE,
fillOpacity = 0.8, opacity = 0.8,
popup=~newhope.n$MonitoringLocationName[1])
#Calculate total volume of water each year
q$Flow_MGD <- 646317*q$Flow/1000000
q.year <- q %>%
group_by(Year) %>%
summarise(Total_MGD = sum(Flow_MGD, na.rm=T), n=n())
q.year <- as.data.frame(q.year)
q.year <- filter(q.year, n>=350*3)
#calculate the average Nitrate concentration
n.year <- newhope.n %>%
group_by(Year) %>%
summarise(AveN = mean(TotalN, na.rm=T), n=n())
n.year <- as.data.frame(n.year)
#For days with a water quality measurement - what was the daily flow?
newhope.n <- merge(n.year, q.year, by.x="Year", by.y="Year")
lbspergal <- 8.34
newhope.n$Pounds <- newhope.n$Total_MGD*newhope.n$AveN*lbspergal
newhope.n$KPounds <- newhope.n$Pounds/1000
par(mfrow=c(1,1))
par(mar = c(2,5,2,2)) #set plot margins
#Plot single graph on top row
plot(newhope.n$Year, newhope.n$KPounds, type='n', yaxt="n", main="New Hope Arm", ylab="Annual Nitrogen Load (thousand lbs)", xlab = '', cex.lab=0.9, ylim=c(0,800))
axis(2, las=2, cex.axis=0.8)
lines(newhope.n$Year, newhope.n$KPounds, col=rgb(0,0,0,0.4))
points(newhope.n$Year, newhope.n$KPounds, col=rgb(0,0,0,0.6), cex=1, pch=19)
abline(h=641021/1000, col="red", lty=4, lwd=2)
Notice that the annual nitrogen load is far below the threshold level. Why or why not is this unexpected? What do you think could be causing this underestimate?
It is likely due to missing several tributary contributions. Jordan Lake has a 1,689 mi2 drainage area. The Haw River Branch is 1,275 mi2. A tributary downstream of newhope.n is 12 mi2. The upstream drainage area of the three sites are 76.9, 21.1, and 41 mi2 (how do you think I found these numbers?). This leaves 402 mi2 unaccounted for. Let’s assume 25% of this left overis downstream of the monitoring site. In hydrology, flow is often linked to drainage area. Let’s adjust the Nitrogen entering the New Hope Arm by drainage area.
site.da <- 76.9+21.2+41
remaining.da <- 1689-1275-12
fraction <- 0.75
adjust.da <- remaining.da*fraction
newhope.n$adj.pounds <- newhope.n$KPounds*adjust.da/site.da
plot(newhope.n$Year, newhope.n$adj.pounds, type='n', yaxt="n", main="New Hope Arm", ylab="Annual Nitrogen Load (thousand lbs)", xlab = '', cex.lab=0.9, ylim=c(0,1000))
axis(2, las=2, cex.axis=0.8)
lines(newhope.n$Year, newhope.n$KPounds, col=rgb(0,0,0,0.4))
points(newhope.n$Year, newhope.n$KPounds, col=rgb(0,0,0,0.6), cex=1, pch=19)
lines(newhope.n$Year, newhope.n$adj.pounds, col=rgb(0.5,0,0,0.4), lty=2)
points(newhope.n$Year, newhope.n$adj.pounds, col=rgb(0.5,0,0,0.4), cex=1, pch=19)
abline(h=641021/1000, col="red", lty=4, lwd=2)
mtext(paste0("Adjust by ", fraction, " drainage area fraction"), side=3, line=-2)
##Save out the new files for entry into Tableau
#write csv
write.csv(df, paste0(swd,'../tableau/nitrogen.csv'),row.names=FALSE)
write.csv(haw.n, paste0(swd, '../tableau/hawbranch.csv'),row.names=FALSE)
write.csv(newhope.n, paste0(swd, '../tableau/newhope.csv'),row.names=FALSE)