I am a physical oceanographer interested in how ocean water is mixed and transformed. I am currently working as Research Scientist at the Bedford Institute of Oceanography in Halifax, Nova Scotia.

Recent Posts:

Using the oce package to make nice maps


Making maps is a pretty important part of doing, and presenting, ocean data analyses. Except for very small domains, using map projections is crucial to ensure that the map is not distorted. This is particularly true for polar and high latitude regions, such as the Arctic (where I do much of my work).

In this post I will give a brief introduction to making projected maps with the oce package, including not just the land/coastline but also various ways of plotting the bathymetry. For the latter I will discuss the handy marmap package.

Making a polar stereographic map of the Canadian Arctic

To make “projected” maps with oce, you can use the mapPlot() function. The easiest way is to use mapPlot() with a coastline object, to set the projection you want, and then add other elements (such as bathymetry, etc) using functions like mapLines(), mapPoints(), mapImage(), etc. You may need to redo the coastline at the end to clean up anything that might have plotted over land that you don’t want.

The coastlineWorldFine dataset from the ocedata package is pretty good for sub-regions such as the Canadian Arctic.

## Loading required package: testthat
## Loading required package: gsw
library(ocedata) #for the coastlineWorldFine data

To get the projection you want, it must be passed in the projection= argument using “proj4” syntax as a character string. You can read up on the syntax and available projection in the help – i.e. ?mapPlot, or have a look at:


For polar maps, the most commonly used is the stereoraphic projection:

## Save it to a function to make it easy to re-run
mp <- function() {
    mapPlot(coastlineWorldFine, projection="+proj=stere +lon_0=-90 +lat_0=90",
            longitudelim = c(-120, -60),
            latitudelim = c(60, 85), col='grey')

plot of chunk unnamed-chunk-2

In this example, the +lon_0 parameter defines the longitude at the center of the projection, and the +lat_0 is the latitude at the center of the projection. The mapPlot() arguments longitudelim and latitudelim are used to control the extent of the map.

As for adding bathymetry, there are a few options:

  1. oce includes a low-res version of the etopo dataset, called topoWorld that can be used, either as a contour plot or as an image plot.

  2. The marmap package allows easy downloading of subsets of the full resolution etopo data, which can be used to make nicer plots. One downside is that because of the “triangle” nature of a polar stereoraphic projection you have to download a lot more data than you really need.

Using topoWorld

A quick way of adding bathymetry is to simply use the mapContour() function (analogous to the base-R contour()).

mapContour(topoWorld, levels=-c(500, 1000, 2000, 4000),

plot of chunk unnamed-chunk-3

Contour plots can be tricky, because of the labeling etc, and choosing which contours to plot. Another option is to use the mapImage() function:

mapImage(topoWorld, col=oceColorsGebco, breaks=seq(-4000, 0, 500))
mapPolygon(coastlineWorldFine, col='grey')

plot of chunk unnamed-chunk-4

But sometimes using mapImage() shows off the “blockiness” of the low-res topoWorld dataset – especially near the poles where the shape of the cells (evenly divided in lon/lat space) gets really skinny.

One fix is to use a higher res data set (so the “boxes” are smaller – see next section). Another is to use the fillContour argument in mapImage() to plot filled contours rather than the individual grid cells.

mapImage(topoWorld, col=oceColorsGebco, breaks=seq(-4000, 0, 500), filledContour = TRUE)
mapPolygon(coastlineWorldFine, col='grey')

plot of chunk unnamed-chunk-5

Using marmap

Here we load the marmap package and download the bathymetry. Note that because we have the North Pole in view, we basically have to download an entire half hemisphere. It takes a little while to download, but if you use the keep=TRUE argument then it will just reload it from the file each time.

However, because there is so much more data, the plotting can be quite a bit slower.

## Attaching package: 'marmap'
## The following object is masked from 'package:oce':
##     plotProfile
## The following object is masked from 'package:grDevices':
##     as.raster
b <- as.topo(getNOAA.bathy(-180, 0, 55, 90, keep=TRUE))
## File already exists ; loading 'marmap_coord_-180;55;0;90_res_4.csv'
mapImage(b, col=oceColorsGebco, breaks=seq(-4000, 0, 500))
mapPolygon(coastlineWorldFine, col='grey')

plot of chunk unnamed-chunk-6

Looks pretty good, though!

The (ocean) physics of The Ocean Cleanup's System 001


The Ocean Cleanup, brainchild of Dutch inventor Boyan Slat, was in the news again this past week after announcing that in addition to the fact that their system is unable to collect plastic as intended, it suffered a mechanical failure. “Wilson” is currently being towed to Hawaii, where it will undergo repairs and upgrades, presumably to be towed back out to the garbage patch for a second trial.

I am not a mechanical engineer, so I don’t intend to comment on the details of their mechanical failure. I am, however, a sea-going oceanographer. Which means that I am used to the sorts of situations with scientific research equipment that was so succinctly paraphrased by Dr. Miriam Goldstein:

“The ocean is strong and powerful, and likes to rip things up.”

Dr. Miriam Goldstein. Prescient oceanographer

In short – the ocean is a difficult place to work. There are literally CONFERENCES dedicated to the engineering of putting thing out to sea and having them survive (see the MTS Buoy Workshop, which I have participated in). There is a saying in oceanographic fieldwork: if you get your gear back, it was a successful program. If it recorded data – that’s icing on the cake.

Designing for physics

But beyond the engineering, there are the questions of what the physics are that TOC are relying on for their system to be successful. Some of you may recall that the original design was to moor (i.e. anchor) their device in 6000m (20000 feet) of water, and let existing ocean currents sweep garbage into the U-shaped structure. Thankfully, they realized the challenges associated with deep-ocean moorings, and abandoned that idea.

The latest design iteration (misleadingly called “System 001”, as though they haven’t built and tested any other previous to it), is to have a freely-drifting system, avoiding the use of anchors. TOC claim that under the influence of current, wind, and waves, their design will drift faster than the plastic – causing it to accumulate in the U, making for easy pickup. They summarize the concept with a little explainer video on their website, with a representative screen shot below:

Nice how the wind, waves, and current all are going in the same direction!!!

Based on a quick Twitter rant that I had after thinking about all this for a few minutes (see here), I wanted to explain out the various points that have either a) been missed by TOC design team, or b) deliberately excluded from their rosy assessment of how they expect their system to actually collect garbage. What follows is a “first stab” at a physical oceanographic assessment of the basic idea behind “System001”, and what TOC would need to address to convince the community (i.e. scientists, conservationists, etc) that their system is actually worth the millions of dollars going into development and testing.

The premise

As outlined in the video, the premise of System001 as a garbage collection system is that through the combined action of wind, waves, and currents, the U-shaped boom will travel faster through the water than the floating plastic, thereby collecting and concentrating it for eventual removal. This appears to be based on the idea that while both the boom and the plastic will drift with the current, because the boom protrudes from the water (like a sail), it will actually move faster than the surface water by catching wind.

There are some issues with this premise. Or, at least, there are some real aspects of oceanography that have either been ignored or missed in thinking that such a system will behave in the predictable way described by TOC. I’ll try and outline them here.

Stokes drift

Any of you who may have had an introduction to ocean waves may have heard that during the passage of a wave, the water particles move in little circles (often called wave orbital motion). While not a bad “first-order” description, it turns out that for real ocean waves there is also some drift in the direction of wave propagation. This drift is named after Gabriel Stokes, who first described it mathematically in 1847 (see wikipedia article here).

Stokes drift, from https://www.researchgate.net/publication/315739116_Breaching_of_Coastal_Barriers_under_Extreme_Storm_Surges_and_Implications_for_Groundwater_Contamination_Application_of_XBeach_in_Coastal_Flood_Propagation/figures?lo=1

The amount of drift depends nonlinearly on both the amplitude and the wavelength of the wave. For example, for a 0.5m amplitude wave with a wavelength of 10m and period of 10s (something like typical ocean swell), the drift velocity is about 10 cm/s right at the surface.

Of course, the Stokes’ solution describes the motion of the water parcels being moved by the wave. For those water parcels to then have an effect on anything in the water, one would need to consider the various components of force/impulse/momentum (i.e. our buddy Sir Isaac Newton). Needless to say, it seems obvious that a smallish piece of neutrally buoyant plastic will respond to the Stokes drift much more readily than a 600m long floating cylinder with a large mass (and therefore large inertia).

This alone could be enough to quash the idea of a passive propagating collection system.

Ekman currents

While we’re talking about long-dead European fluid mechanics pioneers, any study of the effect of winds and currents wouldn’t be complete without a foray into the theories proposed by Swedish oceanographer Vagn Walfrid Ekman in 1905. What Ekman found was that when the wind blew over the surface of the ocean, the resulting current (forced by friction between the air and the water) didn’t actually move in the same direction as the wind. The reason for this is because of the so-called “Coriolis effect”, whereby objects moving on the surface of the Earth experience an “acceleration” orthogonal to their direction of motion that appears to make them follow a curved path (for those who want to go down the rabbit hole, the Coriolis acceleration is essentially a “fix” for the fact that the surface of the Earth is non-inertial reference frame, and therefore doesn’t satisfy the conditions for Newton’s laws to apply without modification).

Anyway – the consequence is that in an ideal ocean, with a steady wind blowing over the surface, the surface currents actually move at an angle of 45 degrees to the wind direction! Whether it’s to the left or right of the wind depends on which hemisphere you are in – I’ll leave it as an exercise to determine which is which. And what’s cooler, is that the surface current then acts like a frictional layer to the water just below it, causing it to move at an angle, and so on, with the effect being that the wind-forced flow actually makes a SPIRAL that gets smaller with depth. This is known as the Ekman spiral.

Ekman spiral, from http://oceanmotion.org/html/background/ocean-in-motion.htm

The actual depth that the spiral penetrates to depends on a mysterious ocean parameter called Az, which describes the vertical mixing of momentum between the layers – kind of like the friction between them. What is clear though, is that a small particle of plastic floating close to the surface and a 3m deep floating structure will likely not experience the same wind-forced current, and therefore won’t move in the same direction. Hmmm … that’s going to make it hard to pick up pieces of plastic.

What is a “Gyre” anyway?

The final point I wanted to make in this article (I have more, which I’ll summarize at the end for a possible future article), is to try and give a sense of what currents in the ocean (including in the “gyre” or in the region often referred to as the “Great Pacific Garbage Patch”) actually look like. The conception that there is a great swirling current 1000’s of km across is true only when the currents are averaged for a very long time. At any given instant, however, the ocean current field is a mess of flows at various space and time scales. An appropriate term for describing typical ocean flow fields is “turbulent”, as in an oft-viewed video made by NASA from satellite ocean current data.

To illustrate this, I took some screenshots of current conditions from the wonderful atmosphere/ocean visualization tool at earth.nullschool.net showing: ocean currents, surface waves, and wind.

Ocean currents

Ocean waves


These images illustrate the potential problem with TOC idea, by highlighting the fact that the wind, wave, and current fields of the ocean (including even in the “quiet” garbage patch) are highly variable spatially and temporally, and are almost never aligned at the same period in time. What’s more, is that the currents and waves at a given time and location are not always a result of the wind at that location. Eddies in the ocean are generated through all kinds of different processes, and can propagate across ocean basins before finally dissipating.

Similarly, surface waves have been measured to cross oceans (i.e. the famous “Waves across the Pacific” study pioneered by the transformative oceanographer Walter Munk).

Other issues

Following the “rule of three”, I tried to hit what I consider to be the biggest concerns with TOC system design and principle, from my perspective as a physical oceanographer. However, there are other issues that should be addressed, if the system as designed is really believed by the TOC team to be capable of doing what they say. And really, it seems like a crazy waste of time on behalf of everyone involved to have spent this much time on something if they aren’t sure it will even work theoretically … not to mention the money spent thus far. So, part of me has to believe that all the dozens of people involved care deeply about making something that might actually work, and they have studied and considered all the effects and potential issues I (and others) have raised.

Anyway, the other issues are:

  • What is the actual response of the system to a rapid change in wind/wave direction? Wind can change direction pretty quickly, especially compared to ocean currents. What’s to prevent a bunch of accumulated plastic getting blown out the open end of the U after a 180 degree shift in wind but before the system can re-orient?

  • What about wave reflection from the boom structure itself? It is a well-known fact that objects (even floating ones) can reflect and “scatter” waves (scattering is when the reflected waves have a shorter wavelength than the original ones), and it seems like this could create a wave field in the U that might actually cause drift out of the system.

  • The idea that all wildlife can just “swim under” the skirt (because it’s impermeable) is not supported by anything that I consider to be rigorous fluid mechanics, aside from the fact that much of what actually lives in the open ocean are non-motile or “planktonic” species. There are a lot of communities in the open ocean that float and drift at the surface, and I see no way that if the System collects floating plastic as it is designed that it won’t just sweep up all those species too. The latest EIA brushed off the effect of the System on planktonic organisms by stating that they “are ubiquitous in the world’s oceans and any deaths that occur as a result of the plastic extraction process will not have any population level effects”. But that doesn’t take into account that the stated mission is to deploy 60 such systems, which are estimated to clean the garbage patch of surface material at a rate of 50% reduction every 5 years. It stands to reason that they would also clean the Pacific of its planktonic communities by the same amount.

Recording and replaying plots with the `recordPlot()` function

This post is not going to focus on anything oceanographic, but on a little trick that I just learned about using base graphics in R – the recordPlot() function.

R plot systems

First, for those who either don’t use R or who have been living under a rock, there are (in my opinion) two major paradigms for producing plots from data in R. The first is the original “base graphics” system – the sequence of functions bundled with R that are part of the graphics package which is installed and loaded by default.

The second is the ggplot2 package, written by Hadley Wickham, which uses the “grammar of graphics” approach to plotting data and is definitely not your standard plot(x, y) approach to making nice-looking plots. To be fair, I think ggplot is quite powerful, and I never discourage anyone from using it, but because I don’t use it in my own work (for many reasons too complicated to get into here) I don’t tend to actively encourage it, either.

Storing ggplots in objects

Anyway, one thing that I’ve always liked about the ggplot approach is that the components of the plot can be saved in objects, and built up in pieces by simply adding new plot commands to the object. A typical use case might be like this (using the built-in iris dataset):

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length)) + geom_point()

plot of chunk unnamed-chunk-1

Note how the foundation of the plot is created with the ggplot() function, but the points are added though the + of the geom_point() function.

However, for more complicated plots, the components are often saved into an object which can have other geom_* bits added to it later on. Then the final plot is rendered by “printing” the object:

pl <- ggplot(iris, aes(x=Sepal.Length, y=Petal.Length))
pl <- pl + geom_point() # add the points
pl <- pl + geom_density2d() # add a 2d density overlay
pl # this "prints" the plot and renders it on the screen

plot of chunk unnamed-chunk-2 Admittedly, this is pretty cool, not the least of which because you can always re-render the plot just by printing the object.

Recording a base graphics plot

While not quite the same, I recently discovered that it’s possible to “record” a base graphics plot to save in an object, allowing you to re-render the same plot. The case that led me to stumble onto this was where I had a complicated bit of code that made a plot that fit a model to data, and then I wanted to step through various iterations of removing certain points, adding new ones, to see what the effect on the fit would be.

I often do this using the pdf() function in R, so that each new plot can become another page in the pdf file that can be stepped through. However, another use case that I thought of after is in writing Rmarkdown documents (like this one!), where you’d like to keep showing a base plot but add different elements to it consecutively. Because of the way Rmarkdown works, the graphics from each code chunk are rendered independently, so it’s not possible to say, generate a plot in one chunk (using plot()), and then add to it with points() or lines() in another chunk.

Let’s see an example. I’ll make a plot that has a bunch of pieces, and then save it to an object with recordPlot():

plot(iris$Sepal.Length, iris$Petal.Length,
     xlab='Sepal Length', ylab='Petal Length',
     pch=round(runif(nrow(iris), max=25)))
title('My base plot')

plot of chunk unnamed-chunk-3

pl <- recordPlot()

Now, if I want to redo the plot exactly as I already did, I just “print” the object:


plot of chunk unnamed-chunk-4

So now if I want to redo the plot, but add different pieces, I can redo the plot as above, and add whatever I want with the normal base graphics functions:

pl # start with the original plot
m <- lm(Petal.Length ~ Sepal.Length, data=iris)
abline(m, lwd=2, col=2)
title(c('', '', 'with a subtitle!'), col.main=2)

plot of chunk unnamed-chunk-5

pl <- recordPlot()

And to keep going I just keep recording the plot and starting each chunk with it:

II <- iris$Petal.Length > 4
points(iris$Sepal.Length[II], iris$Petal.Length[II], col=3, pch=19)
sl <- seq(4, 8, length.out=100)
pl <- predict(m, newdata=list(Sepal.Length=sl), interval='confidence')
lines(sl, pl[,2], lty=2, col=2, lwd=2)
lines(sl, pl[,3], lty=2, col=2, lwd=2)

plot of chunk unnamed-chunk-6

Bootstrapping uncertainties with the boot package

People often ask me what I like about R compared to other popular numerical analysis software commonly used in the oceanographic sciences (coughMatlabcough). Usually the first thing I say is the package system (including the strict rules for package design and documentation), and how easy it is to take advantage of work that others have contributed in a consistent and reproducible way. The second is usually about how the well-integrated the statistics and the statistical methods are in the various techniques. R is fundamentally a data analysis language (by design), something that I’m often reminded of when I am doing statistical analysis or model fitting.

Fitting models, with uncertainties!

Recently I found myself needing to estimate both the slope and x-intercept for a linear regression. Turns out it’s pretty easy for the slope, since it’s one of the directly estimated parameters from the regression (using the lm() function), but it wasn’t as clear how to get the uncertainty for the x-intercept. In steps the boot package, which is a nice interface for doing bootstrap estimation in R. I won’t get into the fundamentals of what bootstrapping involves here (the linked wikipedia article is a great start).

Ok, first, a toy example (which isn’t all that different from my real research problem). We make some data following a linear relationship (with noise):

x <- 0:20
set.seed(123) # for reproducibility
y <- x + rnorm(x)
plot(x, y)

plot of chunk unnamed-chunk-1 We can easily fit a linear model to this using the lm() function, and display the results with the summary() function:

m <- lm(y ~ x)
plot(x, y)

plot of chunk unnamed-chunk-2

## Call:
## lm(formula = y ~ x)
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8437 -0.5803 -0.1321  0.5912  1.8507 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.37967    0.41791   0.909    0.375    
## x            0.97044    0.03575  27.147   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.992 on 19 degrees of freedom
## Multiple R-squared:  0.9749,	Adjusted R-squared:  0.9735 
## F-statistic:   737 on 1 and 19 DF,  p-value: < 2.2e-16

I love lm(). It’s so easy to use, and there’s so much information attached to the result that it’s hard not to feel like you’re a real statistician (even if you’re an imposter like me). Check out the neat little table, showing the estimates of the y-intercept and slope, along with their standard errors, t values, and p values.

So how to get the error on the x-intercept? Well, one way might be to propagate the slope and y-intercept uncertainties through a rearrangement of the equation, but for anything even a little complicated this would be a pain. Let’s do it instead with the boot package.

We need to create a function that takes the data (as the first argument), with the second argument being an index that can be used by the boot() function to run the function with a subset of the data. Let’s demonstrate first by writing a function to calculate the slope, and see how the bootstrapped statistics compare with what comes straight from lm():

slope <- function(d, ind) {
    m <- lm(y ~ x, data=d[ind,])
slope_bs <- boot(data.frame(x, y), slope, 999)
## Call:
## boot(data = data.frame(x, y), statistic = slope, R = 999)
## Bootstrap Statistics :
##      original        bias    std. error
## t1* 0.9704362 -0.0009770872   0.0391621

The bootstrap estimate decided (using 999 subsampled replicates) that the value of the slope should be 0.970436155978, while that straight from the linear regression gave 0.970436155978 (i.e. exactly the same!). Interestingly the standard error is slightly higher than from lm(). My guess is that would get closer to the real value with more replicates.

Ok, now to do it for the x-intercept, we just supply a new function:

xint <- function(d, ind) {
    m <- lm(y ~ x, data=d[ind,])
    -coef(m)[[1]]/coef(m)[[2]] # xint = -a/b
xint_bs <- boot(data.frame(x, y), xint, 999)
## Call:
## boot(data = data.frame(x, y), statistic = xint, R = 999)
## Bootstrap Statistics :
##       original      bias    std. error
## t1* -0.3912359 -0.02204408    0.412457

So the bootstrap estimate of the x-intercept is -0.39123590.412457

Predicting tides in R

This entry is actually a re-post of a great blog I found written by Marcus Beck. It was such a great summary of the tidal analysis capabilities built in to the oce package, that I thought it would make a great addition to the (growing) library of posts here. The original post can be found here, but I’ve reproduced the Rmarkdown in its entirety here with Marcus’ permission (with a few minor format tweaks).


Skip to TL/DR….

Water movement in estuaries is affected by many processes acting across space and time. Tidal exchange with the ocean is an important hydrodynamic process that can define several characteristics of an estuary. Physical flushing rates and water circulation are often controlled by tidal advection, whereas chemical and biological components are affected by the flux of dissolved or particulate components with changes in the tide. As such, describing patterns of tidal variation is a common objective of coastal researchers and environmental managers.

Tidal predictions are nothing new. A clever analog approach has been around since the late 1800s. The tide-predicting machine represents the tide as the summation of waves with different periods and amplitudes. Think of a continuous line plot where the repeating pattern is linked to a rotating circle, Representing the line in two-dimensions from the rotating circle creates a sine wave with the amplitude equal to the radius of the circle. A more complex plot can be created by adding the output of two or more rotating disks, where each disk varies in radius and rate of rotation. The tide-predicting machine is nothing more than a set of rotating disks linked to a single graph as the sum of the rotations from all disks. Here’s a fantastic digital representation of the tide-predicting machine:


Tides are caused primarily by the gravitational pull of the sun and moon on the earth’s surface. The elliptical orbits of both the moon around the earth and the earth around the sun produce periodic but unequal forces that influence water movement. These forces combined with local surface topography and large-scale circulation patterns from uneven heating of the earth’s surface lead to the variation of tidal patterns across the globe. Although complex, these periodic patterns can be characterized as the summation of sine waves, where one wave represents the effect of a single physical process (e.g., diurnal pull of the moon). Describing these forces was the objecive of the earlier tide-predicting machines. Fortunately for us, modern software (i.e., R) provides us with a simpler and less expensive approach based on harmonic regression.


We’ll create and sum our own sine waves to demonstrate complexity from addition. All sine waves follow the general form y as a function of x:

where the amplitude of the wave is and the frequency (or 1 / period) is . The parameters and represent scalar shifts in the curve up/down and left/right, respectively. We can easily create a function in R to simulate sine waves with different characteristics. This function takes the parameters from the above equation as arguments and returns a sine wave () equal in length to the input time series (). The and are interpreted as units of wave height (e.g., meters) and and are in hours.

# function for creating sine wave
waves <- function(time_in, alpha = 0, beta = 1, freq = 24, phi = 0){

  # timestep per hour
  time_step <- 60 / unique(diff(time_in))
  # set phi as difference in hours from start of time_in
  phi  <- min(time_in) + phi * 3600
  phi<- as.numeric(difftime(phi, min(time_in)))
  phi <- phi / time_step
  # get input values to cos func
  in_vals <- seq(0, length(time_in), length = length(time_in))
  in_vals <- in_vals / time_step
  in_vals <- 2 * pi * in_vals * 1 / freq

  # wave
  y <- alpha + beta * sin(in_vals + phi)

The default arguments will return a sine wave with an amplitude of one meter and frequency of one wave per 24 hours. Two additional time series are created that vary these two parameters.

# input time series for two weeks, 15 minute time step
x <- as.POSIXct(c('2017-04-01', '2017-04-15'))
x <- seq(x[1], x[2], by = 60 * 15)

# get three sine waves
# a: default
# b: amplitude 0.5, 48 hour period
# c: amplitude 2, 12 hour period
a <- waves(x)
b <- waves(x, beta = 0.5, f = 48)
c <- waves(x, beta = 2, f = 12)

We can combine all three waves in the same data object, take the summation, and plot to see how it looks.

# for data munging and plotting

# get sum of all y values, combine to single object
yall <- rowSums(cbind(a, b, c))
dat <- data.frame(x, a, b, c, yall) %>% 
  gather('var', 'val', -x)

# plot
ggplot(dat, aes(x = x, y = val)) + 
  geom_line() + 
  facet_wrap(~var, ncol = 1) + 

plot of chunk unnamed-chunk-3

The important piece of information we get from the plot is that adding simple sine waves can create complex patterns. As a general rule, about 83% of the variation in tides is created by seven different harmonic components that, when combined, lead to the complex patterns we observe from monitoring data. These components are described as being of lunar or solar origin and relative periods occurring either once or twice daily. For example, the so-called ‘M2’ component is typically the dominant tidal wave caused by the moon, twice daily. The periods of tidal components are constant across locations but the relative strength (amplitudes) vary considerably.

The oce package in R has a nifty function for predicting up to 69 different tidal constituents. You’ll typically only care about the main components above but it’s useful to appreciate the variety of components included in a tidal signal. We’ll apply the tidem function from oce to predict the tidal components on a subset of SWMP data. A two-week period from the Apalachicola Bay Dry Bar station is used.


# clean, one hour time step, subset, fill gaps
dat <- qaqc(apadbwq) %>% 
  setstep(timestep = 60) %>% 
  subset(subset = c('2013-01-01 0:0', '2013-12-31 0:0'), select = 'depth') %>% 
  na.approx(maxgap = 1e6)

The tidem function from oce requires a ‘sealevel’ object as input. Plotting the sealevel object using the plot method from oce shows three panels; the first is the complete time series, second is the first month in the record, and third is a spectral decomposition of the tidal components as cycles per hour (cph, or period).

datsl <- as.sealevel(elevation = dat$depth, time = dat$datetimestamp)

plot of chunk unnamed-chunk-5

We can create a model to estimate the components from the table above using tidem. Here, we estimate each component separately to extract predictions for each, which we then sum to estimate the complete time series.

# tidal components to estimate
constituents <- c('M2', 'S2', 'N2', 'K2', 'K1', 'O1', 'P1')

# loop through tidal components, predict each with tidem
preds <- sapply(constituents, function(x){
    mod <- tidem(t = datsl, constituent = x)
    pred <- predict(mod)
    pred - mean(pred)

# combine prediction, sum, add time data
predall <- rowSums(preds) + mean(datsl[['elevation']])
preds <- data.frame(time = datsl[['time']], preds, Estimated = predall) 

##                  time           M2           S2          N2
## 1 2013-01-01 00:00:00 -0.111578526 -0.020833606 0.000215982
## 2 2013-01-01 01:00:00 -0.118544835 -0.008940681 0.006428260
## 3 2013-01-01 02:00:00 -0.095806627  0.005348532 0.011088593
## 4 2013-01-01 03:00:00 -0.049059634  0.018205248 0.013072149
## 5 2013-01-01 04:00:00  0.009986414  0.026184523 0.011900172
## 6 2013-01-01 05:00:00  0.066540974  0.027148314 0.007855534
##              K2            K1         O1            P1 Estimated
## 1 -0.0048417234  0.0911501572 0.01312209  0.0381700294  1.463683
## 2 -0.0093752262  0.0646689921 0.03909021  0.0340807303  1.465686
## 3 -0.0113830570  0.0337560517 0.06274939  0.0276811946  1.491713
## 4 -0.0103243372  0.0005294868 0.08270543  0.0194051690  1.532812
## 5 -0.0064842694 -0.0327340223 0.09778235  0.0098135843  1.574727
## 6 -0.0008973087 -0.0637552642 0.10709170 -0.0004434629  1.601819

Plotting two weeks from the estimated data shows the results. Note the variation in amplitude between the components. The M2 , K1, and O1 components are the largest at this location. Also note the clear spring/neap variation in range every two weeks for the combined time series. This complex fort-nightly variation is caused simply by adding the separate sine waves.

# prep for plot
toplo <- preds %>% 
  gather('component', 'estimate', -time) %>% 
  mutate(component = factor(component, level = c('Estimated', constituents)))

# plot two weeks
ggplot(toplo, aes(x = time, y = estimate, group = component)) + 
  geom_line() + 
  scale_x_datetime(limits = as.POSIXct(c('2013-07-01', '2013-07-31'))) + 
  facet_wrap(~component, ncol = 1, scales = 'free_y') + 

plot of chunk unnamed-chunk-7

All tidal components can of course be estimated together. By default, the tidem function estimates all 69 tidal components. Looking at our components of interest shows the same estimated amplitudes in the plot above.

# estimate all components together
mod <- tidem(t = datsl)

# get components of interest
amps <- data.frame(mod@data[c('name', 'amplitude')]) %>% 
  filter(name %in% constituents) %>% 
##   name  amplitude
## 1   K2 0.01091190
## 2   N2 0.01342395
## 3   S2 0.02904518
## 4   P1 0.04100388
## 5   O1 0.11142455
## 6   M2 0.12005114
## 7   K1 0.12865764

And of course comparing the model predictions with the observed data is always a good idea.

# add predictions to observed data
dat$Estimated <- predict(mod)

# plot one month
ggplot(dat, aes(x = datetimestamp, y = depth)) + 
  geom_point() + 
  geom_line(aes(y = Estimated), colour = 'blue') + 
  scale_x_datetime(limits = as.POSIXct(c('2013-07-01', '2013-07-31'))) + 
  scale_y_continuous(limits = c(0.9, 2)) +

plot of chunk unnamed-chunk-9

The fit is not perfect but this could be from several reasons, none of which are directly related to the method - instrument drift, fouling, water movement from non-tidal sources, etc. The real value of the model is we can use it to fill missing observations in tidal time series or to predict future observations. We also get reasonable estimates of the main tidal components, i.e., which physical forces are really driving the tide and how large are the contributions. For example, our data from Apalachicola Bay showed that the tide is driven primarily by the M2, K1, and O1 components, where each had relative amplitudes of about 0.1 meter. This is consistent with general patterns of micro-tidal systems in the Gulf of Mexico. Comparing tidal components in other geographic locations would produce very different results, both in the estimated amplitudes and the dominant components.


Here’s how to estimate the tide from an observed time series. The data are from SWMPr and the tidem model is from oce.


# clean input data, one hour time step, subset, fill gaps
dat <- qaqc(apadbwq) %>% 
  setstep(timestep = 60) %>% 
  subset(., subset = c('2013-01-01 0:0', '2013-12-31 0:0'), select = 'depth') %>% 
  na.approx(maxgap = 1e6)

# get model
datsl <- as.sealevel(elevation = dat$depth, time = dat$datetimestamp)
mod <- tidem(t = datsl)

# add predictions to observed data
dat$Estimated <- predict(mod)

# plot
ggplot(dat, aes(x = datetimestamp, y = Estimated)) + 
  geom_line() +

plot of chunk unnamed-chunk-10