# Welcome!

I am a physical oceanographer interested in how ocean water is mixed and transformed. I am currently working as Research Scientist at the Bedford Institute of Oceanography in Halifax, Nova Scotia.

# A Makefile for knitr documents

One of the best things I’ve found about using R for all my scientific work is powerful and easy to use facilities for generating dynamic reports, particularly using the knitr package. The seamless integration of text, code, and the resulting figures (or tables) is a major step toward fully-reproducible research, and I’ve even found that it’s a great way of doing “exploratory” work that allows me to keep my own notes and code contained in the same document.

Being a fan of a “Makefile” approach to working with R scripts, as well as an Emacs/ESS addict, I find the easiest way to automatically run/compile my knitr latex documents is with a Makefile. Below is a template I adapted from here:

all: pdf

MAINFILE  := **PUT MAIN FILENAME HERE**
RNWFILES  :=
RFILES    :=
TEXFILES  :=
CACHEDIR  := cache
FIGUREDIR := figures
LATEXMK_FLAGS :=
##### Explicit Dependencies #####
################################################################################
RNWTEX = $(RNWFILES:.Rnw=.tex) ROUTFILES =$(RFILES:.R=.Rout)
RDAFILES= $(RFILES:.R=.rda) MAINTEX =$(MAINFILE:=.tex)
MAINPDF = $(MAINFILE:=.pdf) ALLTEX =$(MAINTEX) $(RNWTEX)$(TEXFILES)

# Dependencies
$(RNWTEX):$(RDAFILES)
$(MAINTEX):$(RNWTEX) $(TEXFILES)$(MAINPDF): $(MAINTEX)$(ALLTEX)

.PHONY: pdf tex clean

pdf: $(MAINPDF) tex:$(RDAFILES) $(ALLTEX) %.tex:%.Rnw Rscript \ -e "library(knitr)" \ -e "knitr::opts_chunk[['set']](fig.path='$(FIGUREDIR)/$*-')" \ -e "knitr::opts_chunk[['set']](cache.path='$(CACHEDIR)/$*-')" \ -e "knitr::knit('$<','$@')" %.R:%.Rnw Rscript -e "Sweave('$^', driver=Rtangle())"

%.Rout:%.R
R CMD BATCH "$^" "$@"

%.pdf: %.tex
latexmk -pdf $< clean: -latexmk -c -quiet$(MAINFILE).tex
-rm -f $(MAINTEX)$(RNWTEX)
-rm -rf $(FIGUREDIR) -rm *tikzDictionary -rm$(MAINPDF)


# Making section plots with oce and imagep()

section objects in the oce package are a convenient way of storing a series of CTD casts together – indeed, the object name derives from the common name for such a series of casts collected from a ship during a single campaign.

In it’s heart, a section object is really just a collection of ctd objects, with some other metadata. The CTD stations themselves are stored as a list of ctd objects in the @data slot, like:

List of 124
$:Formal class 'ctd' [package "oce"] with 3 slots$ :Formal class 'ctd' [package "oce"] with 3 slots
$:Formal class 'ctd' [package "oce"] with 3 slots$ :Formal class 'ctd' [package "oce"] with 3 slots
$:Formal class 'ctd' [package "oce"] with 3 slots$ :Formal class 'ctd' [package "oce"] with 3 slots
$:Formal class 'ctd' [package "oce"] with 3 slots$ :Formal class 'ctd' [package "oce"] with 3 slots
[list output truncated]


Just to prove it, we can plot make a standard ctd plot of one of them, by accessing them directly with the [[ accessor syntax. Let’s plot the 100th station:

## Making nice plots of the sections themselves

The main advantage of a section object is to be able to quickly make plots summarizing all the data in the section. This is accomplished using the plot method for section objects, which you can read about by doing ?"plot,section-method". For example, to make a contour plot of the temperature:

Ok, cool. But what about some colors? Use the ztype='image' argument!

## Finer control over the section plot

To get finer control over the section plot than is possible with the section plot() method, one trick I will sometimes do is extract the data I want from the section as a gridded matrix, and then plot the matrix directly using the imagep() function.

First, we “grid” the section so that all the stations comprise the same pressure levels:

Now, we can loop through the station fields, extracting the data as we go.

Basically, what we’re doing here is creating an empty matrix, then filling each row with the data from the section stations. We can make a quick plot with imagep():

Or we can do some fancier things, like use the colormap() function and plot some filled contours:

# Using the oce colormap function in R

When I talk to fellow colleagues about why I use R as my language of choice for scientific data analysis, I typically point out all the advantages, and because I’m honest, the disadvantages.

Typically the biggest disadvantage, especially for those coming from the java-GUI world of Matlab, is the non-interactive graphics. Now, I’ve managed to convince myself that I actually prefer making plots this way (because it forces me to script rather than noodling around with a mouse, the final plot is predictable, etc), but there are always a few things that I wish were easier.

One of those is handling colors in “image” plots and in scatter plots. The former is usually handled pretty easily using the oce function imagep(..., col=oceColorsJet), but the latter tends to be trickier. There is no base R functionality for automatically coloring points by some other attribute. I believe this is relatively easy to do with ggplot2, but that of course requires using ggplot2 (nothing against ggplot2, it just really isn’t an option for me – perhaps the subject of a future blog post).

## the colormap() function

With that in mind, Dan and I set out to create a function that could be used to make an explicit “map” between colors and values to facilitate making plots, but also to ensure that the results of the plot are correct. The concept of a “colormap”, as implemented in Matlab, where the information connecting colors to values is inherent in the plot attributes, doesn’t exist in R. One can plot any colors one would like without thinking twice about whether they mean anything. On the one hand, this can be an advantage because it makes it easier to have multiple colormaps in a single figure. The downside is that using colors to represent numerical values requires some care.

The basic idea of colormap() is that it creates an object that connects a series of colors with values, which can be passed to various plotting functions to ensure that the color-mapping is done correctly. Probably the best way to illustrate the various options is through some examples. In most cases the colormap is communicated through the use of a “palette”, which is either drawn implicitly by the plotting function, or through an explicit call to oceDrawPalette().

## imagep() plots

The imagep() function is a tweaked and customizable version of the base image() function. It is used for making pseudo-color maps of matrix-style data. A nice example comes from the included argo dataset:

Pretty easy. But using colors with imagep() is pretty easy anyway, since the colormap is defined based on the input data and automatically scaled to match the palette.

## Using named GMT-style palettes

In creating colormap(), Dan and I were impressed with the color palettes available in the Generic Mapping Tools (GMT) software, and decided to implement a similar approach to defining custom colormaps. In addition, colormap() includes a number of “named” GMT palettes (see ?colormap), several of which are quite handy for plotting topography.

## Conclusion

The colormap() function is pretty powerful, and as a result somewhat complex to use. I hope the above examples have helped shed some light on how to use oce to map colors to values consistently and reliably in plots.

# Calculating buoyancy frequency for argo/section objects using the apply() family

The most recent CRAN release of oce includes some nice new functionality for reading and converting argo objects (see http://www.argo.ucsd.edu/ for more information about the fantastic Argo float program). One question that arose out of this increased functionality was how to calculate $N^2$ (also known as the buoyancy or Brunt-Väisälä frequency) for such objects.

## Buoyancy frequency

The definition of $N^2$ is:

where $g$ is the acceleration due to gravity, $\rho = \rho(z)$ is the fluid density, and $z$ is the vertical coordinate. Essentially $N^2$ describes the vertical variation of fluid density (also known as “stratification”).

Calculating $N^2$ for regular ctd objects is easily accomplished with the function oce::swN2(). A caution: readers are encouraged to read the documentation carefully, as the details of the actual calculation can have important consequences when applied to real ocean data.

## $N^2$ for station objects

For the case of a station object (which is essentially a collection of ctd stations), the most straightforward way to calculate $N^2$ is to use the lapply() function to “apply” the swN2() function to each of the stations in the object. An example:

The line with the lapply() command takes the list of stations from the section object, and evaluates each of the resulting ctd objects using the oceSetData() function to add the result of swN2() back into the station @data slot.

If we wanted to make a nice plot of the result, we could do:

where I’ve defined a custom colormap just for the fun of it.

## $N^2$ for argo objects

In an argo object, the default storage for the profiles is a matrix, rather than a list of ctd objects. To calculate $N^2$ and make a plot, the simplest approach would be to use as.section() to convert the argo object to a section class object and then do as above. However, having the field as a matrix allows for greater flexibility in plotting, e.g. using the imagep() function, so one might want to calculated $N^2$ in a manner consistent with the default argo storage format.

Let’s load some example data from the argo dataset included in oce:

Note that I’ve gridded the argo fields so the matrices are at consistent pressure levels. Now we create a function that can be applied to each of the matrix columns, to calculate $N^2$ from a single column of the density matrix:

Now we use the above function N2 to calculate buoyancy frequency and add it back to the original object, like:

Note that because of the difference between the “list” and “matrix” approach, the oceSetData() occurs outside of the apply(). Also note the second argument in the apply() call, which specifies to apply the N2() function along the 2nd dimension of the density matrix, i.e. along columns.

Now, lets make a sweet plot of the N2 field using imagep()!

A thing of stratified beauty, if I do say so myself.

# R tutorial at the 2016 AGU Ocean Sciences Meeting

Today I had the pleasure of presenting a talk about R during one of the tutorial sessions at the AGU Ocean Sciences Meeting in New Orleans. I made a deliberate point of saying that my main message was more like: “R is really cool, and here’s why”, rather than: “You should all stop using Matlab”. Being divisive doesn’t help anybody work better.

I was quite surprised at the number of people who raised their hand when I asked if they use R as their main analysis tool – perhaps about a third of the (surprisingly full) room!

Anyway, see below for the pdf of my talk and the Rnw source file. Note that the Rnw has links to some external images so you won’t be able to knitr() it for yourself, but feel free to use the slide content.

Richards-OSM-R-tutorial.pdf

Richards-OSM-R-tutorial.Rnw