Welcome!

I am a physical oceanographer interested in how ocean water is mixed and transformed. I am currently working as Research Scientist at the Bedford Institute of Oceanography in Halifax, Nova Scotia.

Recent Posts:

oce package version 0.9-18 is released!

Today we released a new version of oce, and it has been uploaded to CRAN. Only the source version is available as of the time of writing, but binary versions for all platforms should become available in the next few days. As always, the best way to install the package is to do:

install.packages("oce")

at an R prompt.

Then you can do stuff like:

library(oce)
## Loading required package: methods
## Loading required package: gsw
library(ocedata)
## Loading required package: testthat
data(levitus)
data(coastlineWorld)
cm <- colormap(levitus$SSS, col=oceColorsSalinity, breaks=seq(30, 37, 0.5))
drawPalette(colormap=cm, drawTriangles = TRUE)
mapPlot(coastlineWorld, projection='+proj=stere +lat_0=90 +lon_0=-60',
        longitudelim=c(-180, 180), latitudelim=c(50, 90), col='grey')
with(levitus, mapImage(longitude, latitude, SSS, colormap=cm, filledContour = TRUE))
mapPolygon(coastlineWorld, col='grey')
mapGrid()
title('Sea Surface Salinity')

plot of chunk unnamed-chunk-2

New features

The previous version of oce was uploaded to CRAN about 9 months ago. In the meantime, we’ve fixed lots of bugs and added even more improvements. A quick look at the NEWS file gives a summary of the enhancements:

0.9-18
- improve plot.coastline() and mapPlot()
- add support for G1SST satellite
- all objects now have metadata items for units and flags
- ctdTrim() method renamed: old A and B are new A; old C is new B
- support more channels and features of rsk files
- as.adp() added
- convert argo objects to sections
- makeSection() deprecated; use as.section() instead
- read.adp.rdi() handles Teledyne/RDI vsn 23.19 bottom-track data
- geodXyInverse() added; geod functions now spell out longitude etc
- read.odf() speeded up by a factor of about 30
- add colour palettes from Kristen Thyng's cmocean Python package
- as.oce() added
- rename 'drifter' class as 'argo' to recognize what it actually handles
- add oceColorsViridis()
- interpBarnes() has new argument 'pregrid'
- binMean2D() has new argument 'flatten'
- data(topoWorld) now has longitude from -179.5 to 180
- ODF2oce() added
- read.odf() handles more data types
- read.adp.rdi() reads more VmDas (navigational) data
- ITS-90 is now the default temperature unit
- ctd objects can have vector longitude and latitude
- logger class renamed to rsk
- bremen class added
- coastlineCut() added
- rgdal package used instead of local PROJ.4 source code
- mapproj-style map projections eliminated

Some of the best additions (in my opinion) are:

  • addition of a units metadata field for objects
  • more tools for working with argo objects (which have been renamed from drifter)
  • new color palettes, particularly the cmocean palettes from python
  • renaming of the logger class to rsk for data from RBR instruments
  • lots of new features for rsk objects, including ability to convert to ctd objects with as.ctd()
  • more robust projection handling using the rgdal package
  • coastlineCut() for helping with UHL (ugly horizontal lines) in map projections

Vignette

The vignette has also been updated somewhat, using an Rmarkdown source and with some new examples.

Help us out!

You can help! If you find any bugs or have any requests, please open an Issue on the Github development page:

https://github.com/dankelley/oce/issues

oce has benefited immensely from some great requests recently, and we’d like to keep the momentum going!

R/oce at AGU Ocean Sciences 2016

If you’re going to be at the AGU Ocean Sciences meeting in New Orleans in a couple weeks, make sure you come by to check out the R/oce tutorial I’ll be doing. It’s on Wednesday February 24th at 3:00pm in room R03. See you there!

An R function to shift vectors by a specified lag

Quite of bit of my work involves looking at “shifts” between two time series. There are lots of reasons why shifts are interesting, including such things as:

  • phase differences in the tides at two different locations,

  • physical separation between sensors on a profiling instrument, and

  • clock drifts between two logging sensors.

To accomplish the actual shifting of the vectors (I’m not going to discuss here how to determine the amount by which the series should be shifted, since that depends on the parameters of the problem), I created the following function:

shift <- function(x, lag) {
    n <- length(x)
    xnew <- rep(NA, n)
    if (lag < 0) {
        xnew[1:(n-abs(lag))] <- x[(abs(lag)+1):n]
    } else if (lag > 0) {
        xnew[(lag+1):n] <- x[1:(n-lag)]
    } else {
        xnew <- x
    }
    return(xnew)
}

The function takes as input the vector x and a specified integer lag, and shifts the input series by that amount. NA’s are added to either the beginning or the end (depending on the sign of lag) to pad the shifted vector to be the same length as the input. Note that lag is defined so that a positive lag shifts x “to the right”, i.e. moves values to a higher index.

Back to blogging

I’ve always intended to keep a better blog, partly for the online presence, and partly for the record of things I learn as I muddle my way through computer work and scientific research. My first real attempt, at Coded Ocean, was a bit of a failure – partly for the normal reasons (lazy, etc), but I believe also because ultimately the WordPress format didn’t suit my style very well. Too much moving a mouse and clicking, manually uploading figures and images, and a frustratingly buggy Markdown renderer that often required me to write raw html in my posts.

Since learning about Jekyll, particularly since my friend and colleague (and former supervisor) Dan Kelley set up his own Jekyll blog, it seems like it’s much more along the lines of the kind of blog that would work for me. Specifically, Jekyll:

Blog details

The guts of this website are based off of the knitr-hyde sample blog, which is a great starting point for a R/knitr website and blog, which is in turn built off of the great poole/hyde template. The source code is hosted on GitHub (of course).

Using the scan() function in R to read weirdly formatted data

I am writing some code to parse a weird data format, and using scan() to suck in everything first. Basically, it’s csv-style lines, but some lines have a different number of fields and are for different things — imaging CTD data interspersed with system messages, where the line is identified by the very first field. Something like:

GPS,20150727T120000,-10.1,12.2
MESSAGE, 20150727T120005,Begin descent
CTD,20150727120100,1,25,35
CTD,20150727120200,10,20,34
CTD,20150727120400,100,10,33
MESSAGE,20150727T121000,Begin ascent
CTD,20150727121500,100,10,33
CTD,20150727121600,90,12,33.5
etc ...

Anyway, when I was just reading in the CTD fields, everything was fine, but when I started trying to parse the MESSAGE fields, I found that scan() was doing something unexpected with the spaces in the message field, and producing a char vector like:

"GPS,20150727T120000,-10.1,12.2"
"MESSAGE, 20150727T120005,Begin" 
"descent"
"CTD,20150727120100,1,25,35"
...

Basically, scan() was treating the space between “Begin” and “descent” as a delimiter (as well as the carriage returns).

Anyway, after much attempting to interpret the man page, and trying different things, I discovered that

scan(con, character(), sep='\n')

would suck in the entire line as a character vector, which is what I wanted.

Using the R apply() family with oce objects

Introduction

In the oce package, the various different data formats are stored in consistently structured objects. In this post, I’ll explore a way to access elements of multiple oce objects using the R lapply(), from the apply family of functions.

Example with a ctd object

The objects always contain three fields (or “slots”): metadata, data, and processingLog. The layout of the object can be visualized using the str() command, like:

library(oce)
## Loading required package: methods
## Loading required package: gsw
data(ctd)
str(ctd)

which produces something like:

Formal class 'ctd' [package "oce"] with 3 slots
  ..@ metadata     :List of 26
  .. ..$ header                  : chr [1:42] "* Sea-Bird SBE 25 Data File:"
  .. ..$ type                    : chr "SBE"
  .. ..$ conductivityUnit        : chr "ratio"
  .. ..$ temperatureUnit         : chr "IPTS-68"
  .. ..$ systemUploadTime        : POSIXct[1:1], format: "2003-10-15 11:38:38"
  .. ..$ station                 : chr "Stn 2"
  .. ..$ date                    : POSIXct[1:1], format: "2003-10-15 11:38:38"
  .. ..$ startTime               : POSIXct[1:1], format: "2003-10-15 11:38:38"
  .. ..$ latitude                : num 44.7
  .. ..$ longitude               : num -63.6
  ..@ data         :List of 9
  .. ..$ scan         : int [1:181] 130 131 132 133 134 135 136 137 138 139 ...
  .. ..$ time         : num [1:181] 129 130 131 132 133 134 135 136 137 138 ...
  .. ..$ pressure     : num [1:181] 1.48 1.67 2.05 2.24 2.62 ...
  .. ..$ depth        : num [1:181] 1.47 1.66 2.04 2.23 2.6 ...
  .. ..$ temperature  : num [1:181] 14.2 14.2 14.2 14.2 14.2 ...
  .. ..$ salinity     : num [1:181] 29.9 29.9 29.9 29.9 29.9 ...
  .. ..$ temperature68: num [1:181] 14.2 14.2 14.2 14.2 14.2 ...
  ..@ processingLog:List of 2
  .. ..$ time : POSIXct[1:5], format: "2015-08-18 19:22:36" "2015-08-18 19:22:36" ...
  .. ..$ value: chr [1:5] "create 'ctd' object" "ctdAddColumn(x = res, column = swSigmaTheta(res@data$salinity,     res@data$temperature, res@data$pressure), name = "sigmaThet"| __truncated__ "read.ctd.sbe(file = file, processingLog = processingLog)" "converted temperature from IPTS-69 to ITS-90" ...

(where I’ve trimmed a few lines out just to make it shorter).

For a single object, there are several ways to access the information contained in the object. The first (and generally recommended) way is to use the [[ accessor — for example if you wanted the temperature values from a ctd object you would do

T <- ctd[['temperature']]

Another way is to access the element directly, by using the slot and list syntax, like:

T <- ctd@data$temperature

The disadvantage to the latter is that it requires knowledge of exactly where the desired field is in the object structure, and is brittle to downstream changes in the oce source.

Working with multiple objects

Frequently, especially with CTD data, it is common to have to work with a number of individual ctd objects — usually representing different casts. One way of organizing such objects, particularly if they share a common instrument, or ship, or experiment etc, is to collect them into a list.

For example, we could loop through a directory of individual cast files (or extract multiple casts from one file using ctdFindProfiles()), and append each one to a list like:

files <- dir(pattern='*.cnv')
casts <- list()
for (ifile in 1:length(files)) {
    casts[[ifile]] <- read.oce(files[ifile])
}

If we summarize the new casts list, we can see that it’s filled with ctd objects:

str(casts, 1) # the "1" means just go one level deep

Extracting fields from multiple objects at once

Say we want to extract all the temperature measurements from each object in our new list? How could we do it?

The brute force approach would be to loop through the list elements, and append the temperature field to a vector, maybe something like:

T_all <- NULL
for (i in 1:length(casts)) {
    T_all <- c(T_all, casts[[i]][['temperature']])
}

But in R, there’s a more elegant way — lapply()!

T_all <- unlist(lapply(casts, function(x) x[['temperature']]))