Skip to content
Katell Hamon edited this page Jun 6, 2023 · 8 revisions

Practical 4: How to link VMS and logbook data to 'spatial' grids

Introduction

In general, logbook data are recorded at the ICES rectangle level. VMS data, however, are not associated with any spatial scale (other than taking account of the uncertainty in the data themselves which may be a limiting factor). In many instances, however, the output of any analysis must be delivered at some spatial scale. If that spatial scale is ICES rectangle for logbook data, you're lucky, but most often it requires a bit more work.

Here we will provide some tools to help you out with such jobs.

Converting between spatial notations

What if you are required to report VMS data analyses on the ICES rectangle? Will you go through some complicated procedure to find the closest ICES rectangle to your data, or just use the embedded functionality in VMStools? Let's see how that would work.

#Start afresh!
library(vmstools)
rm(list=ls()) #This removes all objects from R

#load the data
data(tacsat); data(eflalo)

#Add ICES rectangles to your tacsat data, based on longitude and latitude.
tacsat$LE_RECT <- ICESrectangle(tacsat)

#That was easy, can we also convert back?
pos <- ICESrectangle2LonLat(tacsat$LE_RECT,midpoint=T)

tacsatCompare <- cbind(tacsat,pos)

head(tacsatCompare)
>    VE_COU VE_REF  SI_LATI  SI_LONG    SI_DATE  SI_TIME SI_SP SI_HE LE_RECT SI_LATI SI_LONG
658237 Atlantis     10 51.44772 3.583731 04/05/1800 13:32:00     0    25    31F3   51.25     3.5
658238 Atlantis     10 51.44067 3.583847 04/05/1800 15:28:00     0    25    31F3   51.25     3.5
662888 Atlantis     10 51.44074 3.595529 04/05/1800 17:22:00     0    25    31F3   51.25     3.5
662889 Atlantis     10 51.44315 3.586862 04/05/1800 19:18:00     0    25    31F3   51.25     3.5
662890 Atlantis     10 51.44351 3.586948 04/05/1800 21:12:00     0    25    31F3   51.25     3.5
662891 Atlantis     10 51.44811 3.593958 04/05/1800 23:08:00     0    25    31F3   51.25     3.5

Exercise 1

  1. When executing the function 'ICESrectangle2LonLat' a warning was given. What does this warning mean?
  2. Can you find out the locations of the points not converted back to longitude,latitude? Does this make sense?
Show answer

  1. It means that some of the rectangles could not be converted into longitude - latitude positions.

  1. The trick here is that converting a GPS position to an ICES rectangle is based on a simple calculation. Even if GPS positions are located outside the ICES region, the simple calculation can still be applied (but results in odd rectangle names). The conversion back however, relies on a bit more than a simple calculation and hence throws you a warning for the odd rectangle names out there. Below you'll find where these points are located
require(ggplot2); require(dplyr)
idx <- which(is.na(pos$SI_LONG)==T)
map <- map_data("world") %>% filter(long > -23 & long < 30 & lat> -50 & lat < 60)
ggplot(map, aes(long, lat)) +
  geom_polygon(aes(group = group), fill = "darkgrey") + 
  coord_quickmap(xlim=range(tacsat$SI_LATI[idx]),ylim=range(tacsat$SI_LATI[idx])) + geom_point(data=tacsat[idx,],aes(x=SI_LONG,y=SI_LATI),colour=2)

We can of course do similar things with the logbook data

pos <- ICESrectangle2LonLat(eflalo$LE_RECT,midpoint=T)
eflalop <- cbind(eflalo,pos)

#Let's find the ICES area associated with the ICES rectangle
#We need the longitude, latitude position though from the ICES rectangle midpoint from the step before
data(ICESareas)
eflalop$LE_AREA <- ICESareas$Area_Full[ICESarea(eflalop,ICESareas)]

Obviously, we can do the same for the VMS data, so try it yourself! One other type of conversion available is the spatial C-square notation. This notation was developed in Australia, and has many handy features, ie. it can be used for the entire globe. It is now implemented as the default FishFrame spatial scale. The C-square notation is flexible since, although it is a standard 'naming convention', it can also take many different resolutions. Let's convert some VMS data into C-square notation.

#For this example, take the first six positions only
somePos <- head(tacsat[,c("SI_LONG","SI_LATI")])

CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=10)
CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=5)
CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=0.5)
CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=0.1)
CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=0.05)
CSquare(somePos$SI_LONG,somePos$SI_LATI,degrees=0.01)

The length of the CSquare notation becomes longer and longer the higher the resolution gets.

Analyse on a defined spatial scale

So far, we've only converted VMS and logbook data into different spatial notations. A next step would be to analyse the data based on the converted spatial scale.

#Let's calculate the catches of plaice in the total eflalo dataset at the ICES rectangle level
aggregate(eflalop$LE_KG_PLE,by=list(eflalop$LE_RECT),FUN=sum,na.rm=T)

#This would be the picture associated with that catch distribution
plotTools(eflalop,level="ICESrectangle",xlim=c(-5,10),ylim=c(48,62),zlim=NULL,log=FALSE,color=NULL,control.tacsat=list(clm=NULL),control.eflalo=list(clm=c("LE_KG_PLE")))

#Now at a slightly finer scale, the VMS data
tacsatp <- mergeEflalo2Tacsat(eflalo,tacsat)
tacsatp$IDX       <- 1:nrow(tacsatp)

#A quick and dirty rule which says that all speeds between 1 and 6 are fishing
tacsatFilter      <- filterTacsat(tacsatp,st=c(1,6),hd=NULL,remDup=T)
tacsatp$SI_STATE  <- 0
tacsatp$SI_STATE[tacsatFilter$IDX] <- 1

#I needed the activity distinction because I only want to dispatch catches or values over fishing pings!
tacsatEflalo      <- splitAmongPings(tacsat=tacsatp,eflalo=eflalo,
variable="all",level=c("day","ICESrectangle","trip"),conserve=T)

#Let's count the number of rows in tacsatEflalo, and compare with the original tacsatp file
nrow(tacsatp)
nrow(tacsatEflalo)
table(tacsatp$SI_STATE) #does this make sense?

#Add a grid notation
tacsatEflalo$LE_SPAT <- CSquare(tacsatEflalo$SI_LONG,tacsatEflalo$SI_LATI,degrees=0.1)
aggregate(tacsatEflalo$LE_KG_PLE,by=list(tacsatEflalo$LE_SPAT),FUN=sum,na.rm=T)

#In a picture, it looks like this
x11()
# check this overlay problem!
plotTools(tacsatEflalo,level="gridcell",xlim=c(-5,10),ylim=c(48,62),zlim=NULL,log=F,gridcell=c(0.1,0.1),color=NULL,control.tacsat=list(clm="LE_KG_PLE"))

Observe the difference! This is a much more detailed picture of plaice catch distribution. Note the definition of the Plaice Box which was gradually closed to large beamers between 1989 and 1994.

Exercise 2

  1. Aggregate COD catches on a CSquare notation of 0.5 degrees
  2. If you are handy with R, can you make your own plot? (else, use plotTools to do it for you)
Show answer

  1. The code to do this is rather similar to the aggregation of plaice in the example above. This is how it's done:

#Add 0.5 degree notation to tacsatEflalo
tacsatEflalo$LE_SPAT <- CSquare(tacsatEflalo$SI_LONG,tacsatEflalo$SI_LATI,degrees=0.5)
#Aggregate the COD catches
aggregate(tacsatEflalo$LE_KG_COD,by=list(tacsatEflalo$LE_SPAT),FUN=sum,na.rm=T)

  1. Have a look at Practical 9, where a template for a plotting routine is given

Define your own grid

Although the spatial grids just discussed might be suitable in a large number of occasions, there are still some instances where you need to define your own grid. For VMStools, almost all calculations are based on the assumption that the earth is spherical (WGS 84 projection), and we wouldn't be able to use these tools anymore if we would move to km or nm notation. There are plenty of functions available online though to make conversion in other projections.

VMStools relies on a secondary package for spatial data storage and analyses, the sf package. If you plan to do a lot of spatial analyses in R, in combination with VMS (basically the type of analyses you'll see from here and onwards) I can recommend you read up on how to use sf.

Let's define our first custom grid

#Define your ranges in longitude (x) and latitude (y)
x <- seq(-5,5,length.out=10)
y <- seq(50,55,length.out=10)
xrange <- range(x,na.rm=T)
yrange <- range(y,na.rm=T)

#returns a grid with 11 cells in the x-direction and 11 in the y-direction
Grid <- createGrid(xrange,yrange,1,0.5,type="Grid") #just like an ICES rectangle grid

#Let's see what is stored in the 'Grid' object
Grid
> Geometry set for 169 features                                                                         
Geometry type: POLYGON                                                                                
Dimension:     XY                                                                                     
Bounding box:  xmin: -6 ymin: 49.5 xmax: 7 ymax: 56                                                   
CRS:           NA                                                                                     
First 5 geometries:                                                                                   
POLYGON ((-6 49.5, -5 49.5, -5 50, -6 50, -6 49...                                                    
POLYGON ((-5 49.5, -4 49.5, -4 50, -5 50, -5 49...                                                    
POLYGON ((-4 49.5, -3 49.5, -3 50, -4 50, -4 49...                                                    
POLYGON ((-3 49.5, -2 49.5, -2 50, -3 50, -3 49...                                                    
POLYGON ((-2 49.5, -1 49.5, -1 50, -2 50, -2 49...

st_coordinates(Grid) 
#- or get the centroids
st_coordinates(st_centroid(Grid))

plot(Grid); axis(1); axis(2)

Now you've defined your own grid, but obviously you want to link, e.g. your VMS data points, to this grid. In other words, you want to know in which grid cell each of your VMS positions would fall. Why is this handy? Primarily because you can now calculate certain indicators based on a user-defined grid (e.g. kg of cod in a certain area) and it allows you to combine different data sources within the same grid! But more on that later... First, let's see in which grid cell all the VMS positions fall.

#Get some tacsat positions, and link them to the Grid
# We first need to convert out position data to an 'sf' object as well

somePos_sf <- st_as_sf(somePos,coords=c("SI_LONG","SI_LATI"))
st_over(somePos_sf,Grid)
> 49 49 49 49 49 49
#This output really means that all 6 positions in 'somePos' are located in grid cell 49 of st_coordinates(st_centroid(Grid))
st_coordinates(st_centroid(Grid))[49,]
st_coordinates(Grid)[which(st_coordinates(Grid)[,4]==49),]

#Now for the full tacsat dataset, and summarize the result immediately
table(st_over(st_as_sf(tacsat,coords=c("SI_LONG","SI_LATI")),Grid))

Exercise 3

  1. Which grid cell (in coordinate notation) is associated with the highest number of VMS records?
  2. Plot it onto the grid as in the example above
Show answer

  1. We first make a summary of all gridcells present in the tacsat dataset, list the highest value and thereafter find the lon-lat position.

#Get a summary of all gridcells present in the tacsat dataset
sort(table(st_over(st_as_sf(tacsat,coords=c("SI_LONG","SI_LATI")),Grid))) 

#Apparently, gridcell 76 has 7524 records associated
st_coordinates(Grid)[76,] # 4-longitude by 50 latitude

  1. ggplot() + geom_sf(data=Grid) + geom_sf(data=Grid[76],aes(fill="76"))

All you actually need to aggregate again over a certain grid is the 'over' function, and the definition of your own grid. Once you get the results of the overlay back, you can use this to summarize your results. An example:

#Turn tacsat in the 'coordinates' class
coordTacsat <- st_as_sf(tacsat,coords=c("SI_LONG","SI_LATI"))

#Which grid cells do match with the tacsat points?
index <- st_over(coordTacsat,Grid)
tacsat$INDEX <- index

#Do something whacky for a change, and take the mean heading in a grid cell
aggregate(tacsat$SI_HE,by=list(tacsat$INDEX),FUN=mean,na.rm=T)

Clone this wiki locally