Open Sustainable Technology

A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

Browse accepted projects | Review proposed projects | Propose new project | Open Issues


This R package is comprised of functions that facilitate the identification of areas of importance for biodiversity, such as Key Biodiversity Areas (KBAs), based on individual tracking data.

Last synced: about 11 hours ago
JSON representation

Repository metadata

An R package that facilitates the identification of animal aggregations of potential conservation significance based on individual tracking data. Key functions include utilities to identify and summarise individual foraging trips, estimate utilisation distributions, and overlay distributions to identify important aggregation areas.



output: github_document

# track2KBA

```{r, include = FALSE}
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/",
out.width = "100%"


[![Coverage Status](](

This package is comprised of functions that facilitate the identification of areas of importance for biodiversity, such as Key Biodiversity Areas (KBAs), based on individual tracking data. For further detail concerning the method itself, please refer to this [paper]( by Beal et al. (2021).

Key functions include utilities to estimate individual core use areas, the level of representativeness of the tracked sample, and overlay individual distributions to identify important sites at the population level. Other functions assist in plotting the results, formatting your data set, and splitting and summarizing individual foraging trips.

## Installation
You can download the stable version from CRAN with:

``` {r, eval = FALSE}

Or you can download the development version from [GitHub]( with:

``` {r, eval = FALSE}
install.packages("devtools", dependencies = TRUE)
devtools::install_github("BirdLifeInternational/track2kba", dependencies=TRUE) # development version - add argument 'build_vignettes = FALSE' to speed it up

### Example
Now we will use tracking data collected at a seabird breeding colony to illustrate a `track2KBA` workflow for identifying important sites. It is important to note that the specific workflow you use (i.e., which functions and in what order) will depend on the species of interest and the associated data at hand.

First, in order for the data to work in `track2KBA` functions, we can use the `formatFields` function to format the important data columns needed for analysis. These are: a DateTime field, Latitude and Longitude fields, and an ID field (i.e. individual animal, track, or trip).

```{r example, eval= FALSE, message=FALSE, warning = FALSE, include=T}
library(track2KBA) # load package

# ?boobies # for some background info on the example data set

dataGroup <- formatFields(
dataGroup = boobies,
fieldID = "track_id",
fieldDate = "date_gmt",
fieldTime = "time_gmt",
fieldLon = "longitude",
fieldLat = "latitude"



If your data come from a central-place foraging species (i.e. one which makes trips out from a centrally-located place, such as a nest in the case of a bird), you can use `tripSplit` to split up the data into discrete trips.

In order to do this, you must identify the location(s) of the central place(s) (e.g. colony-center, or nest sites).

```{r message=FALSE, eval= FALSE,}

# here we know that the first points in the data set are from the colony center
colony <- dataGroup %>%
Longitude = first(Longitude),
Latitude = first(Latitude)


Our *colony* dataframe tells us where trips originate from. Then we can set some parameters to decide what constitutes a trip. To do that we should use our understanding of the movement ecology of the study species. In this case we know our seabird travels out to sea on the scale of tens of kilometers, so we set *innerBuff* (the minimum distance from the colony) to 3 km, and *duration* (minimum trip duration) to 1 hour. *returnBuff* can be set further out in order to catch incomplete trips, where the animal began returning, but perhaps due to device failure the full trip wasn't captured.

Optionally, we can set *rmNonTrip* to TRUE which will remove the periods when the animals were not on trips. The results of `tripSplit` can be plotted using `mapTrips` to see some examples of trips.

```{r tripSplit, eval= FALSE, warning = FALSE, message=FALSE, include=T, fig.height=7, fig.width=7}


trips <- tripSplit(
dataGroup = dataGroup,
colony = colony,
innerBuff = 3, # kilometers
returnBuff = 10,
duration = 1, # hours
rmNonTrip = TRUE

mapTrips(trips = trips, colony = colony)

```{r tripSplit graphic, echo=FALSE, out.height='80%', out.width='80%', fig.align="center"}
knitr::include_graphics("man/figures/tripSplit-chunk-3-1.png", dpi=50)

Then we can summarize the trip movements, using `tripSummary`. First, we can filter out data from trips that did not return to the vicinity of the colony (i.e. within *returnBuff*), so they don't skew the estimates.

```{r tripSummary, eval= FALSE, message=FALSE, warning=FALSE}
trips <- subset(trips, trips$Returns == "Yes" )

sumTrips <- tripSummary(trips = trips, colony = colony)


Now that we have an idea how the animals are moving, we can start with the process of estimating their space use areas, and identifying potentially important sites for the population!

`track2KBA` uses Kernel Density Estimation (KDE) to produce space use estimates for each individual track. In order for these to be accurate, we need to transform the tracking data to an equal-area projection. We can use the convenience function `projectTracks` to perform this projection. We can select between an azimuthal or cylindrical projection, and decide whether to center the projection on the data itself. Custom-centering is generally a good idea for quick analyses as this will minimize distortion, however it is important to remember that the resulting projection will be data specific. So if you remove even one track and re-analyze, the projection will differ between datasets. For formal analysis, the best solution is to find a standard projection that is appropriate for your study region.

```{r projectTracks, eval= FALSE, warning = FALSE, message=F}
tracks <- projectTracks( dataGroup = trips, projType = 'azim', custom=TRUE )

`findScale` provides options for setting the all-important smoothing parameter in the KDE. This parameter decisions is of the utmost importance, as it determines the scale at which the tracking locations will be 'smoothed' into an estimate of the probability of use of space by the animal. `findScale` calculates candidate smoothing parameter values using several different methods.

If we know our animal uses an area-restricted search (ARS) strategy to locate prey, then we can set the `scaleARS=TRUE`. This uses First Passage Time analysis to identify the spatial scale at which area-restricted search is occuring, which may then be used as the smoothing parameter value.

```{r findScale, eval= FALSE, warning = FALSE, message=F}
hVals <- findScale(
tracks = tracks,
scaleARS = TRUE,
sumTrips = sumTrips)


The other values provided by `findScale` are more simplistic methods of calculating the smoothing parameter. `href` is the canonical reference method, and relates to the number of points in the data and their spatial variance. `mag` is the log of the average foraging range (`med_max_dist` in the *sumTrips* output); this methods only works for central-place foragers.

Next, we must select a smoothing parameter value. To inform our decision, we ought to use our understanding of the species' movement ecology to guide our decision about what scale make sense. That is, from the `findScale` output, we want to avoid using values which may under- or over-represent the area used by the animals while foraging.

Once we have chosen a smoothing value, we can produce KDEs for each individual, using `estSpaceUse`. By default this function isolates the core range of each track (i.e. the 50% utilization distribution, or where the animal spends about half of its time) which is a commonly used standard (Lascelles et al. 2016). However, another quantile can be chose using the `levelUD` argument, or the full utilization distritbution can be returned using `polyOut=FALSE`.

The resulting KDEs can be plotted using mapKDE, which if `polyOut=TRUE` shows each tracks's core range in a different color.

*Note*: here we might want to remove the trip start and end points that fall within the *innerBuff* (i.e. 3 km) we set in `tripSplit`, so that they don't skew the at-sea distribution towards to colony.

```{r estSpaceUse, eval= FALSE, warning = FALSE, message = FALSE, include = TRUE, fig.width=5, fig.height=4, dpi=300}
tracks <- tracks[tracks$ColDist > 3, ] # remove trip start and end points near colony

KDE <- estSpaceUse(
tracks = tracks,
scale = hVals$mag,
levelUD = 50,
polyOut = TRUE

mapKDE(KDE = KDE$UDPolygons, colony = colony)

```{r estSpaceUse graphic, echo=FALSE, out.height='80%', out.width='80%', fig.align="center"}
knitr::include_graphics("man/figures/estSpaceUse-1.png", dpi=50)
At this step we should verify that the smoothing parameter value we selected is producing reasonable space use estimates, given what we know about our study animals. Are the core areas much larger than expected? Much smaller? If so, consider using a different value for the `scale` parameter.

The next step is to estimate how representative this sample of animals is of the population. That is, how well does the variation in space use of this sample of tracks encapsulate variation in the wider population? To do this we can use the `repAssess` function. This function repeatedly samples a subset of track core ranges, averages them together, and quantifies how many points from the unselected tracks fall within this combined core range area. This process is run across the range of the sample size, and iterated a chosen number of times.

To do this, we need to supply Utilization Distributions to `repAssess` (e.g., the output of `estSpaceUse`) and the tracking data. We can choose the number of times we want to re-sample at each sample size by setting the `iteration` argument. The higher the number the more confident we can be in the results, but the longer it will take to compute.

```{r repAssess, eval= FALSE,'hide'}
repr <- repAssess(
tracks = tracks,
KDE = KDE$KDE.Surface,
levelUD = 50,
iteration = 1,
bootTable = FALSE)

The output is a dataframe, with the estimated percentage of representativeness given in the `out` column.

The relationship between sample size and the percent coverage of un-tested animals' space use areas (i.e. *Inclusion*) is visualized in the output plot seen below.

By quantifying this relationship, we can estimate how close we are to an information asymptote. Put another way, we have estimated how much new space use information would be added by tracking more animals. In the case of this seabird dataset, we estimate that ~98% of the core areas used by this population are captured by the sample of 39 individuals. Highly representative!

```{r repAssess graphic, echo=FALSE, out.height='80%', out.width='80%', fig.align="center"}
knitr::include_graphics("man/figures/repAssess-1.png", dpi=100)

Now, using `findSite` we can identify areas where animals are overlapping in space and delineate sites that meet some criteria of importance. Using the core area estimates of each individual track we can calculate where they overlap. Then, we estimate the proportion of the larger population in a given area by adjusting our overlap estimate based on the degree of representativeness.

Here, if we have population size estimates, we can include this value (using the `popSize` argument) to estimate the number of individuals using a space, which can then use to compare against population-level importance criteria (e.g. KBA criteria). If we don't have population size estimates to provide, `findSite` this will output a proportion of the population instead.

If you desire polygon output of the overlap areas, instead of a gridded surface, you can indicate this using the `polyOut` argument.

```{r echo=F}
repr <- data.frame(out = 98)

```{r findSite, eval= FALSE}
Site <- findSite(
KDE = KDE$KDE.Surface,
represent = repr$out,
levelUD = 50,
popSize = 500, # 500 individual seabirds breed one the island
polyOut = TRUE


If we specified `polyOut=TRUE`, then the output will be of Simple Features class, which allows us to easily take advantage of the `ggplot2` plotting syntax to make an attractive map using `mapSite`!

```{r plot Site_sf, eval= FALSE}
Sitemap <- mapSite(Site, colony = colony)
## in case you want to save the plot
# ggplot2::ggsave("KBAmap", device="png")
```{r boobies graphic1, echo=FALSE, out.height='100%', out.width='100%', fig.width=5, fig.height=4, fig.align="center", dpi=300}
knitr::include_graphics("man/figures/plot_KBA_sf.png", dpi=50)

This map shows the number or proportion of individual animals in the population overlapping in space. The red lines indicate the 'potential site'; that is, the areas used by a significant proportion of the local population, given the representativeness of the sample of tracked individuals. In this case, since representativeness is >90%, any area used by 10% or more of the population is considered important (see Lascelles et al. 2016 for details). The orange dot is the colony location and the black line is the coastline.

Note: it is possible to set the threshold of importance at the population level yourself, using the `thresh` argument. Just be aware that this is a crucial threshold parameter that will need justifying if you are to eventually recommend a site for formal acknowledgement as an important site for conservation.

Then, we can combine all the polygons within the 'potentialSite' area, and use, for example, the maximum number of individuals present in that area to assess whether it may merits identification as a Key Biodiversity Area according to the KBA standard.

```{r site2assess, eval= FALSE}
potSite <- Site %>% dplyr::filter(.data$potentialSite==TRUE) %>%
max_animals = max(na.omit(N_animals)), # maximum number of animals aggregating in the site
min_animals = min(na.omit(N_animals)) # minimum number using the site


If in `findSite` we instead specify `polyOut=FALSE`, our output will be a spatial grid of animal densities, with each cell representing the estimated number, or percentage of animals using that area. So this output is independent of the representativness-based importance threshold.

```{r plot Site SPixDF, eval= FALSE}

mapSite(Site, colony = colony)

```{r boobies graphic2, echo=FALSE, out.height='70%', out.width='70%', fig.width=5, fig.height=4, fig.align="center", dpi=300}
knitr::include_graphics("man/figures/KBA_sp_plot.png", dpi=50)

This plot shows the minimum estimated number of birds using the space around the breeding island.


### Package reference
If you use any functions in this package for your work, please use the following citation:

Beal, M., Oppel, S., Handley, J., Pearmain, E. J., Morera-Pujol, V., Carneiro, A. P. B., Davies, T. E., Phillips, R. A., Taylor, P. R., Miller, M. G. R., Franco, A. M. A., Catry, I., Patrício, A. R., Regalla, A., Staniland, I., Boyd, C., Catry, P., & Dias, M. P. (2021). track2KBA: An R package for identifying important sites for biodiversity from tracking data. Methods in Ecology and Evolution, 12(12), 2372-2378.

### Example data reference
Oppel, S., Beard, A., Fox, D., Mackley, E., Leat, E., Henry, L., Clingham, E., Fowler, N., Sim, J., Sommerfeld, J., Weber, N., Weber, S., Bolton, M., 2015. *Foraging distribution of a tropical seabird supports Ashmole’s hypothesis of population regulation. Behav Ecol Sociobiol 69, 915–926.*

### Other references
Lascelles, B. G., Taylor, P. R., Miller, M. G. R., Dias, M. P., Oppel, S., Torres, L., Hedd, A., Corre, M. L., Phillips, R. A., Shaffer, S. A., Weimerskirch, H., & Small, C. (2016). Applying global criteria to tracking data to define important areas for marine conservation. Diversity and Distributions, 22(4), 422–431.

### Acknowledgements
Thanks to Annalea Beard for kindly sharing these example data for use in the package.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 766417.

Owner metadata

GitHub Events

Last Year

Committers metadata

Last synced: 27 days ago

Total Commits: 399
Total Committers: 10
Avg Commits per committer: 39.9
Development Distribution Score (DDS): 0.253

Commits in past year: 26
Committers in past year: 5
Avg Commits per committer in past year: 5.2
Development Distribution Score (DDS) in past year: 0.462

Name Email Commits
Bealhammar m****l@i****t 298
Steffen s****l@g****m 75
Bealhammar m****8@g****m 14
Virginia Morera-Pujol m****a@g****m 4
Jonathan Handley j****y@g****m 2
Lizzie Pearmain 4****r 2
Lizzie Pearmain l****n@g****m 1
Lizzie Pearmain l****n@b****g 1
DavisVaughan d****s@r****m 1
Jonathan Handley 4****H 1

Committer domains:

Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 39
Total pull requests: 11
Average time to close issues: about 1 year
Average time to close pull requests: about 1 month
Total issue authors: 11
Total pull request authors: 5
Average comments per issue: 3.18
Average comments per pull request: 1.64
Merged pull request: 8
Bot issues: 0
Bot pull requests: 0

Past year issues: 3
Past year pull requests: 1
Past year average time to close issues: N/A
Past year average time to close pull requests: 7 days
Past year issue authors: 2
Past year pull request authors: 1
Past year average comments per issue: 3.0
Past year average comments per pull request: 1.0
Past year merged pull request: 1
Past year bot issues: 0
Past year bot pull requests: 0

More stats:

Top Issue Authors

  • steffenoppel (18)
  • MartinBeal (11)
  • VirginiaMorera (2)
  • allisonglider (1)
  • clairemas0n (1)
  • halpinlr (1)
  • kanead (1)
  • lizziepear (1)
  • pecard (1)
  • rsbivand (1)
  • tommyclay (1)

Top Pull Request Authors

  • steffenoppel (7)
  • DavisVaughan (1)
  • MartinBeal (1)
  • VirginiaMorera (1)
  • Jono-H (1)

Top Issue Labels

  • enhancement (1)

Top Pull Request Labels

Package metadata track2KBA

Identifying Important Areas from Animal Tracking Data

  • Homepage:
  • Documentation:
  • Licenses: LGPL-3
  • Latest release: 1.1.1 (published 5 months ago)
  • Last Synced: 2024-02-27T18:00:40.573Z (1 day ago)
  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 269 Last month
  • Docker Downloads: 21,157
  • Rankings:
    • Docker downloads count: 0.582%
    • Forks count: 12.169%
    • Stargazers count: 17.391%
    • Average: 19.743%
    • Dependent repos count: 23.865%
    • Dependent packages count: 28.657%
    • Downloads: 35.791%
  • Maintainers (1)


  • R >= 2.10 depends
  • Matching * imports
  • adehabitatHR * imports
  • dplyr * imports
  • foreach * imports
  • geosphere * imports
  • ggplot2 * imports
  • lubridate * imports
  • magrittr * imports
  • maps * imports
  • maptools * imports
  • methods * imports
  • move * imports
  • purrr * imports
  • raster * imports
  • rgdal >= 1.5 imports
  • rgeos * imports
  • rlang * imports
  • sf >= 0.7 imports
  • sp >= 1.4 imports
  • tidyr * imports
  • adehabitatLT * suggests
  • doParallel * suggests
  • knitr * suggests
  • parallel * suggests
  • rmarkdown * suggests
  • tinytest * suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/test-coverage.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite

Score: 14.914142846432386