A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

spocc

An R package to query and collect species occurrence data from many sources.
https://github.com/ropensci/spocc

Category: Biosphere
Sub Category: Biodiversity Data Access and Management

Keywords

antweb api bison data ebird ecoengine gbif idigbio inaturalist obis occurrence r r-package rstats species species-occurrence spocc vertnet

Keywords from Contributors

routes cycle genome geocoding ecology antweb-api specimen-records air-pollutants climate noaa

Last synced: about 14 hours ago
JSON representation

Repository metadata

Species occurrence data toolkit for R

README.md

spocc (SPecies OCCurrence)

R-CMD-check
test-sp-sf
codecov.io
cran checks
rstudio mirror downloads
cran version

Docs: http://docs.ropensci.org/spocc/

At rOpenSci, we have been writing R packages to interact with many sources of species occurrence data, including GBIF, Vertnet, iNaturalist, and eBird. Other databases are out there as well, which we can pull in. spocc is an R package to query and collect species occurrence data from many sources. The goal is to to create a seamless search experience across data sources, as well as creating unified outputs across data sources.

spocc currently interfaces with seven major biodiversity repositories

  1. Global Biodiversity Information Facility (GBIF) (via rgbif)
    GBIF is a government funded open data repository with several partner organizations with the express goal of providing access to data on Earth's biodiversity. The data are made available by a network of member nodes, coordinating information from various participant organizations and government agencies.

  2. iNaturalist
    iNaturalist provides access to crowd sourced citizen science data on species observations.

  3. VertNet (via rvertnet)
    Similar to rgbif (see below), VertNet provides access to more than 80 million vertebrate records spanning a large number of institutions and museums primarly covering four major disciplines (mammology, herpetology, ornithology, and icthyology).

  4. eBird (via rebird)
    ebird is a database developed and maintained by the Cornell Lab of Ornithology and the National Audubon Society. It provides real-time access to checklist data, data on bird abundance and distribution, and communtiy reports from birders.

  5. iDigBio (via ridigbio)
    iDigBio facilitates the digitization of biological and paleobiological specimens and their associated data, and houses specimen data, as well as providing their specimen data via RESTful web services.

  6. OBIS
    OBIS (Ocean Biogeographic Information System) allows users to search marine species datasets from all of the world's oceans.

  7. Atlas of Living Australia
    ALA (Atlas of Living Australia) contains information on all the known species in Australia aggregated from a wide range of data providers: museums, herbaria, community groups, government departments, individuals and universities; it contains more than 50 million occurrence records.

The inspiration for this comes from users requesting a more seamless experience across data sources, and from our work on a similar package for taxonomy data (taxize).

BEWARE: In cases where you request data from multiple providers, especially when including GBIF, there could be duplicate records since many providers' data eventually ends up with GBIF. See ?spocc_duplicates, after installation, for more.

Learn more

spocc documentation: <docs.ropensci.org/spocc/>

Contributing

See CONTRIBUTING.md

Installation

Stable version from CRAN

install.packages("spocc", dependencies = TRUE)

Or the development version from GitHub

install.packages("remotes")
remotes::install_github("ropensci/spocc")
library("spocc")

Make maps

All mapping functionality is now in a separate package mapr (formerly known as spoccutils), to make spocc easier to maintain. mapr on CRAN.

Meta


Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 5 days ago

Total Commits: 844
Total Committers: 8
Avg Commits per committer: 105.5
Development Distribution Score (DDS): 0.129

Commits in past year: 4
Committers in past year: 2
Avg Commits per committer in past year: 2.0
Development Distribution Score (DDS) in past year: 0.25

Name Email Commits
Scott Chamberlain m****s@g****m 735
Karthik Ram k****m@g****m 64
hannahlowens h****s@g****m 23
Maëlle Salmon m****n@y****e 12
Edmund Hart e****t@g****m 5
Jeroen Ooms j****s@g****m 2
David LeBauer d****r@a****u 2
Stefan Widgren s****n@g****m 1

Committer domains:


Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 242
Total pull requests: 23
Average time to close issues: 2 months
Average time to close pull requests: about 6 hours
Total issue authors: 41
Total pull request authors: 7
Average comments per issue: 2.5
Average comments per pull request: 0.83
Merged pull request: 22
Bot issues: 0
Bot pull requests: 0

Past year issues: 2
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 2
Past year pull request authors: 0
Past year average comments per issue: 0.0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/ropensci/spocc

Top Issue Authors

  • sckott (159)
  • emhart (13)
  • karthik (12)
  • maelle (10)
  • timcdlucas (4)
  • AMBarbosa (4)
  • jamiemkass (3)
  • Martin-Jung (2)
  • cboettig (2)
  • mgaynor1 (2)
  • tphilippi (1)
  • remsamp (1)
  • Pakillo (1)
  • jenmark (1)
  • laeserman (1)

Top Pull Request Authors

  • sckott (10)
  • karthik (4)
  • maelle (3)
  • hannahlowens (3)
  • jessjaco (1)
  • dlebauer (1)
  • stewid (1)

Top Issue Labels

  • bug (48)
  • feature (15)
  • question (11)
  • maps (8)
  • docs (7)
  • datasource (7)
  • idigbio (7)
  • cleaning (5)
  • metadata (2)
  • dedup (2)
  • tests (1)

Top Pull Request Labels

  • docs (1)
  • feature (1)

Package metadata

conda-forge.org: r-spocc

  • Homepage: https://github.com/ropensci/spocc (devel), https://ropensci.github.io/spocc/ (user manual)
  • Licenses: MIT
  • Latest release: 1.2.0 (published over 4 years ago)
  • Last Synced: 2025-04-25T12:08:20.060Z (1 day ago)
  • Versions: 5
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Rankings:
    • Dependent repos count: 24.103%
    • Dependent packages count: 28.954%
    • Average: 29.626%
    • Forks count: 32.318%
    • Stargazers count: 33.127%

Dependencies

DESCRIPTION cran
  • crul * imports
  • data.table * imports
  • jsonlite * imports
  • lubridate * imports
  • rbison * imports
  • rebird * imports
  • rgbif * imports
  • ridigbio * imports
  • rvertnet * imports
  • tibble * imports
  • utils * imports
  • wellknown * imports
  • whisker * imports
  • taxize * suggests
  • testthat * suggests
  • vcr * suggests
.github/workflows/test-sp-sf.yaml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc master composite
  • r-lib/actions/setup-r master composite
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite

Score: 7.67786350067821