visGWDB
A framework for groundwater-level informatics.
https://code.usgs.gov/map/gw/visGWDBmrva
Category: Hydrosphere
Sub Category: Freshwater and Hydrology
Keywords
generalized additive model groundwater hydrograph statistics support vector machine water level
Last synced: about 11 hours ago
JSON representation
Repository metadata
The code collection of visGWDB is a framework for groundwater-level informatics. This visGWDBmrva repository is made to be a static demonstration for the Mississippi River Valley alluvial aquifer (MRVA) and support of other derivative products.
- Host: code.usgs.gov
- URL: https://code.usgs.gov/map/gw/visGWDBmrva
- Owner: map
- License: cc0-1.0
- Created: 2019-08-01T21:19:53.885Z (over 5 years ago)
- Default Branch: master
- Last Synced: 2024-10-29T21:00:08.020Z (6 months ago)
- Topics: generalized additive model, groundwater, hydrograph, statistics, support vector machine, water level
- Stars: 1
- Forks: 0
- Open Issues:
- Releases: 0
https://code.usgs.gov/map/gw/visGWDBmrva/blob/master/
# Source code in R to quality assure, plot, summarize, interpolate, and extend groundwater-level information, visGWDB—Groundwater-level informatics with demonstration for the Mississippi River Valley alluvial aquifer#### Author: William H. Asquith, Ronald C. Seanor #### Contributor: Virginia L. McGuire, Wade H. Kress #### Point of contact: William H. Asquith (wasquith@usgs.gov) #### Version: [1.0.6](https://code.usgs.gov/map/gw/visGWDBmrva/-/tags/v1.0.6) #### Publication Year: 2019 #### Version Year: 2024 #### Digital Object Identifier (DOI): https://doi.org/10.5066/P9W004O6 #### USGS Information Product Data System (IPDS) no.: IP-108240 (internal agency tracking) _Suggested Repository Citation:_ Asquith, W.H., Seanor, R.C., McGuire, V.L., and Kress, W.H., 2019, Source code in R to quality assure, plot, summarize, interpolate, and extend groundwater-level information, visGWDB—Groundwater-level informatics with demonstration for the Mississippi River Valley alluvial aquifer: U.S. Geological Survey software release, https://doi.org/10.5066/P9W004O6. [https://code.usgs.gov/map/gw/visGWDBmrva] _Authors' [ORCID](https://orcid.org/) nos.:_ William H. Asquith, [0000-0002-7400-1861](https://orcid.org/0000-0002-7400-1861); Ronald C. Seanor, [0000-0001-5735-5580](https://orcid.org/0000-0001-5735-5580); Virginia L. McGuire, [0000-0002-3962-4158](https://orcid.org/0000-0002-3962-4158); Wade H. Kress, [0000-0002-6833-028X](https://orcid.org/0000-0002-6833-028X). *** *** # INTRODUCTION This README describes a quasi-static and independent repository of the **visGWDB** groundwater-level informatics software framework written in the _R_ language (R Core Team, 2022), which is a computer language freely available for many computer platforms. In particular, this README describes an instance of the **visGWDB** framework that is in this **visGWDBmrva** repository, which is approved for release (see `./inst/USGSapproval20190726.pdf`). This software repository exists within a much larger repository on groundwater-level information processing and delivery for the Mississippi Alluvial Plain ([MAP](https://www.usgs.gov/tools/mississippi-alluvial-plain-map-regional-water-availability-study)) Regional Water Availability Study (accessed April 15, 2024 at https://www2.usgs.gov/water/lowermississippigulf/map/). The MAP project extent is shown below. The "mrva" in the repository name reflects its setup to demonstrate a workflow for the Mississippi River Valley alluvial aquifer (MRVA).
For the documentation herein, it is assumed that the reader has substantial familiarity with the _R_ language. There are several add-on packages to the language that will be required. An authoritative list is provided at the end of this README; however, users are able to test their system if they were to "source" the `./include/visGWDB_pkgs.R` script. If no errors are encountered, then the steps outlined below for the **visGWDB** framework should work. Again, this **visGWDBmrva** repository is relatively static (as of about December 27, 2019 and through to about at least April 15, 2024) and more importantly is a completely independent, that is stand-alone, fully-functioning _out-of-the-box_ instance for code for demonstration of groundwater-level informatics for the MRVA. This code is intended to support or accompany the following paper: Asquith, W.H., Seanor, R.C., McGuire, V.L., and Kress, W.H., 2020, Methods to quality assure, plot, summarize, interpolate, and extend groundwater-level information—Examples for the Mississippi River Valley alluvial aquifer: Environmental Modelling and Software, v. 134, (2020), 104758, 19 p., https://doi.org/10.1016/j.envsoft.2020.104758. The **visGWDB** framework represented by **visGWDBmrva** also has been used for the groundwater-level computations described in Killian and others (2019). Additional documentation of this software is identified by the [README](https://code.usgs.gov/map/gw/visGWDBmrva/-/blob/master/inst/doc/README.md) file. A potentially useful data preprocessor is the USGS approved **infoGW2visGWDB** framework located at https://code.usgs.gov/map/gw/infoGW2visGWDB. # WHAT DOES THIS REPOSITORY PROVIDE? The **visGWDBmrva** instance of the **visGWDB** framework supports groundwater-level informatics. By that statement, we mean that this repository provides extensive algorithms useful for understanding, documenting, and communicating the information content of arbitrarily large groundwater-level databases. Through its demonstration for the MRVA, the repository provides visualization of provided data, provides well-to-well statistical summaries, provides trend statistics, constructs tables of outliers, tables of monthly means ([MONTHLY_ROLLUP](https://code.usgs.gov/map/gw/visGWDBmrva/-/blob/master/include/MONTHLY_ROLLUP.md)), and provides time-series modeling of the data. The aforementioned paper (Asquith and others, 2020) provides extensive discussion based on data for seven example wells out of nearly 19,000 wells. The framework is designed to generate hydrographs of well-by-well data in conjunction with depictions of water levels from nearby wells. Generalized additive models (GAMs) (a type of regression) and support vector machines (SVMs) (a type of machine learning) are constructed from the well-specific data. Also GAMs and SVMs are fit simultaneously to the neighboring data. These models collectively can be used for detection of outlier or anomalous data. Also these models can be used to estimate the water level statistically on the day of a real measurement. Such estimates are referred to as pseudo-observations. Another feature of the GAMs and SVMs is that estimates of water levels can be made on days without measurements, and this type of estimation is branded as "pozo" in the framework. (Pozo is Spanish for water well.) For example, continuous 28th of the month water levels can be estimated with uncertainty for the benefit of parameter estimation in numerical groundwater flow models. The hydrograph visualization also extends to depiction of metadata such as land-surface altitude, well bottom, and top and bottom of screened openings. These concepts are demonstrated through the production of output by the instructions given here and embodied by the `output20191018.zip` reference archive provided as well. In practice, the authors between 2016 through 2020, have developed and used various instances of this framework for detection and listing of potentially anomalous data and other curiosities for the purposes of database reconciliation and enhancement. The authors use statistical output tables in the workflow for construction of potentiometric surface maps in other software. The framework, through adjustment of settings, can be used for many permutations to support inquiries by stakeholders about water levels. # ACTUALLY USING THE REPOSITORY # Basic Operational Steps The script `visGWDB.R` is the portal to the **visGWDB** framework although there are many subordinate scripts executed when `visGWDB.R` is called. Seven steps to use this repository and produce base figures and output tables for a working manuscript by Asquith and others (2020) are described as follows. Postscripted notes also are given. 1. Start _RStudio_ (accessed May 6, 2019 at https://www.r-studio.com). This is the author-favored interface (wrapper) to the R language that can be acquired from the R Project (accessed May 6, 2019 at https://www.r-project.org). 2. Open the `./visGWDB.R` file in _RStudio_. Through the menus of the _RStudio_, and the steps in the graphic interface should look like: "File --> Open File --> visGWDB.R." 3. Change working directory to visGWDB.R. Through the menus of the application, the steps should look like: "Session --> Set Working Directory --> To Source File Location." The user can check the path by the command `getwd()` (get working directory). The path looks like `/Volumes/HARDDISK/Users/userid/visGWDB` (for a MacOS system). 4. Source the `./visGWDB.R` script. There should be a button labeled "Source" somewhere on the _RStudio_ interface or the "Source" available in the "Code" menu. This script is not dependent on _RStudio_ and other interactive interfaces to the _R_ language or even the command terminal could be used but are outside the scope herein. (The "Source" button of _RStudio_ is seen in the top-central region of the file ./inst/www/ExampleRStudioSession.png screenshot.)
5. The user is asked to look at the results in the `./output/` subdirectory, and the `./READMEoutput.txt` (located in parent directory) describes the various output components that are written into the `./output/` subdirectory. This subdirectory can be deleted at any time; it will regenerate as needed. This is the reason to have the output README positioned **outside** of the `./output/` subdirectory. The archive is delivered first without this subdirectory. It is common for the authors to delete the `./output/` subdirectory whenever different control settings are used. A note is needed that users could see slightly different results in the support vector machine outputs in user-generated PDFs versus those shown within the provided reference zip because of the nature of support vector machine algorithms. More details are provided in the `./READMEoutput.txt` 6. The subdirectory ./include/ contains additional code absorbed by `./visGWDB.R` at runtime and the script `./input/visGWDB_auxcode.R` contains code oriented around setting the time period for the monthly estimation of groundwater levels as described in the paper by Asquith and others (2020). This time period is for prediction and is unrelated to the time period of input data itself that is controllable from within the script `./include/visGWDB_filter.R`. 7. The `sites` variable in `./visGWDB.R` has been preset, as part of packaging this repository, to only run on the seven sites described in Asquith and others (2020). If the user comments out that line of code, the code will process all wells in the database (see `./input/GWmaster20190422.RData`) and the run time could be extremely long. (The script `./include/visGWDB_cmdline.R` provides for support in which sequences of wells can be split to additional processors but that is beyond the scope herein.) **Postscript:** The zipped archive `./output20191018.zip` contains a subdirectory named ./output20191018/ and that directory contains a copy of the `./output/` subdirectory created on September 1, 2019 using the steps as outlined above. The `./output/` subdirectory itself is not provided in the distribution of this archive because the above steps will (should) create it. **Postscript:** One status file is created when `./visGWDB.R` completes. This is file `./zzDONE_GWdone.txt` and is created in the same directory. The file could be useful in long computation times by checking for its presence or its contents. It is otherwise noninformative and can be deleted or just ignored. To clarify, the author, when using a remote login to a host machine, checks the presence of this file to detect whether a computation run has completed. **Postscript:** An example session screenshot is shown in a portable network graphics (png) file named ./inst/www/ExampleRStudioSession.png, which represents a single change to a logical variable (`manyPDFs` from `TRUE` to `FALSE`) that triggered graphical rendering inside the "Plots" tab inside the _RStudio_ interface in lieu of outside by the default `TRUE` (as in the archive) setting that produces portable document format (pdf) files in the `./output/pdfs/` subdirectory. Lastly, the file `./READMEoutput.txt` describes the contents of the `./output/` subdirectory. ## Further Background In order to operate the `./visGWDB.R` script, the user is responsible for installing the requisite _R_ packages as listed in the code `./include/visGWDB_pkgs.R`. Several important details now require addressing. 1. The code framework can fulfill many complex ideas related to information processing of water levels in groundwater databases. It is not restricted to just a single database or study area or even questions being asked of it. The authors above (esp. Asquith, Seanor, and McGuire) are working (2016–2019+) with many other investigators of groundwater-level information in various aquifer systems. 2. As mentioned before, the framework is intended to be a quasi-static code base to provide an archival reference. This is a critically important piece of background information. Whereas, the authors will accommodate questions regarding **visGWDBmrva** in this immediate archive so as to support select publications such as that by Killian and others (2019) or Asquith and others (2020). **Algorithms herein, meaning restricted to this repository alone, are to be permanently static.** 4. A snapshot of the U.S. Geological Survey National Water Information System (NWIS) database for the Mississippi River Valley alluvial aquifer is provided in the file `./input/GWmaster20190422.RData` within which exists a variable called `GWmaster` that is an _R_ data.frame (table). All processing is based on this data table, which is a snapshot on April 22, 2019. The columns therein are possibly mentioned in the aforementioned paper, are based verbatim on column naming conventions of NWIS with capitalization enforced, or are otherwise helpful for automated processing. The columns are described within the subdirectory `./input`. 5. The code herein is extensive. Further, reliance on many external packages to the _R_ language is needed. What follows is the information about the author's computer at the time of testing and this README. The author (W.H. Asquith) is also running _RStudio (Version 2022.03.3 Build 492; Mozilla/5.0 (Macintosh; Intel Mac OS X 13_6_6) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.12.10 Chrome/69.0.3497.128 Safari/537.36_ 6. It is possible that particularly interested users might want to filter or subset their own data in the case that they have created a viable `GWmaster.RData` file. Such advanced users might be interested in the logic contained along with switches and settings in file `./include/visGWDB_filterInput.R`. or example, filtering to restrict input data the year 2018, and as a note therein the `GWtrim` data frame is created from the `GWmaster` contained in the file `GWmaster.RData`. 7. Advanced users might be especially interested in the following settings in the file `./include/visGWDB_control.R` and the expanded comments (not shown below). ```{r} min.outlier.abs.residual <- 10 # in feet open.overlap.buffer <- 10 # in feet n.nearest.wells <- 300 min.distance.inkm.for.nearest.wells <- 0 max.distance.inkm.for.nearest.wells <- 8 ``` # R SESSION INFORMATION (Mon Apr 15 17:57:35 2024) The software framework has been tested by the author using at least following infrastructure: ``` R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS 13.6.6 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] crayon_1.5.2 sp_1.6-1 sf_1.0-13 zip_2.3.0 [5] survival_3.5-5 RColorBrewer_1.1-3 mgcv_1.8-42 nlme_3.1-162 [9] lmomco_2.5.0 kernlab_0.9-32 feather_0.3.5 diptest_0.76-0 [13] batch_1.1-5 loaded via a namespace (and not attached): [1] Rcpp_1.0.10 pillar_1.9.0 compiler_4.2.0 Lmoments_1.3-1 [5] class_7.3-22 tools_4.2.0 goftest_1.2-3 lubridate_1.9.2 [9] lifecycle_1.0.3 tibble_3.2.1 lattice_0.21-8 timechange_0.2.0 [13] pkgconfig_2.0.3 rlang_1.1.1 Matrix_1.5-4.1 DBI_1.1.3 [17] cli_3.6.1 rstudioapi_0.14 rgdal_1.6-7 dataRetrieval_2.7.13 [21] e1071_1.7-13 dplyr_1.1.2 generics_0.1.3 vctrs_0.6.2 [25] hms_1.1.3 classInt_0.4-9 grid_4.2.0 tidyselect_1.2.0 [29] glue_1.6.2 R6_2.5.1 fansi_1.0.4 magrittr_2.0.3 [33] splines_4.2.0 MASS_7.3-60 units_0.8-2 utf8_1.2.3 [37] KernSmooth_2.23-21 proxy_0.4-27 ``` # REFERENCES Asquith, W.H., Seanor, R.C., McGuire, V.L., and Kress, W.H., 2020, Methods to quality assure, plot, summarize, interpolate, and extend groundwater-level information—Examples for the Mississippi River Valley alluvial aquifer: Environmental Modelling and Software, v. 134, (2020), 104758, 19 p., https://doi.org/10.1016/j.envsoft.2020.104758. Killian, C.D., Asquith, W.H., Barlow, J.R.B., Bent, G.C., Kress, W.H., Barlow, P.M., and Schmitz, D.W., 2019, Characterizing groundwater and surface-water interaction using hydrograph-separation techniques and groundwater-level data throughout the Mississippi Delta, USA: Hydrogeology Journal (2019), 13 p., (online), accessed August 2, 2019, at https://doi.org/10.1007/s10040-019-01981-6. R Core Team, 2022, R—A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, version 4.2.0, accessed on April 22, 2022, at https://www.R-project.org. U.S. Geological Survey, 2019, National Water Information System, Web interface, accessed April 22, 2019, at https://doi.org/10.5066/F7P55KJN.
Owner metadata
- Name: Mississippi Alluvial Plain
- Login: map
- Email:
- Kind: organization
- Description: Mississippi Alluvial Plain groundwater and surface water modeling efforts for decision support.
- Website:
- Location:
- Twitter:
- Company:
- Icon url: https://code.usgs.gov/uploads/-/system/group/avatar/296/MAP_MERAS.gif
- Repositories: 1
- Last ynced at: 2024-09-27T15:15:30.321Z
- Profile URL: https://code.usgs.gov/map
Committers metadata
Last synced: 2 days ago
Total Commits: 209
Total Committers: 1
Avg Commits per committer: 209.0
Development Distribution Score (DDS): 0.0
Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0
Name | Commits | |
---|---|---|
William Asquith | w****h@u****v | 209 |
Committer domains:
- usgs.gov: 1
Issue and Pull Request metadata
Last synced: 1 day ago
Score: 0.0