Section 7 Summary of Testing
7.1 Data Integrity
We screened the data for outliers by summarizing and visualizing the raw data, and assessed whether those outliers need to be removed. We consulted other researchers who are familiar with the RIDB datasets to confirm outliers or other anomalies in the data.
Additionally, we documented the percent loss from data wrangling to ensure that our cleaning and wrangling of the data were reasonable.
7.2 Code Review
We conducted code reviews within the team, and with faculty or external advisers. We reviewed specific code chunks and scripts related to the Outdoor Equity App.
We separated our workflows so that one person created scripts, and the other reviewed them. We did this to maintain some objectivity when evaluating if our datasets were aggregating correctly. We also had a separate workflow for metadata, where one person created and wrote metadata, and the other reviewed it. This confirmed that the data matches how it is being described in the metadata. This confirmation is important as we want our client to be able to scale our product and workflows for future use.
7.3 Product Testing
We used three packages to test our R Shiny app. We used shinytest
(Chang, Csárdi, and Wickham 2021) to ensure our app is visualized the way we expect it to using the package’s snapshot-based testing strategy. We used shinyloadtest
(Schloerke, Dipert, and Borges 2021) to test the server hosting the R Shiny App to ensure that it responds in a reasonable amount of time based on the inputs a user provides. Similarly, we utilized the tictoc
(Izrailev 2021) package during our data wrangling and cleaning, and when we initially created our plots, graphs, and maps to estimate an informed guess of how long it may take the app to run our scripts. Lastly, we used the reactlog
(Schloerke 2020) package’s diagnostic tool which creates a reactive visualizer for the app to make sure that reactive elements are working the way we expected them to. It is important to note that this diagnostic tool was not useful as our app functionality increased, as the reactive visualizer became impossible to read. There may be other options within reactlog
to use the reactive visualizer in a different way, but we did not have enough time to research this.
We added temporary print statements to all functions in the app to ensure that the functions were working correctly were are outputting what we expect. We removed print statements from functions that were functioning with zero errors. We did this because print statements can take a long time to run and should not be left in functions or in the app permanently.
Additionally, we held multiple meetings with the Recreation One Stop team to obtain real-time and focused feedback to improve user design and experience.
7.3.1 Next steps for testing
Due to time constraints, we were not able to implement all testing methods we wanted. We recommend the following testing strategies to make the app more robust and for smoother functionality.
- Use the R package
testthat
(Wickham 2022) to conduct unit tests on the scripts used to create Tidy datasets and for subsetted datasets for visualization. This type of testing may be important to avoid silent failures and to ensure that the datasets are aggregating correctly. - Use
gremlin.js
), a JavaScript library used for “Monkey testing” to test the behavior of the R Shiny App. This package is compatible withshiny
(Chang et al. 2021) and does not require any external installation. See Chapter 11 in Engineering Production-Grade Shiny Apps for more guidance. “Monkey testing” is a type of testing where random, automated tests provide random inputs and then checks the behavior of the app (i.e. if the system or application crashes). We were able to find some finicky bugs through our own testing of random inputs, but “Monkey testing” is the formal process of testing app behavior. - Employ user testing with federal public land managers, researchers, and those who are not familiar with RIDB data to further enhance the user experience and design.