Authors / CoAuthors
Foster, S.D. | Vanhatalo, J. | Trenkel, V.M. | Schulz, T. | Lawrence, E. | Przeslawski, R. | Hosack, G.R.
Abstract
Data is currently being used, and reused, in ecological research at unprecedented rates. To ensure appropriate reuse however, we need to ask the question: “Are aggregated databases currently providing the right information to enable effective and unbiased reuse?” We investigate this question, with a focus on designs that purposefully bias the selection of sampling locations (upweighting the probability of selection of some locations). These designs are common and examples are those that have unequal inclusion probabilities or are stratified. We perform a simulation experiment by creating datasets with progressively more bias, and examine the resulting statistical estimates. The effect of ignoring the survey design can be profound, with biases of up to 250% when naive analytical methods are used. The bias is not reduced by adding more data. Fortunately, the bias can be mitigated by using an appropriate estimator or an appropriate model. These are only applicable however, when essential information about the survey design is available: the randomisation structure (e.g. inclusion probabilities or stratification), and/or covariates used in the randomisation process. The results suggest that such information must be stored and served with the data to support inference and reuse. <b>Citation: </b>S.D. Foster, J. Vanhatalo, V.M. Trenkel, T. Schulz, E. Lawrence, R. Przeslawski, and G.R. Hosack. 2021. Effects of ignoring survey design information for data reuse. Ecological Applications 31(6): e02360. 10.1002/eap.2360
Product Type
document
eCat Id
140054
Contact for the resource
Point of contact
Cnr Jerrabomberra Ave and Hindmarsh Dr GPO Box 378
Canberra
ACT
2601
Australia
Resource provider
Point of contact
- Contact instructions
- Place and Communities
Keywords
- theme.ANZRC Fields of Research.rdf
-
- EARTH SCIENCESMarine and Estuarine Ecology (incl. Marine Ichthyology)Applied Statistics
-
- bias
-
- data
-
- database
-
- findable
-
- accessible
-
- interoperable
-
- reusable data
-
- Horvitz-Thompson estimator
-
- inclusion probability
-
- model
-
- population density estimate
-
- reuse
-
- survey design
-
- Published_External
Publication Date
2024-05-14T23:09:47
Creation Date
Security Constraints
Status
completed
Purpose
Manuscript to be submitted to Ecological Applications Journal
Maintenance Information
asNeeded
Topic Category
geoscientificInformation
Series Information
Ecological Applications Volume 31, Issue 6, September 2021, e02360
Lineage
Manuscript to be submitted to Ecological Applications Journal
Parent Information
Extents
[-44.00, -9.00, 112.00, 154.00]
Reference System
Spatial Resolution
Service Information
Associations
Source Information