• Product catalogue
  •  
  •  
  •  

Uncover-ML: a machine learning pipeline for geoscience data analysis.

The geosciences are a data-rich domain where Earth materials and processes are analysed from local to global scales. However, often we only have discrete measurements at specific locations, and a limited understanding of how these features vary across the landscape. Earth system processes are inherently complex, and trans-disciplinary science will likely become increasingly important in finding solutions to future challenges associated with the environment, mineral/petroleum resources and food security. Machine learning is an important approach to synthesise the increasing complexity and sheer volume of Earth science data, and is now widely used in prediction across many scientific disciplines. In this context, we have built a machine learning pipeline, called Uncover-ML, for both supervised and unsupervised learning, prediction and classification. The Uncover-ML pipeline was developed from a partnership between CSIRO and Geoscience Australia, and is largely built around the Python scikit-learn machine learning libraries. In this paper, we briefly describe the architecture and components of Uncover-ML for feature extraction, data scaling, sample selection, predictive mapping, estimating model performance, model optimisation and estimating model uncertainties. Links to download the source code and information on how to implement the algorithms are also provided.


<b>Citation:</b> Wilford, J., Basak, S., Hassan, R., Moushall, B., McCalman, L., Steinberg, D. and Zhang, F, 2020. Uncover-ML: a machine learning pipeline for geoscience data analysis. In: Czarnota, K., Roach, I., Abbott, S., Haynes, M., Kositcin, N., Ray, A. and Slatter, E. (eds.) Exploring for the Future: Extended Abstracts, Geoscience Australia, Canberra, 1–4.

Simple

Identification info

Date (Creation)
2020-06-22
Date (Publication)
2020-06-22T08:08:04
Citation identifier
Geoscience Australia Persistent Identifier/https://pid.geoscience.gov.au/dataset/ga/134466

Citation identifier
Digital Object Identifier/http://dx.doi.org/10.11636/134466

Cited responsible party
Role Organisation / Individual Name Details
Author

Wilford, J.

Author

Basak, S.

Author

Hassan, R.

Author

Moushall, B.

Author

McCalman, L.

Author

Steinberg, D.

Author

Zhang, F.

Purpose

EFTF Extedned Abstract

Status
Completed
Point of contact
Role Organisation / Individual Name Details
Resource provider

Minerals, Energy and Groundwater Division

Point of contact

Commonwealth of Australia (Geoscience Australia)

Voice
Point of contact

Du, Z.

MEG Internal Contact
Spatial representation type
Topic category
  • Geoscientific information

Extent

Extent

N
S
E
W


Maintenance and update frequency
As needed

Resource format

Title

Product data repository: Various Formats

Website

Data Store directory containing the digital product files

Data Store directory containing one or more files, possibly in a variety of formats, accessible to Geoscience Australia staff only for internal purposes

theme.ANZRC Fields of Research.rdf
  • EARTH SCIENCES

  • INFORMATION AND COMPUTING SCIENCES

Theme
  • machine learning

Project
  • EFTF

Keywords
  • information and computer sciences

Keywords
  • data analytics

Keywords
  • Exploring for the Future

Keywords
  • Toolbox

Keywords
  • Published_External

Resource constraints

Title

Creative Commons Attribution 4.0 International Licence

Alternate title

CC-BY

Edition

4.0

Website

http://creativecommons.org/licenses/

Access constraints
License
Use constraints
License

Resource constraints

Title

Australian Government Security ClassificationSystem

Edition date
2018-11-01T00:00:00
Website

https://www.protectivesecurity.gov.au/Pages/default.aspx

Classification
Unclassified
Language
English
Character encoding
UTF8

Distribution Information

Distributor contact
Role Organisation / Individual Name Details
Distributor

Commonwealth of Australia (Geoscience Australia)

Voice
OnLine resource

Extended Abstract for download (pdf) [2.5MB]

Extended Abstract for download (pdf) [2.5MB]

Distribution format
  • pdf

Resource lineage

Statement

The uncover-ML code was developed from a partnership between Data61 (CSIRO) and Geoscience Australia. A large proportion of the code draws on the scikit-learn – machine learning in python resource ( https://scikit-learn.org/stable/ ).

Metadata constraints

Title

Australian Government Security Classification System

Edition date
2018-11-01T00:00:00
Website

https://www.protectivesecurity.gov.au/Pages/default.aspx

Classification
Unclassified

Metadata

Metadata identifier
urn:uuid/d184c3e8-8bc5-4889-94c6-cc598ff3f951

Title

GeoNetwork UUID

Language
English
Character encoding
UTF8
Contact
Role Organisation / Individual Name Details
Point of contact

Commonwealth of Australia (Geoscience Australia)

Voice
Point of contact

Du, Z.

MEG Internal Contact

Type of resource

Resource scope
Document
Name

GA publication: Extended Abstract

Alternative metadata reference

Title

Geoscience Australia - short identifier for metadata record with

uuid

Citation identifier
eCatId/134466

Metadata linkage

https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/5167d912-6784-4371-9938-3931e84ca6f1

Metadata linkage

https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/d184c3e8-8bc5-4889-94c6-cc598ff3f951

Metadata linkage

https://ecat.ga.gov.au:80/geonetwork/srv/eng/catalog.search#/metadata/d184c3e8-8bc5-4889-94c6-cc598ff3f951

Date info (Creation)
2019-04-08T01:55:29
Date info (Revision)
2019-04-08T01:55:29

Metadata standard

Title

AU/NZS ISO 19115-1:2014

Metadata standard

Title

ISO 19115-1:2014

Metadata standard

Title

ISO 19115-3

Title

Geoscience Australia Community Metadata Profile of ISO 19115-1:2014

Edition

Version 2.0, September 2018

Citation identifier
https://pid.geoscience.gov.au/dataset/ga/122551

 
 

Spatial extent

N
S
E
W


Keywords

EFTF Exploring for the Future Toolbox data analytics information and computer sciences machine learning
theme.ANZRC Fields of Research.rdf
EARTH SCIENCES INFORMATION AND COMPUTING SCIENCES

Provided by

Access to the portal
Read here the full details and access to the data.

Associated resources

Not available


  •  
  •  
  •