• Product catalogue
  •  
  •  
  •  

Applications of natural language processing to geoscience text data and prospectivity modelling

<div>Geological maps are powerful models for visualizing the complex distribution of rock types through space and time. However, the descriptive information that forms the basis for a preferred map interpretation is typically stored in geological map databases as unstructured text data that are difficult to use in practice. Herein we apply natural language processing (NLP) to geoscientific text data from Canada, the U.S., and Australia to address that knowledge gap. First, rock descriptions, geological ages, lithostratigraphic and lithodemic information, and other long-form text data are translated to numerical vectors, i.e., a word embedding, using a geoscience language model. Network analysis of word associations, nearest neighbors, and principal component analysis are then used to extract meaningful semantic relationships between rock types. We further demonstrate using simple Naive Bayes classifiers and the area under receiver operating characteristics plots (AUC) how word vectors can be used to: (1) predict the locations of “pegmatitic” (AUC = 0.962) and “alkalic” (AUC = 0.938) rocks; (2) predict mineral potential for Mississippi-Valley-type (AUC = 0.868) and clastic-dominated (AUC = 0.809) Zn-Pb deposits; and (3) search geoscientific text data for analogues of the giant Mount Isa clastic-dominated Zn-Pb deposit using the cosine similarities between word vectors. This form of semantic search is a promising NLP approach for assessing mineral potential with limited training data. Overall, the results highlight how geoscience language models and NLP can be used to extract new knowledge from unstructured text data and reduce the mineral exploration search space for critical raw materials.</div><div><br></div><div><strong>Citation: </strong>Lawley, C. J. M., Gadd, M. G., Parsa, M., Lederer, G. W., Graham, G. E., and Ford, A., 2023, Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling: Natural Resources Research. https://doi.org/10.1007/s11053-023-10216-1</div>

Simple

Identification info

Date (Creation)
2023-01-12T16:00:00
Date (Publication)
2023-06-13T01:20:35
Citation identifier
Geoscience Australia Persistent Identifier/https://pid.geoscience.gov.au/dataset/ga/147637

Cited responsible party
Role Organisation / Individual Name Details
Author

Lawley, C.J.M.

External Contact
Author

Gadd, M.G.

External Contact
Author

Parsa, M.

External Contact
Author

Lederer, G.W.

External Contact
Author

Graham, G.E.

External Contact
Author

Ford, A.

Internal Contact
Publisher

Springer Nature

External Contact
Name

Natural Resources Research

Purpose

Manuscript examining the use of natural language processing for improving understanding of mineral systems and mineral prospectivity. Case studies are presented for evaluating prospectivity for critical minerals in Canada, United States, and Australia.

Status
Completed
Point of contact
Role Organisation / Individual Name Details
Resource provider

Minerals, Energy and Groundwater Division

External Contact
Point of contact

Commonwealth of Australia (Geoscience Australia)

Voice
Point of contact

Ford, A.

Internal Contact
Spatial representation type
Topic category
  • Geoscientific information

Extent

Maintenance and update frequency
Not planned

Resource format

Title

Product data repository: Various Formats

Website

Data Store directory containing the digital product files

Data Store directory containing one or more files, possibly in a variety of formats, accessible to Geoscience Australia staff only for internal purposes

Project
  • critical minerals mapping initiative

Keywords
  • critical minerals

Keywords
  • mineral systems

Keywords
  • natural language processing

theme.ANZRC Fields of Research.rdf
  • Geology

  • Data mining and knowledge discovery

Keywords
  • Published_External

Resource constraints

Title

Creative Commons Attribution 4.0 International Licence

Alternate title

CC-BY

Edition

4.0

Website

https://creativecommons.org/licenses/by/4.0/

Addressee
Role Organisation / Individual Name Details
User

Any

Use constraints
License
Use constraints
Other restrictions
Other constraints

© 2023, Crown

Resource constraints

Title

Australian Government Security Classification System

Edition date
2018-11-01T00:00:00
Website

https://www.protectivesecurity.gov.au/Pages/default.aspx

Classification
Unclassified
Classification system

Australian Government Security Classification System

Language
English
Character encoding
UTF8

Distribution Information

Distributor contact
Role Organisation / Individual Name Details
Distributor

Commonwealth of Australia (Geoscience Australia)

Voice facsimile
OnLine resource

Link to Journal

Link to Journal

Distribution format

Resource lineage

Statement

<div>Multiple geological databases were used as the basis for generating natural language processing workflows to evaluate mineral prospectivity for critical minerals in Canada, the United States, and Australia.</div>

Metadata constraints

Title

Australian Government Security Classification System

Edition date
2018-11-01T00:00:00
Website

https://www.protectivesecurity.gov.au/Pages/default.aspx

Classification
Unclassified

Metadata

Metadata identifier
urn:uuid/db66820f-d78c-469e-a5fd-598b498dbb1e

Title

GeoNetwork UUID

Language
English
Character encoding
UTF8
Contact
Role Organisation / Individual Name Details
Point of contact

Commonwealth of Australia (Geoscience Australia)

Voice
Point of contact

Ford, A.

Internal Contact

Type of resource

Resource scope
Document
Name

Journal Article / Conference Paper

Alternative metadata reference

Title

Geoscience Australia - short identifier for metadata record with

uuid

Citation identifier
eCatId/147637

Metadata linkage

https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/db66820f-d78c-469e-a5fd-598b498dbb1e

Metadata linkage

https://ecat.ga.gov.au/geonetwork/accessDenied.jsp/eng/catalog.search#/metadata/db66820f-d78c-469e-a5fd-598b498dbb1e

Date info (Creation)
2023-06-13T01:08:34
Date info (Revision)
2023-06-13T01:08:34

Metadata standard

Title

AU/NZS ISO 19115-1:2014

Metadata standard

Title

ISO 19115-1:2014

Metadata standard

Title

ISO 19115-3

Title

Geoscience Australia Community Metadata Profile of ISO 19115-1:2014

Edition

Version 2.0, September 2018

Citation identifier
http://pid.geoscience.gov.au/dataset/ga/122551

 
 

Spatial extent

Keywords

critical minerals mapping initiative
theme.ANZRC Fields of Research.rdf
Data mining and knowledge discovery Geology

Provided by

Access to the portal
Read here the full details and access to the data.

Associated resources

Not available


  •  
  •  
  •