Tuesday, January 15 • 11:00am - 12:30pm
Linking Geoscience Resource Discovery and Exploration with Jupyter Notebooks

Session Abstract:
This session will discuss cross-disciplinary data discovery and introduce the EarthCube Data Discovery Studio project, which currently indexes over 1.6 mil datasets and other geoscience resources from 40+ repositories. The DDS system relies on a scalable metadata augmentation pipeline designed to improve and re-index metadata content using text analytics and an integrated geoscience ontology. The addition of automatically-generated ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by additional characteristics, such as measured variables, equipment, science domains, or geospatial features. The system also publishes the metadata using schema.org markup, and lets users validate or invalidate automatic metadata enhancements using a custom metadata editor. In addition, we will demonstrate how DDS portal users can invoke Jupyter notebooks residing on one of several Jupyterhubs, and pass discovered document metadata to the notebooks for additional visualization, analysis or modeling, thus bridging cross-domain resource discovery with more in-depth data exploration. We will also show how users can contribute their own notebooks to process additional types of data indexed in DDS.

Session Takeaways (post-meeting):
1) Data discovery studio does not host any data and does not intend to replace repositories. They intend to ‘enhance’ metadata quality.
2) This is a valuable resource for multi-disciplinary projects because it acts as a central area to locate available datasets for a given topic even if they cover a diverse range of datasets.
3) Changes made in Data Discovery Studio do not go back to the original repository, they only exist in the studio. Additional work would need to be done to make those changes at the repository level.

avatar for Steve Richard

Steve Richard

Adjunct Research Scientist, U. S. Geoscience Information Network
Stephen Richard is an Adjunct Research Scientist at Lamont-Doherty Earth Observatory. He is currently involved in projects to implement interoperable network services for geoscience information, using XML markup and OGC web services. He has been deeply involved in development of XML... Read More →

Tuesday January 15, 2019 11:00am - 12:30pm
Glen Echo

