Loading…
This event has ended. Create your own event on Sched.
The 2019 ESIP Winter Meeting has passed. See session descriptions to access meeting content, including presentations, recordings, and key takeaways. See here for info on upcoming meetings.
Tuesday, January 15
 

11:00am EST

Linking Geoscience Resource Discovery and Exploration with Jupyter Notebooks
Session Abstract:
This session will discuss cross-disciplinary data discovery and introduce the EarthCube Data Discovery Studio project, which currently indexes over 1.6 mil datasets and other geoscience resources from 40+ repositories. The DDS system relies on a scalable metadata augmentation pipeline designed to improve and re-index metadata content using text analytics and an integrated geoscience ontology. The addition of automatically-generated ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by additional characteristics, such as measured variables, equipment, science domains, or geospatial features. The system also publishes the metadata using schema.org markup, and lets users validate or invalidate automatic metadata enhancements using a custom metadata editor. In addition, we will demonstrate how DDS portal users can invoke Jupyter notebooks residing on one of several Jupyterhubs, and pass discovered document metadata to the notebooks for additional visualization, analysis or modeling, thus bridging cross-domain resource discovery with more in-depth data exploration. We will also show how users can contribute their own notebooks to process additional types of data indexed in DDS.

Session Takeaways (post-meeting):
1) Data discovery studio does not host any data and does not intend to replace repositories. They intend to ‘enhance’ metadata quality.
2) This is a valuable resource for multi-disciplinary projects because it acts as a central area to locate available datasets for a given topic even if they cover a diverse range of datasets.
3) Changes made in Data Discovery Studio do not go back to the original repository, they only exist in the studio. Additional work would need to be done to make those changes at the repository level.



Speakers
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →


Tuesday January 15, 2019 11:00am - 12:30pm EST
Glen Echo
  Glen Echo, Breakout Session
 
Thursday, January 17
 

11:00am EST

Filling the Earth Science Cookbook: Discovery and registry of Earth Science workflows from public repositories
Session Abstract:
The majority of scientific programming workflows are developed in isolation by graduate students and postdoctoral researchers. While packages and libraries in R and Python help support the advancement of scientific discovery, researchers are often challenged with combining and analysing data in new ways. Regardless, code use and re-use in the Earth Sciences is often complicated by the fact that few well-developed workflows exist as templates. Most code examples in R packages for example, use well-worn datasets that are not well suited to extrapolation for Earth Science applications. For this reason, the discovery and analysis of existing code resources, such as those undertaken by the FUNding Friday grant, become critical to providing resources to scientific programmers in the Earth Sciences.
This Session will introduce early-career researchers to the principle workflows for sharing code publicly, including discussion of some of the pros and cons of sharing code before it is “good enough”. The session will then provide an overview of work that has been undertaken to analyse a large number of Jupyter notebooks on GitHub, and then provide session members with an opportunity to help build the web of examples for coding resources, discussing what makes code useful as a “cookbook recipe” for Earth Sciences, what particular libraries or data resources are of interest, and how further automation might be undertaken.

Session Notes:
https://docs.google.com/document/d/1S5p4v77B3kCdWSKbxKw3gkPFTotMeS9_pm5ic9rYhak/edit?usp=sharing

Session Takeaways (post-meeting):
1) Cultural knowledge around data use and storage can impact usage and keep data and use in ‘silos.’
2) There are a lot of people now that are putting notebooks on GitHub that are associated with a specific publication. The people that are doing this well are associating the DOI with the original publication and the people doing really well are setting this up so that the repository also has its own DOI.
3) Earth science data cookbook has an easy form to fill out information for earth science datasets and resources. This is intended to make these resources more accessible and clearly labeled. This thing is not live yet, but users can input keywords etc. available at https://bitly.com/esip2019-cookbook





Speakers
avatar for Ben Galewsky

Ben Galewsky

Research Software Engineer, University of Illinois @ Urbana-Champaign


Thursday January 17, 2019 11:00am - 12:30pm EST
White Flint
 


Filter sessions
Apply filters to sessions.