ICESat-2 Hackweek Tackles the Big Data of Earth’s Glaciers

Seattle, Washington. Scientists converge on the multi-media classroom. Floor to ceiling glass doors line one wall, and flat-screen monitors ring the room. Over there is the glaciologist you saw give an AGU talk back in 2016. There, the professor from Alaska who lectured on continuum mechanics in McCarthy. Across the room, the student you know from Twitter but have never seen in person. Walking in the door, the people who like to ski deep powder in Canada. And more. All here in Seattle for the ICESat-2 hackweek.

Anthony Arendt, principal research scientist with the University of Washington eScience Institute and Applied Physics Laboratory, welcomes everyone at 8:30 a.m. sharp. Data scientists, students, professors, researchers, all 50-plus people scoot chairs into one giant circle to review the code of conduct for the week. People talk, ponder, and reflect.

The guidelines for fostering thriving learning communities are simple in concept. Listen for understanding. Recognize contributions. Stay engaged. No fixing. And so on. Supported by evidence-based research, these principles are key to effective collaboration and innovative scientific progress. But not always so easy or straightforward to enact. Participant pairs contemplate the 8.5-by-11-inch printed document and its seven code of conduct concepts intently.

Big data. Tackling big data—and making sense of what it tells us about the icy machinery of Earth’s changing glaciers and ice sheets—requires innovation and interdisciplinary collaboration. And ICESat-2 data is, for sure, big.

ICESat-2 measures the surface elevation of Earth photon by photon, promising unprecedented coverage and precision of surface elevation and glacier height changes at the poles and select mid-latitude glaciers. Tom Neumann, NASA project scientist for ICESat-2, describes this and other mission details. More than 240 billion laser pulses have been collected and it has not even been a full year since launch. How to deal? How to access, organize, manipulate, analyze this immense dataset?

Commence hacking. Fernando Perez, a scientist in UC Berkeley’s statistics department, pulls up the first interactive tutorial of the week. The dozens of people scattered at 15 tables around the room follow along, running code cell by cell using Jupyter Notebook and cloud-hosted, individual virtual machines provided by Amazon Web Services. The topic is GitHub, a powerful code development platform firmly rooted in open source ethos and practice.

Fernando explains that “…your brain has to have the proper scaffolding … you have to understand how the Russian dolls are organized,” before you can handle these structures. He proceeds to describe an “ecosystem of interoperable software” where researchers co-develop code.

A graphic presented by Fernando Perez shows the “ecosystem of interoperable software” that empowers researchers to co-develop scientific code. (Source: David Shean)

Other instructors share their tools and tricks for wrangling ICESat-2 data. Presenters are humorous, humble, and honest. Efficacy is often favored over elegance. Sophisticated, high level concepts are interspersed with metaphors to anchor all this abstract data or computer talk to something more concrete and tangible.

Projects. Day one, brave participants pitch their ideas to the room.

“I’ve been pondering if any information about time-varying snow densities on the Greenland ice sheet can be gleaned from the noise distributions in ATL03 data … How do the returns over regions where supraglacial lakes and streams …differ from surrounding areas, if at all?” posts #waternoice team lead Michalea King, an Ohio State University doctoral student. “Crevasses matter for a whole host of glaciological processes. So, I want to test whether ICESat-2 data can constrain crevasse development,” says #crackup project team lead Ellyn Enderlin, Assistant Professor at Boise State University.

Can we use ICESat-2 data to generate time series of snow depth on the Alaska north slope? Do we have clear shots of the glaciers in high mountain Asia? How can we use ICESat-2 to improve our understanding of the seasonal evolution of water content across icy environments in Alaska?

ICESat-2 hashtags convey various scientific objectives—and punny group identifiers—for the week. (Source: Caitlyn Florentine)

Milling around the room, people start to organize. Teams of seven or eight assemble and team hashtags are decided: #waternoice, #crackup, #floz, #snoblower, #topohack, #drag, #x-trac, #glaciersat2, #seatrac, #wigglysat, #groundhack, #swell.  For nine out of the 40 hours of hackweek, we will work with this project team to access and apply ICESat-2 data.

Transparent, reproducible, and testable. Gatherings like the ICESat-2 hackweek allow us to capitalize on the opportunity that ICESat-2 data grant (or maybe force) with renewed commitment to produce science that is transparent, reproducible, and testable. And meaningful. And impactful. And fun.

Read more on GlacierHub:

Photo Friday: Images From Huascaran Research Expedition

Observing Flora Near a Famous Norwegian Glacier

Annual Assessment of North Cascades Glaciers Finds ‘Shocking Loss’ of Volume

Leave a Reply