Government agencies collect a wide variety of environmental data, through their own monitoring and tracking programs as well as through compliance data collected and submitted by regulated entities. Historically, environmental monitoring data and compliance data have been segregated, often kept in different data systems and managed and shared according to different rules. This asymmetric information infrastructure has kept non-government researchers from exploring critical environmental questions, and impeded discoveries within and among government agencies themselves.
In our current era of climate crisis it is time to revisit the data policies and architecture of the past. We need to enable greater data sharing across public and private boundaries, including compliance data, in order to unlock new insights and innovations. With open source data analysis tools and data literacy increasing, we have the opportunity to engage a wide range of people and institutions in environmental stewardship and oversight. The ongoing implementation of the Federal DataStrategy also provides opportunities to effect change in both IT infrastructure and data governance.
Ultimately, our goal is to bring more U.S.environmental data up to FAIR standards - Findable, Accessible, Interoperable, and Reusable - which is becoming a best practice in academia, industry, and multi-sector partnerships such as international climate assessments. Each federal agency manages environmental and compliance data in its own way, with a corresponding network of state, tribal, and local partners. We believe there will be similarities across these approaches but in order to identify those, we need to dig into the specifics by developing case studies with individual data programs. Our proposed scope of work has three phases:
Phase 1: Scoping Investigations
We will choose 2-3 government environmental data programs to explore in greater detail. For each program, we will convene a series of meetings with experts to:
● Map the current data ecosystem/data flows both within and beyond the agency systems.This map can create a shared understanding of barriers and bottlenecks to data sharing across participants and highlight grey areas for further exploration.The map can also be shared beyond the meeting participants to gather additional feedback
● Identify use cases and topics for which there is data demand. What are (academic)researchers looking for data about now and why? How does this differ from what agencies themselves are expected to report on?
● Identify exemplars of where data sharing is working well and potential resources for improvements.
Our initial candidate topics are environmental health and climate, and we are currently in scoping discussions with potential partners around other topics. Each topical investigation will involve approximately 15-20h of time for participants, spread across virtual meetings and document review.
Phase 2: Case Studies and Common Issues
At the end of the scoping phase, we will write up each topical investigation separately and prepare a summary document looking at common issues across all the systems we examined. This report will be shared with participants and released publicly.
Phase 3: System Redesign
With our background report complete, we will recruit and facilitate an expert panel to help us re-envision these environmental data systems. This group will include external data users, data architects and managers, and data policy experts, as well as representatives from the initial scoping investigation groups. The panel and our team will synthesize the group’s discussions and work into recommendations about improving technical infrastructure for environmental data sharing, as well as policy and social factors.