Share

from left Bala Chaudhary, Adam Pollack, and Justin Mankin

guest panelists from left Bala Chaudhary, Adam Pollack, and Justin Mankin; not pictured, Lora Leligdon

Discussing the challenges and hopes for open data

Whether you’re aware of what the Nelson Memo entails or not, in 2025 anyone conducting research with federal funds will need to consider “open data” from the moment of application to publication. Some of you may wonder, “what does open data have to do with my research?” For others, you’re already grappling with the impending expectations. Wherever you are on your open data journey, the Libraries are here to help. Understanding what’s to come, and the expectations required of you as faculty, student, staff, or researcher doesn’t have to be complicated, and you don’t have to do it alone.

It inspires and motivates me to ensure research is reproducible. Doing so means we’re saying, “we care about this, and we are doing things to make things accessible or improve best practice.” - Adam Pollack

Knowing the impacts of the upcoming changes, library staff, whose expertise intersects teaching, research, and scholarly publishing, have been:

  • hosting workshops
  • actively partnering with faculty and staff across campus
  • consulting on data management best practices and repository selection to meet funder and publisher mandates
  • expanding institutional services like Dartmouth Dataverse 
  • and fostering meaningful discussions to drive transformation.

One such discussion was facilitated by the Libraries this Summer Term. Librarians Jentry CampbellLilly LindenMatt Benzing, and Tricia Martone invited four Dartmouth researchers and data experts leading change in their fields. Titled, “Perspectives on Open Data,” the panelists delved into what open data is, its value, its impact on students, and their hopes for the future.

Teaching is made so much more interesting and successful when students can look at datasets that they care about. It’s a more impactful and transformative educational experience. - Bala Chaudhary

Guest panelists included Adam Pollack, Research Scientist in the Keller Lab Group, Thayer School of Engineering; Bala Chaudhary, Associate Professor of Environmental Studies; Justin Mankin, Associate Professor in the Department of Geography and the Ecology, Evolution, Environment, and Society Program; and Lora Leligdon, Head of Research Data Services at Dartmouth Libraries. 

They are not only systemizing and teaching how to create, organize, store, and share good data for future use, they’re also doing it with the public audience and best reproducible science in mind. They’re doing this laborious work because it makes science better.   

By using these open data products, students are learning how to create great data for future use. They also get to see best practices along the way and discover how important it is to have well-described data. The benefit is open data helps generate more original knowledge and research. Taking the open ethos to heart and desiring to do the work will be part of future success in this space. - Lora Leligdon

Below are some highlights from the 90-minute panel session, featuring the panelists' perspectives on the challenges and opportunities open data presents.

Question:
What are the challenges and obstacles related to open data?

Data Curatorship and Reproducibility

  • Bala: In just the last 24 hours, I’ve come across challenges around curating data. There’s no financial support for data curators. Even when researchers think they’re good at data curatorship, the result doesn’t always uphold the needed standards for future access. Data curatorship and cleanup doesn’t count enough during academic tenure and promotion processes. And now peer review requires us to do that process by testing and checking the data, which again doesn’t count toward tenure or promotion.

    Another challenge is the “bait and switch” around open data. Authors say they’re publishing open data, but it’s not because the form is inaccessible. Unfortunately, adjacent to this, is academic bullying around open data. For instance, someone might say, “My data is open, and I’ll share it with you, person one, but I won’t share it with you, person two.” So the interpersonal can get in the way of the bigger mandates.

  • Adam: That’s a great point: why don't we value curated data for reusability? In my mind, the reusable data set has more value than just a single study. Are we data curators, providers, provisioners, or just researchers? It’s entangled. 

  • Justin: In my department, we work to publish our code alongside our research, so people can reproduce the science. It is time-consuming. But it ensures robust results. Reproducibility is a difficult standard to meet, but a necessary one. It’s new data-generating processes that strike me as essential and an increasing imperative for openness.

  • Adam: If journals really want the credibility that these studies bring, they need to earn it. Pay someone to do those checks instead of leaning back into peer review or researchers to do it. Meeting the demands of open data during research needs an iterative approach. We can do that by including researchers, students, and faculty to help identify the gaps, such as checklists around pre-reproducibility. We can refine the process over time, so it becomes shared knowledge to help everyone do better. Maybe that’s a role that librarians and ITC could take on together?

Responsible Data Use and Storage

  • Justin: A massive challenge is ensuring responsible use of data while maintaining accessibility and fidelity of data. It’s why companies like Amazon or Google increasingly provide platforms to allow researchers to slice and dice and interrogate data into more manageable chunks to use themselves. However, these are private companies with private interests. That contradicts the notion of public access and openness. 

  • Lora: The data archiving landscape is still the wild west, and storing data to the end of time isn’t possible. The true cost of storage and the services related to it create barriers. Repositories need to be thinking about curating information and how it’s held, including determining the migration of format, how long to hold data, and who is meant to use it (and who can), so people can find good data.

  • Justin: Like Lora said, we’re still in the wild west of this area. The pace is moving faster than the regulations and processes currently in place.

QUESTION: 

What’s the future of Open Data?

Exciting, Optimistic, and Keeping Pace with Scientific Discovery

  • Lora: It’s exciting. We’re trending towards open, and that’s a culture shift. Undergraduates now hold this expectation that data is there and available. So the future is about having better tools to make it easier to reproduce, store, and share the data. And, hopefully, it continues to snowball. Taking the open ethos to heart and desiring to do the work will be part of future success in this space.

  • Bala: Teaching is made so much more interesting and successful when students can look at datasets that they care about. It’s a more impactful and transformative educational experience. I’m a data producer, and I want my research program to focus on that. I prioritize the knowledge creation. We’ll always use open data, and we’ll contribute to sets like NEON. I couldn't do what I do as a researcher or professor without open data.

  • Adam: I’m optimistic! Every undergraduate in my lab thinks it’s wild that so much energy is spent on continually improving and refining data. It inspires and motivates me to ensure research is reproducible. Doing so means we’re saying, “we care about this, and we are doing things to make things accessible or improve best practice.” Fundamentally, we need the financial support to do these things well. What that looks like, I don’t know. But that’s the bottleneck for real change we need in our research processes.

  • Justin: Irrespective of costs on the researcher or actual computing infrastructure needs to archive and furnish them in a usable form, open data is essential to accelerating the pace of scientific discovery. At the moment, we’re instilling principles of openness standards. These norms are emerging as important in early career researchers, which wasn’t the case before. That’s distinct from training students to be flexible in the face of evolving standards, and helping them succeed in the face of ambiguity.

    A future includes Dartmouth’s role to codify standards and furnish the resources — computer architecture, funds, etc. — not once, but again and again as tech evolves and old/existing tech degrades.

 

***

Thanks again to our brilliant panelists for sharing their insights! Along with Lora, our people across Dartmouth Libraries are part of regular conversations discussing the challenges encountered when generating, curating, storing, and accessing open data. This expert panel was a way for all of us in the Dartmouth community to discover more about open data and its future.

Whether you're starting fresh or want to advance your skills, there are three ways you can work with us.

  1. Become better informed and better equipped: build your reproducible research skills in expert-led workshops.

  2. Book an appointment for a one-on-one consultation tailored to your needs.

  3. Prefer to send an email query? Use ResearchDataHelp@groups.dartmouth.edu to contact the team.

Back to top