Research Ethics: The Power and Perils of Research Data

Advances in technology have transformed the research world in recent years, enabling researchers to collect enormous amounts of data and make that data broadly accessible to other researchers.

Earlier this month, faculty, staff, students, and community members gathered at the University of Minnesota to discuss the pressing ethical issues bubbling up around all of that data. The Power and Perils of Research Data: Generating, Storing & Sharing Data Responsibly explored issues in data collection, analysis, replicability, storage, and sharing, with those conversations led by national experts on the legal, bioethical, and policy issues related to research data.

The conference was the latest installment of the annual Research Ethics Day, presented by the University’s Office of the Vice President for Research; Consortium on Law and Values in Health, Environment, and the Life Sciences; and Masonic Cancer Center.

The Dual Nature of Data

Constantin Aliferis, MD, PhD, director of the UMN Institute for Health Informatics and chief research informatics officer for the Clinical and Translational Science Institute, said data does neither harm nor good when stored or transmitted; on its own, data is neutral. Only analyzing the data can lead to benefits or harm.

For example, Aliferis said, analyzing a data set might help physicians tending to a group of people with a given illness differentiate between the high-risk patients who need hospitalization and the low-risk ones who should be sent home to recover. If the analysis overgeneralizes, it could mean many patients who don’t need expensive treatment are kept in the hospital. On the other hand, an analysis that “underfits” the risk factors could lead to seriously ill patients being sent home, depriving them of potentially life-saving treatment.

“Effectiveness and efficiency problems occurring even inside the scope of intended use, and with full consent, can have profound ethical, individual, and societal implications,” Aliferis said.

Some of those implications relate to privacy, noted Sameer Badlani, MD, chief information officer with M Health Fairview. Different people react in different ways to seeing their personal information shared, even when they were the ones who initially made it available.

“There is rich, deep, often identifiable data available about us in large data sets, most of which we have contributed into the ecosystem,” Badlani said, adding that much of this data stems from everyday behaviors, like using a smartphone for driving directions, registering on a dating site, or providing a birthdate to an ice cream shop to get a free cone.

Researchers have to question the inference, he said, that someone who shares information online in a publicly visible place must be consenting to that information being harvested for research.

Dr. Dina Paltoo speaks at a podium

The Importance of Sharing Data

Speakers at the conference stressed the importance of not just gathering and using data ethically, but sharing it with their fellow researchers. Beyond the fact that some journals and funders require it, sharing data is appealing because it can help demonstrate the effective use of taxpayer funds, give a sense of transparency and accountability to research, and preserve the scientific record for future reference.

Michelle Meyer, PhD, JD, assistant professor at Geisinger Health System and Geisinger Commonwealth School of Medicine in Pennsylvania, said sharing data also plays an important role in ensuring new findings are reproducible.

“It is a research ethics scandal … that scientists make claims about the world that are not able to be reproduced, much less replicated,” Meyer said. “Science and research is not just about getting publications or saying interesting theoretical things. At its best, it’s about better understanding ourselves, each other, the world around us, and hopefully informing practices and policies. It’s really important to be able to reproduce analyses to make sure that they’re accurate.”

Dina Paltoo, PhD, assistant director for policy development at the National Institute of Health (NIH) National Library of Medicine, noted that, while important, sharing and managing large amounts of data bring many challenges. The data has to be “cleaned” of incomplete or errant data points, contextualized, and preserved in repositories to become useful for later research.

NIH is actively trying to address issues around how data should be managed and shared, Paltoo said. The federal agency put out a draft policy in November 2019 and plans to release a final policy later this year.

“There is considerable value from enhanced access to biomedical data; it accelerates and improves science and it builds trust in the research enterprise,” Paltoo said. “But we also need to consider that technology is advancing, data is getting bigger and more complex, and new approaches and models are always needed.”

Watch the Conference

Start the conference from the beginning in the video player below or view the full playlist on YouTube to select a session to watch.


Research Ethics Week

The Research Ethics Day conference was part of the larger Research Ethics Week, an annual series of educational opportunities organized by colleges and departments across the University system. The events focus on professional development and best practices to promote, maintain, and model high standards of ethics and integrity in research.

More than 25 events took place during this year’s Research Ethics Week, spanning subjects ranging from ethical considerations in the grant writing process to conducting collaborative research with tribal partners that remains respectful of indigenous values.