Diving into Data Science to Protect Midwest Waters

Clean water is essential to the Midwest, buoying many of our most valued industries and activities, from agriculture to recreation. So it’s no surprise that threats to water quality can trickle down and cause a number of wide-reaching problems for our environment and economy.

Now, a team of researchers at the University of Minnesota aims to help protect our water by tapping a different resource, and one that we have in droves: data.

The U of M team is leading an effort to form an innovation network to help develop standards, data sets, and informatics tools that accelerate water resources research and development. The researchers will bring together expertise in data science, cyberinfrastructure development, and water resources from the Water Resources Center (WRC) in the College of Food, Agriculture, and Natural Resource Sciences and U of M Extension; the Institute on the Environment; the U of M Informatics Institute; the Minnesota Supercomputing Institute; the College of Science and Engineering (CSE); and the U’s GEMS Platform.

Jeffrey Peterson, PhD, director of the WRC and principal investigator, said the University is a natural fit to lead the project—which comes as part of a larger, National Science Foundation (NSF)-supported effort known as the Big Data Regional Innovation Hubs (BD Hubs) program—because of its existing research capabilities and stakeholder connections.

“The University of Minnesota is particularly well suited to address water resources challenges through data-enabled research and education,” Peterson said. “Our role in the Midwest Big Data Hubs effort will build on our expertise in water quality research, complemented by our capabilities in informatics and our numerous connections to stakeholders in the private, public, and nonprofit sectors.”

Shashi Shekhar, PhD, a McKnight Distinguished University Professor in computer science and engineering in CSE and a member of the Midwest BD Hub board of directors, said the BD Hubs program is a crucial part of the Harnessing the Data Revolution initiative, one of NSF’s 10 Big Ideas for Future Investment.

“NSF has established four BD Hubs—Midwest, Northeast, South, and West—one in each of the four US census regions, to bring together stakeholders around regional issues,” Shekhar said. “The BD Hubs, now in their second phase, nurture regional community of academic, industry and community stakeholders to accelerate the big data innovation ecosystem.”

The U of M’s role is nested within the program’s $4 million Midwest Hub, led by the University of Illinois at Urbana-Champaign. Four other collaborating institutions—Indiana University, Iowa State University, the University of Michigan, and the University of North Dakota—are also involved, with each one focusing on a different research area.

Connecting Streams of Data

Data on water quality has never been more plentiful. A rapid increase in satellites, instrumentation, and citizen science programs is providing a deluge of new information, but the trouble is that these data come from different sources across the nonprofit, academic, and private sectors. Each source has different standards and measurements and, right now, it’s hard to tie all that data together to get a more comprehensive view of our waterways.

Recent advances in data science have set the stage for the Midwest Hub project to develop a solution. The U of M research team hopes to build new models for integrating data from different sources in near real-time and match those models with tools that can visually map out the results, providing new insights on how to most effectively manage water quality.

“The data revolution presents a new and exciting opportunity to take on challenges in water quality,” said Jim Wilgenbusch, PhD, director of Research Computing at the University of Minnesota and one of the project’s leaders. “We can design better ways to harness the vast amounts of information being collected and use that data to inform how we manage this all-important resource.”

Working toward this goal means reaching beyond the U to work with partners both in the community and in industry. A new Water Innovation Network will bring together University researchers and outside collaborators for an annual conference on the challenges and opportunities around advancing water resources research. During these discussions, network participants will work to spot the missing links and bottlenecks that prevent communities from making use of the standards, datasets, and tools developed through this project. The team also will promote awareness around water resources events, challenges to advancing water resources research, and opportunities to address these challenges.

One Spoke in the Hub

It isn’t just the data itself that’s plentiful—there are also myriad ways to use it. While the U of M focuses on building data science capacity around water quality, its collaborating universities in the Midwest Hub will lead research into four other focus areas: advanced materials and manufacturing; big data in health; digital agriculture; and smart, connected, and resilient communities.

What all of these research areas have in common is the same core idea that improving access to and use of data fuels academic research and education; contributes to economic development, government services and community planning; and benefits the social good.

“Developing innovative, effective solutions to grand challenges requires linking scientists and engineers with local communities,” said Jim Kurose, assistant director for NSF’s Directorate of Computer and Information Science and Engineering, in a news release. “The Big Data Hubs provide the glue to achieve those links, bringing together teams of data science researchers with cities, municipalities and anchor institutions.”