What Is This Project About?


This is a project about data - the bits and pieces of information about people and things that have been collecting for thousands of years on paper, papyrus, clay tablets, and now digitally. Data about crop yields, weather patterns, money flows, and industrial output. Records of who owns what, who is related to whom, and how many people live in your town.

Data collection is an old idea; what’s new is the internet. The internet runs on data. There are now hundreds of thousands, perhaps millions, of public and private databases holding everything from climate history to census data, financial and public health records, or academic research. Combining this wealth of information in new ways creates new insights and provides a means to manage highly complex systems. Data is solving previously intractable problems. 

But while most of this data is unrelated to any particular individual, there is growing concern that data is being used, without user participation, to make decisions that impact people’s lives. Algorithms are in the press - those mathematical models that seek to automate how human beings make decisions with data. Artificial Intelligence, or AI, has burst on the scene as a tool to solve problems by spotting patterns too subtle for humans to identify.

Above all, there is a growing concern that unknown data brokers are using the data that we generate for obscure and potentially nefarious purposes - ones we wouldn’t approve of if we only knew the details.

Alongside these concerns, it is also undeniable that we are living in a golden age, largely because of what data can enable. We are curing disease, understanding the universe, reducing poverty, and putting order back into a chaotic world. Every day we are gaining new insights that enable us to improve lives and reduce the costs that come with modern society.

Our goals with this project are two-fold: to educate those outside the data economy about its inner workings, and to extract the collective wisdom of developers and data scientists on how to be good stewards of the data we’re entrusted with. Along the way we hope to bring greater nuance to the data conversation so that all sides can contribute to the ongoing dialog.