A Guide to Open Data
Open data has garnered a lot of attention in recent years. In fact, you may have come across other terms with the prefix open, such as open source, open access, open knowledge, open education, and open science, to name a few. Are they all connected?
Open Knowledge Foundation defines "open knowledge" as any content, information or data that people are free to use, re-use, and redistribute – without any legal, technological or social restriction. This is a good working definition that applies, more or less, to all of the open movements.
But where did it all begin? This is difficult to pinpoint, because each open movement emerged from its own specific context. In a sense, the underpinning values of these movements are founded on values of intellectual freedom, democracy and universal rights. However, if we were to pick a first, this would be the open source movement.
Open source emerged from the ambiguity of whether computer software should be protected by copyright law. In the early stages of computing, most computer languages and programs were built in academic institutions like universities. When computing became more mainstream, companies needed copyright to incentivize profit. At the same time, the values of collaboration fostered in academic institutions persisted and those who believed in them sought to promote a more open culture. This led to the creation of the free software movement in the 1980s that ran parallel to the for-profit software boom as a result of the rise of personal computing.
Today, the movement has spawned many similar such movements, such as:
- open access, that promotes access to research and academic publications free of cost and access barriers;
- open data and education, that promote transparency of data and free access to education;
- and others.
The open movement is a growing and thriving network of organizations that promote sharing, collaboration and free access to resources.
Why does this matter? Well, these open movements provide us with tools to unleash our creativity, expand our knowledge, and participate fully in the culture that surrounds us.
Below you’ll find links to a variety of open initiatives, from open data in our governmental institutions, to open science, open genomes and open software.
Open data is, in a way, the catch-all phrase for open movements. It mandates the idea that some data should be available to everyone to access, use and modify for different purposes. The rise of open data is both a byproduct of the world wide web and the importance of intellectual property rights. Open data is prevalent within and accessible from public institutions. Other sources include non-profits and organizations that promote open science.
- Open Government Portal - Government of Canada
- Open Government Analytics - Government of Canada
- City of Toronto Open Data Catalog
- Toronto Public Library Open Data Site
Open access refers to a set of principles and research practices through which research is distributed online free of charge and other access barriers. Open access research and scholarly work are often distributed under a free license. A free license is a license agreement that confers rights on individuals to reuse, redistribute and improve the work.
- Directory of Open Access Journals (DOAJ): A wide index of online open access, peer-reviewed journals with full-text access open to all. You access the directory through the Toronto Public Library website.
- T-space: University of Toronto's repository of free and open-access scholarly publications.
- Frontiers: Frontiers Media is a publisher of peer-reviewed open-access journals in science, technology and medicine. It also supports the initiative for open citations.
The open source movement is a growing network of organizations and initiatives that promote freely accessible, usable and modifiable code and software.
Here are two great resources to get started with open source projects:
- Creative Commons Open Source Projects: Find a list of open source projects from the creative commons organization. A non-profit international network, creative commons issues creative commons licenses that make content freely available for use and modification.
- Github Collections: One of the largest open-source hubs, you can find here free collections on how to code and more advanced projects to both learn form and participate in.
Open Education & Content
Open education refers to educational resources and programs that eschew academic requirements of admission. As such, they broaden access to learning and training behind the barrier of education systems. Open content, on the other hand, refers to any creative work, work of art, or functional work that is free to access, use, study and modify. Open education and content resources are often found online through various initiatives and networks like the ones below:
- Open Educational Resources Commons: A digital public library of open educational resources from kindergarten to post-secondary.
- MIT Open Course Ware: A web repository of completely free and open access MIT courses focusing on STEM (Science, Technology, Engineering & Math)
- OpenLearn: A free repository of open courses by the UK based Open University. You can find free courses here on a diverse range of subjects.
- Nina Paley's Sita Sings the Blues & Other Artwork: A champion of free cultural works, Nina Paley has made her animation art available to all by placing it in the public domain.
Like the other open movements, open science seeks to make scientific research accessible to all, for both professional and amateur consumption and use. The open science movement is developed and shared through a variety of collaborative networks.
Some open science initiatives include:
- The International Genome Sample Resource - the 1000 genomes project that ran between 2008-2015 created the largest public catalogue of human variation and genotype data.
- The Encyclopedia of Life - an open online encyclopedia that documents life-forms on earth with the intention to document all 1.9 million known species.
- The Monarch Initiative - an initiative connects phenotypes to genotypes in an effort to combat genetic diseases.
The sequencing of the human genome, a feat accomplished by the Human Genome Project in 2003, formed a landmark case of collaborative science. It mapped and identified more than 3 billion genes in the human genome from a selected sample. But the promise of the human genome project cannot be complete without greater sharing of larger samples of genomic, phenotypic and medical data. The projects below make participation and access to genomic data possible:
- The Harvard Personal Genome Project: Allows participants to donate their genomic data and provides access to a variety of sequenced genomes and genomic data.
- UCSC Genome Browser: Allows you to interactively browse and visualize genomic data across the spectrum of biodiversity.
The Human Connectome Project, launched in 2009, aims to build a network map of functional and anatomical activity of the human brain through brain neuroimaging. The body of data is open to all researchers and seeks, among other things, to facilitate research into brain disorders and diseases.
- The Human Connectome Project: Like the human genome project, this collective project provides access to studies, and processed data related to structural brain imaging.
- Processed fMRI Data: Free to download average structural and tfmri data from 1200 subjects, with free registration and login required.
O'Reilly Learning has immediately accessible ebooks and video tutorials from major tech and business publishers. Find below some resources on open data:
- The Open Data Science Platform
- Open Data in the Web
- Leveraging Machine Learning & Open Data
- Private and Open Data in Asia: A Regional Guide
- Institutional Repositories
Lynda.com is a video tutorial platform with a great number of learning paths and stand alone videos to you started:
If you're interested in open data, be sure to check out our Innovator's in Residence's Open Data program on April 8. You may also be interested in a whole suite of programs on data privacy by our Innovator in Residence between March and April.
Any open projects you of know that you'd like to share? Please comment below.