Skip to Main Content
Banner Image Main site Library home page Library home page

Big Data: Datasets

 

Here follows a list of cross- and single discipline data repositories, data collections and data search engines.  Do bear in mind that the Internet is not permanent, so websites & pages may be here today and gone tomorrow.  If you're a DBS student or staff member who notices a link that doesn't work, or wishes to include a dataset source, please contact the library.  Unsolicited suggestions from non-staff/students will not be entertained. Last updated 30/07/2018

 

Download and share scientific data from a variety of displines

Open dataset search engine

Index to publicly available web services and XML data sources that are provided by the US government

A centralized repository of public datasets stored on the Amazon cloud

Detailed information on more than 2,000 research data repositories

Datasets for Data Mining, Analytics and Knowledge Discovery

A place to discover and share high quality datasets,

A list of data repositories

"The world’s broadest collection of public data"

2600+ Open Data Portals Around the World

Over 10,000 datasets ready to use

Data and code behind the articles and graphics at FiveThirtyEight.com

List of (lists of) free datasets

Access to digital Humanities and Social Sciences data.

A crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects.

A collection of about 800 time series. Data can be searched, exported and read directly into R

2600+ Open Data portals around the world

 

Historical market data for stocks, bonds, commodities and currencies around Global Markets

440 datasets relevant to the machine learning community

Community Resource for Archiving Wireless Data At Dartmouth: wireless trace data from many contributing locations

Direct access in open, accessible, and machine-readable formats to the official data from the Centers for Medicare & Medicaid Services (CMS)

Free and open access to global development data

Provides access to data, from several censuses and surveys, about the United States, Puerto Rico and the Island Areas.

A repository of human brain imaging data collected using MRI and EEG techniques. It has been accepting data since 2010

Direct links to 1-gram through 5-gram data for all Google Books language corpora

A collection of more than 50 large network datasets from social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks

A repository of ecological datasets

Surface temperature time series datasets

Public data and forecasts from a range of international organizations and institutions who use Google to host their datasets

Data and metadata for OECD countries and selected non-member economies

The IMDB Movies Dataset contains information about 14,762 movies. The data was preprocessed and cleaned to be ready for machine learning applications.

The Clean Energy Information Portal from REEEP is an information database for web developers, offering a series of datasets for re-use

Central repository for data on the interactions between students and educational software with a suite of tools to analyze that data

Data about the prevalence of mental disorders, impairments associated with these disorders, and their treatment patterns from representative samples of majority and minority adult populations in the United States

Environmental data and information gathered by the UN's environmental information centres

Key energy information sources from Saudi Arabia, GCC, India, China and East Africa as well as selected global energy agencies and institutes.

 

Access to open data and information about Austin city government.

Access to open data and information about Chicago city government.

Find data published by UK central government, local authorities and public bodies

Datasets, documents, services, tools and applications collected by the Indian government for public use

Publically available data generated by the city of Seattle

Publication of Irish Public Sector data in open, free and reusable formats

Hundreds of datasets from the City and County of San Francisco

Time series datasets complied by the US Bureau of Labor Statistics

Data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more

Data and statistics generated by research performed by the FAA

Access to open data published by EU institutions and bodies

508,000 US and international economic time series datasets

The information that is produced and used by New York City government

Open datasets from the Portuguese government

Open datasets from the city of Berlin

Open datasets from the city of Berlin

Open datasets from the state government of British Columbia

Open data portal of the Italian Government

Open data portal of the Italian province of Trentino

Open platform for French public data

 

Library:

 

Online Help Desk:

 

More hours