Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Banner Image Main site Library home page Library home page

Big Data: Web Resources

originally posted at (sharing policy)

Before you look at a web page, remember that on the Internet nobody knows you're a dog.  Anyone can write anything on the Internet and it doesn't have to be true.  There are a couple of tools for assessing the quality of websites, one of which is the CRAAP test: a list of criteria that you can use to assess the credibility of a source.  If you're doing research on the Internet, consider the following:

Currency:  when was the information posted, has it been revised or updated, does it reflect current knowledge, do the links work?

Relevance: who is the page aimed at, is the information at an appropriate level, how does it compare to other sources on the same topic, would you be happy to use this a source if you were writing an assignment/dissertation?

Authority: who wrote the page, do they have any qualifications & and are they relevant to the topic, is there contact information, what is the top level domain e.g .com (companies) .gov (government), .org (non-profit organisation), .edu/.ac (educational )?

Accuracy, what the source of the information, is it evidence-based, what kind of language is used, what is the tone of the page, is it free from grammatical or spelling errors, can you verify any of the information independently?

Purpose: why does the page exist, is the author trying to inform, persuade, entertain or sell you something, are the intentions clear, is it objective, Are there political, cultural or other biases?

Some good Data Science related websites. If you find any broken links, or are a DBS staff member or student who wants to suggest a site for inclusion, please contact the library.  If you are neither staff nor student, your unsolicited suggestions will be ignored.. Last updated 04/09/2018


Kanopy - Online streaming service

Silicon Republic: Big Data News

News on Big Data topics from Silicon Republic

Women in Big Data

Aiming to strengthen diversity in the big data field by encouraging and attracting more women into the area

Reddit: Big Data

Everything big data, from storage to predictive analytics


Datafloq offers information, insights and opportunities to drive innovation with big data, blockchain and artificial intelligence

Online videos relating to Big Data that are available through the Library#s subscription to Kanopy.

Six Provocations for Big Data

Thought provoking conference papers which poses some interesting questions

Cambridge Big Data

Brings together researchers from across the University to address challenges presented by our access to unprecedented volumes of data. In parallel,researching important issues around law, ethics and economics

Data Science Central

The industry's online resource for big data practitioners


An online trade journal that provides centralised education & resources in data management

Association for the Advancement of Artificial Intelligence (AAAI)

A nonprofits scientific society devoted to advancing the scientific understanding of the mechanisms underlying thought and intelligent behaviour and their embodiment in machines

An introduction to Apache Hadoop for big data

Hadoop is an open source software framework for storage and large scale processing of data-sets

Planet Data

An aggregator of blogs about big data, Hadoop, and related topics

Publicly Available Big Data Sets

A list of publicly available big data sets including pointers as to where to find more


The homepage for R. R is a free software environment for statistical computing and graphics

R Cheatsheets

Cheats to learn about and make using R simpler

12 Websites Every Data Analyst Should Follow

A list of "must-follow forums, data analytics blogs, and resource centers"

Centre for Applied Data Analytics (CeADAR)

industry prototypes and demonstrators along with state of the art reviews of data analytics technology, tools, best practice methodologies and processes.

International Institute for Analytics

An independent research firm that works with organizations to build strong and competitive analytics programs.

The 9 Best Languages For Crunching Data

Data analysts talk about their favorite languages and tool kits for hardcore data analysis.

25+ websites to find datasets for data science projects

A list of websites & resources to provide datasets for data science projects


Website that focuses on data analysis in politics, economics, and sports

ACM Special Interest Group on Knowledge Discovery and Data Mining (ACM SIG KDD)

SIGKDD aims to provide the premier forum for advancement and adoption of the "science" of knowledge discovery and data mining

Content Mine

Text/data mining tool

50 selected papers in Data Mining and Machine Learning

50 full-text articles on data mining and machine learning

Data Mining

Paper exploring many aspects of data mining including privacy

33 Good Data Mining tools

A list of some decent data mining tools

The National Centre for Text Mining (NaCTeM)

The first publicly-funded text mining centre in the world



Online Help Desk:


More hours