Statistics and ProbabilityCourses / Curricula / References, Learning Guides, Etc. / Self-paced online course
A list of courses from the Khan Academy covering the following topics: Introduction to statistics Analyzing categorical data Displaying and comparing quantitative data Summarizing quantitative data Modeling data distributions Exploring bivariate numerical data Study design Probability Counting, permutations, and combinations Random variables Sampling distributions One-sample confidence intervals One-sample z and t significance tests Two-sample inference for the difference between groups Inference for categorical data (chi-square tests) Advanced regression (inference and … Continue Reading
R Reference Card for Data MiningCheat Sheets / References, Learning Guides, Etc.
This cheat sheet covers the following aspects in data mining in R: Association Rules & Frequent Itemsets Sequential Patterns Classification & Prediction Regression Clustering Outlier Detection Time Series Analysis Text Mining Social Network Analysis and Graph Mining Spatial Data Analysis Statistics Graphics Data Manipulation Data Access Big Data Parallel Computing Generating Reports Interface to Weka Editors/GUIs Other R Reference Cards RDataMining Website, Package, Twitter & Groups
Intro to Inferential StatisticsCourses / Self-paced online course
From Udacity: “Inferential statistics allows us to draw conclusions from data that might not be immediately obvious. This course focuses on enhancing your ability to develop hypotheses and use common tests such as t-tests, ANOVA tests, and regression to validate your claims.” The course consists of seven lessons: Estimation Hypothesis Testing t-tests ANOVA Correlation Regression Chi-squared Tests This course is a preliminary step towards the Udacity Data Analyst Nanodegree Program, designed to … Continue Reading
DataCamp – Correlation and RegressionCourses / Interactive tutorial style course
From DataCamp: “Ultimately, data analysis is about understanding relationships among variables. Exploring data with multiple variables requires new, more complex tools, but enables a richer set of comparisons. In this course, you will learn how to describe relationships between two numerical quantities. You will characterize these relationships graphically, in the form of summary statistics, and through simple linear regression models.” This course consists of five chapters: Correlation and regression Correlation Simple … Continue Reading
DataCamp – Introduction to Machine LearningCourses / Interactive tutorial style course
From DataCamp: “This online machine learning course is perfect for those who have a solid basis in R and statistics, but are complete beginners with machine learning. After a broad overview of the discipline’s most common techniques and applications, you’ll gain more insight into the assessment and training of different machine learning models. The rest of the course is dedicated to a first reconnaissance with three of the most basic machine … Continue Reading
IEEE International Conference on Data MiningConferences / Gatherings and Organizations
Next Event: November 18-21, 2017 – New Orleans, LA From ICDM: “The IEEE International Conference on Data Mining series (ICDM) has established itself as the world’s premier research conference in data mining. It provides an international forum for presentation of original research results, as well as exchange and dissemination of innovative, practical development experiences. The conference covers all aspects of data mining, including algorithms, software and systems, and applications.”
AnacondaCONConferences / Gatherings and Organizations
Next event: April 8-11, 2018 – Austin, TX Stay up-to-date for AnacondaCON 2018 From AnacondaCON: AnacondaCON is the place to fast track your current knowledge of Open Data Science, through engagement with visionaries who have established modern Open Data Science technology and are pioneering its evolution. You’ll learn best practices and how other thought leaders are leveraging Anaconda to accelerate the value from their data. Highlights from AnacondaCON 2017 … Continue Reading
Microsoft Azure Machine Learning Cheat SheetCheat Sheets / References, Learning Guides, Etc.
The Microsoft Azure Machine Learning Algorithm Cheat Sheet helps you choose the right machine learning algorithm for your predictive analytics solutions from the Microsoft Azure Machine Learning library of algorithms.
The Elements of Statistical LearningBooks / physical books or multiple formats
This book describes the important ideas in statistics, data mining, machine learning, and bioinformatics in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics.
Data Science for Business: What you need to know about data mining and data-analytic thinkingBooks / physical books or multiple formats
Learn how to improve communication between business stakeholders and data scientists, and how to participate intelligently in your company’s data science projects. Discover how to think data-analytically, support business decision-making with data science methods.