Content: References, Learning Guides, Etc. / Intro Guide

Change Search Criteria:

Bayesian machine learning

/ /

From FastML: “So you know the Bayes rule. How does it relate to machine learning? It can be quite difficult to grasp how the puzzle pieces fit together – we know it took us a while. This article is an introduction we wish we had back then.” This article covers the following topics: Bayesians and Frequentists Priors, updates, and posteriors Inferring model parameters from data Model vs inference Statistical modelling … Continue Reading

Getting Started in Open Source

/ /

From Rebecca Bilbro: “The phrase ‘open source’ evokes an egalitarian, welcoming niche were programmers can work together towards a common purpose–creating software to be freely available to the public in a community that sees contribution as its own reward. But for data scientists who are just entering into the open source milieu, it can sometimes feel like an intimidating place…. And yet, open source development does have a lot going … Continue Reading

R for Excel Users

/ /

This post covers the why and how to switch from Excel to R for managing data and undertaking analysis. It covers the following topics: Four Fundamental Differences Between R and Excel Example: Joining two tables together Interation Generalizing through functions “Excel users have a strong mental model of how data analysis works, and this makes learning to program more difficult. However, learning to program will allow you to do things … Continue Reading

Getting Started with tidyverse in R

/ /

From Storybench: “The tidyverse is a collection of R packages developed by RStudio’s chief scientist Hadley Wickham. These packages work well together as part of a larger data analysis pipeline. To learn more about these tools and how they work together, read R for data science…. The following tutorial will introduce some basic functions in tidyverse for structuring and analyzing datasets.”

Quick-R

/ /

From Quick-R: “R is an elegant and comprehensive statistical and graphical programming language. Unfortunately, it can also have a steep learning curve. I created this website for both current R users, and experienced users of other statistical packages…who would like to transition into R. My goal is to help you quickly access this language in your work. I assume that you are already familiar with the statistical methods covered and … Continue Reading

R Tutorial

/ / /

This series of text-based tutorials covering a variety of introductory topics in R: Input Basic Data Types Basic Operations and Numerical Descriptions Basic Probability Distributions Basic Plots Intermediate Plotting Indexing Into Vectors Linear Least Squares Regression Calculating Confidence Intervals Calculating pValues Calculating The Power Of A Test Two Way Tables Data Management Time Data Types Introduction to Programming Object Oriented Programming Case Study: Working Through a HW Problem Case Study II: … Continue Reading

What Every Data Scientist Needs to Know about SQL

/

In this series of posts, I will provide a broad overview of the key SQL topics required to successfully work with databases to do effective data science.

Introduction to SQL for Data Scientists

/

PDF document explaining basic SQL commands like joins, aggregate functions, and subqueries.