I’m considering adding a “How To” section of the DataSciGuide Data Science Learning Directory. Below is an example of the simplest type of post that would be included. Would adding a section of the site for posts like this be useful to you? Let me know in the comments or on twitter!

——————

## How to: Import a CSV file for Analysis

### With Python

Use the CSV module to import the CSV, then loop to display the data in each row:

1 2 3 4 5 6 7 |
import csv f = open('attendees1.csv') csv_f = csv.reader(f) for row in csv_f: print row |

Source: NewCircle blog

Descriptive statistics of an array can be calculated using SciPy: SciPy.org

Find more information about file input/output in python in the Python Cookbook from O’Reilly, including a free chapter online here: Python Cookbook Chapter 6

### With Python and Pandas

Import the CSV into a Pandas DataFrame, then show summary statistics:

1 2 3 4 5 |
import pandas data_df = pandas.read_csv('in_data.csv') data_df.describe() |

Sources: Spectraldifferences, StackOverflow

Additional descriptive statistics using Pandas: Chris Albon

More about manipulating data with Pandas in the Python for Data Analysis book

### With R

Import CSV and display summary statistics:

1 2 3 |
heisenberg <- read.csv(file="simple.csv",head=TRUE,sep=",") summary(heisenberg) |

Source: Cyclismo

More about importing data and computing statistics in R can be found in the Statistics (The Easier Way) with R book.

Please comment if you find any errors in this “how-to” or if you know another way to import CSVs for analysis!

## Recent Ratings