Tag Archives: statistics

Getting iPython Notebook to Run “Correctly” in Mac OS X 10.8

I’m going to keep this post brief so that the steps are clear and concise. The reason for writing this post is that I wanted to get iPython Notebook, a powerful tool for data analysis, to run with plotting and pandas in Mac OS X 10.8. When I initially tried to get this running, I would encounter errors where there were conflicts between 32-bit and 64-bit installations of different packages. After a good deal of trial and error, I found the following steps resulted in a full iPython Notebook environment with Pandas and Matplotlib functioning flawlessly.

Continue reading →

Starting Quirks with Pandas from an R Junkie

Leave a reply

Okay, okay, the title might be a little sensationalised. I have been using the R statistics package for processing the results of evolutionary runs since beginning my PhD 2 years ago. In that time, I have become familiar with the basic process to importing data, performing basic population statistics, mean, confidence intervals, etc, and plotting using ggplot. I’ve always felt that I could streamline the process though as I perform a great deal of preprocessing using Python. This typically involves combining multiple replicate runs into one data file and possibly even doing some basic statistics using the built-in functionality of Python.

Continue reading →

Escaping the Spreadsheet Mentality: Start with the Right Data Format

Leave a reply

Growing up on a healthy diet of Microsoft Office products, I am well versed in Word, Excel and Powerpoint. As I have transitioned into the research world, these products still have their place, however, I sometimes find that the habits I developed for organizing data doesn’t necessarily transfer to statistical analysis. Recently, I ran into a situation where I was evaluating the performance of solutions in multiple different environments. Organizing this data appeared straightforward to me at first, I would simply group the different environments into one row grouped by the id of the individual. My data then looked something like this:

Generation	Environment 1	Environment 2	Environment 3
1	10.3	12.1	8.2
2	14.1	10.2	7.4
3	8.6	13.4	10.2
4	9.8	11.2	9.3

Continue reading →

Jared M Moore

Associate Professor and Associate Dean of the College of Computing @ Grand Valley State University

Tag Archives: statistics

Getting iPython Notebook to Run “Correctly” in Mac OS X 10.8

Starting Quirks with Pandas from an R Junkie

Escaping the Spreadsheet Mentality: Start with the Right Data Format