Getting iPython Notebook to Run “Correctly” in Mac OS X 10.8

I’m going to keep this post brief so that the steps are clear and concise.  The reason for writing this post is that I wanted to get iPython Notebook, a powerful tool for data analysis, to run with plotting and pandas in Mac OS X 10.8.  When I initially tried to get this running, I would encounter errors where there were conflicts between 32-bit and 64-bit installations of different packages.  After a good deal of trial and error, I found the following steps resulted in a full iPython Notebook environment with Pandas and Matplotlib functioning flawlessly.

Continue reading

Quick and Simple Python Web Server

 

If you’re ever in the need to get a quick web-server up and running, this one line python command will do wonders. Of course, launch it from the directory that your files are in.  Then you just need to go to your favorite browser and type localhost with your directory and voila!  A simple web server.

One Liner: python -m SimpleHTTPServer;

Continue reading

R Quick Tip: Use %in% to filter a data frame.

Working with R, I was looking for functionality to easily subset my data based on a sequence of numbers.  After writing a for loop and using rbind to do it initially (terrible to do in R!), I finally found a way to do this efficiently.  Using a command called %in%, you can easily apply it as a filter in the subset command to get data filtered based on your sequence.  Enjoy!

# Generate sample data based to test.
sample_data <- data.frame(ID=seq(1,100,1),
                          Score=sample(0:100,100,rep=TRUE))
summary(sample_data)

# Plot the scores, see that there is a score for each id.
plot(sample_data$Score~sample_data$ID)

# Create a filter to apply.
look_at <- seq(1,100,10)

# Filter the sample data by look_at using the %in% command.
subset_data <- subset(sample_data, ID %in% look_at)

# Plot the scores, note the filtered data.
plot(subset_data$Score~subset_data$ID)

Bash Scripts, Python, SSH and Screen: Keeping Your Jobs Alive!

I recently ran into an interesting situation that required me to run a Python script repeatedly with different inputs on a remote server.  Of course, with any SSH session, there is always the possibility of a timeout which would kill any running jobs.  Normally, I would simply deploy a program and use an & at the end of the command, allowing the job to run in the background even after I logged out of my SSH session.  Seeing that I had multiple scripts to run, and could simply adjust my inputs with a for loop, I created a bash script that repeatedly called my Python code.  This was pretty straightforward and I deployed the script with an & before logging out of my SSH session to let the job complete.

Continue reading

Starting Quirks with Pandas from an R Junkie

Okay, okay, the title might be a little sensationalised.  I have been using the R statistics package for processing the results of evolutionary runs since beginning my PhD 2 years ago.  In that time, I have become familiar with the basic process to importing data, performing basic population statistics, mean, confidence intervals, etc, and plotting using ggplot.  I’ve always felt that I could streamline the process though as I perform a great deal of preprocessing using Python.  This typically involves combining multiple replicate runs into one data file and possibly even doing some basic statistics using the built-in functionality of Python.

Continue reading

Escaping the Spreadsheet Mentality: Start with the Right Data Format

Growing up on a healthy diet of Microsoft Office products, I am well versed in Word, Excel and Powerpoint.  As I have transitioned into the research world, these products still have their place, however, I sometimes find that the habits I developed for organizing data doesn’t necessarily transfer to statistical analysis.  Recently, I ran into a situation where I was evaluating the performance of solutions in multiple different environments.  Organizing this data appeared straightforward to me at first, I would simply group the different environments into one row grouped by the id of the individual.  My data then looked something like this:

GenerationEnvironment 1Environment 2Environment 3
110.312.18.2
214.110.27.4
38.613.410.2
49.811.29.3

Continue reading

Artificial Neural Network bowl of spaghetti representation.

Adventures in Visualization: Understanding Artificial Neural Networks Pt. 1

In the field of evolutionary robotics, artificial neural networks (ANNs) are an intriguing control strategy attempting to replicate the functionality of natural brains.  These networks, essentially directed graphs, with the possibility for cycles, are comprised of nodes containing a mathematical function, connected by weighted edges.  Inputs are correlated with information that may be useful for a robot such as: orientation, speed, goal conditions, etc., which is then propagated through the edges and weights to arrive at a set of outputs to direct motor movements or sensor readings.  Unfortunately, the size and complexity of these networks can grow rapidly when anything but the most simple tasks are attempted, making these graphs very challenging to interpret what processes and information are being used by the ANN for controlling the robot. I’ll save the long description of ANNs, but for an idea of what they can do, the following video features an ANN to control a swimming robot in a simulated flow.

Continue reading

WebGL for Scientific Visualization

I plan to flesh this out into a full fledged blog post in the future.  For now, this page contains links that complement my presentation at the Visualization Workshop during BEACON Congress 2013.

First and foremost, what is WebGL and why should I use it?  In short, WebGL is a Javascript API for creating 2D and 3D graphics that runs in a modern Internet browser.  In lieu of creating a separate executable for multiple systems (Windows, OS X, Linux, Mobile), a single implementation can be created in Javascript and placed on a webpage.  Users can then access that site from any operating system and see the simulation without the headaches associated with trying to install packages on a system.

Conway's Game of Life

Coupled with supplementary code, WebGL can be used to create interactive demos or even present simulations directly in a web browser.  Conway’s Game of Life has been implemented many times in WebGL, but I have found this one to be particularly interesting.  Additional demos for games, simulations and scientific visualizations can be found at http://www.chromeexperiments.com/webgl/.  Disclaimer: Try not to spend too much time on the site!

Continue reading