Stephanie J. Spielman, Ph.D.

stephanie argent cours

About Me


I am a Research Assisant Professor in the Institute for Genomics and Evolutionary Medicine at Temple University , in Sergei L. Kosakovsky Pond’s lab. I received my Ph.D. from the Ecology, Evolution, and Behavior graduate program at UT Austin, through Claus Wilke’s lab. My research interests broadly encompass computational molecular evolution, phylogenetics, cours de l’argent, and comparative genomics. In addition to research, I have a strong interest in teaching and undergraduate education.

stephanie.spielman@gmail.com

Publications

2017

EL Jackson, SJ Spielman, and CO Wilke. 2017. Computational prediction of the tolerance to amino-acid deletion in green-fluorescent protein. PLOS ONE 12(4): e0164905.

Z Kadlecova, SJ Spielman, D Loerke, A Mohanakrishnan, DK Reed, and SL Schmid. 2017. Regulation of clathrin-mediated endocytosis by hierarchical allosteric activation of AP2. J Cell Biol 216(1): 167–179.

2016

SJ Spielman, S Wan, and CO Wilke. 2016. A comparison of one-rate and two-rate inference frameworks for site-specific dN/dS estimation. Genetics 204(2): 499–511.

SJ Spielman and CO Wilke. 2016. Extensively parameterized mutation–selection models reliably capture site-specific selective constraint. Mol Biol Evol 33(11): 2990–3002.

EL Jackson, A Shahmoradi, SJ Spielman, BR Jack, and CO Wilke. 2016. Intermediate divergence levels maximize the strength of structure–sequence correlations in enzymes and viral proteins. Protein Sci 25(7): 1341-1353.

J Echave, SJ Spielman, and CO Wilke. 2016. Causes of evolutionary rate variation among protein sites. Nature Rev Genet 17: 109-121.

2015

SJ Spielman and CO Wilke. 2015. Pyvolve: A flexible Python module for simulating sequences along phylogenies. PLOS ONE 10(9): e0139047.

AG Meyer, SJ Spielman, T Bedford, and CO Wilke. 2015. Time dependence of evolutionary metrics during the 2009 pandemic influenza virus outbreak. Virus Evolution. 1(1):vev006-10.

SJ Spielman and CO Wilke. 2015. The relationship between dN/dS and scaled selection coefficients. Mol Biol Evol 32(4): 1097-1108.

SJ Spielman, K Kumar, and CO Wilke. Comprehensive, structurally-curated alignment and phylogeny of vertebrate biogenic amine receptors. PeerJ 3:e773. 2015.

2014

SJ Spielman* , AG Meyer*, and CO Wilke. Increased evolutionary rate in the 2014 West African Ebola outbreak is due to transient polymorphism and not positive selection. bioRxiv. 2014. *Authors contributed equally to this work.

SJ Spielman, ET Dawson, and CO Wilke. 2014. Limited utility of residue masking for positive-selection inference. Mol Biol Evol 31(9): 2496-2500. 2014.

A Shahmoradi, DK Sydykova, SJ Spielman, EL Jackson, ET Dawson, AG Meyer, and CO Wilke. 2014. Predicting evolutionary site variability from structure in viral proteins: buriedness, packing, flexibility, and design. J Mol Evol 79:130–142. 2014.

2013

MZ Tien, AG Meyer, DK Sydykova, SJ Spielman, and CO Wilke. 2013. Maximum allowed solvent accessibilites of residues in proteins. PLOS One 8(11):e80635. 2013.

SJ Spielman and CO Wilke. 2013. Membrane environment imposes unique selection pressures in transmembrane domains of G-protein coupled receptors. J Mol Evol 76(3):172-182. 2013.

Courses

I have been involved in teaching and designing a wide variety of courses. As a TA, I have taught Evolutionary Biology (both at UT Austin and Brown University), Biostatistics, and Computational Biology. I played an integral role in designing the curricula and course materials, including computer lab exercises (mostly in R and Python) and video tutorials, for the latter two courses.

Through UT Austin’s Center for Computational Biology and Bioinformatics (CCBB), I have been involved in multiple initiatives to teach biological computing and introductory programming to students of all levels (undergraduate, graduate, and postdoctoral). Below are links to the websites for courses I have designed and taught.

CCBB Big Data in Biology Summer School: Introduction to Python (May 2016)

Peer-Led Introduction to Biocomputing (Spring 2016)

This is the official website for the Spring 2016 Peer-led Biocomputing Group at UT Austin, offered through the Center for Computational Biology and Bioinformatics. The goal of this class is to take away any fear of programming that you may have so that computing can become an advantage rather than a barrier to your research.

The class will meet on Wednesdays from 4-5 pm in FNT 1.104. In addition, feel free to join us for Open Coding Hour at Tuesdays 5-6pm in CCBB conference room (GDC 7.514) for any extra help or feedback.

Please use the links to the left for contact information and some useful biocomputing resources. Please also feel free to participate in the Biocomputing Google Groups Forum with any questions that you have as you begin your biocomputing journey!

The introductory survey can be found here.

Installation and Setup Instructions
If you are using a Mac, you will need to install a text editor. Some excellent options are either Text Wrangler, Sublime Text, or Atom. Alternatively, you can download Apple’s text editor, XCode, from the App Store.

1/25/16: Updated Windows instructions!
If you are using a Windows/PC, you will need to install some software on your computer in order to use python. Follow these instructions (for Windows) to setup your computer. You should setup your system no later than the second week of class (1/27/16). Do not save installation to the last minute - it may take several hours, and you may encounter difficulties!! If you need help, please feel free come to Open Coding Hour in GDC 7.514 on Tuesday 1/26/16.
Schedule and Materials
Date Topic Materials (Download Link) Instructor
January 27, 2016 Introduction to Computing and Unix Cheatsheet and Files Becca
February 03, 2016 Python I: Operators and Variable types Cheatsheet and Files Stephanie
February 10, 2016 Python II: Control Flow Cheatsheet and Files Becca
February 17, 2016 Python III: Functions and Debugging Strategies Cheatsheet and Files Stephanie
February 24, 2016 Python IV: File Input/Output and Parsing Cheatsheet and Files Becca
March 02, 2016 Python V: Sequence Analysis with Biopython Cheatsheet and Files Stephanie
March 09, 2016 Python VI: Best Practices and Testing Cheatsheet and Files Becca
March 23, 2016 Building Analysis Pipelines Cheatsheet and Files Sean
March 30, 2016 Introduction to R Cheatsheet and Files Sean Leonard
April 06, 2016 Navigating TACC Cheatsheet and Files Benni Goetz
April 13, 2016 Regular Expressions Cheatsheet and Files Stephanie
April 20, 2016 Version Control with Git Cheatsheet and Files Cheng Lee
April 27, 2016 Manipulating Protein Structures with Biopython and Pymol Cheatsheet and Files Ben Jack
May 04, 2016 R and the Hadleyverse: Easy data manipulation and visualization Cheatsheet and Files Sean Leonard

CCBB Big Data in Biology Summer School: Introduction to Python (May 2015)

Big Data in Biology Summer School, 2015
UT Austin Center for Computational Biology
View project on GitHub
Welcome to Introduction to Python!
Welcome to the Introduction to Python course, offered by UT Austin’s CCBB as part of the 2nd annual Big Data in Biology Summer School!

Stephanie Spielman will be the lead instructor for this course. She is a 4th year Ph.D. student in the Ecology, Evolution, and Behavior program in Claus Wilke’s lab. Check out her website here. Eleisha Jackson will be the TA for this course. Eleisha is a 3rd year Ph.D. student in the EEB program, also in Claus Wilke’s lab.

Installation Instructions
Before class starts, it is absolutely essential that you get your computer up-and-running for programming! The necessary steps to take depend on what kind of computer you’re using (Mac or PC; if you’re using Linux, you should already be good to go). Follow the instructions, linked below, that match your computer.

PC/Windows Instructions (Note: PC setup can take 1-3 hours, so set aside enough time!). Here is a link for getting set-up with a popular text editor, gedit: https://help.ubuntu.com/community/gedit.
Mac Instructions
Useful Resources
Online Resources:
The Unix Tutorial for Beginners (http://www.ee.surrey.ac.uk/Teaching/Unix/) is a great resource and starting point for getting comfortable with the command-line environment

Google is easily the most valuable resource for figuring things out. If you encounter an issue, chances are somebody else has also encountered it and has asked about it. Googling your error messages is one of the best debugging strategies there is. In particular, try to find links to the website http://www.stackoverflow.com. This forum-based website has all the answers, possibly literally (but they might be super snarky).

The popular websites Code Academy and Rosalind are excellent for learning and practicing python and bioinformatics skills. UT has an account with Rosalind, so if you’re a UT student, simply log in with your standard UT credentials.
Offline Resources:
The book Practical Computing for Biologists (website: http://practicalcomputing.org) by Haddock and Dunn provides a really thorough, entry-level overview of introductory computing concepts, including (but not limited to!) Unix and Python. The book’s accompanying website is also regularly updated with important tips, examples, and errata.
UT Biocomputing Google Group:
Post any questions and join in the conversation in the UT Biocomputing google group: https://groups.google.com/forum/#!forum/utbiocomputing

Day One: UNIX and Python Variables
Day One Slides
Day One Exercises
Day One Cheatsheet
Day One UNIX Exercise Solutions and Day One Python Exercise Solutions
Day Two: Control Flow in Python
Day Two Slides
Day Two Exercises
Day Two Cheatsheet
Day Two Exercise Solutions
Day Three: Functions
Day Three Slides
Day Three Exercises
Day Three Exercise Solutions
Day Four: File Input/Output and Python Modules
Day Four Slides
Day Four Exercises
Day Four Cheatsheet
Reference: Interpreting Error Messages
Reference: File Input/Output
Day Four Exercise Solutions
Supplement: BioPython
BioPython Slides
BioPython Exercises
BioPython Cheatsheet

Peer-Led Introduction to Biocomputing (Spring 2015)

UT Biocomputing 2015
Peer-led working group in Biological Computing at UT Austin, Spring 2015

Download ZIP
Download TAR
View On GitHub
This project is maintained by sjspielman

Welcome!
This is the official website for the Spring 2015 Peer-led Biocomputing Group at UT Austin, offered through the Center for Computational Biology and Bioinformatics. The goal of this class is to take away any fear of programming that you may have so that computing can become an advantage rather than a barrier to your research. On Wednesdays, we will provide lessons suitable for beginners in programming. Each week you should post at least one question and answer on the UTbiocomputing google group and complete the homework. On Tuesdays we will help you with homework or if you had no problems, come anyways to meet other people that are computing at UT, and get some handy tips from GSAF’s Scott Hunicke-Smith.

Please install the required software no later than the second day of class. Installation instructions are available here.
Instructors
Rebecca (Becca) Tarvin (PAT 123) rdtarvin@utexas.edu
Stephanie Spielman (MBB 3.232) stephanie.spielman@gmail.com
Meeting Times
Open Coding Hour: Tuesdays 5-6pm in CCBB conference room (GDC 7.514)
Class: 4-5pm on Wednesdays in FNT 1.104
Helpful Resources
We strongly recommend the book Practical Computing for Biologists. by Haddock and Dunn. The book provides a really thorough overview of most of what we’ll learn in this class (including Unix, Python, R, , and more! The book’s accompanying website (linked above) is also regularly maintained with important tips, examples, and errata.

The UNIX Tutorial for Beginners is a great resource and starting point for getting comfortable with the command-line environment.

Don’t forget your most valuable resource: google! If you encounter an issue, chances are somebody else has also encountered it and has asked about it. Googling your error messages is one of the best debugging strategies there is. In particular, try to find links to the website stackoverflow.com. This forum-based website has all the answers, possibly literally.

This wiki and this github repository contain powerpoints, exercises, and cheatsheets from the Spring 2014 course.

Here’s a great paper on Best Practices in Scientific Computing.
Schedule and Materials
Each item in the “Topic” column links to materials for that course day. Materials include a cheatsheet, lesson plan, and sometimes a powerpoint and/or example scripts. All course materials can also be accessed directly from this github repository.

Week Date Topic Instructor
Week 1 Jan 21 Welcome and Introduction Becca
Week 2 Jan 28 UNIX and Bash: Navigating the Command Line Becca
Week 3 Feb 04 Python I: Basic data types and structure Stephanie
Week 4 Feb 11 Python II: Control flow and loops Becca
Week 5 Feb 18 Python III: Functions Stephanie
Week 6 Feb 25 Python IV: File Input/Output Stephanie
Week 7 Mar 04 Python V: Testing and Code Hygiene Becca
Week 8 Mar 11 Python VI: Biopython Stephanie
Week 9 Mar 25 Merging Python and Bash Stephanie and Becca
Week 10 Apr 01 Version Control with git Cheng Lee
Week 11 Apr 08 Statistical Computing in R Nate Pope
Week 12 Apr 15 Data analysis in Python Ben Liebeskind
Week 13 Apr 22 High-performance computing with TACC Benni Goetz
Week 14 Apr 29 RNAseq: Computational platforms and pipelines Dariya Sydykova
Week 15 May 06 pyRAD: Pipelines for analyzing NextGen SNP data April Wright

Blog

More specifically, more than a few #smbe15 tweets (enough to catch my eye) have been really critical, and not necessarily constructively, about talks that the “tweeter” didn’t like. In some cases, senior and/or well-established scientists seemed to be using twitter as an open forum to point out specific talks or presentation styles they didn’t feel were worthy or done correctly. As a graduate student trying to network and pave a career for myself, I was pretty worried when I saw these tweets. Would these scientists also tweet negative remarks about my talk? Will my research be de-valued because some tenured professor made a snarky comment? Thankfully, I didn’t see any mocking tweets after my talk, but there were a few tweets where I thought to myself, “Wait…was that about me?” Not a good feeling.

Let me be clear - these tweets are in the minority, but at least 10-15 tweets have fallen into this category, which in my book is way too many. Even so, seeing several leading scientists publicly share and effectively sign their names to negative remarks about others’ talks or posters was incredibly discouraging. I haven’t seen any tweets outright saying “Hey, Person X! You’re talk was dumb!”, but I have seen a lot of tweets with an excessively mocking and/or ostracizing tone.

I’m not suggesting that the scientific community should avoid looking at others’ work with a critical eye, but I am suggesting (nay, requesting) that public tweets be respectful and supportive. If you have something that doesn’t fit into those categories but that, nevertheless, you feel must be said, it would be a much more productive strategy to actually go find and talk with the person! We’re at a conference, after all. Having real conversations with real people seems like an overall better approach than does cryptically tweeting your thoughts about things you don’t like. Plus, 140 characters can’t do those sorts of conversations justice and leaves you vulnerable to strong misunderstandings. Indeed, maybe I’ve miscalculated the whole thing and I am incredibly wrong about all the seemingly inappropriate tweets I saw! …Which is exactly my point.

Twitter should be a place to collaborate, share, and engage respectfully - not a convenient place for accusing or pointing fingers ar people. After all, what young scientist wants to join a community with a thriving culture of cyber snark?

Pyvolve is written in pure Python, with dependencies of NumPy, SciPy, and Biopython. The Pyvolve framework is extremely flexible, allowing you to simulate sequences according to virtually all standard models of nucleotide, amino acid, and codon data, and you can customize all model parameters to your heart’s content. Further, Pyvolve allows you to provide a custom rate matrix, if the available models are not quite what you’re looking for (however, please feel free to get in touch with me if you would like to request that a new model be included!).

Pyvolve incorporates both site and temporal heterogeneity, and, as you’ll see in the preprint linked above, contains several novel simulation features. Below, I show some simple examples of Pyvolve simulations. In general, sequence simulations require several you to do a few things:

Specify a phylogeny (with branch lengths!)
Define any evolutionary model(s) to use. In Pyvolve, these are Model objects.
Assign model(s) to partition(s). In Pyvolve, these are Partition objects.
Evolve, using the callable Evolver class.
Partitions are essentially a convenient way of defining “domains” – each partition can evolve according to a distinct evolutionary model (provided that all partitions evolve the same state, e.g. nucleotides, amino acids, or codons), and each partition can have differing degrees of heterogeneity.

Examples shown below are minimal and do not capture the full power of Pyvolve – to really see what Pyvolve can do, have a look at the user manual!

Simulating nucleotide sequences
This simple example demonstrates how to evolve nucleotide sequences.

import pyvolve

Define a phylogeny, from a file containing a newick tree

my_tree = pyvolve.read_tree(file = “file_with_tree.tre”)

Define a nucleotide model, as a Model object.

my_model = Model(“nucleotide”)

Assign the model to a Partition. The size argument indicates to evolve 250 positions

my_partition = Partition(models = my_model, size = 250)

Evolve!

my_evolver = Evolver(partitions = my_partition, tree = my_tree)
my_evolver()

The code shown above will simulate a nucleotide alignment of 250 positions along the phylogeny provided in file_with_tree.tre. This code simulates nucleotides according to default parameters: mutation rates among nucleotides are equal, and nucleotide equilibrium frequencies are equal at 0.25 each. We can customize these parameters by adding a second argument to Model: a dictionary of parameters to customize.

To customize mutation rates, we can use the key “mu”. This key should have an associated value of a dictionary of mutation rates. Mutation rates are symmetric, denotated by keys “AT”, “AC”, etc. (where “AT” is the rate from A to T, and conversely T to A). To customize frequencies, we can use the key “state_freqs”, whose associated value should be a list/numpy array of frequencies ordered ACGT.

This code chunk simulates nucleotide sequences with customized parameters:

import pyvolve

Define a phylogeny, from a file containing a newick tree
my_tree = pyvolve.read_tree(file = “file_with_tree.tre”)

Define a nucleotide model with custom parameters!
mutation = {“AC”: 1.5, “AG”: 2.5, “AT”: 0.5, “CG”: 0.8, “CT”: 0.99, “GT”: 1.56}
frequencies = [0.25, 0.3, 0.1, 0.35] f(A) = 0.25, f© = 0.3, etc.
my_model = Model(“nucleotide”, {“mu”: mutation, “state_freqs”: frequencies} )

Assign the model to a Partition.
my_partition = Partition(models = my_model, size = 250)

Evolve!
my_evolver = Evolver(partitions = my_partition, tree = my_tree)
my_evolver()
For those of you who, like myself, tend towards some minor, completely socially-acceptable laziness, you can alternatively specify mutation rates with simply the key “kappa”, which represents the transition-to-transversion bias. Here’s how to define such a model:

Define a nucleotide model kappa
frequencies = [0.25, 0.3, 0.1, 0.35] f(A) = 0.25, f© = 0.3, etc.
my_model = Model(“nucleotide”, {“kappa”: 3.25, “state_freqs”: frequencies} )

Now we’re cookin’! Let’s add some more bells and whistles, like rate heterogeneity. In the simulations shown above, all positions evolve according to exactly the same model and the same rate. We can incorporate rate heterogeneity by adding a few keyword arguments when defining our Model object. In this example, we will specify rate heterogeneity with a custom distribution, although as you’ll see in the user manual, you can also specify that rates be distribution according to a gamma distribution.

To implement rate heterogeneity (this holds for nucleotide and amino-acid models!), you need to specify the scalar factors which govern the heterogeneity, and a list of probabilities associated with each factor. This list will determine the probability that a given site evolves according to the associated factor. (Note that if you don’t specify these probabilties, each category will be equally likely).

Let’s go ahead and add in four rate categories, some slow and some fast, with associated probabilities. Specify a list of rate factors with the argument rate_factors, and specify a list of probabilities with the argument rate_probs (should sum to 1!). These lists are associated 1:1, as in the first item in rate_factors will have a probability equal to the first item in rate_probs.

import pyvolve

Define a phylogeny, from a file containing a newick tree
my_tree = pyvolve.read_tree(file = “file_with_tree.tre”)

Define our parameters
mutation = {“AC”: 1.5, “AG”: 2.5, “AT”: 0.5, “CG”: 0.8, “CT”: 0.99, “GT”: 1.56}
frequencies = [0.25, 0.3, 0.1, 0.35] f(A) = 0.25, f© = 0.3, etc.

factors = [3.5, 2.5, 0.08, 0.005] Two fast categories, and two slow categories
probs = [0.05, 0.1, 0.5, 0.35] The fast rates will occur with relatively low probabilities

Define our model with all parameters
my_model = Model(“nucleotide”, {“mu”: mutation, “state_freqs”: frequencies}, rate_factors = factors, rate_probs = probs)

Assign the model to a Partition.
my_partition = Partition(models = my_model, size = 250)

Evolve!
my_evolver = Evolver(partitions = my_partition, tree = my_tree)
my_evolver()
By default, Pyvolve outputs several files:

simulated_alignment.fasta
site_rates.txt
site_rates_info.txt
The first file contains the simulated alignment, and the latter two files contain information about site-specific rates and/or parameters. Using these two files, you can determine at which rate each site evolved. Note that you can suppress the creation of or change the name of these files with certain arguments when calling your Evolver object – see the user manual!

And finally, one more example - what if we wanted to use multiple models in our simulation? For this task, we’ll need to define multiple Partition objects. In the example below, one Partition object will be assigned default parameters, and one Partition will be assigned custom parameters.

import pyvolve

Define a phylogeny, from a file containing a newick tree
my_tree = pyvolve.read_tree(file = “file_with_tree.tre”)

Define default model
model1 = Model(“nucleotide”)

Define customized model (notice, no site heterogeneity this time!)
mutation = {“AC”: 1.5, “AG”: 2.5, “AT”: 0.5, “CG”: 0.8, “CT”: 0.99, “GT”: 1.56}
frequencies = [0.25, 0.3, 0.1, 0.35] # f(A) = 0.25, f© = 0.3, etc.
model2 = Model(“nucleotide”, {“mu”: mutation, “state_freqs”: frequencies})

Assign each model to a Partition.
partition1 = Partition(models = model1, size = 100)
partition2 = Partition(models = model2, size = 200)

Evolve by providing both partitions in a list to Evolver
my_evolver = Evolver(partitions = [partition1, partition2], tree = my_tree)
my_evolver()
In the resulting sequence file, the first 100 positions will have evolved according to model1, and the next 200 positions (there will be a total of 300 positions!) will evolve according to model2.

For more, yes more!, ways to use Pyvolve, check out the (drumroll…) user manual! Please feel free to post any questions and/or file bug reports on Pyvolve’s github repository Issues section. Enjoy!

I recently got myself the new 2015 Macbook Pro, and so far it’s been great - especially considering that my old laptop’s battery only held a charge for a whopping 2.5 hours. Now, 4 hours since the last charge, I still have 8 hours of battery life remaining. The world is my oyster!

In any case, the first thing I did with the new computer was set up my biocomputing environment, and in my at-long-last-achieved-wisdom, I saved all the steps I took to get my computer up-and-running, and here it is! Bear in mind that these are the steps that I took, and they may or may not work for you.

Step-by-step guide to a configuring a Biocomputing environment

  1. Download XCode from the App Store. This will give you the basics you need to proceed, like clang/clang++ compilers, git, and other goodies. However, note that XCode won’t give you a fortran compiler. The quickest option for getting one is to download and install from this link: http://r.research.att.com/libs/gfortran-4.8.2-darwin13.tar.bz2.

  2. XCode comes with it’s own text editor, although I am partial to TextWrangler (available here: http://www.barebones.com/products/textwrangler/download.html), which I installed next.

  3. Set up the global configurations for git by typing the following lines into terminal:

git config --global user.name “first last”
git config --global user.email “email”
where “first last” are replaced with your first and last name (“Stephanie Spielman” for me), and the “email” is replaced with your email.

  1. Install Homebrew, a convenient and comprehensive Mac package manager. Personally, I do prefer homebrew over MacPorts or other package managers, although I have no real basis for this preference. If MacPorts or other is your thing, then the next few steps might not be so helpful.
    To get homebrew, enter this code into the command line:

ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
Note that you might need to “sudo” that command. Also, if you ever want to update your homebrew, use this command:

brew update

  1. While Mac does come with its own python distribution, this distribution tends to get wonky when dealing with python modules. In the past, I’ve had some really annoyances trying to get numpy and scipy running properly, so I abandoned Mac’s python in favor of homebrew’s. So, once homebrew is installed (it will be in “/usr/local/Cellar/”), use it to download their python version. To make sure that, when using the python interpretter, that you can freely press the up/down/left/right arrows without annoying characters appearing, install readline first!:

brew install readline --universal
brew install python
By default, this will give you python-2.7. If you want python3, use this command instead:

brew install python3

  1. Next, just to make sure that Mac’s python doesn’t interfere with homebrew’s, enter these commands:

cd /System/Library/Frameworks/Python.framework/Versions
sudo rm Current

Note that the “2.7.9” below will need to be replaced if its not the version on your system!

sudo ln -s /usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/Current

  1. Homebrew’s python comes with pip, which you can use to download various libraries. Here’s what I did:

pip install numpy
pip install scipy
pip install “ipython[notebook]” # install iPython and notebook
pip install ipython --upgrade # the above command does not seem to install most recent iPython, so this line fixes that issue
pip install pandas

  1. Unfortunately, you can’t get Biopython with pip, so you have to download from source. You can get Biopython from this website: http://biopython.org/wiki/Download. Download either biopython-1.65.zip or biopython-1.65.tar.gz, uncompress, and navigate to the directory. Once in the biopython directory, enter the following in terminal to install Biopython:

python setup.py build
sudo python setup.py install

  1. Pandoc, a package for converting file formats (often from markdown or ipython notebook to html or pdf, etc.), is an especially useful thing to have around. Download from the website: https://github.com/jgm/pandoc/releases.

  2. If you want to convert to pdf, though, you’ll need LaTeX on your system (surprisingly enough, if you want to use LaTeX for anything else, you’ll still need LaTeX!). For this, you have to download MacTex (from this website: https://tug.org/mactex/downloading.html). This will give you a convenient installer which guides you through the process. The package is super big though (2.4 GB), so if you internet is not lightning-speed, this might be a good time for a coffee break!

  3. Next, we’ll get R up and running. This is pretty straightforward - just download the R package installer (the one for Mavericks!) from this website: http://www.r-project.org. I recommend against building from source, since this requires a fortran compiler, and as previously mentioned, you’ll have to get this running on your own - this link should do it: http://r.research.att.com/libs/gfortran-4.8.2-darwin13.tar.bz2.

  4. Download, if you want, RStudio from this website: http://www.rstudio.com. You’ll also need XQuartz, which you can download from this website: http://xquartz.macosforge.org/landing/

  5. Once R is running, you can install any packages you’ll need. I ran the following commands in R for installation:

install.packages(“dplyr”, dep=T)
install.packages(“tidyr”, dep=T)
install.packages(“ggplot2”, dep=T)
install.packages(“cowplot”, dep=T) # for sexier ggplot-ing
install.packages(“ape”, dep=T) # for phylogeny manipulation
install.packages(“lme4”, dep=T) # for more linear modeling

And now, for the most part, you have a functioning biocomputing environment for git, Python, and R!

Software

Pyvolve, a python platform to simulate sequences along a phylogeny using continuous-time Markov models.

alignfigR, an R package to create multiple sequence alignment figures.