analysis Module

Execute a standard HyPhy analysis.

class analysis.Analysis(**kwargs)

Bases: object

Parent class for all analysis methods, which include the following children:

  • ABSREL
  • BUSTED
  • FEL
  • FUBAR
  • LEISR
  • MEME
  • RELAX
  • SLAC

Do not use this parent class. Instead, see child classes for analysis-specific arguments and examples.

run_analysis()

Execute an Analysis and save output.

Examples:

>>> ### Execute a default FEL analysis
>>> myfel = FEL(data = "/path/to/data_with_tree.dat")
>>> myfel.run_analysis()         
class analysis.FEL(**kwargs)

Bases: analysis.Analysis

Initialize and execute a FEL analysis.

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. srv, Employ synonymous rate variation in inference (i.e. allow dS to vary across sites?). Values “Yes”/”No” or True/False accepted. Default: True.
  3. branches, Branches to consider in site-level selection inference. Values “All”, “Internal”, “Leaves”, “Unlabeled branches”, or a specific label in your tree are accepted
  4. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  5. alpha, The p-value threshold for calling sites as positively or negatively selected. Note that this argument has 0 bearing on JSON output. Default: 0.1
  6. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.

Examples:

>>> ### Define a default FEL analysis, where data is contained in a single file
>>> myfel = FEL(data = "/path/to/data_with_tree.dat")
>>> ### Define a default FEL analysis, where alignment and tree are in separate files 
>>> myfel = FEL(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a FEL analysis, with a specified path to output JSON
>>> myfel = FEL(data = "/path/to/data_with_tree.dat", output="/path/to/json/output.json")
>>> ### Define a FEL analysis, with a one-rate approach (i.e. synonymous rate variation turned off) 
>>> myfel = FEL(data = "/path/to/data_with_tree.dat", srv=False)
>>> ### Define FEL analysis, specifying only to use internal branches to test for selection
>>> myfel = FEL(data = "/path/to/data_with_tree.dat", branches="Internal")
>>> ### Define FEL analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myfel = FEL(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined FEL instance
>>> myfel.run_analysis()
class analysis.FUBAR(**kwargs)

Bases: analysis.Analysis

Initialize and execute a FUBAR analysis.

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data.
  3. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.
  4. grid_size, Number of grid points per rate grid dimension (Default: 20, allowed [5,50])
  5. nchains, Number of MCMC chains to run (Default: 5, allowed [2,20])
  6. chain_length, The length of each chain (Default: 2e6, allowed [5e5,5e7])
  7. burnin, Number of samples to consider as burn-in (Default 1e6, allowed [ceil(chain_length/20),ceil(95*chain_length/100)])
  8. samples_per_chain, Number of samples to draw per chain (Default 100, allowed [50,chain_length-burnin])
  9. alpha, The concentration parameter of the Dirichlet prior (Default 0.5, allowed[0.001,1])
  10. cache, Name (and path to) output FUBAR cache. Default: goes to same directory as provided data. Provide the argument False to not save the cache (this argument simply sends it to /dev/null)

Examples:

>>> ### Define a default FUBAR analysis, where data is contained in a single file
>>> myfubar = FUBAR(data = "/path/to/data_with_tree.dat")
>>> ### Define a default FUBAR analysis, where alignment and tree are in separate files 
>>> myfubar = FUBAR(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a FUBAR analysis using a 10x10 grid and alpha parameter of 0.75
>>> myfubar = FUBAR(data = "/path/to/data_with_tree.dat", grid_size = 10, alpha = 0.75 )
>>> ### Define FUBAR analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myfubar = FUBAR(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined FUBAR instance
>>> myfubar.run_analysis()
class analysis.MEME(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. branches, Branches to consider in site-level selection inference. Values “All”, “Internal”, “Leaves”, “Unlabeled branches”, or a specific label are accepted
  3. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  4. alpha, The p-value threshold for calling sites as positively or negatively selected. Note that this argument has 0 bearing on JSON output. Default: 0.1
  5. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.

Examples:

>>> ### Define a default MEME analysis, where data is contained in a single file
>>> mymeme = MEME(data = "/path/to/data_with_tree.dat")
>>> ### Define a default MEME analysis, where alignment and tree are in separate files 
>>> mymeme = MEME(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a MEME analysis, specifying that selection be tested only on leaves
>>> mymeme = MEME(data = "/path/to/data_with_tree.dat", branches = "Leaves" )
>>> ### Define a MEME analysis, specifying a custom JSON output file
>>> mymeme = MEME(data = "/path/to/data_with_tree.dat", output = "meme.json" )
>>> ### Define MEME analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> mymeme = MEME(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined MEME instance
>>> mymeme.run_analysis()
class analysis.SLAC(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. branches, Branches to consider in site-level selection inference. Values “All”, “Internal”, “Leaves”, “Unlabeled branches”, or a specific label are accepted
  3. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  4. alpha, The p-value threshold for calling sites as positively or negatively selected. Note that this argument has 0 bearing on JSON output. Default: 0.1
  5. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.
  6. bootstrap, The number of samples used to assess ancestral reconstruction uncertainty, in [0,100000]. Default:100.

Examples:

>>> ### Define a default SLAC analysis, where data is contained in a single file
>>> myslac = SLAC(data = "/path/to/data_with_tree.dat")
>>> ### Define a default SLAC analysis, where alignment and tree are in separate files 
>>> myslac = SLAC(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a SLAC analysis, specifying that selection be tested only on leaves
>>> myslac = SLAC(data = "/path/to/data_with_tree.dat", branches = "Leaves" )
>>> ### Define a SLAC analysis, specifying 150 bootstrap replicates be used for ASR uncertainty
>>> myslac = SLAC(data = "/path/to/data_with_tree.dat", bootstrap = 150 )
>>> ### Define a SLAC analysis, specifying a custom JSON output file
>>> myslac = SLAC(data = "/path/to/data_with_tree.dat", output = "slac.json" )
>>> ### Define SLAC analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myslac = SLAC(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined SLAC instance
>>> myslac.run_analysis()
class analysis.ABSREL(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. branches, Branches to consider in site-level selection inference. Values “All”, “Internal”, “Leaves”, “Unlabeled branches”, or a specific label are accepted
  3. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  4. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.

Examples:

>>> ### Define a default ABSREL analysis, where data is contained in a single file
>>> myabsrel = ABSREL(data = "/path/to/data_with_tree.dat")
>>> ### Define a default ABSREL analysis, where alignment and tree are in separate files 
>>> myabsrel = ABSREL(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a ABSREL analysis, specifying that selection be tested only on leaves
>>> myabsrel = ABSREL(data = "/path/to/data_with_tree.dat", branches = "Leaves" )
>>> ### Define ABSREL analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myabsrel = ABSREL(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined ABSREL instance
>>> myabsrel.run_analysis()
class analysis.BUSTED(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. branches, Branches to consider in site-level selection inference. Values “All”, “Internal”, “Leaves”, “Unlabeled branches”, or a specific label are accepted
  3. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  4. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.

Examples:

>>> ### Define a default BUSTED analysis, where data is contained in a single file
>>> mybusted = BUSTED(data = "/path/to/data_with_tree.dat")
>>> ### Define a default BUSTED analysis, where alignment and tree are in separate files 
>>> mybusted = BUSTED(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre")
>>> ### Define a BUSTED analysis, specifying that selection be tested only on leaves
>>> mybusted = BUSTED(data = "/path/to/data_with_tree.dat", branches = "Leaves" )
>>> ### Define BUSTED analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> mybusted = BUSTED(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined BUSTED instance
>>> mybusted.run_analysis()
class analysis.RELAX(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. test_label, The label (must be found in your tree) corresponding to the test branch set
  3. reference_label, The label f(must be found in your tree) corresponding to the reference branch set. Only provide this argument if your tree has multiple labels in it.
  4. output, Name (and path to) to final output JSON file. Default: Goes to same directory as provided data
  5. analysis_type, “All” (run hypothesis test and fit descriptive models) or “Minimal” (only run hypothesis test). Default: “All”.
  6. genetic_code, the genetic code to use in codon analysis, Default: Universal. Consult NIH for details.

Examples:

>>> ### Define a default RELAX analysis, where data is contained in a single file and test branches are labeled "test"
>>> myrelax = RELAX(data = "/path/to/data_with_tree.dat", test_label = "test")
>>> ### Define a default RELAX analysis, where alignment and tree are in separate files and test branches are labeled "test"
>>> myrelax = RELAX(alignment = "/path/to/alignment.fasta", tree = "/path/to/tree.tre", test_label = "test")
>>> ### Define a default RELAX analysis, where data is contained in a single file, test branches are labeled "test", and reference branches are labeled "ref"
>>> myrelax = RELAX(data = "/path/to/data_with_tree.dat", test_label = "test", reference_label = "ref")
>>> ### Define a default RELAX analysis, where data is contained in a single file, test branches are labeled "test", and the Minimal analysis is run
>>> myrelax = RELAX(data = "/path/to/data_with_tree.dat", test_label = "test", analysis_type = "Minimal")
>>> ### Define RELAX analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myrelax = RELAX(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined RELAX instance
>>> myrelax.run_analysis()
class analysis.LEISR(**kwargs)

Bases: analysis.Analysis

Required arguments:
  1. alignment and tree OR data, either a file for alignment and tree separately, OR a file with both (combo FASTA/newick or nexus)
  2. type, either “nucleotide” or “protein” indicating the type of data being analyzed
Optional keyword arguments:
  1. hyphy, a HyPhy() instance. Default: Assumes canonical HyPhy install.
  2. model, The model to use to fit relative rates, i.e. GTR for nucleotides or LG for amino acids. For full options, please see HyPhy. Default: JC69.
  3. rate_variation, Whether to apply rate variation to branch length optimization. Options include No, Gamma, GDD. Note that Gamma and GDD will use four categories each. Default: No

Examples:

>>> ### Define a LEISR Protein analysis, where data is contained in a single file and the WAG model with no rate variation is used
>>> myleisr = LEISR(data = "/path/to/data_with_tree.dat", type = "protein", model = "WAG")
>>> ### Define a LEISR Protein analysis, where data is contained in a single file and the WAG model with Gamma rate variation is used
>>> myleisr = LEISR(data = "/path/to/data_with_tree.dat", type = "protein", model = "WAG", rate_variation = "Gamma")
>>> ### Define a LEISR Nucleotide analysis, where data is contained in a single file and the HKY85 model with GDD rate variation is used
>>> myleisr = LEISR(data = "/path/to/data_with_tree.dat", type = "nucleotide", model = "HKY85", rate_variation = "GDD")
>>> ### Define LEISR analysis with a custom Hyphy, which is also defined here:
>>> myhyphy = HyPhy(suppress_log = True, quiet = True) ## HyPhy will use default canonical install but run in full quiet mode
>>> myleisr = LEISR(data = "/path/to/data_with_tree.dat", hyphy=myhyphy)
>>> ### Execute a defined LEISR instance
>>> myleisr.run_analysis()