The Lanfear Lab @ANU Molecular Evolution and Phylogenetics

FAQ

  1. How do I speed up my analysis?
  2. My analysis quits without giving me a useful error message. What can I do?
  3. Can I use PartitionFinder2 to do model selection?
  4. Can I see the PartitionFinder2 source code?
  5. What models of molecular evolution are included in PartitionFinder?
  6. How do I set use PartitionFinder2 output to set up a BEAST analysis?
  7. Can I still get/use PartitionFinder 1.1.1?

How do I speed up my analysis?

PartitionFinder2 has to do a huge number of calculations to find the best partitioning scheme. On very large datasets, some types of analysis are just impractical. There are a number of things you can do to make sure your analysis runs as quickly as possible. Remember that the goal here is not necessarily to find the optimal partitioning scheme (though that would be nice) but to find the best partitioning scheme you can in a practical amount of time.

  1. Try a more efficient search. In order of increasing efficiency, the search options are: "all", "greedy", "rcluster". See the 'quickstart' page of the manual for more details.
  2. Use the --rcluster-max and --rlcluster-percent settings. These can make the rcluster algorithm faster, for a trade off in accuracy. See the manual for more details.
  3. Use the '--raxml' commandline option. See the manual for more details.
  4. Use a computer with multiple processors. PartitionFinder2 automatically detects how many processors you have available, and uses all of them. The '-p' option can be used to control how many processors PartitionFinder2 uses.
  5. Reduce the number of models you're considering. Most people start by selecting "models = all;". This is a good start, but in some cases it's just not practical to analyse all possible models. PartitionFinder2 will still work well if you use just one or two models, for instance with DNA sequences you can use "models = GTR+G;".


My analysis quits without giving me a useful error message. What can I do?

PartitionFinder2 will usually give you a helpful error message when there's a problem, but in some cases we won't have anticipated a particular issue so it will just quit without any useful error message. There are three things to do here.

  1. Double check that your partition_finder.cfg file follows all the conventions described in the manual. This is by far the most common cause of problems.
  2. Double check that your alignment file follows all the conventions described in the manual. This is a close second common cause of problems.
  3. Re-run your analysis with exactly the same commandline. PartitionFinder2 will use all the results it saved from the previous run, and in many cases it will just pick up where it left off and not give another error.
  4. Try re-starting your analysis from scratch. To do this, add "--force-restart" to the end of your command line. Be careful though, this command will delete all previous analyses.
  5. If none of the above work, post a question on the PartitionFinder2 google group. The more detail you can provide, the more likely we are to be able to help figure out the problem, so please attach the log.txt file to your email (it's in the folder with your .cfg file), and also state your operating system, and as much detail about your analysis as you can. If you email me in person, I'll just ask you to post your question on the google group, so please go straight there.

Can I use PartitionFinder2 to do model selection?

Yes. PartitionFinder2 and PartitionFinder2 can easily be used to do standard model selection, and it works in a very similar way to programs like ModelTest, ProtTest, ModelGenerator, etc. PartitionFinder2 should be as quick, or quicker than, these programs. The big advantage of PartitionFinder2 is that it can perform model selection on partitioned datasets - doing model selection on each partition, without having to run separate analyses. In fact, the algorithms we use in PartitionFinder2 are in many ways more appropriate for performing model selection on partitioned datasets than those in other programs, because we use information from the whole alignment to build a guide tree for the model selection. So, if you have a dataset and want to perform model selection, just follow these steps:

  1. In the .cfg file, specify the models you want to compare and the metric you want to compare them with (AIC, AICc, BIC)
  2. In the .cfg file, set "search=user;"
  3. In the .cfg file, specify the partitioning scheme you want to use (see the manual for how to do this).
  4. Run PartitionFinder2 following the instructions in the manual
  5. Your results will be printed out in the 'best_schemes.txt' file, which is in the /analysis folder

The best_schemes.txt file tells you the best model for each subset of sites (sometimes called a partition) in your alignment. If you use the --save-phylofiles option (see the manual) PartitionFinder2 will also store all of the model selection results for each subset - very similar to the output of programs like ModelTest, ProtTest, etc. This information is stored in a .txt file inside the /analysis/subsets folder. To find it, copy the subset identifier from the best_schemes.txt file (in the "Alignment column"). This is a long name something like this "50bf1643d2a386419c9264eccd173b6b". Now go and find the .txt file in /analysis/subsets that has that name: e.g. 50bf1643d2a386419c9264eccd173b6b.txt. That file contains neatly formatted model selection results for the subset.

Can I see the PartitionFinder2 source code?

Yes. It's here: https://github.com/brettc/partitionfinder. It's released under a GNU General Public License, which means you can do more or less whatever you want with it.

What models of molecular evolution are included in PartitionFinder2?

This is described in the manual, but in short it's all the named models available in PhyML and RAxML.

How do I set use PartitionFinder2 output to set up a BEAST analysis?

First things first. PartitionFinder2 is built around a likelihood framework, in which the free parameters are: parameters of the model of molecular evolution; the tree topology; the branch lengths; that's it. This is exactly what's used in RAxML, PhyML GARLi, and most other maximum likelihood phylogenetics programs. Even MrBayes can be set up to run in a very similar way. BEAST is a Bayesian MCMC program in which analyses tend to contain a lot more free parmaters than the ones already mentioned. Specifically, most BEAST analyses contain at least a few free parameters to do with molecular rates and dates, and many contain free parameters about multiple topologies (e.g. gene trees), population sizes, migration rates, etc. etc. It's important to be aware that because PartitionFinder2 does not take these parameters into account (it's not desigined to), the partitioning schemes you estimate in PartitionFinder2 are not always appropriate for BEAST analyses. So, before using your PartitionFinder2 output to set up a BEAST analysis, please consider carefully whether this is sensible. If you're not sure, post on the PartitionFinder2 or BEAST google groups as appropriate. And before you use PartitionFinder, you should first try the native Bayesian solutions to the partitioning problem implemented in BEAST - these are more elegant, and much more appropriate, than the solutions in PartitionFinder2 for a BEAST run. Details are in Wu et al (2012): http://www.ncbi.nlm.nih.gov/pubmed/23233462.

But, if you do need to use PartitionFinder2 output in BEAST, you'll notice that the names of the models don't quite match up. Here's a translation table to help.

K80 in PartitionFinder: in BEAUti this is “HKY” with “base frequencies” set to “All Equal”
TrNef in PartitionFinder: in BEAUti this is “TN93” with “base frequencies” set to “All Equal”
SYM in PartitionFinder: in BEAUti this is “GTR” with “base frequencies” set to “All Equal”
HKY in PartitionFinder: in BEAUti this is “HKY” with “base frequencies” set to “estimated”
TrN in PartitionFinder: in BEAUti this is “TN93” with “base frequencies” set to “estimated”
GTR in PartitionFinder: in BEAUti this is “GTR” with “base frequencies” set to “estimated”

Can I still get/use PartitionFinder 1.1.1?

Yes. All old versions of PartitionFinder are here, and version 1.1.0 specifically is here. Please do note that as of December 2016, PF1.1.1 is no longer supported, so we will not be fixing bugs or making any updates.