ProteinHistorian: Protein Age Estimation and Enrichment Analysis

ProteinHistorian identifies enrichment for proteins of different phylogenetic ages in protein sets of interest. ProteinHistorian is to evolutionary history as Gene Ontology term enrichment analysis is to function. Over thirty eukaryotic species are currently supported. Other protein attributes can be uploaded to combine with the protein age analysis. ProteinHistorian is described in the following paper:

Capra JA, Williams AG, and Pollard KS. ProteinHistorian: Tools for the Comparative Analysis of Eukaryote Protein Origin. PLoS Computational Biology, In Press, 2012.

Please cite the paper if you use any of the resources below.

Age data for proteins in all species and open source command line versions of the protein age enrichment analysis scripts are available on the downloads page. Details of the creation and contents of the databases are available on the methods page. Frequently asked questions are answered in the FAQ.

Run ProteinHistorian with your own data:

(Or use the example data that is already in place.)

1. Select species:

Species:

Age estimation options

For a given species, multiple pre-computed protein age estimates, based on different databases of evolutionary relationships and ancestral reconstruction algorithms, may be available. The options for the selected species are listed in the Family Database and Reconstruction Algorithm menus. For more details on particular databases and algorithms, please see the methods page.

Family Database:
Reconstruction Algorithm:

Hide Options

Given the complex evolutionary histories of proteins, different age estimation strategies may produce different ages for the same proteins.

2. Input your genes/proteins of interest:

(sample stuff) What if my protein IDs are in a different format?

Paste list of protein IDs of interest:

(One protein ID per line.)

Upload a file with list of proteins:

Accepted formats:

Plain text file with one protein ID per line. For example: human_high_expr.txt

GO Annotation File (GAF) v2.0. Available from QuickGO and AmiGO.

Input background or comparison protein set

Input a background or comparison set:

By default, ProteinHistorian compares your input proteins to the background of all protein ages in the species. This section allows you to input a second set of proteins to serve as a comparison or more specific background set.

All proteins (default): Use the background age distribution of all proteins in the database.

OR upload a file with list of background proteins for comparison:

Accepted formats:

Plain text file with one protein ID per line. For example: human_low_expr.txt

GO Annotation File (GAF) v2.0. Available from QuickGO and AmiGO.

OR paste list of proteins for background comparison here:

Hide

More analysis options

If no age data are present for a protein of interest:

Multiple Test Correction: Use Bonferroni correction when determining significance.

Hide

3. (Optional) Input protein feature data:

ProteinHistorian can also correlate protein ages with other quantifiable protein features (e.g., length, essentiality, evolutionary rate).

Paste list of proteins with attributes:

Upload a file with list of proteins with attributes:

The input file should be a space-delimited plain text file with data for a single protein per row, for example: human_lens.txt.

(same as above)

Supplementary Material

Downloads
Age data for proteins in all species and open source command line versions of the protein age enrichment analysis scripts are available.
Methods
This page provides details on the creation and contents of the databases available for each species.

This page uses Google Analytics to analyze usage patterns. Google Analytics is used by many of the web's most popular sites, but If you would like to avoid having having your traffic logged, you can download the opt-out browser plug-in or edit your /etc/hosts/ file .