This folder contains ELASPIC database tables and external files, which are required to run the ELASPIC pipeline on a local computer.
The files are named using the convention
table_name is the name of the originating table in the ELASPIC database.
The root folder contains the
domain_contact.tsv.gz files, which contain Profs domain definitions for all proteins in the PDB, and information about the interactions between those domains, respectively. Both of those files are required to set up the ELASPIC pipeline to work with any organism.
All other tables and files are separated into different folders based on their organism of origin, in order to make the size of the download more manageable. We provide separate downloads for the following organisms:
Data for all other organisms can be found in the All_other_organisms folder.
Each organism folder contains the following files:
Information about the database tables that correspond to those files, including an outline of the table columns, is availible in the online documentation.
Each organism folder also contains the following subfolders:
The provean subfolder contains Provean supporting sets, which are referenced in the provean table of the ELASPIC database.
The uniprot_domain subfolder contains alignments of proteins domains with their strucutral templates, and homology models of protein domains made using those structural templates. This data is referenced in the uniprot_domain_model table of the ELASPIC database.
The uniprot_domain_pair subfolder contains alignments of proteins domain pairs with their strucutral templates, and homology models of protein domain pairs made using those structural templates. This data is referenced in the uniprot_domain_pair_model table of the ELASPIC database.