Pocket Similarity: Are Cα’s Enough?

A novel method for measuring protein pocket similarity was devised, using only the alpha carbon positions of the pocket residues. Pockets were compared pairwise using an exhaustive 3D C common subset search and grouping residues by physicochemical properties. At least five Ca matches were required for each hit, and distances between corresponding points were fit to an Extreme Value Distribution (EVD) resulting in a probabilistic score or likelihood for any given superposition. A set of 85 structures from 13 diverse protein families was clustered based on binding sites alone, using this score. It was also successfully used to cluster 25 kinases into a number of subfamilies. Using a test kinase query to retrieve other kinase pockets, it was found that a specificity of 99.2% and sensitivity of 97.5% could be achieved using an appropriate cutoff score. The search itself took from 2-15 minutes on a single 3GHz CPU to search the entire PDB (133,800 pockets), depending on the number of hits returned. A short demonstration of the protein structure repository PSILO will follow the scientific presentation.

Airdate: September 17, 2021