SitEx |
|
D-glyceraldehyde-3-phosphate dehydrogenase (GAPDH, PDB: 1U8F, EC 1.2.1.12) binds to nicotinamide-adenine-dinucleotide (NAD) in its active site (AC2) formed at the boundary of two NAD-binding domains and encoded by all the 8 exons of the sequence. Using the 3DExonScan integrated with SitEx, we found structural similarity (Z-score 3.9, RMSD 3.4) with the polypeptide encoded by the sixth exon of alcohol dehydrogenase (PDB: 1D1T, EC 1.1.1.1), although no sequence similarity was revealed. This polypeptide is partly involved in binding to NAD (EC2 site). A comparison of the positions of the NAD amino acid binding sites disclosed that they are located in the similar regions of the spatial structure of polypeptides and that they are similar in amino acid composition. The binding sites of the acetate and zinc ions are located precisely in these regions. It should be noted that the region with the 1U8F structure aligned with the polypeptide with the 1D1T structure is encoded by 4 exons. The alcohol dehydrogenase site is formed also by two dehydrogenase domains.
Prokaryotes as a specific type of analysis of exonsThe search for the conserved functional units of proteins is considered as an exemplary application of SitEx. The human protein uroporphyrinogen decarboxylase is encoded by 10 exons. Its binding site is composed of 18 amino acids encoded by 8 exons. The PDB identifier is 1R3S, its site is AC1. URO-D is the single domain encoded by 10 exons. Bacillus subtilis possesses a protein with identical function (PDB: 2INF), similar in sequence (E-Value≈10-64) and structure (Z-Score=6.1). Search using 3DExonScan revealed high structural similarity between these two proteins only for the polypeptide encoded by the sixth exon of the human protein sequence (Z-Score=4.9). High similarity between the two proteins was found for only exons 2, 5, 6, and 10 in the human protein sequence (E-Value ≈10-8).
According to Fan et al., of the 17 amino acids in the human functional site, 7 differ from the amino acids in the active center uroporphyrinogen decarboxylase of Bacillus subtilis, substitutions were observed only for hydrophobic amino acids, while the amino acids directly binding coproporphyrin remained conserved and are located in polypeptides encoded by exons 2, 6 and 10 of human uroporphyrinogen decarboxylase.
Investigation of the exon structure of encoded protein promiscuous domainThe Carboxylesterase type B domain (Pfam: PF00135) occurs in proteins with different function and could be found in combination with various protein domains. The Acetylcholinesterase (PDB: 2X8B) protein was provided as an example. Using the BLAST program, we established sequence similarities between Acetylcholinesterase sequence and exon sequences involved in coding of the following proteins Butyrylcholinesterase (PDB: 1P0I), Bile salt-dependent lipase (PDB: 1AQL), Carboxylesterase1(PDB: 2H7C), Neuroligin-2 (PDB: 3BL8), Neuroligin-4 (PDB: 3BE8). All these proteins comprise the Carboxylesterase type B domain. This domain is encoded by different numbers of exons in all the proteins. It should be noted that exon 2 in the coding structure of the Neuroligin-2 protein shares no sequence similarity with other exon sequences coding this particular domain in the other sequences. We examined the distribution of the functional sites in the exon structure of the coding sequences. These proteins posses the common ligand in their binding sites - N-acetyl-D-glucosamine (NAG). The coefficient was CoefE=0 for its binding sites in all the structures with the exception of 3BL8 (Table 4). In 3BL8 this ligand is bound to protein region encoded by exons that are not neighbors (CoefE>0) due to the occurrence of an additional exon 2. The other ligands that bind to the above listed proteins are different. CoefE values deviating from 0 indicate that the amino acids of binding sites for Taurocholic acid (TCH) and Coenzyme A (COA) (the bulky ligands) are not encoded by neighbor exons, while the smaller ligands (FUC, fucose; BUA, butanoic acid; BMA, MAN, mannose; FLC, citrate anion) are encoded by one or neighbor exons (CoefE=0); the CoefA value close to 1 indicate that the amino acids of sites that bind themare quite distant from each other. The CoefA coefficient equals 0 for binding sites where a single amino acid of site occurs in one chain (Table 1).
Table 1. Binding sites for ligands in the Carboxylesterase type B domain
| PDB Id | Site | Ligand | CoefE | CoefA | N(exons) | N(domains) | N(chains) |
|---|---|---|---|---|---|---|---|
|
AC1 |
NAG |
0 |
0.6 |
1 |
1 |
1 |
|
|
AC4 |
NAG |
0 |
0.4 |
1 |
1 |
1 |
|
|
AC5 |
NAG |
0 |
0 |
1 |
1 |
1 |
|
|
AC7 |
NAG |
0 |
0.953 |
1 |
1 |
1 |
|
|
AC8 |
NAG |
0 |
0.955 |
1 |
1 |
1 |
|
|
AC9 |
NAG |
0 |
0.857 |
2 |
1 |
1 |
|
|
AC3 |
FUC |
0 |
0.912 |
1 |
1 |
1 |
|
|
BC5 |
BUA |
0 |
0.985 |
1 |
1 |
1 |
|
|
AC1 |
NAG |
0 |
0 |
1 |
1 |
1 |
|
|
AC3 |
TCH |
0/0 |
0.5/0 |
1/1 |
1 |
2 |
|
|
AC4 |
TCH |
0.25 |
0.932 |
3 |
1 |
1 |
|
|
AC5 |
TCH |
0.714/0 |
0.982/0 |
2/1 |
1 |
2 |
|
|
AC1 |
NAG |
0 |
0.25 |
1 |
1 |
1 |
|
|
CC5 |
COA |
0.33/0.25 |
0.953/0.91 |
4/3 |
1 |
2 |
|
|
DC1 |
COA |
0.5/0.375 |
0.972/0.959 |
4/6 |
1 |
2 |
|
|
AC1 |
NAG |
0.714 |
0.991 |
2 |
1 |
1 |
|
|
AC2 |
NAG |
0.714 |
0.996 |
2 |
1 |
1 |
|
|
AC4 |
NAG |
0 |
0.4 |
1 |
1 |
1 |
|
|
AC5 |
NAG |
0.33/0 |
0.968/0 |
2/1 |
1 |
2 |
|
|
AC7 |
NAG |
0.33 |
0.977 |
2 |
1 |
1 |
|
|
AC8 |
BMA |
0 |
0 |
1 |
1 |
1 |
|
|
AC9 |
MAN |
0 |
0.939 |
2 |
1 |
1 |
|
|
AD1 |
MAN |
0 |
0 |
1 |
1 |
1 |
|
|
BC2 |
MAN |
0 |
0.968 |
2 |
1 |
1 |
|
|
AC1 |
NAG |
0 |
0 |
1 |
1 |
1 |
|
|
AC2 |
NAG |
0 |
0.886 |
1 |
1 |
1 |
|
|
AC5 |
FLC |
0 |
0.933 |
2 |
1 |
1 |
|
|
AC9 |
NA |
0 |
0.5 |
1 |
1 |
1 |