Prof. Dr.-Ing. Werner Henkel
In the previous project, we used classical multidimensional scaling (CMD) to scale down the 64 x 64 ECM mutation matrix and 20 x 20 amino acid chemical distance matrix to 2 dimensions (2-D). From the 2-D representations, we were able to see similar clusterings, which gives a meaning that highly probable mutations are between codons of similar chemical properties, at least in terms of polarity, chemical composition, and molecular volume. However, we have also observed some inconsistencies and become suspicious that it may be from the dimension reduction method we employed. Hence, in this project, we implemented sparse subspace clustering (SCC). The obtained result is not what we expected. Almost half of the codons are located extremely close to each other, making it difficult for observing any clear relation. The CMD result was much better in terms of showing the similarity and differences of codon distances.
As a separate and additional work of the semester project, we updated a paper submitted to BioMed Central (BMC) journal according to the reviewers comments and suggestions. The title of the paper is “The Empirical Codon Mutation Matrix as a Communication Channel”.
[Report] Sparse Subspace Clustering for Dimension Reduction of Mutation Matrix