As I attempted to wrap up the final details of my project, I discovered that many problems have not been addressed yet. During the initial stages of my research, I convinced myself that it would be relatively easy to complete the project. Unfortunately, I couldn't have been more wrong. While analyzing my data, I began to question my understanding in multivariate data analysis and statistics. Did any of my results make sense? Or even worst, did I use the correct analysis technique? As I learnt the hard way, any mistake costed me a lot of time since it took a lot of computational resources to analyze thousands of variables. For these reasons, I spent some of my time reading papers and thinking about potential solutions to my problems. Below, I described some of my findings.
Global Minimum in MSE vs Smoothing Parameter Graph
As I mentioned in my older blogs, the only parameter for GRNN, is the smoothing parameter. In using the holdout method suggested by its author, it was important to choose the global minimum in the graph of MSE vs smoothing parameter (also known as spread). Since it was computationally expensive to compute MSE for many values of smoothing parameter, I could not be certain to have chosen a global minimum. For this reason, I found a way to estimate the MSE as the spread approaches to infinity. It exploited the fact that an exponential function is 1 when its exponent is 0:
The formula for GRNN becomes the formula for the mean of the training data Y. Therefore, I could compute the value of MSE at extreme values of SPREAD. With this information, I could be more certain I have picked the global minimum when I chose a SPREAD that lie before the asymptotic behavior.
Principal Component Analysis (PCA) Implementation
PCA is a well known technique to reduce the dimensionality of data. Even though I have used this technique since my undergraduate career, I never understood why or how it worked. Since I needed to use PCA for this project, I decided to gain a deeper understanding of it by reading about its mathematics and implementing it from scratch in MATLAB. I was very happy when I succeeded even though my implementation was slower than that of MATLAB. However, it helped me understand the fundamentals of PCA.
My algorithms uses the standard eigendecomposition method for the covariance matrix. Since the covariance matrix is symmetrical, it has very special properties given by the Spectral Theorem. The trace of the covariance matrix represents the total variance in the data, which is also equal to the sum of the eigenvalues. As such, the eigenvectors associated with each eigenvalues points in the direction of greatest variance. In addition, they are mutually orthogonal to each other. The original data can be then transformed into a new coordinate system given by the space spanned by the orthonormal eigenvectors.
Global Minimum in MSE vs Smoothing Parameter Graph
As I mentioned in my older blogs, the only parameter for GRNN, is the smoothing parameter. In using the holdout method suggested by its author, it was important to choose the global minimum in the graph of MSE vs smoothing parameter (also known as spread). Since it was computationally expensive to compute MSE for many values of smoothing parameter, I could not be certain to have chosen a global minimum. For this reason, I found a way to estimate the MSE as the spread approaches to infinity. It exploited the fact that an exponential function is 1 when its exponent is 0:
The formula for GRNN becomes the formula for the mean of the training data Y. Therefore, I could compute the value of MSE at extreme values of SPREAD. With this information, I could be more certain I have picked the global minimum when I chose a SPREAD that lie before the asymptotic behavior.
Principal Component Analysis (PCA) Implementation
PCA is a well known technique to reduce the dimensionality of data. Even though I have used this technique since my undergraduate career, I never understood why or how it worked. Since I needed to use PCA for this project, I decided to gain a deeper understanding of it by reading about its mathematics and implementing it from scratch in MATLAB. I was very happy when I succeeded even though my implementation was slower than that of MATLAB. However, it helped me understand the fundamentals of PCA.
My algorithms uses the standard eigendecomposition method for the covariance matrix. Since the covariance matrix is symmetrical, it has very special properties given by the Spectral Theorem. The trace of the covariance matrix represents the total variance in the data, which is also equal to the sum of the eigenvalues. As such, the eigenvectors associated with each eigenvalues points in the direction of greatest variance. In addition, they are mutually orthogonal to each other. The original data can be then transformed into a new coordinate system given by the space spanned by the orthonormal eigenvectors.
No comments:
Post a Comment