Friday, July 15, 2016

Week 6: Preliminary Results

This Friday marks the end of the sixth week in the summer immersion program. I am always amazed by how fast the time passes. As my time in the city draw close to an end, I felt a bit of relief when I was able to implement a big part of the machine learning algorithm for my project. In this blog, I will describe briefly its implementation in MATLAB and its results from analyzing a public dataset. 

The data for traumatic brain injury of patients are not available to the public. The computer in which the data are stored does not have a toolbox’s function I need to begin processing. Fortunately, it is a simple function that I can code by the end of this week. On the other hand, I was able to apply the GRNN to some simple model and data found within the MATLAB software.  Some of the results are shown below: 

Experiment 1: Proof of Concept

In this experiment, I trained the network with two points (1,1) and (2,2) with smoothness parameter from 0.1 to 5. In other words, estimate (x,y) varying smoothness parameter. With a small smoothness parameter, only the points immediately near the training values contribute to the final results, as depicted below:


Experiment 2: Nonlinear Curve Fitting

In this test, the graph y=sin(x)/x was plotted using 63 data points from 0 to 2π. Then, 30 random points were taken out. Finally, I used GRNN to estimate those 30 missing values. From the MSE vs smoothing parameter plot, a smoothing parameter of σ = 0.25 was chosen. As expected, estimating (x,y) outside the training range would yield a value close to the last data point.
 

Experiment 3: Predicting the Outcome from Input with High Dimensions

A MATLAB dataset (house_dataset.mat) was used to train a neural network to estimate the median house price in a neighborhood based on neighborhood statistics. There are two variable, one of which is a 13x506 matrix defining thirteen characteristics of 506 different neighborhoods.  The last variable defines a 1x506 matrix of median values (in 1000s of dollars) of each neighborhood. Only 100 different and randomly chosen neighborhoods was used to train GRNN. A different set of randomly chosen 10 neighborhoods was used to evaluate the algorithm.  The smoothing parameter was determined to be ~ 0.9. 
 

No comments:

Post a Comment