Some of the spline curves don't look like the best fit; I need to do some more reading about spline curves to see if I'm doing the right spline fit (there's quite a few types of spline fitting like bivariate splines etc. that I still need to explore).
I can also get the coefficients of the spline fit, but I'm not sure what they represent. Can ask Stubbs / Eske more about spline fitting.

30 Aug 2022 - 05 Sep 2022 -

Spline Fitting

Working in Continuum_Reduction_Ind.ipynb
Met with Chris and he suggested that I take a few points (around 10) for each graph to have better spline fits that pass through data points
I am trying to automate the process of choosing points for spline fit
Looking into the scipy.interpolate.UnivariateSpline, especially the w attribute, which allows me to weigh the points differently in a separate array (so higher weighing points are more likely to be fitted). Trying to devise a math function to fit the data points such that a higher density of points (more clustered) are weighed less than the singular points ( like inflection points )
- Since data is already organized into buckets (grouped by a range of 5 to reduce variability), I thought if more data points are in a bucket, then the resulting averaged data point should have less weight.
- I tried using a math function weight = 1/(bucket length) which would give higher weight to smaller bucket sizes, but the spline didn't look good:
- Tried weight = 1/ pow(x, bucket_length) for diff values where 1 <= x <= 2, and the higher orders did not look very good either (ex. using x = 1.5 (left) and x = 2 (right))
- Image ModifiedImage Modified
- Maybe a better math function would work
To do next time: choose points by hand and see how the fit goes, once fit is good, try to design an algorithm around the points already in buckets

I reprocessed the data such that the spaces between neighboring data points are more consistent and then ran the spline through the data points
Image Added
I went back to using the univariate spline. Played around a bit with the attributes within univariate spline, especially the smoothing factor (s) and k, the degree of the smoothing spline
- Other attributes (like changing the number of knots), didn't have as large of an effect as the smoothing factor for some reason
For the graph above, only got two spline knot locations.
Found this useful website that explains splines and knots a bit more clear to me:
https://stats.stackexchange.com/questions/517375/splines-relationship-of-knots-degree-and-degrees-of-freedom

When s= 0, we get a nice spline fit, but it looks pretty choppy.
Image Added
but taking the automatic spline smoothing factor gives the first graph (where the spline is not a strong fit)

Image Added
Setting k = 5 automatically gave a much better fit, which is expected because polynomials (if the degree is high enough) can fit data points more accurately usually
Printing out the coefficients of the spline function, not 100% sure how to read it, and then also printing the locations of the knots

Applied the function to other sets of data, a bit too choppy
Image Added
I set the smoothing factor to 10 and then kept k at 10. I spaced out the points by taking every point that is at least 1/3rd of the greatest interval width in the data set, changed to 1/2