Lab assignment L2
Non-Linear function estimation
Objective
To practice the use and implementation of overdetermined equation systems/model fitting (Heath Ch3, Ch7) through a practical application.
Procedure
3xx students do 4 of 5 exercises
5xx students do exercises all excercises
Exercise 1: Fit poly and exp to experimental data
Cooling coffee:
The temperature T(ti) of a cup of coffee is measured over 60 minutes in a room with a constant ambient temperature, Tambient . The equation underlying the process is Newton’s law of cooling:
T(t) =Tambient + (T0 - Tambient) * e-kt. The measurement readings contain some error.
The ambient temperature during the measurement was 22 degrees.
- Fit a polynomial to the data. What degree is suitable? Should you use an interpolation or approximation?
- What basis function is better suited for this type of data?
How must the data be transformed to solve for the cooling constant (k) using linear least-squares? Plot your fitted equation.
t = [0.00 2.50 5.00 7.50 10.00 12.50 15.00 17.50 20.00 22.50 25.00 27.50 30.00 32.50 35.00 37.50 40.00 42.50 45.00 47.50 50.00 52.50 55.00 57.50 60.00]'
T =[94.46 87.29 79.83 72.51 65.28 60.55 55.19 53.35 49.03 46.45 42.32 42.30 40.11 36.25 33.57 32.51 31.89 30.66 29.03 29.10 27.58 26.01 26.38 26.05 26.63]'
Exercise 2: Polynomial interpolation/approximation of periodic functions
First consider how to fit a polynomial to sin(x). (Note the similarity
to the coefficient functions below). If you haven't done the readings, and
tried the on-line polynomial calculator from the lectures page, consider doing
this first. Then generate an example data vector for the sin function:
Consider m = 0.5, 1, 2, 3. Then
- Write a function Y_new=polynomial_interp(X,Y,X_new,n) which fits a
polynomial of degree n to Y=f(X), and returns the values of that polynomial
for each element in the vector X_new. Use a polynomial basis of your choice (Monomial, Newton etc), set up the equation system and solve using matlab A\b. Note: if length(X) > n+1 then
fit an approximate polynomial to the points in the least squares
sense. (You can use the same A\b notation to solve the equation system
for both cases) Now the polynomial doesn't necessarily go
through the data points (does not interpolate in the strict sense),
but is the best approximation in the least squares
sense to the given data points.
Hint
1) form a monomial / vandermonde matrix from a set of sample points. You're allowed to use the built-in function here, although it's easy to write it yourself: A = [ones(..), t, t.^2 .. ]; etc where t is a column vector
2) Solve the Ax = b for the given data points both nxn and mxn over-determined.
- Plot the polynomial for X_new = (0:0.1:2^(1/n)*m*pi)'
- What degrees polynomial "fits" reasonably each of the
cases m=0.5, 1, 2, 3 above?
Try some large degree and small degree (and vary the number of observations, i.e., m), compute the error between the interpolated y value and the ground-truth on X, also compute the error between the interpolated y and the ground-truth on X_new.
- How does the polynomial compare with the true sin function?
- Inbetween sample points?
- Outside the samples interval? (note the "2^(1/n)" in X_new =(0:0.1:2^(1/n)*m*pi)' )
Plot the polynomial and sin in the same plot using e.g. solid (poly) and
dotted (sin) lines. See "help plot" for how to structure the plot argument.
Exercise 3: Atmospheric carbon
Link: “datasetIII.mat”
The monthly carbon dioxide concentration in the atmosphere in parts per million (ppm) from 2010 to 2022. Notice that, in addition to a global trend, there is a clear periodicity caused by seasonal plant growth cycles. What model can you propose that captures both the upwards trend as well as the periodicity? How would the choice of model affect predictions made for future years? The dataset for this problem is contained in the "datasetIII.mat" file.
Link: “datasetIII.mat”
This data is published by the USA's National Oceanic and Atmospheric Administration (NOAA) and you can find the full dataset here starting from 1979. The values reported are the monthly mean carbon dioxide levels averaged over marine surface sites located in Barrow (Alaska), Mauna Loa (Hawaii), American Samoa, and South Pole (Antarctica). These are used to estimate the global average CO2 levels.
Note about the units from the source linked above: "Data are reported as a dry air mole fraction defined as the number of molecules of carbon dioxide divided by the number of all molecules in air, including CO2 itself, after water vapor has been removed. The mole fraction is expressed as parts per million (ppm). Example: 0.000400 is expressed as 400 ppm."
Exercise 4: Spline of an airplane wing
An engineer provides 4 precise coordinates that define the top surface of an airplane aerofoil (in profile view). The manufacturing process needs a function that passes through these points exactly, with smoothness and continuity between the points. See Heath example 7.6.
- Explain how each continuity condition contributes to the physical design requirements.
- Consider endpoint conditions. Why would you use them? Explain from a physical and linear algebra perspective.
- Set the slope of the two end points to values you decide. Explain your design choice.
- There are 4 data points. How many unknown parameters do you need to solve for? How many equations are required? Hint: sum up the equations arising from:
- Interpolation defines the start and end values of each segment
- Slope continuity
- Curvature continuity
- Boundary conditions
- Use cubic spline interpolation to generate the wing profile. Implement this in matlab without using spline(). (Not necessary to generalize to N points, but you can if you wish to)
X = [0.00 2.50 7.00 10.00]'
Y = [0.00 1.70 0.50 -0.50]'
Hint: the input should be X, Y (both vectors of size 4), the endpoint derivatives. The output should be a 3x4 matrix where each row is the coefficients of the fitted polynomial for the three segments.
(Optional)
Two more points are defined at the bottom of the wing, at (2.5, -0.5) and (7, -0.5). The spline now forms a closed contour. Let matrix A1 represent the top 3 segments, A2 represent the bottom three segments. Build a new linear system that has the additional constraint of slope continuity on the leading (left) edge. Let the right tailing edge of the wing end at the natural boundary condition of zero slope, from both the top and bottom curve.
Hint:
You can build the bottom piece just like you did the top piece, then combine the matrices:
Atot =
[Atop, zeroes(..)
zeroes(..), Abottom]
Then adjust Atot for the spline constraint joining the top and bottom at the front of the wing.
If you used the build in "spline" routine, how would you have to structure the data input?
Exercise 5: Fit a contour along a brain tumor
In medical imaging it is common that a person (the radiologist) manually draws a contour around an organ or disease region. This is done with an interface similar to that of an image editor. The person clicks on points on the boundary, and the computer interpolates the boundary segment between the clicks.
The images are located in ~c340/web_docs/labs/labAssign2/brain_tumor.There are
three images:tumor.jpg(real brain MRI image), tumorContour.jpg (tumor actual border, which can be used to measure the error of the polynomial) and polynomialContour.jpg (Polynomial contour of the
tumor. The red dots are given by radiologist and the blue border is fitted to the dots using polynomial interpolation. Note that it serves as an example of what your output would look like).
To display the image
I = imread('tumor.jpg');
imshow(I);
You can collect points on the contour using octave/matlab command ginput. You may have to try several different point placements to get an accurate curve, ie that the fitted polynomial follows approximately the tumor border and both numerical residual and visual inspection will provide satisfactory results. In fact, some medical devices use the same method where the radiologist indicates points around an object and the device fits a curve to those points.
(Alternatively, you can collect the points by subsampling of border pixels. You would have to figure out how to sort them - ginput is easier)
You can find the set of points on tumor contour using the following matlab code:
I = imread('tumorContour.jpg');
[x,y]= find(I > 250);
This is useful for computing the error. Think about how error might be defined here? There are many ways using both distances and pixels. For this exercise how can you define a Euclidean distance error from a pixel on the polynomial or spline contour to the ground truth radiologist contour?
Fit a piecewise cubic spline curve (Heath Ch7.4.2) along the contour of the brain tumor on the images of exercise. (You can use Octave/Matlab's built in spline or implement the routine yourself).
2.Find least square error also between your fitted contour and radiologist drawn contour.
Hints
You might go about it as follows:
Figure out how to use spline in a regular 1D function examples as in Exc 1
Figure out curve length parameterisation s: For each pair of [x_i,y_i] coordinates from ginput you want to compute the cumulative distance to the first point [x_0, y_0]. You can now think of these as a triplet [x_i,y_i, s_i]. Check that the distance increases monotonically with the points.
Use two separate splines, one to interpolate the x-coordinates, one for the y coordinates. Resamples these using a denser s_new, ie one per pixel.
plot(x,y) on top of image.
(Optional)
Fit a polynomial of order n along the contour of the brain tumor located in a brain MRI.
Start to fit a piecewise (imagine the contour is divided into a number of pieces) polynomial of order n along the contour so that it more or less approximates (in least square sense) the contour drawn by radiologist . The value of n may or may not be same for all pieces.Find least square error between your fitted contour and radiologist drawn contour.