MATH 2270
# MAPLE PROJECT 4
# November, 1998
# 
# This project is available at
# http://www.math.utah.edu/~korevaar/2270proj4.txt  Save it to your home
# directory and then open it from maple as a maple text document.
# 
# 1)  In this first part of our project we will use the method of least
# squares to find a power law which (approximately) relates human
# weights to heights.  Within the last year you might have noticed a
# series of newspaper and magazine articles about the so-called body
# mass index, and its use in determining risk factors for overweight
# people.  According to the most recent article which I read on this
# subject, a person's body mass index is computed by dividing their
# weight by the square of their height, and then multiplying by a
# universal constant.  Thus, the proponants of this index seem to be
# assuming, or claiming,  that for adults at equal risk levels, weight
# should be proportional to the square of height.  As we discussed in
# class, if people were to scale equally in all directions when they
# grew, weight would scale as the cube of height.   That particular
# power law seems a little high, since we don't look like uniformly
# expanded versions of babies; we seem to get relatively stretched out
# when we grow taller.  One would expect the best predictive power to be
# somewhere between 2 and 3.  If the power is much larger than 2 then
# one could argue that the body mass index might need to be modified to
# reflect this fact. Of course, our sample heights and weights may or
# may not be representative of a healthy population, but let's proceed
# anyway.  (When we did this project last year our power came out around
# 2.5.)
#      Here's how to use a least squares method to find a power law: 
# Suppose we have some data points, {[xi,yi]}, and we expect a power law
# yi=C*(xi)^p to relate the yi's  to the xi's.   Here p is the power,
# and let's call C the proportionality constant.  Then we expect the
# logarithms to satisfy ln(yi) = ln(C) + p*ln(xi).  In other words, the
# ln-ln data should satisfy the equation of a line having slope equal to
# p and vertical axis intercept equal to ln(C).  So we take the least
# squares fit for the ln-ln data, and then use the slope and intercept
# to deduce C and p in the power law: take p equal to the slope, and C
# equal to the exponential of the intercept.
#      Here's a carried-out example.  By the way, it was constructed so
# that each yi is close to 2*(xi)^3.  So when we go through the process
# above we should come out with something close to a cubic power law,
# with proportionality constant close to 2. 
> with(linalg):with(plots):
> S:=[[1,2.2],[2,15.8],[3,55],[5,255],[6,430]];
>       #five data points

       S := [[1, 2.2], [2, 15.8], [3, 55], [5, 255], [6, 430]]

> A1:=convert(S,matrix);  #make the points the rows
>       #in a matrix.  We will manipulate this matrix

                                [1    2.2 ]
                                [         ]
                                [2    15.8]
                                [         ]
                          A1 := [3     55 ]
                                [         ]
                                [5    255 ]
                                [         ]
                                [6    430 ]

> A2:=map(ln, A1);  #take ln of each matrix
>      #entry

                           [  0      .7884573604]
                           [                    ]
                           [ln(2)    2.760009940]
                           [                    ]
                     A2 := [ln(3)      ln(55)   ]
                           [                    ]
                           [ln(5)      ln(255)  ]
                           [                    ]
                           [ln(6)      ln(430)  ]

> A3:=map(evalf,A2);  #get each decimal value

                        [     0         .7884573604]
                        [                          ]
                        [.6931471806    2.760009940]
                        [                          ]
                  A3 := [1.098612289    4.007333185]
                        [                          ]
                        [1.609437912    5.541263545]
                        [                          ]
                        [1.791759469    6.063785209]

> rowdim(A3);  #this command tells me the number of rows in 
>      #a matrix.  Of course, for this one, I knew it was five

                                  5

> col2:=vector(rowdim(A3),1);  #this will be the second
>       #column for our least-squares line fit, see e.g.
>       #page 454 of the text.  The command makes a vector
>       #with number of entries equal to the first argument
>       #and makes each entry equal to the second argument

                       col2 := [1, 1, 1, 1, 1]

> A4:=delcols(A3,2..2); #remove the last column of A3,
>       #what remains is the first column, which is also the
>       #first column of our A matrix

                               [     0     ]
                               [           ]
                               [.6931471806]
                               [           ]
                         A4 := [1.098612289]
                               [           ]
                               [1.609437912]
                               [           ]
                               [1.791759469]

> A:=augment(A4,col2);  #This is our matrix for least squares

                            [     0         1]
                            [                ]
                            [.6931471806    1]
                            [                ]
                       A := [1.098612289    1]
                            [                ]
                            [1.609437912    1]
                            [                ]
                            [1.791759469    1]

> b:=delcols(A3,1..1);  #This is the right-hand side for least squares

                               [.7884573604]
                               [           ]
                               [2.760009940]
                               [           ]
                          b := [4.007333185]
                               [           ]
                               [5.541263545]
                               [           ]
                               [6.063785209]

# The next three commands find the least-squares solution, as in section
# 8.4.
# 
> ATA:=evalm(transpose(A)&*A);

                        [7.488094364    5.192956851]
                 ATA := [                          ]
                        [5.192956851         5     ]

> ATb:=evalm(transpose(A)&*b);

                                [26.09876279]
                         ATb := [           ]
                                [19.16084924]

> linsolve(ATA,ATb);

                            [2.959072395]
                            [           ]
                            [.7589027952]

# which is the same as
> evalm(inverse(ATA)&*ATb);

                            [2.959072395]
                            [           ]
                            [ .75890280 ]

# Actually, least squares for linear systems is a standard tool so Maple
# has the command built right in:
> b1:=col(b,1); #I had to do this because Maple thought
>     #b was a 5 by 1 matrix, not a vector.  This command
>     #makes b1 a vector, and so I can use it in the command
>     #below.  Don't ask me to explain, I can't!

  b1 := [.7884573604, 2.760009940, 4.007333185, 5.541263545,

        6.063785209]

> leastsqrs(A,b1);  #returns the least square solution to Ax=b1.

                      [2.959072398, .7589027908]

# The first entry above is the least-squares line slope, the second
# point its intercept.  See visually how the line-fit went:
> lnlnplot:=pointplot({seq([A3[i,1],A3[i,2]],i=1..5)}):
>       #those are the points from the ln-ln data in
>       #the A3 matrix
> line:=plot(2.959072398*x + .7589027908, x=0..2):
>       #I used the mouse to paste in the coefficients
> display({lnlnplot,line});

# Pretty good fit.  Now work backwards to get the power law.
> C:=exp(.7589027908); #proportionality constant
> p:=2.959072398;  #power.  I used the mouse to paste these

                           C := 2.135931371


                           p := 2.959072398

# Notice, the power came out close to 3 and the proportionality constant
# came out close to 2, as expected from the original choice of points. 
# Now see visually how we did:
> realplot:=pointplot(S):
> powerplot:=plot(C*x^p,x=0..6):
> display({realplot,powerplot});

# 1a) Your job:  carry out the same sort of analysis for the
# height-weight data which we have accumulated.  (See the set S just
# below.)
# 1b)  What does your power law predict for the weight of typical 5
# foot, 5.5 foot, 6 foot, 6.5 foot people?
# 
# Here's the data.   In the following list, the first number is height
# (inches), the second number is weight (pounds).
> S:=[[19,8.375],[21,9.438],[57,68],
>     [69,130],[70,135],[62,115],
>     [54,90],[65,135],[15,2.238],
>     [50,61],[67,123],[45,64],
>     [72,180],[20.5,8.81],[62,102],
>     [20.5,8.44],[58,95],[21,10.13],
>     [50,52],[20,7.75],[71,145],
>     [69,215],[57,85],[47,54],
>     [54,61.3],[70,88],[66,120.3],
>     [65,101],[49,48.8],[61,98],
>     [71,133],[71,128],[69,180],
>     [53,61.5],[75,225],[65,111.5],
>     [72,125],[71,160],[77,215],
>     [77,170],[76,150],[73,195],
>     [69,120],[69,122],[52,69],
>     [42,40],[34,25],[70,135],
>     [66,140],[69,170]];

# 
# 
# 2)  We will explore inner product function spaces.  This part of the
# project is like examples  4-7, 9-10 in Appendix B.1, except we will
# use the x-interval from [-1..1] instead of [0..1].  It is more
# standard to do so.  
> restart;
# Here's the inner product we will use.  We'll call it ``dot'' in honor
# of the dot product of R^n, to which this inner product is very similar
# indeed.  With the dot product we can therefore define the magnitude of
# vectors, as well as distance and angle between vectors.
> dot:=(f,g)->int(f(x)*g(x),x=-1..1); #dot takes a pair of
>       #functions and returns a real number by computing
>       #a particular integral

                                      1
                                     /
                                    |
                  dot := (f, g) ->  |   f(x) g(x) dx
                                    |
                                   /
                                     -1

> mag:=f->sqrt(dot(f,f)); #computes the magnitude of
>      #a vector

                     mag := f -> sqrt(dot(f, f))

> dist:=(f,g)->mag(f-g);  #computes the ``distance'' between
>      #two vectors.  

                     dist := (f, g) -> mag(f - g)

> angle:=(f,g)->arccos(dot(f,g)/(mag(f)*mag(g)));
>      #computes the ``angle'' between two vectors

                                           dot(f, g)
               angle := (f, g) -> arccos(-------------)
                                         mag(f) mag(g)

# Let's experiment by computing some of these quantities for some
# polynomials.  Here's our usual basis for the space of polynomials of
# degree at most three.  The restriction of these polynomials to the
# interval [-1..1] gives a subspace of our inner product space
> P0:=x->1;P1:=x->x;P2:=x->x^2;P3:=x->x^3;

                               P0 := 1


                             P1 := x -> x


                                        2
                            P2 := x -> x


                                        3
                            P3 := x -> x

# 2a)  Compute the magnitudes of P0,P1,P2,P3.
# 2b)  Which pairs of these four polynomials are perpendicular to each
# other?  Is it true that if f is an even function and g is an odd
# function, then f and g are perpendicular in this space?  Why?
# 2c)  Show that f(x)=1 and g(x)=x^2 - 1/3 are orthogonal.  Hint:
# compute the inner product of P0 with P2 - (1/3)*P0. 
# 2d)  Find the distance between f(x)=1+2*x+x^3 and g(x)=x+3*x^2.
# 
# Let's denote the subspace spanned by P0,P1,P2 with the letter W2. 
# Here's how to use Gram-Schmidt to find an orthonormal basis.  It
# should look familiar!  We will first find an orthogonal basis, then we
# will normalize it.  We'll use Q0,Q1,Q2 for the orthogonal basis, and
# u0,u1,u2 for the normalized one.

> Q0:=P0;   #first orthogonal basis element
> proj0:=f->(dot(f,Q0)/mag(Q0)^2)*Q0;
>       #projection onto the span of Q0

> Q1:=P1 - proj0(P1);  #second orthogonal basis element
> proj1:=f->proj0(f) + (dot(f,Q1)/mag(Q1)^2)*Q1;
>       #projection onto the span of Q0 and Q1

> Q1(x);  #check our work

                                  x

> Q2:=P2-proj1(P2);  #third orthogonal basis element
> proj2:=f->proj1(f) + (dot(f,Q2)/mag(Q2)^2)*Q2;

> Q2(x); #check our work

                                2
                               x  - 1/3

> dot(Q0,Q1);dot(Q0,Q2);dot(Q1,Q2); #these should all
>       #be zero!
> u0:=x->Q0(x)/mag(Q0);u1:=x->Q1(x)/mag(Q1);u2:=x->Q2(x)/mag(Q2);
>      #normalize our orthogonal basis

> u0(x);u1(x);u2(x); #unit vectors

                             1/2 sqrt(2)


                            1/2 x sqrt(6)


                             2
                       3/4 (x  - 1/3) sqrt(10)

# Now let's use proj2 to find the projection of the exponential function
# exp onto W2.  It should be the nearest polynomial of degree 2 to the
# exp function.  In particular it should be closer that the degree two
# Taylor approximation 1+x+x^2/2:

> proj2(exp)(x);

> evalf(%);

                                                        2
             .9962940199 + 1.103638324 x + .5367215194 x

> dist(proj2(exp),exp);

> evalf(%);

                             .03795490746

> dist(P0+P1+P2/2,exp);

> evalf(%);

                             .09479756619

# So both approximations were close, but the projection was about 2.5
# times closer.
# Let's get a feel for these distances geometrically by plotting the
# three functions under consideration:
> with(plots):
> actual:=plot(exp(x),x=-1..1,color=`black`):
> near:=plot(1+x+x^2/2,x=-1..1,color=`blue`):
> nearest:=plot(.9962940199+1.103638324*x+.5367215194*x^2,x=-1..1,
>         color=`red`):
>      #I used the mouse to paste this in from up above
> display({actual,near,nearest});

# 2e)  Continue as above to find an orthogonal basis for W3, i.e. the
# span of {P0,P1,P2,P3}.  
# 2f)  Find the projection map onto W3, and use it to compare the
# projection of exp with the degree three Taylor approximation
# P(x)=1+x+x^2/2 + x^3/6, in terms of how close each is to the
# exponential function.
# 
# The orthogonal polynomials which you have been constructing are called
# the Legendre polynomials.  Maple knows about them.  To see the first
# few you can load the `orthopoly' package.  By the way, if you define
# different weighted inner products you get different (famous to experts
# ) orthogonal polynomial families, you can read about some of them on
# the help windows, starting at `orthopoly'.  Orthogonal polynomials are
# used in approximation problems, as you might expect.
#  
> with(orthopoly):
> P(0,x);P(1,x);P(2,x);P(3,x);P(4,x);P(5,x);
# That's the end of #2
# 
# 
> restart;with(linalg):with(plots):
>     #we'll use linear algebra and plots below
# 3)  It turns out that on the interval [-Pi..Pi] the functions
# {1,cos(x),cos(2x),...sin(x),sin(2x),sin(3x),...} are mutually
# orthogonal with respect to the inner product
> Dot:=(f,g)->(1/Pi) *int(f(x)*g(x),x=-Pi..Pi);

                                     Pi
                                    /
                                   |
                                   |    f(x) g(x) dx
                                   |
                                  /
                                    -Pi
                 Dot := (f, g) -> ------------------
                                          Pi

# Because we multiplied the integral by 1/Pi, the functions
# {1/sqrt(2),cos(x),cos(2x),..sin(x),sin(2x),...} are actually also of
# unit length, so we have an orthonormal family of functions.  Using Dot
# (we capitalized it to make it different from the inner product in #2),
# we again define magnitude, distance, angle, and projections.  If Wn is
# the span of {1,cos(x),...cos(nx),sin(x),sin(2x),...sin(nx)}, then the
# projection of a function f onto Wn is called the nth order Fourier
# Series for f.  It is very easy to compute because we have an
# orthonormal basis.  The limit as n approaches infinity is called the
# Fourier Series of f.  Fourier Series are used widely in Science and
# Engineering.  You will encounter them seriously in the latter part of
# Math 2280.  Let's go through an example to see them in action:
> Mag:=f->sqrt(Dot(f,f)); #computes the magnitude of
>      #a vector
> Dist:=(f,g)->Mag(f-g);  #computes the ``distance'' between
>      #two vectors.  
> angle:=(f,g)->arccos(Dot(f,g)/(Mag(f)*Mag(g)));
>      #computes the ``angle'' between two vectors

# 3a)  Verify that sin(3*x) and sin(4*x) are orthogonal, and that each
# is a unit vector for our dot product.
# 3b)  Verify that the constant function f(x)=1/sqrt(2)  is a unit
# vector.
# 
# When we do the projection of a function f onto Wn, the coefficients
# for each orthonormal basis vector are just the inner product of that
# basis vector with f.  Let's illustrate by projecting the absolute
# value function onto W10:  

> f:=x->abs(x);  #we'll do Fourier for the absolute value function
> n:=10;   #order 10 expansion
> a:=vector(n); #sin coefficients
> b:=vector(n);  #cos coefficients
> c:=Dot(f,1/sqrt(2));  #order zero Fourier coefficient
> 

                               f := abs


                               n := 10


                       a := array(1 .. 10, [])


                       b := array(1 .. 10, [])


                         c := 1/2 Pi sqrt(2)

> for i from 1 to n do  #compute the projection coefficients
> a[i]:=Dot(f,x->sin(i*x));
> b[i]:=Dot(f,x->cos(i*x));
> od:
> 
> evalm(a);#why will an even function have Fourier sine
>   #coefficients all equal to zero?

                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

> evalm(b); #Could you predict higher order coefficients
>    #from the pattern you see here?

  [    1              1               1               1
  [-4 ----, 0, - 4/9 ----, 0, - 4/25 ----, 0, - 4/49 ----, 0,
  [    Pi             Pi              Pi              Pi

                1     ]
        - 4/81 ----, 0]
                Pi    ]

> approx:=x->sum(a[k]*sin(k*x),k=1..n)
>            +sum(b[m]*cos(m*x),m=1..n)
>            +c/sqrt(2);

  approx :=

             /  n                \   /  n                \
             |-----              |   |-----              |
             | \                 |   | \                 |      c
        x -> |  )   a[k] sin(k x)| + |  )   b[m] cos(m x)| + -------
             | /                 |   | /                 |   sqrt(2)
             |-----              |   |-----              |
             \k = 1              /   \m = 1              /

> approx(x); #order 10 Fourier expansion for abs(x):

     cos(x)       cos(3 x)        cos(5 x)        cos(7 x)
  -4 ------ - 4/9 -------- - 4/25 -------- - 4/49 --------
       Pi            Pi              Pi              Pi

                cos(9 x)
         - 4/81 -------- + 1/2 Pi
                   Pi

> almost:=plot(approx(x),x=-Pi..Pi,color=`red`):
> exact:=plot(abs(x),x=-Pi..Pi,color=`black`):
> display({almost,exact});

# 3c)  Plot the Fourier series approximation above on the larger
# interval from -3*Pi to 3*Pi.  Can you explain what is happening?
# 3d)  Find the order 10 Fourier expansion for f(x)=x^3, and draw the
# plot analogous to the one above.  The approximation should look really
# good except near the interval endpoints. Can you explain why something
# strange has to happen at Pi and -Pi?  Hint: Plot the Fourier Series
# expansion on the interval from -3*Pi to 3*Pi.