# Math 2250
# Maple Project 1
# January 1999
# 
#      This project consists of two parts.  In Part A you will study the
# logisitic equation from section 2.1 of the text and test it against
# actual populations, following the investigations of the section 2.1
# computing projects.  In part B of your Maple project you will study
# numerical methods for the solution of first order differential
# equations, using ideas from sections 2.4-2.6.  You will need the
# Edwards-Penney text book as well as the Computing Projects companion
# to do this project. 
#      You are expected to read and work through the project.  The only
# work which you have to turn in consists of those questions which are
# clearly labeled as PROBLEMS.  There are many other times when you are
# asked to verify certain computations or facts, but these verifications
# are for your own benefit in understanding the material.  Although they
# are not to be handed in you skip them at your own peril.
#      You should get an on-line copy of this project from
# http://www.math.utah.edu/~korevaar/2250proj1s.txt   In the tutorial it
# was explained how to do this:  Save a copy of the file using your web
# browser, and then open it from Maple as a ``Maple text'' document. 
# You can then work through the project as indicated.  You can insert
# the answers to the problems using the menu options to insert text and
# Maple commands.  If you italicize your text answers they will be easy
# for the grader to see.  After you've completed the project in this
# manner you might want to save a copy of it under another name (use the
# ``save as'' option in the file menu.  Then, for the copy you will hand
# in, you can delete the parts of the project which are irrelevant to
# the answers to the PROBLEMS.  This will save a lot of paper. 
# 
# 
# PART A
# 
#     Recall the logistic equation for population change which you have
# just been studying:

                           dP
                          ---- = kP (M - P)
                           dt

# or

                            dP           2
                           ---- = aP + bP
                            dt

# with a=kM and b=-k.  In our model the parameters k,M,a,b are related
# to assumptions about birth and death rates.  Suppose you have a real
# population and want to pick parameters a and b to make a good model. 
# One way would be to try to estimate fertility and morbidity rates
# based on birth and death data, but that could get quite complicated. 
# For example, if you want to develop an accurate model of world
# population growth based on this sort of analysis you would probably
# need to collect data from different regions of our planet and develop
# different parameters for different societies, solve the problem in
# each part of the globe and then add your results together.   
#      A more simple-minded approach  to finding a and b is to try
# assuming the logistic model and to  use actual  population data to
# deduce good choices for a and b. The book explains a good way to do
# this on pages 77-80.  Along the way the method checks whether the
# model seems to be justified by the data.  You should read the text's
# explanation.  Let's sketch their idea in any case.  If you divide the
# logistic DE by P, you get

                             dP
                            ---- = a + bP
                            P dt

# If you have multi-year population data you can get good estimates for
# the left side of this equation by using difference quotients to
# estimate the time derivative of the population.  Dividing by P gives
# an estimate for (dP/dt)/P.  
#      For example, looking at the USA data on the bottom of page 72, we
# can estimate dP/dt  in 1920 in a "centered" way by taking the
# difference quotient DP/Dt = (P(1930)-P(1910))/20 (in units of
# people/year).  Divide this by P(1920) to get an estimate for (dP/dt)/P
# in 1920, for the USA population. Verify from the page 72 chart that
# these are the numbers you get:
> ((123.2-92.2)/20)/106.0;

                             .01462264151

# We can get estimates for 1910, 1900,1890 as well, we'll use them in a
# bit:  We are still using the table on page 72. Understand where the
# following numbers came from: 
> ((106-76.2)/20)/92.2; ((92.2-63.0)/20)/76.2; ((76.2-50.2)/20)/63;

                             .01616052061


                             .01916010499


                             .02063492063


#      Now if you plot points (P, (DP/Dt)/P) in the plane you can see if
# they seem to obey a linear law, as they should if the logistic model
# is applicable.  If they do then that justifies (after the fact) the
# idea of using the logistic model, and furthermore you can read off
# good values of the intercept a and the slope b from the graph.  The
# book has created such a plot on the top left of page 79.  ( They
# mislabled the horizontal axis "t" instead of "P" in my edition of the
# book, however.).  The first point we just computed is (106.0, 0.0146),
# the one the farthest to the right in the page 79 plot. In that same
# plot they have sketched a line which fits the points ``best, '' from
# which it appears that
> a:=0.032;   #vertical axis intercept from 
>             #figure 2.1.7

                              a := .032

> b:=(0.0125-0.032)/120;  #slope estimate (rise/run)
>             #from line in figure 2.1.7

                         b := -.0001625000000

# are good choices for intercept and slope, respectively.  Using these
# values for a and b you can solve the logistic equation and try to make
# quantitative estimates about future populations.  You can see that
# these choices for a and b are close to the one Verhulst made in 1845
# (see page 72), which is perhaps why his guesses for future populations
# (until 1930) were so accurate.  
# 
# 
# 
# 
# 
#      We can check our best line fit by plotting the points and the
# line.  Rather than reproduce the entire figure 2.1.7 on page 79, let's
# just do the four right-hand points and the line, since we calculated
# the second coordinates of those points already:
> with(plots): #load the plotting package
> points:={[106,.0146],[92.2,.01616],[76.2,.01916],[63.0,.020635]};
>     #this set contains the four points calculated above

  points := {

        [106, .0146], [92.2, .01616], [76.2, .01916], [63.0, .020635]

        }

> pict1:=pointplot(points): #a plot of the points,
>     #with output suppressed.  Make sure
>     #to end this command with a colon! If you use a
>     #semicolon you get a huge mess. 
> line:=plot(a+b*x,x=0..120):  #same warning here
> 
> display({pict1,line});  #now use a semicolon, and
>     #get a picture containing the line and the points. 


# Verify for yourself that the plot above accurately represents the line
# and the four right-hand points of figure 2.1.7 on page 79.
# 
#      The moral of Investigation 1 on page 79 is that although the
# choices for a and b above worked well for a while, they don't work
# very well to estimate populations later in the 1900's.  Let's go
# through this investigation. 
#      Let's solve the initial value problem for the logistic equation
# with the values of a and b from above, taking 1920 as t=0, and P0
# therefore equal to 106 million people (table page 72).  Then we'll
# compare the predicted population to actual populations from 1820 to
# 1990.  We'll do this numerically with a table, and also graphically.  
> with(DEtools):  #load the DE package
> deqtn1:=diff(x(t),t)=a*x(t) + b*x(t)^2;
>      #logistic eqtn with our parameters

                   d                                        2
         deqtn1 := -- x(t) = .032 x(t) - .0001625000000 x(t)
                   dt


> P:=dsolve({deqtn1,x(0)=106.0},x(t));
>      #take 1920 as t=0 and solve the initial value
>      #problem  

                                           1
       P := x(t) = 2560. --------------------------------------
                         13. + 11.15094340 exp(-.03200000000 t)


> f:=s->evalf(subs(t=s,rhs(P))); #extract the right-hand side
>       #from the above expression to make your solution function

                 f := s -> evalf(subs(t = s, rhs(P)))

> f(s);    #check that those weird subs and rhs commands really work
>     

                                     1
             2560. --------------------------------------
                   13. + 11.15094340 exp(-.03200000000 s)


> 
> f(-100),9.6;  #use shift-return to enter
> f(-90),12.9;  #multiline commands without
> f(-80),17.1;  #Maple trying to execute
> f(-70),23.2;  #the lines one at a time.
> f(-60),31.4;
> f(-50),38.6;
> f(-40),50.2; 
> f(-30),63.0;
> f(-20), 76.2;  
> f(-10), 92.2;  
> f(0), 106.0;   
> f(10), 123.2;
> f(20), 132.2;  #You should get a list of
> f(30), 151.3;  #predicted populations vs.
> f(40), 179.3;  #actual ones
> f(50), 203.3;
> f(60), 225.6;
> f(70), 248.7;
> 
> pops:={[1820,9.6],[1830,12.9],[1840,17.1],
>        [1850,23.2],[1860,31.4],[1870,38.6],
>        [1880,50.2],[1890,63.0],[1900,76.2],
>        [1910,92.2],[1920,106.0],[1930,123.2],
>        [1940,132.2],[1950,151.3],[1960,179.3],
>        [1970,203.3],[1980,225.6],[1990,248.7]};
>                #points for year vs population graph   
>               

> g:=t->f(t-1920);  #convert from t back to
>     #calander year

> graph1:=pointplot(pops): #remember, colon!
> graph2:=plot(g(t),t=1820..1990,`color`=`black`):
> display({graph1,graph2}); #now semicolon!

# Looking at the numbers you generated and the graph you made above, you
# should observe that these choices for a and b give a solution which 
# greatly underestimates the U.S. population, starting a little after
# 1950  thru the present day.
#   
# PROBLEM A.1. Can you explain the deviation between predictions and
# actual population data which begins around 1950?   Can you think of
# changes in society over the last 100 years which may have affected a
# and b (or k and M), therefore affecting the values of a and b
# (equivalently k and M), or any other changes affecting the validity of
# using the logistic model from 1820 to 1990?   
# 
# Now you will do Investigation 2, on page 79.  This is the bulk of your
# part A work:  
# 
# PROBLEM A.2.  Use the centered difference approach as described above,
# to estimate (dP/dt)/P, as above, for the U.S. populations from
# 1900-1990.  Thus you will be computing estimates for (dP/dt)/P for the
# eight years 1910, 1920,  1930, ..., 1980. 
# 
# PROBLEM A.3.  Use MAPLE to plot the eight points (P, (DP/Dt)/P), which
# result from your computations in A.2.
# 
# PROBLEM A.4.  Rather than trying to fit a line to the points of
# Problem A.3, use the method of least squares, described on pages 34-35
# of your Computing Projects companion to the textbook, to have the
# computer find the best line fit to your collection of eight points
# representing (P,(dP/dt)/P).  
# 
# PROBLEM A.5.  Plot the best-line fit together with the eight points
# (using MAPLE) to verify that your line from A.4 gives a good fit.
# 
# PROBLEM A.6.  Use the best-line fit to get values for a and b in the
# logistic equation.  Then use MAPLE to solve the logistic equation,
# using 1900 as t=0.
# 
# PROBLEM A.7  Use MAPLE to plot actual populations from 1900 to 1990
# together with the graph of your solution to A.6.  Does this give a
# better fit for the post 1950 populations than the earlier logistic
# equation solution?
# 
# PROBLEM A.8  What does your logistic equation solution predict for the
# USA population in 2020?  Do you think this is a good approximation? 
# Explain your reasoning.
# 
# 
# 
# 
# 
# 
# 
# 
# 
# 
# 
# 
# 
# 
# PART B
# 
#      In this part of the project we will study numerical methods for
# approximating solutions to first order differential equations.  We
# will see soon how higher order differential equations can be converted
# into first order systems of differential equations (and we may have
# even mentioned this once or twice in passing already).  It turns out
# that there is a natural way to generalize what we will do today in the
# context of a single first order differential equation  to systems of
# first order differential equations.  So understanding today's material
# will be an important step in understanding numerical solutions to
# higher order differential equations and to systems of differential
# equations.
#      We will be working through material from sections 2.4-2.6 of the
# text.  You may want to read the text and consult the Computing
# Projects manual, projects 8-10, if you find  explanations here to be
# deficient. 
#      The most basic method of approximating solutions to differential
# equations is called Euler's method, after the 1700's mathematician who
# first formulated it.  If you want to approximate the solution to the
# initial value problem dy/dx = f(x,y), y(x0)=y0, first pick a step size
# ``h''.  Then for x between x0 and x0+h, use the constant slope
# f(x0,y0).  At x-value x1:=x0+h your y-value will therefore be y1:=y0 +
# f(x0,y0)h.  Then for x between x1 and x1+h you use the constant slope
# f(x1,y1), so that at x2:=x1+h your y-value is y2:=y1+f(x1,y1)h.  You
# continue in this manner.  It is easy to visualize if you understand
# the slope field concept we've been talking about; you just use the
# slope field where you are at the end of each step to get a slope for
# the next step.  It is straightforward to have the computer do this
# sort of tedious computation for you.  In Euler's time such
# computations would have been done by hand!
# 
#      A good first example to illustrate Euler's method is our favorite
# DE from the time of Calculus, namely dy/dx =y, say with initial value
# y(0)=1, so that y=exp(x) is the solution.  Let's take h=0.2 and try to
# approximate the solution of the x-interval [0,1].  Since the
# approximate solution will be piecewise affine, we only need to know
# the approximations at the discrete x values x=0,0.2,0.4,0.6,0.8,1. 
# Here's a simple ``do loop'' to make these computations.  
# 
> restart;  #clear any memory from earlier work 
> x0:=0.0; xn:=1.0; y0:=1.0; n:=5; h:=(xn-x0)/n;  # specify initial
>                         # values, number of steps, step size.         
>                               

                               x0 := 0


                              xn := 1.0


                              y0 := 1.0


                                n := 5


                           h := .2000000000

> f:=(x,y)->y;    #this is the function f(x,y) in dy/dx = f(x,y)

                           f := (x, y) -> y

> x:=x0; y:=y0;   #initialize x,y for the do loop

                                x := 0


                               y := 1.0

> for i from 1 to n do
>           k:= f(x,y):  #current slope, use : to suppress output
>           y:= y + h*k: #new y value via Euler
>           x:= x + h:   #updated x-value:
>           print(x,y,exp(x));  #display current values, 
>                        #and compare to exact solution.
>     od:                #``od'' ends a do loop  

                .2000000000, 1.200000000, 1.221402758


                .4000000000, 1.440000000, 1.491824698


                .6000000000, 1.728000000, 1.822118800


                .8000000000, 2.073600000, 2.225540928


                1.000000000, 2.488320000, 2.718281828

# Notice your approximations are all a little too small, in particular
# your final approximation 2.488... is short of the exact value of
# exp(1)=e=2.71828..  The reason for this is that because of the form of
# our f(x,y) our approximate slope is always less than the actual slope.
#  We can see this graphically using plots: 
> with(plots):with(linalg):
> xval:=vector(n+1);yval:=vector(n+1);  #to collect all our points

                      xval := array(1 .. 6, [])


                      yval := array(1 .. 6, [])


> xval[1]:=x0; yval[1]:=y0;        #initial values

                             xval[1] := 0


                            yval[1] := 1.0

>            #paste in the previous work, and modify for plotting:
> for i from 1 to n do
>           x:=xval[i]:  #current x
>           y:=yval[i]:  #current y
>           k:= f(x,y):  #current slope
>           yval[i+1]:= y + h*k:   #new y value via Euler
>           xval[i+1]:= x + h:     #updated x-value:
>               od:                #``od'' ends a do loop  
> approxsol:=pointplot({seq([xval[i],yval[i]], i=1..n+1)},connect=true):
> exactsol:=plot(exp(t),t=0..1,`color`=`black`):   
>      #used t because x was already used above
> display({approxsol,exactsol});


# The picture you just created is like the one in figure 2.4.3, on page
# 100 of the text.  In your picture the upper graph is of the
# exponential function, the lower graph is of your Euler approximation.
# 
#     It should be that as your step size ``h'' gets smaller, your
# approximations get better to the actual solution.  This is true if
# your computer can do exact math (which it can't), but in practice you
# don't want to make the computer do too many computations because of
# problems with round-off error and computation time, so for example,
# choosing h=0.0000001 would not be practical. But, trying h=0.01 should
# be instructive.
# 
# PROBLEM B.1  Change the n-value to 100.  Keep the other data the same.
# Only print out your approximations when h is a multiple of 0.1.  The
# commands you need are below.  
> x0:=0.0; xn:=1.0; y0:=1.0; n:=100; h:=(xn-x0)/n;
> x:=x0; y:=y0;

> for i from 1 to n do
>           k:= f(x,y):  #current slope
>           y:= y + h*k: #new y value via Euler
>           x:= x + h:   #updated x-value:
>           if frac(i/10)=0
>                then print(x,y,exp(x));
>           fi;  #use the ``if'' test to decide when to print;
>                #the command ``frac'' computes the remainder 
>                #of a quotient, it will be zero for us if i 
>                #is a multiple of 10. 
>     od:                

                .1000000000, 1.104622127, 1.105170918


                .2000000000, 1.220190041, 1.221402758


                .3000000000, 1.347848916, 1.349858808


                .4000000000, 1.488863735, 1.491824698


                .5000000000, 1.644631823, 1.648721271


                .6000000000, 1.816696699, 1.822118800


                .7000000000, 2.006763370, 2.013752707


                .8000000000, 2.216715220, 2.225540928


                .9000000000, 2.448632678, 2.459603111


                1.000000000, 2.704813834, 2.718281828

#  PROBLEM B.2.  How close is your estimate to e now, based on n=100? 
# 
#      Actually, considering how many computations you did in problem
# B.1, you are not so close to the exact solution. In more complicated
# problems it is a very serious issue to find relatively efficient ways
# of approximating solutions.  An entire field of mathematics,
# ``numerical analysis'' deals with such issues for a variety of
# mathematical problems.   The book talks about some improvements to
# Euler  in sections 2.5 and 2.6.    If you are interested in this
# important field of mathematics you should read 2.5 and 2.6 carefully.
# Let's summarize some highlights below.
#       For any time step h the fundamental theorem of calculus asserts
# that, since dy/dx = f(x,y(x)),
> y(x+h)=y(x) + Int(f(t,y(t)),t=`x`..`x+h`);

                                     x+h
                                    /
                                   |
                y(x + h) = y(x) +  |     f(t, y(t)) dt
                                   |
                                  /
                                    x

# The problem with Euler is that we always approximated this integral by
# h*f(x,y(x)), i.e. we used the left-hand endpoint as our approximation
# of the ``average height''.  The improvements to Euler depend on better
# approximations to that integral.  ``Improved Euler'' uses an
# approximation to the Trapezoid Rule to approximate the integral.  The
# trapezoid rule for the integral approximation would be
# (1/2)*h*(f(x,y(x))+f((x+h),y(x+h)).  Since we don't know y(x+h) we
# approximate it using unimproved Euler, and then feed that into the
# trapezoid rule.  This leads to the improved Euler do loop below.  Of
# course before you use it you must make sure you initialize everything
# correctly.
#  
> x:=x0; y:=y0; n:=5; h:=(xn-x0)/n;

                                x := 0


                               y := 1.0


                                n := 5


                           h := .2000000000

> for i from 1 to n do
>     k1:=f(x,y):          #left-hand slope
>     k2:=f(x+h,y+h*k1):   #approximation to right-hand slope
>     k:= (k1+k2)/2:       #approximation to average slope
>     y:= y+h*k:            #improved Euler update
>     x:= x+h:             #update x
>     print(x,y,exp(x));
> od:  
> 

                .2000000000, 1.220000000, 1.221402758


                .4000000000, 1.488400000, 1.491824698


                .6000000000, 1.815848000, 1.822118800


                .8000000000, 2.215334560, 2.225540928


                1.000000000, 2.702708163, 2.718281828

# Notice you almost did as well with n=5 as you did with n=100 in
# unimproved Euler. 
# 
# PROBLEM B.3   Use improved Euler with n= 100 to see how well you do
# for the same initial value problem.  Use the template above, modified
# to print out data only at x-multiples of 0.1
# 
#      One can also use Taylor approximation methods to improve upon
# Euler; by differentiating the equation dy/dx = f(x,y) one can solve
# for higher order derivatives of y in terms of the lower order ones,
# and then use the Taylor approximation for y(x+h) in terms of y(x). 
# See the book for more details of this method, we won't do it here.
#      In the same vein as ``improved Euler'' we can use the Simpson
# approximation for the integral instead of the Trapezoid one, and this
# leads to the Runge-Kutta method which is widely used in real software.
#  (You may or may not have talked about Simpson's Rule in Calculus, it
# is based on a quadratic approximation to the function f, whereas the
# Trapezoid rule is based on a first order approximation.)  Here is the
# code for the Runge-Kutta method.  The text explains it in section 2.6.
#  Simpson's rule approximates an integral in terms of the integrand
# values at each endpoint and at the interval midpoint.  Runge-Kutta
# uses two different approximations for the midpoint value.
#      Before you use the loop below you must initialize your values,
# like you did above.  
# 
# 
> 
> x:=x0; y:=y0; n:=5; h:=(xn-x0)/n;

                                x := 0


                               y := 1.0


                                n := 5


                           h := .2000000000


> for i from 1 to n do
>     k1:=f(x,y):              #left-hand slope
>     k2:=f(x+h/2,y+h*k1/2):   #1st guess at midpoint slope
>     k3:=f(x+h/2,y+h*k2/2):   #second guess at midpoint slope
>     k4:=f(x+h,y+h*k3):       #guess at right-hand slope
>     k:=(k1+2*k2+2*k3+k4)/6:  #Simpson's approximation for the integral
>     x:=x+h:                  #x update
>     y:=y+h*k:                #y update
>     print(x,y,exp(x));              #display current values
>   od:  
> 

                .2000000000, 1.221400000, 1.221402758


                .4000000000, 1.491817960, 1.491824698


                .6000000000, 1.822106456, 1.822118800


                .8000000000, 2.225520825, 2.225540928


                1.000000000, 2.718251136, 2.718281828

# Notice how close Runge-Kutta gets you to the correct value of e, with
# n=5.
# 
# PROBLEM B.4.  Do Runge-Kutta for the same initial value problem, with
# n=20.  Compare to your results using Euler and improved Euler, using
# n=100. There should be a huge improvement.
# 
#    As we know, solutions to non-linear DE's can blow up, and there are
# other interesting pathologies as well, so if one is doing numerical
# solutions there is a real need for care.  The text has a number of
# examples of this.  You are to do the following  two problems:  Don't
# forget to insert textual explanations to go with your MAPLE
# computations.
# 
# PROBLEM B.5.  Section 2.4 page 103, #25.  (illustrates the danger of
# missing blow-up)
# 
# PROBLEM B.6.  Section 2.6 page 123:  Work through and understand
# example #4.  Recreate the data in the last two columns of the table on
# the top of page 123, using your Runge-Kutta routine (i.e. the
# Runge-Kutta approximation with h=0.05 and the actual y value).  
# 
#      Understand (for yourself) how mathematical instability caused by
# the solution to the homogeneous problem can lead to numerical
# instability like that illustrated in this important example of Problem
# B.6, even when you're using a ``good'' numerical technique like Runge
# Kutta.  This example illustrates why people who create numerical
# solutions to problems must also be well-grounded in the mathematical
# theory behind the problems.
# 
#