<Text-field style="Heading 1" layout="Heading 1">Data merging</Text-field>

Two equivalent MC simulations gave the following values of the order parameter: \316\2701 = 2.17(33), \316\2702 = 2.32(31). The trial displacement has been changed and the third simulation yielded: \316\2703 = 2.25(22). Calculate the best estimate of \316\270

[2.25(16)]

restart;

theta1:=2.17: theta2:=2.32: theta3:=2.25:

s1:=0.33: s2:=0.31: s3:=0.22:

Data 1 and 2: equal weights, merged 1+2 data are:

theta12:=(theta1+theta2)/2; s12:=sqrt(s1^2+s2^2)/2;

Data 1+2 and 3 combined with weights \342\210\235 1/s^2

theta123:=(theta12/s12^2+theta3/s3^2)/(1/s12^2+1/s3^2); s123:=1/sqrt(1/s12^2+1/s3^2);

<Text-field style="Heading 1" layout="Heading 1">Linear and nonlinear fit \342\200\223 simple without error analysis</Text-field>

<Text-field style="Heading 2" layout="Heading 2">Linear fit</Text-field>

restart;

with(Statistics): with(RandomTools[MersenneTwister]): with(plots):

SetState(state=313);

xi:=i*0.1;

GenerateFloat();

Function 3+x\302\262 : data with errors

f:=x->3+x^2+(GenerateFloat()-0.5)*0.2;

X:=Vector([seq(xi,i=-10..10)]);

Y:=Vector([seq(f(xi),i=-10..10)]);

p1:=plot(X,Y,style=point):

To be fitted by a quadratic polynomial \342\200\223 linear in parameters, Maple will use LinearFit

fn:=Fit(a+b*x+c*x^2,X,Y,x);

p2:=plot(fn,x=-1..1,color=blue):

display({p1,p2});

<Text-field style="Heading 2" layout="Heading 2">Nonlinear fit</Text-field>

restart;

with(Statistics): with(RandomTools[MersenneTwister]): with(plots):

SetState(state=314);

xi:=i*0.5;

GenerateFloat();

Function 3/(2+x\302\262) : data with errors added

f:=x->3/(2+x^2)+(GenerateFloat()-0.5)*0.1;

X:=Vector([seq(xi,i=-10..10)]);

Y:=Vector([seq(f(xi),i=-10..10)]);

p1:=plot(X,Y,style=point):

To be fitted by a/(b+x\302\262) \342\200\223 nonlinear in parameters, Maple will use NonlinearFit, initialvalues needed

fn:=Fit(a/(b+x^2),X,Y,x,initialvalues=[a=1,b=1])

p2:=plot(fn,x=-5..5,y=0..1.5,color=blue):

display({p1,p2});

<Text-field style="Heading 1" layout="Heading 1">3\303\227 linear fit with error analysis</Text-field>

<Text-field style="Heading 2" layout="Heading 2">Maple Fitting 1: The errors of the data are not known, but they are the same</Text-field>

We analyze a first-order decomposition reaction. Concentrations [mmol/L] as a function of time [s] are given below. It is assumed that the relative errors of the concentration data are the same (including t = 0). Calculate the concentration at time t = 300 s with the error estimate.

[19(4) \316\274mol/L]

NB: The case of equal absolute errors in c cannot be (easily) calculated in Maple because the model is not linear and Maple does not support variancecovariancematrix for NonlinearFit.

restart;

with(Statistics): with(plots):

Part 1 \342\200\223 fitting

1=leastsquaresfunction, 2=parametervalues, 3=standarderrors, 4=residualstandarddeviation,5=residualstandarddeviation

OUT:=[leastsquaresfunction,parametervalues,standarderrors,residualstandarddeviation,variancecovariancematrix]:

ttab:=[0,10,20,30,40,50,60,70,80,90,100,110,120]:

ctab:=[3.371,3.291,2.211,1.903,1.846,1.335,1.308,1.019,0.716,0.569,0.656,0.572,0.455]:

n:=numelems(ctab);

Equal relative errors in c = equal absolute errors in ln(c)

c(t) = c0exp(\342\210\222kt) \342\206\222 ln c(t) = ln(c0) \342\210\222 kt

Calculate ln(c) (change ctab in place):

for i from 1 to n do ctab[i]:=ln(ctab[i]) end do:

p1:=plot(ttab,ctab,style=point):

The model:

f:=lnc0-k*t;

This fit is linear (Maple calls LinearFit internally)

out:=Fit(f,ttab,ctab,t,output=OUT);

p2:=plot(out[1],t=0..300,color=blue):

display(p1,p2);

Part 2 \342\200\223 extrapolation

Function to be calculated: g = ln(c(300)), parameters k,lnc0 unassigned because we need derivatives

g:=subs(t=300,f);

The vector of "sensitivities", watch the order of varibles generated by Maple: [k,lnc0]

sens:=<diff(g,k)|diff(g,lnc0)>;

Absolute error of the logarithm of concentration = relative error of the concentration, \342\210\202g/\342\210\202a\342\213\205\342\237\250\316\224a\316\224a\342\237\251\342\213\205\342\210\202g/\342\210\202a

dg:=sqrt(sens.out[5].sens^+);

Assign the fitted parameters to k,lnc0

assign(out[2]);

Calculated the required value (extrapolation to t=300 s)

lnc300:=g; c300:=exp(lnc300);

Estimated the standard error of c(300) = value*relative error

dc300:=c300*dg;

<Text-field style="Heading 2" layout="Heading 2">Maple Fitting 2: The errors of the data are not known, the weights are not the same</Text-field>

We measure the heat capacity of a sample as a function of temperature. We have 10 pieces of data pairs (T,Cp) from DSC-1 (for t<55 \302\260C) and 2 pieces of data pairs from a more accurate machine, DSC-2, for t=55 and 60 \302\260C. Based on the literature, the DSC-2 is twice as accurate (the expected error is one half of that of DSC-1). Assume that the inaccuracies are stochastic, not systematic. 1) Fit the data to a linear function and plot the fit, the data, and the standard deviations of the data. 2) Calculate the best estimate of Cp(25\302\260C).

Fabrication of the data: tab 5 60 5 | tabproc x '2.03+.13*x+rnd(1)*.4/(1+(x>51))' '(1+(x>51))^2' | transtab | repl ' ' ' ' | repl ' ' ,

restart;

with(Statistics): with(plots):

ttab:=[5,10,15,20,25,30,35,40,45,50,55,60]: # t in \302\260C

Ctab:=[3.615,3.408,4.107,4.404,5.937,5.565,7.023,6.577,7.632,8.31,9.133,9.833]: # Cp in J/K

w:=[1,1,1,1,1,1,1,1,1,1,4,4]: # weights

p1:=plot(ttab,Ctab,style=point,labels=["t [\302\260C]","C [J/K]"],symbolsize=16):

out:=Fit(a1+a2*t,ttab,Ctab,t,weights=w,output=[leastsquaresfunction,residualstandarddeviation]);

f:=unapply(out[1],t);

p2:=plot(f(t),t=0..65,color=blue):

out[2]=s \342\207\222 we calculate sig[i] = "reconstructed" estimated standard deviations of input data

ex:=Ctab: sig:=Ctab: for i from 1 to numelems(ex) do ex[i]:=f(ttab[i]): sig[i]:=out[2]/sqrt(w[i]): end do:

And we will plot the fit with "reconstructed" standard deviations.

p3:=ErrorPlot(ex,xcoords=ttab,yerrors=sig,style=line,color=red):

Ideally, about 2/3 of the data should be within error bars.

display(p1,p2,p3);

The value for t=25 first

f(25);

Variance-covarinace matrix

res:=Fit(a1+a2*t,ttab,Ctab,t,weights=w,output=[parametervalues,variancecovariancematrix]);

assign(res[1]);

sensitivities (\342\210\202g/a1,\342\210\202g/a2) for g=f=a1+a2*t for t=25 (so simple that Maple not necessary...)

g:=<1|25>:

Standard error of f(25):

sqrt(g.res[2].g^+);

Result: Cp(25 \302\260C) = 5.43(14) J/K (original data:LUklbXJvd0c2Iy9JK21vZHVsZW5hbWVHNiJJLFR5cGVzZXR0aW5nR0koX3N5c2xpYkdGJzYmLUkjbW9HRiQ2LlEifkYnLyUrZXhlY3V0YWJsZUdRJmZhbHNlRicvJSxtYXRodmFyaWFudEdRJ25vcm1hbEYnLyUmZmVuY2VHRjEvJSpzZXBhcmF0b3JHRjEvJSlzdHJldGNoeUdGMS8lKnN5bW1ldHJpY0dGMS8lKGxhcmdlb3BHRjEvJS5tb3ZhYmxlbGltaXRzR0YxLyUnYWNjZW50R0YxLyUnbHNwYWNlR1EmMC4wZW1GJy8lJ3JzcGFjZUdGRS1JI21uR0YkNiVRJjUuOTM3RidGL0YyRi9GMg==(43) J/K \342\200\223 see below).

The accuracy of treated data is higher! Note, however, that we assumed uncorrelated stochastic errors \342\200\223 likely does not apply for the DSC.

ttab[5]; Ctab[5]; sig[5];

<Text-field style="Heading 2" layout="Heading 2">Maple Fitting 3: the errors of the data are known and reliable</Text-field>

The statistical weights W=exp(S), where S is the residual entropy per lattice site, were determined for ice Ih periodic crystals with different numbers of molecules. The standard errors are the results of long MC simulations and are thus reliable. Theory predicts the following scaling law:

W(N) = W(\342\210\236) + b/N + c ln(N)/N

Fit the data and determine W(\342\210\236). (See also plot/mat-num2.sh.)

restart;

with(Statistics): with(plots):

Number of molecules in a lattice

Ntab:=[360,2880,9720,23040,45000,77760,184320]:

Statistical weights W=exp(S)

Wtab:=[1.51898342,1.50906444,1.50796203,1.50767991,1.50757977,1.50753663,1.50749468]:

standard errors

sigmatab:=[0.00001264,0.00000887,0.00000650,0.00000566,0.00000466,0.00000405,0.00000348]:

Weights to be used in Maple

w:=sigmatab: for i from 1 to numelems(sigmatab) do w[i]:=1/w[i]^2 end do:

p1:=plot(Ntab,Wtab,style=point,labels=["N","W"]):

The basis for the fit (equivalent to writing base:=Winf + b/N + c ln(N)/N)

base:=[1,1/N,ln(N)/N];

The fitted function (to plot)

f:=LinearFit(base,Ntab,Wtab,N,weights=w);

p2:=plot(f,N=0..200000,color=blue,labels=["N","W"]):

display(p1,p2);

This does not look well, so let's use 1/N instead of N

iNtab:=Ntab: for i from 1 to numelems(Ntab) do iNtab[i]:=1./Ntab[i] end do:

for i from 1 to numelems(Ntab) do iNtab[i]=1./Ntab[i] end do:

iNtab;

p1:=plot(iNtab,Wtab,style=point,labels=["1/N","W"]):

ff:=subs(N=1/x,f);

p2:=plot(ff,x=1e-9..1./350,color=blue,labels=["1/N","W"]):

display(p1,p2);

Error of parameters (particularly, W(\342\210\236))

out:=LinearFit(base,Ntab,Wtab,N,weights=w,output=[parametervalues,standarderrors,residualstandarddeviation]);

residualstandarddeviation should be around 1

The result is:

printf("%8.7f \302\261 %8.7f (error from data scattering)\134n",out[1][1],out[2][1]);

However, \316\275=n\342\210\222p=4 (number of degrees of freedom) is small, we cannot get a good error estimate from four pieces of data!

Hence, we divide by s=residualstandarddeviation:

printf("%9.7f \302\261 %9.7f (error based on known data inaccuracies)\134n",out[1][1],out[2][1]/out[3]);

All parameters to compare with MC:

for i from 1 to 3 do printf("%9.8f \302\261 %9.8f (error based on known data inaccuracies)\134n",out[1][i],out[2][i]/out[3]) end do;

<Text-field style="Heading 1" layout="Heading 1">Polynomial fit \342\200\223 numerical stability</Text-field>

restart;

with(Statistics): with(plots): Digits:=16;

x:=i*0.01;

Let us fit exp(x) in [2,4] to a polynomial

f:=x->exp(x);

X:=Vector([seq(x,i=200..400)]):

Y:=Vector([seq(f(x),i=200..400)]):

1=leastsquaresfunction, 2=residuals, 3=residualstandarddeviation, 4=variancecovariancematrix

OUT:=[leastsquaresfunction,residuals,residualstandarddeviation,variancecovariancematrix]:

res9:=Fit(a+b*t+c*t^2+d*t^3+e*t^4+f*t^5+g*t^6+h*t^7+i*t^8+j*t^9,X,Y,t,output=OUT):

[3] = residual standard deviation

res9[3];

Just shift for a better-conditioned matrix

res9c:=Fit(a+b*(t-3)+c*(t-3)^2+d*(t-3)^3+e*(t-3)^4+f*(t-3)^5+g*(t-3)^6+h*(t-3)^7+i*(t-3)^8+j*(t-3)^9,X,Y,t,output=OUT):

res9c[3];

Best solution \342\200\223 use orthogonal functions

res9P:=Fit(a+b*(t-3)+c*LegendreP(2,t-3)+d*LegendreP(3,t-3)+e*LegendreP(4,t-3)+f*LegendreP(5,t-3)+g*LegendreP(6,t-3)+h*LegendreP(7,t-3)+i*LegendreP(8,t-3)+j*LegendreP(9,t-3),X,Y,t,output=OUT):

res9P[3];

Plotting [2] = the residuals

p9:=plot(X,res9[2],color=red,legend="poly (low rank)"):

p9c:=plot(X,res9c[2],color=green,legend="poly shifted"):

p9ch:=plot(X,res9P[2],color=black,style=point,legend="poly Legendre"):

display(p9,p9c,p9ch);

[4]: selected elements of the covariance matrix for comparison

- a diagonal element:

i:=3:j:=3: res9[4][i][j]; res9c[4][i][j]; res9P[4][i][j];

- an off-diagonal element:

i:=2:j:=4: res9[4][i][j]; res9c[4][i][j]; res9P[4][i][j];

<Text-field style="Heading 1" layout="Heading 1">Complex fitting example</Text-field>

Fit the data to a suitable function f(x) and provide the solution x0 of equation f(x0) = 1, including the standard error estimate.

[12.5(2)]

restart:

with(Statistics): with(plots):

X:=[2,3,4,5,6,7,8,9,10]:

Y:=[4.001,3.424,3.039,2.710,2.482,2.208,1.985,1.749,1.528]:

sig:=[0.014,0.013,0.011,0.010,0.009,0.008,0.008,0.007,0.007]:

p1:=ErrorPlot(Y,xcoords=X,yerrors=sig,style=line,color=red);

Maple likes weights (not standard errors) as a parameter

W:=sig: for i from 1 to numelems(sig) do W[i]:=1/sig[i]^2 end do:

The model

p:=3: base:=a[1]+a[2]*x+a[3]/x:

#p:=4: base:=a[1]+a[2]*x+a[3]*x^2+a[4]*x^3:

#p:=5: base:=a[1]+a[2]*x+a[3]*x^2+a[4]*x^3+a[5]*x^4:

#p:=4: base:=a[1]+a[2]*x+a[3]/x+a[4]*x^2:

The sensitivities of the root of equation f(x)=y on parameters can be obtained from the formula for the derivative of implicit function: f(ai+dai, x+dx) = y

\342\210\202f/\342\210\202ai\342\213\205dai + \342\210\202f/\342\210\202x\342\213\205dx=0

\342\210\202x/\342\210\202ai = \342\210\222(\342\210\202f/\342\210\202ai)/(\342\210\202f/\342\210\202x)

g:=Vector(p): for i from 1 to p do g[i]:=-diff(base,a[i])/diff(base,x) end do:

Fitting NB: standarderrors and variancecovariancematrix are calculated from the sum of squares, not from sig[i] given! If we are sure that our sig[i] are (more) accurate, we should rescale the resulting errors.

1=parametervalues, 2=standarderrors, 3=residualstandarddeviation, 4=variancecovariancematrix, 5=leastsquaresfunction, 6=residuals

res:=LinearFit(base,X,Y,x,weights=W,output=[parametervalues,standarderrors,residualstandarddeviation,variancecovariancematrix,leastsquaresfunction,residuals]):

f:=unapply(res[5],x);

The data and the fit

p2:=plot(f(x),x=1.5..13.5,symbol=soliddiamond,symbolsize=15,color=gray):

display(p1,p2);

The differences

dif:=Y: for i from 1 to numelems(Y) do dif[i]:=Y[i]-f(X[i]) end do: dif;

p1:=ErrorPlot(dif,xcoords=X,yerrors=sig,style=line,symbol=soliddiamond,symbolsize=15,color=red):

p2:=plot(0,x=1.5..10.5,color=blue,color=gray):

display(p1,p2);

The residuals are multiplied by \317\203_i^2: (multiplied by w_i); 2/3 of them should be \302\2611

plot(X,res[6],style=point,symbol=soliddiamond,symbolsize=15,color=red);

The residualstandarddeviation should ideally be around 1

s:=res[3];

a:=res[1];

The raw standard errors of the parameters

res[2];

Rescaling the standard errors of the parameters (if we trust our sig[i] data)

res[2]/s;

The raw covariance matrix

cov:=res[4];

x0:=fsolve(f(x)=1, x=10);

var:=subs(x=x0,g^+.cov.g);

sqrt(var);

The covariance matrix for weight[i]=1/sig[i]^2 should be rescaled, too (if we trust our sig[i] data)

cov:=res[4]/s^2;

x0:=fsolve(f(x)=1, x=10);

var:=subs(x=x0,g^+.cov.g);

sigmax0:=sqrt(var);

Solution: 12.49(5); if we are not sure about our sig[i], better use 12.49(7). Poly(3) less stable. Poly(4) cannot extrapolate! Other functions f give scattered results, better increase \317\203.

<Text-field style="Heading 1" layout="Heading 1">Linear fit example, to be compared to student+fit.ods / student+fit.xlsx</Text-field>

Linear fit of X,Y data to a quadratic function, f(x)=a+bx+cx2.

The data Y are known with estimated standard errors, \317\203.

restart;

with(Statistics):

X:=[ -2.000,-1.800,-1.600,-1.400,-1.200,-1.000,-0.800,-0.600,-0.400,-0.200,-0.000,0.200,0.400,0.600,0.800,1.000,1.200,1.400,1.600,1.800,2.000 ];

Y:=[ 11.876,10.918,9.746,8.761,7.791,7.003,6.408,5.452,5.010,4.325,3.622,3.466,4.087,3.257,3.517,3.546,2.575,2.525,3.807,3.162,4.141 ];

sig:=[ 0.700,0.660,0.620,0.580,0.540,0.500,0.460,0.420,0.380,0.340,0.300,0.340,0.380,0.420,0.460,0.500,0.540,0.580,0.620,0.660,0.700 ];

n:=numelems(sig);

Maple likes weights instead of \317\203's: w:=1/\317\203^2

w:=sig; for i from 1 to n do w[i]:=1/sig[i]^2 end do:

res:=LinearFit(a+b*x+c*x^2,X,Y,x,weights=w,output=[parametervalues,residualstandarddeviation, standarderrors, variancecovariancematrix]);

Estimated parameter values:

res[1];

Estimated \317\203\342\200\231s based on \317\203\342\200\231s of the data (if the data \317\203 are more reliable)

stderrs:=res[3]/res[2];