Blame - doc/distributions/students_t_examples.qbk - RealtimeRoboticsGroup/test

blob: 483631ecebaa526b28d79667a2411b9b8d78c460 [file] [log] [blame]

Brian Silverman	32ed54e	2018-08-04 23:37:28 -0700	[diff] [blame^]	1
				2	[section:st_eg Student's t Distribution Examples]
				3
				4	[section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
				5
				6	Let's say you have a sample mean, you may wish to know what confidence intervals
				7	you can place on that mean. Colloquially: "I want an interval that I can be
				8	P% sure contains the true mean". (On a technical point, note that
				9	the interval either contains the true mean or it does not: the
				10	meaning of the confidence level is subtly
				11	different from this colloquialism. More background information can be found on the
				12	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]).
				13
				14	The formula for the interval can be expressed as:
				15
				16	[equation dist_tutorial4]
				17
				18	Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation,
				19	/N/ is the sample size, /[alpha]/ is the desired significance level and
				20	['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-t
				21	distribution with /N-1/ degrees of freedom.
				22
				23	[note
				24	The quantity [alpha][space] is the maximum acceptable risk of falsely rejecting
				25	the null-hypothesis. The smaller the value of [alpha] the greater the
				26	strength of the test.
				27
				28	The confidence level of the test is defined as 1 - [alpha], and often expressed
				29	as a percentage. So for example a significance level of 0.05, is equivalent
				30	to a 95% confidence level. Refer to
				31	[@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm
				32	"What are confidence intervals?"] in __handbook for more information.
				33	] [/ Note]
				34
				35	[note
				36	The usual assumptions of
				37	[@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)]
				38	variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution]
				39	of course apply here, as they do in other examples.
				40	]
				41
				42	From the formula, it should be clear that:
				43
				44	* The width of the confidence interval decreases as the sample size increases.
				45	* The width increases as the standard deviation increases.
				46	* The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger).
				47	* The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger).
				48
				49	The following example code is taken from the example program
				50	[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
				51
				52	We'll begin by defining a procedure to calculate intervals for
				53	various confidence levels; the procedure will print these out
				54	as a table:
				55
				56	// Needed includes:
				57	#include <boost/math/distributions/students_t.hpp>
				58	#include <iostream>
				59	#include <iomanip>
				60	// Bring everything into global namespace for ease of use:
				61	using namespace boost::math;
				62	using namespace std;
				63
				64	void confidence_limits_on_mean(
				65	double Sm, // Sm = Sample Mean.
				66	double Sd, // Sd = Sample Standard Deviation.
				67	unsigned Sn) // Sn = Sample Size.
				68	{
				69	using namespace std;
				70	using namespace boost::math;
				71
				72	// Print out general info:
				73	cout <<
				74	"__________________________________\n"
				75	"2-Sided Confidence Limits For Mean\n"
				76	"__________________________________\n\n";
				77	cout << setprecision(7);
				78	cout << setw(40) << left << "Number of Observations" << "= " << Sn << "\n";
				79	cout << setw(40) << left << "Mean" << "= " << Sm << "\n";
				80	cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n";
				81
				82	We'll define a table of significance/risk levels for which we'll compute intervals:
				83
				84	double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
				85
				86	Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999).
				87
				88	Next we'll declare the distribution object we'll need, note that
				89	the /degrees of freedom/ parameter is the sample size less one:
				90
				91	students_t dist(Sn - 1);
				92
				93	Most of what follows in the program is pretty printing, so let's focus
				94	on the calculation of the interval. First we need the t-statistic,
				95	computed using the /quantile/ function and our significance level. Note
				96	that since the significance levels are the complement of the probability,
				97	we have to wrap the arguments in a call to /complement(...)/:
				98
				99	double T = quantile(complement(dist, alpha[i] / 2));
				100
				101	Note that alpha was divided by two, since we'll be calculating
				102	both the upper and lower bounds: had we been interested in a single
				103	sided interval then we would have omitted this step.
				104
				105	Now to complete the picture, we'll get the (one-sided) width of the
				106	interval from the t-statistic
				107	by multiplying by the standard deviation, and dividing by the square
				108	root of the sample size:
				109
				110	double w = T * Sd / sqrt(double(Sn));
				111
				112	The two-sided interval is then the sample mean plus and minus this width.
				113
				114	And apart from some more pretty-printing that completes the procedure.
				115
				116	Let's take a look at some sample output, first using the
				117	[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
				118	Heat flow data] from the NIST site. The data set was collected
				119	by Bob Zarr of NIST in January, 1990 from a heat flow meter
				120	calibration and stability analysis.
				121	The corresponding dataplot
				122	output for this test can be found in
				123	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
				124	section 3.5.2] of the __handbook.
				125
				126	[pre'''
				127	__________________________________
				128	2-Sided Confidence Limits For Mean
				129	__________________________________
				130
				131	Number of Observations = 195
				132	Mean = 9.26146
				133	Standard Deviation = 0.02278881
				134
				135
				136	___________________________________________________________________
				137	Confidence T Interval Lower Upper
				138	Value (%) Value Width Limit Limit
				139	___________________________________________________________________
				140	50.000 0.676 1.103e-003 9.26036 9.26256
				141	75.000 1.154 1.883e-003 9.25958 9.26334
				142	90.000 1.653 2.697e-003 9.25876 9.26416
				143	95.000 1.972 3.219e-003 9.25824 9.26468
				144	99.000 2.601 4.245e-003 9.25721 9.26571
				145	99.900 3.341 5.453e-003 9.25601 9.26691
				146	99.990 3.973 6.484e-003 9.25498 9.26794
				147	99.999 4.537 7.404e-003 9.25406 9.26886
				148	''']
				149
				150	As you can see the large sample size (195) and small standard deviation (0.023)
				151	have combined to give very small intervals, indeed we can be
				152	very confident that the true mean is 9.2.
				153
				154	For comparison the next example data output is taken from
				155	['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
				156	and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
				157	J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
				158	The values result from the determination of mercury by cold-vapour
				159	atomic absorption.
				160
				161	[pre'''
				162	__________________________________
				163	2-Sided Confidence Limits For Mean
				164	__________________________________
				165
				166	Number of Observations = 3
				167	Mean = 37.8000000
				168	Standard Deviation = 0.9643650
				169
				170
				171	___________________________________________________________________
				172	Confidence T Interval Lower Upper
				173	Value (%) Value Width Limit Limit
				174	___________________________________________________________________
				175	50.000 0.816 0.455 37.34539 38.25461
				176	75.000 1.604 0.893 36.90717 38.69283
				177	90.000 2.920 1.626 36.17422 39.42578
				178	95.000 4.303 2.396 35.40438 40.19562
				179	99.000 9.925 5.526 32.27408 43.32592
				180	99.900 31.599 17.594 20.20639 55.39361
				181	99.990 99.992 55.673 -17.87346 93.47346
				182	99.999 316.225 176.067 -138.26683 213.86683
				183	''']
				184
				185	This time the fact that there are only three measurements leads to
				186	much wider intervals, indeed such large intervals that it's hard
				187	to be very confident in the location of the mean.
				188
				189	[endsect]
				190
				191	[section:tut_mean_test Testing a sample mean for difference from a "true" mean]
				192
				193	When calibrating or comparing a scientific instrument or measurement method of some kind,
				194	we want to be answer the question "Does an observed sample mean differ from the
				195	"true" mean in any significant way?". If it does, then we have evidence of
				196	a systematic difference. This question can be answered with a Students-t test:
				197	more information can be found
				198	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
				199	on the NIST site].
				200
				201	Of course, the assignment of "true" to one mean may be quite arbitrary,
				202	often this is simply a "traditional" method of measurement.
				203
				204	The following example code is taken from the example program
				205	[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
				206
				207	We'll begin by defining a procedure to determine which of the
				208	possible hypothesis are rejected or not-rejected
				209	at a given significance level:
				210
				211	[note
				212	Non-statisticians might say 'not-rejected' means 'accepted',
				213	(often of the null-hypothesis) implying, wrongly, that there really IS no difference,
				214	but statisticans eschew this to avoid implying that there is positive evidence of 'no difference'.
				215	'Not-rejected' here means there is no evidence of difference, but there still might well be a difference.
				216	For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and
				217	[@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.]
				218	] [/ note]
				219
				220
				221	// Needed includes:
				222	#include <boost/math/distributions/students_t.hpp>
				223	#include <iostream>
				224	#include <iomanip>
				225	// Bring everything into global namespace for ease of use:
				226	using namespace boost::math;
				227	using namespace std;
				228
				229	void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha)
				230	{
				231	//
				232	// M = true mean.
				233	// Sm = Sample Mean.
				234	// Sd = Sample Standard Deviation.
				235	// Sn = Sample Size.
				236	// alpha = Significance Level.
				237
				238	Most of the procedure is pretty-printing, so let's just focus on the
				239	calculation, we begin by calculating the t-statistic:
				240
				241	// Difference in means:
				242	double diff = Sm - M;
				243	// Degrees of freedom:
				244	unsigned v = Sn - 1;
				245	// t-statistic:
				246	double t_stat = diff * sqrt(double(Sn)) / Sd;
				247
				248	Finally calculate the probability from the t-statistic. If we're interested
				249	in simply whether there is a difference (either less or greater) or not,
				250	we don't care about the sign of the t-statistic,
				251	and we take the complement of the probability for comparison
				252	to the significance level:
				253
				254	students_t dist(v);
				255	double q = cdf(complement(dist, fabs(t_stat)));
				256
				257	The procedure then prints out the results of the various tests
				258	that can be done, these
				259	can be summarised in the following table:
				260
				261	[table
				262	[[Hypothesis][Test]]
				263	[[The Null-hypothesis: there is
				264	no difference in means]
				265	[Reject if complement of CDF for \|t\| < significance level / 2:
				266
				267	`cdf(complement(dist, fabs(t))) < alpha / 2`]]
				268
				269	[[The Alternative-hypothesis: there
				270	is difference in means]
				271	[Reject if complement of CDF for \|t\| > significance level / 2:
				272
				273	`cdf(complement(dist, fabs(t))) > alpha / 2`]]
				274
				275	[[The Alternative-hypothesis: the sample mean is less than
				276	the true mean.]
				277	[Reject if CDF of t > 1 - significance level:
				278
				279	`cdf(complement(dist, t)) < alpha`]]
				280
				281	[[The Alternative-hypothesis: the sample mean is greater than
				282	the true mean.]
				283	[Reject if complement of CDF of t < significance level:
				284
				285	`cdf(dist, t) < alpha`]]
				286	]
				287
				288	[note
				289	Notice that the comparisons are against `alpha / 2` for a two-sided test
				290	and against `alpha` for a one-sided test]
				291
				292	Now that we have all the parts in place, let's take a look at some
				293	sample output, first using the
				294	[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
				295	Heat flow data] from the NIST site. The data set was collected
				296	by Bob Zarr of NIST in January, 1990 from a heat flow meter
				297	calibration and stability analysis. The corresponding dataplot
				298	output for this test can be found in
				299	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
				300	section 3.5.2] of the __handbook.
				301
				302	[pre
				303	__________________________________
				304	Student t test for a single sample
				305	__________________________________
				306
				307	Number of Observations = 195
				308	Sample Mean = 9.26146
				309	Sample Standard Deviation = 0.02279
				310	Expected True Mean = 5.00000
				311
				312	Sample Mean - Expected Test Mean = 4.26146
				313	Degrees of Freedom = 194
				314	T Statistic = 2611.28380
				315	Probability that difference is due to chance = 0.000e+000
				316
				317	Results for Alternative Hypothesis and alpha = 0.0500
				318
				319	Alternative Hypothesis Conclusion
				320	Mean != 5.000 NOT REJECTED
				321	Mean < 5.000 REJECTED
				322	Mean > 5.000 NOT REJECTED
				323	]
				324
				325	You will note the line that says the probability that the difference is
				326	due to chance is zero. From a philosophical point of view, of course,
				327	the probability can never reach zero. However, in this case the calculated
				328	probability is smaller than the smallest representable double precision number,
				329	hence the appearance of a zero here. Whatever its "true" value is, we know it
				330	must be extraordinarily small, so the alternative hypothesis - that there is
				331	a difference in means - is not rejected.
				332
				333	For comparison the next example data output is taken from
				334	['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
				335	and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
				336	J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
				337	The values result from the determination of mercury by cold-vapour
				338	atomic absorption.
				339
				340	[pre
				341	__________________________________
				342	Student t test for a single sample
				343	__________________________________
				344
				345	Number of Observations = 3
				346	Sample Mean = 37.80000
				347	Sample Standard Deviation = 0.96437
				348	Expected True Mean = 38.90000
				349
				350	Sample Mean - Expected Test Mean = -1.10000
				351	Degrees of Freedom = 2
				352	T Statistic = -1.97566
				353	Probability that difference is due to chance = 1.869e-001
				354
				355	Results for Alternative Hypothesis and alpha = 0.0500
				356
				357	Alternative Hypothesis Conclusion
				358	Mean != 38.900 REJECTED
				359	Mean < 38.900 NOT REJECTED
				360	Mean > 38.900 NOT REJECTED
				361	]
				362
				363	As you can see the small number of measurements (3) has led to a large uncertainty
				364	in the location of the true mean. So even though there appears to be a difference
				365	between the sample mean and the expected true mean, we conclude that there
				366	is no significant difference, and are unable to reject the null hypothesis.
				367	However, if we were to lower the bar for acceptance down to alpha = 0.1
				368	(a 90% confidence level) we see a different output:
				369
				370	[pre
				371	__________________________________
				372	Student t test for a single sample
				373	__________________________________
				374
				375	Number of Observations = 3
				376	Sample Mean = 37.80000
				377	Sample Standard Deviation = 0.96437
				378	Expected True Mean = 38.90000
				379
				380	Sample Mean - Expected Test Mean = -1.10000
				381	Degrees of Freedom = 2
				382	T Statistic = -1.97566
				383	Probability that difference is due to chance = 1.869e-001
				384
				385	Results for Alternative Hypothesis and alpha = 0.1000
				386
				387	Alternative Hypothesis Conclusion
				388	Mean != 38.900 REJECTED
				389	Mean < 38.900 NOT REJECTED
				390	Mean > 38.900 REJECTED
				391	]
				392
				393	In this case, we really have a borderline result,
				394	and more data (and/or more accurate data),
				395	is needed for a more convincing conclusion.
				396
				397	[endsect]
				398
				399	[section:tut_mean_size Estimating how large a sample size would have to become
				400	in order to give a significant Students-t test result with a single sample test]
				401
				402	Imagine you have conducted a Students-t test on a single sample in order
				403	to check for systematic errors in your measurements. Imagine that the
				404	result is borderline. At this point one might go off and collect more data,
				405	but it might be prudent to first ask the question "How much more?".
				406	The parameter estimators of the students_t_distribution class
				407	can provide this information.
				408
				409	This section is based on the example code in
				410	[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]
				411	and we begin by defining a procedure that will print out a table of
				412	estimated sample sizes for various confidence levels:
				413
				414	// Needed includes:
				415	#include <boost/math/distributions/students_t.hpp>
				416	#include <iostream>
				417	#include <iomanip>
				418	// Bring everything into global namespace for ease of use:
				419	using namespace boost::math;
				420	using namespace std;
				421
				422	void single_sample_find_df(
				423	double M, // M = true mean.
				424	double Sm, // Sm = Sample Mean.
				425	double Sd) // Sd = Sample Standard Deviation.
				426	{
				427
				428	Next we define a table of significance levels:
				429
				430	double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
				431
				432	Printing out the table of sample sizes required for various confidence levels
				433	begins with the table header:
				434
				435	cout << "\n\n"
				436	"_______________________________________________________________\n"
				437	"Confidence Estimated Estimated\n"
				438	" Value (%) Sample Size Sample Size\n"
				439	" (one sided test) (two sided test)\n"
				440	"_______________________________________________________________\n";
				441
				442
				443	And now the important part: the sample sizes required. Class
				444	`students_t_distribution` has a static member function
				445	`find_degrees_of_freedom` that will calculate how large
				446	a sample size needs to be in order to give a definitive result.
				447
				448	The first argument is the difference between the means that you
				449	wish to be able to detect, here it's the absolute value of the
				450	difference between the sample mean, and the true mean.
				451
				452	Then come two probability values: alpha and beta. Alpha is the
				453	maximum acceptable risk of rejecting the null-hypothesis when it is
				454	in fact true. Beta is the maximum acceptable risk of failing to reject
				455	the null-hypothesis when in fact it is false.
				456	Also note that for a two-sided test, alpha must be divided by 2.
				457
				458	The final parameter of the function is the standard deviation of the sample.
				459
				460	In this example, we assume that alpha and beta are the same, and call
				461	`find_degrees_of_freedom` twice: once with alpha for a one-sided test,
				462	and once with alpha/2 for a two-sided test.
				463
				464	for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
				465	{
				466	// Confidence value:
				467	cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
				468	// calculate df for single sided test:
				469	double df = students_t::find_degrees_of_freedom(
				470	fabs(M - Sm), alpha[i], alpha[i], Sd);
				471	// convert to sample size:
				472	double size = ceil(df) + 1;
				473	// Print size:
				474	cout << fixed << setprecision(0) << setw(16) << right << size;
				475	// calculate df for two sided test:
				476	df = students_t::find_degrees_of_freedom(
				477	fabs(M - Sm), alpha[i]/2, alpha[i], Sd);
				478	// convert to sample size:
				479	size = ceil(df) + 1;
				480	// Print size:
				481	cout << fixed << setprecision(0) << setw(16) << right << size << endl;
				482	}
				483	cout << endl;
				484	}
				485
				486	Let's now look at some sample output using data taken from
				487	['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
				488	and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
				489	J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
				490	The values result from the determination of mercury by cold-vapour
				491	atomic absorption.
				492
				493	Only three measurements were made, and the Students-t test above
				494	gave a borderline result, so this example
				495	will show us how many samples would need to be collected:
				496
				497	[pre'''
				498	_____________________________________________________________
				499	Estimated sample sizes required for various confidence levels
				500	_____________________________________________________________
				501
				502	True Mean = 38.90000
				503	Sample Mean = 37.80000
				504	Sample Standard Deviation = 0.96437
				505
				506
				507	_______________________________________________________________
				508	Confidence Estimated Estimated
				509	Value (%) Sample Size Sample Size
				510	(one sided test) (two sided test)
				511	_______________________________________________________________
				512	75.000 3 4
				513	90.000 7 9
				514	95.000 11 13
				515	99.000 20 22
				516	99.900 35 37
				517	99.990 50 53
				518	99.999 66 68
				519	''']
				520
				521	So in this case, many more measurements would have had to be made,
				522	for example at the 95% level, 14 measurements in total for a two-sided test.
				523
				524	[endsect]
				525	[section:two_sample_students_t Comparing the means of two samples with the Students-t test]
				526
				527	Imagine that we have two samples, and we wish to determine whether
				528	their means are different or not. This situation often arises when
				529	determining whether a new process or treatment is better than an old one.
				530
				531	In this example, we'll be using the
				532	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm
				533	Car Mileage sample data] from the
				534	[@http://www.itl.nist.gov NIST website]. The data compares
				535	miles per gallon of US cars with miles per gallon of Japanese cars.
				536
				537	The sample code is in
				538	[@../../example/students_t_two_samples.cpp students_t_two_samples.cpp].
				539
				540	There are two ways in which this test can be conducted: we can assume
				541	that the true standard deviations of the two samples are equal or not.
				542	If the standard deviations are assumed to be equal, then the calculation
				543	of the t-statistic is greatly simplified, so we'll examine that case first.
				544	In real life we should verify whether this assumption is valid with a
				545	Chi-Squared test for equal variances.
				546
				547	We begin by defining a procedure that will conduct our test assuming equal
				548	variances:
				549
				550	// Needed headers:
				551	#include <boost/math/distributions/students_t.hpp>
				552	#include <iostream>
				553	#include <iomanip>
				554	// Simplify usage:
				555	using namespace boost::math;
				556	using namespace std;
				557
				558	void two_samples_t_test_equal_sd(
				559	double Sm1, // Sm1 = Sample 1 Mean.
				560	double Sd1, // Sd1 = Sample 1 Standard Deviation.
				561	unsigned Sn1, // Sn1 = Sample 1 Size.
				562	double Sm2, // Sm2 = Sample 2 Mean.
				563	double Sd2, // Sd2 = Sample 2 Standard Deviation.
				564	unsigned Sn2, // Sn2 = Sample 2 Size.
				565	double alpha) // alpha = Significance Level.
				566	{
				567
				568
				569	Our procedure will begin by calculating the t-statistic, assuming
				570	equal variances the needed formulae are:
				571
				572	[equation dist_tutorial1]
				573
				574	where Sp is the "pooled" standard deviation of the two samples,
				575	and /v/ is the number of degrees of freedom of the two combined
				576	samples. We can now write the code to calculate the t-statistic:
				577
				578	// Degrees of freedom:
				579	double v = Sn1 + Sn2 - 2;
				580	cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
				581	// Pooled variance:
				582	double sp = sqrt(((Sn1-1) * Sd1 * Sd1 + (Sn2-1) * Sd2 * Sd2) / v);
				583	cout << setw(55) << left << "Pooled Standard Deviation" << "= " << sp << "\n";
				584	// t-statistic:
				585	double t_stat = (Sm1 - Sm2) / (sp * sqrt(1.0 / Sn1 + 1.0 / Sn2));
				586	cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
				587
				588	The next step is to define our distribution object, and calculate the
				589	complement of the probability:
				590
				591	students_t dist(v);
				592	double q = cdf(complement(dist, fabs(t_stat)));
				593	cout << setw(55) << left << "Probability that difference is due to chance" << "= "
				594	<< setprecision(3) << scientific << 2 * q << "\n\n";
				595
				596	Here we've used the absolute value of the t-statistic, because we initially
				597	want to know simply whether there is a difference or not (a two-sided test).
				598	However, we can also test whether the mean of the second sample is greater
				599	or is less (one-sided test) than that of the first:
				600	all the possible tests are summed up in the following table:
				601
				602	[table
				603	[[Hypothesis][Test]]
				604	[[The Null-hypothesis: there is
				605	no difference in means]
				606	[Reject if complement of CDF for \|t\| < significance level / 2:
				607
				608	`cdf(complement(dist, fabs(t))) < alpha / 2`]]
				609
				610	[[The Alternative-hypothesis: there is a
				611	difference in means]
				612	[Reject if complement of CDF for \|t\| > significance level / 2:
				613
				614	`cdf(complement(dist, fabs(t))) < alpha / 2`]]
				615
				616	[[The Alternative-hypothesis: Sample 1 Mean is less than
				617	Sample 2 Mean.]
				618	[Reject if CDF of t > significance level:
				619
				620	`cdf(dist, t) > alpha`]]
				621
				622	[[The Alternative-hypothesis: Sample 1 Mean is greater than
				623	Sample 2 Mean.]
				624
				625	[Reject if complement of CDF of t > significance level:
				626
				627	`cdf(complement(dist, t)) > alpha`]]
				628	]
				629
				630	[note
				631	For a two-sided test we must compare against alpha / 2 and not alpha.]
				632
				633	Most of the rest of the sample program is pretty-printing, so we'll
				634	skip over that, and take a look at the sample output for alpha=0.05
				635	(a 95% probability level). For comparison the dataplot output
				636	for the same data is in
				637	[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
				638	section 1.3.5.3] of the __handbook.
				639
				640	[pre'''
				641	________________________________________________
				642	Student t test for two samples (equal variances)
				643	________________________________________________
				644
				645	Number of Observations (Sample 1) = 249
				646	Sample 1 Mean = 20.145
				647	Sample 1 Standard Deviation = 6.4147
				648	Number of Observations (Sample 2) = 79
				649	Sample 2 Mean = 30.481
				650	Sample 2 Standard Deviation = 6.1077
				651	Degrees of Freedom = 326
				652	Pooled Standard Deviation = 6.3426
				653	T Statistic = -12.621
				654	Probability that difference is due to chance = 5.273e-030
				655
				656	Results for Alternative Hypothesis and alpha = 0.0500'''
				657
				658	Alternative Hypothesis Conclusion
				659	Sample 1 Mean != Sample 2 Mean NOT REJECTED
				660	Sample 1 Mean < Sample 2 Mean NOT REJECTED
				661	Sample 1 Mean > Sample 2 Mean REJECTED
				662	]
				663
				664	So with a probability that the difference is due to chance of just
				665	5.273e-030, we can safely conclude that there is indeed a difference.
				666
				667	The tests on the alternative hypothesis show that we must
				668	also reject the hypothesis that Sample 1 Mean is
				669	greater than that for Sample 2: in this case Sample 1 represents the
				670	miles per gallon for Japanese cars, and Sample 2 the miles per gallon for
				671	US cars, so we conclude that Japanese cars are on average more
				672	fuel efficient.
				673
				674	Now that we have the simple case out of the way, let's look for a moment
				675	at the more complex one: that the standard deviations of the two samples
				676	are not equal. In this case the formula for the t-statistic becomes:
				677
				678	[equation dist_tutorial2]
				679
				680	And for the combined degrees of freedom we use the
				681	[@http://en.wikipedia.org/wiki/Welch-Satterthwaite_equation Welch-Satterthwaite]
				682	approximation:
				683
				684	[equation dist_tutorial3]
				685
				686	Note that this is one of the rare situations where the degrees-of-freedom
				687	parameter to the Student's t distribution is a real number, and not an
				688	integer value.
				689
				690	[note
				691	Some statistical packages truncate the effective degrees of freedom to
				692	an integer value: this may be necessary if you are relying on lookup tables,
				693	but since our code fully supports non-integer degrees of freedom there is no
				694	need to truncate in this case. Also note that when the degrees of freedom
				695	is small then the Welch-Satterthwaite approximation may be a significant
				696	source of error.]
				697
				698	Putting these formulae into code we get:
				699
				700	// Degrees of freedom:
				701	double v = Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2;
				702	v *= v;
				703	double t1 = Sd1 * Sd1 / Sn1;
				704	t1 *= t1;
				705	t1 /= (Sn1 - 1);
				706	double t2 = Sd2 * Sd2 / Sn2;
				707	t2 *= t2;
				708	t2 /= (Sn2 - 1);
				709	v /= (t1 + t2);
				710	cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
				711	// t-statistic:
				712	double t_stat = (Sm1 - Sm2) / sqrt(Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2);
				713	cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
				714
				715	Thereafter the code and the tests are performed the same as before. Using
				716	are car mileage data again, here's what the output looks like:
				717
				718	[pre'''
				719	__________________________________________________
				720	Student t test for two samples (unequal variances)
				721	__________________________________________________
				722
				723	Number of Observations (Sample 1) = 249
				724	Sample 1 Mean = 20.145
				725	Sample 1 Standard Deviation = 6.4147
				726	Number of Observations (Sample 2) = 79
				727	Sample 2 Mean = 30.481
				728	Sample 2 Standard Deviation = 6.1077
				729	Degrees of Freedom = 136.87
				730	T Statistic = -12.946
				731	Probability that difference is due to chance = 1.571e-025
				732
				733	Results for Alternative Hypothesis and alpha = 0.0500'''
				734
				735	Alternative Hypothesis Conclusion
				736	Sample 1 Mean != Sample 2 Mean NOT REJECTED
				737	Sample 1 Mean < Sample 2 Mean NOT REJECTED
				738	Sample 1 Mean > Sample 2 Mean REJECTED
				739	]
				740
				741	This time allowing the variances in the two samples to differ has yielded
				742	a higher likelihood that the observed difference is down to chance alone
				743	(1.571e-025 compared to 5.273e-030 when equal variances were assumed).
				744	However, the conclusion remains the same: US cars are less fuel efficient
				745	than Japanese models.
				746
				747	[endsect]
				748	[section:paired_st Comparing two paired samples with the Student's t distribution]
				749
				750	Imagine that we have a before and after reading for each item in the sample:
				751	for example we might have measured blood pressure before and after administration
				752	of a new drug. We can't pool the results and compare the means before and after
				753	the change, because each patient will have a different baseline reading.
				754	Instead we calculate the difference between before and after measurements
				755	in each patient, and calculate the mean and standard deviation of the differences.
				756	To test whether a significant change has taken place, we can then test
				757	the null-hypothesis that the true mean is zero using the same procedure
				758	we used in the single sample cases previously discussed.
				759
				760	That means we can:
				761
				762	* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals Calculate confidence intervals of the mean].
				763	If the endpoints of the interval differ in sign then we are unable to reject
				764	the null-hypothesis that there is no change.
				765	* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_test Test whether the true mean is zero]. If the
				766	result is consistent with a true mean of zero, then we are unable to reject the
				767	null-hypothesis that there is no change.
				768	* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_size Calculate how many pairs of readings we would need
				769	in order to obtain a significant result].
				770
				771	[endsect]
				772
				773	[endsect][/section:st_eg Student's t]
				774
				775	[/
				776	Copyright 2006, 2012 John Maddock and Paul A. Bristow.
				777	Distributed under the Boost Software License, Version 1.0.
				778	(See accompanying file LICENSE_1_0.txt or copy at
				779	http://www.boost.org/LICENSE_1_0.txt).
				780	]
				781