Blame - doc/distributions/arcsine.qbk - RealtimeRoboticsGroup/test

blob: 8dc731e7d3140779ccbaa40a5262501b3070cae9 [file] [log] [blame]

Brian Silverman	32ed54e	2018-08-04 23:37:28 -0700	[diff] [blame^]	1	[section:arcine_dist Arcsine Distribution]
				2
				3	[import ../../example/arcsine_example.cpp] [/ for arcsine snips below]
				4
				5
				6	``#include <boost/math/distributions/arcsine.hpp>``
				7
				8	namespace boost{ namespace math{
				9
				10	template <class RealType = double,
				11	class ``__Policy`` = ``__policy_class`` >
				12	class arcsine_distribution;
				13
				14	typedef arcsine_distribution<double> arcsine; // double precision standard arcsine distribution [0,1].
				15
				16	template <class RealType, class ``__Policy``>
				17	class arcsine_distribution
				18	{
				19	public:
				20	typedef RealType value_type;
				21	typedef Policy policy_type;
				22
				23	// Constructor from two range parameters, x_min and x_max:
				24	arcsine_distribution(RealType x_min, RealType x_max);
				25
				26	// Range Parameter accessors:
				27	RealType x_min() const;
				28	RealType x_max() const;
				29	};
				30	}} // namespaces
				31
				32	The class type `arcsine_distribution` represents an
				33	[@http://en.wikipedia.org/wiki/arcsine_distribution arcsine]
				34	[@http://en.wikipedia.org/wiki/Probability_distribution probability distribution function].
				35	The arcsine distribution is named because its CDF uses the inverse sin[super -1] or arcsine.
				36
				37	This is implemented as a generalized version with support from ['x_min] to ['x_max]
				38	providing the 'standard arcsine distribution' as default with ['x_min = 0] and ['x_max = 1].
				39	(A few make other choices for 'standard').
				40
				41	The arcsine distribution is generalized to include any bounded support ['a <= x <= b] by
				42	[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram] and
				43	[@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia],
				44	but also using ['location] and ['scale] parameters by
				45	[@http://www.math.uah.edu/stat/index.html Virtual Laboratories in Probability and Statistics]
				46	[@http://www.math.uah.edu/stat/special/Arcsine.html Arcsine distribution].
				47	The end-point version is simpler and more obvious, so we implement that.
				48	If desired, [@http://en.wikipedia.org/wiki/arcsine_distribution this]
				49	outlines how the __beta_distrib can be used to add a shape factor.
				50
				51	The [@http://en.wikipedia.org/wiki/Probability_density_function probability density function PDF]
				52	for the [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine distribution]
				53	defined on the interval \[['x_min, x_max]\] is given by:
				54
				55	[figspace] [figspace] f(x; x_min, x_max) = 1 /([pi][sdot][sqrt]((x - x_min)[sdot](x_max - x_min))
				56
				57	For example, __WolframAlpha arcsine distribution, from input of
				58
				59	N[PDF[arcsinedistribution[0, 1], 0.5], 50]
				60
				61	computes the PDF value
				62
				63	0.63661977236758134307553505349005744813783858296183
				64
				65	The Probability Density Functions (PDF) of generalized arcsine distributions are symmetric U-shaped curves,
				66	centered on ['(x_max - x_min)/2],
				67	highest (infinite) near the two extrema, and quite flat over the central region.
				68
				69	If random variate ['x] is ['x_min] or ['x_max], then the PDF is infinity.
				70	If random variate ['x] is ['x_min] then the CDF is zero.
				71	If random variate ['x] is ['x_max] then the CDF is unity.
				72
				73	The 'Standard' (0, 1) arcsine distribution is shown in blue
				74	and some generalized examples with other ['x] ranges.
				75
				76	[graph arcsine_pdf]
				77
				78	The Cumulative Distribution Function CDF is defined as
				79
				80	[figspace] [figspace] F(x) = 2[sdot]arcsin([sqrt]((x-x_min)/(x_max - x))) / [pi]
				81
				82	[graph arcsine_cdf]
				83
				84	[h5 Constructor]
				85
				86	arcsine_distribution(RealType x_min, RealType x_max);
				87
				88	constructs an arcsine distribution with range parameters ['x_min] and ['x_max].
				89
				90	Requires ['x_min < x_max], otherwise __domain_error is called.
				91
				92	For example:
				93
				94	arcsine_distribution<> myarcsine(-2, 4);
				95
				96	constructs an arcsine distribution with ['x_min = -2] and ['x_max = 4].
				97
				98	Default values of ['x_min = 0] and ['x_max = 1] and a ` typedef arcsine_distribution<double> arcsine;` mean that
				99
				100	arcsine as;
				101
				102	constructs a 'Standard 01' arcsine distribution.
				103
				104	[h5 Parameter Accessors]
				105
				106	RealType x_min() const;
				107	RealType x_max() const;
				108
				109	Return the parameter ['x_min] or ['x_max] from which this distribution was constructed.
				110
				111	So, for example:
				112
				113	[arcsine_snip_8]
				114
				115	[h4 Non-member Accessor Functions]
				116
				117	All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions]
				118	that are generic to all distributions are supported: __usual_accessors.
				119
				120	The formulae for calculating these are shown in the table below, and at
				121	[@http://mathworld.wolfram.com/arcsineDistribution.html Wolfram Mathworld].
				122
				123	[note There are always [two] values for the [mode], at ['x_min] and at ['x_max], default 0 and 1,
				124	so instead we raise the exception __domain_error.
				125	At these extrema, the PDFs are infinite, and the CDFs zero or unity.]
				126
				127	[h4 Applications]
				128
				129	The arcsine distribution is useful to describe
				130	[@http://en.wikipedia.org/wiki/Random_walk Random walks], (including drunken walks)
				131	[@http://en.wikipedia.org/wiki/Brownian_motion Brownian motion],
				132	[@http://en.wikipedia.org/wiki/Wiener_process Weiner processes],
				133	[@http://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trials],
				134	and their appplication to solve stock market and other
				135	[@http://en.wikipedia.org/wiki/Gambler%27s_ruin ruinous gambling games].
				136
				137	The random variate ['x] is constrained to ['x_min] and ['x_max], (for our 'standard' distribution, 0 and 1),
				138	and is usually some fraction. For any other ['x_min] and ['x_max] a fraction can be obtained from ['x] using
				139
				140	[sixemspace] fraction = (x - x_min) / (x_max - x_min)
				141
				142	The simplest example is tossing heads and tails with a fair coin and modelling the risk of losing, or winning.
				143	Walkers (molecules, drunks...) moving left or right of a centre line are another common example.
				144
				145	The random variate ['x] is the fraction of time spent on the 'winning' side.
				146	If half the time is spent on the 'winning' side (and so the other half on the 'losing' side) then ['x = 1/2].
				147
				148	For large numbers of tosses, this is modelled by the (standard \[0,1\]) arcsine distribution,
				149	and the PDF can be calculated thus:
				150
				151	[arcsine_snip_2]
				152
				153	From the plot of PDF, it is clear that ['x] = [frac12] is the [*minimum] of the curve,
				154	so this is the [*least likely] scenario.
				155	(This is highly counter-intuitive, considering that fair tosses must [*eventually] become equal.
				156	It turns out that ['eventually] is not just very long, but [*infinite]!).
				157
				158	The [*most likely] scenarios are towards the extrema where ['x] = 0 or ['x] = 1.
				159
				160	If fraction of time on the left is a [frac14],
				161	it is only slightly more likely because the curve is quite flat bottomed.
				162
				163	[arcsine_snip_3]
				164
				165	If we consider fair coin-tossing games being played for 100 days
				166	(hypothetically continuously to be 'at-limit')
				167	the person winning after day 5 will not change in fraction 0.144 of the cases.
				168
				169	We can easily compute this setting ['x] = 5./100 = 0.05
				170
				171	[arcsine_snip_4]
				172
				173	Similarly, we can compute from a fraction of 0.05 /2 = 0.025
				174	(halved because we are considering both winners and losers)
				175	corresponding to 1 - 0.025 or 97.5% of the gamblers, (walkers, particles...) on the [*same side] of the origin
				176
				177	[arcsine_snip_5]
				178
				179	(use of the complement gives a bit more clarity,
				180	and avoids potential loss of accuracy when ['x] is close to unity, see __why_complements).
				181
				182	[arcsine_snip_6]
				183
				184	or we can reverse the calculation by assuming a fraction of time on one side, say fraction 0.2,
				185
				186	[arcsine_snip_7]
				187
				188	[*Summary]: Every time we toss, the odds are equal,
				189	so on average we have the same change of winning and losing.
				190
				191	But this is [not true] for an an individual game where one will be [mostly in a bad or good patch].
				192
				193	This is quite counter-intuitive to most people, but the mathematics is clear,
				194	and gamblers continue to provide proof.
				195
				196	[*Moral]: if you in a losing patch, leave the game.
				197	(Because the odds to recover to a good patch are poor).
				198
				199	[*Corollary]: Quit while you are ahead?
				200
				201	A working example is at [@../../example/arcsine_example.cpp arcsine_example.cpp]
				202	including sample output .
				203
				204	[h4 Related distributions]
				205
				206	The arcsine distribution with ['x_min = 0] and ['x_max = 1] is special case of the
				207	__beta_distrib with [alpha] = 1/2 and [beta] = 1/2.
				208
				209	[h4 Accuracy]
				210
				211	This distribution is implemented using sqrt, sine, cos and arc sine and cos trigonometric functions
				212	which are normally accurate to a few __epsilon.
				213	But all values suffer from [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance or cancellation error]
				214	for values of ['x] close to ['x_max].
				215	For example, for a standard [0, 1] arcsine distribution ['as], the pdf is symmetric about random variate ['x = 0.5]
				216	so that one would expect `pdf(as, 0.01) == pdf(as, 0.99)`. But as ['x] nears unity, there is increasing
				217	[@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance].
				218	To counteract this, the complement versions of CDF and quantile
				219	are implemented with alternative expressions using ['cos[super -1]] instead of ['sin[super -1]].
				220	Users should see __why_complements for guidance on when to avoid loss of accuracy by using complements.
				221
				222	[h4 Testing]
				223	The results were tested against a few accurate spot values computed by __WolframAlpha, for example:
				224
				225	N[PDF[arcsinedistribution[0, 1], 0.5], 50]
				226	0.63661977236758134307553505349005744813783858296183
				227
				228	[h4 Implementation]
				229
				230	In the following table ['a] and ['b] are the parameters ['x_min][space] and ['x_max],
				231	['x] is the random variable, ['p] is the probability and its complement ['q = 1-p].
				232
				233	[table
				234	[[Function][Implementation Notes]]
				235	[[support] [x [isin] \[a, b\], default x [isin] \[0, 1\] ]]
				236	[[pdf] [f(x; a, b) = 1/([pi][sdot][sqrt](x - a)[sdot](b - x))]]
				237	[[cdf] [F(x) = 2/[pi][sdot]sin[super-1]([sqrt](x - a) / (b - a) ) ]]
				238	[[cdf of complement] [2/([pi][sdot]cos[super-1]([sqrt](x - a) / (b - a)))]]
				239	[[quantile] [-a[sdot]sin[super 2]([frac12][pi][sdot]p) + a + b[sdot]sin[super 2]([frac12][pi][sdot]p)]]
				240	[[quantile from the complement] [-a[sdot]cos[super 2]([frac12][pi][sdot]p) + a + b[sdot]cos[super 2]([frac12][pi][sdot]q)]]
				241	[[mean] [[frac12](a+b)]]
				242	[[median] [[frac12](a+b)]]
				243	[[mode] [ x [isin] \[a, b\], so raises domain_error (returning NaN).]]
				244	[[variance] [(b - a)[super 2] / 8]]
				245	[[skewness] [0]]
				246	[[kurtosis excess] [ -3/2 ]]
				247	[[kurtosis] [kurtosis_excess + 3]]
				248	]
				249
				250	The quantile was calculated using an expression obtained by using __WolframAlpha
				251	to invert the formula for the CDF thus
				252
				253	solve [p - 2/pi sin^-1(sqrt((x-a)/(b-a))) = 0, x]
				254
				255	which was interpreted as
				256
				257	Solve[p - (2 ArcSin[Sqrt[(-a + x)/(-a + b)]])/Pi == 0, x, MaxExtraConditions -> Automatic]
				258
				259	and produced the resulting expression
				260
				261	x = -a sin^2((pi p)/2)+a+b sin^2((pi p)/2)
				262
				263	Thanks to Wolfram for providing this facility.
				264
				265	[h4 References]
				266
				267	* [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia arcsine distribution]
				268	* [@http://en.wikipedia.org/wiki/Beta_distribution Wikipedia Beta distribution]
				269	* [@http://mathworld.wolfram.com/BetaDistribution.html Wolfram MathWorld]
				270	* [@http://www.wolframalpha.com/ Wolfram Alpha]
				271
				272	[h4 Sources]
				273
				274	*[@http://estebanmoro.org/2009/04/the-probability-of-going-through-a-bad-patch The probability of going through a bad patch] Esteban Moro's Blog.
				275	*[@http://www.gotohaggstrom.com/What%20do%20schmucks%20and%20the%20arc%20sine%20law%20have%20in%20common.pdf What soschumcks and the arc sine have in common] Peter Haggstrom.
				276	*[@http://www.math.uah.edu/stat/special/Arcsine.html arcsine distribution].
				277	*[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram reference arcsine examples].
				278	*[@http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf Shlomo Sternberg slides].
				279
				280
				281	[endsect] [/section:arcsine_dist arcsine]
				282
				283	[/ arcsine.qbk
				284	Copyright 2014 John Maddock and Paul A. Bristow.
				285	Distributed under the Boost Software License, Version 1.0.
				286	(See accompanying file LICENSE_1_0.txt or copy at
				287	http://www.boost.org/LICENSE_1_0.txt).
				288	]