Brian Silverman | 32ed54e | 2018-08-04 23:37:28 -0700 | [diff] [blame^] | 1 | [section:arcine_dist Arcsine Distribution] |
| 2 | |
| 3 | [import ../../example/arcsine_example.cpp] [/ for arcsine snips below] |
| 4 | |
| 5 | |
| 6 | ``#include <boost/math/distributions/arcsine.hpp>`` |
| 7 | |
| 8 | namespace boost{ namespace math{ |
| 9 | |
| 10 | template <class RealType = double, |
| 11 | class ``__Policy`` = ``__policy_class`` > |
| 12 | class arcsine_distribution; |
| 13 | |
| 14 | typedef arcsine_distribution<double> arcsine; // double precision standard arcsine distribution [0,1]. |
| 15 | |
| 16 | template <class RealType, class ``__Policy``> |
| 17 | class arcsine_distribution |
| 18 | { |
| 19 | public: |
| 20 | typedef RealType value_type; |
| 21 | typedef Policy policy_type; |
| 22 | |
| 23 | // Constructor from two range parameters, x_min and x_max: |
| 24 | arcsine_distribution(RealType x_min, RealType x_max); |
| 25 | |
| 26 | // Range Parameter accessors: |
| 27 | RealType x_min() const; |
| 28 | RealType x_max() const; |
| 29 | }; |
| 30 | }} // namespaces |
| 31 | |
| 32 | The class type `arcsine_distribution` represents an |
| 33 | [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine] |
| 34 | [@http://en.wikipedia.org/wiki/Probability_distribution probability distribution function]. |
| 35 | The arcsine distribution is named because its CDF uses the inverse sin[super -1] or arcsine. |
| 36 | |
| 37 | This is implemented as a generalized version with support from ['x_min] to ['x_max] |
| 38 | providing the 'standard arcsine distribution' as default with ['x_min = 0] and ['x_max = 1]. |
| 39 | (A few make other choices for 'standard'). |
| 40 | |
| 41 | The arcsine distribution is generalized to include any bounded support ['a <= x <= b] by |
| 42 | [@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram] and |
| 43 | [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia], |
| 44 | but also using ['location] and ['scale] parameters by |
| 45 | [@http://www.math.uah.edu/stat/index.html Virtual Laboratories in Probability and Statistics] |
| 46 | [@http://www.math.uah.edu/stat/special/Arcsine.html Arcsine distribution]. |
| 47 | The end-point version is simpler and more obvious, so we implement that. |
| 48 | If desired, [@http://en.wikipedia.org/wiki/arcsine_distribution this] |
| 49 | outlines how the __beta_distrib can be used to add a shape factor. |
| 50 | |
| 51 | The [@http://en.wikipedia.org/wiki/Probability_density_function probability density function PDF] |
| 52 | for the [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine distribution] |
| 53 | defined on the interval \[['x_min, x_max]\] is given by: |
| 54 | |
| 55 | [figspace] [figspace] f(x; x_min, x_max) = 1 /([pi][sdot][sqrt]((x - x_min)[sdot](x_max - x_min)) |
| 56 | |
| 57 | For example, __WolframAlpha arcsine distribution, from input of |
| 58 | |
| 59 | N[PDF[arcsinedistribution[0, 1], 0.5], 50] |
| 60 | |
| 61 | computes the PDF value |
| 62 | |
| 63 | 0.63661977236758134307553505349005744813783858296183 |
| 64 | |
| 65 | The Probability Density Functions (PDF) of generalized arcsine distributions are symmetric U-shaped curves, |
| 66 | centered on ['(x_max - x_min)/2], |
| 67 | highest (infinite) near the two extrema, and quite flat over the central region. |
| 68 | |
| 69 | If random variate ['x] is ['x_min] or ['x_max], then the PDF is infinity. |
| 70 | If random variate ['x] is ['x_min] then the CDF is zero. |
| 71 | If random variate ['x] is ['x_max] then the CDF is unity. |
| 72 | |
| 73 | The 'Standard' (0, 1) arcsine distribution is shown in blue |
| 74 | and some generalized examples with other ['x] ranges. |
| 75 | |
| 76 | [graph arcsine_pdf] |
| 77 | |
| 78 | The Cumulative Distribution Function CDF is defined as |
| 79 | |
| 80 | [figspace] [figspace] F(x) = 2[sdot]arcsin([sqrt]((x-x_min)/(x_max - x))) / [pi] |
| 81 | |
| 82 | [graph arcsine_cdf] |
| 83 | |
| 84 | [h5 Constructor] |
| 85 | |
| 86 | arcsine_distribution(RealType x_min, RealType x_max); |
| 87 | |
| 88 | constructs an arcsine distribution with range parameters ['x_min] and ['x_max]. |
| 89 | |
| 90 | Requires ['x_min < x_max], otherwise __domain_error is called. |
| 91 | |
| 92 | For example: |
| 93 | |
| 94 | arcsine_distribution<> myarcsine(-2, 4); |
| 95 | |
| 96 | constructs an arcsine distribution with ['x_min = -2] and ['x_max = 4]. |
| 97 | |
| 98 | Default values of ['x_min = 0] and ['x_max = 1] and a ` typedef arcsine_distribution<double> arcsine;` mean that |
| 99 | |
| 100 | arcsine as; |
| 101 | |
| 102 | constructs a 'Standard 01' arcsine distribution. |
| 103 | |
| 104 | [h5 Parameter Accessors] |
| 105 | |
| 106 | RealType x_min() const; |
| 107 | RealType x_max() const; |
| 108 | |
| 109 | Return the parameter ['x_min] or ['x_max] from which this distribution was constructed. |
| 110 | |
| 111 | So, for example: |
| 112 | |
| 113 | [arcsine_snip_8] |
| 114 | |
| 115 | [h4 Non-member Accessor Functions] |
| 116 | |
| 117 | All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions] |
| 118 | that are generic to all distributions are supported: __usual_accessors. |
| 119 | |
| 120 | The formulae for calculating these are shown in the table below, and at |
| 121 | [@http://mathworld.wolfram.com/arcsineDistribution.html Wolfram Mathworld]. |
| 122 | |
| 123 | [note There are always [*two] values for the [*mode], at ['x_min] and at ['x_max], default 0 and 1, |
| 124 | so instead we raise the exception __domain_error. |
| 125 | At these extrema, the PDFs are infinite, and the CDFs zero or unity.] |
| 126 | |
| 127 | [h4 Applications] |
| 128 | |
| 129 | The arcsine distribution is useful to describe |
| 130 | [@http://en.wikipedia.org/wiki/Random_walk Random walks], (including drunken walks) |
| 131 | [@http://en.wikipedia.org/wiki/Brownian_motion Brownian motion], |
| 132 | [@http://en.wikipedia.org/wiki/Wiener_process Weiner processes], |
| 133 | [@http://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trials], |
| 134 | and their appplication to solve stock market and other |
| 135 | [@http://en.wikipedia.org/wiki/Gambler%27s_ruin ruinous gambling games]. |
| 136 | |
| 137 | The random variate ['x] is constrained to ['x_min] and ['x_max], (for our 'standard' distribution, 0 and 1), |
| 138 | and is usually some fraction. For any other ['x_min] and ['x_max] a fraction can be obtained from ['x] using |
| 139 | |
| 140 | [sixemspace] fraction = (x - x_min) / (x_max - x_min) |
| 141 | |
| 142 | The simplest example is tossing heads and tails with a fair coin and modelling the risk of losing, or winning. |
| 143 | Walkers (molecules, drunks...) moving left or right of a centre line are another common example. |
| 144 | |
| 145 | The random variate ['x] is the fraction of time spent on the 'winning' side. |
| 146 | If half the time is spent on the 'winning' side (and so the other half on the 'losing' side) then ['x = 1/2]. |
| 147 | |
| 148 | For large numbers of tosses, this is modelled by the (standard \[0,1\]) arcsine distribution, |
| 149 | and the PDF can be calculated thus: |
| 150 | |
| 151 | [arcsine_snip_2] |
| 152 | |
| 153 | From the plot of PDF, it is clear that ['x] = [frac12] is the [*minimum] of the curve, |
| 154 | so this is the [*least likely] scenario. |
| 155 | (This is highly counter-intuitive, considering that fair tosses must [*eventually] become equal. |
| 156 | It turns out that ['eventually] is not just very long, but [*infinite]!). |
| 157 | |
| 158 | The [*most likely] scenarios are towards the extrema where ['x] = 0 or ['x] = 1. |
| 159 | |
| 160 | If fraction of time on the left is a [frac14], |
| 161 | it is only slightly more likely because the curve is quite flat bottomed. |
| 162 | |
| 163 | [arcsine_snip_3] |
| 164 | |
| 165 | If we consider fair coin-tossing games being played for 100 days |
| 166 | (hypothetically continuously to be 'at-limit') |
| 167 | the person winning after day 5 will not change in fraction 0.144 of the cases. |
| 168 | |
| 169 | We can easily compute this setting ['x] = 5./100 = 0.05 |
| 170 | |
| 171 | [arcsine_snip_4] |
| 172 | |
| 173 | Similarly, we can compute from a fraction of 0.05 /2 = 0.025 |
| 174 | (halved because we are considering both winners and losers) |
| 175 | corresponding to 1 - 0.025 or 97.5% of the gamblers, (walkers, particles...) on the [*same side] of the origin |
| 176 | |
| 177 | [arcsine_snip_5] |
| 178 | |
| 179 | (use of the complement gives a bit more clarity, |
| 180 | and avoids potential loss of accuracy when ['x] is close to unity, see __why_complements). |
| 181 | |
| 182 | [arcsine_snip_6] |
| 183 | |
| 184 | or we can reverse the calculation by assuming a fraction of time on one side, say fraction 0.2, |
| 185 | |
| 186 | [arcsine_snip_7] |
| 187 | |
| 188 | [*Summary]: Every time we toss, the odds are equal, |
| 189 | so on average we have the same change of winning and losing. |
| 190 | |
| 191 | But this is [*not true] for an an individual game where one will be [*mostly in a bad or good patch]. |
| 192 | |
| 193 | This is quite counter-intuitive to most people, but the mathematics is clear, |
| 194 | and gamblers continue to provide proof. |
| 195 | |
| 196 | [*Moral]: if you in a losing patch, leave the game. |
| 197 | (Because the odds to recover to a good patch are poor). |
| 198 | |
| 199 | [*Corollary]: Quit while you are ahead? |
| 200 | |
| 201 | A working example is at [@../../example/arcsine_example.cpp arcsine_example.cpp] |
| 202 | including sample output . |
| 203 | |
| 204 | [h4 Related distributions] |
| 205 | |
| 206 | The arcsine distribution with ['x_min = 0] and ['x_max = 1] is special case of the |
| 207 | __beta_distrib with [alpha] = 1/2 and [beta] = 1/2. |
| 208 | |
| 209 | [h4 Accuracy] |
| 210 | |
| 211 | This distribution is implemented using sqrt, sine, cos and arc sine and cos trigonometric functions |
| 212 | which are normally accurate to a few __epsilon. |
| 213 | But all values suffer from [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance or cancellation error] |
| 214 | for values of ['x] close to ['x_max]. |
| 215 | For example, for a standard [0, 1] arcsine distribution ['as], the pdf is symmetric about random variate ['x = 0.5] |
| 216 | so that one would expect `pdf(as, 0.01) == pdf(as, 0.99)`. But as ['x] nears unity, there is increasing |
| 217 | [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance]. |
| 218 | To counteract this, the complement versions of CDF and quantile |
| 219 | are implemented with alternative expressions using ['cos[super -1]] instead of ['sin[super -1]]. |
| 220 | Users should see __why_complements for guidance on when to avoid loss of accuracy by using complements. |
| 221 | |
| 222 | [h4 Testing] |
| 223 | The results were tested against a few accurate spot values computed by __WolframAlpha, for example: |
| 224 | |
| 225 | N[PDF[arcsinedistribution[0, 1], 0.5], 50] |
| 226 | 0.63661977236758134307553505349005744813783858296183 |
| 227 | |
| 228 | [h4 Implementation] |
| 229 | |
| 230 | In the following table ['a] and ['b] are the parameters ['x_min][space] and ['x_max], |
| 231 | ['x] is the random variable, ['p] is the probability and its complement ['q = 1-p]. |
| 232 | |
| 233 | [table |
| 234 | [[Function][Implementation Notes]] |
| 235 | [[support] [x [isin] \[a, b\], default x [isin] \[0, 1\] ]] |
| 236 | [[pdf] [f(x; a, b) = 1/([pi][sdot][sqrt](x - a)[sdot](b - x))]] |
| 237 | [[cdf] [F(x) = 2/[pi][sdot]sin[super-1]([sqrt](x - a) / (b - a) ) ]] |
| 238 | [[cdf of complement] [2/([pi][sdot]cos[super-1]([sqrt](x - a) / (b - a)))]] |
| 239 | [[quantile] [-a[sdot]sin[super 2]([frac12][pi][sdot]p) + a + b[sdot]sin[super 2]([frac12][pi][sdot]p)]] |
| 240 | [[quantile from the complement] [-a[sdot]cos[super 2]([frac12][pi][sdot]p) + a + b[sdot]cos[super 2]([frac12][pi][sdot]q)]] |
| 241 | [[mean] [[frac12](a+b)]] |
| 242 | [[median] [[frac12](a+b)]] |
| 243 | [[mode] [ x [isin] \[a, b\], so raises domain_error (returning NaN).]] |
| 244 | [[variance] [(b - a)[super 2] / 8]] |
| 245 | [[skewness] [0]] |
| 246 | [[kurtosis excess] [ -3/2 ]] |
| 247 | [[kurtosis] [kurtosis_excess + 3]] |
| 248 | ] |
| 249 | |
| 250 | The quantile was calculated using an expression obtained by using __WolframAlpha |
| 251 | to invert the formula for the CDF thus |
| 252 | |
| 253 | solve [p - 2/pi sin^-1(sqrt((x-a)/(b-a))) = 0, x] |
| 254 | |
| 255 | which was interpreted as |
| 256 | |
| 257 | Solve[p - (2 ArcSin[Sqrt[(-a + x)/(-a + b)]])/Pi == 0, x, MaxExtraConditions -> Automatic] |
| 258 | |
| 259 | and produced the resulting expression |
| 260 | |
| 261 | x = -a sin^2((pi p)/2)+a+b sin^2((pi p)/2) |
| 262 | |
| 263 | Thanks to Wolfram for providing this facility. |
| 264 | |
| 265 | [h4 References] |
| 266 | |
| 267 | * [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia arcsine distribution] |
| 268 | * [@http://en.wikipedia.org/wiki/Beta_distribution Wikipedia Beta distribution] |
| 269 | * [@http://mathworld.wolfram.com/BetaDistribution.html Wolfram MathWorld] |
| 270 | * [@http://www.wolframalpha.com/ Wolfram Alpha] |
| 271 | |
| 272 | [h4 Sources] |
| 273 | |
| 274 | *[@http://estebanmoro.org/2009/04/the-probability-of-going-through-a-bad-patch The probability of going through a bad patch] Esteban Moro's Blog. |
| 275 | *[@http://www.gotohaggstrom.com/What%20do%20schmucks%20and%20the%20arc%20sine%20law%20have%20in%20common.pdf What soschumcks and the arc sine have in common] Peter Haggstrom. |
| 276 | *[@http://www.math.uah.edu/stat/special/Arcsine.html arcsine distribution]. |
| 277 | *[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram reference arcsine examples]. |
| 278 | *[@http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf Shlomo Sternberg slides]. |
| 279 | |
| 280 | |
| 281 | [endsect] [/section:arcsine_dist arcsine] |
| 282 | |
| 283 | [/ arcsine.qbk |
| 284 | Copyright 2014 John Maddock and Paul A. Bristow. |
| 285 | Distributed under the Boost Software License, Version 1.0. |
| 286 | (See accompanying file LICENSE_1_0.txt or copy at |
| 287 | http://www.boost.org/LICENSE_1_0.txt). |
| 288 | ] |