Brian Silverman | 32ed54e | 2018-08-04 23:37:28 -0700 | [diff] [blame^] | 1 | [section:nc_chi_squared_dist Noncentral Chi-Squared Distribution] |
| 2 | |
| 3 | ``#include <boost/math/distributions/non_central_chi_squared.hpp>`` |
| 4 | |
| 5 | namespace boost{ namespace math{ |
| 6 | |
| 7 | template <class RealType = double, |
| 8 | class ``__Policy`` = ``__policy_class`` > |
| 9 | class non_central_chi_squared_distribution; |
| 10 | |
| 11 | typedef non_central_chi_squared_distribution<> non_central_chi_squared; |
| 12 | |
| 13 | template <class RealType, class ``__Policy``> |
| 14 | class non_central_chi_squared_distribution |
| 15 | { |
| 16 | public: |
| 17 | typedef RealType value_type; |
| 18 | typedef Policy policy_type; |
| 19 | |
| 20 | // Constructor: |
| 21 | non_central_chi_squared_distribution(RealType v, RealType lambda); |
| 22 | |
| 23 | // Accessor to degrees of freedom parameter v: |
| 24 | RealType degrees_of_freedom()const; |
| 25 | |
| 26 | // Accessor to non centrality parameter lambda: |
| 27 | RealType non_centrality()const; |
| 28 | |
| 29 | // Parameter finders: |
| 30 | static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p); |
| 31 | template <class A, class B, class C> |
| 32 | static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c); |
| 33 | |
| 34 | static RealType find_non_centrality(RealType v, RealType x, RealType p); |
| 35 | template <class A, class B, class C> |
| 36 | static RealType find_non_centrality(const complemented3_type<A,B,C>& c); |
| 37 | }; |
| 38 | |
| 39 | }} // namespaces |
| 40 | |
| 41 | The noncentral chi-squared distribution is a generalization of the |
| 42 | __chi_squared_distrib. If X[sub i] are [nu] independent, normally |
| 43 | distributed random variables with means [mu][sub i] and variances |
| 44 | [sigma][sub i][super 2], then the random variable |
| 45 | |
| 46 | [equation nc_chi_squ_ref1] |
| 47 | |
| 48 | is distributed according to the noncentral chi-squared distribution. |
| 49 | |
| 50 | The noncentral chi-squared distribution has two parameters: |
| 51 | [nu] which specifies the number of degrees of freedom |
| 52 | (i.e. the number of X[sub i]), and [lambda] which is related to the |
| 53 | mean of the random variables X[sub i] by: |
| 54 | |
| 55 | [equation nc_chi_squ_ref2] |
| 56 | |
| 57 | (Note that some references define [lambda] as one half of the above sum). |
| 58 | |
| 59 | This leads to a PDF of: |
| 60 | |
| 61 | [equation nc_chi_squ_ref3] |
| 62 | |
| 63 | where ['f(x;k)] is the central chi-squared distribution PDF, and |
| 64 | ['I[sub v](x)] is a modified Bessel function of the first kind. |
| 65 | |
| 66 | The following graph illustrates how the distribution changes |
| 67 | for different values of [lambda]: |
| 68 | |
| 69 | [graph nccs_pdf] |
| 70 | |
| 71 | [h4 Member Functions] |
| 72 | |
| 73 | non_central_chi_squared_distribution(RealType v, RealType lambda); |
| 74 | |
| 75 | Constructs a Chi-Squared distribution with /v/ degrees of freedom |
| 76 | and non-centrality parameter /lambda/. |
| 77 | |
| 78 | Requires v > 0 and lambda >= 0, otherwise calls __domain_error. |
| 79 | |
| 80 | RealType degrees_of_freedom()const; |
| 81 | |
| 82 | Returns the parameter /v/ from which this object was constructed. |
| 83 | |
| 84 | RealType non_centrality()const; |
| 85 | |
| 86 | Returns the parameter /lambda/ from which this object was constructed. |
| 87 | |
| 88 | |
| 89 | static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p); |
| 90 | |
| 91 | This function returns the number of degrees of freedom /v/ such that: |
| 92 | `cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p` |
| 93 | |
| 94 | template <class A, class B, class C> |
| 95 | static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c); |
| 96 | |
| 97 | When called with argument `boost::math::complement(lambda, x, q)` |
| 98 | this function returns the number of degrees of freedom /v/ such that: |
| 99 | |
| 100 | `cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`. |
| 101 | |
| 102 | static RealType find_non_centrality(RealType v, RealType x, RealType p); |
| 103 | |
| 104 | This function returns the non centrality parameter /lambda/ such that: |
| 105 | |
| 106 | `cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p` |
| 107 | |
| 108 | template <class A, class B, class C> |
| 109 | static RealType find_non_centrality(const complemented3_type<A,B,C>& c); |
| 110 | |
| 111 | When called with argument `boost::math::complement(v, x, q)` |
| 112 | this function returns the non centrality parameter /lambda/ such that: |
| 113 | |
| 114 | `cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`. |
| 115 | |
| 116 | [h4 Non-member Accessors] |
| 117 | |
| 118 | All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions] |
| 119 | that are generic to all distributions are supported: __usual_accessors. |
| 120 | |
| 121 | The domain of the random variable is \[0, +[infin]\]. |
| 122 | |
| 123 | [h4 Examples] |
| 124 | |
| 125 | There is a |
| 126 | [link math_toolkit.stat_tut.weg.nccs_eg worked example] |
| 127 | for the noncentral chi-squared distribution. |
| 128 | |
| 129 | [h4 Accuracy] |
| 130 | |
| 131 | The following table shows the peak errors |
| 132 | (in units of [@http://en.wikipedia.org/wiki/Machine_epsilon epsilon]) |
| 133 | found on various platforms with various floating point types. |
| 134 | The failures in the comparison to the [@http://www.r-project.org/ R Math library], |
| 135 | seem to be mostly in the corner cases when the probablity would be very small. |
| 136 | Unless otherwise specified any floating-point type that is narrower |
| 137 | than the one shown will have __zero_error. |
| 138 | |
| 139 | [table_non_central_chi_squared_CDF] |
| 140 | |
| 141 | [table_non_central_chi_squared_CDF_complement] |
| 142 | |
| 143 | Error rates for the quantile |
| 144 | functions are broadly similar. Special mention should go to |
| 145 | the `mode` function: there is no closed form for this function, |
| 146 | so it is evaluated numerically by finding the maxima of the PDF: |
| 147 | in principal this can not produce an accuracy greater than the |
| 148 | square root of the machine epsilon. |
| 149 | |
| 150 | [h4 Tests] |
| 151 | |
| 152 | There are two sets of test data used to verify this implementation: |
| 153 | firstly we can compare with published data, for example with |
| 154 | Table 6 of "Self-Validating Computations of Probabilities for |
| 155 | Selected Central and Noncentral Univariate Probability Functions", |
| 156 | Morgan C. Wang and William J. Kennedy, |
| 157 | Journal of the American Statistical Association, |
| 158 | Vol. 89, No. 427. (Sep., 1994), pp. 878-887. |
| 159 | Secondly, we have tables of test data, computed with this |
| 160 | implementation and using interval arithmetic - this data should |
| 161 | be accurate to at least 50 decimal digits - and is the used for |
| 162 | our accuracy tests. |
| 163 | |
| 164 | [h4 Implementation] |
| 165 | |
| 166 | The CDF and its complement are evaluated as follows: |
| 167 | |
| 168 | First we determine which of the two values (the CDF or its |
| 169 | complement) is likely to be the smaller: for this we can use the |
| 170 | relation due to Temme (see "Asymptotic and Numerical Aspects of the |
| 171 | Noncentral Chi-Square Distribution", N. M. Temme, Computers Math. Applic. |
| 172 | Vol 25, No. 5, 55-63, 1993) that: |
| 173 | |
| 174 | F([nu],[lambda];[nu]+[lambda]) [asymp] 0.5 |
| 175 | |
| 176 | and so compute the CDF when the random variable is less than |
| 177 | [nu]+[lambda], and its complement when the random variable is |
| 178 | greater than [nu]+[lambda]. If necessary the computed result |
| 179 | is then subtracted from 1 to give the desired result (the CDF or its |
| 180 | complement). |
| 181 | |
| 182 | For small values of the non centrality parameter, the CDF is computed |
| 183 | using the method of Ding (see "Algorithm AS 275: Computing the Non-Central |
| 184 | #2 Distribution Function", Cherng G. Ding, Applied Statistics, Vol. 41, |
| 185 | No. 2. (1992), pp. 478-482). This uses the following series representation: |
| 186 | |
| 187 | [equation nc_chi_squ_ref4] |
| 188 | |
| 189 | which requires just one call to __gamma_p_derivative with the subsequent |
| 190 | terms being computed by recursion as shown above. |
| 191 | |
| 192 | For larger values of the non-centrality parameter, Ding's method can take |
| 193 | an unreasonable number of terms before convergence is achieved. Furthermore, |
| 194 | the largest term is not the first term, so in extreme cases the first term may |
| 195 | be zero, leading to a zero result, even though the true value may be non-zero. |
| 196 | |
| 197 | Therefore, when the non-centrality parameter is greater than 200, the method due |
| 198 | to Krishnamoorthy (see "Computing discrete mixtures of continuous distributions: |
| 199 | noncentral chisquare, noncentral t and the distribution of the |
| 200 | square of the sample multiple correlation coefficient", |
| 201 | Denise Benton and K. Krishnamoorthy, Computational Statistics & |
| 202 | Data Analysis, 43, (2003), 249-267) is used. |
| 203 | |
| 204 | This method uses the well known sum: |
| 205 | |
| 206 | [equation nc_chi_squ_ref5] |
| 207 | |
| 208 | Where P[sub a](x) is the incomplete gamma function. |
| 209 | |
| 210 | The method starts at the [lambda]th term, which is where the Poisson weighting |
| 211 | function achieves its maximum value, although this is not necessarily |
| 212 | the largest overall term. Subsequent terms are calculated via the normal |
| 213 | recurrence relations for the incomplete gamma function, and iteration proceeds |
| 214 | both forwards and backwards until sufficient precision has been achieved. It |
| 215 | should be noted that recurrence in the forwards direction of P[sub a](x) is |
| 216 | numerically unstable. However, since we always start /after/ the largest |
| 217 | term in the series, numeric instability is introduced more slowly than the |
| 218 | series converges. |
| 219 | |
| 220 | Computation of the complement of the CDF uses an extension of Krishnamoorthy's |
| 221 | method, given that: |
| 222 | |
| 223 | [equation nc_chi_squ_ref6] |
| 224 | |
| 225 | we can again start at the [lambda]'th term and proceed in both directions from |
| 226 | there until the required precision is achieved. This time it is backwards |
| 227 | recursion on the incomplete gamma function Q[sub a](x) which is unstable. |
| 228 | However, as long as we start well /before/ the largest term, this is not an |
| 229 | issue in practice. |
| 230 | |
| 231 | The PDF is computed directly using the relation: |
| 232 | |
| 233 | [equation nc_chi_squ_ref3] |
| 234 | |
| 235 | Where ['f(x; v)] is the PDF of the central __chi_squared_distrib and |
| 236 | ['I[sub v](x)] is a modified Bessel function, see __cyl_bessel_i. |
| 237 | For small values of the |
| 238 | non-centrality parameter the relation in terms of __cyl_bessel_i |
| 239 | is used. However, this method fails for large values of the |
| 240 | non-centrality parameter, so in that case the infinite sum is |
| 241 | evaluated using the method of Benton and Krishnamoorthy, and |
| 242 | the usual recurrence relations for successive terms. |
| 243 | |
| 244 | The quantile functions are computed by numeric inversion of the CDF. |
| 245 | An improve starting quess is from |
| 246 | Thomas Luu, |
| 247 | [@http://discovery.ucl.ac.uk/1482128/, Fast and accurate parallel computation of quantile functions for random number generation, Doctorial Thesis, 2016]. |
| 248 | |
| 249 | There is no [@http://en.wikipedia.org/wiki/Closed_form closed form] |
| 250 | for the mode of the noncentral chi-squared |
| 251 | distribution: it is computed numerically by finding the maximum |
| 252 | of the PDF. Likewise, the median is computed numerically via |
| 253 | the quantile. |
| 254 | |
| 255 | The remaining non-member functions use the following formulas: |
| 256 | |
| 257 | [equation nc_chi_squ_ref7] |
| 258 | |
| 259 | Some analytic properties of noncentral distributions |
| 260 | (particularly unimodality, and monotonicity of their modes) |
| 261 | are surveyed and summarized by: |
| 262 | |
| 263 | Andrea van Aubel & Wolfgang Gawronski, Applied Mathematics and Computation, 141 (2003) 3-12. |
| 264 | |
| 265 | [endsect] [/section:nc_chi_squared_dist] |
| 266 | |
| 267 | [/ nc_chi_squared.qbk |
| 268 | Copyright 2008 John Maddock and Paul A. Bristow. |
| 269 | Distributed under the Boost Software License, Version 1.0. |
| 270 | (See accompanying file LICENSE_1_0.txt or copy at |
| 271 | http://www.boost.org/LICENSE_1_0.txt). |
| 272 | ] |
| 273 | |