Brian Silverman | 32ed54e | 2018-08-04 23:37:28 -0700 | [diff] [blame^] | 1 | [section:sf_implementation Additional Implementation Notes] |
| 2 | |
| 3 | The majority of the implementation notes are included with the documentation |
| 4 | of each function or distribution. The notes here are of a more general nature, |
| 5 | and reflect more the general implementation philosophy used. |
| 6 | |
| 7 | [h4 Implementation philosophy] |
| 8 | |
| 9 | "First be right, then be fast." |
| 10 | |
| 11 | There will always be potential compromises |
| 12 | to be made between speed and accuracy. |
| 13 | It may be possible to find faster methods, |
| 14 | particularly for certain limited ranges of arguments, |
| 15 | but for most applications of math functions and distributions, |
| 16 | we judge that speed is rarely as important as accuracy. |
| 17 | |
| 18 | So our priority is accuracy. |
| 19 | |
| 20 | To permit evaluation of accuracy of the special functions, |
| 21 | production of extremely accurate tables of test values |
| 22 | has received considerable effort. |
| 23 | |
| 24 | (It also required much CPU effort - |
| 25 | there was some danger of molten plastic dripping from the bottom of JM's laptop, |
| 26 | so instead, PAB's Dual-core desktop was kept 50% busy for [*days] |
| 27 | calculating some tables of test values!) |
| 28 | |
| 29 | For a specific RealType, say `float` or `double`, |
| 30 | it may be possible to find approximations for some functions |
| 31 | that are simpler and thus faster, but less accurate |
| 32 | (perhaps because there are no refining iterations, |
| 33 | for example, when calculating inverse functions). |
| 34 | |
| 35 | If these prove accurate enough to be "fit for his purpose", |
| 36 | then a user may substitute his custom specialization. |
| 37 | |
| 38 | For example, there are approximations dating back from times |
| 39 | when computation was a [*lot] more expensive: |
| 40 | |
| 41 | H Goldberg and H Levine, Approximate formulas for |
| 42 | percentage points and normalisation of t and chi squared, |
| 43 | Ann. Math. Stat., 17(4), 216 - 225 (Dec 1946). |
| 44 | |
| 45 | A H Carter, Approximations to percentage points of the z-distribution, |
| 46 | Biometrika 34(2), 352 - 358 (Dec 1947). |
| 47 | |
| 48 | These could still provide sufficient accuracy for some speed-critical applications. |
| 49 | |
| 50 | [h4 Accuracy and Representation of Test Values] |
| 51 | |
| 52 | In order to be accurate enough for as many as possible real types, |
| 53 | constant values are given to 50 decimal digits if available |
| 54 | (though many sources proved only accurate near to 64-bit double precision). |
| 55 | Values are specified as long double types by appending L, |
| 56 | unless they are exactly representable, for example integers, or binary fractions like 0.125. |
| 57 | This avoids the risk of loss of accuracy converting from double, the default type. |
| 58 | Values are used after `static_cast<RealType>(1.2345L)` |
| 59 | to provide the appropriate RealType for spot tests. |
| 60 | |
| 61 | Functions that return constants values, like kurtosis for example, are written as |
| 62 | |
| 63 | `static_cast<RealType>(-3) / 5;` |
| 64 | |
| 65 | to provide the most accurate value |
| 66 | that the compiler can compute for the real type. |
| 67 | (The denominator is an integer and so will be promoted exactly). |
| 68 | |
| 69 | So tests for one third, *not* exactly representable with radix two floating-point, |
| 70 | (should) use, for example: |
| 71 | |
| 72 | `static_cast<RealType>(1) / 3;` |
| 73 | |
| 74 | If a function is very sensitive to changes in input, |
| 75 | specifying an inexact value as input (such as 0.1) can throw |
| 76 | the result off by a noticeable amount: 0.1f is "wrong" |
| 77 | by ~1e-7 for example (because 0.1 has no exact binary representation). |
| 78 | That is why exact binary values - halves, quarters, and eighths etc - |
| 79 | are used in test code along with the occasional fraction `a/b` with `b` |
| 80 | a power of two (in order to ensure that the result is an exactly |
| 81 | representable binary value). |
| 82 | |
| 83 | [h4 Tolerance of Tests] |
| 84 | |
| 85 | The tolerances need to be set to the maximum of: |
| 86 | |
| 87 | * Some epsilon value. |
| 88 | * The accuracy of the data (often only near 64-bit double). |
| 89 | |
| 90 | Otherwise when long double has more digits than the test data, then no |
| 91 | amount of tweaking an epsilon based tolerance will work. |
| 92 | |
| 93 | A common problem is when tolerances that are suitable for implementations |
| 94 | like Microsoft VS.NET where double and long double are the same size: |
| 95 | tests fail on other systems where long double is more accurate than double. |
| 96 | Check first that the suffix L is present, and then that the tolerance is big enough. |
| 97 | |
| 98 | [h4 Handling Unsuitable Arguments] |
| 99 | |
| 100 | In |
| 101 | [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1665.pdf Errors in Mathematical Special Functions], J. Marraffino & M. Paterno |
| 102 | it is proposed that signalling a domain error is mandatory |
| 103 | when the argument would give an mathematically undefined result. |
| 104 | |
| 105 | *Guideline 1 |
| 106 | |
| 107 | [:A mathematical function is said to be defined at a point a = (a1, a2, . . .) |
| 108 | if the limits as x = (x1, x2, . . .) 'approaches a from all directions agree'. |
| 109 | The defined value may be any number, or +infinity, or -infinity.] |
| 110 | |
| 111 | Put crudely, if the function goes to + infinity |
| 112 | and then emerges 'round-the-back' with - infinity, |
| 113 | it is NOT defined. |
| 114 | |
| 115 | [:The library function which approximates a mathematical function shall signal a domain error |
| 116 | whenever evaluated with argument values for which the mathematical function is undefined.] |
| 117 | |
| 118 | *Guideline 2 |
| 119 | |
| 120 | [:The library function which approximates a mathematical function |
| 121 | shall signal a domain error whenever evaluated with argument values |
| 122 | for which the mathematical function obtains a non-real value.] |
| 123 | |
| 124 | This implementation is believed to follow these proposals and to assist compatibility with |
| 125 | ['ISO/IEC 9899:1999 Programming languages - C] |
| 126 | and with the |
| 127 | [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf Draft Technical Report on C++ Library Extensions, 2005-06-24, section 5.2.1, paragraph 5]. |
| 128 | [link math_toolkit.error_handling See also domain_error]. |
| 129 | |
| 130 | See __policy_ref for details of the error handling policies that should allow |
| 131 | a user to comply with any of these recommendations, as well as other behaviour. |
| 132 | |
| 133 | See [link math_toolkit.error_handling error handling] |
| 134 | for a detailed explanation of the mechanism, and |
| 135 | [link math_toolkit.stat_tut.weg.error_eg error_handling example] |
| 136 | and |
| 137 | [@../../example/error_handling_example.cpp error_handling_example.cpp] |
| 138 | |
| 139 | [caution If you enable throw but do NOT have try & catch block, |
| 140 | then the program will terminate with an uncaught exception and probably abort. |
| 141 | Therefore to get the benefit of helpful error messages, enabling *all* exceptions |
| 142 | *and* using try&catch is recommended for all applications. |
| 143 | However, for simplicity, this is not done for most examples.] |
| 144 | |
| 145 | [h4 Handling of Functions that are Not Mathematically defined] |
| 146 | |
| 147 | Functions that are not mathematically defined, |
| 148 | like the Cauchy mean, fail to compile by default. |
| 149 | A [link math_toolkit.pol_ref.assert_undefined policy] |
| 150 | allows control of this. |
| 151 | |
| 152 | If the policy is to permit undefined functions, then calling them |
| 153 | throws a domain error, by default. But the error policy can be set |
| 154 | to not throw, and to return NaN instead. For example, |
| 155 | |
| 156 | `#define BOOST_MATH_DOMAIN_ERROR_POLICY ignore_error` |
| 157 | |
| 158 | appears before the first Boost include, |
| 159 | then if the un-implemented function is called, |
| 160 | mean(cauchy<>()) will return std::numeric_limits<T>::quiet_NaN(). |
| 161 | |
| 162 | [warning If `std::numeric_limits<T>::has_quiet_NaN` is false |
| 163 | (for example, if T is a User-defined type without NaN support), |
| 164 | then an exception will always be thrown when a domain error occurs. |
| 165 | Catching exceptions is therefore strongly recommended.] |
| 166 | |
| 167 | [h4 Median of distributions] |
| 168 | |
| 169 | There are many distributions for which we have been unable to find an analytic formula, |
| 170 | and this has deterred us from implementing |
| 171 | [@http://en.wikipedia.org/wiki/Median median functions], the mid-point in a list of values. |
| 172 | |
| 173 | However a useful numerical approximation for distribution `dist` |
| 174 | is available as usual as an accessor non-member function median using `median(dist)`, |
| 175 | that may be evaluated (in the absence of an analytic formula) by calling |
| 176 | |
| 177 | `quantile(dist, 0.5)` (this is the /mathematical/ definition of course). |
| 178 | |
| 179 | [@http://www.amstat.org/publications/jse/v13n2/vonhippel.html Mean, Median, and Skew, Paul T von Hippel] |
| 180 | |
| 181 | [@http://documents.wolfram.co.jp/teachersedition/MathematicaBook/24.5.html Descriptive Statistics,] |
| 182 | |
| 183 | [@http://documents.wolfram.co.jp/v5/Add-onsLinks/StandardPackages/Statistics/DescriptiveStatistics.html and ] |
| 184 | |
| 185 | [@http://documents.wolfram.com/v5/TheMathematicaBook/AdvancedMathematicsInMathematica/NumericalOperationsOnData/3.8.1.html |
| 186 | Mathematica Basic Statistics.] give more detail, in particular for discrete distributions. |
| 187 | |
| 188 | |
| 189 | [h4 Handling of Floating-Point Infinity] |
| 190 | |
| 191 | Some functions and distributions are well defined with + or - infinity as |
| 192 | argument(s), but after some experiments with handling infinite arguments |
| 193 | as special cases, we concluded that it was generally more useful to forbid this, |
| 194 | and instead to return the result of __domain_error. |
| 195 | |
| 196 | Handling infinity as special cases is additionally complicated |
| 197 | because, unlike built-in types on most - but not all - platforms, |
| 198 | not all User-Defined Types are |
| 199 | specialized to provide `std::numeric_limits<RealType>::infinity()` |
| 200 | and would return zero rather than any representation of infinity. |
| 201 | |
| 202 | The rationale is that non-finiteness may happen because of error |
| 203 | or overflow in the users code, and it will be more helpful for this |
| 204 | to be diagnosed promptly rather than just continuing. |
| 205 | The code also became much more complicated, more error-prone, |
| 206 | much more work to test, and much less readable. |
| 207 | |
| 208 | However in a few cases, for example normal, where we felt it obvious, |
| 209 | we have permitted argument(s) to be infinity, |
| 210 | provided infinity is implemented for the `RealType` on that implementation, |
| 211 | and it is supported and tested by the distribution. |
| 212 | |
| 213 | The range for these distributions is set to infinity if supported by the platform, |
| 214 | (by testing `std::numeric_limits<RealType>::has_infinity`) |
| 215 | else the maximum value provided for the `RealType` by Boost.Math. |
| 216 | |
| 217 | Testing for has_infinity is obviously important for arbitrary precision types |
| 218 | where infinity makes much less sense than for IEEE754 floating-point. |
| 219 | |
| 220 | So far we have not set `support()` function (only range) |
| 221 | on the grounds that the PDF is uninteresting/zero for infinities. |
| 222 | |
| 223 | Users who require special handling of infinity (or other specific value) can, |
| 224 | of course, always intercept this before calling a distribution or function |
| 225 | and return their own choice of value, or other behavior. |
| 226 | This will often be simpler than trying to handle the aftermath of the error policy. |
| 227 | |
| 228 | Overflow, underflow, denorm can be handled using __error_policy. |
| 229 | |
| 230 | We have also tried to catch boundary cases where the mathematical specification |
| 231 | would result in divide by zero or overflow and signalling these similarly. |
| 232 | What happens at (and near), poles can be controlled through __error_policy. |
| 233 | |
| 234 | [h4 Scale, Shape and Location] |
| 235 | |
| 236 | We considered adding location and scale to the list of functions, for example: |
| 237 | |
| 238 | template <class RealType> |
| 239 | inline RealType scale(const triangular_distribution<RealType>& dist) |
| 240 | { |
| 241 | RealType lower = dist.lower(); |
| 242 | RealType mode = dist.mode(); |
| 243 | RealType upper = dist.upper(); |
| 244 | RealType result; // of checks. |
| 245 | if(false == detail::check_triangular(BOOST_CURRENT_FUNCTION, lower, mode, upper, &result)) |
| 246 | { |
| 247 | return result; |
| 248 | } |
| 249 | return (upper - lower); |
| 250 | } |
| 251 | |
| 252 | but found that these concepts are not defined (or their definition too contentious) |
| 253 | for too many distributions to be generally applicable. |
| 254 | Because they are non-member functions, they can be added if required. |
| 255 | |
| 256 | [h4 Notes on Implementation of Specific Functions & Distributions] |
| 257 | |
| 258 | * Default parameters for the Triangular Distribution. |
| 259 | We are uncertain about the best default parameters. |
| 260 | Some sources suggest that the Standard Triangular Distribution has |
| 261 | lower = 0, mode = half and upper = 1. |
| 262 | However as a approximation for the normal distribution, |
| 263 | the most common usage, lower = -1, mode = 0 and upper = 1 would be more suitable. |
| 264 | |
| 265 | [h4 Rational Approximations Used] |
| 266 | |
| 267 | Some of the special functions in this library are implemented via |
| 268 | rational approximations. These are either taken from the literature, |
| 269 | or devised by John Maddock using |
| 270 | [link math_toolkit.internals.minimax our Remez code]. |
| 271 | |
| 272 | Rational rather than Polynomial approximations are used to ensure |
| 273 | accuracy: polynomial approximations are often wonderful up to |
| 274 | a certain level of accuracy, but then quite often fail to provide much greater |
| 275 | accuracy no matter how many more terms are added. |
| 276 | |
| 277 | Our own approximations were devised either for added accuracy |
| 278 | (to support 128-bit long doubles for example), or because |
| 279 | literature methods were unavailable or under non-BSL |
| 280 | compatible license. Our Remez code is known to produce good |
| 281 | agreement with literature results in fairly simple "toy" cases. |
| 282 | All approximations were checked |
| 283 | for convergence and to ensure that |
| 284 | they were not ill-conditioned (the coefficients can give a |
| 285 | theoretically good solution, but the resulting rational function |
| 286 | may be un-computable at fixed precision). |
| 287 | |
| 288 | Recomputing using different |
| 289 | Remez implementations may well produce differing coefficients: the |
| 290 | problem is well known to be ill conditioned in general, and our Remez implementation |
| 291 | often found a broad and ill-defined minima for many of these approximations |
| 292 | (of course for simple "toy" examples like approximating `exp` the minima |
| 293 | is well defined, and the coefficients should agree no matter whose Remez |
| 294 | implementation is used). This should not in general effect the validity |
| 295 | of the approximations: there's good literature supporting the idea that |
| 296 | coefficients can be "in error" without necessarily adversely effecting |
| 297 | the result. Note that "in error" has a special meaning in this context, |
| 298 | see [@http://front.math.ucdavis.edu/0101.5042 |
| 299 | "Approximate construction of rational approximations and the effect |
| 300 | of error autocorrection.", Grigori Litvinov, eprint arXiv:math/0101042]. |
| 301 | Therefore the coefficients still need to be accurately calculated, even if they can |
| 302 | be in error compared to the "true" minimax solution. |
| 303 | |
| 304 | [h4 Representation of Mathematical Constants] |
| 305 | |
| 306 | A macro BOOST_DEFINE_MATH_CONSTANT in constants.hpp is used |
| 307 | to provide high accuracy constants to mathematical functions and distributions, |
| 308 | since it is important to provide values uniformly for both built-in |
| 309 | float, double and long double types, |
| 310 | and for User Defined types in __multiprecision like __cpp_dec_float. |
| 311 | and others like NTL::quad_float and NTL::RR. |
| 312 | |
| 313 | To permit calculations in this Math ToolKit and its tests, (and elsewhere) |
| 314 | at about 100 decimal digits with NTL::RR type, |
| 315 | it is obviously necessary to define constants to this accuracy. |
| 316 | |
| 317 | However, some compilers do not accept decimal digits strings as long as this. |
| 318 | So the constant is split into two parts, with the 1st containing at least |
| 319 | long double precision, and the 2nd zero if not needed or known. |
| 320 | The 3rd part permits an exponent to be provided if necessary (use zero if none) - |
| 321 | the other two parameters may only contain decimal digits (and sign and decimal point), |
| 322 | and may NOT include an exponent like 1.234E99 (nor a trailing F or L). |
| 323 | The second digit string is only used if T is a User-Defined Type, |
| 324 | when the constant is converted to a long string literal and lexical_casted to type T. |
| 325 | (This is necessary because you can't use a numeric constant |
| 326 | since even a long double might not have enough digits). |
| 327 | |
| 328 | For example, pi is defined: |
| 329 | |
| 330 | BOOST_DEFINE_MATH_CONSTANT(pi, |
| 331 | 3.141592653589793238462643383279502884197169399375105820974944, |
| 332 | 5923078164062862089986280348253421170679821480865132823066470938446095505, |
| 333 | 0) |
| 334 | |
| 335 | And used thus: |
| 336 | |
| 337 | using namespace boost::math::constants; |
| 338 | |
| 339 | double diameter = 1.; |
| 340 | double radius = diameter * pi<double>(); |
| 341 | |
| 342 | or boost::math::constants::pi<NTL::RR>() |
| 343 | |
| 344 | Note that it is necessary (if inconvenient) to specify the type explicitly. |
| 345 | |
| 346 | So you cannot write |
| 347 | |
| 348 | double p = boost::math::constants::pi<>(); // could not deduce template argument for 'T' |
| 349 | |
| 350 | Neither can you write: |
| 351 | |
| 352 | double p = boost::math::constants::pi; // Context does not allow for disambiguation of overloaded function |
| 353 | double p = boost::math::constants::pi(); // Context does not allow for disambiguation of overloaded function |
| 354 | |
| 355 | [h4 Thread safety] |
| 356 | |
| 357 | Reporting of error by setting `errno` should be thread-safe already |
| 358 | (otherwise none of the std lib math functions would be thread safe?). |
| 359 | If you turn on reporting of errors via exceptions, `errno` gets left unused anyway. |
| 360 | |
| 361 | For normal C++ usage, the Boost.Math `static const` constants are now thread-safe so |
| 362 | for built-in real-number types: `float`, `double` and `long double` are all thread safe. |
| 363 | |
| 364 | For User_defined types, for example, __cpp_dec_float, |
| 365 | the Boost.Math should also be thread-safe, |
| 366 | (thought we are unsure how to rigorously prove this). |
| 367 | |
| 368 | (Thread safety has received attention in the C++11 Standard revision, |
| 369 | so hopefully all compilers will do the right thing here at some point.) |
| 370 | |
| 371 | [h4 Sources of Test Data] |
| 372 | |
| 373 | We found a large number of sources of test data. |
| 374 | We have assumed that these are /"known good"/ |
| 375 | if they agree with the results from our test |
| 376 | and only consulted other sources for their /'vote'/ |
| 377 | in the case of serious disagreement. |
| 378 | The accuracy, actual and claimed, vary very widely. |
| 379 | Only [@http://functions.wolfram.com/ Wolfram Mathematica functions] |
| 380 | provided a higher accuracy than |
| 381 | C++ double (64-bit floating-point) and was regarded as |
| 382 | the most-trusted source by far. |
| 383 | The __R provided the widest range of distributions, |
| 384 | but the usual Intel X86 distribution uses 64-but doubles, |
| 385 | so our use was limited to the 15 to 17 decimal digit accuracy. |
| 386 | |
| 387 | A useful index of sources is: |
| 388 | [@http://www.sal.hut.fi/Teaching/Resources/ProbStat/table.html |
| 389 | Web-oriented Teaching Resources in Probability and Statistics] |
| 390 | |
| 391 | [@http://espse.ed.psu.edu/edpsych/faculty/rhale/hale/507Mat/statlets/free/pdist.htm Statlet]: |
| 392 | Is a Javascript application that calculates and plots probability distributions, |
| 393 | and provides the most complete range of distributions: |
| 394 | |
| 395 | [:Bernoulli, Binomial, discrete uniform, geometric, hypergeometric, |
| 396 | negative binomial, Poisson, beta, Cauchy-Lorentz, chi-sequared, Erlang, |
| 397 | exponential, extreme value, Fisher, gamma, Laplace, logistic, |
| 398 | lognormal, normal, Parteo, Student's t, triangular, uniform, and Weibull.] |
| 399 | |
| 400 | It calculates pdf, cdf, survivor, log survivor, hazard, tail areas, |
| 401 | & critical values for 5 tail values. |
| 402 | |
| 403 | It is also the only independent source found for the Weibull distribution; |
| 404 | unfortunately it appears to suffer from very poor accuracy in areas where |
| 405 | the underlying special function is known to be difficult to implement. |
| 406 | |
| 407 | [h4 Testing for Invalid Parameters to Functions and Constructors] |
| 408 | |
| 409 | After finding that some 'bad' parameters (like NaN) were not throwing |
| 410 | a `domain_error` exception as they should, a function |
| 411 | |
| 412 | `check_out_of_range` (in `test_out_of_range.hpp`) |
| 413 | was devised by JM to check |
| 414 | (using Boost.Test's BOOST_CHECK_THROW macro) |
| 415 | that bad parameters passed to constructors and functions throw `domain_error` exceptions. |
| 416 | |
| 417 | Usage is `check_out_of_range< DistributionType >(list-of-params);` |
| 418 | Where list-of-params is a list of *valid* parameters from which the distribution can be constructed |
| 419 | - ie the same number of args are passed to the function, |
| 420 | as are passed to the distribution constructor. |
| 421 | |
| 422 | The values of the parameters are not important, but must be *valid* to pass the constructor checks; |
| 423 | the default values are suitable, but must be explicitly provided, for example: |
| 424 | |
| 425 | check_out_of_range<extreme_value_distribution<RealType> >(1, 2); |
| 426 | |
| 427 | Checks made are: |
| 428 | |
| 429 | * Infinity or NaN (if available) passed in place of each of the valid params. |
| 430 | * Infinity or NaN (if available) as a random variable. |
| 431 | * Out-of-range random variable passed to pdf and cdf |
| 432 | (ie outside of "range(DistributionType)"). |
| 433 | * Out-of-range probability passed to quantile function and complement. |
| 434 | |
| 435 | but does *not* check finite but out-of-range parameters to the constructor |
| 436 | because these are specific to each distribution, for example: |
| 437 | |
| 438 | BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(0, 1), 0), std::domain_error); |
| 439 | BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(1, 0), 0), std::domain_error); |
| 440 | |
| 441 | checks `scale` and `shape` parameters are both > 0 |
| 442 | by checking that `domain_error` exception is thrown if either are == 0. |
| 443 | |
| 444 | (Use of `check_out_of_range` function may mean that some previous tests are now redundant). |
| 445 | |
| 446 | It was also noted that if more than one parameter is bad, |
| 447 | then only the first detected will be reported by the error message. |
| 448 | |
| 449 | [h4 Creating and Managing the Equations] |
| 450 | |
| 451 | Equations that fit on a single line can most easily be produced by inline Quickbook code |
| 452 | using templates for Unicode Greek and Unicode Math symbols. |
| 453 | All Greek letter and small set of Math symbols is available at |
| 454 | /boost-path/libs/math/doc/sf_and_dist/html4_symbols.qbk |
| 455 | |
| 456 | Where equations need to use more than one line, real Math editors were used. |
| 457 | |
| 458 | The primary source for the equations is now |
| 459 | [@http://www.w3.org/Math/ MathML]: see the |
| 460 | *.mml files in libs\/math\/doc\/sf_and_dist\/equations\/. |
| 461 | |
| 462 | These are most easily edited by a GUI editor such as |
| 463 | [@http://mathcast.sourceforge.net/home.html Mathcast], |
| 464 | please note that the equation editor supplied with Open Office |
| 465 | currently mangles these files and should not currently be used. |
| 466 | |
| 467 | Conversion to SVG was achieved using |
| 468 | [@https://sourceforge.net/projects/svgmath/ SVGMath] and a command line |
| 469 | such as: |
| 470 | |
| 471 | [pre |
| 472 | $for file in *.mml; do |
| 473 | >/cygdrive/c/Python25/python.exe 'C:\download\open\SVGMath-0.3.1\math2svg.py' \\ |
| 474 | >>$file > $(basename $file .mml).svg |
| 475 | >done |
| 476 | ] |
| 477 | |
| 478 | See also the section on "Using Python to run Inkscape" and |
| 479 | "Using inkscape to convert scalable vector SVG files to Portable Network graphic PNG". |
| 480 | |
| 481 | Note that SVGMath requires that the mml files are *not* wrapped in an XHTML |
| 482 | XML wrapper - this is added by Mathcast by default - one workaround is to |
| 483 | copy an existing mml file and then edit it with Mathcast: the existing |
| 484 | format should then be preserved. This is a bug in the XML parser used by |
| 485 | SVGMath which the author is aware of. |
| 486 | |
| 487 | If necessary the XHTML wrapper can be removed with: |
| 488 | |
| 489 | [pre cat filename | tr -d "\\r\\n" \| sed -e 's\/.*\\(<math\[^>\]\*>.\*<\/math>\\).\*\/\\1\/' > newfile] |
| 490 | |
| 491 | Setting up fonts for SVGMath is currently rather tricky, on a Windows XP system |
| 492 | JM's font setup is the same as the sample config file provided with SVGMath |
| 493 | but with: |
| 494 | |
| 495 | [pre |
| 496 | <!\-\- Double\-struck \-\-> |
| 497 | <mathvariant name\="double\-struck" family\="Mathematica7, Lucida Sans Unicode"\/> |
| 498 | ] |
| 499 | |
| 500 | changed to: |
| 501 | |
| 502 | [pre |
| 503 | <!\-\- Double\-struck \-\-> |
| 504 | <mathvariant name\="double\-struck" family\="Lucida Sans Unicode"\/> |
| 505 | ] |
| 506 | |
| 507 | Note that unlike the sample config file supplied with SVGMath, this does not |
| 508 | make use of the [@http://support.wolfram.com/technotes/fonts/windows/latestfonts.html Mathematica 7 font] |
| 509 | as this lacks sufficient Unicode information |
| 510 | for it to be used with either SVGMath or XEP "as is". |
| 511 | |
| 512 | Also note that the SVG files in the repository are almost certainly |
| 513 | Windows-specific since they reference various Windows Fonts. |
| 514 | |
| 515 | PNG files can be created from the SVGs using |
| 516 | [@http://xmlgraphics.apache.org/batik/tools/rasterizer.html Batik] |
| 517 | and a command such as: |
| 518 | |
| 519 | [pre java -jar 'C:\download\open\batik-1.7\batik-rasterizer.jar' -dpi 120 *.svg] |
| 520 | |
| 521 | Or using Inkscape (File, Export bitmap, Drawing tab, bitmap size (default size, 100 dpi), Filename (default). png) |
| 522 | |
| 523 | or Using Cygwin, a command such as: |
| 524 | |
| 525 | [pre for file in *.svg; do |
| 526 | /cygdrive/c/progra~1/Inkscape/inkscape -d 120 -e $(cygpath -a -w $(basename $file .svg).png) $(cygpath -a -w $file); |
| 527 | done] |
| 528 | |
| 529 | Using BASH |
| 530 | |
| 531 | [pre # Convert single SVG to PNG file. |
| 532 | # /c/progra~1/Inkscape/inkscape -d 120 -e a.png a.svg |
| 533 | ] |
| 534 | |
| 535 | or to convert All files in folder SVG to PNG. |
| 536 | |
| 537 | [pre |
| 538 | for file in *.svg; do |
| 539 | /c/progra~1/Inkscape/inkscape -d 120 -e $(basename $file .svg).png $file |
| 540 | done |
| 541 | ] |
| 542 | |
| 543 | Currently Inkscape seems to generate the better looking PNGs. |
| 544 | |
| 545 | The PDF is generated into \pdf\math.pdf |
| 546 | using a command from a shell or command window with current directory |
| 547 | \math_toolkit\libs\math\doc\sf_and_dist, typically: |
| 548 | |
| 549 | [pre bjam -a pdf >math_pdf.log] |
| 550 | |
| 551 | Note that XEP will have to be configured to *use and embed* |
| 552 | whatever fonts are used by the SVG equations |
| 553 | (almost certainly editing the sample xep.xml provided by the XEP installation). |
| 554 | If you fail to do this you will get XEP warnings in the log file like |
| 555 | |
| 556 | [pre \[warning\]could not find any font family matching "Times New Roman"; replaced by Helvetica] |
| 557 | |
| 558 | (html is the default so it is generated at libs\math\doc\html\index.html |
| 559 | using command line >bjam -a > math_toolkit.docs.log). |
| 560 | |
| 561 | <!-- Sample configuration for Windows TrueType fonts. --> |
| 562 | is provided in the xep.xml downloaded, but the Windows TrueType fonts are commented out. |
| 563 | |
| 564 | JM's XEP config file \xep\xep.xml has the following font configuration section added: |
| 565 | |
| 566 | [pre |
| 567 | <font\-group xml:base\="file:\/C:\/Windows\/Fonts\/" label\="Windows TrueType" embed\="true" subset\="true"> |
| 568 | <font\-family name\="Arial"> |
| 569 | <font><font\-data ttf\="arial.ttf"\/><\/font> |
| 570 | <font style\="oblique"><font\-data ttf\="ariali.ttf"\/><\/font> |
| 571 | <font weight\="bold"><font\-data ttf\="arialbd.ttf"\/><\/font> |
| 572 | <font weight\="bold" style\="oblique"><font\-data ttf\="arialbi.ttf"\/><\/font> |
| 573 | <\/font\-family> |
| 574 | |
| 575 | <font\-family name\="Times New Roman" ligatures\="fi fl"> |
| 576 | <font><font\-data ttf\="times.ttf"\/><\/font> |
| 577 | <font style\="italic"><font\-data ttf\="timesi.ttf"\/><\/font> |
| 578 | <font weight\="bold"><font\-data ttf\="timesbd.ttf"\/><\/font> |
| 579 | <font weight\="bold" style\="italic"><font\-data ttf\="timesbi.ttf"\/><\/font> |
| 580 | <\/font\-family> |
| 581 | |
| 582 | <font\-family name\="Courier New"> |
| 583 | <font><font\-data ttf\="cour.ttf"\/><\/font> |
| 584 | <font style\="oblique"><font\-data ttf\="couri.ttf"\/><\/font> |
| 585 | <font weight\="bold"><font\-data ttf\="courbd.ttf"\/><\/font> |
| 586 | <font weight\="bold" style\="oblique"><font\-data ttf\="courbi.ttf"\/><\/font> |
| 587 | <\/font\-family> |
| 588 | |
| 589 | <font\-family name\="Tahoma" embed\="true"> |
| 590 | <font><font\-data ttf\="tahoma.ttf"\/><\/font> |
| 591 | <font weight\="bold"><font\-data ttf\="tahomabd.ttf"\/><\/font> |
| 592 | <\/font\-family> |
| 593 | |
| 594 | <font\-family name\="Verdana" embed\="true"> |
| 595 | <font><font\-data ttf\="verdana.ttf"\/><\/font> |
| 596 | <font style\="oblique"><font\-data ttf\="verdanai.ttf"\/><\/font> |
| 597 | <font weight\="bold"><font\-data ttf\="verdanab.ttf"\/><\/font> |
| 598 | <font weight\="bold" style\="oblique"><font\-data ttf\="verdanaz.ttf"\/><\/font> |
| 599 | <\/font\-family> |
| 600 | |
| 601 | <font\-family name\="Palatino" embed\="true" ligatures\="ff fi fl ffi ffl"> |
| 602 | <font><font\-data ttf\="pala.ttf"\/><\/font> |
| 603 | <font style\="italic"><font\-data ttf\="palai.ttf"\/><\/font> |
| 604 | <font weight\="bold"><font\-data ttf\="palab.ttf"\/><\/font> |
| 605 | <font weight\="bold" style\="italic"><font\-data ttf\="palabi.ttf"\/><\/font> |
| 606 | <\/font\-family> |
| 607 | |
| 608 | <font-family name="Lucida Sans Unicode"> |
| 609 | <!-- <font><font-data ttf="lsansuni.ttf"></font> --> |
| 610 | <!-- actually called l_10646.ttf on Windows 2000 and Vista Sp1 --> |
| 611 | <font><font-data ttf="l_10646.ttf"/></font> |
| 612 | </font-family> |
| 613 | ] |
| 614 | |
| 615 | PAB had to alter his because the Lucida Sans Unicode font had a different name. |
| 616 | Other changes are very likely to be required if you are not using Windows. |
| 617 | |
| 618 | XZ authored his equations using the venerable Latex, JM converted these to |
| 619 | MathML using [@http://gentoo-wiki.com/HOWTO_Convert_LaTeX_to_HTML_with_MathML mxlatex]. |
| 620 | This process is currently unreliable and required some manual intervention: |
| 621 | consequently Latex source is not considered a viable route for the automatic |
| 622 | production of SVG versions of equations. |
| 623 | |
| 624 | Equations are embedded in the quickbook source using the /equation/ |
| 625 | template defined in math.qbk. This outputs Docbook XML that looks like: |
| 626 | |
| 627 | [pre |
| 628 | <inlinemediaobject> |
| 629 | <imageobject role="html"> |
| 630 | <imagedata fileref="../equations/myfile.png"></imagedata> |
| 631 | </imageobject> |
| 632 | <imageobject role="print"> |
| 633 | <imagedata fileref="../equations/myfile.svg"></imagedata> |
| 634 | </imageobject> |
| 635 | </inlinemediaobject> |
| 636 | ] |
| 637 | |
| 638 | MathML is not currently present in the Docbook output, or in the |
| 639 | generated HTML: this needs further investigation. |
| 640 | |
| 641 | [h4 Producing Graphs] |
| 642 | |
| 643 | Graphs were produced in SVG format and then converted to PNG's using the same |
| 644 | process as the equations. |
| 645 | |
| 646 | The programs |
| 647 | `/libs/math/doc/sf_and_dist/graphs/dist_graphs.cpp` |
| 648 | and `/libs/math/doc/sf_and_dist/graphs/sf_graphs.cpp` |
| 649 | generate the SVG's directly using the |
| 650 | [@http://code.google.com/soc/2007/boost/about.html Google Summer of Code 2007] |
| 651 | project of Jacob Voytko (whose work so far, |
| 652 | considerably enhanced and now reasonably mature and usable, by Paul A. Bristow, |
| 653 | is at .\boost-sandbox\SOC\2007\visualization). |
| 654 | |
| 655 | [endsect] [/section:sf_implementation Implementation Notes] |
| 656 | |
| 657 | [/ |
| 658 | Copyright 2006, 2007, 2010 John Maddock and Paul A. Bristow. |
| 659 | Distributed under the Boost Software License, Version 1.0. |
| 660 | (See accompanying file LICENSE_1_0.txt or copy at |
| 661 | http://www.boost.org/LICENSE_1_0.txt). |
| 662 | ] |
| 663 | |
| 664 | |