Blame - third_party/gmp/doc/gmp.texi - RealtimeRoboticsGroup/test

blob: a69d0e9b0ec21b251096d27e153335602debee98 [file] [log] [blame]

Austin Schuh	dace2a6	2020-08-18 10:56:48 -0700	[diff] [blame]	1	\input texinfo @c --texinfo--
				2	@c %**start of header
				3	@setfilename gmp.info
				4	@documentencoding ISO-8859-1
				5	@include version.texi
				6	@settitle GNU MP @value{VERSION}
				7	@synindex tp fn
				8	@iftex
				9	@afourpaper
				10	@end iftex
				11	@comment %**end of header
				12
				13	@copying
				14	This manual describes how to install and use the GNU multiple precision
				15	arithmetic library, version @value{VERSION}.
				16
				17	Copyright 1991, 1993-2016, 2018 Free Software Foundation, Inc.
				18
				19	Permission is granted to copy, distribute and/or modify this document under
				20	the terms of the GNU Free Documentation License, Version 1.3 or any later
				21	version published by the Free Software Foundation; with no Invariant Sections,
				22	with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
				23	Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
				24	software''. A copy of the license is included in
				25	@ref{GNU Free Documentation License}.
				26	@end copying
				27	@c Note the @ref above must be on one line, a line break in an @ref within
				28	@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes
				29	@c with texinfo 4.7), with messages about missing @endcsname.
				30
				31
				32	@c Texinfo version 4.2 or up will be needed to process this file.
				33	@c
				34	@c The version number and edition number are taken from version.texi provided
				35	@c by automake (note that it's regenerated only if you configure with
				36	@c --enable-maintainer-mode).
				37	@c
				38	@c Notes discussing the present version number of GMP in relation to previous
				39	@c ones (for instance in the "Compatibility" section) must be updated at
				40	@c manually though.
				41	@c
				42	@c @cindex entries have been made for function categories and programming
				43	@c topics. The "mpn" section is not included in this, because a beginner
				44	@c looking for "GCD" or something is only going to be confused by pointers to
				45	@c low level routines.
				46	@c
				47	@c @cindex entries are present for processors and systems when there's
				48	@c particular notes concerning them, but not just for everything GMP
				49	@c supports.
				50	@c
				51	@c Index entries for files use @code rather than @file, @samp or @option,
				52	@c since the latter come out with quotes in TeX, which are nice in the text
				53	@c but don't look so good in index columns.
				54	@c
				55	@c Tex:
				56	@c
				57	@c A suitable texinfo.tex is supplied, a newer one should work equally well.
				58	@c
				59	@c HTML:
				60	@c
				61	@c Nothing special is done for links to external manuals, they just come out
				62	@c in the usual makeinfo style, eg. "../libc/Locales.html". If you have
				63	@c local copies of such manuals then this is a good thing, if not then you
				64	@c may want to search-and-replace to some online source.
				65	@c
				66
				67	@dircategory GNU libraries
				68	@direntry
				69	* gmp: (gmp). GNU Multiple Precision Arithmetic Library.
				70	@end direntry
				71
				72	@c html <meta name="description" content="...">
				73	@documentdescription
				74	How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
				75	@end documentdescription
				76
				77	@c smallbook
				78	@finalout
				79	@setchapternewpage on
				80
				81	@ifnottex
				82	@node Top, Copying, (dir), (dir)
				83	@top GNU MP
				84	@end ifnottex
				85
				86	@iftex
				87	@titlepage
				88	@title GNU MP
				89	@subtitle The GNU Multiple Precision Arithmetic Library
				90	@subtitle Edition @value{EDITION}
				91	@subtitle @value{UPDATED}
				92
				93	@author by Torbj@"orn Granlund and the GMP development team
				94	@c @email{tg@@gmplib.org}
				95
				96	@c Include the Distribution inside the titlepage so
				97	@c that headings are turned off.
				98
				99	@tex
				100	\global\parindent=0pt
				101	\global\parskip=8pt
				102	\global\baselineskip=13pt
				103	@end tex
				104
				105	@page
				106	@vskip 0pt plus 1filll
				107	@end iftex
				108
				109	@insertcopying
				110	@ifnottex
				111	@sp 1
				112	@end ifnottex
				113
				114	@iftex
				115	@end titlepage
				116	@headings double
				117	@end iftex
				118
				119	@c Don't bother with contents for html, the menus seem adequate.
				120	@ifnothtml
				121	@contents
				122	@end ifnothtml
				123
				124	@menu
				125	* Copying:: GMP Copying Conditions (LGPL).
				126	* Introduction to GMP:: Brief introduction to GNU MP.
				127	* Installing GMP:: How to configure and compile the GMP library.
				128	* GMP Basics:: What every GMP user should know.
				129	* Reporting Bugs:: How to usefully report bugs.
				130	* Integer Functions:: Functions for arithmetic on signed integers.
				131	* Rational Number Functions:: Functions for arithmetic on rational numbers.
				132	* Floating-point Functions:: Functions for arithmetic on floats.
				133	* Low-level Functions:: Fast functions for natural numbers.
				134	* Random Number Functions:: Functions for generating random numbers.
				135	* Formatted Output:: @code{printf} style output.
				136	* Formatted Input:: @code{scanf} style input.
				137	* C++ Class Interface:: Class wrappers around GMP types.
				138	* Custom Allocation:: How to customize the internal allocation.
				139	* Language Bindings:: Using GMP from other languages.
				140	* Algorithms:: What happens behind the scenes.
				141	* Internals:: How values are represented behind the scenes.
				142
				143	* Contributors:: Who brings you this library?
				144	* References:: Some useful papers and books to read.
				145	* GNU Free Documentation License::
				146	* Concept Index::
				147	* Function Index::
				148	@end menu
				149
				150
				151	@c @m{T,N} is $T$ in tex or @math{N} otherwise. Commas in N or T don't work,
				152	@c but @C{} can be used instead.
				153	@iftex
				154	@macro m {T,N}
				155	@tex$\T\$@end tex
				156	@end macro
				157	@end iftex
				158	@ifnottex
				159	@macro m {T,N}
				160	@math{\N\}
				161	@end macro
				162	@end ifnottex
				163
				164	@c @mm{T,N} is $T$ tex and html and @math{N} in info. Commas in N or T don't
				165	@c work, but @C{} can be used instead.
				166	@iftex
				167	@macro mm {T,N}
				168	@tex$\T\$@end tex
				169	@end macro
				170	@end iftex
				171
				172	@ifhtml
				173	@macro mm {T,N}
				174	@math{\T\}
				175	@end macro
				176	@end ifhtml
				177
				178	@ifinfo
				179	@macro mm {T,N}
				180	@math{\N\}
				181	@end macro
				182	@end ifinfo
				183
				184
				185	@macro C {}
				186	,
				187	@end macro
				188
				189	@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple
				190	@c subscripts like @ms{x,0}.
				191	@iftex
				192	@macro ms {V,N}
				193	@tex$\V\_{\N\}$@end tex
				194	@end macro
				195	@end iftex
				196	@ifnottex
				197	@macro ms {V,N}
				198	\V\\N\
				199	@end macro
				200	@end ifnottex
				201
				202	@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used
				203	@c when the quotes that @code{} gives in info aren't wanted, but the
				204	@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'}
				205	@c though (gives two backslashes in tex).
				206	@ifinfo
				207	@macro nicode {S}
				208	\S\
				209	@end macro
				210	@end ifinfo
				211	@ifnotinfo
				212	@macro nicode {S}
				213	@code{\S\}
				214	@end macro
				215	@end ifnotinfo
				216
				217	@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used
				218	@c when the quotes that @samp{} gives in info aren't wanted, but the
				219	@c fontification in tex or html is wanted.
				220	@ifinfo
				221	@macro nisamp {S}
				222	\S\
				223	@end macro
				224	@end ifinfo
				225	@ifnotinfo
				226	@macro nisamp {S}
				227	@samp{\S\}
				228	@end macro
				229	@end ifnotinfo
				230
				231	@c Usage: @GMPtimes{}
				232	@c Give either \times or the word "times".
				233	@tex
				234	\gdef\GMPtimes{\times}
				235	@end tex
				236	@ifnottex
				237	@macro GMPtimes
				238	times
				239	@end macro
				240	@end ifnottex
				241
				242	@c Usage: @GMPmultiply{}
				243	@c Give * in info, or nothing in tex.
				244	@tex
				245	\gdef\GMPmultiply{}
				246	@end tex
				247	@ifnottex
				248	@macro GMPmultiply
				249	*
				250	@end macro
				251	@end ifnottex
				252
				253	@c Usage: @GMPabs{x}
				254	@c Give either \|x\| in tex, or abs(x) in info or html.
				255	@tex
				256	\gdef\GMPabs#1{\|#1\|}
				257	@end tex
				258	@ifnottex
				259	@macro GMPabs {X}
				260	@abs{}(\X\)
				261	@end macro
				262	@end ifnottex
				263
				264	@c Usage: @GMPfloor{x}
				265	@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
				266	@tex
				267	\gdef\GMPfloor#1{\lfloor #1\rfloor}
				268	@end tex
				269	@ifnottex
				270	@macro GMPfloor {X}
				271	floor(\X\)
				272	@end macro
				273	@end ifnottex
				274
				275	@c Usage: @GMPceil{x}
				276	@c Give either \lceil x\rceil in tex, or ceil(x) in info or html.
				277	@tex
				278	\gdef\GMPceil#1{\lceil #1 \rceil}
				279	@end tex
				280	@ifnottex
				281	@macro GMPceil {X}
				282	ceil(\X\)
				283	@end macro
				284	@end ifnottex
				285
				286	@c Math operators already available in tex, made available in info too.
				287	@c For example @bmod{} can be used in both tex and info.
				288	@ifnottex
				289	@macro bmod
				290	mod
				291	@end macro
				292	@macro gcd
				293	gcd
				294	@end macro
				295	@macro ge
				296	>=
				297	@end macro
				298	@macro le
				299	<=
				300	@end macro
				301	@macro log
				302	log
				303	@end macro
				304	@macro min
				305	min
				306	@end macro
				307	@macro leftarrow
				308	<-
				309	@end macro
				310	@macro rightarrow
				311	->
				312	@end macro
				313	@end ifnottex
				314
				315	@c New math operators.
				316	@c @abs{} can be used in both tex and info, or just \abs in tex.
				317	@tex
				318	\gdef\abs{\mathop{\rm abs}}
				319	@end tex
				320	@ifnottex
				321	@macro abs
				322	abs
				323	@end macro
				324	@end ifnottex
				325
				326	@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works
				327	@c inside or outside $ $.
				328	@tex
				329	\gdef\cross{\ifmmode\times\else$\times$\fi}
				330	@end tex
				331	@ifnottex
				332	@macro cross
				333	x
				334	@end macro
				335	@end ifnottex
				336
				337	@c @times{} made available as a "*" in info and html (already works in tex).
				338	@ifnottex
				339	@macro times
				340	*
				341	@end macro
				342	@end ifnottex
				343
				344	@c Usage: @W{text}
				345	@c Like @w{} but working in math mode too.
				346	@tex
				347	\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
				348	@end tex
				349	@ifnottex
				350	@macro W {S}
				351	@w{\S\}
				352	@end macro
				353	@end ifnottex
				354
				355	@c Usage: \GMPdisplay{text}
				356	@c Put the given text in an @display style indent, but without turning off
				357	@c paragraph reflow etc.
				358	@tex
				359	\gdef\GMPdisplay#1{%
				360	\noindent
				361	\advance\leftskip by \lispnarrowing
				362	#1\par}
				363	@end tex
				364
				365	@c Usage: \GMPhat
				366	@c A new \hat that will work in math mode, unlike the texinfo redefined
				367	@c version.
				368	@tex
				369	\gdef\GMPhat{\mathaccent"705E}
				370	@end tex
				371
				372	@c Usage: \GMPraise{text}
				373	@c For use in a $ $ math expression as an alternative to "^". This is good
				374	@c for @code{} in an exponent, since there seems to be no superscript font
				375	@c for that.
				376	@tex
				377	\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
				378	@end tex
				379
				380	@c Usage: @texlinebreak{}
				381	@c A line break as per @*, but only in tex.
				382	@iftex
				383	@macro texlinebreak
				384	@*
				385	@end macro
				386	@end iftex
				387	@ifnottex
				388	@macro texlinebreak
				389	@end macro
				390	@end ifnottex
				391
				392	@c Usage: @maybepagebreak
				393	@c Allow tex to insert a page break, if it feels the urge.
				394	@c Normally blocks of @deftypefun/funx are kept together, which can lead to
				395	@c some poor page break positioning if it's a big block, like the sets of
				396	@c division functions etc.
				397	@tex
				398	\gdef\maybepagebreak{\penalty0}
				399	@end tex
				400	@ifnottex
				401	@macro maybepagebreak
				402	@end macro
				403	@end ifnottex
				404
				405	@c Usage: @GMPreftop{info,title}
				406	@c Usage: @GMPpxreftop{info,title}
				407	@c
				408	@c Like @ref{} and @pxref{}, but designed for a reference to the top of a
				409	@c document, not a particular section. The TeX output for plain @ref insists
				410	@c on printing a particular section, GMPreftop gives just the title.
				411	@c
				412	@c The texinfo manual recommends putting a likely section name in references
				413	@c like this, eg. "Introduction", but it seems better to just give the title.
				414	@c
				415	@iftex
				416	@macro GMPreftop{info,title}
				417	@i{\title\}
				418	@end macro
				419	@macro GMPpxreftop{info,title}
				420	see @i{\title\}
				421	@end macro
				422	@end iftex
				423	@c
				424	@ifnottex
				425	@macro GMPreftop{info,title}
				426	@ref{Top,\title\,\title\,\info\,\title\}
				427	@end macro
				428	@macro GMPpxreftop{info,title}
				429	@pxref{Top,\title\,\title\,\info\,\title\}
				430	@end macro
				431	@end ifnottex
				432
				433
				434	@node Copying, Introduction to GMP, Top, Top
				435	@comment node-name, next, previous, up
				436	@unnumbered GNU MP Copying Conditions
				437	@cindex Copying conditions
				438	@cindex Conditions for copying GNU MP
				439	@cindex License conditions
				440
				441	This library is @dfn{free}; this means that everyone is free to use it and
				442	free to redistribute it on a free basis. The library is not in the public
				443	domain; it is copyrighted and there are restrictions on its distribution, but
				444	these restrictions are designed to permit everything that a good cooperating
				445	citizen would want to do. What is not allowed is to try to prevent others
				446	from further sharing any version of this library that they might get from
				447	you.@refill
				448
				449	Specifically, we want to make sure that you have the right to give away copies
				450	of the library, that you receive source code or else can get it if you want
				451	it, that you can change this library or use pieces of it in new free programs,
				452	and that you know you can do these things.@refill
				453
				454	To make sure that everyone has such rights, we have to forbid you to deprive
				455	anyone else of these rights. For example, if you distribute copies of the GNU
				456	MP library, you must give the recipients all the rights that you have. You
				457	must make sure that they, too, receive or can get the source code. And you
				458	must tell them their rights.@refill
				459
				460	Also, for our own protection, we must make certain that everyone finds out
				461	that there is no warranty for the GNU MP library. If it is modified by
				462	someone else and passed on, we want their recipients to know that what they
				463	have is not what we distributed, so that any problems introduced by others
				464	will not reflect on our reputation.@refill
				465
				466	More precisely, the GNU MP library is dual licensed, under the conditions of
				467	the GNU Lesser General Public License version 3 (see
				468	@file{COPYING.LESSERv3}), or the GNU General Public License version 2 (see
				469	@file{COPYINGv2}). This is the recipient's choice, and the recipient also has
				470	the additional option of applying later versions of these licenses. (The
				471	reason for this dual licensing is to make it possible to use the library with
				472	programs which are licensed under GPL version 2, but which for historical or
				473	other reasons do not allow use under later versions of the GPL).
				474
				475	Programs which are not part of the library itself, such as demonstration
				476	programs and the GMP testsuite, are licensed under the terms of the GNU
				477	General Public License version 3 (see @file{COPYINGv3}), or any later
				478	version.
				479
				480
				481	@node Introduction to GMP, Installing GMP, Copying, Top
				482	@comment node-name, next, previous, up
				483	@chapter Introduction to GNU MP
				484	@cindex Introduction
				485
				486	GNU MP is a portable library written in C for arbitrary precision arithmetic
				487	on integers, rational numbers, and floating-point numbers. It aims to provide
				488	the fastest possible arithmetic for all applications that need higher
				489	precision than is directly supported by the basic C types.
				490
				491	Many applications use just a few hundred bits of precision; but some
				492	applications may need thousands or even millions of bits. GMP is designed to
				493	give good performance for both, by choosing algorithms based on the sizes of
				494	the operands, and by carefully keeping the overhead at a minimum.
				495
				496	The speed of GMP is achieved by using fullwords as the basic arithmetic type,
				497	by using sophisticated algorithms, by including carefully optimized assembly
				498	code for the most common inner loops for many different CPUs, and by a general
				499	emphasis on speed (as opposed to simplicity or elegance).
				500
				501	There is assembly code for these CPUs:
				502	@cindex CPU types
				503	ARM Cortex-A9, Cortex-A15, and generic ARM,
				504	DEC Alpha 21064, 21164, and 21264,
				505	AMD K8 and K10 (sold under many brands, e.g. Athlon64, Phenom, Opteron)
				506	Bulldozer, and Bobcat,
				507	Intel Pentium, Pentium Pro/II/III, Pentium 4, Core2, Nehalem, Sandy bridge, Haswell, generic x86,
				508	Intel IA-64,
				509	Motorola/IBM PowerPC 32 and 64 such as POWER970, POWER5, POWER6, and POWER7,
				510	MIPS 32-bit and 64-bit,
				511	SPARC 32-bit ad 64-bit with special support for all UltraSPARC models.
				512	There is also assembly code for many obsolete CPUs.
				513
				514
				515	@cindex Home page
				516	@cindex Web page
				517	@noindent
				518	For up-to-date information on GMP, please see the GMP web pages at
				519
				520	@display
				521	@uref{https://gmplib.org/}
				522	@end display
				523
				524	@cindex Latest version of GMP
				525	@cindex Anonymous FTP of latest version
				526	@cindex FTP of latest version
				527	@noindent
				528	The latest version of the library is available at
				529
				530	@display
				531	@uref{https://ftp.gnu.org/gnu/gmp/}
				532	@end display
				533
				534	Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
				535	near you, see @uref{https://www.gnu.org/order/ftp.html} for a full list.
				536
				537	@cindex Mailing lists
				538	There are three public mailing lists of interest. One for release
				539	announcements, one for general questions and discussions about usage of the GMP
				540	library and one for bug reports. For more information, see
				541
				542	@display
				543	@uref{https://gmplib.org/mailman/listinfo/}.
				544	@end display
				545
				546	The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See
				547	@ref{Reporting Bugs} for information about reporting bugs.
				548
				549	@sp 1
				550	@section How to use this Manual
				551	@cindex About this manual
				552
				553	Everyone should read @ref{GMP Basics}. If you need to install the library
				554	yourself, then read @ref{Installing GMP}. If you have a system with multiple
				555	ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
				556	on applications.
				557
				558	The rest of the manual can be used for later reference, although it is
				559	probably a good idea to glance through it.
				560
				561
				562	@node Installing GMP, GMP Basics, Introduction to GMP, Top
				563	@comment node-name, next, previous, up
				564	@chapter Installing GMP
				565	@cindex Installing GMP
				566	@cindex Configuring GMP
				567	@cindex Building GMP
				568
				569	GMP has an autoconf/automake/libtool based configuration system. On a
				570	Unix-like system a basic build can be done with
				571
				572	@example
				573	./configure
				574	make
				575	@end example
				576
				577	@noindent
				578	Some self-tests can be run with
				579
				580	@example
				581	make check
				582	@end example
				583
				584	@noindent
				585	And you can install (under @file{/usr/local} by default) with
				586
				587	@example
				588	make install
				589	@end example
				590
				591	If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}.
				592	See @ref{Reporting Bugs}, for information on what to include in useful bug
				593	reports.
				594
				595	@menu
				596	* Build Options::
				597	* ABI and ISA::
				598	* Notes for Package Builds::
				599	* Notes for Particular Systems::
				600	* Known Build Problems::
				601	* Performance optimization::
				602	@end menu
				603
				604
				605	@node Build Options, ABI and ISA, Installing GMP, Installing GMP
				606	@section Build Options
				607	@cindex Build options
				608
				609	All the usual autoconf configure options are available, run @samp{./configure
				610	--help} for a summary. The file @file{INSTALL.autoconf} has some generic
				611	installation information too.
				612
				613	@table @asis
				614	@item Tools
				615	@cindex Non-Unix systems
				616	@samp{configure} requires various Unix-like tools. See @ref{Notes for
				617	Particular Systems}, for some options on non-Unix systems.
				618
				619	It might be possible to build without the help of @samp{configure}, certainly
				620	all the code is there, but unfortunately you'll be on your own.
				621
				622	@item Build Directory
				623	@cindex Build directory
				624	To compile in a separate build directory, @command{cd} to that directory, and
				625	prefix the configure command with the path to the GMP source directory. For
				626	example
				627
				628	@example
				629	cd /my/build/dir
				630	/my/sources/gmp-@value{VERSION}/configure
				631	@end example
				632
				633	Not all @samp{make} programs have the necessary features (@code{VPATH}) to
				634	support this. In particular, SunOS and Slowaris @command{make} have bugs that
				635	make them unable to build in a separate directory. Use GNU @command{make}
				636	instead.
				637
				638	@item @option{--prefix} and @option{--exec-prefix}
				639	@cindex Prefix
				640	@cindex Exec prefix
				641	@cindex Install prefix
				642	@cindex @code{--prefix}
				643	@cindex @code{--exec-prefix}
				644	The @option{--prefix} option can be used in the normal way to direct GMP to
				645	install under a particular tree. The default is @samp{/usr/local}.
				646
				647	@option{--exec-prefix} can be used to direct architecture-dependent files like
				648	@file{libgmp.a} to a different location. This can be used to share
				649	architecture-independent parts like the documentation, but separate the
				650	dependent parts. Note however that @file{gmp.h} is
				651	architecture-dependent since it encodes certain aspects of @file{libgmp}, so
				652	it will be necessary to ensure both @file{$prefix/include} and
				653	@file{$exec_prefix/include} are available to the compiler.
				654
				655	@item @option{--disable-shared}, @option{--disable-static}
				656	@cindex @code{--disable-shared}
				657	@cindex @code{--disable-static}
				658	By default both shared and static libraries are built (where possible), but
				659	one or other can be disabled. Shared libraries result in smaller executables
				660	and permit code sharing between separate running processes, but on some CPUs
				661	are slightly slower, having a small cost on each function call.
				662
				663	@item Native Compilation, @option{--build=CPU-VENDOR-OS}
				664	@cindex Native compilation
				665	@cindex Build system
				666	@cindex @code{--build}
				667	For normal native compilation, the system can be specified with
				668	@samp{--build}. By default @samp{./configure} uses the output from running
				669	@samp{./config.guess}. On some systems @samp{./config.guess} can determine
				670	the exact CPU type, on others it will be necessary to give it explicitly. For
				671	example,
				672
				673	@example
				674	./configure --build=ultrasparc-sun-solaris2.7
				675	@end example
				676
				677	In all cases the @samp{OS} part is important, since it controls how libtool
				678	generates shared libraries. Running @samp{./config.guess} is the simplest way
				679	to see what it should be, if you don't know already.
				680
				681	@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
				682	@cindex Cross compiling
				683	@cindex Host system
				684	@cindex @code{--host}
				685	When cross-compiling, the system used for compiling is given by @samp{--build}
				686	and the system where the library will run is given by @samp{--host}. For
				687	example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
				688
				689	@example
				690	./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
				691	@end example
				692
				693	Compiler tools are sought first with the host system type as a prefix. For
				694	example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
				695	@command{ranlib}. This makes it possible for a set of cross-compiling tools
				696	to co-exist with native tools. The prefix is the argument to @samp{--host},
				697	and this can be an alias, such as @samp{m68k-linux}. But note that tools
				698	don't have to be setup this way, it's enough to just have a @env{PATH} with a
				699	suitable cross-compiling @command{cc} etc.
				700
				701	Compiling for a different CPU in the same family as the build system is a form
				702	of cross-compilation, though very possibly this would merely be special
				703	options on a native compiler. In any case @samp{./configure} avoids depending
				704	on being able to run code on the build system, which is important when
				705	creating binaries for a newer CPU since they very possibly won't run on the
				706	build system.
				707
				708	In all cases the compiler must be able to produce an executable (of whatever
				709	format) from a standard C @code{main}. Although only object files will go to
				710	make up @file{libgmp}, @samp{./configure} uses linking tests for various
				711	purposes, such as determining what functions are available on the host system.
				712
				713	Currently a warning is given unless an explicit @samp{--build} is used when
				714	cross-compiling, because it may not be possible to correctly guess the build
				715	system type if the @env{PATH} has only a cross-compiling @command{cc}.
				716
				717	Note that the @samp{--target} option is not appropriate for GMP@. It's for use
				718	when building compiler tools, with @samp{--host} being where they will run,
				719	and @samp{--target} what they'll produce code for. Ordinary programs or
				720	libraries like GMP are only interested in the @samp{--host} part, being where
				721	they'll run. (Some past versions of GMP used @samp{--target} incorrectly.)
				722
				723	@item CPU types
				724	@cindex CPU types
				725	In general, if you want a library that runs as fast as possible, you should
				726	configure GMP for the exact CPU type your system uses. However, this may mean
				727	the binaries won't run on older members of the family, and might run slower on
				728	other members, older or newer. The best idea is always to build GMP for the
				729	exact machine type you intend to run it on.
				730
				731	The following CPUs have specific support. See @file{configure.ac} for details
				732	of what code and compiler options they select.
				733
				734	@itemize @bullet
				735
				736	@c Keep this formatting, it's easy to read and it can be grepped to
				737	@c automatically test that CPUs listed get through ./config.sub
				738
				739	@item
				740	Alpha:
				741	@nisamp{alpha},
				742	@nisamp{alphaev5},
				743	@nisamp{alphaev56},
				744	@nisamp{alphapca56},
				745	@nisamp{alphapca57},
				746	@nisamp{alphaev6},
				747	@nisamp{alphaev67},
				748	@nisamp{alphaev68}
				749	@nisamp{alphaev7}
				750
				751	@item
				752	Cray:
				753	@nisamp{c90},
				754	@nisamp{j90},
				755	@nisamp{t90},
				756	@nisamp{sv1}
				757
				758	@item
				759	HPPA:
				760	@nisamp{hppa1.0},
				761	@nisamp{hppa1.1},
				762	@nisamp{hppa2.0},
				763	@nisamp{hppa2.0n},
				764	@nisamp{hppa2.0w},
				765	@nisamp{hppa64}
				766
				767	@item
				768	IA-64:
				769	@nisamp{ia64},
				770	@nisamp{itanium},
				771	@nisamp{itanium2}
				772
				773	@item
				774	MIPS:
				775	@nisamp{mips},
				776	@nisamp{mips3},
				777	@nisamp{mips64}
				778
				779	@item
				780	Motorola:
				781	@nisamp{m68k},
				782	@nisamp{m68000},
				783	@nisamp{m68010},
				784	@nisamp{m68020},
				785	@nisamp{m68030},
				786	@nisamp{m68040},
				787	@nisamp{m68060},
				788	@nisamp{m68302},
				789	@nisamp{m68360},
				790	@nisamp{m88k},
				791	@nisamp{m88110}
				792
				793	@item
				794	POWER:
				795	@nisamp{power},
				796	@nisamp{power1},
				797	@nisamp{power2},
				798	@nisamp{power2sc}
				799
				800	@item
				801	PowerPC:
				802	@nisamp{powerpc},
				803	@nisamp{powerpc64},
				804	@nisamp{powerpc401},
				805	@nisamp{powerpc403},
				806	@nisamp{powerpc405},
				807	@nisamp{powerpc505},
				808	@nisamp{powerpc601},
				809	@nisamp{powerpc602},
				810	@nisamp{powerpc603},
				811	@nisamp{powerpc603e},
				812	@nisamp{powerpc604},
				813	@nisamp{powerpc604e},
				814	@nisamp{powerpc620},
				815	@nisamp{powerpc630},
				816	@nisamp{powerpc740},
				817	@nisamp{powerpc7400},
				818	@nisamp{powerpc7450},
				819	@nisamp{powerpc750},
				820	@nisamp{powerpc801},
				821	@nisamp{powerpc821},
				822	@nisamp{powerpc823},
				823	@nisamp{powerpc860},
				824	@nisamp{powerpc970}
				825
				826	@item
				827	SPARC:
				828	@nisamp{sparc},
				829	@nisamp{sparcv8},
				830	@nisamp{microsparc},
				831	@nisamp{supersparc},
				832	@nisamp{sparcv9},
				833	@nisamp{ultrasparc},
				834	@nisamp{ultrasparc2},
				835	@nisamp{ultrasparc2i},
				836	@nisamp{ultrasparc3},
				837	@nisamp{sparc64}
				838
				839	@item
				840	x86 family:
				841	@nisamp{i386},
				842	@nisamp{i486},
				843	@nisamp{i586},
				844	@nisamp{pentium},
				845	@nisamp{pentiummmx},
				846	@nisamp{pentiumpro},
				847	@nisamp{pentium2},
				848	@nisamp{pentium3},
				849	@nisamp{pentium4},
				850	@nisamp{k6},
				851	@nisamp{k62},
				852	@nisamp{k63},
				853	@nisamp{athlon},
				854	@nisamp{amd64},
				855	@nisamp{viac3},
				856	@nisamp{viac32}
				857
				858	@item
				859	Other:
				860	@nisamp{arm},
				861	@nisamp{sh},
				862	@nisamp{sh2},
				863	@nisamp{vax},
				864	@end itemize
				865
				866	CPUs not listed will use generic C code.
				867
				868	@item Generic C Build
				869	@cindex Generic C
				870	If some of the assembly code causes problems, or if otherwise desired, the
				871	generic C code can be selected with the configure @option{--disable-assembly}.
				872
				873	Note that this will run quite slowly, but it should be portable and should at
				874	least make it possible to get something running if all else fails.
				875
				876	@item Fat binary, @option{--enable-fat}
				877	@cindex Fat binary
				878	@cindex @code{--enable-fat}
				879	Using @option{--enable-fat} selects a ``fat binary'' build on x86, where
				880	optimized low level subroutines are chosen at runtime according to the CPU
				881	detected. This means more code, but gives good performance on all x86 chips.
				882	(This option might become available for more architectures in the future.)
				883
				884	@item @option{ABI}
				885	@cindex ABI
				886	On some systems GMP supports multiple ABIs (application binary interfaces),
				887	meaning data type sizes and calling conventions. By default GMP chooses the
				888	best ABI available, but a particular ABI can be selected. For example
				889
				890	@example
				891	./configure --host=mips64-sgi-irix6 ABI=n32
				892	@end example
				893
				894	See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
				895	applications need to do.
				896
				897	@item @option{CC}, @option{CFLAGS}
				898	@cindex C compiler
				899	@cindex @code{CC}
				900	@cindex @code{CFLAGS}
				901	By default the C compiler used is chosen from among some likely candidates,
				902	with @command{gcc} normally preferred if it's present. The usual
				903	@samp{CC=whatever} can be passed to @samp{./configure} to choose something
				904	different.
				905
				906	For various systems, default compiler flags are set based on the CPU and
				907	compiler. The usual @samp{CFLAGS="-whatever"} can be passed to
				908	@samp{./configure} to use something different or to set good flags for systems
				909	GMP doesn't otherwise know.
				910
				911	The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
				912	and can be found in each generated @file{Makefile}. This is the easiest way
				913	to check the defaults when considering changing or adding something.
				914
				915	Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
				916	supporting multiple ABIs it's important to give an explicit
				917	@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
				918	won't be able to select the correct assembly code.
				919
				920	If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
				921	compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can
				922	be used to force the use of GCC, with default flags (and default ABI).
				923
				924	@item @option{CPPFLAGS}
				925	@cindex @code{CPPFLAGS}
				926	Any flags like @samp{-D} defines or @samp{-I} includes required by the
				927	preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
				928	Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
				929	preprocessing uses just @samp{CPPFLAGS}. This distinction is because most
				930	preprocessors won't accept all the flags the compiler does. Preprocessing is
				931	done separately in some configure tests.
				932
				933	@item @option{CC_FOR_BUILD}
				934	@cindex @code{CC_FOR_BUILD}
				935	Some build-time programs are compiled and run to generate host-specific data
				936	tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need
				937	to be in any particular ABI or mode, it merely needs to generate executables
				938	that can run. The default is to try the selected @samp{CC} and some likely
				939	candidates such as @samp{cc} and @samp{gcc}, looking for something that works.
				940
				941	No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like
				942	@samp{cc foo.c} should be enough. If some particular options are required
				943	they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}.
				944
				945	@item C++ Support, @option{--enable-cxx}
				946	@cindex C++ support
				947	@cindex @code{--enable-cxx}
				948	C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
				949	C++ compiler will be required. As a convenience @samp{--enable-cxx=detect}
				950	can be used to enable C++ support only if a compiler can be found. The C++
				951	support consists of a library @file{libgmpxx.la} and header file
				952	@file{gmpxx.h} (@pxref{Headers and Libraries}).
				953
				954	A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
				955	within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
				956	bloated by a dependency on the C++ standard library, and to avoid any chance
				957	that the C++ compiler could be required when linking plain C programs.
				958
				959	@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
				960	only be expected to work with @file{libgmp.la} from the same GMP version.
				961	Future changes to the relevant internals will be accompanied by renaming, so a
				962	mismatch will cause unresolved symbols rather than perhaps mysterious
				963	misbehaviour.
				964
				965	In general @file{libgmpxx.la} will be usable only with the C++ compiler that
				966	built it, since name mangling and runtime support are usually incompatible
				967	between different compilers.
				968
				969	@item @option{CXX}, @option{CXXFLAGS}
				970	@cindex C++ compiler
				971	@cindex @code{CXX}
				972	@cindex @code{CXXFLAGS}
				973	When C++ support is enabled, the C++ compiler and its flags can be set with
				974	variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for
				975	@samp{CXX} is the first compiler that works from a list of likely candidates,
				976	with @command{g++} normally preferred when available. The default for
				977	@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
				978	for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
				979	@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using
				980	@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
				981	usually suit @samp{g++}.
				982
				983	It's important that the C and C++ compilers match, meaning their startup and
				984	runtime support routines are compatible and that they generate code in the
				985	same ABI (if there's a choice of ABIs on the system). @samp{./configure}
				986	isn't currently able to check these things very well itself, so for that
				987	reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
				988	compiler mismatch. Perhaps this will change in the future.
				989
				990	Incidentally, it's normally not good enough to set @samp{CXX} to the same as
				991	@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as
				992	C++ code, only @command{g++} will invoke the linker the right way when
				993	building an executable or shared library from C++ object files.
				994
				995	@item Temporary Memory, @option{--enable-alloca=<choice>}
				996	@cindex Temporary memory
				997	@cindex Stack overflow
				998	@cindex @code{alloca}
				999	@cindex @code{--enable-alloca}
				1000	GMP allocates temporary workspace using one of the following three methods,
				1001	which can be selected with for instance
				1002	@samp{--enable-alloca=malloc-reentrant}.
				1003
				1004	@itemize @bullet
				1005	@item
				1006	@samp{alloca} - C library or compiler builtin.
				1007	@item
				1008	@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
				1009	@item
				1010	@samp{malloc-notreentrant} - the heap, with global variables.
				1011	@end itemize
				1012
				1013	For convenience, the following choices are also available.
				1014	@samp{--disable-alloca} is the same as @samp{no}.
				1015
				1016	@itemize @bullet
				1017	@item
				1018	@samp{yes} - a synonym for @samp{alloca}.
				1019	@item
				1020	@samp{no} - a synonym for @samp{malloc-reentrant}.
				1021	@item
				1022	@samp{reentrant} - @code{alloca} if available, otherwise
				1023	@samp{malloc-reentrant}. This is the default.
				1024	@item
				1025	@samp{notreentrant} - @code{alloca} if available, otherwise
				1026	@samp{malloc-notreentrant}.
				1027	@end itemize
				1028
				1029	@code{alloca} is reentrant and fast, and is recommended. It actually allocates
				1030	just small blocks on the stack; larger ones use malloc-reentrant.
				1031
				1032	@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
				1033	but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
				1034	not required.
				1035
				1036	The two malloc methods in fact use the memory allocation functions selected by
				1037	@code{mp_set_memory_functions}, these being @code{malloc} and friends by
				1038	default. @xref{Custom Allocation}.
				1039
				1040	An additional choice @samp{--enable-alloca=debug} is available, to help when
				1041	debugging memory related problems (@pxref{Debugging}).
				1042
				1043	@item FFT Multiplication, @option{--disable-fft}
				1044	@cindex FFT multiplication
				1045	@cindex @code{--disable-fft}
				1046	By default multiplications are done using Karatsuba, 3-way Toom, higher degree
				1047	Toom, and Fermat FFT@. The FFT is only used on large to very large operands
				1048	and can be disabled to save code size if desired.
				1049
				1050	@item Assertion Checking, @option{--enable-assert}
				1051	@cindex Assertion checking
				1052	@cindex @code{--enable-assert}
				1053	This option enables some consistency checking within the library. This can be
				1054	of use while debugging, @pxref{Debugging}.
				1055
				1056	@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument}
				1057	@cindex Execution profiling
				1058	@cindex @code{--enable-profiling}
				1059	Enable profiling support, in one of various styles, @pxref{Profiling}.
				1060
				1061	@item @option{MPN_PATH}
				1062	@cindex @code{MPN_PATH}
				1063	Various assembly versions of each mpn subroutines are provided. For a given
				1064	CPU, a search is made though a path to choose a version of each. For example
				1065	@samp{sparcv8} has
				1066
				1067	@example
				1068	MPN_PATH="sparc32/v8 sparc32 generic"
				1069	@end example
				1070
				1071	which means look first for v8 code, then plain sparc32 (which is v7), and
				1072	finally fall back on generic C@. Knowledgeable users with special requirements
				1073	can specify a different path. Normally this is completely unnecessary.
				1074
				1075	@item Documentation
				1076	@cindex Documentation formats
				1077	@cindex Texinfo
				1078	The source for the document you're now reading is @file{doc/gmp.texi}, in
				1079	Texinfo format, see @GMPreftop{texinfo, Texinfo}.
				1080
				1081	@cindex Postscript
				1082	@cindex DVI
				1083	@cindex PDF
				1084	Info format @samp{doc/gmp.info} is included in the distribution. The usual
				1085	automake targets are available to make PostScript, DVI, PDF and HTML (these
				1086	will require various @TeX{} and Texinfo tools).
				1087
				1088	@cindex DocBook
				1089	@cindex XML
				1090	DocBook and XML can be generated by the Texinfo @command{makeinfo} program
				1091	too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo,
				1092	Texinfo}.
				1093
				1094	Some supplementary notes can also be found in the @file{doc} subdirectory.
				1095
				1096	@end table
				1097
				1098
				1099	@need 2000
				1100	@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
				1101	@section ABI and ISA
				1102	@cindex ABI
				1103	@cindex Application Binary Interface
				1104	@cindex ISA
				1105	@cindex Instruction Set Architecture
				1106
				1107	ABI (Application Binary Interface) refers to the calling conventions between
				1108	functions, meaning what registers are used and what sizes the various C data
				1109	types are. ISA (Instruction Set Architecture) refers to the instructions and
				1110	registers a CPU has available.
				1111
				1112	Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
				1113	latter for compatibility with older CPUs in the family. GMP supports some
				1114	CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a
				1115	combination of chip ABI, plus how GMP chooses to use it. For example in some
				1116	32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
				1117	@code{long long}.
				1118
				1119	By default GMP chooses the best ABI available for a given system, and this
				1120	generally gives significantly greater speed. But an ABI can be chosen
				1121	explicitly to make GMP compatible with other libraries, or particular
				1122	application requirements. For example,
				1123
				1124	@example
				1125	./configure ABI=32
				1126	@end example
				1127
				1128	In all cases it's vital that all object code used in a given program is
				1129	compiled for the same ABI.
				1130
				1131	Usually a limb is implemented as a @code{long}. When a @code{long long} limb
				1132	is used this is encoded in the generated @file{gmp.h}. This is convenient for
				1133	applications, but it does mean that @file{gmp.h} will vary, and can't be just
				1134	copied around. @file{gmp.h} remains compiler independent though, since all
				1135	compilers for a particular ABI will be expected to use the same limb type.
				1136
				1137	Currently no attempt is made to follow whatever conventions a system has for
				1138	installing library or header files built for a particular ABI@. This will
				1139	probably only matter when installing multiple builds of GMP, and it might be
				1140	as simple as configuring with a special @samp{libdir}, or it might require
				1141	more than that. Note that builds for different ABIs need to done separately,
				1142	with a fresh @command{./configure} and @command{make} each.
				1143
				1144	@sp 1
				1145	@table @asis
				1146	@need 1000
				1147	@item AMD64 (@samp{x86_64})
				1148	@cindex AMD64
				1149	On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the
				1150	following ABI choices are available.
				1151
				1152	@table @asis
				1153	@item @samp{ABI=64}
				1154	The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip
				1155	architecture. This is the default. Applications will usually not need
				1156	special compiler flags, but for reference the option is
				1157
				1158	@example
				1159	gcc -m64
				1160	@end example
				1161
				1162	@item @samp{ABI=32}
				1163	The 32-bit ABI is the usual i386 conventions. This will be slower, and is not
				1164	recommended except for inter-operating with other code not yet 64-bit capable.
				1165	Applications must be compiled with
				1166
				1167	@example
				1168	gcc -m32
				1169	@end example
				1170
				1171	(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.)
				1172
				1173	@item @samp{ABI=x32}
				1174	The x32 ABI uses 64-bit limbs but 32-bit pointers. Like the 64-bit ABI, it
				1175	makes full use of the chip's arithmetic capabilities. This ABI is not
				1176	supported by all operating systems.
				1177
				1178	@example
				1179	gcc -mx32
				1180	@end example
				1181
				1182	@end table
				1183
				1184	@sp 1
				1185	@need 1000
				1186	@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64})
				1187	@cindex HPPA
				1188	@cindex HP-UX
				1189	@table @asis
				1190	@item @samp{ABI=2.0w}
				1191	The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or
				1192	up. Applications must be compiled with
				1193
				1194	@example
				1195	gcc [built for 2.0w]
				1196	cc +DD64
				1197	@end example
				1198
				1199	@item @samp{ABI=2.0n}
				1200	The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling
				1201	conventions, but with 64-bit instructions permitted within functions. GMP
				1202	uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64
				1203	GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with
				1204
				1205	@example
				1206	gcc [built for 2.0n]
				1207	cc +DA2.0 +e
				1208	@end example
				1209
				1210	Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit
				1211	instructions for @code{long long} operations and so may be slower than for
				1212	2.0w. (The GMP assembly code is the same though.)
				1213
				1214	@item @samp{ABI=1.0}
				1215	HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@.
				1216	No special compiler options are needed for applications.
				1217	@end table
				1218
				1219	All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and
				1220	@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are
				1221	considered.
				1222
				1223	Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes,
				1224	unlike HP @command{cc}. Instead it must be built for one or the other ABI@.
				1225	GMP will detect how it was built, and skip to the corresponding @samp{ABI}.
				1226
				1227	@sp 1
				1228	@need 1500
				1229	@item IA-64 under HP-UX (@samp{ia64--hpux}, @samp{itanium--hpux})
				1230	@cindex IA-64
				1231	@cindex HP-UX
				1232	HP-UX supports two ABIs for IA-64. GMP performance is the same in both.
				1233
				1234	@table @asis
				1235	@item @samp{ABI=32}
				1236	In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP
				1237	uses a 64 bit @code{long long} for a limb. Applications can be compiled
				1238	without any special flags since this ABI is the default in both HP C and GCC,
				1239	but for reference the flags are
				1240
				1241	@example
				1242	gcc -milp32
				1243	cc +DD32
				1244	@end example
				1245
				1246	@item @samp{ABI=64}
				1247	In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a
				1248	@code{long} for a limb. Applications must be compiled with
				1249
				1250	@example
				1251	gcc -mlp64
				1252	cc +DD64
				1253	@end example
				1254	@end table
				1255
				1256	On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only
				1257	choice.
				1258
				1259	@sp 1
				1260	@need 1000
				1261	@item MIPS under IRIX 6 (@samp{mips--irix[6789]})
				1262	@cindex MIPS
				1263	@cindex IRIX
				1264	IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32,
				1265	and 64. n32 or 64 are recommended, and GMP performance will be the same in
				1266	each. The default is n32.
				1267
				1268	@table @asis
				1269	@item @samp{ABI=o32}
				1270	The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP
				1271	will be slower than in n32 or 64, this option only exists to support old
				1272	compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special
				1273	flags on an old compiler, or on a newer compiler with
				1274
				1275	@example
				1276	gcc -mabi=32
				1277	cc -32
				1278	@end example
				1279
				1280	@item @samp{ABI=n32}
				1281	The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
				1282	@code{long long}. Applications must be compiled with
				1283
				1284	@example
				1285	gcc -mabi=n32
				1286	cc -n32
				1287	@end example
				1288
				1289	@item @samp{ABI=64}
				1290	The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled
				1291	with
				1292
				1293	@example
				1294	gcc -mabi=64
				1295	cc -64
				1296	@end example
				1297	@end table
				1298
				1299	Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
				1300	support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
				1301
				1302	@sp 1
				1303	@need 1000
				1304	@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5})
				1305	@cindex PowerPC
				1306	@table @asis
				1307	@item @samp{ABI=mode64}
				1308	@cindex AIX
				1309	The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64
				1310	@samp{--aix*} systems. Applications must be compiled with
				1311
				1312	@example
				1313	gcc -maix64
				1314	xlc -q64
				1315	@end example
				1316
				1317	On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must
				1318	be compiled with
				1319
				1320	@example
				1321	gcc -m64
				1322	@end example
				1323
				1324	@item @samp{ABI=mode32}
				1325	The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip
				1326	still in 32-bit mode and using 32-bit calling conventions. This is the default
				1327	for systems where the true 64-bit ABI is unavailable. No special compiler
				1328	options are typically needed for applications. This ABI is not available under
				1329	AIX.
				1330
				1331	@item @samp{ABI=32}
				1332	This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler
				1333	options are needed for applications.
				1334	@end table
				1335
				1336	GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd
				1337	best. In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full
				1338	use of a 64-bit chip.
				1339
				1340	@sp 1
				1341	@need 1000
				1342	@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*})
				1343	@cindex Sparc V9
				1344	@cindex Solaris
				1345	@cindex Sun
				1346	@table @asis
				1347	@item @samp{ABI=64}
				1348	The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent
				1349	versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in
				1350	64-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On
				1351	GNU/Linux, depending on the default @command{gcc} mode, applications must be
				1352	compiled with
				1353
				1354	@example
				1355	gcc -m64
				1356	@end example
				1357
				1358	On Solaris applications must be compiled with
				1359
				1360	@example
				1361	gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
				1362	cc -xarch=v9
				1363	@end example
				1364
				1365	On the BSD sparc64 systems no special options are required, since 64-bits is
				1366	the only ABI available.
				1367
				1368	@item @samp{ABI=32}
				1369	For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In
				1370	the Sun documentation this combination is known as ``v8plus''. On GNU/Linux,
				1371	depending on the default @command{gcc} mode, applications may need to be
				1372	compiled with
				1373
				1374	@example
				1375	gcc -m32
				1376	@end example
				1377
				1378	On Solaris, no special compiler options are required for applications, though
				1379	using something like the following is recommended. (@command{gcc} 2.8 and
				1380	earlier only support @samp{-mv8} though.)
				1381
				1382	@example
				1383	gcc -mv8plus
				1384	cc -xarch=v8plus
				1385	@end example
				1386	@end table
				1387
				1388	GMP speed is greatest in @samp{ABI=64}, so it's the default where available.
				1389	The speed is partly because there are extra registers available and partly
				1390	because 64-bits is considered the more important case and has therefore had
				1391	better code written for it.
				1392
				1393	Don't be confused by the names of the @samp{-m} and @samp{-x} compiler
				1394	options, they're called @samp{arch} but effectively control both ABI and ISA@.
				1395
				1396	On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel
				1397	doesn't save all registers.
				1398
				1399	On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will
				1400	reject @samp{ABI=64} because the resulting executables won't run.
				1401	@samp{ABI=64} can still be built if desired by making it look like a
				1402	cross-compile, for example
				1403
				1404	@example
				1405	./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
				1406	@end example
				1407	@end table
				1408
				1409
				1410	@need 2000
				1411	@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
				1412	@section Notes for Package Builds
				1413	@cindex Build notes for binary packaging
				1414	@cindex Packaged builds
				1415
				1416	GMP should present no great difficulties for packaging in a binary
				1417	distribution.
				1418
				1419	@cindex Libtool versioning
				1420	@cindex Shared library versioning
				1421	Libtool is used to build the library and @samp{-version-info} is set
				1422	appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning,
				1423	Library interface versions, Library interface versions, libtool, GNU
				1424	Libtool}).
				1425
				1426	The GMP 4 series will be upwardly binary compatible in each release and will
				1427	be upwardly binary compatible with all of the GMP 3 series. Additional
				1428	function interfaces may be added in each release, so on systems where libtool
				1429	versioning is not fully checked by the loader an auxiliary mechanism may be
				1430	needed to express that a dynamic linked application depends on a new enough
				1431	GMP.
				1432
				1433	An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
				1434	(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
				1435	from the same GMP version, since this is not done by the libtool versioning,
				1436	nor otherwise. A mismatch will result in unresolved symbols from the linker,
				1437	or perhaps the loader.
				1438
				1439	When building a package for a CPU family, care should be taken to use
				1440	@samp{--host} (or @samp{--build}) to choose the least common denominator among
				1441	the CPUs which might use the package. For example this might mean plain
				1442	@samp{sparc} (meaning V7) for SPARCs.
				1443
				1444	For x86s, @option{--enable-fat} sets things up for a fat binary build, making a
				1445	runtime selection of optimized low level routines. This is a good choice for
				1446	packaging to run on a range of x86 chips.
				1447
				1448	Users who care about speed will want GMP built for their exact CPU type, to
				1449	make best use of the available optimizations. Providing a way to suitably
				1450	rebuild a package may be useful. This could be as simple as making it
				1451	possible for a user to omit @samp{--build} (and @samp{--host}) so
				1452	@samp{./config.guess} will detect the CPU@. But a way to manually specify a
				1453	@samp{--build} will be wanted for systems where @samp{./config.guess} is
				1454	inexact.
				1455
				1456	On systems with multiple ABIs, a packaged build will need to decide which
				1457	among the choices is to be provided, see @ref{ABI and ISA}. A given run of
				1458	@samp{./configure} etc will only build one ABI@. If a second ABI is also
				1459	required then a second run of @samp{./configure} etc must be made, starting
				1460	from a clean directory tree (@samp{make distclean}).
				1461
				1462	As noted under ``ABI and ISA'', currently no attempt is made to follow system
				1463	conventions for install locations that vary with ABI, such as
				1464	@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for
				1465	@samp{ABI=32}. A package build can override @samp{libdir} and other standard
				1466	variables as necessary.
				1467
				1468	Note that @file{gmp.h} is a generated file, and will be architecture and ABI
				1469	dependent. When attempting to install two ABIs simultaneously it will be
				1470	important that an application compile gets the correct @file{gmp.h} for its
				1471	desired ABI@. If compiler include paths don't vary with ABI options then it
				1472	might be necessary to create a @file{/usr/include/gmp.h} which tests
				1473	preprocessor symbols and chooses the correct actual @file{gmp.h}.
				1474
				1475
				1476	@need 2000
				1477	@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
				1478	@section Notes for Particular Systems
				1479	@cindex Build notes for particular systems
				1480	@cindex Particular systems
				1481	@cindex Systems
				1482	@table @asis
				1483
				1484	@c This section is more or less meant for notes about performance or about
				1485	@c build problems that have been worked around but might leave a user
				1486	@c scratching their head. Fun with different ABIs on a system belongs in the
				1487	@c above section.
				1488
				1489	@item AIX 3 and 4
				1490	@cindex AIX
				1491	On systems @samp{--aix[34]*} shared libraries are disabled by default, since
				1492	some versions of the native @command{ar} fail on the convenience libraries
				1493	used. A shared build can be attempted with
				1494
				1495	@example
				1496	./configure --enable-shared --disable-static
				1497	@end example
				1498
				1499	Note that the @samp{--disable-static} is necessary because in a shared build
				1500	libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
				1501	the benefit of old versions of @command{ld} which only recognise @file{.a},
				1502	but unfortunately this is done even if a fully functional @command{ld} is
				1503	available.
				1504
				1505	@item ARM
				1506	@cindex ARM
				1507	On systems @samp{arm--*}, versions of GCC up to and including 2.95.3 have a
				1508	bug in unsigned division, giving wrong results for some operands. GMP
				1509	@samp{./configure} will demand GCC 2.95.4 or later.
				1510
				1511	@item Compaq C++
				1512	@cindex Compaq C++
				1513	Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
				1514	an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the
				1515	standard one, which unfortunately is not the default but must be selected by
				1516	defining @code{__USE_STD_IOSTREAM}. Configure with for instance
				1517
				1518	@example
				1519	./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
				1520	@end example
				1521
				1522	@item Floating Point Mode
				1523	@cindex Floating point mode
				1524	@cindex Hardware floating point mode
				1525	@cindex Precision of hardware floating point
				1526	@cindex x87
				1527	On some systems, the hardware floating point has a control mode which can set
				1528	all operations to be done in a particular precision, for instance single,
				1529	double or extended on x86 systems (x87 floating point). The GMP functions
				1530	involving a @code{double} cannot be expected to operate to their full
				1531	precision when the hardware is in single precision mode. Of course this
				1532	affects all code, including application code, not just GMP.
				1533
				1534	@item FreeBSD 7.x, 8.x, 9.0, 9.1, 9.2
				1535	@cindex FreeBSD
				1536	@command{m4} in these releases of FreeBSD has an eval function which ignores
				1537	its 2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file
				1538	processing. @samp{./configure} will detect the problem and either abort or
				1539	choose another m4 in the @env{PATH}. The bug is fixed in FreeBSD 9.3 and 10.0,
				1540	so either upgrade or use GNU m4. Note that the FreeBSD package system installs
				1541	GNU m4 under the name @samp{gm4}, which GMP cannot guess.
				1542
				1543	@item FreeBSD 7.x, 8.x, 9.x
				1544	@cindex FreeBSD
				1545	GMP releases starting with 6.0 do not support @samp{ABI=32} on FreeBSD/amd64
				1546	prior to release 10.0 of the system. The cause is a broken @code{limits.h},
				1547	which GMP no longer works around.
				1548
				1549	@item MS-DOS and MS Windows
				1550	@cindex MS-DOS
				1551	@cindex MS Windows
				1552	@cindex Windows
				1553	@cindex Cygwin
				1554	@cindex DJGPP
				1555	@cindex MINGW
				1556	On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows
				1557	system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of
				1558	GCC and the various GNU tools.
				1559
				1560	@display
				1561	@uref{https://www.cygwin.com/}
				1562	@uref{http://www.delorie.com/djgpp/}
				1563	@uref{http://www.mingw.org/}
				1564	@end display
				1565
				1566	@cindex Interix
				1567	@cindex Services for Unix
				1568	Microsoft also publishes an Interix ``Services for Unix'' which can be used to
				1569	build GMP on Windows (with a normal @samp{./configure}), but it's not free
				1570	software.
				1571
				1572	@item MS Windows DLLs
				1573	@cindex DLLs
				1574	@cindex MS Windows
				1575	@cindex Windows
				1576	On systems @samp{--cygwin}, @samp{--mingw} and @samp{--pw32*} by
				1577	default GMP builds only a static library, but a DLL can be built instead using
				1578
				1579	@example
				1580	./configure --disable-static --enable-shared
				1581	@end example
				1582
				1583	Static and DLL libraries can't both be built, since certain export directives
				1584	in @file{gmp.h} must be different.
				1585
				1586	A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't
				1587	install a @file{.lib} format import library, but it can be created with MS
				1588	@command{lib} as follows, and copied to the install directory. Similarly for
				1589	@file{libmp} and @file{libgmpxx}.
				1590
				1591	@example
				1592	cd .libs
				1593	lib /def:libgmp-3.dll.def /out:libgmp-3.lib
				1594	@end example
				1595
				1596	MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications
				1597	wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do
				1598	the same. If one of the other C runtime library choices provided by MS C is
				1599	desired then the suggestion is to use the GMP string functions and confine I/O
				1600	to the application.
				1601
				1602	@item Motorola 68k CPU Types
				1603	@cindex 68000
				1604	@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a
				1605	performance boost on applicable CPUs. @samp{m68360} can be used for CPU32
				1606	series chips. @samp{m68302} can be used for ``Dragonball'' series chips,
				1607	though this is merely a synonym for @samp{m68000}.
				1608
				1609	@item NetBSD 5.x
				1610	@cindex NetBSD
				1611	@command{m4} in these releases of NetBSD has an eval function which ignores its
				1612	2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file
				1613	processing. @samp{./configure} will detect the problem and either abort or
				1614	choose another m4 in the @env{PATH}. The bug is fixed in NetBSD 6, so either
				1615	upgrade or use GNU m4. Note that the NetBSD package system installs GNU m4
				1616	under the name @samp{gm4}, which GMP cannot guess.
				1617
				1618	@item OpenBSD 2.6
				1619	@cindex OpenBSD
				1620	@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
				1621	unsuitable for @file{.asm} file processing. @samp{./configure} will detect
				1622	the problem and either abort or choose another m4 in the @env{PATH}. The bug
				1623	is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
				1624
				1625	@item Power CPU Types
				1626	@cindex Power/PowerPC
				1627	In GMP, CPU types @samp{power} and @samp{powerpc} will each use instructions
				1628	not available on the other, so it's important to choose the right one for the
				1629	CPU that will be used. Currently GMP has no assembly code support for using
				1630	just the common instruction subset. To get executables that run on both, the
				1631	current suggestion is to use the generic C code (@option{--disable-assembly}),
				1632	possibly with appropriate compiler options (like @samp{-mcpu=common} for
				1633	@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of
				1634	workstations) is accepted by @file{config.sub}, but is currently equivalent to
				1635	@option{--disable-assembly}.
				1636
				1637	@item Sparc CPU Types
				1638	@cindex Sparc
				1639	@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
				1640	significant performance increase over the V7 code selected by plain
				1641	@samp{sparc}.
				1642
				1643	@item Sparc App Regs
				1644	@cindex Sparc
				1645	The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the
				1646	``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
				1647	that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC
				1648	Options, gcc, Using the GNU Compiler Collection (GCC)}).
				1649
				1650	This makes that code unsuitable for use with the special V9
				1651	@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and
				1652	for applications wanting to use those registers for special purposes. In these
				1653	cases the only suggestion currently is to build GMP with
				1654	@option{--disable-assembly} to avoid the assembly code.
				1655
				1656	@item SunOS 4
				1657	@cindex SunOS
				1658	@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
				1659	files, and instead @samp{./configure} will automatically use
				1660	@command{/usr/5bin/m4}, which we believe is always available (if not then use
				1661	GNU m4).
				1662
				1663	@item x86 CPU Types
				1664	@cindex x86
				1665	@cindex 80x86
				1666	@cindex i386
				1667	@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended
				1668	P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
				1669	P-III)@. @samp{i386} is a better choice when making binaries that must run on
				1670	both.
				1671
				1672	@item x86 MMX and SSE2 Code
				1673	@cindex MMX
				1674	@cindex SSE2
				1675	If the CPU selected has MMX code but the assembler doesn't support it, a
				1676	warning is given and non-MMX code is used instead. This will be an inferior
				1677	build, since the MMX code that's present is there because it's faster than the
				1678	corresponding plain integer code. The same applies to SSE2.
				1679
				1680	Old versions of @samp{gas} don't support MMX instructions, in particular
				1681	version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1
				1682	doesn't.
				1683
				1684	Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
				1685	to register @code{movq} instructions, and so can't be used for MMX code.
				1686	Install a recent @command{gas} if MMX code is wanted on these systems.
				1687	@end table
				1688
				1689
				1690	@need 2000
				1691	@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP
				1692	@section Known Build Problems
				1693	@cindex Build problems known
				1694
				1695	@c This section is more or less meant for known build problems that are not
				1696	@c otherwise worked around and require some sort of manual intervention.
				1697
				1698	You might find more up-to-date information at @uref{https://gmplib.org/}.
				1699
				1700	@table @asis
				1701	@item Compiler link options
				1702	The version of libtool currently in use rather aggressively strips compiler
				1703	options when linking a shared library. This will hopefully be relaxed in the
				1704	future, but for now if this is a problem the suggestion is to create a little
				1705	script to hide them, and for instance configure with
				1706
				1707	@example
				1708	./configure CC=gcc-with-my-options
				1709	@end example
				1710
				1711	@item DJGPP (@samp{--msdosdjgpp*})
				1712	@cindex DJGPP
				1713	The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
				1714	script, it exits silently, having died writing a preamble to
				1715	@file{config.log}. Use @command{bash} 2.04 or higher.
				1716
				1717	@samp{make all} was found to run out of memory during the final
				1718	@file{libgmp.la} link on one system tested, despite having 64Mb available.
				1719	Running @samp{make libgmp.la} directly helped, perhaps recursing into the
				1720	various subdirectories uses up memory.
				1721
				1722	@item GNU binutils @command{strip} prior to 2.12
				1723	@cindex Stripped libraries
				1724	@cindex Binutils @command{strip}
				1725	@cindex GNU @command{strip}
				1726	@command{strip} from GNU binutils 2.11 and earlier should not be used on the
				1727	static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
				1728	but the last of multiple archive members with the same name, like the three
				1729	versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be
				1730	used successfully.
				1731
				1732	The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
				1733	this and any version of @command{strip} can be used on them.
				1734
				1735	@item @command{make} syntax error
				1736	@cindex SCO
				1737	@cindex IRIX
				1738	On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
				1739	is unable to handle the long dependencies list for @file{libgmp.la}. The
				1740	symptom is a ``syntax error'' on the following line of the top-level
				1741	@file{Makefile}.
				1742
				1743	@example
				1744	libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
				1745	@end example
				1746
				1747	Either use GNU Make, or as a workaround remove
				1748	@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
				1749	build work, but if any recompiling is done @file{libgmp.la} might not be
				1750	rebuilt).
				1751
				1752	@item MacOS X (@samp{--darwin*})
				1753	@cindex MacOS X
				1754	@cindex Darwin
				1755	Libtool currently only knows how to create shared libraries on MacOS X using
				1756	the native @command{cc} (which is a modified GCC), not a plain GCC@. A
				1757	static-only build should work though (@samp{--disable-shared}).
				1758
				1759	@item NeXT prior to 3.3
				1760	@cindex NeXT
				1761	The system compiler on old versions of NeXT was a massacred and old GCC, even
				1762	if it called itself @file{cc}. This compiler cannot be used to build GMP, you
				1763	need to get a real GCC, and install that. (NeXT may have fixed this in
				1764	release 3.3 of their system.)
				1765
				1766	@item POWER and PowerPC
				1767	@cindex Power/PowerPC
				1768	Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
				1769	PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or
				1770	later).
				1771
				1772	@item Sequent Symmetry
				1773	@cindex Sequent Symmetry
				1774	Use the GNU assembler instead of the system assembler, since the latter has
				1775	serious bugs.
				1776
				1777	@item Solaris 2.6
				1778	@cindex Solaris
				1779	The system @command{sed} prints an error ``Output line too long'' when libtool
				1780	builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects,
				1781	but GNU @command{sed} is recommended, to avoid any doubt.
				1782
				1783	@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32}
				1784	@cindex Solaris
				1785	A shared library build of GMP seems to fail in this combination, it builds but
				1786	then fails the tests, apparently due to some incorrect data relocations within
				1787	@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown,
				1788	@samp{--disable-shared} is recommended.
				1789	@end table
				1790
				1791
				1792	@need 2000
				1793	@node Performance optimization, , Known Build Problems, Installing GMP
				1794	@section Performance optimization
				1795	@cindex Optimizing performance
				1796
				1797	@c At some point, this should perhaps move to a separate chapter on optimizing
				1798	@c performance.
				1799
				1800	For optimal performance, build GMP for the exact CPU type of the target
				1801	computer, see @ref{Build Options}.
				1802
				1803	Unlike what is the case for most other programs, the compiler typically
				1804	doesn't matter much, since GMP uses assembly language for the most critical
				1805	operation.
				1806
				1807	In particular for long-running GMP applications, and applications demanding
				1808	extremely large numbers, building and running the @code{tuneup} program in the
				1809	@file{tune} subdirectory, can be important. For example,
				1810
				1811	@example
				1812	cd tune
				1813	make tuneup
				1814	./tuneup
				1815	@end example
				1816
				1817	will generate better contents for the @file{gmp-mparam.h} parameter file.
				1818
				1819	To use the results, put the output in the file indicated in the
				1820	@samp{Parameters for ...} header. Then recompile from scratch.
				1821
				1822	The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which
				1823	instructs the program how long to check FFT multiply parameters. If you're
				1824	going to use GMP for extremely large numbers, you may want to run @code{tuneup}
				1825	with a large NNN value.
				1826
				1827
				1828	@node GMP Basics, Reporting Bugs, Installing GMP, Top
				1829	@comment node-name, next, previous, up
				1830	@chapter GMP Basics
				1831	@cindex Basics
				1832
				1833	@strong{Using functions, macros, data types, etc.@: not documented in this
				1834	manual is strongly discouraged. If you do so your application is guaranteed
				1835	to be incompatible with future versions of GMP.}
				1836
				1837	@menu
				1838	* Headers and Libraries::
				1839	* Nomenclature and Types::
				1840	* Function Classes::
				1841	* Variable Conventions::
				1842	* Parameter Conventions::
				1843	* Memory Management::
				1844	* Reentrancy::
				1845	* Useful Macros and Constants::
				1846	* Compatibility with older versions::
				1847	* Demonstration Programs::
				1848	* Efficiency::
				1849	* Debugging::
				1850	* Profiling::
				1851	* Autoconf::
				1852	* Emacs::
				1853	@end menu
				1854
				1855	@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
				1856	@section Headers and Libraries
				1857	@cindex Headers
				1858
				1859	@cindex @file{gmp.h}
				1860	@cindex Include files
				1861	@cindex @code{#include}
				1862	All declarations needed to use GMP are collected in the include file
				1863	@file{gmp.h}. It is designed to work with both C and C++ compilers.
				1864
				1865	@example
				1866	#include <gmp.h>
				1867	@end example
				1868
				1869	@cindex @code{stdio.h}
				1870	Note however that prototypes for GMP functions with @code{FILE *} parameters
				1871	are only provided if @code{<stdio.h>} is included too.
				1872
				1873	@example
				1874	#include <stdio.h>
				1875	#include <gmp.h>
				1876	@end example
				1877
				1878	@cindex @code{stdarg.h}
				1879	Likewise @code{<stdarg.h>} is required for prototypes with @code{va_list}
				1880	parameters, such as @code{gmp_vprintf}. And @code{<obstack.h>} for prototypes
				1881	with @code{struct obstack} parameters, such as @code{gmp_obstack_printf}, when
				1882	available.
				1883
				1884	@cindex Libraries
				1885	@cindex Linking
				1886	@cindex @code{libgmp}
				1887	All programs using GMP must link against the @file{libgmp} library. On a
				1888	typical Unix-like system this can be done with @samp{-lgmp}, for example
				1889
				1890	@example
				1891	gcc myprogram.c -lgmp
				1892	@end example
				1893
				1894	@cindex @code{libgmpxx}
				1895	GMP C++ functions are in a separate @file{libgmpxx} library. This is built
				1896	and installed if C++ support has been enabled (@pxref{Build Options}). For
				1897	example,
				1898
				1899	@example
				1900	g++ mycxxprog.cc -lgmpxx -lgmp
				1901	@end example
				1902
				1903	@cindex Libtool
				1904	GMP is built using Libtool and an application can use that to link if desired,
				1905	@GMPpxreftop{libtool, GNU Libtool}.
				1906
				1907	If GMP has been installed to a non-standard location then it may be necessary
				1908	to use @samp{-I} and @samp{-L} compiler options to point to the right
				1909	directories, and some sort of run-time path for a shared library.
				1910
				1911
				1912	@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
				1913	@section Nomenclature and Types
				1914	@cindex Nomenclature
				1915	@cindex Types
				1916
				1917	@cindex Integer
				1918	@tindex @code{mpz_t}
				1919	In this manual, @dfn{integer} usually means a multiple precision integer, as
				1920	defined by the GMP library. The C data type for such integers is @code{mpz_t}.
				1921	Here are some examples of how to declare such integers:
				1922
				1923	@example
				1924	mpz_t sum;
				1925
				1926	struct foo @{ mpz_t x, y; @};
				1927
				1928	mpz_t vec[20];
				1929	@end example
				1930
				1931	@cindex Rational number
				1932	@tindex @code{mpq_t}
				1933	@dfn{Rational number} means a multiple precision fraction. The C data type
				1934	for these fractions is @code{mpq_t}. For example:
				1935
				1936	@example
				1937	mpq_t quotient;
				1938	@end example
				1939
				1940	@cindex Floating-point number
				1941	@tindex @code{mpf_t}
				1942	@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
				1943	mantissa with a limited precision exponent. The C data type for such objects
				1944	is @code{mpf_t}. For example:
				1945
				1946	@example
				1947	mpf_t fp;
				1948	@end example
				1949
				1950	@tindex @code{mp_exp_t}
				1951	The floating point functions accept and return exponents in the C type
				1952	@code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems
				1953	it's an @code{int} for efficiency.
				1954
				1955	@cindex Limb
				1956	@tindex @code{mp_limb_t}
				1957	A @dfn{limb} means the part of a multi-precision number that fits in a single
				1958	machine word. (We chose this word because a limb of the human body is
				1959	analogous to a digit, only larger, and containing several digits.) Normally a
				1960	limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}.
				1961
				1962	@tindex @code{mp_size_t}
				1963	Counts of limbs of a multi-precision number represented in the C type
				1964	@code{mp_size_t}. Currently this is normally a @code{long}, but on some
				1965	systems it's an @code{int} for efficiency, and on some systems it will be
				1966	@code{long long} in the future.
				1967
				1968	@tindex @code{mp_bitcnt_t}
				1969	Counts of bits of a multi-precision number are represented in the C type
				1970	@code{mp_bitcnt_t}. Currently this is always an @code{unsigned long}, but on
				1971	some systems it will be an @code{unsigned long long} in the future.
				1972
				1973	@cindex Random state
				1974	@tindex @code{gmp_randstate_t}
				1975	@dfn{Random state} means an algorithm selection and current state data. The C
				1976	data type for such objects is @code{gmp_randstate_t}. For example:
				1977
				1978	@example
				1979	gmp_randstate_t rstate;
				1980	@end example
				1981
				1982	Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and
				1983	@code{size_t} is used for byte or character counts.
				1984
				1985
				1986	@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
				1987	@section Function Classes
				1988	@cindex Function classes
				1989
				1990	There are six classes of functions in the GMP library:
				1991
				1992	@enumerate
				1993	@item
				1994	Functions for signed integer arithmetic, with names beginning with
				1995	@code{mpz_}. The associated type is @code{mpz_t}. There are about 150
				1996	functions in this class. (@pxref{Integer Functions})
				1997
				1998	@item
				1999	Functions for rational number arithmetic, with names beginning with
				2000	@code{mpq_}. The associated type is @code{mpq_t}. There are about 35
				2001	functions in this class, but the integer functions can be used for arithmetic
				2002	on the numerator and denominator separately. (@pxref{Rational Number
				2003	Functions})
				2004
				2005	@item
				2006	Functions for floating-point arithmetic, with names beginning with
				2007	@code{mpf_}. The associated type is @code{mpf_t}. There are about 70
				2008	functions is this class. (@pxref{Floating-point Functions})
				2009
				2010	@item
				2011	Fast low-level functions that operate on natural numbers. These are used by
				2012	the functions in the preceding groups, and you can also call them directly
				2013	from very time-critical user programs. These functions' names begin with
				2014	@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are
				2015	about 60 (hard-to-use) functions in this class. (@pxref{Low-level Functions})
				2016
				2017	@item
				2018	Miscellaneous functions. Functions for setting up custom allocation and
				2019	functions for generating random numbers. (@pxref{Custom Allocation}, and
				2020	@pxref{Random Number Functions})
				2021	@end enumerate
				2022
				2023
				2024	@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
				2025	@section Variable Conventions
				2026	@cindex Variable conventions
				2027	@cindex Conventions for variables
				2028
				2029	GMP functions generally have output arguments before input arguments. This
				2030	notation is by analogy with the assignment operator.
				2031
				2032	GMP lets you use the same variable for both input and output in one call. For
				2033	example, the main function for integer multiplication, @code{mpz_mul}, can be
				2034	used to square @code{x} and put the result back in @code{x} with
				2035
				2036	@example
				2037	mpz_mul (x, x, x);
				2038	@end example
				2039
				2040	Before you can assign to a GMP variable, you need to initialize it by calling
				2041	one of the special initialization functions. When you're done with a
				2042	variable, you need to clear it out, using one of the functions for that
				2043	purpose. Which function to use depends on the type of variable. See the
				2044	chapters on integer functions, rational number functions, and floating-point
				2045	functions for details.
				2046
				2047	A variable should only be initialized once, or at least cleared between each
				2048	initialization. After a variable has been initialized, it may be assigned to
				2049	any number of times.
				2050
				2051	For efficiency reasons, avoid excessive initializing and clearing. In
				2052	general, initialize near the start of a function and clear near the end. For
				2053	example,
				2054
				2055	@example
				2056	void
				2057	foo (void)
				2058	@{
				2059	mpz_t n;
				2060	int i;
				2061	mpz_init (n);
				2062	for (i = 1; i < 100; i++)
				2063	@{
				2064	mpz_mul (n, @dots{});
				2065	mpz_fdiv_q (n, @dots{});
				2066	@dots{}
				2067	@}
				2068	mpz_clear (n);
				2069	@}
				2070	@end example
				2071
				2072	GMP types like @code{mpz_t} are implemented as one-element arrays of certain
				2073	structures. Declaring a variable creates an object with the fields GMP needs,
				2074	but variables are normally manipulated by using the pointer to the object. For
				2075	both behavior and efficiency reasons, it is discouraged to make copies of the
				2076	GMP object itself (either directly or via aggregate objects containing such GMP
				2077	objects). If copies are done, all of them must be used read-only; using a copy
				2078	as the output of some function will invalidate all the other copies. Note that
				2079	the actual fields in each @code{mpz_t} etc are for internal use only and should
				2080	not be accessed directly by code that expects to be compatible with future GMP
				2081	releases.
				2082
				2083	@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
				2084	@section Parameter Conventions
				2085	@cindex Parameter conventions
				2086	@cindex Conventions for parameters
				2087
				2088	When a GMP variable is used as a function parameter, it's effectively a
				2089	call-by-reference, meaning that when the function stores a value there it will
				2090	change the original in the caller. Parameters which are input-only can be
				2091	designated @code{const} to provoke a compiler error or warning on attempting to
				2092	modify them.
				2093
				2094	When a function is going to return a GMP result, it should designate a
				2095	parameter that it sets, like the library functions do. More than one value
				2096	can be returned by having more than one output parameter, again like the
				2097	library functions. A @code{return} of an @code{mpz_t} etc doesn't return the
				2098	object, only a pointer, and this is almost certainly not what's wanted.
				2099
				2100	Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
				2101	and storing the result to the indicated parameter.
				2102
				2103	@example
				2104	void
				2105	foo (mpz_t result, const mpz_t param, unsigned long n)
				2106	@{
				2107	unsigned long i;
				2108	mpz_mul_ui (result, param, n);
				2109	for (i = 1; i < n; i++)
				2110	mpz_add_ui (result, result, i*7);
				2111	@}
				2112
				2113	int
				2114	main (void)
				2115	@{
				2116	mpz_t r, n;
				2117	mpz_init (r);
				2118	mpz_init_set_str (n, "123456", 0);
				2119	foo (r, n, 20L);
				2120	gmp_printf ("%Zd\n", r);
				2121	return 0;
				2122	@}
				2123	@end example
				2124
				2125	Our function @code{foo} works even if its caller passes the same variable for
				2126	@code{param} and @code{result}, just like the library functions. But
				2127	sometimes it's tricky to make that work, and an application might not want to
				2128	bother supporting that sort of thing.
				2129
				2130	Since GMP types are implemented as one-element arrays, using a GMP variable as
				2131	a parameter passes a pointer to the object. Hence the call-by-reference.
				2132
				2133
				2134	@need 1000
				2135	@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
				2136	@section Memory Management
				2137	@cindex Memory management
				2138
				2139	The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
				2140	and pointers to allocated data. Once a variable is initialized, GMP takes
				2141	care of all space allocation. Additional space is allocated whenever a
				2142	variable doesn't have enough.
				2143
				2144	@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
				2145	Normally this is the best policy, since it avoids frequent reallocation.
				2146	Applications that need to return memory to the heap at some particular point
				2147	can use @code{mpz_realloc2}, or clear variables no longer needed.
				2148
				2149	@code{mpf_t} variables, in the current implementation, use a fixed amount of
				2150	space, determined by the chosen precision and allocated at initialization, so
				2151	their size doesn't change.
				2152
				2153	All memory is allocated using @code{malloc} and friends by default, but this
				2154	can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is
				2155	also used (via @code{alloca}), but this can be changed at build-time if
				2156	desired, see @ref{Build Options}.
				2157
				2158
				2159	@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
				2160	@section Reentrancy
				2161	@cindex Reentrancy
				2162	@cindex Thread safety
				2163	@cindex Multi-threading
				2164
				2165	@noindent
				2166	GMP is reentrant and thread-safe, with some exceptions:
				2167
				2168	@itemize @bullet
				2169	@item
				2170	If configured with @option{--enable-alloca=malloc-notreentrant} (or with
				2171	@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
				2172	then naturally GMP is not reentrant.
				2173
				2174	@item
				2175	@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
				2176	selected precision. @code{mpf_init2} can be used instead, and in the C++
				2177	interface an explicit precision to the @code{mpf_class} constructor.
				2178
				2179	@item
				2180	@code{mpz_random} and the other old random number functions use a global
				2181	random state and are hence not reentrant. The newer random number functions
				2182	that accept a @code{gmp_randstate_t} parameter can be used instead.
				2183
				2184	@item
				2185	@code{gmp_randinit} (obsolete) returns an error indication through a global
				2186	variable, which is not thread safe. Applications are advised to use
				2187	@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead.
				2188
				2189	@item
				2190	@code{mp_set_memory_functions} uses global variables to store the selected
				2191	memory allocation functions.
				2192
				2193	@item
				2194	If the memory allocation functions set by a call to
				2195	@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
				2196	not reentrant, then GMP will not be reentrant either.
				2197
				2198	@item
				2199	If the standard I/O functions such as @code{fwrite} are not reentrant then the
				2200	GMP I/O functions using them will not be reentrant either.
				2201
				2202	@item
				2203	It's safe for two threads to read from the same GMP variable simultaneously,
				2204	but it's not safe for one to read while another might be writing, nor for
				2205	two threads to write simultaneously. It's not safe for two threads to
				2206	generate a random number from the same @code{gmp_randstate_t} simultaneously,
				2207	since this involves an update of that variable.
				2208	@end itemize
				2209
				2210
				2211	@need 2000
				2212	@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
				2213	@section Useful Macros and Constants
				2214	@cindex Useful macros and constants
				2215	@cindex Constants
				2216
				2217	@deftypevr {Global Constant} {const int} mp_bits_per_limb
				2218	@findex mp_bits_per_limb
				2219	@cindex Bits per limb
				2220	@cindex Limb size
				2221	The number of bits per limb.
				2222	@end deftypevr
				2223
				2224	@defmac __GNU_MP_VERSION
				2225	@defmacx __GNU_MP_VERSION_MINOR
				2226	@defmacx __GNU_MP_VERSION_PATCHLEVEL
				2227	@cindex Version number
				2228	@cindex GMP version number
				2229	The major and minor GMP version, and patch level, respectively, as integers.
				2230	For GMP i.j, these numbers will be i, j, and 0, respectively.
				2231	For GMP i.j.k, these numbers will be i, j, and k, respectively.
				2232	@end defmac
				2233
				2234	@deftypevr {Global Constant} {const char * const} gmp_version
				2235	@findex gmp_version
				2236	The GMP version number, as a null-terminated string, in the form ``i.j.k''.
				2237	This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was
				2238	used, before version 4.3.0, when k was zero.
				2239	@end deftypevr
				2240
				2241	@defmac __GMP_CC
				2242	@defmacx __GMP_CFLAGS
				2243	The compiler and compiler flags, respectively, used when compiling GMP, as
				2244	strings.
				2245	@end defmac
				2246
				2247
				2248	@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
				2249	@section Compatibility with older versions
				2250	@cindex Compatibility with older versions
				2251	@cindex Past GMP versions
				2252	@cindex Upward compatibility
				2253
				2254	This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x
				2255	versions, and upwardly compatible at the source level with all 2.x versions,
				2256	with the following exceptions.
				2257
				2258	@itemize @bullet
				2259	@item
				2260	@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
				2261	with other @code{mpn} functions.
				2262
				2263	@item
				2264	@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
				2265	3.0.1, but in 3.1 reverted to the 2.x style.
				2266
				2267	@item
				2268	@code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed.
				2269	@end itemize
				2270
				2271	There are a number of compatibility issues between GMP 1 and GMP 2 that of
				2272	course also apply when porting applications from GMP 1 to GMP 5. Please
				2273	see the GMP 2 manual for details.
				2274
				2275	@c @item Integer division functions round the result differently. The obsolete
				2276	@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
				2277	@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
				2278	@c quotient towards
				2279	@c @ifinfo
				2280	@c @minus{}infinity).
				2281	@c @end ifinfo
				2282	@c @iftex
				2283	@c @tex
				2284	@c $-\infty$).
				2285	@c @end tex
				2286	@c @end iftex
				2287	@c There are a lot of functions for integer division, giving the user better
				2288	@c control over the rounding.
				2289
				2290	@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
				2291
				2292	@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
				2293	@c @strong{mod} for reduction.
				2294
				2295	@c @item The assignment functions for rational numbers do no longer canonicalize
				2296	@c their results. In the case a non-canonical result could arise from an
				2297	@c assignment, the user need to insert an explicit call to
				2298	@c @code{mpq_canonicalize}. This change was made for efficiency.
				2299
				2300	@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
				2301	@c by @code{mpz_inp_raw} in previous releases. This change was made for making
				2302	@c the file format truly portable between machines with different word sizes.
				2303
				2304	@c @item Several @code{mpn} functions have changed. But they were intentionally
				2305	@c undocumented in previous releases.
				2306
				2307	@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
				2308	@c are now implemented as macros, and thereby sometimes evaluate their
				2309	@c arguments multiple times.
				2310
				2311	@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
				2312	@c for 0^0. (In version 1, they yielded 0.)
				2313
				2314	@c In version 1 of the library, @code{mpq_set_den} handled negative
				2315	@c denominators by copying the sign to the numerator. That is no longer done.
				2316
				2317	@c Pure assignment functions do not canonicalize the assigned variable. It is
				2318	@c the responsibility of the user to canonicalize the assigned variable before
				2319	@c any arithmetic operations are performed on that variable.
				2320	@c Note that this is an incompatible change from version 1 of the library.
				2321
				2322	@c @end enumerate
				2323
				2324
				2325	@need 1000
				2326	@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
				2327	@section Demonstration programs
				2328	@cindex Demonstration programs
				2329	@cindex Example programs
				2330	@cindex Sample programs
				2331	The @file{demos} subdirectory has some sample programs using GMP@. These
				2332	aren't built or installed, but there's a @file{Makefile} with rules for them.
				2333	For instance,
				2334
				2335	@example
				2336	make pexpr
				2337	./pexpr 68^975+10
				2338	@end example
				2339
				2340	@noindent
				2341	The following programs are provided
				2342
				2343	@itemize @bullet
				2344	@item
				2345	@cindex Expression parsing demo
				2346	@cindex Parsing expressions demo
				2347	@samp{pexpr} is an expression evaluator, the program used on the GMP web page.
				2348	@item
				2349	@cindex Expression parsing demo
				2350	@cindex Parsing expressions demo
				2351	The @samp{calc} subdirectory has a similar but simpler evaluator using
				2352	@command{lex} and @command{yacc}.
				2353	@item
				2354	@cindex Expression parsing demo
				2355	@cindex Parsing expressions demo
				2356	The @samp{expr} subdirectory is yet another expression evaluator, a library
				2357	designed for ease of use within a C program. See @file{demos/expr/README} for
				2358	more information.
				2359	@item
				2360	@cindex Factorization demo
				2361	@samp{factorize} is a Pollard-Rho factorization program.
				2362	@item
				2363	@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
				2364	function.
				2365	@item
				2366	@samp{primes} counts or lists primes in an interval, using a sieve.
				2367	@item
				2368	@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
				2369	class numbers.
				2370	@item
				2371	@cindex @code{perl}
				2372	@cindex GMP Perl module
				2373	@cindex Perl module
				2374	The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See
				2375	@file{demos/perl/INSTALL} for more information. Documentation is in POD
				2376	format in @file{demos/perl/GMP.pm}.
				2377	@end itemize
				2378
				2379	As an aside, consideration has been given at various times to some sort of
				2380	expression evaluation within the main GMP library. Going beyond something
				2381	minimal quickly leads to matters like user-defined functions, looping, fixnums
				2382	for control variables, etc, which are considered outside the scope of GMP
				2383	(much closer to language interpreters or compilers, @xref{Language Bindings}.)
				2384	Something simple for program input convenience may yet be a possibility, a
				2385	combination of the @file{expr} demo and the @file{pexpr} tree back-end
				2386	perhaps. But for now the above evaluators are offered as illustrations.
				2387
				2388
				2389	@need 1000
				2390	@node Efficiency, Debugging, Demonstration Programs, GMP Basics
				2391	@section Efficiency
				2392	@cindex Efficiency
				2393
				2394	@table @asis
				2395	@item Small Operands
				2396	@cindex Small operands
				2397	On small operands, the time for function call overheads and memory allocation
				2398	can be significant in comparison to actual calculation. This is unavoidable
				2399	in a general purpose variable precision library, although GMP attempts to be
				2400	as efficient as it can on both large and small operands.
				2401
				2402	@item Static Linking
				2403	@cindex Static linking
				2404	On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
				2405	used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
				2406	have a small overhead on each function call and global data address. For many
				2407	programs this will be insignificant, but for long calculations there's a gain
				2408	to be had.
				2409
				2410	@item Initializing and Clearing
				2411	@cindex Initializing and clearing
				2412	Avoid excessive initializing and clearing of variables, since this can be
				2413	quite time consuming, especially in comparison to otherwise fast operations
				2414	like addition.
				2415
				2416	A language interpreter might want to keep a free list or stack of
				2417	initialized variables ready for use. It should be possible to integrate
				2418	something like that with a garbage collector too.
				2419
				2420	@item Reallocations
				2421	@cindex Reallocations
				2422	An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
				2423	values will have its memory repeatedly @code{realloc}ed, which could be quite
				2424	slow or could fragment memory, depending on the C library. If an application
				2425	can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
				2426	be called to allocate the necessary space from the beginning
				2427	(@pxref{Initializing Integers}).
				2428
				2429	It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
				2430	is too small, since all functions will do a further reallocation if necessary.
				2431	Badly overestimating memory required will waste space though.
				2432
				2433	@item @code{2exp} Functions
				2434	@cindex @code{2exp} functions
				2435	It's up to an application to call functions like @code{mpz_mul_2exp} when
				2436	appropriate. General purpose functions like @code{mpz_mul} make no attempt to
				2437	identify powers of two or other special forms, because such inputs will
				2438	usually be very rare and testing every time would be wasteful.
				2439
				2440	@item @code{ui} and @code{si} Functions
				2441	@cindex @code{ui} and @code{si} functions
				2442	The @code{ui} functions and the small number of @code{si} functions exist for
				2443	convenience and should be used where applicable. But if for example an
				2444	@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
				2445	need extract it and call a @code{ui} function, just use the regular @code{mpz}
				2446	function.
				2447
				2448	@item In-Place Operations
				2449	@cindex In-place operations
				2450	@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
				2451	and @code{mpf_neg} are fast when used for in-place operations like
				2452	@code{mpz_abs(x,x)}, since in the current implementation only a single field
				2453	of @code{x} needs changing. On suitable compilers (GCC for instance) this is
				2454	inlined too.
				2455
				2456	@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
				2457	benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
				2458	usually only one or two limbs of @code{x} will need to be changed. The same
				2459	applies to the full precision @code{mpz_add} etc if @code{y} is small. If
				2460	@code{y} is big then cache locality may be helped, but that's all.
				2461
				2462	@code{mpz_mul} is currently the opposite, a separate destination is slightly
				2463	better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
				2464	limb, make a temporary copy of @code{x} before forming the result. Normally
				2465	that copying will only be a tiny fraction of the time for the multiply, so
				2466	this is not a particularly important consideration.
				2467
				2468	@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
				2469	no attempt to recognise a copy of something to itself, so a call like
				2470	@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written
				2471	deliberately, but if it might arise from two pointers to the same object then
				2472	a test to avoid it might be desirable.
				2473
				2474	@example
				2475	if (x != y)
				2476	mpz_set (x, y);
				2477	@end example
				2478
				2479	Note that it's never worth introducing extra @code{mpz_set} calls just to get
				2480	in-place operations. If a result should go to a particular variable then just
				2481	direct it there and let GMP take care of data movement.
				2482
				2483	@item Divisibility Testing (Small Integers)
				2484	@cindex Divisibility testing
				2485	@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
				2486	for testing whether an @code{mpz_t} is divisible by an individual small
				2487	integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
				2488	which gives no useful information about the actual remainder, only whether
				2489	it's zero (or a particular value).
				2490
				2491	However when testing divisibility by several small integers, it's best to take
				2492	a remainder modulo their product, to save multi-precision operations. For
				2493	instance to test whether a number is divisible by any of 23, 29 or 31 take a
				2494	remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
				2495
				2496	The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
				2497	as a remainder are generally a little slower than the remainder-only functions
				2498	like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's
				2499	probably best to just take a remainder and then go back and calculate the
				2500	quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
				2501	remainder is zero).
				2502
				2503	@item Rational Arithmetic
				2504	@cindex Rational arithmetic
				2505	The @code{mpq} functions operate on @code{mpq_t} values with no common factors
				2506	in the numerator and denominator. Common factors are checked-for and cast out
				2507	as necessary. In general, cancelling factors every time is the best approach
				2508	since it minimizes the sizes for subsequent operations.
				2509
				2510	However, applications that know something about the factorization of the
				2511	values they're working with might be able to avoid some of the GCDs used for
				2512	canonicalization, or swap them for divisions. For example when multiplying by
				2513	a prime it's enough to check for factors of it in the denominator instead of
				2514	doing a full GCD@. Or when forming a big product it might be known that very
				2515	little cancellation will be possible, and so canonicalization can be left to
				2516	the end.
				2517
				2518	The @code{mpq_numref} and @code{mpq_denref} macros give access to the
				2519	numerator and denominator to do things outside the scope of the supplied
				2520	@code{mpq} functions. @xref{Applying Integer Functions}.
				2521
				2522	The canonical form for rationals allows mixed-type @code{mpq_t} and integer
				2523	additions or subtractions to be done directly with multiples of the
				2524	denominator. This will be somewhat faster than @code{mpq_add}. For example,
				2525
				2526	@example
				2527	/* mpq increment */
				2528	mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
				2529
				2530	/* mpq += unsigned long */
				2531	mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
				2532
				2533	/* mpq -= mpz */
				2534	mpz_submul (mpq_numref(q), mpq_denref(q), z);
				2535	@end example
				2536
				2537	@item Number Sequences
				2538	@cindex Number sequences
				2539	Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
				2540	are designed for calculating isolated values. If a range of values is wanted
				2541	it's probably best to call to get a starting point and iterate from there.
				2542
				2543	@item Text Input/Output
				2544	@cindex Text input/output
				2545	Hexadecimal or octal are suggested for input or output in text form.
				2546	Power-of-2 bases like these can be converted much more efficiently than other
				2547	bases, like decimal. For big numbers there's usually nothing of particular
				2548	interest to be seen in the digits, so the base doesn't matter much.
				2549
				2550	Maybe we can hope octal will one day become the normal base for everyday use,
				2551	as proposed by King Charles XII of Sweden and later reformers.
				2552	@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-)
				2553	@end table
				2554
				2555
				2556	@node Debugging, Profiling, Efficiency, GMP Basics
				2557	@section Debugging
				2558	@cindex Debugging
				2559
				2560	@table @asis
				2561	@item Stack Overflow
				2562	@cindex Stack overflow
				2563	@cindex Segmentation violation
				2564	@cindex Bus error
				2565	Depending on the system, a segmentation violation or bus error might be the
				2566	only indication of stack overflow. See @samp{--enable-alloca} choices in
				2567	@ref{Build Options}, for how to address this.
				2568
				2569	In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
				2570	overflow is recognised by the system before too much damage is done, or
				2571	@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
				2572	add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
				2573	Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
				2574	These options must be added to the @samp{CFLAGS} used in the GMP build
				2575	(@pxref{Build Options}), adding them just to an application will have no
				2576	effect. Note also they're a slowdown, adding overhead to each function call
				2577	and each stack allocation.
				2578
				2579	@item Heap Problems
				2580	@cindex Heap problems
				2581	@cindex Malloc problems
				2582	The most likely cause of application problems with GMP is heap corruption.
				2583	Failing to @code{init} GMP variables will have unpredictable effects, and
				2584	corruption arising elsewhere in a program may well affect GMP@. Initializing
				2585	GMP variables more than once or failing to clear them will cause memory leaks.
				2586
				2587	@cindex Malloc debugger
				2588	In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD
				2589	system the standard C library @code{malloc} has some diagnostic facilities,
				2590	see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library
				2591	Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no
				2592	particular order, include
				2593
				2594	@display
				2595	@uref{http://cs.ecs.baylor.edu/~donahoo/tools/ccmalloc/}
				2596	@uref{http://dmalloc.com/}
				2597	@uref{https://wiki.gnome.org/Apps/MemProf}
				2598	@end display
				2599
				2600	The GMP default allocation routines in @file{memory.c} also have a simple
				2601	sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
				2602	This is mainly designed for detecting buffer overruns during GMP development,
				2603	but might find other uses.
				2604
				2605	@item Stack Backtraces
				2606	@cindex Stack backtrace
				2607	On some systems the compiler options GMP uses by default can interfere with
				2608	debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
				2609	is used and this generally inhibits stack backtracing. Recompiling without
				2610	such options may help while debugging, though the usual caveats about it
				2611	potentially moving a memory problem or hiding a compiler bug will apply.
				2612
				2613	@item GDB, the GNU Debugger
				2614	@cindex GDB
				2615	@cindex GNU Debugger
				2616	A sample @file{.gdbinit} is included in the distribution, showing how to call
				2617	some undocumented dump functions to print GMP variables from within GDB@. Note
				2618	that these functions shouldn't be used in final application code since they're
				2619	undocumented and may be subject to incompatible changes in future versions of
				2620	GMP.
				2621
				2622	@item Source File Paths
				2623	GMP has multiple source files with the same name, in different directories.
				2624	For example @file{mpz}, @file{mpq} and @file{mpf} each have an
				2625	@file{init.c}. If the debugger can't already determine the right one it may
				2626	help to build with absolute paths on each C file. One way to do that is to
				2627	use a separate object directory with an absolute path to the source directory.
				2628
				2629	@example
				2630	cd /my/build/dir
				2631	/my/source/dir/gmp-@value{VERSION}/configure
				2632	@end example
				2633
				2634	This works via @code{VPATH}, and might require GNU @command{make}.
				2635	Alternately it might be possible to change the @code{.c.lo} rules
				2636	appropriately.
				2637
				2638	@item Assertion Checking
				2639	@cindex Assertion checking
				2640	The build option @option{--enable-assert} is available to add some consistency
				2641	checks to the library (see @ref{Build Options}). These are likely to be of
				2642	limited value to most applications. Assertion failures are just as likely to
				2643	indicate memory corruption as a library or compiler bug.
				2644
				2645	Applications using the low-level @code{mpn} functions, however, will benefit
				2646	from @option{--enable-assert} since it adds checks on the parameters of most
				2647	such functions, many of which have subtle restrictions on their usage. Note
				2648	however that only the generic C code has checks, not the assembly code, so
				2649	@option{--disable-assembly} should be used for maximum checking.
				2650
				2651	@item Temporary Memory Checking
				2652	The build option @option{--enable-alloca=debug} arranges that each block of
				2653	temporary memory in GMP is allocated with a separate call to @code{malloc} (or
				2654	the allocation function set with @code{mp_set_memory_functions}).
				2655
				2656	This can help a malloc debugger detect accesses outside the intended bounds,
				2657	or detect memory not released. In a normal build, on the other hand,
				2658	temporary memory is allocated in blocks which GMP divides up for its own use,
				2659	or may be allocated with a compiler builtin @code{alloca} which will go
				2660	nowhere near any malloc debugger hooks.
				2661
				2662	@item Maximum Debuggability
				2663	To summarize the above, a GMP build for maximum debuggability would be
				2664
				2665	@example
				2666	./configure --disable-shared --enable-assert \
				2667	--enable-alloca=debug --disable-assembly CFLAGS=-g
				2668	@end example
				2669
				2670	For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
				2671
				2672	@item Checker
				2673	@cindex Checker
				2674	@cindex GCC Checker
				2675	The GCC checker (@uref{https://savannah.nongnu.org/projects/checker/}) can be
				2676	used with GMP@. It contains a stub library which means GMP applications
				2677	compiled with checker can use a normal GMP build.
				2678
				2679	A build of GMP with checking within GMP itself can be made. This will run
				2680	very very slowly. On GNU/Linux for example,
				2681
				2682	@cindex @command{checkergcc}
				2683	@example
				2684	./configure --disable-assembly CC=checkergcc
				2685	@end example
				2686
				2687	@option{--disable-assembly} must be used, since the GMP assembly code doesn't
				2688	support the checking scheme. The GMP C++ features cannot be used, since
				2689	current versions of checker (0.9.9.1) don't yet support the standard C++
				2690	library.
				2691
				2692	@item Valgrind
				2693	@cindex Valgrind
				2694	Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS,
				2695	PowerPC, and S/390. It translates and emulates machine instructions to do
				2696	strong checks for uninitialized data (at the level of individual bits), memory
				2697	accesses through bad pointers, and memory leaks.
				2698
				2699	Valgrind does not always support every possible instruction, in particular
				2700	ones recently added to an ISA. Valgrind might therefore be incompatible with
				2701	a recent GMP or even a less recent GMP which is compiled using a recent GCC.
				2702
				2703	GMP's assembly code sometimes promotes a read of the limbs to some larger size,
				2704	for efficiency. GMP will do this even at the start and end of a multilimb
				2705	operand, using naturally aligned operations on the larger type. This may lead
				2706	to benign reads outside of allocated areas, triggering complaints from
				2707	Valgrind. Valgrind's option @samp{--partial-loads-ok=yes} should help.
				2708
				2709	@item Other Problems
				2710	Any suspected bug in GMP itself should be isolated to make sure it's not an
				2711	application problem, see @ref{Reporting Bugs}.
				2712	@end table
				2713
				2714
				2715	@node Profiling, Autoconf, Debugging, GMP Basics
				2716	@section Profiling
				2717	@cindex Profiling
				2718	@cindex Execution profiling
				2719	@cindex @code{--enable-profiling}
				2720
				2721	Running a program under a profiler is a good way to find where it's spending
				2722	most time and where improvements can be best sought. The profiling choices
				2723	for a GMP build are as follows.
				2724
				2725	@table @asis
				2726	@item @samp{--disable-profiling}
				2727	The default is to add nothing special for profiling.
				2728
				2729	It should be possible to just compile the mainline of a program with @code{-p}
				2730	and use @command{prof} to get a profile consisting of timer-based sampling of
				2731	the program counter. Most of the GMP assembly code has the necessary symbol
				2732	information.
				2733
				2734	This approach has the advantage of minimizing interference with normal program
				2735	operation, but on most systems the resolution of the sampling is quite low (10
				2736	milliseconds for instance), requiring long runs to get accurate information.
				2737
				2738	@item @samp{--enable-profiling=prof}
				2739	@cindex @code{prof}
				2740	Build with support for the system @command{prof}, which means @samp{-p} added
				2741	to the @samp{CFLAGS}.
				2742
				2743	This provides call counting in addition to program counter sampling, which
				2744	allows the most frequently called routines to be identified, and an average
				2745	time spent in each routine to be determined.
				2746
				2747	The x86 assembly code has support for this option, but on other processors
				2748	the assembly routines will be as if compiled without @samp{-p} and therefore
				2749	won't appear in the call counts.
				2750
				2751	On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in
				2752	this case @samp{--enable-profiling=gprof} described below should be used
				2753	instead.
				2754
				2755	@item @samp{--enable-profiling=gprof}
				2756	@cindex @code{gprof}
				2757	Build with support for @command{gprof}, which means @samp{-pg} added to the
				2758	@samp{CFLAGS}.
				2759
				2760	This provides call graph construction in addition to call counting and program
				2761	counter sampling, which makes it possible to count calls coming from different
				2762	locations. For example the number of calls to @code{mpn_mul} from
				2763	@code{mpz_mul} versus the number from @code{mpf_mul}. The program counter
				2764	sampling is still flat though, so only a total time in @code{mpn_mul} would be
				2765	accumulated, not a separate amount for each call site.
				2766
				2767	The x86 assembly code has support for this option, but on other processors
				2768	the assembly routines will be as if compiled without @samp{-pg} and therefore
				2769	not be included in the call counts.
				2770
				2771	On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
				2772	incompatible, so the latter is omitted from the default flags in that case,
				2773	which might result in poorer code generation.
				2774
				2775	Incidentally, it should be possible to use the @command{gprof} program with a
				2776	plain @samp{--enable-profiling=prof} build. But in that case only the
				2777	@samp{gprof -p} flat profile and call counts can be expected to be valid, not
				2778	the @samp{gprof -q} call graph.
				2779
				2780	@item @samp{--enable-profiling=instrument}
				2781	@cindex @code{-finstrument-functions}
				2782	@cindex @code{instrument-functions}
				2783	Build with the GCC option @samp{-finstrument-functions} added to the
				2784	@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc,
				2785	Using the GNU Compiler Collection (GCC)}).
				2786
				2787	This inserts special instrumenting calls at the start and end of each
				2788	function, allowing exact timing and full call graph construction.
				2789
				2790	This instrumenting is not normally a standard system feature and will require
				2791	support from an external library, such as
				2792
				2793	@cindex FunctionCheck
				2794	@cindex fnccheck
				2795	@display
				2796	@uref{https://sourceforge.net/projects/fnccheck/}
				2797	@end display
				2798
				2799	This should be included in @samp{LIBS} during the GMP configure so that test
				2800	programs will link. For example,
				2801
				2802	@example
				2803	./configure --enable-profiling=instrument LIBS=-lfc
				2804	@end example
				2805
				2806	On a GNU system the C library provides dummy instrumenting functions, so
				2807	programs compiled with this option will link. In this case it's only
				2808	necessary to ensure the correct library is added when linking an application.
				2809
				2810	The x86 assembly code supports this option, but on other processors the
				2811	assembly routines will be as if compiled without
				2812	@samp{-finstrument-functions} meaning time spent in them will effectively be
				2813	attributed to their caller.
				2814	@end table
				2815
				2816
				2817	@node Autoconf, Emacs, Profiling, GMP Basics
				2818	@section Autoconf
				2819	@cindex Autoconf
				2820
				2821	Autoconf based applications can easily check whether GMP is installed. The
				2822	only thing to be noted is that GMP library symbols from version 3 onwards have
				2823	prefixes like @code{__gmpz}. The following therefore would be a simple test,
				2824
				2825	@cindex @code{AC_CHECK_LIB}
				2826	@example
				2827	AC_CHECK_LIB(gmp, __gmpz_init)
				2828	@end example
				2829
				2830	This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
				2831	but an application that must have GMP would want to generate an error if not
				2832	found. For example,
				2833
				2834	@example
				2835	AC_CHECK_LIB(gmp, __gmpz_init, ,
				2836	[AC_MSG_ERROR([GNU MP not found, see https://gmplib.org/])])
				2837	@end example
				2838
				2839	If functions added in some particular version of GMP are required, then one of
				2840	those can be used when checking. For example @code{mpz_mul_si} was added in
				2841	GMP 3.1,
				2842
				2843	@example
				2844	AC_CHECK_LIB(gmp, __gmpz_mul_si, ,
				2845	[AC_MSG_ERROR(
				2846	[GNU MP not found, or not 3.1 or up, see https://gmplib.org/])])
				2847	@end example
				2848
				2849	An alternative would be to test the version number in @file{gmp.h} using say
				2850	@code{AC_EGREP_CPP}. That would make it possible to test the exact version,
				2851	if some particular sub-minor release is known to be necessary.
				2852
				2853	In general it's recommended that applications should simply demand a new
				2854	enough GMP rather than trying to provide supplements for features not
				2855	available in past versions.
				2856
				2857	Occasionally an application will need or want to know the size of a type at
				2858	configuration or preprocessing time, not just with @code{sizeof} in the code.
				2859	This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
				2860	up is best for this, since prior versions needed certain @samp{-D} defines on
				2861	systems using a @code{long long} limb. The following would suit Autoconf 2.50
				2862	or up,
				2863
				2864	@example
				2865	AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
				2866	@end example
				2867
				2868
				2869	@node Emacs, , Autoconf, GMP Basics
				2870	@section Emacs
				2871	@cindex Emacs
				2872	@cindex @code{info-lookup-symbol}
				2873
				2874	@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
				2875	on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
				2876	emacs, The Emacs Editor}).
				2877
				2878	The GMP manual can be included in such lookups by putting the following in
				2879	your @file{.emacs},
				2880
				2881	@c This isn't pretty, but there doesn't seem to be a better way (in emacs
				2882	@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s,
				2883	@c but that function isn't documented, whereas info-lookup-alist is.
				2884	@c
				2885	@example
				2886	(eval-after-load "info-look"
				2887	'(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
				2888	(setcar (nthcdr 3 mode-value)
				2889	(cons '("(gmp)Function Index" nil "^ -.* " "\\>")
				2890	(nth 3 mode-value)))))
				2891	@end example
				2892
				2893
				2894	@node Reporting Bugs, Integer Functions, GMP Basics, Top
				2895	@comment node-name, next, previous, up
				2896	@chapter Reporting Bugs
				2897	@cindex Reporting bugs
				2898	@cindex Bug reporting
				2899
				2900	If you think you have found a bug in the GMP library, please investigate it
				2901	and report it. We have made this library available to you, and it is not too
				2902	much to ask you to report the bugs you find.
				2903
				2904	Before you report a bug, check it's not already addressed in @ref{Known Build
				2905	Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want
				2906	to check @uref{https://gmplib.org/} for patches for this release.
				2907
				2908	Please include the following in any report,
				2909
				2910	@itemize @bullet
				2911	@item
				2912	The GMP version number, and if pre-packaged or patched then say so.
				2913
				2914	@item
				2915	A test program that makes it possible for us to reproduce the bug. Include
				2916	instructions on how to run the program.
				2917
				2918	@item
				2919	A description of what is wrong. If the results are incorrect, in what way.
				2920	If you get a crash, say so.
				2921
				2922	@item
				2923	If you get a crash, include a stack backtrace from the debugger if it's
				2924	informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
				2925
				2926	@item
				2927	Please do not send core dumps, executables or @command{strace}s.
				2928
				2929	@item
				2930	The @samp{configure} options you used when building GMP, if any.
				2931
				2932	@item
				2933	The output from @samp{configure}, as printed to stdout, with any options used.
				2934
				2935	@item
				2936	The name of the compiler and its version. For @command{gcc}, get the version
				2937	with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
				2938
				2939	@item
				2940	The output from running @samp{uname -a}.
				2941
				2942	@item
				2943	The output from running @samp{./config.guess}, and from running
				2944	@samp{./configfsf.guess} (might be the same).
				2945
				2946	@item
				2947	If the bug is related to @samp{configure}, then the compressed contents of
				2948	@file{config.log}.
				2949
				2950	@item
				2951	If the bug is related to an @file{asm} file not assembling, then the contents
				2952	of @file{config.m4} and the offending line or lines from the temporary
				2953	@file{mpn/tmp-<file>.s}.
				2954	@end itemize
				2955
				2956	Please make an effort to produce a self-contained report, with something
				2957	definite that can be tested or debugged. Vague queries or piecemeal messages
				2958	are difficult to act on and don't help the development effort.
				2959
				2960	It is not uncommon that an observed problem is actually due to a bug in the
				2961	compiler; the GMP code tends to explore interesting corners in compilers.
				2962
				2963	If your bug report is good, we will do our best to help you get a corrected
				2964	version of the library; if the bug report is poor, we won't do anything about
				2965	it (except maybe ask you to send a better report).
				2966
				2967	Send your report to: @email{gmp-bugs@@gmplib.org}.
				2968
				2969	If you think something in this manual is unclear, or downright incorrect, or if
				2970	the language needs to be improved, please send a note to the same address.
				2971
				2972
				2973	@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
				2974	@comment node-name, next, previous, up
				2975	@chapter Integer Functions
				2976	@cindex Integer functions
				2977
				2978	This chapter describes the GMP functions for performing integer arithmetic.
				2979	These functions start with the prefix @code{mpz_}.
				2980
				2981	GMP integers are stored in objects of type @code{mpz_t}.
				2982
				2983	@menu
				2984	* Initializing Integers::
				2985	* Assigning Integers::
				2986	* Simultaneous Integer Init & Assign::
				2987	* Converting Integers::
				2988	* Integer Arithmetic::
				2989	* Integer Division::
				2990	* Integer Exponentiation::
				2991	* Integer Roots::
				2992	* Number Theoretic Functions::
				2993	* Integer Comparisons::
				2994	* Integer Logic and Bit Fiddling::
				2995	* I/O of Integers::
				2996	* Integer Random Numbers::
				2997	* Integer Import and Export::
				2998	* Miscellaneous Integer Functions::
				2999	* Integer Special Functions::
				3000	@end menu
				3001
				3002	@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
				3003	@comment node-name, next, previous, up
				3004	@section Initialization Functions
				3005	@cindex Integer initialization functions
				3006	@cindex Initialization functions
				3007
				3008	The functions for integer arithmetic assume that all integer objects are
				3009	initialized. You do that by calling the function @code{mpz_init}. For
				3010	example,
				3011
				3012	@example
				3013	@{
				3014	mpz_t integ;
				3015	mpz_init (integ);
				3016	@dots{}
				3017	mpz_add (integ, @dots{});
				3018	@dots{}
				3019	mpz_sub (integ, @dots{});
				3020
				3021	/* Unless the program is about to exit, do ... */
				3022	mpz_clear (integ);
				3023	@}
				3024	@end example
				3025
				3026	As you can see, you can store new values any number of times, once an
				3027	object is initialized.
				3028
				3029	@deftypefun void mpz_init (mpz_t @var{x})
				3030	Initialize @var{x}, and set its value to 0.
				3031	@end deftypefun
				3032
				3033	@deftypefun void mpz_inits (mpz_t @var{x}, ...)
				3034	Initialize a NULL-terminated list of @code{mpz_t} variables, and set their
				3035	values to 0.
				3036	@end deftypefun
				3037
				3038	@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
				3039	Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0.
				3040	Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never
				3041	necessary; reallocation is handled automatically by GMP when needed.
				3042
				3043	While @var{n} defines the initial space, @var{x} will grow automatically in the
				3044	normal way, if necessary, for subsequent values stored. @code{mpz_init2} makes
				3045	it possible to avoid such reallocations if a maximum size is known in advance.
				3046
				3047	In preparation for an operation, GMP often allocates one limb more than
				3048	ultimately needed. To make sure GMP will not perform reallocation for
				3049	@var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}.
				3050	@end deftypefun
				3051
				3052	@deftypefun void mpz_clear (mpz_t @var{x})
				3053	Free the space occupied by @var{x}. Call this function for all @code{mpz_t}
				3054	variables when you are done with them.
				3055	@end deftypefun
				3056
				3057	@deftypefun void mpz_clears (mpz_t @var{x}, ...)
				3058	Free the space occupied by a NULL-terminated list of @code{mpz_t} variables.
				3059	@end deftypefun
				3060
				3061	@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
				3062	Change the space allocated for @var{x} to @var{n} bits. The value in @var{x}
				3063	is preserved if it fits, or is set to 0 if not.
				3064
				3065	Calling this function is never necessary; reallocation is handled automatically
				3066	by GMP when needed. But this function can be used to increase the space for a
				3067	variable in order to avoid repeated automatic reallocations, or to decrease it
				3068	to give memory back to the heap.
				3069	@end deftypefun
				3070
				3071
				3072	@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
				3073	@comment node-name, next, previous, up
				3074	@section Assignment Functions
				3075	@cindex Integer assignment functions
				3076	@cindex Assignment functions
				3077
				3078	These functions assign new values to already initialized integers
				3079	(@pxref{Initializing Integers}).
				3080
				3081	@deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op})
				3082	@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
				3083	@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
				3084	@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
				3085	@deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op})
				3086	@deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op})
				3087	Set the value of @var{rop} from @var{op}.
				3088
				3089	@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
				3090	make it an integer.
				3091	@end deftypefun
				3092
				3093	@deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
				3094	Set the value of @var{rop} from @var{str}, a null-terminated C string in base
				3095	@var{base}. White space is allowed in the string, and is simply ignored.
				3096
				3097	The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
				3098	characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
				3099	@code{0B} for binary, @code{0} for octal, or decimal otherwise.
				3100
				3101	For bases up to 36, case is ignored; upper-case and lower-case letters have
				3102	the same value. For bases 37 to 62, upper-case letter represent the usual
				3103	10..35 while lower-case letter represent 36..61.
				3104
				3105	This function returns 0 if the entire string is a valid number in base
				3106	@var{base}. Otherwise it returns @minus{}1.
				3107	@c
				3108	@c It turns out that it is not entirely true that this function ignores
				3109	@c white-space. It does ignore it between digits, but not after a minus sign
				3110	@c or within or after ``0x''. Some thought was given to disallowing all
				3111	@c whitespace, but that would be an incompatible change, whitespace has been
				3112	@c documented as ignored ever since GMP 1.
				3113	@c
				3114	@end deftypefun
				3115
				3116	@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
				3117	Swap the values @var{rop1} and @var{rop2} efficiently.
				3118	@end deftypefun
				3119
				3120
				3121	@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
				3122	@comment node-name, next, previous, up
				3123	@section Combined Initialization and Assignment Functions
				3124	@cindex Integer assignment functions
				3125	@cindex Assignment functions
				3126	@cindex Integer initialization functions
				3127	@cindex Initialization functions
				3128
				3129	For convenience, GMP provides a parallel series of initialize-and-set functions
				3130	which initialize the output and then store the value there. These functions'
				3131	names have the form @code{mpz_init_set@dots{}}
				3132
				3133	Here is an example of using one:
				3134
				3135	@example
				3136	@{
				3137	mpz_t pie;
				3138	mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
				3139	@dots{}
				3140	mpz_sub (pie, @dots{});
				3141	@dots{}
				3142	mpz_clear (pie);
				3143	@}
				3144	@end example
				3145
				3146	@noindent
				3147	Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
				3148	functions, it can be used as the source or destination operand for the ordinary
				3149	integer functions. Don't use an initialize-and-set function on a variable
				3150	already initialized!
				3151
				3152	@deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op})
				3153	@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
				3154	@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
				3155	@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
				3156	Initialize @var{rop} with limb space and set the initial numeric value from
				3157	@var{op}.
				3158	@end deftypefun
				3159
				3160	@deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
				3161	Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
				3162	documentation above for details).
				3163
				3164	If the string is a correct base @var{base} number, the function returns 0;
				3165	if an error occurs it returns @minus{}1. @var{rop} is initialized even if
				3166	an error occurs. (I.e., you have to call @code{mpz_clear} for it.)
				3167	@end deftypefun
				3168
				3169
				3170	@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
				3171	@comment node-name, next, previous, up
				3172	@section Conversion Functions
				3173	@cindex Integer conversion functions
				3174	@cindex Conversion functions
				3175
				3176	This section describes functions for converting GMP integers to standard C
				3177	types. Functions for converting @emph{to} GMP integers are described in
				3178	@ref{Assigning Integers} and @ref{I/O of Integers}.
				3179
				3180	@deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op})
				3181	Return the value of @var{op} as an @code{unsigned long}.
				3182
				3183	If @var{op} is too big to fit an @code{unsigned long} then just the least
				3184	significant bits that do fit are returned. The sign of @var{op} is ignored,
				3185	only the absolute value is used.
				3186	@end deftypefun
				3187
				3188	@deftypefun {signed long int} mpz_get_si (const mpz_t @var{op})
				3189	If @var{op} fits into a @code{signed long int} return the value of @var{op}.
				3190	Otherwise return the least significant part of @var{op}, with the same sign
				3191	as @var{op}.
				3192
				3193	If @var{op} is too big to fit in a @code{signed long int}, the returned
				3194	result is probably not very useful. To find out if the value will fit, use
				3195	the function @code{mpz_fits_slong_p}.
				3196	@end deftypefun
				3197
				3198	@deftypefun double mpz_get_d (const mpz_t @var{op})
				3199	Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
				3200	towards zero).
				3201
				3202	If the exponent from the conversion is too big, the result is system
				3203	dependent. An infinity is returned where available. A hardware overflow trap
				3204	may or may not occur.
				3205	@end deftypefun
				3206
				3207	@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op})
				3208	Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
				3209	towards zero), and returning the exponent separately.
				3210
				3211	The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
				3212	exponent is stored to @code{@var{exp}}. @m{@var{d} 2^{exp}, @var{d} *
				3213	2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the
				3214	return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
				3215
				3216	@cindex @code{frexp}
				3217	This is similar to the standard C @code{frexp} function (@pxref{Normalization
				3218	Functions,,, libc, The GNU C Library Reference Manual}).
				3219	@end deftypefun
				3220
				3221	@deftypefun {char } mpz_get_str (char @var{str}, int @var{base}, const mpz_t @var{op})
				3222	Convert @var{op} to a string of digits in base @var{base}. The base argument
				3223	may vary from 2 to 62 or from @minus{}2 to @minus{}36.
				3224
				3225	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				3226	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				3227	digits, upper-case letters, and lower-case letters (in that significance order)
				3228	are used.
				3229
				3230	If @var{str} is @code{NULL}, the result string is allocated using the current
				3231	allocation function (@pxref{Custom Allocation}). The block will be
				3232	@code{strlen(str)+1} bytes, that being exactly enough for the string and
				3233	null-terminator.
				3234
				3235	If @var{str} is not @code{NULL}, it should point to a block of storage large
				3236	enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
				3237	+ 2}. The two extra bytes are for a possible minus sign, and the
				3238	null-terminator.
				3239
				3240	A pointer to the result string is returned, being either the allocated block,
				3241	or the given @var{str}.
				3242	@end deftypefun
				3243
				3244
				3245	@need 2000
				3246	@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
				3247	@comment node-name, next, previous, up
				3248	@section Arithmetic Functions
				3249	@cindex Integer arithmetic functions
				3250	@cindex Arithmetic functions
				3251
				3252	@deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3253	@deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3254	Set @var{rop} to @math{@var{op1} + @var{op2}}.
				3255	@end deftypefun
				3256
				3257	@deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3258	@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3259	@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2})
				3260	Set @var{rop} to @var{op1} @minus{} @var{op2}.
				3261	@end deftypefun
				3262
				3263	@deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3264	@deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2})
				3265	@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3266	Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
				3267	@end deftypefun
				3268
				3269	@deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3270	@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3271	Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
				3272	@end deftypefun
				3273
				3274	@deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3275	@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3276	Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
				3277	@end deftypefun
				3278
				3279	@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2})
				3280	@cindex Bit shift left
				3281	Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
				3282	@var{op2}}. This operation can also be defined as a left shift by @var{op2}
				3283	bits.
				3284	@end deftypefun
				3285
				3286	@deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op})
				3287	Set @var{rop} to @minus{}@var{op}.
				3288	@end deftypefun
				3289
				3290	@deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op})
				3291	Set @var{rop} to the absolute value of @var{op}.
				3292	@end deftypefun
				3293
				3294
				3295	@need 2000
				3296	@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
				3297	@section Division Functions
				3298	@cindex Integer division functions
				3299	@cindex Division functions
				3300
				3301	Division is undefined if the divisor is zero. Passing a zero divisor to the
				3302	division or modulo functions (including the modular powering functions
				3303	@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
				3304	zero. This lets a program handle arithmetic exceptions in these functions the
				3305	same way as for normal C @code{int} arithmetic.
				3306
				3307	@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
				3308	@c between each, and seem to let tex do a better job of page breaks than an
				3309	@c @sp 1 in the middle of one big set.
				3310
				3311	@deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
				3312	@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3313	@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3314	@maybepagebreak
				3315	@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3316	@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3317	@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
				3318	@deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3319	@maybepagebreak
				3320	@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3321	@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3322	@end deftypefun
				3323
				3324	@deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
				3325	@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3326	@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3327	@maybepagebreak
				3328	@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3329	@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3330	@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
				3331	@deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3332	@maybepagebreak
				3333	@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3334	@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3335	@end deftypefun
				3336
				3337	@deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
				3338	@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3339	@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3340	@maybepagebreak
				3341	@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3342	@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3343	@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
				3344	@deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3345	@maybepagebreak
				3346	@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3347	@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
				3348	@cindex Bit shift right
				3349
				3350	@sp 1
				3351	Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
				3352	@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
				3353	The rounding is in three styles, each suiting different applications.
				3354
				3355	@itemize @bullet
				3356	@item
				3357	@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
				3358	have the opposite sign to @var{d}. The @code{c} stands for ``ceil''.
				3359
				3360	@item
				3361	@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
				3362	@var{r} will have the same sign as @var{d}. The @code{f} stands for
				3363	``floor''.
				3364
				3365	@item
				3366	@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
				3367	as @var{n}. The @code{t} stands for ``truncate''.
				3368	@end itemize
				3369
				3370	In all cases @var{q} and @var{r} will satisfy
				3371	@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
				3372	@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
				3373
				3374	The @code{q} functions calculate only the quotient, the @code{r} functions
				3375	only the remainder, and the @code{qr} functions calculate both. Note that for
				3376	@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
				3377	results will be unpredictable.
				3378
				3379	For the @code{ui} variants the return value is the remainder, and in fact
				3380	returning the remainder is all the @code{div_ui} functions do. For
				3381	@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
				3382	return value is the absolute value of the remainder.
				3383
				3384	For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These
				3385	functions are implemented as right shifts and bit masks, but of course they
				3386	round the same as the other functions.
				3387
				3388	For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp}
				3389	are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp}
				3390	is effectively an arithmetic right shift treating @var{n} as twos complement
				3391	the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp}
				3392	effectively treats @var{n} as sign and magnitude.
				3393	@end deftypefun
				3394
				3395	@deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
				3396	@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
				3397	Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is
				3398	ignored; the result is always non-negative.
				3399
				3400	@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
				3401	remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only
				3402	the return value is wanted.
				3403	@end deftypefun
				3404
				3405	@deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
				3406	@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d})
				3407	@cindex Exact division functions
				3408	Set @var{q} to @var{n}/@var{d}. These functions produce correct results only
				3409	when it is known in advance that @var{d} divides @var{n}.
				3410
				3411	These routines are much faster than the other division functions, and are the
				3412	best choice when exact division is known to occur, for example reducing a
				3413	rational to lowest terms.
				3414	@end deftypefun
				3415
				3416	@deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d})
				3417	@deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d})
				3418	@deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b})
				3419	@cindex Divisibility functions
				3420	Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
				3421	@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
				3422
				3423	@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying
				3424	@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division
				3425	functions, @math{@var{d}=0} is accepted and following the rule it can be seen
				3426	that only 0 is considered divisible by 0.
				3427	@end deftypefun
				3428
				3429	@deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d})
				3430	@deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
				3431	@deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b})
				3432	@cindex Divisibility functions
				3433	@cindex Congruence functions
				3434	Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
				3435	case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
				3436
				3437	@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q}
				3438	satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike
				3439	the other division functions, @math{@var{d}=0} is accepted and following the
				3440	rule it can be seen that @var{n} and @var{c} are considered congruent mod 0
				3441	only when exactly equal.
				3442	@end deftypefun
				3443
				3444
				3445	@need 2000
				3446	@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
				3447	@section Exponentiation Functions
				3448	@cindex Integer exponentiation functions
				3449	@cindex Exponentiation functions
				3450	@cindex Powering functions
				3451
				3452	@deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
				3453	@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod})
				3454	Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
				3455	modulo @var{mod}}.
				3456
				3457	Negative @var{exp} is supported if the inverse @mm{@var{base}@sup{-1} @bmod
				3458	@var{mod}, @var{base}^(-1) @bmod @var{mod}} exists (see @code{mpz_invert} in
				3459	@ref{Number Theoretic Functions}). If an inverse doesn't exist then a divide
				3460	by zero is raised.
				3461	@end deftypefun
				3462
				3463	@deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
				3464	Set @var{rop} to @m{base^{exp} \bmod @var{mod}, (@var{base} raised to @var{exp})
				3465	modulo @var{mod}}.
				3466
				3467	It is required that @math{@var{exp} > 0} and that @var{mod} is odd.
				3468
				3469	This function is designed to take the same time and have the same cache access
				3470	patterns for any two same-size arguments, assuming that function arguments are
				3471	placed at the same position and that the machine state is identical upon
				3472	function entry. This function is intended for cryptographic purposes, where
				3473	resilience to side-channel attacks is desired.
				3474	@end deftypefun
				3475
				3476	@deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp})
				3477	@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
				3478	Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case
				3479	@math{0^0} yields 1.
				3480	@end deftypefun
				3481
				3482
				3483	@need 2000
				3484	@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
				3485	@section Root Extraction Functions
				3486	@cindex Integer root functions
				3487	@cindex Root extraction functions
				3488
				3489	@deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n})
				3490	Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
				3491	part of the @var{n}th root of @var{op}. Return non-zero if the computation
				3492	was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
				3493	@end deftypefun
				3494
				3495	@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n})
				3496	Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated
				3497	integer part of the @var{n}th root of @var{u}. Set @var{rem} to the
				3498	remainder, @m{(@var{u} - @var{root}^n),
				3499	@var{u}@minus{}@var{root}**@var{n}}.
				3500	@end deftypefun
				3501
				3502	@deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op})
				3503	Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
				3504	integer part of the square root of @var{op}.
				3505	@end deftypefun
				3506
				3507	@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op})
				3508	Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
				3509	of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the
				3510	remainder @m{(@var{op} - @var{rop1}^2),
				3511	@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
				3512	perfect square.
				3513
				3514	If @var{rop1} and @var{rop2} are the same variable, the results are
				3515	undefined.
				3516	@end deftypefun
				3517
				3518	@deftypefun int mpz_perfect_power_p (const mpz_t @var{op})
				3519	@cindex Perfect power functions
				3520	@cindex Root testing functions
				3521	Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
				3522	@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
				3523	@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
				3524
				3525	Under this definition both 0 and 1 are considered to be perfect powers.
				3526	Negative values of @var{op} are accepted, but of course can only be odd
				3527	perfect powers.
				3528	@end deftypefun
				3529
				3530	@deftypefun int mpz_perfect_square_p (const mpz_t @var{op})
				3531	@cindex Perfect square functions
				3532	@cindex Root testing functions
				3533	Return non-zero if @var{op} is a perfect square, i.e., if the square root of
				3534	@var{op} is an integer. Under this definition both 0 and 1 are considered to
				3535	be perfect squares.
				3536	@end deftypefun
				3537
				3538
				3539	@need 2000
				3540	@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
				3541	@section Number Theoretic Functions
				3542	@cindex Number theoretic functions
				3543
				3544	@deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps})
				3545	@cindex Prime testing functions
				3546	@cindex Probable prime testing functions
				3547	Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime,
				3548	return 1 if @var{n} is probably prime (without being certain), or return 0 if
				3549	@var{n} is definitely non-prime.
				3550
				3551	This function performs some trial divisions, a Baillie-PSW probable prime
				3552	test, then @var{reps-24} Miller-Rabin probabilistic primality tests. A
				3553	higher @var{reps} value will reduce the chances of a non-prime being
				3554	identified as ``probably prime''. A composite number will be identified as a
				3555	prime with an asymptotic probability of less than @m{4^{-reps},4^(-@var{reps})}.
				3556	Reasonable values of @var{reps} are between 15 and 50.
				3557
				3558	GMP versions up to and including 6.1.2 did not use the Baillie-PSW
				3559	primality test. In those older versions of GMP, this function performed
				3560	@var{reps} Miller-Rabin tests.
				3561	@end deftypefun
				3562
				3563	@deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op})
				3564	@cindex Next prime function
				3565	Set @var{rop} to the next prime greater than @var{op}.
				3566
				3567	This function uses a probabilistic algorithm to identify primes. For
				3568	practical purposes it's adequate, the chance of a composite passing will be
				3569	extremely small.
				3570	@end deftypefun
				3571
				3572	@c mpz_prime_p not implemented as of gmp 3.0.
				3573
				3574	@c @deftypefun int mpz_prime_p (const mpz_t @var{n})
				3575	@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
				3576	@c This function is far slower than @code{mpz_probab_prime_p}, but then it
				3577	@c never returns non-zero for composite numbers.
				3578
				3579	@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
				3580	@c The likelihood of a programming error or hardware malfunction is orders
				3581	@c of magnitudes greater than the likelihood for a composite to pass as a
				3582	@c prime, if the @var{reps} argument is in the suggested range.)
				3583	@c @end deftypefun
				3584
				3585	@deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3586	@cindex Greatest common divisor functions
				3587	@cindex GCD functions
				3588	Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. The
				3589	result is always positive even if one or both input operands are negative.
				3590	Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}.
				3591	@end deftypefun
				3592
				3593	@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
				3594	Compute the greatest common divisor of @var{op1} and @var{op2}. If
				3595	@var{rop} is not @code{NULL}, store the result there.
				3596
				3597	If the result is small enough to fit in an @code{unsigned long int}, it is
				3598	returned. If the result does not fit, 0 is returned, and the result is equal
				3599	to the argument @var{op1}. Note that the result will always fit if @var{op2}
				3600	is non-zero.
				3601	@end deftypefun
				3602
				3603	@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b})
				3604	@cindex Extended GCD
				3605	@cindex GCD extended
				3606	Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
				3607	addition set @var{s} and @var{t} to coefficients satisfying
				3608	@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
				3609	The value in @var{g} is always positive, even if one or both of @var{a} and
				3610	@var{b} are negative (or zero if both inputs are zero). The values in @var{s}
				3611	and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} <
				3612	@GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}}
				3613	/ (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely. There
				3614	are a few exceptional cases:
				3615
				3616	If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0},
				3617	@math{@var{t} = sgn(@var{b})}.
				3618
				3619	Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or
				3620	@math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if
				3621	@math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}.
				3622
				3623	In all cases, @math{@var{s} = 0} if and only if @math{@var{g} =
				3624	@GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b}
				3625	= 0}.
				3626
				3627	If @var{t} or @var{g} is @code{NULL} then that value is not computed.
				3628	@end deftypefun
				3629
				3630	@deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3631	@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2})
				3632	@cindex Least common multiple functions
				3633	@cindex LCM functions
				3634	Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
				3635	@var{rop} is always positive, irrespective of the signs of @var{op1} and
				3636	@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
				3637	@end deftypefun
				3638
				3639	@deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3640	@cindex Modular inverse functions
				3641	@cindex Inverse modulo functions
				3642	Compute the inverse of @var{op1} modulo @var{op2} and put the result in
				3643	@var{rop}. If the inverse exists, the return value is non-zero and @var{rop}
				3644	will satisfy @math{0 @le{} @var{rop} < @GMPabs{@var{op2}}} (with @math{@var{rop}
				3645	= 0} possible only when @math{@GMPabs{@var{op2}} = 1}, i.e., in the
				3646	somewhat degenerate zero ring). If an inverse doesn't
				3647	exist the return value is zero and @var{rop} is undefined. The behaviour of
				3648	this function is undefined when @var{op2} is zero.
				3649	@end deftypefun
				3650
				3651	@deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b})
				3652	@cindex Jacobi symbol functions
				3653	Calculate the Jacobi symbol @m{\left(a \over b\right),
				3654	(@var{a}/@var{b})}. This is defined only for @var{b} odd.
				3655	@end deftypefun
				3656
				3657	@deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p})
				3658	@cindex Legendre symbol functions
				3659	Calculate the Legendre symbol @m{\left(a \over p\right),
				3660	(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive
				3661	prime, and for such @var{p} it's identical to the Jacobi symbol.
				3662	@end deftypefun
				3663
				3664	@deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b})
				3665	@deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b})
				3666	@deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b})
				3667	@deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b})
				3668	@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b})
				3669	@cindex Kronecker symbol functions
				3670	Calculate the Jacobi symbol @m{\left(a \over b\right),
				3671	(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
				3672	2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
				3673	@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
				3674
				3675	When @var{b} is odd the Jacobi symbol and Kronecker symbol are
				3676	identical, so @code{mpz_kronecker_ui} etc can be used for mixed
				3677	precision Jacobi symbols too.
				3678
				3679	For more information see Henri Cohen section 1.4.2 (@pxref{References}),
				3680	or any number theory textbook. See also the example program
				3681	@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
				3682	@end deftypefun
				3683
				3684	@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f})
				3685	@cindex Remove factor functions
				3686	@cindex Factor removal functions
				3687	Remove all occurrences of the factor @var{f} from @var{op} and store the
				3688	result in @var{rop}. The return value is how many such occurrences were
				3689	removed.
				3690	@end deftypefun
				3691
				3692	@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
				3693	@deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
				3694	@deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m})
				3695	@cindex Factorial functions
				3696	Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!,
				3697	@code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the
				3698	@var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}.
				3699	@end deftypefun
				3700
				3701	@deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n})
				3702	@cindex Primorial functions
				3703	Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive
				3704	prime numbers @math{@le{}@var{n}}.
				3705	@end deftypefun
				3706
				3707	@deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k})
				3708	@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
				3709	@cindex Binomial coefficient functions
				3710	Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
				3711	@var{k}} and store the result in @var{rop}. Negative values of @var{n} are
				3712	supported by @code{mpz_bin_ui}, using the identity
				3713	@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
				3714	bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
				3715	part G.
				3716	@end deftypefun
				3717
				3718	@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
				3719	@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
				3720	@cindex Fibonacci sequence functions
				3721	@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
				3722	number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
				3723	@m{F_{n-1},F[n-1]}.
				3724
				3725	These functions are designed for calculating isolated Fibonacci numbers. When
				3726	a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
				3727	iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
				3728	similar.
				3729	@end deftypefun
				3730
				3731	@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
				3732	@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
				3733	@cindex Lucas number functions
				3734	@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
				3735	number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
				3736	to @m{L_{n-1},L[n-1]}.
				3737
				3738	These functions are designed for calculating isolated Lucas numbers. When a
				3739	sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
				3740	iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
				3741	similar.
				3742
				3743	The Fibonacci numbers and Lucas numbers are related sequences, so it's never
				3744	necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The
				3745	formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
				3746	Algorithm}, the reverse is straightforward too.
				3747	@end deftypefun
				3748
				3749
				3750	@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
				3751	@comment node-name, next, previous, up
				3752	@section Comparison Functions
				3753	@cindex Integer comparison functions
				3754	@cindex Comparison functions
				3755
				3756	@deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2})
				3757	@deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2})
				3758	@deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2})
				3759	@deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
				3760	Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
				3761	@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
				3762	@math{@var{op1} < @var{op2}}.
				3763
				3764	@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their
				3765	arguments more than once. @code{mpz_cmp_d} can be called with an infinity,
				3766	but results are undefined for a NaN.
				3767	@end deftypefn
				3768
				3769	@deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2})
				3770	@deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2})
				3771	@deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
				3772	Compare the absolute values of @var{op1} and @var{op2}. Return a positive
				3773	value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
				3774	@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
				3775	@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
				3776
				3777	@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined
				3778	for a NaN.
				3779	@end deftypefn
				3780
				3781	@deftypefn Macro int mpz_sgn (const mpz_t @var{op})
				3782	@cindex Sign tests
				3783	@cindex Integer sign tests
				3784	Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
				3785	@math{-1} if @math{@var{op} < 0}.
				3786
				3787	This function is actually implemented as a macro. It evaluates its argument
				3788	multiple times.
				3789	@end deftypefn
				3790
				3791
				3792	@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
				3793	@comment node-name, next, previous, up
				3794	@section Logical and Bit Manipulation Functions
				3795	@cindex Logical functions
				3796	@cindex Bit manipulation functions
				3797	@cindex Integer logical functions
				3798	@cindex Integer bit manipulation functions
				3799
				3800	These functions behave as if twos complement arithmetic were used (although
				3801	sign-magnitude is the actual implementation). The least significant bit is
				3802	number 0.
				3803
				3804	@deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3805	Set @var{rop} to @var{op1} bitwise-and @var{op2}.
				3806	@end deftypefun
				3807
				3808	@deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3809	Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}.
				3810	@end deftypefun
				3811
				3812	@deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
				3813	Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}.
				3814	@end deftypefun
				3815
				3816	@deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op})
				3817	Set @var{rop} to the one's complement of @var{op}.
				3818	@end deftypefun
				3819
				3820	@deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op})
				3821	If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the
				3822	number of 1 bits in the binary representation. If @math{@var{op}<0}, the
				3823	number of 1s is infinite, and the return value is the largest possible
				3824	@code{mp_bitcnt_t}.
				3825	@end deftypefun
				3826
				3827	@deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2})
				3828	If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the
				3829	hamming distance between the two operands, which is the number of bit positions
				3830	where @var{op1} and @var{op2} have different bit values. If one operand is
				3831	@math{@ge{}0} and the other @math{<0} then the number of bits different is
				3832	infinite, and the return value is the largest possible @code{mp_bitcnt_t}.
				3833	@end deftypefun
				3834
				3835	@deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
				3836	@deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
				3837	@cindex Bit scanning functions
				3838	@cindex Scan bit functions
				3839	Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
				3840	bits, until the first 0 or 1 bit (respectively) is found. Return the index of
				3841	the found bit.
				3842
				3843	If the bit at @var{starting_bit} is already what's sought, then
				3844	@var{starting_bit} is returned.
				3845
				3846	If there's no bit found, then the largest possible @code{mp_bitcnt_t} is
				3847	returned. This will happen in @code{mpz_scan0} past the end of a negative
				3848	number, or @code{mpz_scan1} past the end of a nonnegative number.
				3849	@end deftypefun
				3850
				3851	@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
				3852	Set bit @var{bit_index} in @var{rop}.
				3853	@end deftypefun
				3854
				3855	@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
				3856	Clear bit @var{bit_index} in @var{rop}.
				3857	@end deftypefun
				3858
				3859	@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
				3860	Complement bit @var{bit_index} in @var{rop}.
				3861	@end deftypefun
				3862
				3863	@deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index})
				3864	Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
				3865	@end deftypefun
				3866
				3867	@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
				3868	@comment node-name, next, previous, up
				3869	@section Input and Output Functions
				3870	@cindex Integer input and output functions
				3871	@cindex Input functions
				3872	@cindex Output functions
				3873	@cindex I/O functions
				3874
				3875	Functions that perform input from a stdio stream, and functions that output to
				3876	a stdio stream, of @code{mpz} numbers. Passing a @code{NULL} pointer for a
				3877	@var{stream} argument to any of these functions will make them read from
				3878	@code{stdin} and write to @code{stdout}, respectively.
				3879
				3880	When using any of these functions, it is a good idea to include @file{stdio.h}
				3881	before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
				3882	for these functions.
				3883
				3884	See also @ref{Formatted Output} and @ref{Formatted Input}.
				3885
				3886	@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op})
				3887	Output @var{op} on stdio stream @var{stream}, as a string of digits in base
				3888	@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to
				3889	@minus{}36.
				3890
				3891	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				3892	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				3893	digits, upper-case letters, and lower-case letters (in that significance order)
				3894	are used.
				3895
				3896	Return the number of bytes written, or if an error occurred, return 0.
				3897	@end deftypefun
				3898
				3899	@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
				3900	Input a possibly white-space preceded string in base @var{base} from stdio
				3901	stream @var{stream}, and put the read integer in @var{rop}.
				3902
				3903	The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
				3904	characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
				3905	@code{0B} for binary, @code{0} for octal, or decimal otherwise.
				3906
				3907	For bases up to 36, case is ignored; upper-case and lower-case letters have
				3908	the same value. For bases 37 to 62, upper-case letter represent the usual
				3909	10..35 while lower-case letter represent 36..61.
				3910
				3911	Return the number of bytes read, or if an error occurred, return 0.
				3912	@end deftypefun
				3913
				3914	@deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op})
				3915	Output @var{op} on stdio stream @var{stream}, in raw binary format. The
				3916	integer is written in a portable format, with 4 bytes of size information, and
				3917	that many bytes of limbs. Both the size and the limbs are written in
				3918	decreasing significance order (i.e., in big-endian).
				3919
				3920	The output can be read with @code{mpz_inp_raw}.
				3921
				3922	Return the number of bytes written, or if an error occurred, return 0.
				3923
				3924	The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
				3925	of changes necessary for compatibility between 32-bit and 64-bit machines.
				3926	@end deftypefun
				3927
				3928	@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
				3929	Input from stdio stream @var{stream} in the format written by
				3930	@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of
				3931	bytes read, or if an error occurred, return 0.
				3932
				3933	This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
				3934	spite of changes necessary for compatibility between 32-bit and 64-bit
				3935	machines.
				3936	@end deftypefun
				3937
				3938
				3939	@need 2000
				3940	@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
				3941	@comment node-name, next, previous, up
				3942	@section Random Number Functions
				3943	@cindex Integer random number functions
				3944	@cindex Random number functions
				3945
				3946	The random number functions of GMP come in two groups; older function
				3947	that rely on a global state, and newer functions that accept a state
				3948	parameter that is read and modified. Please see the @ref{Random Number
				3949	Functions} for more information on how to use and not to use random
				3950	number functions.
				3951
				3952	@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
				3953	Generate a uniformly distributed random integer in the range 0 to
				3954	@mm{2@sup{n}-1, 2^@var{n}@minus{}1}, inclusive.
				3955
				3956	The variable @var{state} must be initialized by calling one of the
				3957	@code{gmp_randinit} functions (@ref{Random State Initialization}) before
				3958	invoking this function.
				3959	@end deftypefun
				3960
				3961	@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n})
				3962	Generate a uniform random integer in the range 0 to @math{@var{n}-1},
				3963	inclusive.
				3964
				3965	The variable @var{state} must be initialized by calling one of the
				3966	@code{gmp_randinit} functions (@ref{Random State Initialization})
				3967	before invoking this function.
				3968	@end deftypefun
				3969
				3970	@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
				3971	Generate a random integer with long strings of zeros and ones in the
				3972	binary representation. Useful for testing functions and algorithms,
				3973	since this kind of random numbers have proven to be more likely to
				3974	trigger corner-case bugs. The random number will be in the range
				3975	@mm{2@sup{n-1}, 2^(@var{n}@minus{}1)} to @mm{2@sup{n}-1,
				3976	2^@var{n}@minus{}1}, inclusive.
				3977
				3978	The variable @var{state} must be initialized by calling one of the
				3979	@code{gmp_randinit} functions (@ref{Random State Initialization})
				3980	before invoking this function.
				3981	@end deftypefun
				3982
				3983	@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
				3984	Generate a random integer of at most @var{max_size} limbs. The generated
				3985	random number doesn't satisfy any particular requirements of randomness.
				3986	Negative random numbers are generated when @var{max_size} is negative.
				3987
				3988	This function is obsolete. Use @code{mpz_urandomb} or
				3989	@code{mpz_urandomm} instead.
				3990	@end deftypefun
				3991
				3992	@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
				3993	Generate a random integer of at most @var{max_size} limbs, with long strings
				3994	of zeros and ones in the binary representation. Useful for testing functions
				3995	and algorithms, since this kind of random numbers have proven to be more
				3996	likely to trigger corner-case bugs. Negative random numbers are generated
				3997	when @var{max_size} is negative.
				3998
				3999	This function is obsolete. Use @code{mpz_rrandomb} instead.
				4000	@end deftypefun
				4001
				4002
				4003	@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
				4004	@section Integer Import and Export
				4005
				4006	@code{mpz_t} variables can be converted to and from arbitrary words of binary
				4007	data with the following functions.
				4008
				4009	@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
				4010	@cindex Integer import
				4011	@cindex Import
				4012	Set @var{rop} from an array of word data at @var{op}.
				4013
				4014	The parameters specify the format of the data. @var{count} many words are
				4015	read, each @var{size} bytes. @var{order} can be 1 for most significant word
				4016	first or -1 for least significant first. Within each word @var{endian} can be
				4017	1 for most significant byte first, -1 for least significant first, or 0 for
				4018	the native endianness of the host CPU@. The most significant @var{nails} bits
				4019	of each word are skipped, this can be 0 to use the full words.
				4020
				4021	There is no sign taken from the data, @var{rop} will simply be a positive
				4022	integer. An application can handle any sign itself, and apply it for instance
				4023	with @code{mpz_neg}.
				4024
				4025	There are no data alignment restrictions on @var{op}, any address is allowed.
				4026
				4027	Here's an example converting an array of @code{unsigned long} data, most
				4028	significant element first, and host byte order within each value.
				4029
				4030	@example
				4031	unsigned long a[20];
				4032	/* Initialize @var{z} and @var{a} */
				4033	mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
				4034	@end example
				4035
				4036	This example assumes the full @code{sizeof} bytes are used for data in the
				4037	given type, which is usually true, and certainly true for @code{unsigned long}
				4038	everywhere we know of. However on Cray vector systems it may be noted that
				4039	@code{short} and @code{int} are always stored in 8 bytes (and with
				4040	@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails}
				4041	feature can account for this, by passing for instance
				4042	@code{8*sizeof(int)-INT_BIT}.
				4043	@end deftypefun
				4044
				4045	@deftypefun {void } mpz_export (void @var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op})
				4046	@cindex Integer export
				4047	@cindex Export
				4048	Fill @var{rop} with word data from @var{op}.
				4049
				4050	The parameters specify the format of the data produced. Each word will be
				4051	@var{size} bytes and @var{order} can be 1 for most significant word first or
				4052	-1 for least significant first. Within each word @var{endian} can be 1 for
				4053	most significant byte first, -1 for least significant first, or 0 for the
				4054	native endianness of the host CPU@. The most significant @var{nails} bits of
				4055	each word are unused and set to zero, this can be 0 to produce full words.
				4056
				4057	The number of words produced is written to @code{*@var{countp}}, or
				4058	@var{countp} can be @code{NULL} to discard the count. @var{rop} must have
				4059	enough space for the data, or if @var{rop} is @code{NULL} then a result array
				4060	of the necessary size is allocated using the current GMP allocation function
				4061	(@pxref{Custom Allocation}). In either case the return value is the
				4062	destination used, either @var{rop} or the allocated block.
				4063
				4064	If @var{op} is non-zero then the most significant word produced will be
				4065	non-zero. If @var{op} is zero then the count returned will be zero and
				4066	nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no
				4067	block is allocated, just @code{NULL} is returned.
				4068
				4069	The sign of @var{op} is ignored, just the absolute value is exported. An
				4070	application can use @code{mpz_sgn} to get the sign and handle it as desired.
				4071	(@pxref{Integer Comparisons})
				4072
				4073	There are no data alignment restrictions on @var{rop}, any address is allowed.
				4074
				4075	When an application is allocating space itself the required size can be
				4076	determined with a calculation like the following. Since @code{mpz_sizeinbase}
				4077	always returns at least 1, @code{count} here will be at least one, which
				4078	avoids any portability problems with @code{malloc(0)}, though if @code{z} is
				4079	zero no space at all is actually needed (or written).
				4080
				4081	@example
				4082	numb = 8*size - nail;
				4083	count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
				4084	p = malloc (count * size);
				4085	@end example
				4086	@end deftypefun
				4087
				4088
				4089	@need 2000
				4090	@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions
				4091	@comment node-name, next, previous, up
				4092	@section Miscellaneous Functions
				4093	@cindex Miscellaneous integer functions
				4094	@cindex Integer miscellaneous functions
				4095
				4096	@deftypefun int mpz_fits_ulong_p (const mpz_t @var{op})
				4097	@deftypefunx int mpz_fits_slong_p (const mpz_t @var{op})
				4098	@deftypefunx int mpz_fits_uint_p (const mpz_t @var{op})
				4099	@deftypefunx int mpz_fits_sint_p (const mpz_t @var{op})
				4100	@deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op})
				4101	@deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op})
				4102	Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
				4103	@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
				4104	short int}, or @code{signed short int}, respectively. Otherwise, return zero.
				4105	@end deftypefun
				4106
				4107	@deftypefn Macro int mpz_odd_p (const mpz_t @var{op})
				4108	@deftypefnx Macro int mpz_even_p (const mpz_t @var{op})
				4109	Determine whether @var{op} is odd or even, respectively. Return non-zero if
				4110	yes, zero if no. These macros evaluate their argument more than once.
				4111	@end deftypefn
				4112
				4113	@deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base})
				4114	@cindex Size in digits
				4115	@cindex Digits in an integer
				4116	Return the size of @var{op} measured in number of digits in the given
				4117	@var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is
				4118	ignored, just the absolute value is used. The result will be either exact or
				4119	1 too big. If @var{base} is a power of 2, the result is always exact. If
				4120	@var{op} is zero the return value is always 1.
				4121
				4122	This function can be used to determine the space required when converting
				4123	@var{op} to a string. The right amount of allocation is normally two more
				4124	than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign
				4125	and one for the null-terminator.
				4126
				4127	@cindex Most significant bit
				4128	It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate
				4129	the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise
				4130	functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical
				4131	and Bit Manipulation Functions}.)
				4132	@end deftypefun
				4133
				4134
				4135	@node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions
				4136	@section Special Functions
				4137	@cindex Special integer functions
				4138	@cindex Integer special functions
				4139
				4140	The functions in this section are for various special purposes. Most
				4141	applications will not need them.
				4142
				4143	@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
				4144	@strong{This is an obsolete function. Do not use it.}
				4145	@end deftypefun
				4146
				4147	@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
				4148	Change the space for @var{integer} to @var{new_alloc} limbs. The value in
				4149	@var{integer} is preserved if it fits, or is set to 0 if not. The return
				4150	value is not useful to applications and should be ignored.
				4151
				4152	@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
				4153	this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
				4154	@code{_mpz_realloc} takes its size in limbs.
				4155	@end deftypefun
				4156
				4157	@deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n})
				4158	Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored,
				4159	just the absolute value is used. The least significant limb is number 0.
				4160
				4161	@code{mpz_size} can be used to find how many limbs make up @var{op}.
				4162	@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
				4163	@code{mpz_size(@var{op})-1}.
				4164	@end deftypefun
				4165
				4166	@deftypefun size_t mpz_size (const mpz_t @var{op})
				4167	Return the size of @var{op} measured in number of limbs. If @var{op} is zero,
				4168	the returned value will be zero.
				4169	@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
				4170	@end deftypefun
				4171
				4172	@deftypefun {const mp_limb_t *} mpz_limbs_read (const mpz_t @var{x})
				4173	Return a pointer to the limb array representing the absolute value of @var{x}.
				4174	The size of the array is @code{mpz_size(@var{x})}. Intended for read access
				4175	only.
				4176	@end deftypefun
				4177
				4178	@deftypefun {mp_limb_t *} mpz_limbs_write (mpz_t @var{x}, mp_size_t @var{n})
				4179	@deftypefunx {mp_limb_t *} mpz_limbs_modify (mpz_t @var{x}, mp_size_t @var{n})
				4180	Return a pointer to the limb array, intended for write access. The array is
				4181	reallocated as needed, to make room for @var{n} limbs. Requires @math{@var{n}
				4182	> 0}. The @code{mpz_limbs_modify} function returns an array that holds the old
				4183	absolute value of @var{x}, while @code{mpz_limbs_write} may destroy the old
				4184	value and return an array with unspecified contents.
				4185	@end deftypefun
				4186
				4187	@deftypefun void mpz_limbs_finish (mpz_t @var{x}, mp_size_t @var{s})
				4188	Updates the internal size field of @var{x}. Used after writing to the limb
				4189	array pointer returned by @code{mpz_limbs_write} or @code{mpz_limbs_modify} is
				4190	completed. The array should contain @math{@GMPabs{@var{s}}} valid limbs,
				4191	representing the new absolute value for @var{x}, and the sign of @var{x} is
				4192	taken from the sign of @var{s}. This function never reallocates @var{x}, so
				4193	the limb pointer remains valid.
				4194	@end deftypefun
				4195
				4196	@c FIXME: Some more useful and less silly example?
				4197	@example
				4198	void foo (mpz_t x)
				4199	@{
				4200	mp_size_t n, i;
				4201	mp_limb_t *xp;
				4202
				4203	n = mpz_size (x);
				4204	xp = mpz_limbs_modify (x, 2*n);
				4205	for (i = 0; i < n; i++)
				4206	xp[n+i] = xp[n-1-i];
				4207	mpz_limbs_finish (x, mpz_sgn (x) < 0 ? - 2n : 2n);
				4208	@}
				4209	@end example
				4210
				4211	@deftypefun mpz_srcptr mpz_roinit_n (mpz_t @var{x}, const mp_limb_t *@var{xp}, mp_size_t @var{xs})
				4212	Special initialization of @var{x}, using the given limb array and size.
				4213	@var{x} should be treated as read-only: it can be passed safely as input to
				4214	any mpz function, but not as an output. The array @var{xp} must point to at
				4215	least a readable limb, its size is
				4216	@math{@GMPabs{@var{xs}}}, and the sign of @var{x} is the sign of @var{xs}. For
				4217	convenience, the function returns @var{x}, but cast to a const pointer type.
				4218	@end deftypefun
				4219
				4220	@example
				4221	void foo (mpz_t x)
				4222	@{
				4223	static const mp_limb_t y[3] = @{ 0x1, 0x2, 0x3 @};
				4224	mpz_t tmp;
				4225	mpz_add (x, x, mpz_roinit_n (tmp, y, 3));
				4226	@}
				4227	@end example
				4228
				4229	@deftypefn Macro mpz_t MPZ_ROINIT_N (mp_limb_t *@var{xp}, mp_size_t @var{xs})
				4230	This macro expands to an initializer which can be assigned to an mpz_t
				4231	variable. The limb array @var{xp} must point to at least a readable limb,
				4232	moreover, unlike the @code{mpz_roinit_n} function, the array must be
				4233	normalized: if @var{xs} is non-zero, then
				4234	@code{@var{xp}[@math{@GMPabs{@var{xs}}-1}]} must be non-zero. Intended
				4235	primarily for constant values. Using it for non-constant values requires a C
				4236	compiler supporting C99.
				4237	@end deftypefn
				4238
				4239	@example
				4240	void foo (mpz_t x)
				4241	@{
				4242	static const mp_limb_t ya[3] = @{ 0x1, 0x2, 0x3 @};
				4243	static const mpz_t y = MPZ_ROINIT_N ((mp_limb_t *) ya, 3);
				4244
				4245	mpz_add (x, x, y);
				4246	@}
				4247	@end example
				4248
				4249
				4250	@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
				4251	@comment node-name, next, previous, up
				4252	@chapter Rational Number Functions
				4253	@cindex Rational number functions
				4254
				4255	This chapter describes the GMP functions for performing arithmetic on rational
				4256	numbers. These functions start with the prefix @code{mpq_}.
				4257
				4258	Rational numbers are stored in objects of type @code{mpq_t}.
				4259
				4260	All rational arithmetic functions assume operands have a canonical form, and
				4261	canonicalize their result. The canonical form means that the denominator and
				4262	the numerator have no common factors, and that the denominator is positive.
				4263	Zero has the unique representation 0/1.
				4264
				4265	Pure assignment functions do not canonicalize the assigned variable. It is
				4266	the responsibility of the user to canonicalize the assigned variable before
				4267	any arithmetic operations are performed on that variable.
				4268
				4269	@deftypefun void mpq_canonicalize (mpq_t @var{op})
				4270	Remove any factors that are common to the numerator and denominator of
				4271	@var{op}, and make the denominator positive.
				4272	@end deftypefun
				4273
				4274	@menu
				4275	* Initializing Rationals::
				4276	* Rational Conversions::
				4277	* Rational Arithmetic::
				4278	* Comparing Rationals::
				4279	* Applying Integer Functions::
				4280	* I/O of Rationals::
				4281	@end menu
				4282
				4283	@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
				4284	@comment node-name, next, previous, up
				4285	@section Initialization and Assignment Functions
				4286	@cindex Rational assignment functions
				4287	@cindex Assignment functions
				4288	@cindex Rational initialization functions
				4289	@cindex Initialization functions
				4290
				4291	@deftypefun void mpq_init (mpq_t @var{x})
				4292	Initialize @var{x} and set it to 0/1. Each variable should normally only be
				4293	initialized once, or at least cleared out (using the function @code{mpq_clear})
				4294	between each initialization.
				4295	@end deftypefun
				4296
				4297	@deftypefun void mpq_inits (mpq_t @var{x}, ...)
				4298	Initialize a NULL-terminated list of @code{mpq_t} variables, and set their
				4299	values to 0/1.
				4300	@end deftypefun
				4301
				4302	@deftypefun void mpq_clear (mpq_t @var{x})
				4303	Free the space occupied by @var{x}. Make sure to call this function for all
				4304	@code{mpq_t} variables when you are done with them.
				4305	@end deftypefun
				4306
				4307	@deftypefun void mpq_clears (mpq_t @var{x}, ...)
				4308	Free the space occupied by a NULL-terminated list of @code{mpq_t} variables.
				4309	@end deftypefun
				4310
				4311	@deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op})
				4312	@deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op})
				4313	Assign @var{rop} from @var{op}.
				4314	@end deftypefun
				4315
				4316	@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
				4317	@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
				4318	Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and
				4319	@var{op2} have common factors, @var{rop} has to be passed to
				4320	@code{mpq_canonicalize} before any operations are performed on @var{rop}.
				4321	@end deftypefun
				4322
				4323	@deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base})
				4324	Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
				4325
				4326	The string can be an integer like ``41'' or a fraction like ``41/152''. The
				4327	fraction must be in canonical form (@pxref{Rational Number Functions}), or if
				4328	not then @code{mpq_canonicalize} must be called.
				4329
				4330	The numerator and optional denominator are parsed the same as in
				4331	@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in
				4332	the string, and is simply ignored. The @var{base} can vary from 2 to 62, or
				4333	if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex,
				4334	@code{0b} or @code{0B} for binary,
				4335	@code{0} for octal, or decimal otherwise. Note that this is done separately
				4336	for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
				4337	whereas @code{0xEF/0x100} is 239/256.
				4338
				4339	The return value is 0 if the entire string is a valid number, or @minus{}1 if
				4340	not.
				4341	@end deftypefun
				4342
				4343	@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
				4344	Swap the values @var{rop1} and @var{rop2} efficiently.
				4345	@end deftypefun
				4346
				4347
				4348	@need 2000
				4349	@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
				4350	@comment node-name, next, previous, up
				4351	@section Conversion Functions
				4352	@cindex Rational conversion functions
				4353	@cindex Conversion functions
				4354
				4355	@deftypefun double mpq_get_d (const mpq_t @var{op})
				4356	Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
				4357	towards zero).
				4358
				4359	If the exponent from the conversion is too big or too small to fit a
				4360	@code{double} then the result is system dependent. For too big an infinity is
				4361	returned when available. For too small @math{0.0} is normally returned.
				4362	Hardware overflow, underflow and denorm traps may or may not occur.
				4363	@end deftypefun
				4364
				4365	@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
				4366	@deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op})
				4367	Set @var{rop} to the value of @var{op}. There is no rounding, this conversion
				4368	is exact.
				4369	@end deftypefun
				4370
				4371	@deftypefun {char } mpq_get_str (char @var{str}, int @var{base}, const mpq_t @var{op})
				4372	Convert @var{op} to a string of digits in base @var{base}. The base argument
				4373	may vary from 2 to 62 or from @minus{}2 to @minus{}36. The string will be of
				4374	the form @samp{num/den}, or if the denominator is 1 then just @samp{num}.
				4375
				4376	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				4377	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				4378	digits, upper-case letters, and lower-case letters (in that significance order)
				4379	are used.
				4380
				4381	If @var{str} is @code{NULL}, the result string is allocated using the current
				4382	allocation function (@pxref{Custom Allocation}). The block will be
				4383	@code{strlen(str)+1} bytes, that being exactly enough for the string and
				4384	null-terminator.
				4385
				4386	If @var{str} is not @code{NULL}, it should point to a block of storage large
				4387	enough for the result, that being
				4388
				4389	@example
				4390	mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
				4391	+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
				4392	@end example
				4393
				4394	The three extra bytes are for a possible minus sign, possible slash, and the
				4395	null-terminator.
				4396
				4397	A pointer to the result string is returned, being either the allocated block,
				4398	or the given @var{str}.
				4399	@end deftypefun
				4400
				4401
				4402	@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
				4403	@comment node-name, next, previous, up
				4404	@section Arithmetic Functions
				4405	@cindex Rational arithmetic functions
				4406	@cindex Arithmetic functions
				4407
				4408	@deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2})
				4409	Set @var{sum} to @var{addend1} + @var{addend2}.
				4410	@end deftypefun
				4411
				4412	@deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend})
				4413	Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
				4414	@end deftypefun
				4415
				4416	@deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand})
				4417	Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
				4418	@end deftypefun
				4419
				4420	@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
				4421	Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
				4422	@var{op2}}.
				4423	@end deftypefun
				4424
				4425	@deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor})
				4426	@cindex Division functions
				4427	Set @var{quotient} to @var{dividend}/@var{divisor}.
				4428	@end deftypefun
				4429
				4430	@deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
				4431	Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
				4432	@var{op2}}.
				4433	@end deftypefun
				4434
				4435	@deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand})
				4436	Set @var{negated_operand} to @minus{}@var{operand}.
				4437	@end deftypefun
				4438
				4439	@deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op})
				4440	Set @var{rop} to the absolute value of @var{op}.
				4441	@end deftypefun
				4442
				4443	@deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number})
				4444	Set @var{inverted_number} to 1/@var{number}. If the new denominator is
				4445	zero, this routine will divide by zero.
				4446	@end deftypefun
				4447
				4448	@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
				4449	@comment node-name, next, previous, up
				4450	@section Comparison Functions
				4451	@cindex Rational comparison functions
				4452	@cindex Comparison functions
				4453
				4454	@deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2})
				4455	@deftypefunx int mpq_cmp_z (const mpq_t @var{op1}, const mpz_t @var{op2})
				4456	Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
				4457	@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
				4458	@math{@var{op1} < @var{op2}}.
				4459
				4460	To determine if two rationals are equal, @code{mpq_equal} is faster than
				4461	@code{mpq_cmp}.
				4462	@end deftypefun
				4463
				4464	@deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
				4465	@deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
				4466	Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if
				4467	@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
				4468	@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
				4469	@var{num2}/@var{den2}}.
				4470
				4471	@var{num2} and @var{den2} are allowed to have common factors.
				4472
				4473	These functions are implemented as a macros and evaluate their arguments
				4474	multiple times.
				4475	@end deftypefn
				4476
				4477	@deftypefn Macro int mpq_sgn (const mpq_t @var{op})
				4478	@cindex Sign tests
				4479	@cindex Rational sign tests
				4480	Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
				4481	@math{-1} if @math{@var{op} < 0}.
				4482
				4483	This function is actually implemented as a macro. It evaluates its
				4484	argument multiple times.
				4485	@end deftypefn
				4486
				4487	@deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2})
				4488	Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
				4489	non-equal. Although @code{mpq_cmp} can be used for the same purpose, this
				4490	function is much faster.
				4491	@end deftypefun
				4492
				4493	@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
				4494	@comment node-name, next, previous, up
				4495	@section Applying Integer Functions to Rationals
				4496	@cindex Rational numerator and denominator
				4497	@cindex Numerator and denominator
				4498
				4499	The set of @code{mpq} functions is quite small. In particular, there are few
				4500	functions for either input or output. The following functions give direct
				4501	access to the numerator and denominator of an @code{mpq_t}.
				4502
				4503	Note that if an assignment to the numerator and/or denominator could take an
				4504	@code{mpq_t} out of the canonical form described at the start of this chapter
				4505	(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
				4506	called before any other @code{mpq} functions are applied to that @code{mpq_t}.
				4507
				4508	@deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op})
				4509	@deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op})
				4510	Return a reference to the numerator and denominator of @var{op}, respectively.
				4511	The @code{mpz} functions can be used on the result of these macros.
				4512	@end deftypefn
				4513
				4514	@deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational})
				4515	@deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational})
				4516	@deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator})
				4517	@deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator})
				4518	Get or set the numerator or denominator of a rational. These functions are
				4519	equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
				4520	@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is
				4521	recommended instead of these functions.
				4522	@end deftypefun
				4523
				4524
				4525	@need 2000
				4526	@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions
				4527	@comment node-name, next, previous, up
				4528	@section Input and Output Functions
				4529	@cindex Rational input and output functions
				4530	@cindex Input functions
				4531	@cindex Output functions
				4532	@cindex I/O functions
				4533
				4534	Functions that perform input from a stdio stream, and functions that output to
				4535	a stdio stream, of @code{mpq} numbers. Passing a @code{NULL} pointer for a
				4536	@var{stream} argument to any of these functions will make them read from
				4537	@code{stdin} and write to @code{stdout}, respectively.
				4538
				4539	When using any of these functions, it is a good idea to include @file{stdio.h}
				4540	before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
				4541	for these functions.
				4542
				4543	See also @ref{Formatted Output} and @ref{Formatted Input}.
				4544
				4545	@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op})
				4546	Output @var{op} on stdio stream @var{stream}, as a string of digits in base
				4547	@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to
				4548	@minus{}36. Output is in the form
				4549	@samp{num/den} or if the denominator is 1 then just @samp{num}.
				4550
				4551	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				4552	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				4553	digits, upper-case letters, and lower-case letters (in that significance order)
				4554	are used.
				4555
				4556	Return the number of bytes written, or if an error occurred, return 0.
				4557	@end deftypefun
				4558
				4559	@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
				4560	Read a string of digits from @var{stream} and convert them to a rational in
				4561	@var{rop}. Any initial white-space characters are read and discarded. Return
				4562	the number of characters read (including white space), or 0 if a rational
				4563	could not be read.
				4564
				4565	The input can be a fraction like @samp{17/63} or just an integer like
				4566	@samp{123}. Reading stops at the first character not in this form, and white
				4567	space is not permitted within the string. If the input might not be in
				4568	canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
				4569	Number Functions}).
				4570
				4571	The @var{base} can be between 2 and 62, or can be 0 in which case the leading
				4572	characters of the string determine the base, @samp{0x} or @samp{0X} for
				4573	hexadecimal, @code{0b} and @code{0B} for binary, @samp{0} for octal, or
				4574	decimal otherwise. The leading characters
				4575	are examined separately for the numerator and denominator of a fraction, so
				4576	for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is
				4577	@math{16/17}.
				4578	@end deftypefun
				4579
				4580
				4581	@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
				4582	@comment node-name, next, previous, up
				4583	@chapter Floating-point Functions
				4584	@cindex Floating-point functions
				4585	@cindex Float functions
				4586	@cindex User-defined precision
				4587	@cindex Precision of floats
				4588
				4589	GMP floating point numbers are stored in objects of type @code{mpf_t} and
				4590	functions operating on them have an @code{mpf_} prefix.
				4591
				4592	The mantissa of each float has a user-selectable precision, in practice only
				4593	limited by available memory. Each variable has its own precision, and that can
				4594	be increased or decreased at any time. This selectable precision is a minimum
				4595	value, GMP rounds it up to a whole limb.
				4596
				4597	The accuracy of a calculation is determined by the priorly set precision of the
				4598	destination variable and the numeric values of the input variables. Input
				4599	variables' set precisions do not affect calculations (except indirectly as
				4600	their values might have been affected when they were assigned).
				4601
				4602	The exponent of each float has fixed precision, one machine word on most
				4603	systems. In the current implementation the exponent is a count of limbs, so
				4604	for example on a 32-bit system this means a range of roughly
				4605	@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
				4606	this will be much greater. Note however that @code{mpf_get_str} can only
				4607	return an exponent which fits an @code{mp_exp_t} and currently
				4608	@code{mpf_set_str} doesn't accept exponents bigger than a @code{long}.
				4609
				4610	Each variable keeps track of the mantissa data actually in use. This means
				4611	that if a float is exactly represented in only a few bits then only those bits
				4612	will be used in a calculation, even if the variable's selected precision is
				4613	high. This is a performance optimization; it does not affect the numeric
				4614	results.
				4615
				4616	Internally, GMP sometimes calculates with higher precision than that of the
				4617	destination variable in order to limit errors. Final results are always
				4618	truncated to the destination variable's precision.
				4619
				4620	The mantissa is stored in binary. One consequence of this is that decimal
				4621	fractions like @math{0.1} cannot be represented exactly. The same is true of
				4622	plain IEEE @code{double} floats. This makes both highly unsuitable for
				4623	calculations involving money or other values that should be exact decimal
				4624	fractions. (Suitably scaled integers, or perhaps rationals, are better
				4625	choices.)
				4626
				4627	The @code{mpf} functions and variables have no special notion of infinity or
				4628	not-a-number, and applications must take care not to overflow the exponent or
				4629	results will be unpredictable.
				4630
				4631	Note that the @code{mpf} functions are @emph{not} intended as a smooth
				4632	extension to IEEE P754 arithmetic. In particular results obtained on one
				4633	computer often differ from the results on a computer with a different word
				4634	size.
				4635
				4636	New projects should consider using the GMP extension library MPFR
				4637	(@url{http://mpfr.org}) instead. MPFR provides well-defined precision and
				4638	accurate rounding, and thereby naturally extends IEEE P754.
				4639
				4640	@menu
				4641	* Initializing Floats::
				4642	* Assigning Floats::
				4643	* Simultaneous Float Init & Assign::
				4644	* Converting Floats::
				4645	* Float Arithmetic::
				4646	* Float Comparison::
				4647	* I/O of Floats::
				4648	* Miscellaneous Float Functions::
				4649	@end menu
				4650
				4651	@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
				4652	@comment node-name, next, previous, up
				4653	@section Initialization Functions
				4654	@cindex Float initialization functions
				4655	@cindex Initialization functions
				4656
				4657	@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec})
				4658	Set the default precision to be @strong{at least} @var{prec} bits. All
				4659	subsequent calls to @code{mpf_init} will use this precision, but previously
				4660	initialized variables are unaffected.
				4661	@end deftypefun
				4662
				4663	@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void)
				4664	Return the default precision actually used.
				4665	@end deftypefun
				4666
				4667	An @code{mpf_t} object must be initialized before storing the first value in
				4668	it. The functions @code{mpf_init} and @code{mpf_init2} are used for that
				4669	purpose.
				4670
				4671	@deftypefun void mpf_init (mpf_t @var{x})
				4672	Initialize @var{x} to 0. Normally, a variable should be initialized once only
				4673	or at least be cleared, using @code{mpf_clear}, between initializations. The
				4674	precision of @var{x} is undefined unless a default precision has already been
				4675	established by a call to @code{mpf_set_default_prec}.
				4676	@end deftypefun
				4677
				4678	@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec})
				4679	Initialize @var{x} to 0 and set its precision to be @strong{at least}
				4680	@var{prec} bits. Normally, a variable should be initialized once only or at
				4681	least be cleared, using @code{mpf_clear}, between initializations.
				4682	@end deftypefun
				4683
				4684	@deftypefun void mpf_inits (mpf_t @var{x}, ...)
				4685	Initialize a NULL-terminated list of @code{mpf_t} variables, and set their
				4686	values to 0. The precision of the initialized variables is undefined unless a
				4687	default precision has already been established by a call to
				4688	@code{mpf_set_default_prec}.
				4689	@end deftypefun
				4690
				4691	@deftypefun void mpf_clear (mpf_t @var{x})
				4692	Free the space occupied by @var{x}. Make sure to call this function for all
				4693	@code{mpf_t} variables when you are done with them.
				4694	@end deftypefun
				4695
				4696	@deftypefun void mpf_clears (mpf_t @var{x}, ...)
				4697	Free the space occupied by a NULL-terminated list of @code{mpf_t} variables.
				4698	@end deftypefun
				4699
				4700	@need 2000
				4701	Here is an example on how to initialize floating-point variables:
				4702	@example
				4703	@{
				4704	mpf_t x, y;
				4705	mpf_init (x); /* use default precision */
				4706	mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */
				4707	@dots{}
				4708	/* Unless the program is about to exit, do ... */
				4709	mpf_clear (x);
				4710	mpf_clear (y);
				4711	@}
				4712	@end example
				4713
				4714	The following three functions are useful for changing the precision during a
				4715	calculation. A typical use would be for adjusting the precision gradually in
				4716	iterative algorithms like Newton-Raphson, making the computation precision
				4717	closely match the actual accurate part of the numbers.
				4718
				4719	@deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op})
				4720	Return the current precision of @var{op}, in bits.
				4721	@end deftypefun
				4722
				4723	@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
				4724	Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The
				4725	value in @var{rop} will be truncated to the new precision.
				4726
				4727	This function requires a call to @code{realloc}, and so should not be used in
				4728	a tight loop.
				4729	@end deftypefun
				4730
				4731	@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
				4732	Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
				4733	without changing the memory allocated.
				4734
				4735	@var{prec} must be no more than the allocated precision for @var{rop}, that
				4736	being the precision when @var{rop} was initialized, or in the most recent
				4737	@code{mpf_set_prec}.
				4738
				4739	The value in @var{rop} is unchanged, and in particular if it had a higher
				4740	precision than @var{prec} it will retain that higher precision. New values
				4741	written to @var{rop} will use the new @var{prec}.
				4742
				4743	Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
				4744	@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
				4745	allocated precision. Failing to do so will have unpredictable results.
				4746
				4747	@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
				4748	original allocated precision. After @code{mpf_set_prec_raw} it reflects the
				4749	@var{prec} value set.
				4750
				4751	@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
				4752	different precisions during a calculation, perhaps to gradually increase
				4753	precision in an iteration, or just to use various different precisions for
				4754	different purposes during a calculation.
				4755	@end deftypefun
				4756
				4757
				4758	@need 2000
				4759	@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
				4760	@comment node-name, next, previous, up
				4761	@section Assignment Functions
				4762	@cindex Float assignment functions
				4763	@cindex Assignment functions
				4764
				4765	These functions assign new values to already initialized floats
				4766	(@pxref{Initializing Floats}).
				4767
				4768	@deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op})
				4769	@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
				4770	@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
				4771	@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
				4772	@deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op})
				4773	@deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op})
				4774	Set the value of @var{rop} from @var{op}.
				4775	@end deftypefun
				4776
				4777	@deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
				4778	Set the value of @var{rop} from the string in @var{str}. The string is of the
				4779	form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
				4780	@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always
				4781	in the specified base. The exponent is either in the specified base or, if
				4782	@var{base} is negative, in decimal. The decimal point expected is taken from
				4783	the current locale, on systems providing @code{localeconv}.
				4784
				4785	The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to
				4786	@minus{}2. Negative values are used to specify that the exponent is in
				4787	decimal.
				4788
				4789	For bases up to 36, case is ignored; upper-case and lower-case letters have
				4790	the same value; for bases 37 to 62, upper-case letter represent the usual
				4791	10..35 while lower-case letter represent 36..61.
				4792
				4793	Unlike the corresponding @code{mpz} function, the base will not be determined
				4794	from the leading characters of the string if @var{base} is 0. This is so that
				4795	numbers like @samp{0.23} are not interpreted as octal.
				4796
				4797	White space is allowed in the string, and is simply ignored. [This is not
				4798	really true; white-space is ignored in the beginning of the string and within
				4799	the mantissa, but not in other places, such as after a minus sign or in the
				4800	exponent. We are considering changing the definition of this function, making
				4801	it fail when there is any white-space in the input, since that makes a lot of
				4802	sense. Please tell us your opinion about this change. Do you really want it
				4803	to accept @nicode{"3 14"} as meaning 314 as it does now?]
				4804
				4805	This function returns 0 if the entire string is a valid number in base
				4806	@var{base}. Otherwise it returns @minus{}1.
				4807	@end deftypefun
				4808
				4809	@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
				4810	Swap @var{rop1} and @var{rop2} efficiently. Both the values and the
				4811	precisions of the two variables are swapped.
				4812	@end deftypefun
				4813
				4814
				4815	@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
				4816	@comment node-name, next, previous, up
				4817	@section Combined Initialization and Assignment Functions
				4818	@cindex Float assignment functions
				4819	@cindex Assignment functions
				4820	@cindex Float initialization functions
				4821	@cindex Initialization functions
				4822
				4823	For convenience, GMP provides a parallel series of initialize-and-set functions
				4824	which initialize the output and then store the value there. These functions'
				4825	names have the form @code{mpf_init_set@dots{}}
				4826
				4827	Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
				4828	functions, it can be used as the source or destination operand for the ordinary
				4829	float functions. Don't use an initialize-and-set function on a variable
				4830	already initialized!
				4831
				4832	@deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op})
				4833	@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
				4834	@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
				4835	@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
				4836	Initialize @var{rop} and set its value from @var{op}.
				4837
				4838	The precision of @var{rop} will be taken from the active default precision, as
				4839	set by @code{mpf_set_default_prec}.
				4840	@end deftypefun
				4841
				4842	@deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
				4843	Initialize @var{rop} and set its value from the string in @var{str}. See
				4844	@code{mpf_set_str} above for details on the assignment operation.
				4845
				4846	Note that @var{rop} is initialized even if an error occurs. (I.e., you have to
				4847	call @code{mpf_clear} for it.)
				4848
				4849	The precision of @var{rop} will be taken from the active default precision, as
				4850	set by @code{mpf_set_default_prec}.
				4851	@end deftypefun
				4852
				4853
				4854	@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
				4855	@comment node-name, next, previous, up
				4856	@section Conversion Functions
				4857	@cindex Float conversion functions
				4858	@cindex Conversion functions
				4859
				4860	@deftypefun double mpf_get_d (const mpf_t @var{op})
				4861	Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
				4862	towards zero).
				4863
				4864	If the exponent in @var{op} is too big or too small to fit a @code{double}
				4865	then the result is system dependent. For too big an infinity is returned when
				4866	available. For too small @math{0.0} is normally returned. Hardware overflow,
				4867	underflow and denorm traps may or may not occur.
				4868	@end deftypefun
				4869
				4870	@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op})
				4871	Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
				4872	towards zero), and with an exponent returned separately.
				4873
				4874	The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
				4875	exponent is stored to @code{*@var{exp}}. @m{@var{d} \times 2^{exp},
				4876	@var{d} * 2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero,
				4877	the return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
				4878
				4879	@cindex @code{frexp}
				4880	This is similar to the standard C @code{frexp} function (@pxref{Normalization
				4881	Functions,,, libc, The GNU C Library Reference Manual}).
				4882	@end deftypefun
				4883
				4884	@deftypefun long mpf_get_si (const mpf_t @var{op})
				4885	@deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op})
				4886	Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
				4887	fraction part. If @var{op} is too big for the return type, the result is
				4888	undefined.
				4889
				4890	See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
				4891	(@pxref{Miscellaneous Float Functions}).
				4892	@end deftypefun
				4893
				4894	@deftypefun {char } mpf_get_str (char @var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
				4895	Convert @var{op} to a string of digits in base @var{base}. The base argument
				4896	may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits}
				4897	digits will be generated. Trailing zeros are not returned. No more digits
				4898	than can be accurately represented by @var{op} are ever generated. If
				4899	@var{n_digits} is 0 then that accurate maximum number of digits are generated.
				4900
				4901	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				4902	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				4903	digits, upper-case letters, and lower-case letters (in that significance order)
				4904	are used.
				4905
				4906	If @var{str} is @code{NULL}, the result string is allocated using the current
				4907	allocation function (@pxref{Custom Allocation}). The block will be
				4908	@code{strlen(str)+1} bytes, that being exactly enough for the string and
				4909	null-terminator.
				4910
				4911	If @var{str} is not @code{NULL}, it should point to a block of
				4912	@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
				4913	possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get
				4914	all significant digits, an application won't be able to know the space
				4915	required, and @var{str} should be @code{NULL} in that case.
				4916
				4917	The generated string is a fraction, with an implicit radix point immediately
				4918	to the left of the first digit. The applicable exponent is written through
				4919	the @var{expptr} pointer. For example, the number 3.1416 would be returned as
				4920	string @nicode{"31416"} and exponent 1.
				4921
				4922	When @var{op} is zero, an empty string is produced and the exponent returned
				4923	is 0.
				4924
				4925	A pointer to the result string is returned, being either the allocated block
				4926	or the given @var{str}.
				4927	@end deftypefun
				4928
				4929
				4930	@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
				4931	@comment node-name, next, previous, up
				4932	@section Arithmetic Functions
				4933	@cindex Float arithmetic functions
				4934	@cindex Arithmetic functions
				4935
				4936	@deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
				4937	@deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
				4938	Set @var{rop} to @math{@var{op1} + @var{op2}}.
				4939	@end deftypefun
				4940
				4941	@deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
				4942	@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
				4943	@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
				4944	Set @var{rop} to @var{op1} @minus{} @var{op2}.
				4945	@end deftypefun
				4946
				4947	@deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
				4948	@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
				4949	Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
				4950	@end deftypefun
				4951
				4952	Division is undefined if the divisor is zero, and passing a zero divisor to the
				4953	divide functions will make these functions intentionally divide by zero. This
				4954	lets the user handle arithmetic exceptions in these functions in the same
				4955	manner as other arithmetic exceptions.
				4956
				4957	@deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
				4958	@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
				4959	@deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
				4960	@cindex Division functions
				4961	Set @var{rop} to @var{op1}/@var{op2}.
				4962	@end deftypefun
				4963
				4964	@deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op})
				4965	@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
				4966	@cindex Root extraction functions
				4967	Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
				4968	@end deftypefun
				4969
				4970	@deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
				4971	@cindex Exponentiation functions
				4972	@cindex Powering functions
				4973	Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
				4974	@end deftypefun
				4975
				4976	@deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op})
				4977	Set @var{rop} to @minus{}@var{op}.
				4978	@end deftypefun
				4979
				4980	@deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op})
				4981	Set @var{rop} to the absolute value of @var{op}.
				4982	@end deftypefun
				4983
				4984	@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
				4985	Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
				4986	@var{op2}}.
				4987	@end deftypefun
				4988
				4989	@deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
				4990	Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
				4991	@var{op2}}.
				4992	@end deftypefun
				4993
				4994	@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
				4995	@comment node-name, next, previous, up
				4996	@section Comparison Functions
				4997	@cindex Float comparison functions
				4998	@cindex Comparison functions
				4999
				5000	@deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2})
				5001	@deftypefunx int mpf_cmp_z (const mpf_t @var{op1}, const mpz_t @var{op2})
				5002	@deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2})
				5003	@deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2})
				5004	@deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2})
				5005	Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
				5006	@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
				5007	@math{@var{op1} < @var{op2}}.
				5008
				5009	@code{mpf_cmp_d} can be called with an infinity, but results are undefined for
				5010	a NaN.
				5011	@end deftypefun
				5012
				5013	@deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3)
				5014	@strong{This function is mathematically ill-defined and should not be used.}
				5015
				5016	Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
				5017	equal, zero otherwise. Note that numbers like e.g., 256 (binary 100000000) and
				5018	255 (binary 11111111) will never be equal by this function's measure, and
				5019	furthermore that 0 will only be equal to itself.
				5020	@end deftypefun
				5021
				5022	@deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
				5023	Compute the relative difference between @var{op1} and @var{op2} and store the
				5024	result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
				5025	@end deftypefun
				5026
				5027	@deftypefn Macro int mpf_sgn (const mpf_t @var{op})
				5028	@cindex Sign tests
				5029	@cindex Float sign tests
				5030	Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
				5031	@math{-1} if @math{@var{op} < 0}.
				5032
				5033	This function is actually implemented as a macro. It evaluates its argument
				5034	multiple times.
				5035	@end deftypefn
				5036
				5037	@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
				5038	@comment node-name, next, previous, up
				5039	@section Input and Output Functions
				5040	@cindex Float input and output functions
				5041	@cindex Input functions
				5042	@cindex Output functions
				5043	@cindex I/O functions
				5044
				5045	Functions that perform input from a stdio stream, and functions that output to
				5046	a stdio stream, of @code{mpf} numbers. Passing a @code{NULL} pointer for a
				5047	@var{stream} argument to any of these functions will make them read from
				5048	@code{stdin} and write to @code{stdout}, respectively.
				5049
				5050	When using any of these functions, it is a good idea to include @file{stdio.h}
				5051	before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
				5052	for these functions.
				5053
				5054	See also @ref{Formatted Output} and @ref{Formatted Input}.
				5055
				5056	@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
				5057	Print @var{op} to @var{stream}, as a string of digits. Return the number of
				5058	bytes written, or if an error occurred, return 0.
				5059
				5060	The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
				5061	which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is
				5062	then printed, separated by an @samp{e}, or if the base is greater than 10 then
				5063	by an @samp{@@}. The exponent is always in decimal. The decimal point follows
				5064	the current locale, on systems providing @code{localeconv}.
				5065
				5066	For @var{base} in the range 2..36, digits and lower-case letters are used; for
				5067	@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
				5068	digits, upper-case letters, and lower-case letters (in that significance order)
				5069	are used.
				5070
				5071	Up to @var{n_digits} will be printed from the mantissa, except that no more
				5072	digits than are accurately representable by @var{op} will be printed.
				5073	@var{n_digits} can be 0 to select that accurate maximum.
				5074	@end deftypefun
				5075
				5076	@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
				5077	Read a string in base @var{base} from @var{stream}, and put the read float in
				5078	@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or
				5079	less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the
				5080	exponent. The mantissa is always in the specified base. The exponent is
				5081	either in the specified base or, if @var{base} is negative, in decimal. The
				5082	decimal point expected is taken from the current locale, on systems providing
				5083	@code{localeconv}.
				5084
				5085	The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
				5086	@minus{}2. Negative values are used to specify that the exponent is in
				5087	decimal.
				5088
				5089	Unlike the corresponding @code{mpz} function, the base will not be determined
				5090	from the leading characters of the string if @var{base} is 0. This is so that
				5091	numbers like @samp{0.23} are not interpreted as octal.
				5092
				5093	Return the number of bytes read, or if an error occurred, return 0.
				5094	@end deftypefun
				5095
				5096	@c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float})
				5097	@c Output @var{float} on stdio stream @var{stream}, in raw binary
				5098	@c format. The float is written in a portable format, with 4 bytes of
				5099	@c size information, and that many bytes of limbs. Both the size and the
				5100	@c limbs are written in decreasing significance order.
				5101	@c @end deftypefun
				5102
				5103	@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
				5104	@c Input from stdio stream @var{stream} in the format written by
				5105	@c @code{mpf_out_raw}, and put the result in @var{float}.
				5106	@c @end deftypefun
				5107
				5108
				5109	@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions
				5110	@comment node-name, next, previous, up
				5111	@section Miscellaneous Functions
				5112	@cindex Miscellaneous float functions
				5113	@cindex Float miscellaneous functions
				5114
				5115	@deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op})
				5116	@deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op})
				5117	@deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op})
				5118	@cindex Rounding functions
				5119	@cindex Float rounding functions
				5120	Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the
				5121	next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
				5122	to the integer towards zero.
				5123	@end deftypefun
				5124
				5125	@deftypefun int mpf_integer_p (const mpf_t @var{op})
				5126	Return non-zero if @var{op} is an integer.
				5127	@end deftypefun
				5128
				5129	@deftypefun int mpf_fits_ulong_p (const mpf_t @var{op})
				5130	@deftypefunx int mpf_fits_slong_p (const mpf_t @var{op})
				5131	@deftypefunx int mpf_fits_uint_p (const mpf_t @var{op})
				5132	@deftypefunx int mpf_fits_sint_p (const mpf_t @var{op})
				5133	@deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op})
				5134	@deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op})
				5135	Return non-zero if @var{op} would fit in the respective C data type, when
				5136	truncated to an integer.
				5137	@end deftypefun
				5138
				5139	@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits})
				5140	@cindex Random number functions
				5141	@cindex Float random number functions
				5142	Generate a uniformly distributed random float in @var{rop}, such that @math{0
				5143	@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or
				5144	less if the precision of @var{rop} is smaller.
				5145
				5146	The variable @var{state} must be initialized by calling one of the
				5147	@code{gmp_randinit} functions (@ref{Random State Initialization}) before
				5148	invoking this function.
				5149	@end deftypefun
				5150
				5151	@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
				5152	Generate a random float of at most @var{max_size} limbs, with long strings of
				5153	zeros and ones in the binary representation. The exponent of the number is in
				5154	the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is
				5155	useful for testing functions and algorithms, since these kind of random
				5156	numbers have proven to be more likely to trigger corner-case bugs. Negative
				5157	random numbers are generated when @var{max_size} is negative.
				5158	@end deftypefun
				5159
				5160	@c @deftypefun size_t mpf_size (const mpf_t @var{op})
				5161	@c Return the size of @var{op} measured in number of limbs. If @var{op} is
				5162	@c zero, the returned value will be zero. (@xref{Nomenclature}, for an
				5163	@c explanation of the concept @dfn{limb}.)
				5164	@c
				5165	@c @strong{This function is obsolete. It will disappear from future GMP
				5166	@c releases.}
				5167	@c @end deftypefun
				5168
				5169
				5170	@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
				5171	@comment node-name, next, previous, up
				5172	@chapter Low-level Functions
				5173	@cindex Low-level functions
				5174
				5175	This chapter describes low-level GMP functions, used to implement the
				5176	high-level GMP functions, but also intended for time-critical user code.
				5177
				5178	These functions start with the prefix @code{mpn_}.
				5179
				5180	@c 1. Some of these function clobber input operands.
				5181	@c
				5182
				5183	The @code{mpn} functions are designed to be as fast as possible, @strong{not}
				5184	to provide a coherent calling interface. The different functions have somewhat
				5185	similar interfaces, but there are variations that make them hard to use. These
				5186	functions do as little as possible apart from the real multiple precision
				5187	computation, so that no time is spent on things that not all callers need.
				5188
				5189	A source operand is specified by a pointer to the least significant limb and a
				5190	limb count. A destination operand is specified by just a pointer. It is the
				5191	responsibility of the caller to ensure that the destination has enough space
				5192	for storing the result.
				5193
				5194	With this way of specifying operands, it is possible to perform computations on
				5195	subranges of an argument, and store the result into a subrange of a
				5196	destination.
				5197
				5198	A common requirement for all functions is that each source area needs at least
				5199	one limb. No size argument may be zero. Unless otherwise stated, in-place
				5200	operations are allowed where source and destination are the same, but not where
				5201	they only partly overlap.
				5202
				5203	The @code{mpn} functions are the base for the implementation of the
				5204	@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
				5205
				5206	This example adds the number beginning at @var{s1p} and the number beginning at
				5207	@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs.
				5208
				5209	@example
				5210	cy = mpn_add_n (destp, s1p, s2p, n)
				5211	@end example
				5212
				5213	It should be noted that the @code{mpn} functions make no attempt to identify
				5214	high or low zero limbs on their operands, or other special forms. On random
				5215	data such cases will be unlikely and it'd be wasteful for every function to
				5216	check every time. An application knowing something about its data can take
				5217	steps to trim or perhaps split its calculations.
				5218	@c
				5219	@c For reference, within gmp mpz_t operands never have high zero limbs, and
				5220	@c we rate low zero limbs as unlikely too (or something an application should
				5221	@c handle). This is a prime motivation for not stripping zero limbs in say
				5222	@c mpn_mul_n etc.
				5223	@c
				5224	@c Other applications doing variable-length calculations will quite likely do
				5225	@c something similar to mpz. And even if not then it's highly likely zero
				5226	@c limb stripping can be done at just a few judicious points, which will be
				5227	@c more efficient than having lots of mpn functions checking every time.
				5228
				5229	@sp 1
				5230	@noindent
				5231	In the notation used below, a source operand is identified by the pointer to
				5232	the least significant limb, and the limb count in braces. For example,
				5233	@{@var{s1p}, @var{s1n}@}.
				5234
				5235	@deftypefun mp_limb_t mpn_add_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5236	Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
				5237	least significant limbs of the result to @var{rp}. Return carry, either 0 or
				5238	1.
				5239
				5240	This is the lowest-level function for addition. It is the preferred function
				5241	for addition, since it is written in assembly for most CPUs. For addition of
				5242	a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift}
				5243	with a count of 1 for optimal speed.
				5244	@end deftypefun
				5245
				5246	@deftypefun mp_limb_t mpn_add_1 (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
				5247	Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
				5248	significant limbs of the result to @var{rp}. Return carry, either 0 or 1.
				5249	@end deftypefun
				5250
				5251	@deftypefun mp_limb_t mpn_add (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
				5252	Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
				5253	@var{s1n} least significant limbs of the result to @var{rp}. Return carry,
				5254	either 0 or 1.
				5255
				5256	This function requires that @var{s1n} is greater than or equal to @var{s2n}.
				5257	@end deftypefun
				5258
				5259	@deftypefun mp_limb_t mpn_sub_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5260	Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
				5261	@var{n} least significant limbs of the result to @var{rp}. Return borrow,
				5262	either 0 or 1.
				5263
				5264	This is the lowest-level function for subtraction. It is the preferred
				5265	function for subtraction, since it is written in assembly for most CPUs.
				5266	@end deftypefun
				5267
				5268	@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
				5269	Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
				5270	significant limbs of the result to @var{rp}. Return borrow, either 0 or 1.
				5271	@end deftypefun
				5272
				5273	@deftypefun mp_limb_t mpn_sub (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
				5274	Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
				5275	@var{s1n} least significant limbs of the result to @var{rp}. Return borrow,
				5276	either 0 or 1.
				5277
				5278	This function requires that @var{s1n} is greater than or equal to
				5279	@var{s2n}.
				5280	@end deftypefun
				5281
				5282	@deftypefun mp_limb_t mpn_neg (mp_limb_t @var{rp}, const mp_limb_t @var{sp}, mp_size_t @var{n})
				5283	Perform the negation of @{@var{sp}, @var{n}@}, and write the result to
				5284	@{@var{rp}, @var{n}@}. This is equivalent to calling @code{mpn_sub_n} with a
				5285	@var{n}-limb zero minuend and passing @{@var{sp}, @var{n}@} as subtrahend.
				5286	Return borrow, either 0 or 1.
				5287	@end deftypefun
				5288
				5289	@deftypefun void mpn_mul_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5290	Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
				5291	2*@var{n}-limb result to @var{rp}.
				5292
				5293	The destination has to have space for 2*@var{n} limbs, even if the product's
				5294	most significant limb is zero. No overlap is permitted between the
				5295	destination and either source.
				5296
				5297	If the two input operands are the same, use @code{mpn_sqr}.
				5298	@end deftypefun
				5299
				5300	@deftypefun mp_limb_t mpn_mul (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
				5301	Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
				5302	(@var{s1n}+@var{s2n})-limb result to @var{rp}. Return the most significant
				5303	limb of the result.
				5304
				5305	The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
				5306	product's most significant limb is zero. No overlap is permitted between the
				5307	destination and either source.
				5308
				5309	This function requires that @var{s1n} is greater than or equal to @var{s2n}.
				5310	@end deftypefun
				5311
				5312	@deftypefun void mpn_sqr (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n})
				5313	Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
				5314	result to @var{rp}.
				5315
				5316	The destination has to have space for 2@var{n} limbs, even if the result's
				5317	most significant limb is zero. No overlap is permitted between the
				5318	destination and the source.
				5319	@end deftypefun
				5320
				5321	@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
				5322	Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
				5323	significant limbs of the product to @var{rp}. Return the most significant
				5324	limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
				5325	allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
				5326
				5327	This is a low-level function that is a building block for general
				5328	multiplication as well as other operations in GMP@. It is written in assembly
				5329	for most CPUs.
				5330
				5331	Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
				5332	with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
				5333	@end deftypefun
				5334
				5335	@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
				5336	Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
				5337	significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
				5338	to @var{rp}. Return the most significant limb of the product, plus carry-out
				5339	from the addition. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
				5340	allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
				5341
				5342	This is a low-level function that is a building block for general
				5343	multiplication as well as other operations in GMP@. It is written in assembly
				5344	for most CPUs.
				5345	@end deftypefun
				5346
				5347	@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
				5348	Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
				5349	least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
				5350	result to @var{rp}. Return the most significant limb of the product, plus
				5351	borrow-out from the subtraction. @{@var{s1p}, @var{n}@} and @{@var{rp},
				5352	@var{n}@} are allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
				5353
				5354	This is a low-level function that is a building block for general
				5355	multiplication and division as well as other operations in GMP@. It is written
				5356	in assembly for most CPUs.
				5357	@end deftypefun
				5358
				5359	@deftypefun void mpn_tdiv_qr (mp_limb_t @var{qp}, mp_limb_t @var{rp}, mp_size_t @var{qxn}, const mp_limb_t @var{np}, mp_size_t @var{nn}, const mp_limb_t @var{dp}, mp_size_t @var{dn})
				5360	Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
				5361	at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
				5362	@var{dn}@}. The quotient is rounded towards 0.
				5363
				5364	No overlap is permitted between arguments, except that @var{np} might equal
				5365	@var{rp}. The dividend size @var{nn} must be greater than or equal to divisor
				5366	size @var{dn}. The most significant limb of the divisor must be non-zero. The
				5367	@var{qxn} operand must be zero.
				5368	@end deftypefun
				5369
				5370	@deftypefun mp_limb_t mpn_divrem (mp_limb_t @var{r1p}, mp_size_t @var{qxn}, mp_limb_t @var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
				5371	[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
				5372	performance.]
				5373
				5374	Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
				5375	quotient at @var{r1p}, with the exception of the most significant limb, which
				5376	is returned. The remainder replaces the dividend at @var{rs2p}; it will be
				5377	@var{s3n} limbs long (i.e., as many limbs as the divisor).
				5378
				5379	In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
				5380	stored after the integral limbs. For most usages, @var{qxn} will be zero.
				5381
				5382	It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is
				5383	required that the most significant bit of the divisor is set.
				5384
				5385	If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside
				5386	from that special case, no overlap between arguments is permitted.
				5387
				5388	Return the most significant limb of the quotient, either 0 or 1.
				5389
				5390	The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
				5391	limbs large.
				5392	@end deftypefun
				5393
				5394	@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t @var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t @var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
				5395	@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t @var{r1p}, mp_limb_t @var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
				5396	Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
				5397	@var{r1p}. Return the remainder.
				5398
				5399	The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
				5400	addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
				5401	@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most
				5402	usages, @var{qxn} will be zero.
				5403
				5404	@code{mpn_divmod_1} exists for upward source compatibility and is simply a
				5405	macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
				5406
				5407	The areas at @var{r1p} and @var{s2p} have to be identical or completely
				5408	separate, not partially overlapping.
				5409	@end deftypefn
				5410
				5411	@deftypefun mp_limb_t mpn_divmod (mp_limb_t @var{r1p}, mp_limb_t @var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
				5412	[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
				5413	performance.]
				5414	@end deftypefun
				5415
				5416	@deftypefun void mpn_divexact_1 (mp_limb_t * @var{rp}, const mp_limb_t * @var{sp}, mp_size_t @var{n}, mp_limb_t @var{d})
				5417	Divide @{@var{sp}, @var{n}@} by @var{d}, expecting it to divide exactly, and
				5418	writing the result to @{@var{rp}, @var{n}@}. If @var{d} doesn't divide
				5419	exactly, the value written to @{@var{rp}, @var{n}@} is undefined. The areas at
				5420	@var{rp} and @var{sp} have to be identical or completely separate, not
				5421	partially overlapping.
				5422	@end deftypefun
				5423
				5424	@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t @var{rp}, mp_limb_t @var{sp}, @w{mp_size_t @var{n}})
				5425	@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t @var{rp}, mp_limb_t @var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
				5426	Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
				5427	the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is
				5428	zero and the result is the quotient. If not, the return value is non-zero and
				5429	the result won't be anything useful.
				5430
				5431	@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
				5432	return value from a previous call, so a large calculation can be done piece by
				5433	piece from low to high. @code{mpn_divexact_by3} is simply a macro calling
				5434	@code{mpn_divexact_by3c} with a 0 carry parameter.
				5435
				5436	These routines use a multiply-by-inverse and will be faster than
				5437	@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
				5438
				5439	The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
				5440	and return value @math{c} satisfy @m{cb^n+a-i=3q, cb^n + a-i = 3q}, where
				5441	@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The
				5442	return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
				5443	be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly
				5444	@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
				5445	3} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
				5446	@code{mp_bits_per_limb} is even, which is always so currently).
				5447	@end deftypefn
				5448
				5449	@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
				5450	Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
				5451	@var{s1n} can be zero.
				5452	@end deftypefun
				5453
				5454	@deftypefun mp_limb_t mpn_lshift (mp_limb_t @var{rp}, const mp_limb_t @var{sp}, mp_size_t @var{n}, unsigned int @var{count})
				5455	Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
				5456	@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the
				5457	least significant @var{count} bits of the return value (the rest of the return
				5458	value is zero).
				5459
				5460	@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
				5461	regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
				5462	@math{@var{rp} @ge{} @var{sp}}.
				5463
				5464	This function is written in assembly for most CPUs.
				5465	@end deftypefun
				5466
				5467	@deftypefun mp_limb_t mpn_rshift (mp_limb_t @var{rp}, const mp_limb_t @var{sp}, mp_size_t @var{n}, unsigned int @var{count})
				5468	Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
				5469	@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the
				5470	most significant @var{count} bits of the return value (the rest of the return
				5471	value is zero).
				5472
				5473	@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
				5474	regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
				5475	@math{@var{rp} @le{} @var{sp}}.
				5476
				5477	This function is written in assembly for most CPUs.
				5478	@end deftypefun
				5479
				5480	@deftypefun int mpn_cmp (const mp_limb_t @var{s1p}, const mp_limb_t @var{s2p}, mp_size_t @var{n})
				5481	Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
				5482	positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
				5483	negative value if @math{@var{s1} < @var{s2}}.
				5484	@end deftypefun
				5485
				5486	@deftypefun int mpn_zero_p (const mp_limb_t *@var{sp}, mp_size_t @var{n})
				5487	Test @{@var{sp}, @var{n}@} and return 1 if the operand is zero, 0 otherwise.
				5488	@end deftypefun
				5489
				5490	@deftypefun mp_size_t mpn_gcd (mp_limb_t @var{rp}, mp_limb_t @var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn})
				5491	Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp},
				5492	@var{xn}@} and @{@var{yp}, @var{yn}@}. The result can be up to @var{yn} limbs,
				5493	the return value is the actual number produced. Both source operands are
				5494	destroyed.
				5495
				5496	It is required that @math{@var{xn} @ge @var{yn} > 0}, the most significant
				5497	limb of @{@var{yp}, @var{yn}@} must be non-zero, and at least one of
				5498	the two operands must be odd. No overlap is permitted
				5499	between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}.
				5500	@end deftypefun
				5501
				5502	@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb})
				5503	Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}.
				5504	Both operands must be non-zero.
				5505	@end deftypefun
				5506
				5507	@deftypefun mp_size_t mpn_gcdext (mp_limb_t @var{gp}, mp_limb_t @var{sp}, mp_size_t @var{sn}, mp_limb_t @var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn})
				5508	Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be
				5509	defined by @{@var{vp}, @var{vn}@}.
				5510
				5511	Compute the greatest common divisor @math{G} of @math{U} and @math{V}. Compute
				5512	a cofactor @math{S} such that @math{G = US + VT}. The second cofactor @var{T}
				5513	is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} -
				5514	@var{U}*@var{S}) / @var{V}} (the division will be exact). It is required that
				5515	@math{@var{un} @ge @var{vn} > 0}, and the most significant
				5516	limb of @{@var{vp}, @var{vn}@} must be non-zero.
				5517
				5518	@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S =
				5519	0} if and only if @math{V} divides @math{U} (i.e., @math{G = V}).
				5520
				5521	Store @math{G} at @var{gp} and let the return value define its limb count.
				5522	Store @math{S} at @var{sp} and let \|*@var{sn}\| define its limb count. @math{S}
				5523	can be negative; when this happens *@var{sn} will be negative. The area at
				5524	@var{gp} should have room for @var{vn} limbs and the area at @var{sp} should
				5525	have room for @math{@var{vn}+1} limbs.
				5526
				5527	Both source operands are destroyed.
				5528
				5529	Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly.
				5530	Earlier as well as later GMP releases define @math{S} as described here.
				5531	GMP releases before GMP 4.3.0 required additional space for both input and output
				5532	areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and
				5533	@{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an
				5534	extra limb past the end of each), and the areas pointed to by @var{gp} and
				5535	@var{sp} should each have room for @math{@var{un}+1} limbs.
				5536	@end deftypefun
				5537
				5538	@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t @var{r1p}, mp_limb_t @var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
				5539	Compute the square root of @{@var{sp}, @var{n}@} and put the result at
				5540	@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
				5541	@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value
				5542	indicates how many are produced.
				5543
				5544	The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The
				5545	areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
				5546	be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
				5547	@var{n}@} must be either identical or completely separate.
				5548
				5549	If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
				5550	case the return value is zero or non-zero according to whether the remainder
				5551	would have been zero or non-zero.
				5552
				5553	A return value of zero indicates a perfect square. See also
				5554	@code{mpn_perfect_square_p}.
				5555	@end deftypefun
				5556
				5557	@deftypefun size_t mpn_sizeinbase (const mp_limb_t *@var{xp}, mp_size_t @var{n}, int @var{base})
				5558	Return the size of @{@var{xp},@var{n}@} measured in number of digits in the
				5559	given @var{base}. @var{base} can vary from 2 to 62. Requires @math{@var{n} > 0}
				5560	and @math{@var{xp}[@var{n}-1] > 0}. The result will be either exact or
				5561	1 too big. If @var{base} is a power of 2, the result is always exact.
				5562	@end deftypefun
				5563
				5564	@deftypefun mp_size_t mpn_get_str (unsigned char @var{str}, int @var{base}, mp_limb_t @var{s1p}, mp_size_t @var{s1n})
				5565	Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
				5566	base @var{base}, and return the number of characters produced. There may be
				5567	leading zeros in the string. The string is not in ASCII; to convert it to
				5568	printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
				5569	the base and range. @var{base} can vary from 2 to 256.
				5570
				5571	The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
				5572	non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
				5573	@var{base} is a power of 2, in which case it's unchanged.
				5574
				5575	The area at @var{str} has to have space for the largest possible number
				5576	represented by a @var{s1n} long limb array, plus one extra character.
				5577	@end deftypefun
				5578
				5579	@deftypefun mp_size_t mpn_set_str (mp_limb_t @var{rp}, const unsigned char @var{str}, size_t @var{strsize}, int @var{base})
				5580	Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
				5581	@var{rp}.
				5582
				5583	@math{@var{str}[0]} is the most significant input byte and
				5584	@math{@var{str}[@var{strsize}-1]} is the least significant input byte. Each
				5585	byte should be a value in the range 0 to @math{@var{base}-1}, not an ASCII
				5586	character. @var{base} can vary from 2 to 256.
				5587
				5588	The converted value is @{@var{rp},@var{rn}@} where @var{rn} is the return
				5589	value. If the most significant input byte @math{@var{str}[0]} is non-zero,
				5590	then @math{@var{rp}[@var{rn}-1]} will be non-zero, else
				5591	@math{@var{rp}[@var{rn}-1]} and some number of subsequent limbs may be zero.
				5592
				5593	The area at @var{rp} has to have space for the largest possible number with
				5594	@var{strsize} digits in the chosen base, plus one extra limb.
				5595
				5596	The input must have at least one byte, and no overlap is permitted between
				5597	@{@var{str},@var{strsize}@} and the result at @var{rp}.
				5598	@end deftypefun
				5599
				5600	@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
				5601	Scan @var{s1p} from bit position @var{bit} for the next clear bit.
				5602
				5603	It is required that there be a clear bit within the area at @var{s1p} at or
				5604	beyond bit position @var{bit}, so that the function has something to return.
				5605	@end deftypefun
				5606
				5607	@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
				5608	Scan @var{s1p} from bit position @var{bit} for the next set bit.
				5609
				5610	It is required that there be a set bit within the area at @var{s1p} at or
				5611	beyond bit position @var{bit}, so that the function has something to return.
				5612	@end deftypefun
				5613
				5614	@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
				5615	@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
				5616	Generate a random number of length @var{r1n} and store it at @var{r1p}. The
				5617	most significant limb is always non-zero. @code{mpn_random} generates
				5618	uniformly distributed limb data, @code{mpn_random2} generates long strings of
				5619	zeros and ones in the binary representation.
				5620
				5621	@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
				5622	routines.
				5623	@end deftypefun
				5624
				5625	@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
				5626	Count the number of set bits in @{@var{s1p}, @var{n}@}.
				5627	@end deftypefun
				5628
				5629	@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t @var{s1p}, const mp_limb_t @var{s2p}, mp_size_t @var{n})
				5630	Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
				5631	@var{n}@}, which is the number of bit positions where the two operands have
				5632	different bit values.
				5633	@end deftypefun
				5634
				5635	@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
				5636	Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
				5637	The most significant limb of the input @{@var{s1p}, @var{n}@} must be
				5638	non-zero.
				5639	@end deftypefun
				5640
				5641	@deftypefun void mpn_and_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5642	Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
				5643	@var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
				5644	@end deftypefun
				5645
				5646	@deftypefun void mpn_ior_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5647	Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
				5648	@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
				5649	@end deftypefun
				5650
				5651	@deftypefun void mpn_xor_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5652	Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
				5653	@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
				5654	@end deftypefun
				5655
				5656	@deftypefun void mpn_andn_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5657	Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise
				5658	complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
				5659	@end deftypefun
				5660
				5661	@deftypefun void mpn_iorn_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5662	Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise
				5663	complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
				5664	@end deftypefun
				5665
				5666	@deftypefun void mpn_nand_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5667	Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
				5668	@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}.
				5669	@end deftypefun
				5670
				5671	@deftypefun void mpn_nior_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5672	Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
				5673	@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
				5674	@{@var{rp}, @var{n}@}.
				5675	@end deftypefun
				5676
				5677	@deftypefun void mpn_xnor_n (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5678	Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
				5679	@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
				5680	@{@var{rp}, @var{n}@}.
				5681	@end deftypefun
				5682
				5683	@deftypefun void mpn_com (mp_limb_t @var{rp}, const mp_limb_t @var{sp}, mp_size_t @var{n})
				5684	Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result
				5685	to @{@var{rp}, @var{n}@}.
				5686	@end deftypefun
				5687
				5688	@deftypefun void mpn_copyi (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n})
				5689	Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly.
				5690	@end deftypefun
				5691
				5692	@deftypefun void mpn_copyd (mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, mp_size_t @var{n})
				5693	Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly.
				5694	@end deftypefun
				5695
				5696	@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n})
				5697	Zero @{@var{rp}, @var{n}@}.
				5698	@end deftypefun
				5699
				5700	@sp 1
				5701	@section Low-level functions for cryptography
				5702	@cindex Low-level functions for cryptography
				5703	@cindex Cryptography functions, low-level
				5704
				5705	The functions prefixed with @code{mpn_sec_} and @code{mpn_cnd_} are designed to
				5706	perform the exact same low-level operations and have the same cache access
				5707	patterns for any two same-size arguments, assuming that function arguments are
				5708	placed at the same position and that the machine state is identical upon
				5709	function entry. These functions are intended for cryptographic purposes, where
				5710	resilience to side-channel attacks is desired.
				5711
				5712	These functions are less efficient than their ``leaky'' counterparts; their
				5713	performance for operands of the sizes typically used for cryptographic
				5714	applications is between 15% and 100% worse. For larger operands, these
				5715	functions might be inadequate, since they rely on asymptotically elementary
				5716	algorithms.
				5717
				5718	These functions do not make any explicit allocations. Those of these functions
				5719	that need scratch space accept a scratch space operand. This convention allows
				5720	callers to keep sensitive data in designated memory areas. Note however that
				5721	compilers may choose to spill scalar values used within these functions to
				5722	their stack frame and that such scalars may contain sensitive data.
				5723
				5724	In addition to these specially crafted functions, the following @code{mpn}
				5725	functions are naturally side-channel resistant: @code{mpn_add_n},
				5726	@code{mpn_sub_n}, @code{mpn_lshift}, @code{mpn_rshift}, @code{mpn_zero},
				5727	@code{mpn_copyi}, @code{mpn_copyd}, @code{mpn_com}, and the logical function
				5728	(@code{mpn_and_n}, etc).
				5729
				5730	There are some exceptions from the side-channel resilience: (1) Some assembly
				5731	implementations of @code{mpn_lshift} identify shift-by-one as a special case.
				5732	This is a problem iff the shift count is a function of sensitive data. (2)
				5733	Alpha ev6 and Pentium4 using 64-bit limbs have leaky @code{mpn_add_n} and
				5734	@code{mpn_sub_n}. (3) Alpha ev6 has a leaky @code{mpn_mul_1} which also makes
				5735	@code{mpn_sec_mul} on those systems unsafe.
				5736
				5737	@deftypefun mp_limb_t mpn_cnd_add_n (mp_limb_t @var{cnd}, mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5738	@deftypefunx mp_limb_t mpn_cnd_sub_n (mp_limb_t @var{cnd}, mp_limb_t @var{rp}, const mp_limb_t @var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
				5739	These functions do conditional addition and subtraction. If @var{cnd} is
				5740	non-zero, they produce the same result as a regular @code{mpn_add_n} or
				5741	@code{mpn_sub_n}, and if @var{cnd} is zero, they copy @{@var{s1p},@var{n}@} to
				5742	the result area and return zero. The functions are designed to have timing and
				5743	memory access patterns depending only on size and location of the data areas,
				5744	but independent of the condition @var{cnd}. Like for @code{mpn_add_n} and
				5745	@code{mpn_sub_n}, on most machines, the timing will also be independent of the
				5746	actual limb values.
				5747	@end deftypefun
				5748
				5749	@deftypefun mp_limb_t mpn_sec_add_1 (mp_limb_t @var{rp}, const mp_limb_t @var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp})
				5750	@deftypefunx mp_limb_t mpn_sec_sub_1 (mp_limb_t @var{rp}, const mp_limb_t @var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp})
				5751	Set @var{R} to @var{A} + @var{b} or @var{A} - @var{b}, respectively, where
				5752	@var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, and @var{b} is
				5753	a single limb. Returns carry.
				5754
				5755	These functions take @math{O(N)} time, unlike the leaky functions
				5756	@code{mpn_add_1} which are @math{O(1)} on average. They require scratch space
				5757	of @code{mpn_sec_add_1_itch(@var{n})} and @code{mpn_sec_sub_1_itch(@var{n})}
				5758	limbs, respectively, to be passed in the @var{tp} parameter. The scratch space
				5759	requirements are guaranteed to be at most @var{n} limbs, and increase
				5760	monotonously in the operand size.
				5761	@end deftypefun
				5762
				5763	@deftypefun void mpn_cnd_swap (mp_limb_t @var{cnd}, volatile mp_limb_t @var{ap}, volatile mp_limb_t @var{bp}, mp_size_t @var{n})
				5764	If @var{cnd} is non-zero, swaps the contents of the areas @{@var{ap},@var{n}@}
				5765	and @{@var{bp},@var{n}@}. Otherwise, the areas are left unmodified.
				5766	Implemented using logical operations on the limbs, with the same memory
				5767	accesses independent of the value of @var{cnd}.
				5768	@end deftypefun
				5769
				5770	@deftypefun void mpn_sec_mul (mp_limb_t @var{rp}, const mp_limb_t @var{ap}, mp_size_t @var{an}, const mp_limb_t @var{bp}, mp_size_t @var{bn}, mp_limb_t @var{tp})
				5771	@deftypefunx mp_size_t mpn_sec_mul_itch (mp_size_t @var{an}, mp_size_t @var{bn})
				5772	Set @var{R} to @math{A @times{} B}, where @var{A} = @{@var{ap},@var{an}@},
				5773	@var{B} = @{@var{bp},@var{bn}@}, and @var{R} =
				5774	@{@var{rp},@math{@var{an}+@var{bn}}@}.
				5775
				5776	It is required that @math{@var{an} @ge @var{bn} > 0}.
				5777
				5778	No overlapping between @var{R} and the input operands is allowed. For
				5779	@math{@var{A} = @var{B}}, use @code{mpn_sec_sqr} for optimal performance.
				5780
				5781	This function requires scratch space of @code{mpn_sec_mul_itch(@var{an},
				5782	@var{bn})} limbs to be passed in the @var{tp} parameter. The scratch space
				5783	requirements are guaranteed to increase monotonously in the operand sizes.
				5784	@end deftypefun
				5785
				5786
				5787	@deftypefun void mpn_sec_sqr (mp_limb_t @var{rp}, const mp_limb_t @var{ap}, mp_size_t @var{an}, mp_limb_t *@var{tp})
				5788	@deftypefunx mp_size_t mpn_sec_sqr_itch (mp_size_t @var{an})
				5789	Set @var{R} to @math{A^2}, where @var{A} = @{@var{ap},@var{an}@}, and @var{R} =
				5790	@{@var{rp},@math{2@var{an}}@}.
				5791
				5792	It is required that @math{@var{an} > 0}.
				5793
				5794	No overlapping between @var{R} and the input operands is allowed.
				5795
				5796	This function requires scratch space of @code{mpn_sec_sqr_itch(@var{an})} limbs
				5797	to be passed in the @var{tp} parameter. The scratch space requirements are
				5798	guaranteed to increase monotonously in the operand size.
				5799	@end deftypefun
				5800
				5801
				5802	@deftypefun void mpn_sec_powm (mp_limb_t @var{rp}, const mp_limb_t @var{bp}, mp_size_t @var{bn}, const mp_limb_t @var{ep}, mp_bitcnt_t @var{enb}, const mp_limb_t @var{mp}, mp_size_t @var{n}, mp_limb_t *@var{tp})
				5803	@deftypefunx mp_size_t mpn_sec_powm_itch (mp_size_t @var{bn}, mp_bitcnt_t @var{enb}, size_t @var{n})
				5804	Set @var{R} to @m{B^E \bmod @var{M}, (@var{B} raised to @var{E}) modulo
				5805	@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{M} = @{@var{mp},@var{n}@},
				5806	and @var{E} = @{@var{ep},@math{@GMPceil{@var{enb} /
				5807	@code{GMP\_NUMB\_BITS}}}@}.
				5808
				5809	It is required that @math{@var{B} > 0}, that @math{@var{M} > 0} is odd, and
				5810	that @m{@var{E} < 2@GMPraise{@var{enb}}, @var{E} < 2^@var{enb}}, with @math{@var{enb} > 0}.
				5811
				5812	No overlapping between @var{R} and the input operands is allowed.
				5813
				5814	This function requires scratch space of @code{mpn_sec_powm_itch(@var{bn},
				5815	@var{enb}, @var{n})} limbs to be passed in the @var{tp} parameter. The scratch
				5816	space requirements are guaranteed to increase monotonously in the operand
				5817	sizes.
				5818	@end deftypefun
				5819
				5820	@deftypefun void mpn_sec_tabselect (mp_limb_t @var{rp}, const mp_limb_t @var{tab}, mp_size_t @var{n}, mp_size_t @var{nents}, mp_size_t @var{which})
				5821	Select entry @var{which} from table @var{tab}, which has @var{nents} entries, each @var{n}
				5822	limbs. Store the selected entry at @var{rp}.
				5823
				5824	This function reads the entire table to avoid side-channel information leaks.
				5825	@end deftypefun
				5826
				5827	@deftypefun mp_limb_t mpn_sec_div_qr (mp_limb_t @var{qp}, mp_limb_t @var{np}, mp_size_t @var{nn}, const mp_limb_t @var{dp}, mp_size_t @var{dn}, mp_limb_t @var{tp})
				5828	@deftypefunx mp_size_t mpn_sec_div_qr_itch (mp_size_t @var{nn}, mp_size_t @var{dn})
				5829
				5830	Set @var{Q} to @m{\lfloor @var{N} / @var{D}\rfloor, the truncated quotient
				5831	@var{N} / @var{D}} and @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo
				5832	@var{D}}, where @var{N} = @{@var{np},@var{nn}@}, @var{D} =
				5833	@{@var{dp},@var{dn}@}, @var{Q}'s most significant limb is the function return
				5834	value and the remaining limbs are @{@var{qp},@var{nn-dn}@}, and @var{R} =
				5835	@{@var{np},@var{dn}@}.
				5836
				5837	It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that
				5838	@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not
				5839	imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded.
				5840
				5841	Note the overlapping between @var{N} and @var{R}. No other operand overlapping
				5842	is allowed. The entire space occupied by @var{N} is overwritten.
				5843
				5844	This function requires scratch space of @code{mpn_sec_div_qr_itch(@var{nn},
				5845	@var{dn})} limbs to be passed in the @var{tp} parameter.
				5846	@end deftypefun
				5847
				5848	@deftypefun void mpn_sec_div_r (mp_limb_t @var{np}, mp_size_t @var{nn}, const mp_limb_t @var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp})
				5849	@deftypefunx mp_size_t mpn_sec_div_r_itch (mp_size_t @var{nn}, mp_size_t @var{dn})
				5850
				5851	Set @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo @var{D}}, where @var{N}
				5852	= @{@var{np},@var{nn}@}, @var{D} = @{@var{dp},@var{dn}@}, and @var{R} =
				5853	@{@var{np},@var{dn}@}.
				5854
				5855	It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that
				5856	@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not
				5857	imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded.
				5858
				5859	Note the overlapping between @var{N} and @var{R}. No other operand overlapping
				5860	is allowed. The entire space occupied by @var{N} is overwritten.
				5861
				5862	This function requires scratch space of @code{mpn_sec_div_r_itch(@var{nn},
				5863	@var{dn})} limbs to be passed in the @var{tp} parameter.
				5864	@end deftypefun
				5865
				5866	@deftypefun int mpn_sec_invert (mp_limb_t @var{rp}, mp_limb_t @var{ap}, const mp_limb_t @var{mp}, mp_size_t @var{n}, mp_bitcnt_t @var{nbcnt}, mp_limb_t @var{tp})
				5867	@deftypefunx mp_size_t mpn_sec_invert_itch (mp_size_t @var{n})
				5868	Set @var{R} to @m{@var{A}^{-1} \bmod @var{M}, the inverse of @var{A} modulo
				5869	@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@},
				5870	and @var{M} = @{@var{mp},@var{n}@}. @strong{This function's interface is
				5871	preliminary.}
				5872
				5873	If an inverse exists, return 1, otherwise return 0 and leave @var{R}
				5874	undefined. In either case, the input @var{A} is destroyed.
				5875
				5876	It is required that @var{M} is odd, and that @math{@var{nbcnt} @ge
				5877	@GMPceil{\log(@var{A}+1)} + @GMPceil{\log(@var{M}+1)}}. A safe choice is
				5878	@m{@var{nbcnt} = 2@var{n} @times{} @code{GMP\_NUMB\_BITS}, @var{nbcnt} = 2
				5879	@times{} @var{n} @times{} GMP_NUMB_BITS}, but a smaller value might improve
				5880	performance if @var{M} or @var{A} are known to have leading zero bits.
				5881
				5882	This function requires scratch space of @code{mpn_sec_invert_itch(@var{n})}
				5883	limbs to be passed in the @var{tp} parameter.
				5884	@end deftypefun
				5885
				5886
				5887	@sp 1
				5888	@section Nails
				5889	@cindex Nails
				5890
				5891	@strong{Everything in this section is highly experimental and may disappear or
				5892	be subject to incompatible changes in a future version of GMP.}
				5893
				5894	Nails are an experimental feature whereby a few bits are left unused at the
				5895	top of each @code{mp_limb_t}. This can significantly improve carry handling
				5896	on some processors.
				5897
				5898	All the @code{mpn} functions accepting limb data will expect the nail bits to
				5899	be zero on entry, and will return data with the nails similarly all zero.
				5900	This applies both to limb vectors and to single limb arguments.
				5901
				5902	Nails can be enabled by configuring with @samp{--enable-nails}. By default
				5903	the number of bits will be chosen according to what suits the host processor,
				5904	but a particular number can be selected with @samp{--enable-nails=N}.
				5905
				5906	At the mpn level, a nail build is neither source nor binary compatible with a
				5907	non-nail build, strictly speaking. But programs acting on limbs only through
				5908	the mpn functions are likely to work equally well with either build, and
				5909	judicious use of the definitions below should make any program compatible with
				5910	either build, at the source level.
				5911
				5912	For the higher level routines, meaning @code{mpz} etc, a nail build should be
				5913	fully source and binary compatible with a non-nail build.
				5914
				5915	@defmac GMP_NAIL_BITS
				5916	@defmacx GMP_NUMB_BITS
				5917	@defmacx GMP_LIMB_BITS
				5918	@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
				5919	use. @code{GMP_NUMB_BITS} is the number of data bits in a limb.
				5920	@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In
				5921	all cases
				5922
				5923	@example
				5924	GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
				5925	@end example
				5926	@end defmac
				5927
				5928	@defmac GMP_NAIL_MASK
				5929	@defmacx GMP_NUMB_MASK
				5930	Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0
				5931	when nails are not in use.
				5932
				5933	@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
				5934	with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
				5935	can help various RISC chips.
				5936	@end defmac
				5937
				5938	@defmac GMP_NUMB_MAX
				5939	The maximum value that can be stored in the number part of a limb. This is
				5940	the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
				5941	comparisons rather than bit-wise operations.
				5942	@end defmac
				5943
				5944	The term ``nails'' comes from finger or toe nails, which are at the ends of a
				5945	limb (arm or leg). ``numb'' is short for number, but is also how the
				5946	developers felt after trying for a long time to come up with sensible names
				5947	for these things.
				5948
				5949	In the future (the distant future most likely) a non-zero nail might be
				5950	permitted, giving non-unique representations for numbers in a limb vector.
				5951	This would help vector processors since carries would only ever need to
				5952	propagate one or two limbs.
				5953
				5954
				5955	@node Random Number Functions, Formatted Output, Low-level Functions, Top
				5956	@chapter Random Number Functions
				5957	@cindex Random number functions
				5958
				5959	Sequences of pseudo-random numbers in GMP are generated using a variable of
				5960	type @code{gmp_randstate_t}, which holds an algorithm selection and a current
				5961	state. Such a variable must be initialized by a call to one of the
				5962	@code{gmp_randinit} functions, and can be seeded with one of the
				5963	@code{gmp_randseed} functions.
				5964
				5965	The functions actually generating random numbers are described in @ref{Integer
				5966	Random Numbers}, and @ref{Miscellaneous Float Functions}.
				5967
				5968	The older style random number functions don't accept a @code{gmp_randstate_t}
				5969	parameter but instead share a global variable of that type. They use a
				5970	default algorithm and are currently not seeded (though perhaps that will
				5971	change in the future). The new functions accepting a @code{gmp_randstate_t}
				5972	are recommended for applications that care about randomness.
				5973
				5974	@menu
				5975	* Random State Initialization::
				5976	* Random State Seeding::
				5977	* Random State Miscellaneous::
				5978	@end menu
				5979
				5980	@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
				5981	@section Random State Initialization
				5982	@cindex Random number state
				5983	@cindex Initialization functions
				5984
				5985	@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
				5986	Initialize @var{state} with a default algorithm. This will be a compromise
				5987	between speed and randomness, and is recommended for applications with no
				5988	special requirements. Currently this is @code{gmp_randinit_mt}.
				5989	@end deftypefun
				5990
				5991	@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state})
				5992	@cindex Mersenne twister random numbers
				5993	Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is
				5994	fast and has good randomness properties.
				5995	@end deftypefun
				5996
				5997	@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}})
				5998	@cindex Linear congruential random numbers
				5999	Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
				6000	@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
				6001
				6002	The low bits of @math{X} in this algorithm are not very random. The least
				6003	significant bit will have a period no more than 2, and the second bit no more
				6004	than 4, etc. For this reason only the high half of each @math{X} is actually
				6005	used.
				6006
				6007	When a random number of more than @math{@var{m2exp}/2} bits is to be
				6008	generated, multiple iterations of the recurrence are used and the results
				6009	concatenated.
				6010	@end deftypefun
				6011
				6012	@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size})
				6013	@cindex Linear congruential random numbers
				6014	Initialize @var{state} for a linear congruential algorithm as per
				6015	@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected
				6016	from a table, chosen so that @var{size} bits (or more) of each @math{X} will
				6017	be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}.
				6018
				6019	If successful the return value is non-zero. If @var{size} is bigger than the
				6020	table data provides then the return value is zero. The maximum @var{size}
				6021	currently supported is 128.
				6022	@end deftypefun
				6023
				6024	@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op})
				6025	Initialize @var{rop} with a copy of the algorithm and state from @var{op}.
				6026	@end deftypefun
				6027
				6028	@c Although gmp_randinit, gmp_errno and related constants are obsolete, we
				6029	@c still put @findex entries for them, since they're still documented and
				6030	@c someone might be looking them up when perusing old application code.
				6031
				6032	@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{})
				6033	@strong{This function is obsolete.}
				6034
				6035	@findex GMP_RAND_ALG_LC
				6036	@findex GMP_RAND_ALG_DEFAULT
				6037	Initialize @var{state} with an algorithm selected by @var{alg}. The only
				6038	choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}
				6039	described above. A third parameter of type @code{unsigned long} is required,
				6040	this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0
				6041	are the same as @code{GMP_RAND_ALG_LC}.
				6042
				6043	@c For reference, this is the only place gmp_errno has been documented, and
				6044	@c due to being non thread safe we won't be adding to it's uses.
				6045	@findex gmp_errno
				6046	@findex GMP_ERROR_UNSUPPORTED_ARGUMENT
				6047	@findex GMP_ERROR_INVALID_ARGUMENT
				6048	@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to
				6049	indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is
				6050	unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter
				6051	is too big. It may be noted this error reporting is not thread safe (a good
				6052	reason to use @code{gmp_randinit_lc_2exp_size} instead).
				6053	@end deftypefun
				6054
				6055	@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
				6056	Free all memory occupied by @var{state}.
				6057	@end deftypefun
				6058
				6059
				6060	@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions
				6061	@section Random State Seeding
				6062	@cindex Random number seeding
				6063	@cindex Seeding random numbers
				6064
				6065	@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed})
				6066	@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
				6067	Set an initial seed value into @var{state}.
				6068
				6069	The size of a seed determines how many different sequences of random numbers
				6070	that it's possible to generate. The ``quality'' of the seed is the randomness
				6071	of a given seed compared to the previous seed used, and this affects the
				6072	randomness of separate number sequences. The method for choosing a seed is
				6073	critical if the generated numbers are to be used for important applications,
				6074	such as generating cryptographic keys.
				6075
				6076	Traditionally the system time has been used to seed, but care needs to be
				6077	taken with this. If an application seeds often and the resolution of the
				6078	system clock is low, then the same sequence of numbers might be repeated.
				6079	Also, the system time is quite easy to guess, so if unpredictability is
				6080	required then it should definitely not be the only source for the seed value.
				6081	On some systems there's a special device @file{/dev/random} which provides
				6082	random data better suited for use as a seed.
				6083	@end deftypefun
				6084
				6085
				6086	@node Random State Miscellaneous, , Random State Seeding, Random Number Functions
				6087	@section Random State Miscellaneous
				6088
				6089	@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
				6090	Return a uniformly distributed random number of @var{n} bits, i.e.@: in the
				6091	range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or
				6092	equal to the number of bits in an @code{unsigned long}.
				6093	@end deftypefun
				6094
				6095	@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
				6096	Return a uniformly distributed random number in the range 0 to
				6097	@math{@var{n}-1}, inclusive.
				6098	@end deftypefun
				6099
				6100
				6101	@node Formatted Output, Formatted Input, Random Number Functions, Top
				6102	@chapter Formatted Output
				6103	@cindex Formatted output
				6104	@cindex @code{printf} formatted output
				6105
				6106	@menu
				6107	* Formatted Output Strings::
				6108	* Formatted Output Functions::
				6109	* C++ Formatted Output::
				6110	@end menu
				6111
				6112	@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
				6113	@section Format Strings
				6114
				6115	@code{gmp_printf} and friends accept format strings similar to the standard C
				6116	@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C
				6117	Library Reference Manual}). A format specification is of the form
				6118
				6119	@example
				6120	% [flags] [width] [.[precision]] [type] conv
				6121	@end example
				6122
				6123	GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
				6124	and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for
				6125	an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave
				6126	like integers. @samp{Q} will print a @samp{/} and a denominator, if needed.
				6127	@samp{F} behaves like a float. For example,
				6128
				6129	@example
				6130	mpz_t z;
				6131	gmp_printf ("%s is an mpz %Zd\n", "here", z);
				6132
				6133	mpq_t q;
				6134	gmp_printf ("a hex rational: %#40Qx\n", q);
				6135
				6136	mpf_t f;
				6137	int n;
				6138	gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
				6139
				6140	mp_limb_t l;
				6141	gmp_printf ("limb %Mu\n", l);
				6142
				6143	const mp_limb_t *ptr;
				6144	mp_size_t size;
				6145	gmp_printf ("limb array %Nx\n", ptr, size);
				6146	@end example
				6147
				6148	For @samp{N} the limbs are expected least significant first, as per the
				6149	@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be
				6150	given to print the value as a negative.
				6151
				6152	All the standard C @code{printf} types behave the same as the C library
				6153	@code{printf}, and can be freely intermixed with the GMP extensions. In the
				6154	current implementation the standard parts of the format string are simply
				6155	handed to @code{printf} and only the GMP extensions handled directly.
				6156
				6157	The flags accepted are as follows. GLIBC style @nisamp{'} is only for the
				6158	standard C types (not the GMP types), and only if the C library supports it.
				6159
				6160	@quotation
				6161	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6162	@item @nicode{0} @tab pad with zeros (rather than spaces)
				6163	@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
				6164	@item @nicode{+} @tab always show a sign
				6165	@item (space) @tab show a space or a @samp{-} sign
				6166	@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
				6167	@end multitable
				6168	@end quotation
				6169
				6170	The optional width and precision can be given as a number within the format
				6171	string, or as a @samp{*} to take an extra parameter of type @code{int}, the
				6172	same as the standard @code{printf}.
				6173
				6174	The standard types accepted are as follows. @samp{h} and @samp{l} are
				6175	portable, the rest will depend on the compiler (or include files) for the type
				6176	and the C library for the output.
				6177
				6178	@quotation
				6179	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6180	@item @nicode{h} @tab @nicode{short}
				6181	@item @nicode{hh} @tab @nicode{char}
				6182	@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
				6183	@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t}
				6184	@item @nicode{ll} @tab @nicode{long long}
				6185	@item @nicode{L} @tab @nicode{long double}
				6186	@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
				6187	@item @nicode{t} @tab @nicode{ptrdiff_t}
				6188	@item @nicode{z} @tab @nicode{size_t}
				6189	@end multitable
				6190	@end quotation
				6191
				6192	@noindent
				6193	The GMP types are
				6194
				6195	@quotation
				6196	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6197	@item @nicode{F} @tab @nicode{mpf_t}, float conversions
				6198	@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
				6199	@item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions
				6200	@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions
				6201	@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
				6202	@end multitable
				6203	@end quotation
				6204
				6205	The conversions accepted are as follows. @samp{a} and @samp{A} are always
				6206	supported for @code{mpf_t} but depend on the C library for standard C float
				6207	types. @samp{m} and @samp{p} depend on the C library.
				6208
				6209	@quotation
				6210	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6211	@item @nicode{a} @nicode{A} @tab hex floats, C99 style
				6212	@item @nicode{c} @tab character
				6213	@item @nicode{d} @tab decimal integer
				6214	@item @nicode{e} @nicode{E} @tab scientific format float
				6215	@item @nicode{f} @tab fixed point float
				6216	@item @nicode{i} @tab same as @nicode{d}
				6217	@item @nicode{g} @nicode{G} @tab fixed or scientific float
				6218	@item @nicode{m} @tab @code{strerror} string, GLIBC style
				6219	@item @nicode{n} @tab store characters written so far
				6220	@item @nicode{o} @tab octal integer
				6221	@item @nicode{p} @tab pointer
				6222	@item @nicode{s} @tab string
				6223	@item @nicode{u} @tab unsigned integer
				6224	@item @nicode{x} @nicode{X} @tab hex integer
				6225	@end multitable
				6226	@end quotation
				6227
				6228	@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
				6229	types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not
				6230	meaningful for @samp{Z}, @samp{Q} and @samp{N}.
				6231
				6232	@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the
				6233	size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed
				6234	conversion can be used and will interpret the value as a twos complement
				6235	negative.
				6236
				6237	@samp{n} can be used with any type, even the GMP types.
				6238
				6239	Other types or conversions that might be accepted by the C library
				6240	@code{printf} cannot be used through @code{gmp_printf}, this includes for
				6241	instance extensions registered with GLIBC @code{register_printf_function}.
				6242	Also currently there's no support for POSIX @samp{$} style numbered arguments
				6243	(perhaps this will be added in the future).
				6244
				6245	The precision field has its usual meaning for integer @samp{Z} and float
				6246	@samp{F} types, but is currently undefined for @samp{Q} and should not be used
				6247	with that.
				6248
				6249	@code{mpf_t} conversions only ever generate as many digits as can be
				6250	accurately represented by the operand, the same as @code{mpf_get_str} does.
				6251	Zeros will be used if necessary to pad to the requested precision. This
				6252	happens even for an @samp{f} conversion of an @code{mpf_t} which is an
				6253	integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
				6254	precision will only produce about 40 digits, then pad with zeros to the
				6255	decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
				6256	be used to specifically request just the significant digits. Without any dot
				6257	and thus no precision field, a precision value of 6 will be used. Note that
				6258	these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be
				6259	different.
				6260
				6261	The decimal point character (or string) is taken from the current locale
				6262	settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales
				6263	and Internationalization, libc, The GNU C Library Reference Manual}). The C
				6264	library will normally do the same for standard float output.
				6265
				6266	The format string is only interpreted as plain @code{char}s, multibyte
				6267	characters are not recognised. Perhaps this will change in the future.
				6268
				6269
				6270	@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
				6271	@section Functions
				6272	@cindex Output functions
				6273
				6274	Each of the following functions is similar to the corresponding C library
				6275	function. The basic @code{printf} forms take a variable argument list. The
				6276	@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,,
				6277	Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
				6278	va_start}.
				6279
				6280	It should be emphasised that if a format string is invalid, or the arguments
				6281	don't match what the format specifies, then the behaviour of any of these
				6282	functions will be unpredictable. GCC format string checking is not available,
				6283	since it doesn't recognise the GMP extensions.
				6284
				6285	The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
				6286	@math{-1} to indicate a write error. Output is not ``atomic'', so partial
				6287	output may be produced if a write error occurs. All the functions can return
				6288	@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but
				6289	this shouldn't normally occur.
				6290
				6291	@deftypefun int gmp_printf (const char *@var{fmt}, @dots{})
				6292	@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
				6293	Print to the standard output @code{stdout}. Return the number of characters
				6294	written, or @math{-1} if an error occurred.
				6295	@end deftypefun
				6296
				6297	@deftypefun int gmp_fprintf (FILE @var{fp}, const char @var{fmt}, @dots{})
				6298	@deftypefunx int gmp_vfprintf (FILE @var{fp}, const char @var{fmt}, va_list @var{ap})
				6299	Print to the stream @var{fp}. Return the number of characters written, or
				6300	@math{-1} if an error occurred.
				6301	@end deftypefun
				6302
				6303	@deftypefun int gmp_sprintf (char @var{buf}, const char @var{fmt}, @dots{})
				6304	@deftypefunx int gmp_vsprintf (char @var{buf}, const char @var{fmt}, va_list @var{ap})
				6305	Form a null-terminated string in @var{buf}. Return the number of characters
				6306	written, excluding the terminating null.
				6307
				6308	No overlap is permitted between the space at @var{buf} and the string
				6309	@var{fmt}.
				6310
				6311	These functions are not recommended, since there's no protection against
				6312	exceeding the space available at @var{buf}.
				6313	@end deftypefun
				6314
				6315	@deftypefun int gmp_snprintf (char @var{buf}, size_t @var{size}, const char @var{fmt}, @dots{})
				6316	@deftypefunx int gmp_vsnprintf (char @var{buf}, size_t @var{size}, const char @var{fmt}, va_list @var{ap})
				6317	Form a null-terminated string in @var{buf}. No more than @var{size} bytes
				6318	will be written. To get the full output, @var{size} must be enough for the
				6319	string and null-terminator.
				6320
				6321	The return value is the total number of characters which ought to have been
				6322	produced, excluding the terminating null. If @math{@var{retval} @ge{}
				6323	@var{size}} then the actual output has been truncated to the first
				6324	@math{@var{size}-1} characters, and a null appended.
				6325
				6326	No overlap is permitted between the region @{@var{buf},@var{size}@} and the
				6327	@var{fmt} string.
				6328
				6329	Notice the return value is in ISO C99 @code{snprintf} style. This is so even
				6330	if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
				6331	@end deftypefun
				6332
				6333	@deftypefun int gmp_asprintf (char *@var{pp}, const char @var{fmt}, @dots{})
				6334	@deftypefunx int gmp_vasprintf (char *@var{pp}, const char @var{fmt}, va_list @var{ap})
				6335	Form a null-terminated string in a block of memory obtained from the current
				6336	memory allocation function (@pxref{Custom Allocation}). The block will be the
				6337	size of the string and null-terminator. The address of the block in stored to
				6338	*@var{pp}. The return value is the number of characters produced, excluding
				6339	the null-terminator.
				6340
				6341	Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
				6342	@math{-1} if there's no more memory available, it lets the current allocation
				6343	function handle that.
				6344	@end deftypefun
				6345
				6346	@deftypefun int gmp_obstack_printf (struct obstack @var{ob}, const char @var{fmt}, @dots{})
				6347	@deftypefunx int gmp_obstack_vprintf (struct obstack @var{ob}, const char @var{fmt}, va_list @var{ap})
				6348	@cindex @code{obstack} output
				6349	Append to the current object in @var{ob}. The return value is the number of
				6350	characters written. A null-terminator is not written.
				6351
				6352	@var{fmt} cannot be within the current object in @var{ob}, since that object
				6353	might move as it grows.
				6354
				6355	These functions are available only when the C library provides the obstack
				6356	feature, which probably means only on GNU systems, see @ref{Obstacks,,
				6357	Obstacks, libc, The GNU C Library Reference Manual}.
				6358	@end deftypefun
				6359
				6360
				6361	@node C++ Formatted Output, , Formatted Output Functions, Formatted Output
				6362	@section C++ Formatted Output
				6363	@cindex C++ @code{ostream} output
				6364	@cindex @code{ostream} output
				6365
				6366	The following functions are provided in @file{libgmpxx} (@pxref{Headers and
				6367	Libraries}), which is built if C++ support is enabled (@pxref{Build Options}).
				6368	Prototypes are available from @code{<gmp.h>}.
				6369
				6370	@deftypefun ostream& operator<< (ostream& @var{stream}, const mpz_t @var{op})
				6371	Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
				6372	@code{ios::width} is reset to 0 after output, the same as the standard
				6373	@code{ostream operator<<} routines do.
				6374
				6375	In hex or octal, @var{op} is printed as a signed number, the same as for
				6376	decimal. This is unlike the standard @code{operator<<} routines on @code{int}
				6377	etc, which instead give twos complement.
				6378	@end deftypefun
				6379
				6380	@deftypefun ostream& operator<< (ostream& @var{stream}, const mpq_t @var{op})
				6381	Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
				6382	@code{ios::width} is reset to 0 after output, the same as the standard
				6383	@code{ostream operator<<} routines do.
				6384
				6385	Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
				6386	just a plain integer like @samp{123}.
				6387
				6388	In hex or octal, @var{op} is printed as a signed value, the same as for
				6389	decimal. If @code{ios::showbase} is set then a base indicator is shown on
				6390	both the numerator and denominator (if the denominator is required).
				6391	@end deftypefun
				6392
				6393	@deftypefun ostream& operator<< (ostream& @var{stream}, const mpf_t @var{op})
				6394	Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
				6395	@code{ios::width} is reset to 0 after output, the same as the standard
				6396	@code{ostream operator<<} routines do.
				6397
				6398	The decimal point follows the standard library float @code{operator<<}, which
				6399	on recent systems means the @code{std::locale} imbued on @var{stream}.
				6400
				6401	Hex and octal are supported, unlike the standard @code{operator<<} on
				6402	@code{double}. The mantissa will be in hex or octal, the exponent will be in
				6403	decimal. For hex the exponent delimiter is an @samp{@@}. This is as per
				6404	@code{mpf_out_str}.
				6405
				6406	@code{ios::showbase} is supported, and will put a base on the mantissa, for
				6407	example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
				6408	This last form is slightly strange, but at least differentiates itself from
				6409	decimal.
				6410	@end deftypefun
				6411
				6412	These operators mean that GMP types can be printed in the usual C++ way, for
				6413	example,
				6414
				6415	@example
				6416	mpz_t z;
				6417	int n;
				6418	...
				6419	cout << "iteration " << n << " value " << z << "\n";
				6420	@end example
				6421
				6422	But note that @code{ostream} output (and @code{istream} input, @pxref{C++
				6423	Formatted Input}) is the only overloading available for the GMP types and that
				6424	for instance using @code{+} with an @code{mpz_t} will have unpredictable
				6425	results. For classes with overloading, see @ref{C++ Class Interface}.
				6426
				6427
				6428	@node Formatted Input, C++ Class Interface, Formatted Output, Top
				6429	@chapter Formatted Input
				6430	@cindex Formatted input
				6431	@cindex @code{scanf} formatted input
				6432
				6433	@menu
				6434	* Formatted Input Strings::
				6435	* Formatted Input Functions::
				6436	* C++ Formatted Input::
				6437	@end menu
				6438
				6439
				6440	@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
				6441	@section Formatted Input Strings
				6442
				6443	@code{gmp_scanf} and friends accept format strings similar to the standard C
				6444	@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C
				6445	Library Reference Manual}). A format specification is of the form
				6446
				6447	@example
				6448	% [flags] [width] [type] conv
				6449	@end example
				6450
				6451	GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
				6452	and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers.
				6453	@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves
				6454	like a float.
				6455
				6456	GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
				6457	they're already ``call-by-reference''. For example,
				6458
				6459	@example
				6460	/* to read say "a(5) = 1234" */
				6461	int n;
				6462	mpz_t z;
				6463	gmp_scanf ("a(%d) = %Zd\n", &n, z);
				6464
				6465	mpq_t q1, q2;
				6466	gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
				6467
				6468	/* to read say "topleft (1.55,-2.66)" */
				6469	mpf_t x, y;
				6470	char buf[32];
				6471	gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
				6472	@end example
				6473
				6474	All the standard C @code{scanf} types behave the same as in the C library
				6475	@code{scanf}, and can be freely intermixed with the GMP extensions. In the
				6476	current implementation the standard parts of the format string are simply
				6477	handed to @code{scanf} and only the GMP extensions handled directly.
				6478
				6479	The flags accepted are as follows. @samp{a} and @samp{'} will depend on
				6480	support from the C library, and @samp{'} cannot be used with GMP types.
				6481
				6482	@quotation
				6483	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6484	@item @nicode{*} @tab read but don't store
				6485	@item @nicode{a} @tab allocate a buffer (string conversions)
				6486	@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types)
				6487	@end multitable
				6488	@end quotation
				6489
				6490	The standard types accepted are as follows. @samp{h} and @samp{l} are
				6491	portable, the rest will depend on the compiler (or include files) for the type
				6492	and the C library for the input.
				6493
				6494	@quotation
				6495	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6496	@item @nicode{h} @tab @nicode{short}
				6497	@item @nicode{hh} @tab @nicode{char}
				6498	@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
				6499	@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
				6500	@item @nicode{ll} @tab @nicode{long long}
				6501	@item @nicode{L} @tab @nicode{long double}
				6502	@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
				6503	@item @nicode{t} @tab @nicode{ptrdiff_t}
				6504	@item @nicode{z} @tab @nicode{size_t}
				6505	@end multitable
				6506	@end quotation
				6507
				6508	@noindent
				6509	The GMP types are
				6510
				6511	@quotation
				6512	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6513	@item @nicode{F} @tab @nicode{mpf_t}, float conversions
				6514	@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
				6515	@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
				6516	@end multitable
				6517	@end quotation
				6518
				6519	The conversions accepted are as follows. @samp{p} and @samp{[} will depend on
				6520	support from the C library, the rest are standard.
				6521
				6522	@quotation
				6523	@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				6524	@item @nicode{c} @tab character or characters
				6525	@item @nicode{d} @tab decimal integer
				6526	@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
				6527	@tab float
				6528	@item @nicode{i} @tab integer with base indicator
				6529	@item @nicode{n} @tab characters read so far
				6530	@item @nicode{o} @tab octal integer
				6531	@item @nicode{p} @tab pointer
				6532	@item @nicode{s} @tab string of non-whitespace characters
				6533	@item @nicode{u} @tab decimal integer
				6534	@item @nicode{x} @nicode{X} @tab hex integer
				6535	@item @nicode{[} @tab string of characters in a set
				6536	@end multitable
				6537	@end quotation
				6538
				6539	@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
				6540	read either fixed point or scientific format, and either upper or lower case
				6541	@samp{e} for the exponent in scientific format.
				6542
				6543	C99 style hex float format (@code{printf %a}, @pxref{Formatted Output
				6544	Strings}) is always accepted for @code{mpf_t}, but for the standard float
				6545	types it will depend on the C library.
				6546
				6547	@samp{x} and @samp{X} are identical, both accept both upper and lower case
				6548	hexadecimal.
				6549
				6550	@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
				6551	values. For the standard C types these are described as ``unsigned''
				6552	conversions, but that merely affects certain overflow handling, negatives are
				6553	still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of
				6554	Integers, libc, The GNU C Library Reference Manual}). For GMP types there are
				6555	no overflows, so @samp{d} and @samp{u} are identical.
				6556
				6557	@samp{Q} type reads the numerator and (optional) denominator as given. If the
				6558	value might not be in canonical form then @code{mpq_canonicalize} must be
				6559	called before using it in any calculations (@pxref{Rational Number
				6560	Functions}).
				6561
				6562	@samp{Qi} will read a base specification separately for the numerator and
				6563	denominator. For example @samp{0x10/11} would be 16/11, whereas
				6564	@samp{0x10/0x11} would be 16/17.
				6565
				6566	@samp{n} can be used with any of the types above, even the GMP types.
				6567	@samp{*} to suppress assignment is allowed, though in that case it would do
				6568	nothing at all.
				6569
				6570	Other conversions or types that might be accepted by the C library
				6571	@code{scanf} cannot be used through @code{gmp_scanf}.
				6572
				6573	Whitespace is read and discarded before a field, except for @samp{c} and
				6574	@samp{[} conversions.
				6575
				6576	For float conversions, the decimal point character (or string) expected is
				6577	taken from the current locale settings on systems which provide
				6578	@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc,
				6579	The GNU C Library Reference Manual}). The C library will normally do the same
				6580	for standard float input.
				6581
				6582	The format string is only interpreted as plain @code{char}s, multibyte
				6583	characters are not recognised. Perhaps this will change in the future.
				6584
				6585
				6586	@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
				6587	@section Formatted Input Functions
				6588	@cindex Input functions
				6589
				6590	Each of the following functions is similar to the corresponding C library
				6591	function. The plain @code{scanf} forms take a variable argument list. The
				6592	@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,,
				6593	Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
				6594	va_start}.
				6595
				6596	It should be emphasised that if a format string is invalid, or the arguments
				6597	don't match what the format specifies, then the behaviour of any of these
				6598	functions will be unpredictable. GCC format string checking is not available,
				6599	since it doesn't recognise the GMP extensions.
				6600
				6601	No overlap is permitted between the @var{fmt} string and any of the results
				6602	produced.
				6603
				6604	@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{})
				6605	@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
				6606	Read from the standard input @code{stdin}.
				6607	@end deftypefun
				6608
				6609	@deftypefun int gmp_fscanf (FILE @var{fp}, const char @var{fmt}, @dots{})
				6610	@deftypefunx int gmp_vfscanf (FILE @var{fp}, const char @var{fmt}, va_list @var{ap})
				6611	Read from the stream @var{fp}.
				6612	@end deftypefun
				6613
				6614	@deftypefun int gmp_sscanf (const char @var{s}, const char @var{fmt}, @dots{})
				6615	@deftypefunx int gmp_vsscanf (const char @var{s}, const char @var{fmt}, va_list @var{ap})
				6616	Read from a null-terminated string @var{s}.
				6617	@end deftypefun
				6618
				6619	The return value from each of these functions is the same as the standard C99
				6620	@code{scanf}, namely the number of fields successfully parsed and stored.
				6621	@samp{%n} fields and fields read but suppressed by @samp{*} don't count
				6622	towards the return value.
				6623
				6624	If end of input (or a file error) is reached before a character for a field or
				6625	a literal, and if no previous non-suppressed fields have matched, then the
				6626	return value is @code{EOF} instead of 0. A whitespace character in the format
				6627	string is only an optional match and doesn't induce an @code{EOF} in this
				6628	fashion. Leading whitespace read and discarded for a field don't count as
				6629	characters for that field.
				6630
				6631	For the GMP types, input parsing follows C99 rules, namely one character of
				6632	lookahead is used and characters are read while they continue to meet the
				6633	format requirements. If this doesn't provide a complete number then the
				6634	function terminates, with that field not stored nor counted towards the return
				6635	value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read
				6636	up to the @samp{X} and that character pushed back since it's not a digit. The
				6637	string @samp{1.23e-} would then be considered invalid since an @samp{e} must
				6638	be followed by at least one digit.
				6639
				6640	For the standard C types, in the current implementation GMP calls the C
				6641	library @code{scanf} functions, which might have looser rules about what
				6642	constitutes a valid input.
				6643
				6644	Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one
				6645	character of lookahead when parsing. Although clearly it could look at its
				6646	entire input, it is deliberately made identical to @code{gmp_fscanf}, the same
				6647	way C99 @code{sscanf} is the same as @code{fscanf}.
				6648
				6649
				6650	@node C++ Formatted Input, , Formatted Input Functions, Formatted Input
				6651	@section C++ Formatted Input
				6652	@cindex C++ @code{istream} input
				6653	@cindex @code{istream} input
				6654
				6655	The following functions are provided in @file{libgmpxx} (@pxref{Headers and
				6656	Libraries}), which is built only if C++ support is enabled (@pxref{Build
				6657	Options}). Prototypes are available from @code{<gmp.h>}.
				6658
				6659	@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
				6660	Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
				6661	@end deftypefun
				6662
				6663	@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
				6664	An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No
				6665	whitespace is allowed around the @samp{/}. If the fraction is not in
				6666	canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational
				6667	Number Functions}) before operating on it.
				6668
				6669	As per integer input, an @samp{0} or @samp{0x} base indicator is read when
				6670	none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is
				6671	done separately for numerator and denominator, so that for instance
				6672	@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}.
				6673	@end deftypefun
				6674
				6675	@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
				6676	Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
				6677
				6678	Hex or octal floats are not supported, but might be in the future, or perhaps
				6679	it's best to accept only what the standard float @code{operator>>} does.
				6680	@end deftypefun
				6681
				6682	Note that digit grouping specified by the @code{istream} locale is currently
				6683	not accepted. Perhaps this will change in the future.
				6684
				6685	@sp 1
				6686	These operators mean that GMP types can be read in the usual C++ way, for
				6687	example,
				6688
				6689	@example
				6690	mpz_t z;
				6691	...
				6692	cin >> z;
				6693	@end example
				6694
				6695	But note that @code{istream} input (and @code{ostream} output, @pxref{C++
				6696	Formatted Output}) is the only overloading available for the GMP types and
				6697	that for instance using @code{+} with an @code{mpz_t} will have unpredictable
				6698	results. For classes with overloading, see @ref{C++ Class Interface}.
				6699
				6700
				6701
				6702	@node C++ Class Interface, Custom Allocation, Formatted Input, Top
				6703	@chapter C++ Class Interface
				6704	@cindex C++ interface
				6705
				6706	This chapter describes the C++ class based interface to GMP.
				6707
				6708	All GMP C language types and functions can be used in C++ programs, since
				6709	@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
				6710	overloaded functions and operators which may be more convenient.
				6711
				6712	Due to the implementation of this interface, a reasonably recent C++ compiler
				6713	is required, one supporting namespaces, partial specialization of templates
				6714	and member templates.
				6715
				6716	@strong{Everything described in this chapter is to be considered preliminary
				6717	and might be subject to incompatible changes if some unforeseen difficulty
				6718	reveals itself.}
				6719
				6720	@menu
				6721	* C++ Interface General::
				6722	* C++ Interface Integers::
				6723	* C++ Interface Rationals::
				6724	* C++ Interface Floats::
				6725	* C++ Interface Random Numbers::
				6726	* C++ Interface Limitations::
				6727	@end menu
				6728
				6729
				6730	@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
				6731	@section C++ Interface General
				6732
				6733	@noindent
				6734	All the C++ classes and functions are available with
				6735
				6736	@cindex @code{gmpxx.h}
				6737	@example
				6738	#include <gmpxx.h>
				6739	@end example
				6740
				6741	Programs should be linked with the @file{libgmpxx} and @file{libgmp}
				6742	libraries. For example,
				6743
				6744	@example
				6745	g++ mycxxprog.cc -lgmpxx -lgmp
				6746	@end example
				6747
				6748	@noindent
				6749	The classes defined are
				6750
				6751	@deftp Class mpz_class
				6752	@deftpx Class mpq_class
				6753	@deftpx Class mpf_class
				6754	@end deftp
				6755
				6756	The standard operators and various standard functions are overloaded to allow
				6757	arithmetic with these classes. For example,
				6758
				6759	@example
				6760	int
				6761	main (void)
				6762	@{
				6763	mpz_class a, b, c;
				6764
				6765	a = 1234;
				6766	b = "-5678";
				6767	c = a+b;
				6768	cout << "sum is " << c << "\n";
				6769	cout << "absolute value is " << abs(c) << "\n";
				6770
				6771	return 0;
				6772	@}
				6773	@end example
				6774
				6775	An important feature of the implementation is that an expression like
				6776	@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
				6777	without using a temporary for the @code{b+c} part. Expressions which by their
				6778	nature imply intermediate values, like @code{a=bc+de}, still use temporaries
				6779	though.
				6780
				6781	The classes can be freely intermixed in expressions, as can the classes and
				6782	the standard types @code{long}, @code{unsigned long} and @code{double}.
				6783	Smaller types like @code{int} or @code{float} can also be intermixed, since
				6784	C++ will promote them.
				6785
				6786	Note that @code{bool} is not accepted directly, but must be explicitly cast to
				6787	an @code{int} first. This is because C++ will automatically convert any
				6788	pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
				6789	sorts of invalid class and pointer combinations compile but almost certainly
				6790	not do anything sensible.
				6791
				6792	Conversions back from the classes to standard C++ types aren't done
				6793	automatically, instead member functions like @code{get_si} are provided (see
				6794	the following sections for details).
				6795
				6796	Also there are no automatic conversions from the classes to the corresponding
				6797	GMP C types, instead a reference to the underlying C object can be obtained
				6798	with the following functions,
				6799
				6800	@deftypefun mpz_t mpz_class::get_mpz_t ()
				6801	@deftypefunx mpq_t mpq_class::get_mpq_t ()
				6802	@deftypefunx mpf_t mpf_class::get_mpf_t ()
				6803	@end deftypefun
				6804
				6805	These can be used to call a C function which doesn't have a C++ class
				6806	interface. For example to set @code{a} to the GCD of @code{b} and @code{c},
				6807
				6808	@example
				6809	mpz_class a, b, c;
				6810	...
				6811	mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
				6812	@end example
				6813
				6814	In the other direction, a class can be initialized from the corresponding GMP
				6815	C type, or assigned to if an explicit constructor is used. In both cases this
				6816	makes a copy of the value, it doesn't create any sort of association. For
				6817	example,
				6818
				6819	@example
				6820	mpz_t z;
				6821	// ... init and calculate z ...
				6822	mpz_class x(z);
				6823	mpz_class y;
				6824	y = mpz_class (z);
				6825	@end example
				6826
				6827	There are no namespace setups in @file{gmpxx.h}, all types and functions are
				6828	simply put into the global namespace. This is what @file{gmp.h} has done in
				6829	the past, and continues to do for compatibility. The extras provided by
				6830	@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
				6831	anything.
				6832
				6833
				6834	@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
				6835	@section C++ Interface Integers
				6836
				6837	@deftypefun {} mpz_class::mpz_class (type @var{n})
				6838	Construct an @code{mpz_class}. All the standard C++ types may be used, except
				6839	@code{long long} and @code{long double}, and all the GMP C++ classes can be
				6840	used, although conversions from @code{mpq_class} and @code{mpf_class} are
				6841	@code{explicit}. Any necessary conversion follows the corresponding C
				6842	function, for example @code{double} follows @code{mpz_set_d}
				6843	(@pxref{Assigning Integers}).
				6844	@end deftypefun
				6845
				6846	@deftypefun explicit mpz_class::mpz_class (const mpz_t @var{z})
				6847	Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is
				6848	copied into the new @code{mpz_class}, there won't be any permanent association
				6849	between it and @var{z}.
				6850	@end deftypefun
				6851
				6852	@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0)
				6853	@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0)
				6854	Construct an @code{mpz_class} converted from a string using @code{mpz_set_str}
				6855	(@pxref{Assigning Integers}).
				6856
				6857	If the string is not a valid integer, an @code{std::invalid_argument}
				6858	exception is thrown. The same applies to @code{operator=}.
				6859	@end deftypefun
				6860
				6861	@deftypefun mpz_class operator"" _mpz (const char *@var{str})
				6862	With C++11 compilers, integers can be constructed with the syntax
				6863	@code{123_mpz} which is equivalent to @code{mpz_class("123")}.
				6864	@end deftypefun
				6865
				6866	@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
				6867	@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
				6868	Divisions involving @code{mpz_class} round towards zero, as per the
				6869	@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
				6870	This is the same as the C99 @code{/} and @code{%} operators.
				6871
				6872	The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called
				6873	directly if desired. For example,
				6874
				6875	@example
				6876	mpz_class q, a, d;
				6877	...
				6878	mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
				6879	@end example
				6880	@end deftypefun
				6881
				6882	@deftypefun mpz_class abs (mpz_class @var{op})
				6883	@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
				6884	@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
				6885	@maybepagebreak
				6886	@deftypefunx bool mpz_class::fits_sint_p (void)
				6887	@deftypefunx bool mpz_class::fits_slong_p (void)
				6888	@deftypefunx bool mpz_class::fits_sshort_p (void)
				6889	@maybepagebreak
				6890	@deftypefunx bool mpz_class::fits_uint_p (void)
				6891	@deftypefunx bool mpz_class::fits_ulong_p (void)
				6892	@deftypefunx bool mpz_class::fits_ushort_p (void)
				6893	@maybepagebreak
				6894	@deftypefunx double mpz_class::get_d (void)
				6895	@deftypefunx long mpz_class::get_si (void)
				6896	@deftypefunx string mpz_class::get_str (int @var{base} = 10)
				6897	@deftypefunx {unsigned long} mpz_class::get_ui (void)
				6898	@maybepagebreak
				6899	@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base})
				6900	@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base})
				6901	@deftypefunx int sgn (mpz_class @var{op})
				6902	@deftypefunx mpz_class sqrt (mpz_class @var{op})
				6903	@maybepagebreak
				6904	@deftypefunx mpz_class gcd (mpz_class @var{op1}, mpz_class @var{op2})
				6905	@deftypefunx mpz_class lcm (mpz_class @var{op1}, mpz_class @var{op2})
				6906	@deftypefunx mpz_class mpz_class::factorial (type @var{op})
				6907	@deftypefunx mpz_class factorial (mpz_class @var{op})
				6908	@deftypefunx mpz_class mpz_class::primorial (type @var{op})
				6909	@deftypefunx mpz_class primorial (mpz_class @var{op})
				6910	@deftypefunx mpz_class mpz_class::fibonacci (type @var{op})
				6911	@deftypefunx mpz_class fibonacci (mpz_class @var{op})
				6912	@maybepagebreak
				6913	@deftypefunx void mpz_class::swap (mpz_class& @var{op})
				6914	@deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2})
				6915	These functions provide a C++ class interface to the corresponding GMP C
				6916	routines. Calling @code{factorial} or @code{primorial} on a negative number
				6917	is undefined.
				6918
				6919	@code{cmp} can be used with any of the classes or the standard C++ types,
				6920	except @code{long long} and @code{long double}.
				6921	@end deftypefun
				6922
				6923	@sp 1
				6924	Overloaded operators for combinations of @code{mpz_class} and @code{double}
				6925	are provided for completeness, but it should be noted that if the given
				6926	@code{double} is not an integer then the way any rounding is done is currently
				6927	unspecified. The rounding might take place at the start, in the middle, or at
				6928	the end of the operation, and it might change in the future.
				6929
				6930	Conversions between @code{mpz_class} and @code{double}, however, are defined
				6931	to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
				6932	And comparisons are always made exactly, as per @code{mpz_cmp_d}.
				6933
				6934
				6935	@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
				6936	@section C++ Interface Rationals
				6937
				6938	In all the following constructors, if a fraction is given then it should be in
				6939	canonical form, or if not then @code{mpq_class::canonicalize} called.
				6940
				6941	@deftypefun {} mpq_class::mpq_class (type @var{op})
				6942	@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den})
				6943	Construct an @code{mpq_class}. The initial value can be a single value of any
				6944	type (conversion from @code{mpf_class} is @code{explicit}), or a pair of
				6945	integers (@code{mpz_class} or standard C++ integer types) representing a
				6946	fraction, except that @code{long long} and @code{long double} are not
				6947	supported. For example,
				6948
				6949	@example
				6950	mpq_class q (99);
				6951	mpq_class q (1.75);
				6952	mpq_class q (1, 3);
				6953	@end example
				6954	@end deftypefun
				6955
				6956	@deftypefun explicit mpq_class::mpq_class (const mpq_t @var{q})
				6957	Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is
				6958	copied into the new @code{mpq_class}, there won't be any permanent association
				6959	between it and @var{q}.
				6960	@end deftypefun
				6961
				6962	@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0)
				6963	@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0)
				6964	Construct an @code{mpq_class} converted from a string using @code{mpq_set_str}
				6965	(@pxref{Initializing Rationals}).
				6966
				6967	If the string is not a valid rational, an @code{std::invalid_argument}
				6968	exception is thrown. The same applies to @code{operator=}.
				6969	@end deftypefun
				6970
				6971	@deftypefun mpq_class operator"" _mpq (const char *@var{str})
				6972	With C++11 compilers, integral rationals can be constructed with the syntax
				6973	@code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other
				6974	rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}.
				6975	@end deftypefun
				6976
				6977	@deftypefun void mpq_class::canonicalize ()
				6978	Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
				6979	Functions}. All arithmetic operators require their operands in canonical
				6980	form, and will return results in canonical form.
				6981	@end deftypefun
				6982
				6983	@deftypefun mpq_class abs (mpq_class @var{op})
				6984	@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
				6985	@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
				6986	@maybepagebreak
				6987	@deftypefunx double mpq_class::get_d (void)
				6988	@deftypefunx string mpq_class::get_str (int @var{base} = 10)
				6989	@maybepagebreak
				6990	@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base})
				6991	@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base})
				6992	@deftypefunx int sgn (mpq_class @var{op})
				6993	@maybepagebreak
				6994	@deftypefunx void mpq_class::swap (mpq_class& @var{op})
				6995	@deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2})
				6996	These functions provide a C++ class interface to the corresponding GMP C
				6997	routines.
				6998
				6999	@code{cmp} can be used with any of the classes or the standard C++ types,
				7000	except @code{long long} and @code{long double}.
				7001	@end deftypefun
				7002
				7003	@deftypefun {mpz_class&} mpq_class::get_num ()
				7004	@deftypefunx {mpz_class&} mpq_class::get_den ()
				7005	Get a reference to an @code{mpz_class} which is the numerator or denominator
				7006	of an @code{mpq_class}. This can be used both for read and write access. If
				7007	the object returned is modified, it modifies the original @code{mpq_class}.
				7008
				7009	If direct manipulation might produce a non-canonical value, then
				7010	@code{mpq_class::canonicalize} must be called before further operations.
				7011	@end deftypefun
				7012
				7013	@deftypefun mpz_t mpq_class::get_num_mpz_t ()
				7014	@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
				7015	Get a reference to the underlying @code{mpz_t} numerator or denominator of an
				7016	@code{mpq_class}. This can be passed to C functions expecting an
				7017	@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the
				7018	original @code{mpq_class}.
				7019
				7020	If direct manipulation might produce a non-canonical value, then
				7021	@code{mpq_class::canonicalize} must be called before further operations.
				7022	@end deftypefun
				7023
				7024	@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
				7025	Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
				7026	the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
				7027
				7028	If the @var{rop} read might not be in canonical form then
				7029	@code{mpq_class::canonicalize} must be called.
				7030	@end deftypefun
				7031
				7032
				7033	@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface
				7034	@section C++ Interface Floats
				7035
				7036	When an expression requires the use of temporary intermediate @code{mpf_class}
				7037	values, like @code{f=gh+xy}, those temporaries will have the same precision
				7038	as the destination @code{f}. Explicit constructors can be used if this
				7039	doesn't suit.
				7040
				7041	@deftypefun {} mpf_class::mpf_class (type @var{op})
				7042	@deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec})
				7043	Construct an @code{mpf_class}. Any standard C++ type can be used, except
				7044	@code{long long} and @code{long double}, and any of the GMP C++ classes can be
				7045	used.
				7046
				7047	If @var{prec} is given, the initial precision is that value, in bits. If
				7048	@var{prec} is not given, then the initial precision is determined by the type
				7049	of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++
				7050	builtin type will give the default @code{mpf} precision (@pxref{Initializing
				7051	Floats}). An @code{mpf_class} or expression will give the precision of that
				7052	value. The precision of a binary expression is the higher of the two
				7053	operands.
				7054
				7055	@example
				7056	mpf_class f(1.5); // default precision
				7057	mpf_class f(1.5, 500); // 500 bits (at least)
				7058	mpf_class f(x); // precision of x
				7059	mpf_class f(abs(x)); // precision of x
				7060	mpf_class f(-g, 1000); // 1000 bits (at least)
				7061	mpf_class f(x+y); // greater of precisions of x and y
				7062	@end example
				7063	@end deftypefun
				7064
				7065	@deftypefun explicit mpf_class::mpf_class (const mpf_t @var{f})
				7066	@deftypefunx {} mpf_class::mpf_class (const mpf_t @var{f}, mp_bitcnt_t @var{prec})
				7067	Construct an @code{mpf_class} from an @code{mpf_t}. The value in @var{f} is
				7068	copied into the new @code{mpf_class}, there won't be any permanent association
				7069	between it and @var{f}.
				7070
				7071	If @var{prec} is given, the initial precision is that value, in bits. If
				7072	@var{prec} is not given, then the initial precision is that of @var{f}.
				7073	@end deftypefun
				7074
				7075	@deftypefun explicit mpf_class::mpf_class (const char *@var{s})
				7076	@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
				7077	@deftypefunx explicit mpf_class::mpf_class (const string& @var{s})
				7078	@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
				7079	Construct an @code{mpf_class} converted from a string using @code{mpf_set_str}
				7080	(@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is
				7081	that value, in bits. If not, the default @code{mpf} precision
				7082	(@pxref{Initializing Floats}) is used.
				7083
				7084	If the string is not a valid float, an @code{std::invalid_argument} exception
				7085	is thrown. The same applies to @code{operator=}.
				7086	@end deftypefun
				7087
				7088	@deftypefun mpf_class operator"" _mpf (const char *@var{str})
				7089	With C++11 compilers, floats can be constructed with the syntax
				7090	@code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}.
				7091	@end deftypefun
				7092
				7093	@deftypefun {mpf_class&} mpf_class::operator= (type @var{op})
				7094	Convert and store the given @var{op} value to an @code{mpf_class} object. The
				7095	same types are accepted as for the constructors above.
				7096
				7097	Note that @code{operator=} only stores a new value, it doesn't copy or change
				7098	the precision of the destination, instead the value is truncated if necessary.
				7099	This is the same as @code{mpf_set} etc. Note in particular this means for
				7100	@code{mpf_class} a copy constructor is not the same as a default constructor
				7101	plus assignment.
				7102
				7103	@example
				7104	mpf_class x (y); // x created with precision of y
				7105
				7106	mpf_class x; // x created with default precision
				7107	x = y; // value truncated to that precision
				7108	@end example
				7109
				7110	Applications using templated code may need to be careful about the assumptions
				7111	the code makes in this area, when working with @code{mpf_class} values of
				7112	various different or non-default precisions. For instance implementations of
				7113	the standard @code{complex} template have been seen in both styles above,
				7114	though of course @code{complex} is normally only actually specified for use
				7115	with the builtin float types.
				7116	@end deftypefun
				7117
				7118	@deftypefun mpf_class abs (mpf_class @var{op})
				7119	@deftypefunx mpf_class ceil (mpf_class @var{op})
				7120	@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
				7121	@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
				7122	@maybepagebreak
				7123	@deftypefunx bool mpf_class::fits_sint_p (void)
				7124	@deftypefunx bool mpf_class::fits_slong_p (void)
				7125	@deftypefunx bool mpf_class::fits_sshort_p (void)
				7126	@maybepagebreak
				7127	@deftypefunx bool mpf_class::fits_uint_p (void)
				7128	@deftypefunx bool mpf_class::fits_ulong_p (void)
				7129	@deftypefunx bool mpf_class::fits_ushort_p (void)
				7130	@maybepagebreak
				7131	@deftypefunx mpf_class floor (mpf_class @var{op})
				7132	@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
				7133	@maybepagebreak
				7134	@deftypefunx double mpf_class::get_d (void)
				7135	@deftypefunx long mpf_class::get_si (void)
				7136	@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0)
				7137	@deftypefunx {unsigned long} mpf_class::get_ui (void)
				7138	@maybepagebreak
				7139	@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base})
				7140	@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base})
				7141	@deftypefunx int sgn (mpf_class @var{op})
				7142	@deftypefunx mpf_class sqrt (mpf_class @var{op})
				7143	@maybepagebreak
				7144	@deftypefunx void mpf_class::swap (mpf_class& @var{op})
				7145	@deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2})
				7146	@deftypefunx mpf_class trunc (mpf_class @var{op})
				7147	These functions provide a C++ class interface to the corresponding GMP C
				7148	routines.
				7149
				7150	@code{cmp} can be used with any of the classes or the standard C++ types,
				7151	except @code{long long} and @code{long double}.
				7152
				7153	The accuracy provided by @code{hypot} is not currently guaranteed.
				7154	@end deftypefun
				7155
				7156	@deftypefun {mp_bitcnt_t} mpf_class::get_prec ()
				7157	@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec})
				7158	@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec})
				7159	Get or set the current precision of an @code{mpf_class}.
				7160
				7161	The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
				7162	Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the
				7163	@code{mpf_class} must be restored to it's allocated precision before being
				7164	destroyed. This must be done by application code, there's no automatic
				7165	mechanism for it.
				7166	@end deftypefun
				7167
				7168
				7169	@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface
				7170	@section C++ Interface Random Numbers
				7171
				7172	@deftp Class gmp_randclass
				7173	The C++ class interface to the GMP random number functions uses
				7174	@code{gmp_randclass} to hold an algorithm selection and current state, as per
				7175	@code{gmp_randstate_t}.
				7176	@end deftp
				7177
				7178	@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{})
				7179	Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
				7180	function (@pxref{Random State Initialization}). The arguments expected are
				7181	the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
				7182	For example,
				7183
				7184	@example
				7185	gmp_randclass r1 (gmp_randinit_default);
				7186	gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
				7187	gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
				7188	gmp_randclass r4 (gmp_randinit_mt);
				7189	@end example
				7190
				7191	@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big,
				7192	an @code{std::length_error} exception is thrown in that case.
				7193	@end deftypefun
				7194
				7195	@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{})
				7196	Construct a @code{gmp_randclass} using the same parameters as
				7197	@code{gmp_randinit} (@pxref{Random State Initialization}). This function is
				7198	obsolete and the above @var{randinit} style should be preferred.
				7199	@end deftypefun
				7200
				7201	@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
				7202	@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
				7203	Seed a random number generator. See @pxref{Random Number Functions}, for how
				7204	to choose a good seed.
				7205	@end deftypefun
				7206
				7207	@deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits})
				7208	@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
				7209	Generate a random integer with a specified number of bits.
				7210	@end deftypefun
				7211
				7212	@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
				7213	Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
				7214	@end deftypefun
				7215
				7216	@deftypefun mpf_class gmp_randclass::get_f ()
				7217	@deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec})
				7218	Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f}
				7219	will be to @var{prec} bits precision, or if @var{prec} is not given then to
				7220	the precision of the destination. For example,
				7221
				7222	@example
				7223	gmp_randclass r;
				7224	...
				7225	mpf_class f (0, 512); // 512 bits precision
				7226	f = r.get_f(); // random number, 512 bits
				7227	@end example
				7228	@end deftypefun
				7229
				7230
				7231
				7232	@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface
				7233	@section C++ Interface Limitations
				7234
				7235	@table @asis
				7236	@item @code{mpq_class} and Templated Reading
				7237	A generic piece of template code probably won't know that @code{mpq_class}
				7238	requires a @code{canonicalize} call if inputs read with @code{operator>>}
				7239	might be non-canonical. This can lead to incorrect results.
				7240
				7241	@code{operator>>} behaves as it does for reasons of efficiency. A
				7242	canonicalize can be quite time consuming on large operands, and is best
				7243	avoided if it's not necessary.
				7244
				7245	But this potential difficulty reduces the usefulness of @code{mpq_class}.
				7246	Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
				7247	the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
				7248	pressed into service. Or maybe, at the risk of inconsistency, the
				7249	@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
				7250	@code{operator>>} not doing so, for use on those occasions when that's
				7251	acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}.
				7252
				7253	@item Subclassing
				7254	Subclassing the GMP C++ classes works, but is not currently recommended.
				7255
				7256	Expressions involving subclasses resolve correctly (or seem to), but in normal
				7257	C++ fashion the subclass doesn't inherit constructors and assignments.
				7258	There's many of those in the GMP classes, and a good way to reestablish them
				7259	in a subclass is not yet provided.
				7260
				7261	@item Templated Expressions
				7262	A subtle difficulty exists when using expressions together with
				7263	application-defined template functions. Consider the following, with @code{T}
				7264	intended to be some numeric type,
				7265
				7266	@example
				7267	template <class T>
				7268	T fun (const T &, const T &);
				7269	@end example
				7270
				7271	@noindent
				7272	When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
				7273	is resolved as @code{mpz_class}.
				7274
				7275	@example
				7276	mpz_class f(1), g(2);
				7277	fun (f, g); // Good
				7278	@end example
				7279
				7280	@noindent
				7281	But when one of the arguments is an expression, it doesn't work.
				7282
				7283	@example
				7284	mpz_class f(1), g(2), h(3);
				7285	fun (f, g+h); // Bad
				7286	@end example
				7287
				7288	This is because @code{g+h} ends up being a certain expression template type
				7289	internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
				7290	to automatically convert to @code{mpz_class}. The workaround is simply to add
				7291	an explicit cast.
				7292
				7293	@example
				7294	mpz_class f(1), g(2), h(3);
				7295	fun (f, mpz_class(g+h)); // Good
				7296	@end example
				7297
				7298	Similarly, within @code{fun} it may be necessary to cast an expression to type
				7299	@code{T} when calling a templated @code{fun2}.
				7300
				7301	@example
				7302	template <class T>
				7303	void fun (T f, T g)
				7304	@{
				7305	fun2 (f, f+g); // Bad
				7306	@}
				7307
				7308	template <class T>
				7309	void fun (T f, T g)
				7310	@{
				7311	fun2 (f, T(f+g)); // Good
				7312	@}
				7313	@end example
				7314
				7315	@item C++11
				7316	C++11 provides several new ways in which types can be inferred: @code{auto},
				7317	@code{decltype}, etc. While they can be very convenient, they don't mix well
				7318	with expression templates. In this example, the addition is performed twice,
				7319	as if we had defined @code{sum} as a macro.
				7320
				7321	@example
				7322	mpz_class z = 33;
				7323	auto sum = z + z;
				7324	mpz_class prod = sum * sum;
				7325	@end example
				7326
				7327	This other example may crash, though some compilers might make it look like
				7328	it is working, because the expression @code{z+z} goes out of scope before it
				7329	is evaluated.
				7330
				7331	@example
				7332	mpz_class z = 33;
				7333	auto sum = z + z + z;
				7334	mpz_class prod = sum * 2;
				7335	@end example
				7336
				7337	It is thus strongly recommended to avoid @code{auto} anywhere a GMP C++
				7338	expression may appear.
				7339	@end table
				7340
				7341
				7342	@node Custom Allocation, Language Bindings, C++ Class Interface, Top
				7343	@comment node-name, next, previous, up
				7344	@chapter Custom Allocation
				7345	@cindex Custom allocation
				7346	@cindex Memory allocation
				7347	@cindex Allocation of memory
				7348
				7349	By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
				7350	allocation, and if they fail GMP prints a message to the standard error output
				7351	and terminates the program.
				7352
				7353	Alternate functions can be specified, to allocate memory in a different way or
				7354	to have a different error action on running out of memory.
				7355
				7356	@deftypefun void mp_set_memory_functions (@* void (@var{alloc_func_ptr}) (size_t), @* void (@var{realloc_func_ptr}) (void , size_t, size_t), @ void (@var{free_func_ptr}) (void , size_t))
				7357	Replace the current allocation functions from the arguments. If an argument
				7358	is @code{NULL}, the corresponding default function is used.
				7359
				7360	These functions will be used for all memory allocation done by GMP, apart from
				7361	temporary space from @code{alloca} if that function is available and GMP is
				7362	configured to use it (@pxref{Build Options}).
				7363
				7364	@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
				7365	active GMP objects allocated using the previous memory functions! Usually
				7366	that means calling it before any other GMP function.}
				7367	@end deftypefun
				7368
				7369	The functions supplied should fit the following declarations:
				7370
				7371	@deftypevr Function {void *} allocate_function (size_t @var{alloc_size})
				7372	Return a pointer to newly allocated space with at least @var{alloc_size}
				7373	bytes.
				7374	@end deftypevr
				7375
				7376	@deftypevr Function {void } reallocate_function (void @var{ptr}, size_t @var{old_size}, size_t @var{new_size})
				7377	Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
				7378	@var{new_size} bytes.
				7379
				7380	The block may be moved if necessary or if desired, and in that case the
				7381	smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
				7382	location. The return value is a pointer to the resized block, that being the
				7383	new location if moved or just @var{ptr} if not.
				7384
				7385	@var{ptr} is never @code{NULL}, it's always a previously allocated block.
				7386	@var{new_size} may be bigger or smaller than @var{old_size}.
				7387	@end deftypevr
				7388
				7389	@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size})
				7390	De-allocate the space pointed to by @var{ptr}.
				7391
				7392	@var{ptr} is never @code{NULL}, it's always a previously allocated block of
				7393	@var{size} bytes.
				7394	@end deftypevr
				7395
				7396	A @dfn{byte} here means the unit used by the @code{sizeof} operator.
				7397
				7398	The @var{reallocate_function} parameter @var{old_size} and the
				7399	@var{free_function} parameter @var{size} are passed for convenience, but of
				7400	course they can be ignored if not needed by an implementation. The default
				7401	functions using @code{malloc} and friends for instance don't use them.
				7402
				7403	No error return is allowed from any of these functions, if they return then
				7404	they must have performed the specified operation. In particular note that
				7405	@var{allocate_function} or @var{reallocate_function} mustn't return
				7406	@code{NULL}.
				7407
				7408	Getting a different fatal error action is a good use for custom allocation
				7409	functions, for example giving a graphical dialog rather than the default print
				7410	to @code{stderr}. How much is possible when genuinely out of memory is
				7411	another question though.
				7412
				7413	There's currently no defined way for the allocation functions to recover from
				7414	an error such as out of memory, they must terminate program execution. A
				7415	@code{longjmp} or throwing a C++ exception will have undefined results. This
				7416	may change in the future.
				7417
				7418	GMP may use allocated blocks to hold pointers to other allocated blocks. This
				7419	will limit the assumptions a conservative garbage collection scheme can make.
				7420
				7421	Since the default GMP allocation uses @code{malloc} and friends, those
				7422	functions will be linked in even if the first thing a program does is an
				7423	@code{mp_set_memory_functions}. It's necessary to change the GMP sources if
				7424	this is a problem.
				7425
				7426	@sp 1
				7427	@deftypefun void mp_get_memory_functions (@* void (@var{alloc_func_ptr}) (size_t), @ void (@var{realloc_func_ptr}) (void , size_t, size_t), @* void (*@var{free_func_ptr}) (void , size_t))
				7428	Get the current allocation functions, storing function pointers to the
				7429	locations given by the arguments. If an argument is @code{NULL}, that
				7430	function pointer is not stored.
				7431
				7432	@need 1000
				7433	For example, to get just the current free function,
				7434
				7435	@example
				7436	void (freefunc) (void , size_t);
				7437
				7438	mp_get_memory_functions (NULL, NULL, &freefunc);
				7439	@end example
				7440	@end deftypefun
				7441
				7442	@node Language Bindings, Algorithms, Custom Allocation, Top
				7443	@chapter Language Bindings
				7444	@cindex Language bindings
				7445	@cindex Other languages
				7446
				7447	The following packages and projects offer access to GMP from languages other
				7448	than C, though perhaps with varying levels of functionality and efficiency.
				7449
				7450	@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
				7451	@c in tex, just to separate the URL from the preceding text a bit.
				7452	@iftex
				7453	@macro spaceuref {U}
				7454	@ @ @uref{\U\}
				7455	@end macro
				7456	@end iftex
				7457	@ifnottex
				7458	@macro spaceuref {U}
				7459	@uref{\U\}
				7460	@end macro
				7461	@end ifnottex
				7462
				7463	@sp 1
				7464	@table @asis
				7465	@item C++
				7466	@itemize @bullet
				7467	@item
				7468	GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
				7469	interface, expression templates to eliminate temporaries.
				7470	@item
				7471	ALP @spaceuref{https://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and
				7472	polynomials using templates.
				7473	@item
				7474	CLN @spaceuref{https://www.ginac.de/CLN/} @* High level classes for arithmetic.
				7475	@item
				7476	Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices.
				7477	@item
				7478	NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library.
				7479	@end itemize
				7480
				7481	@c @item D
				7482	@c @itemize @bullet
				7483	@c @item
				7484	@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/}
				7485	@c @end itemize
				7486
				7487	@item Eiffel
				7488	@itemize @bullet
				7489	@item
				7490	Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442}
				7491	@end itemize
				7492
				7493	@c @item Fortran
				7494	@c @itemize @bullet
				7495	@c @item
				7496	@c Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary
				7497	@c precision floats.
				7498	@c @end itemize
				7499
				7500	@item Haskell
				7501	@itemize @bullet
				7502	@item
				7503	Glasgow Haskell Compiler @spaceuref{https://www.haskell.org/ghc/}
				7504	@end itemize
				7505
				7506	@item Java
				7507	@itemize @bullet
				7508	@item
				7509	Kaffe @spaceuref{https://github.com/kaffe/kaffe}
				7510	@end itemize
				7511
				7512	@item Lisp
				7513	@itemize @bullet
				7514	@item
				7515	GNU Common Lisp @spaceuref{https://www.gnu.org/software/gcl/gcl.html}
				7516	@item
				7517	Librep @spaceuref{http://librep.sourceforge.net/}
				7518	@item
				7519	@c FIXME: When there's a stable release with gmp support, just refer to it
				7520	@c rather than bothering to talk about betas.
				7521	XEmacs (21.5.18 beta and up) @spaceuref{https://www.xemacs.org} @* Optional
				7522	big integers, rationals and floats using GMP.
				7523	@end itemize
				7524
				7525	@item ML
				7526	@itemize @bullet
				7527	@item
				7528	MLton compiler @spaceuref{http://mlton.org/}
				7529	@end itemize
				7530
				7531	@item Objective Caml
				7532	@itemize @bullet
				7533	@item
				7534	MLGMP @spaceuref{https://opam.ocaml.org/packages/mlgmp/}
				7535	@item
				7536	Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using
				7537	GMP.
				7538	@end itemize
				7539
				7540	@item Oz
				7541	@itemize @bullet
				7542	@item
				7543	Mozart @spaceuref{https://mozart.github.io/}
				7544	@end itemize
				7545
				7546	@item Pascal
				7547	@itemize @bullet
				7548	@item
				7549	GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit.
				7550	@item
				7551	Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal,
				7552	optionally using GMP.
				7553	@end itemize
				7554
				7555	@item Perl
				7556	@itemize @bullet
				7557	@item
				7558	GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration
				7559	Programs}).
				7560	@item
				7561	Math::GMP @spaceuref{https://www.cpan.org/} @* Compatible with Math::BigInt, but
				7562	not as many functions as the GMP module above.
				7563	@item
				7564	Math::BigInt::GMP @spaceuref{https://www.cpan.org/} @* Plug Math::GMP into
				7565	normal Math::BigInt operations.
				7566	@end itemize
				7567
				7568	@need 1000
				7569	@item Pike
				7570	@itemize @bullet
				7571	@item
				7572	pikempz module in the standard distribution, @uref{https://pike.lysator.liu.se/}
				7573	@end itemize
				7574
				7575	@need 500
				7576	@item Prolog
				7577	@itemize @bullet
				7578	@item
				7579	SWI Prolog @spaceuref{http://www.swi-prolog.org/} @*
				7580	Arbitrary precision floats.
				7581	@end itemize
				7582
				7583	@item Python
				7584	@itemize @bullet
				7585	@item
				7586	GMPY @uref{https://code.google.com/p/gmpy/}
				7587	@end itemize
				7588
				7589	@item Ruby
				7590	@itemize @bullet
				7591	@item
				7592	@uref{https://rubygems.org/gems/gmp}
				7593	@end itemize
				7594
				7595	@item Scheme
				7596	@itemize @bullet
				7597	@item
				7598	GNU Guile @spaceuref{https://www.gnu.org/software/guile/guile.html}
				7599	@item
				7600	RScheme @spaceuref{https://www.rscheme.org/}
				7601	@item
				7602	STklos @spaceuref{http://www.stklos.net/}
				7603	@c
				7604	@c For reference, MzScheme uses some of gmp, but (as of version 205) it only
				7605	@c has copies of some of the generic C code, and we don't consider that a
				7606	@c language binding to gmp.
				7607	@c
				7608	@end itemize
				7609
				7610	@item Smalltalk
				7611	@itemize @bullet
				7612	@item
				7613	GNU Smalltalk @spaceuref{http://smalltalk.gnu.org/}
				7614	@end itemize
				7615
				7616	@item Other
				7617	@itemize @bullet
				7618	@item
				7619	Axiom @uref{https://savannah.nongnu.org/projects/axiom} @* Computer algebra
				7620	using GCL.
				7621	@item
				7622	DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and
				7623	mathematical programming language.
				7624	@item
				7625	GiNaC @spaceuref{httsp://www.ginac.de/} @* C++ computer algebra using CLN.
				7626	@item
				7627	GOO @spaceuref{https://www.eecs.berkeley.edu/~jrb/goo/} @* Dynamic object oriented
				7628	language.
				7629	@item
				7630	Maxima @uref{https://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
				7631	computer algebra using GCL.
				7632	@c @item
				7633	@c Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system.
				7634	@item
				7635	Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator.
				7636	@item
				7637	Yacas @spaceuref{http://yacas.sourceforge.net} @* Yet another computer algebra system.
				7638	@end itemize
				7639
				7640	@end table
				7641
				7642
				7643	@node Algorithms, Internals, Language Bindings, Top
				7644	@chapter Algorithms
				7645	@cindex Algorithms
				7646
				7647	This chapter is an introduction to some of the algorithms used for various GMP
				7648	operations. The code is likely to be hard to understand without knowing
				7649	something about the algorithms.
				7650
				7651	Some GMP internals are mentioned, but applications that expect to be
				7652	compatible with future GMP releases should take care to use only the
				7653	documented functions.
				7654
				7655	@menu
				7656	* Multiplication Algorithms::
				7657	* Division Algorithms::
				7658	* Greatest Common Divisor Algorithms::
				7659	* Powering Algorithms::
				7660	* Root Extraction Algorithms::
				7661	* Radix Conversion Algorithms::
				7662	* Other Algorithms::
				7663	* Assembly Coding::
				7664	@end menu
				7665
				7666
				7667	@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
				7668	@section Multiplication
				7669	@cindex Multiplication algorithms
				7670
				7671	N@cross{}N limb multiplications and squares are done using one of seven
				7672	algorithms, as the size N increases.
				7673
				7674	@quotation
				7675	@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				7676	@item Algorithm @tab Threshold
				7677	@item Basecase @tab (none)
				7678	@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD}
				7679	@item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD}
				7680	@item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD}
				7681	@item Toom-6.5 @tab @code{MUL_TOOM6H_THRESHOLD}
				7682	@item Toom-8.5 @tab @code{MUL_TOOM8H_THRESHOLD}
				7683	@item FFT @tab @code{MUL_FFT_THRESHOLD}
				7684	@end multitable
				7685	@end quotation
				7686
				7687	Similarly for squaring, with the @code{SQR} thresholds.
				7688
				7689	N@cross{}M multiplications of operands with different sizes above
				7690	@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired
				7691	algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced
				7692	Multiplication}).
				7693
				7694	@menu
				7695	* Basecase Multiplication::
				7696	* Karatsuba Multiplication::
				7697	* Toom 3-Way Multiplication::
				7698	* Toom 4-Way Multiplication::
				7699	* Higher degree Toom'n'half::
				7700	* FFT Multiplication::
				7701	* Other Multiplication::
				7702	* Unbalanced Multiplication::
				7703	@end menu
				7704
				7705
				7706	@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
				7707	@subsection Basecase Multiplication
				7708
				7709	Basecase N@cross{}M multiplication is a straightforward rectangular set of
				7710	cross-products, the same as long multiplication done by hand and for that
				7711	reason sometimes known as the schoolbook or grammar school method. This is an
				7712	@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M
				7713	(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
				7714
				7715	Assembly implementations of @code{mpn_mul_basecase} are essentially the same
				7716	as the generic C code, but have all the usual assembly tricks and
				7717	obscurities introduced for speed.
				7718
				7719	A square can be done in roughly half the time of a multiply, by using the fact
				7720	that the cross products above and below the diagonal are the same. A triangle
				7721	of products below the diagonal is formed, doubled (left shift by one bit), and
				7722	then the products on the diagonal added. This can be seen in
				7723	@file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take
				7724	essentially the same approach.
				7725
				7726	@tex
				7727	\def\GMPline#1#2#3#4#5#6{%
				7728	\hbox {%
				7729	\vrule height 2.5ex depth 1ex
				7730	\hbox to 2em {\hfil{#2}\hfil}%
				7731	\vrule \hbox to 2em {\hfil{#3}\hfil}%
				7732	\vrule \hbox to 2em {\hfil{#4}\hfil}%
				7733	\vrule \hbox to 2em {\hfil{#5}\hfil}%
				7734	\vrule \hbox to 2em {\hfil{#6}\hfil}%
				7735	\vrule}}
				7736	\GMPdisplay{
				7737	\hbox{%
				7738	\vbox{%
				7739	\hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
				7740	\hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
				7741	\hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
				7742	\hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
				7743	\hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
				7744	\hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
				7745	\vfill}%
				7746	\vbox{%
				7747	\hbox{%
				7748	\hbox to 2em {\hfil u0\hfil}%
				7749	\hbox to 2em {\hfil u1\hfil}%
				7750	\hbox to 2em {\hfil u2\hfil}%
				7751	\hbox to 2em {\hfil u3\hfil}%
				7752	\hbox to 2em {\hfil u4\hfil}}%
				7753	\vskip 0.7ex
				7754	\hrule
				7755	\GMPline{u0}{d}{}{}{}{}%
				7756	\hrule
				7757	\GMPline{u1}{}{d}{}{}{}%
				7758	\hrule
				7759	\GMPline{u2}{}{}{d}{}{}%
				7760	\hrule
				7761	\GMPline{u3}{}{}{}{d}{}%
				7762	\hrule
				7763	\GMPline{u4}{}{}{}{}{d}%
				7764	\hrule}}}
				7765	@end tex
				7766	@ifnottex
				7767	@example
				7768	@group
				7769	u0 u1 u2 u3 u4
				7770	+---+---+---+---+---+
				7771	u0 \| d \| \| \| \| \|
				7772	+---+---+---+---+---+
				7773	u1 \| \| d \| \| \| \|
				7774	+---+---+---+---+---+
				7775	u2 \| \| \| d \| \| \|
				7776	+---+---+---+---+---+
				7777	u3 \| \| \| \| d \| \|
				7778	+---+---+---+---+---+
				7779	u4 \| \| \| \| \| d \|
				7780	+---+---+---+---+---+
				7781	@end group
				7782	@end example
				7783	@end ifnottex
				7784
				7785	In practice squaring isn't a full 2@cross{} faster than multiplying, it's
				7786	usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates
				7787	@code{mpn_sqr_basecase} wants improving on that CPU.
				7788
				7789	On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
				7790	@code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is
				7791	the size at which to use @code{mpn_sqr_basecase}, this will be zero if that
				7792	routine should be used always.
				7793
				7794
				7795	@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
				7796	@subsection Karatsuba Multiplication
				7797	@cindex Karatsuba multiplication
				7798
				7799	The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
				7800	part A, and various other textbooks. A brief description is given here.
				7801
				7802	The inputs @math{x} and @math{y} are treated as each split into two parts of
				7803	equal length (or the most significant part one limb shorter if N is odd).
				7804
				7805	@tex
				7806	% GMPboxwidth used for all the multiplication pictures
				7807	\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
				7808	% GMPboxdepth and GMPboxheight are also used for the float pictures
				7809	\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex
				7810	\global\newdimen\GMPboxheight \global\GMPboxheight=2ex
				7811	\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
				7812	\def\GMPbox#1#2{%
				7813	\vbox {%
				7814	\hrule
				7815	\hbox to 2\GMPboxwidth{%
				7816	\GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
				7817	\hrule}}
				7818	\GMPdisplay{%
				7819	\vbox{%
				7820	\hbox to 2\GMPboxwidth {high \hfil low}
				7821	\vskip 0.7ex
				7822	\GMPbox{x_1}{x_0}
				7823	\vskip 0.5ex
				7824	\GMPbox{y_1}{y_0}
				7825	}}
				7826	@end tex
				7827	@ifnottex
				7828	@example
				7829	@group
				7830	high low
				7831	+----------+----------+
				7832	\| x1 \| x0 \|
				7833	+----------+----------+
				7834
				7835	+----------+----------+
				7836	\| y1 \| y0 \|
				7837	+----------+----------+
				7838	@end group
				7839	@end example
				7840	@end ifnottex
				7841
				7842	Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is
				7843	@math{k} limbs (@ms{y,0} the same) then
				7844	@m{b=2\GMPraise{$k$@code{mp\_bits\_per\_limb}}, b=2^(kmp_bits_per_limb)}.
				7845	With that @m{x=x_1b+x_0,x=x1b+x0} and @m{y=y_1b+y_0,y=y1b+y0}, and the
				7846	following holds,
				7847
				7848	@display
				7849	@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
				7850	xy = (b^2+b)x1y1 - b(x1-x0)(y1-y0) + (b+1)x0*y0}
				7851	@end display
				7852
				7853	This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
				7854	whereas a basecase multiply of N@cross{}N limbs is equivalent to four
				7855	multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent
				7856	the positions where the three products must be added.
				7857
				7858	@tex
				7859	\def\GMPboxA#1#2{%
				7860	\vbox{%
				7861	\hrule
				7862	\hbox{%
				7863	\GMPvrule
				7864	\hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
				7865	\vrule
				7866	\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
				7867	\vrule}
				7868	\hrule}}
				7869	\def\GMPboxB#1#2{%
				7870	\hbox{%
				7871	\raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
				7872	\vbox{%
				7873	\hrule
				7874	\hbox{%
				7875	\GMPvrule
				7876	\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
				7877	\vrule}%
				7878	\hrule}}}
				7879	\GMPdisplay{%
				7880	\vbox{%
				7881	\hbox to 4\GMPboxwidth {high \hfil low}
				7882	\vskip 0.7ex
				7883	\GMPboxA{x_1y_1}{x_0y_0}
				7884	\vskip 0.5ex
				7885	\GMPboxB{$+$}{x_1y_1}
				7886	\vskip 0.5ex
				7887	\GMPboxB{$+$}{x_0y_0}
				7888	\vskip 0.5ex
				7889	\GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
				7890	}}
				7891	@end tex
				7892	@ifnottex
				7893	@example
				7894	@group
				7895	high low
				7896	+--------+--------+ +--------+--------+
				7897	\| x1y1 \| \| x0y0 \|
				7898	+--------+--------+ +--------+--------+
				7899	+--------+--------+
				7900	add \| x1*y1 \|
				7901	+--------+--------+
				7902	+--------+--------+
				7903	add \| x0*y0 \|
				7904	+--------+--------+
				7905	+--------+--------+
				7906	sub \| (x1-x0)*(y1-y0) \|
				7907	+--------+--------+
				7908	@end group
				7909	@end example
				7910	@end ifnottex
				7911
				7912	The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
				7913	absolute value, and the sign used to choose to add or subtract. Notice the
				7914	sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
				7915	high(x0y0)+low(x1y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
				7916	additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
				7917	outweigh the saving.
				7918
				7919	Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
				7920	an equivalent with three squares,
				7921
				7922	@display
				7923	@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
				7924	x^2 = (b^2+b)x1^2 - b(x1-x0)^2 + (b+1)*x0^2}
				7925	@end display
				7926
				7927	The final result is accumulated from those three squares the same way as for
				7928	the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
				7929	always positive.
				7930
				7931	A similar formula for both multiplying and squaring can be constructed with a
				7932	middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed
				7933	@math{k} limbs, leading to more carry handling and additions than the form
				7934	above.
				7935
				7936	Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
				7937	the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
				7938	each @math{1/2} the size of the inputs. This is a big improvement over the
				7939	basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra
				7940	additions Karatsuba performs. @code{MUL_TOOM22_THRESHOLD} can be as little
				7941	as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}.
				7942
				7943	The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c,
				7944	M(N) = aN^2 + bN + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN +
				7945	e, K(N) = 3M(N/2) + dN + e}, which expands to @m{K(N) = {3\over4} aN^2 +
				7946	{3\over2} bN + 3c + dN + e, K(N) = 3/4aN^2 + 3/2bN + 3c + dN + e}. The
				7947	factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the
				7948	basecase code will increase the threshold since they benefit @math{M(N)} more
				7949	than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means
				7950	linear style speedups of @math{b} will increase the threshold since they
				7951	benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for
				7952	instance when adding an optimized @code{mpn_sqr_diagonal} to
				7953	@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in
				7954	that sense the algorithm thresholds are merely of academic interest.
				7955
				7956
				7957	@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms
				7958	@subsection Toom 3-Way Multiplication
				7959	@cindex Toom multiplication
				7960
				7961	The Karatsuba formula is the simplest case of a general approach to splitting
				7962	inputs that leads to both Toom and FFT algorithms. A description of
				7963	Toom can be found in Knuth section 4.3.3, with an example 3-way
				7964	calculation after Theorem A@. The 3-way form used in GMP is described here.
				7965
				7966	The operands are each considered split into 3 pieces of equal length (or the
				7967	most significant part 1 or 2 limbs shorter than the other two).
				7968
				7969	@tex
				7970	\def\GMPbox#1#2#3{%
				7971	\vbox{%
				7972	\hrule \vfil
				7973	\hbox to 3\GMPboxwidth {%
				7974	\GMPvrule
				7975	\hfil$#1$\hfil
				7976	\vrule
				7977	\hfil$#2$\hfil
				7978	\vrule
				7979	\hfil$#3$\hfil
				7980	\vrule}%
				7981	\vfil \hrule
				7982	}}
				7983	\GMPdisplay{%
				7984	\vbox{%
				7985	\hbox to 3\GMPboxwidth {high \hfil low}
				7986	\vskip 0.7ex
				7987	\GMPbox{x_2}{x_1}{x_0}
				7988	\vskip 0.5ex
				7989	\GMPbox{y_2}{y_1}{y_0}
				7990	\vskip 0.5ex
				7991	}}
				7992	@end tex
				7993	@ifnottex
				7994	@example
				7995	@group
				7996	high low
				7997	+----------+----------+----------+
				7998	\| x2 \| x1 \| x0 \|
				7999	+----------+----------+----------+
				8000
				8001	+----------+----------+----------+
				8002	\| y2 \| y1 \| y0 \|
				8003	+----------+----------+----------+
				8004	@end group
				8005	@end example
				8006	@end ifnottex
				8007
				8008	@noindent
				8009	These parts are treated as the coefficients of two polynomials
				8010
				8011	@display
				8012	@group
				8013	@m{X(t) = x_2t^2 + x_1t + x_0,
				8014	X(t) = x2t^2 + x1t + x0}
				8015	@m{Y(t) = y_2t^2 + y_1t + y_0,
				8016	Y(t) = y2t^2 + y1t + y0}
				8017	@end group
				8018	@end display
				8019
				8020	Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1},
				8021	@ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then
				8022	@m{b=2\GMPraise{$k$@code{mp\_bits\_per\_limb}}, b=2^(kmp_bits_per_limb)}.
				8023	With this @math{x=X(b)} and @math{y=Y(b)}.
				8024
				8025	Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
				8026	are
				8027
				8028	@display
				8029	@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
				8030	W(t) = w4t^4 + w3t^3 + w2t^2 + w1t + w0}
				8031	@end display
				8032
				8033	The @m{w_i,w[i]} are going to be determined, and when they are they'll give
				8034	the final result using @math{w=W(b)}, since
				8035	@m{xy=X(b)Y(b),xy=X(b)Y(b)=W(b)}. The coefficients will be roughly
				8036	@math{b^2} each, and the final @math{W(b)} will be an addition like,
				8037
				8038	@tex
				8039	\def\GMPbox#1#2{%
				8040	\moveright #1\GMPboxwidth
				8041	\vbox{%
				8042	\hrule
				8043	\hbox{%
				8044	\GMPvrule
				8045	\hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
				8046	\vrule}%
				8047	\hrule
				8048	}}
				8049	\GMPdisplay{%
				8050	\vbox{%
				8051	\hbox to 6\GMPboxwidth {high \hfil low}%
				8052	\vskip 0.7ex
				8053	\GMPbox{0}{w_4}
				8054	\vskip 0.5ex
				8055	\GMPbox{1}{w_3}
				8056	\vskip 0.5ex
				8057	\GMPbox{2}{w_2}
				8058	\vskip 0.5ex
				8059	\GMPbox{3}{w_1}
				8060	\vskip 0.5ex
				8061	\GMPbox{4}{w_0}
				8062	}}
				8063	@end tex
				8064	@ifnottex
				8065	@example
				8066	@group
				8067	high low
				8068	+-------+-------+
				8069	\| w4 \|
				8070	+-------+-------+
				8071	+--------+-------+
				8072	\| w3 \|
				8073	+--------+-------+
				8074	+--------+-------+
				8075	\| w2 \|
				8076	+--------+-------+
				8077	+--------+-------+
				8078	\| w1 \|
				8079	+--------+-------+
				8080	+-------+-------+
				8081	\| w0 \|
				8082	+-------+-------+
				8083	@end group
				8084	@end example
				8085	@end ifnottex
				8086
				8087	The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
				8088	products, like @m{w_4=x_2y_2,w4=x2y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2y1+x1*y2},
				8089	@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2y0+x1y1+x0*y2} etc, but this would need all
				8090	nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
				8091	to a basecase multiply. Instead the following approach is used.
				8092
				8093	@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
				8094	values of @math{W(t)} at those points. In GMP the following points are used,
				8095
				8096	@quotation
				8097	@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				8098	@item Point @tab Value
				8099	@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
				8100	@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)}
				8101	@item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)}
				8102	@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4x2+2x1+x0) * (4y2+2y1+y0)}
				8103	@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately
				8104	@end multitable
				8105	@end quotation
				8106
				8107	At @math{t=-1} the values can be negative and that's handled using the
				8108	absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the
				8109	value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in
				8110	the limit as t approaches infinity}, but it's much easier to think of as
				8111	simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like
				8112	@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
				8113
				8114	Each of the points substituted into
				8115	@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
				8116	of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
				8117	been calculated.
				8118
				8119	@tex
				8120	\GMPdisplay{%
				8121	$\matrix{%
				8122	W(0) & = & & & & & & & & & w_0 \cr
				8123	W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr
				8124	W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr
				8125	W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
				8126	W(\infty) & = & w_4 \cr
				8127	}$}
				8128	@end tex
				8129	@ifnottex
				8130	@example
				8131	@group
				8132	W(0) = w0
				8133	W(1) = w4 + w3 + w2 + w1 + w0
				8134	W(-1) = w4 - w3 + w2 - w1 + w0
				8135	W(2) = 16w4 + 8w3 + 4w2 + 2w1 + w0
				8136	W(inf) = w4
				8137	@end group
				8138	@end example
				8139	@end ifnottex
				8140
				8141	This is a set of five equations in five unknowns, and some elementary linear
				8142	algebra quickly isolates each @m{w_i,w[i]}. This involves adding or
				8143	subtracting one @math{W(t)} value from another, and a couple of divisions by
				8144	powers of 2 and one division by 3, the latter using the special
				8145	@code{mpn_divexact_by3} (@pxref{Exact Division}).
				8146
				8147	The conversion of @math{W(t)} values to the coefficients is interpolation. A
				8148	polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
				8149	at 5 different points. The points are arbitrary and can be chosen to make the
				8150	linear equations come out with a convenient set of steps for quickly isolating
				8151	the @m{w_i,w[i]}.
				8152
				8153	Squaring follows the same procedure as multiplication, but there's only one
				8154	@math{X(t)} and it's evaluated at the 5 points, and those values squared to
				8155	give values of @math{W(t)}. The interpolation is then identical, and in fact
				8156	the same @code{toom_interpolate_5pts} subroutine is used for both squaring and
				8157	multiplying.
				8158
				8159	Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
				8160	@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
				8161	original size each. This is an improvement over Karatsuba at
				8162	@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and
				8163	interpolation and so it only realizes its advantage above a certain size.
				8164
				8165	Near the crossover between Toom-3 and Karatsuba there's generally a range of
				8166	sizes where the difference between the two is small.
				8167	@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and
				8168	successive runs of the tune program can give different values due to small
				8169	variations in measuring. A graph of time versus size for the two shows the
				8170	effect, see @file{tune/README}.
				8171
				8172	At the fairly small sizes where the Toom-3 thresholds occur it's worth
				8173	remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
				8174	expected to make accurate predictions, due of course to the big influence of
				8175	all sorts of overheads, and the fact that only a few recursions of each are
				8176	being performed. Even at large sizes there's a good chance machine dependent
				8177	effects like cache architecture will mean actual performance deviates from
				8178	what might be predicted.
				8179
				8180	The formula given for the Karatsuba algorithm (@pxref{Karatsuba
				8181	Multiplication}) has an equivalent for Toom-3 involving only five multiplies,
				8182	but this would be complicated and unenlightening.
				8183
				8184	An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
				8185	a vector to represent the @math{x} and @math{y} splits and a matrix
				8186	multiplication for the evaluation and interpolation stages. The matrix
				8187	inverses are not meant to be actually used, and they have elements with values
				8188	much greater than in fact arise in the interpolation steps. The diagram shown
				8189	for the 3-way is attractive, but again doesn't have to be implemented that way
				8190	and for example with a bit of rearrangement just one division by 6 can be
				8191	done.
				8192
				8193
				8194	@node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms
				8195	@subsection Toom 4-Way Multiplication
				8196	@cindex Toom multiplication
				8197
				8198	Karatsuba and Toom-3 split the operands into 2 and 3 coefficients,
				8199	respectively. Toom-4 analogously splits the operands into 4 coefficients.
				8200	Using the notation from the section on Toom-3 multiplication, we form two
				8201	polynomials:
				8202
				8203	@display
				8204	@group
				8205	@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0,
				8206	X(t) = x3t^3 + x2t^2 + x1*t + x0}
				8207	@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0,
				8208	Y(t) = y3t^3 + y2t^2 + y1*t + y0}
				8209	@end group
				8210	@end display
				8211
				8212	@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving
				8213	values of @math{W(t)} at those points. In GMP the following points are used,
				8214
				8215	@quotation
				8216	@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
				8217	@item Point @tab Value
				8218	@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
				8219	@item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2x2+4x1+8x0) (y3+2y2+4y1+8*y0)}
				8220	@item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2x2-4x1+8x0) (-y3+2y2-4y1+8*y0)}
				8221	@item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)}
				8222	@item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)}
				8223	@item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8x3+4x2+2x1+x0) (8y3+4y2+2*y1+y0)}
				8224	@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately
				8225	@end multitable
				8226	@end quotation
				8227
				8228	The number of additions and subtractions for Toom-4 is much larger than for Toom-3.
				8229	But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs
				8230	for both @math{t=1} and @math{t=-1}.
				8231
				8232	Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being
				8233	@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the
				8234	original size each.
				8235
				8236
				8237	@node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms
				8238	@subsection Higher degree Toom'n'half
				8239	@cindex Toom multiplication
				8240
				8241	The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
				8242	@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
				8243	number of pieces. In general a split of two equally long operands into
				8244	@math{r} pieces leads to evaluations and pointwise multiplications done at
				8245	@m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have
				8246	a multiple of 4 points, that's why for higher degree Toom'n'half is used.
				8247
				8248	Toom'n'half means that the existence of one more piece is considered for a
				8249	single operand. It can be virtual, i.e. zero, or real, when the two operand
				8250	are not exactly balanced. By choosing an even @math{r},
				8251	Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four.
				8252
				8253	The quadruplets of points include 0, @m{\infty,inf}, +1, -1 and
				8254	@m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the
				8255	evaluation phase and for some steps in the interpolation phase. Further tricks
				8256	are used to reduce the memory footprint of the whole multiplication algorithm
				8257	to a memory buffer equal in size to the result of the product.
				8258
				8259	Current GMP uses both Toom-6'n'half and Toom-8'n'half.
				8260
				8261
				8262	@node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms
				8263	@subsection FFT Multiplication
				8264	@cindex FFT multiplication
				8265	@cindex Fast Fourier Transform
				8266
				8267	At large to very large sizes a Fermat style FFT multiplication is used,
				8268	following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs
				8269	in various forms can be found in many textbooks, for instance Knuth section
				8270	4.3.3 part C or Lipson chapter IX@. A brief description of the form used in
				8271	GMP is given here.
				8272
				8273	The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
				8274	@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge
				8275	\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
				8276	@math{x} and @math{y} with high zero limbs. The modular product is the native
				8277	form for the algorithm, so padding to get a full product is unavoidable.
				8278
				8279	The algorithm follows a split, evaluate, pointwise multiply, interpolate and
				8280	combine similar to that described above for Karatsuba and Toom-3. A @math{k}
				8281	parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
				8282	pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of
				8283	@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
				8284	the split falls on limb boundaries, avoiding bit shifts in the split and
				8285	combine stages.
				8286
				8287	The evaluations, pointwise multiplications, and interpolation, are all done
				8288	modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
				8289	multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of
				8290	interpolation will be the following negacyclic convolution of the input
				8291	pieces, and the choice of @math{N'} ensures these sums aren't truncated.
				8292	@tex
				8293	$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
				8294	@end tex
				8295	@ifnottex
				8296
				8297	@example
				8298	---
				8299	\ b
				8300	w[n] = / (-1) * x[i] * y[j]
				8301	---
				8302	i+j==b*2^k+n
				8303	b=0,1
				8304	@end example
				8305
				8306	@end ifnottex
				8307	The points used for the evaluation are @math{g^i} for @math{i=0} to
				8308	@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a
				8309	@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
				8310	cancellations at the interpolation stage, and it's also a power of 2 so the
				8311	fast Fourier transforms used for the evaluation and interpolation do only
				8312	shifts, adds and negations.
				8313
				8314	The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
				8315	recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
				8316	basecase), whichever is optimal at the size @math{N'}. The interpolation is
				8317	an inverse fast Fourier transform. The resulting set of sums of @m{x_iy_j,
				8318	x[i]*y[j]} are added at appropriate offsets to give the final result.
				8319
				8320	Squaring is the same, but @math{x} is the only input so it's one transform at
				8321	the evaluate stage and the pointwise multiplies are squares. The
				8322	interpolation is the same.
				8323
				8324	For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
				8325	O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
				8326	modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
				8327	Each successive @math{k} is an asymptotic improvement, but overheads mean each
				8328	is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE}
				8329	and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each
				8330	new @math{k} effectively swaps some multiplying for some shifts, adds and
				8331	overheads.
				8332
				8333	A mod @math{2^N+1} product can be formed with a normal
				8334	@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
				8335	and Toom-3 etc can be compared directly. A @math{k=4} FFT at
				8336	@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
				8337	@math{O(N^@W{1.465})}. In practice this is what's found, with
				8338	@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
				8339	300 and 1000 limbs, depending on the CPU@. So far it's been found that only
				8340	very large FFTs recurse into pointwise multiplies above these sizes.
				8341
				8342	When an FFT is to give a full product, the change of @math{N} to @math{2N}
				8343	doesn't alter the theoretical complexity for a given @math{k}, but for the
				8344	purposes of considering where an FFT might be first used it can be assumed
				8345	that the FFT is recursing into a normal multiply and that on that basis it's
				8346	doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
				8347	the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean
				8348	@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
				8349	In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
				8350	found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
				8351
				8352	The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
				8353	rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
				8354	when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
				8355	multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of
				8356	@math{N} just under such a multiple will be rounded to the next. The
				8357	complexity calculations above assume that a favourable size is used, meaning
				8358	one which isn't padded through rounding, and it's also assumed that the extra
				8359	@math{+k+3} bits are negligible at typical FFT sizes.
				8360
				8361	The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
				8362	step-effect into measured speeds. For example @math{k=8} will round @math{N}
				8363	up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
				8364	groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for
				8365	@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In
				8366	practice it's been found each @math{k} is used at quite small multiples of its
				8367	size constraint and so the step effect is quite noticeable in a time versus
				8368	size graph.
				8369
				8370	The threshold determinations currently measure at the mid-points of size
				8371	steps, but this is sub-optimal since at the start of a new step it can happen
				8372	that it's better to go back to the previous @math{k} for a while. Something
				8373	more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
				8374	needed.
				8375
				8376
				8377	@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms
				8378	@subsection Other Multiplication
				8379	@cindex Toom multiplication
				8380
				8381	The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
				8382	@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
				8383	number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not
				8384	currently used. The notes here are merely for interest.
				8385
				8386	In general a split into @math{r+1} pieces is made, and evaluations and
				8387	pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7
				8388	pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way
				8389	algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}. Only
				8390	the pointwise multiplications count towards big-@math{O} complexity, but the
				8391	time spent in the evaluate and interpolate stages grows with @math{r} and has
				8392	a significant practical impact, with the asymptotic advantage of each @math{r}
				8393	realized only at bigger and bigger sizes. The overheads grow as
				8394	@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
				8395	r), O(N*log(r))}.
				8396
				8397	Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
				8398	uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
				8399	multiplies in the evaluate stage (or rather trades them for additions), and
				8400	has a further saving of nearly half the interpolate steps. The idea is to
				8401	separate odd and even final coefficients and then perform algorithm C steps C7
				8402	and C8 on them separately. The divisors at step C7 become @math{j^2} and the
				8403	multipliers at C8 become @m{2tj-j^2,2tj-j^2}.
				8404
				8405	Splitting odd and even parts through positive and negative points can be
				8406	thought of as using @math{-1} as a square root of unity. If a 4th root of
				8407	unity was available then a further split and speedup would be possible, but no
				8408	such root exists for plain integers. Going to complex integers with
				8409	@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian
				8410	form it takes three real multiplies to do a complex multiply. The existence
				8411	of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
				8412	Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
				8413
				8414	Floating point FFTs use complex numbers approximating Nth roots of unity.
				8415	Some processors have special support for such FFTs. But these are not used in
				8416	GMP since it's very difficult to guarantee an exact result (to some number of
				8417	bits). An occasional difference of 1 in the last bit might not matter to a
				8418	typical signal processing algorithm, but is of course of vital importance to
				8419	GMP.
				8420
				8421
				8422	@node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms
				8423	@subsection Unbalanced Multiplication
				8424	@cindex Unbalanced multiplication
				8425
				8426	Multiplication of operands with different sizes, both below
				8427	@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication
				8428	(@pxref{Basecase Multiplication}).
				8429
				8430	For really large operands, we invoke FFT directly.
				8431
				8432	For operands between these sizes, we use Toom inspired algorithms suggested by
				8433	Alberto Zanoni and Marco Bodrato. The idea is to split the operands into
				8434	polynomials of different degree. GMP currently splits the smaller operand
				8435	onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand
				8436	can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to
				8437	3.
				8438
				8439	@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that
				8440	@c screws up layout here and there in the rest of the manual.
				8441	@c @tex
				8442	@c \goodbreak
				8443	@c @end tex
				8444	@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
				8445	@section Division Algorithms
				8446	@cindex Division algorithms
				8447
				8448	@menu
				8449	* Single Limb Division::
				8450	* Basecase Division::
				8451	* Divide and Conquer Division::
				8452	* Block-Wise Barrett Division::
				8453	* Exact Division::
				8454	* Exact Remainder::
				8455	* Small Quotient Division::
				8456	@end menu
				8457
				8458
				8459	@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
				8460	@subsection Single Limb Division
				8461
				8462	N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
				8463	high to low, either with a hardware divide instruction or a multiplication by
				8464	inverse, whichever is best on a given CPU.
				8465
				8466	The multiply by inverse follows ``Improved division by invariant integers'' by
				8467	M@"oller and Granlund (@pxref{References}) and is implemented as
				8468	@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to have a
				8469	fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then
				8470	multiply by the high limb (plus one bit) of the dividend to get a quotient
				8471	@math{q}. With @math{d} normalized (high bit set), @math{q} is no more than 1
				8472	too small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and
				8473	reveals whether @math{q} or @math{q-1} is correct.
				8474
				8475	The result is a division done with two multiplications and four or five
				8476	arithmetic operations. On CPUs with low latency multipliers this can be much
				8477	faster than a hardware divide, though the cost of calculating the inverse at
				8478	the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
				8479
				8480	When a divisor must be normalized, either for the generic C
				8481	@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
				8482	actually @m{a2^k,a2^k} by @m{d2^k,d2^k} where @math{a} is the dividend and
				8483	@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
				8484	The bit shifts for the dividend are usually accomplished ``on the fly''
				8485	meaning by extracting the appropriate bits at each step. Done this way the
				8486	quotient limbs come out aligned ready to store. When only the remainder is
				8487	wanted, an alternative is to take the dividend limbs unshifted and calculate
				8488	@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
				8489	\bmod d2^k, r2^k mod d2^k}. This can help on CPUs with poor bit shifts or
				8490	few registers.
				8491
				8492	The multiply by inverse can be done two limbs at a time. The calculation is
				8493	basically the same, but the inverse is two limbs and the divisor treated as if
				8494	padded with a low zero limb. This means more work, since the inverse will
				8495	need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
				8496	independent and can therefore be done partly or wholly in parallel. Likewise
				8497	for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two
				8498	limbs with roughly the same two multiplies worth of latency that one limb at a
				8499	time gives. This extends to 3 or 4 limbs at a time, though the extra work to
				8500	apply the inverse will almost certainly soon reach the limits of multiplier
				8501	throughput.
				8502
				8503	A similar approach in reverse can be taken to process just half a limb at a
				8504	time if the divisor is only a half limb. In this case the 1@cross{}1 multiply
				8505	for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each
				8506	limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
				8507	if the only multiply is a half limb, and especially if it's not pipelined.
				8508
				8509
				8510	@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
				8511	@subsection Basecase Division
				8512
				8513	Basecase N@cross{}M division is like long division done by hand, but in base
				8514	@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth
				8515	section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
				8516
				8517	Briefly stated, while the dividend remains larger than the divisor, a high
				8518	quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
				8519	the top end of the dividend. With a normalized divisor (most significant bit
				8520	set), each quotient limb can be formed with a 2@cross{}1 division and a
				8521	1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is
				8522	by the high limb of the divisor and is done either with a hardware divide or a
				8523	multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
				8524	faster. Such a quotient is sometimes one too big, requiring an addback of the
				8525	divisor, but that happens rarely.
				8526
				8527	With Q=N@minus{}M being the number of quotient limbs, this is an
				8528	@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
				8529	Q@cross{}M multiplication, differing in fact only in the extra multiply and
				8530	divide for each of the Q quotient limbs.
				8531
				8532
				8533	@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms
				8534	@subsection Divide and Conquer Division
				8535
				8536	For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing.
				8537	Or to be precise by a recursive divide and conquer algorithm based on work by
				8538	Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
				8539
				8540	The algorithm consists essentially of recognising that a 2N@cross{}N division
				8541	can be done with the basecase division algorithm (@pxref{Basecase Division}),
				8542	but using N/2 limbs as a base, not just a single limb. This way the
				8543	multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
				8544	Karatsuba and higher multiplication algorithms (@pxref{Multiplication
				8545	Algorithms}). The two ``digits'' of the quotient are formed by recursive
				8546	N@cross{}(N/2) divisions.
				8547
				8548	If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
				8549	then the work is about the same as a basecase division, but with more function
				8550	call overheads and with some subtractions separated from the multiplies.
				8551	These overheads mean that it's only when N/2 is above
				8552	@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use.
				8553
				8554	@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere
				8555	above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the
				8556	CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a
				8557	little by offering a ready-made advantage over repeated @code{mpn_submul_1}
				8558	calls.
				8559
				8560	Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
				8561	@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The
				8562	actual time is a sum over multiplications of the recursed sizes, as can be
				8563	seen near the end of section 2.2 of Burnikel and Ziegler. For example, within
				8564	the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher
				8565	algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
				8566	N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division
				8567	is about 2 to 4 times slower than an N@cross{}N multiplication.
				8568
				8569
				8570	@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms
				8571	@subsection Block-Wise Barrett Division
				8572
				8573	For the largest divisions, a block-wise Barrett division algorithm is used.
				8574	Here, the divisor is inverted to a precision determined by the relative size of
				8575	the dividend and divisor. Blocks of quotient limbs are then generated by
				8576	multiplying blocks from the dividend by the inverse.
				8577
				8578	Our block-wise algorithm computes a smaller inverse than in the plain Barrett
				8579	algorithm. For a @math{2n/n} division, the inverse will be just @m{\lceil n/2
				8580	\rceil, ceil(n/2)} limbs.
				8581
				8582
				8583	@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms
				8584	@subsection Exact Division
				8585
				8586
				8587	A so-called exact division is when the dividend is known to be an exact
				8588	multiple of the divisor. Jebelean's exact division algorithm uses this
				8589	knowledge to make some significant optimizations (@pxref{References}).
				8590
				8591	The idea can be illustrated in decimal for example with 368154 divided by
				8592	543. Because the low digit of the dividend is 4, the low digit of the
				8593	quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
				8594	4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
				8595	the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
				8596	@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
				8597	subtracted from the dividend leaving 363810. Notice the low digit has become
				8598	zero.
				8599
				8600	The procedure is repeated at the second digit, with the next quotient digit 7
				8601	(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
				8602	@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at
				8603	the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
				8604	mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
				8605	So the quotient is 678.
				8606
				8607	Notice however that the multiplies and subtractions don't need to extend past
				8608	the low three digits of the dividend, since that's enough to determine the
				8609	three quotient digits. For the last quotient digit no subtraction is needed
				8610	at all. On a 2N@cross{}N division like this one, only about half the work of
				8611	a normal basecase division is necessary.
				8612
				8613	For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
				8614	saving over a normal basecase division is in two parts. Firstly, each of the
				8615	Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
				8616	multiply. Secondly, the crossproducts are reduced when @math{Q>M} to
				8617	@m{QM-M(M+1)/2,QM-M(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
				8618	Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many
				8619	divisions are saved, or if Q is small then the crossproducts reduce to a small
				8620	number.
				8621
				8622	The modular inverse used is calculated efficiently by @code{binvert_limb} in
				8623	@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a
				8624	64-bit limb. @file{tune/modlinv.c} has some alternate implementations that
				8625	might suit processors better at bit twiddling than multiplying.
				8626
				8627	The sub-quadratic exact division described by Jebelean in ``Exact Division
				8628	with Karatsuba Complexity'' is not currently implemented. It uses a
				8629	rearrangement similar to the divide and conquer for normal division
				8630	(@pxref{Divide and Conquer Division}), but operating from low to high. A
				8631	further possibility not currently implemented is ``Bidirectional Exact Integer
				8632	Division'' by Krandick and Jebelean which forms quotient limbs from both the
				8633	high and low ends of the dividend, and can halve once more the number of
				8634	crossproducts needed in a 2N@cross{}N division.
				8635
				8636	A special case exact division by 3 exists in @code{mpn_divexact_by3},
				8637	supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms
				8638	quotient digits with a multiply by the modular inverse of 3 (which is
				8639	@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
				8640	limb. The multiplications don't need to be on the dependent chain, as long as
				8641	the effect of the borrows is applied, which can help chips with pipelined
				8642	multipliers.
				8643
				8644
				8645	@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
				8646	@subsection Exact Remainder
				8647	@cindex Exact remainder
				8648
				8649	If the exact division algorithm is done with a full subtraction at each stage
				8650	and the dividend isn't a multiple of the divisor, then low zero limbs are
				8651	produced but with a remainder in the high limbs. For dividend @math{a},
				8652	divisor @math{d}, quotient @math{q}, and @m{b = 2
				8653	\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder
				8654	@math{r} is of the form
				8655	@tex
				8656	$$ a = qd + r b^n $$
				8657	@end tex
				8658	@ifnottex
				8659
				8660	@example
				8661	a = qd + rb^n
				8662	@end example
				8663
				8664	@end ifnottex
				8665	@math{n} represents the number of zero limbs produced by the subtractions,
				8666	that being the number of limbs produced for @math{q}. @math{r} will be in the
				8667	range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
				8668	a factor of @math{b^n}.
				8669
				8670	Carrying out full subtractions at each stage means the same number of cross
				8671	products must be done as a normal division, but there's still some single limb
				8672	divisions saved. When @math{d} is a single limb some simplifications arise,
				8673	providing good speedups on a number of processors.
				8674
				8675	The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
				8676	internal @code{mpn_redc_X} functions differ subtly in how they return @math{r},
				8677	leading to some negations in the above formula, but all are essentially the
				8678	same.
				8679
				8680	@cindex Divisibility algorithm
				8681	@cindex Congruence algorithm
				8682	Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
				8683	leads to divisibility or congruence tests which are potentially more efficient
				8684	than a normal division.
				8685
				8686	The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
				8687	odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and
				8688	@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}).
				8689
				8690	Montgomery's REDC method for modular multiplications uses operands of the form
				8691	of @m{xb^{-n}, xb^-n} and @m{yb^{-n}, yb^-n} and on calculating @m{(xb^{-n})
				8692	(yb^{-n}), (xb^-n)(y*b^-n)} uses the factor of @math{b^n} in the exact
				8693	remainder to reach a product in the same form @m{(xy)b^{-n}, (xy)b^-n}
				8694	(@pxref{Modular Powering Algorithm}).
				8695
				8696	Notice that @math{r} generally gives no useful information about the ordinary
				8697	remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If
				8698	however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
				8699	ordinary remainder. This occurs whenever @math{d} is a factor of
				8700	@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or
				8701	64 bit limb other such factors include 5, 17 and 257, but no particular use
				8702	has been found for this.
				8703
				8704
				8705	@node Small Quotient Division, , Exact Remainder, Division Algorithms
				8706	@subsection Small Quotient Division
				8707
				8708	An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
				8709	small can be optimized somewhat.
				8710
				8711	An ordinary basecase division normalizes the divisor by shifting it to make
				8712	the high bit set, shifting the dividend accordingly, and shifting the
				8713	remainder back down at the end of the calculation. This is wasteful if only a
				8714	few quotient limbs are to be formed. Instead a division of just the top
				8715	@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
				8716	used to form a trial quotient. This requires only those limbs normalized, not
				8717	the whole of the divisor and dividend.
				8718
				8719	A multiply and subtract then applies the trial quotient to the M@minus{}Q
				8720	unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
				8721	limbs remaining from the trial quotient division). The starting trial
				8722	quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
				8723	too big are detected by first comparing the most significant limbs that will
				8724	arise from the subtraction. An addback is done if the quotient still turns
				8725	out to be 1 too big.
				8726
				8727	This whole procedure is essentially the same as one step of the basecase
				8728	algorithm done in a Q limb base, though with the trial quotient test done only
				8729	with the high limbs, not an entire Q limb ``digit'' product. The correctness
				8730	of this weaker test can be established by following the argument of Knuth
				8731	section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
				8732	+ u_2, v2q>br+u2} condition appropriately relaxed.
				8733
				8734
				8735	@need 1000
				8736	@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
				8737	@section Greatest Common Divisor
				8738	@cindex Greatest common divisor algorithms
				8739	@cindex GCD algorithms
				8740
				8741	@menu
				8742	* Binary GCD::
				8743	* Lehmer's Algorithm::
				8744	* Subquadratic GCD::
				8745	* Extended GCD::
				8746	* Jacobi Symbol::
				8747	@end menu
				8748
				8749
				8750	@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
				8751	@subsection Binary GCD
				8752
				8753	At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described
				8754	in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply
				8755	consists of successively reducing odd operands @math{a} and @math{b} using
				8756
				8757	@quotation
				8758	@math{a,b = @abs{}(a-b),@min{}(a,b)} @*
				8759	strip factors of 2 from @math{a}
				8760	@end quotation
				8761
				8762	The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly
				8763	computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces
				8764	@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to
				8765	be faster than the Euclidean algorithm everywhere. One reason the binary
				8766	method does well is that the implied quotient at each step is usually small,
				8767	so often only one or two subtractions are needed to get the same effect as a
				8768	division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth
				8769	section 4.5.3 Theorem E.
				8770
				8771	When the implied quotient is large, meaning @math{b} is much smaller than
				8772	@math{a}, then a division is worthwhile. This is the basis for the initial
				8773	@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
				8774	for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction,
				8775	big quotients occur too rarely to make it worth checking for them.
				8776
				8777	@sp 1
				8778	The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C
				8779	code as described above. For two N-bit operands, the algorithm takes about
				8780	0.68 iterations per bit. For optimum performance some attention needs to be
				8781	paid to the way the factors of 2 are stripped from @math{a}.
				8782
				8783	Firstly it may be noted that in twos complement the number of low zero bits on
				8784	@math{a-b} is the same as @math{b-a}, so counting or testing can begin on
				8785	@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined.
				8786
				8787	A loop stripping low zero bits tends not to branch predict well, since the
				8788	condition is data dependent. But on average there's only a few low zeros, so
				8789	an option is to strip one or two bits arithmetically then loop for more (as
				8790	done for AMD K6). Or use a lookup table to get a count for several bits then
				8791	loop for more (as done for AMD K7). An alternative approach is to keep just
				8792	one of @math{a} or @math{b} odd and iterate
				8793
				8794	@quotation
				8795	@math{a,b = @abs{}(a-b), @min{}(a,b)} @*
				8796	@math{a = a/2} if even @*
				8797	@math{b = b/2} if even
				8798	@end quotation
				8799
				8800	This requires about 1.25 iterations per bit, but stripping of a single bit at
				8801	each step avoids any branching. Repeating the bit strip reduces to about 0.9
				8802	iterations per bit, which may be a worthwhile tradeoff.
				8803
				8804	Generally with the above approaches a speed of perhaps 6 cycles per bit can be
				8805	achieved, which is still not terribly fast with for instance a 64-bit GCD
				8806	taking nearly 400 cycles. It's this sort of time which means it's not usually
				8807	advantageous to combine a set of divisibility tests into a GCD.
				8808
				8809	Currently, the binary algorithm is used for GCD only when @math{N < 3}.
				8810
				8811	@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms
				8812	@comment node-name, next, previous, up
				8813	@subsection Lehmer's algorithm
				8814
				8815	Lehmer's improvement of the Euclidean algorithms is based on the observation
				8816	that the initial part of the quotient sequence depends only on the most
				8817	significant parts of the inputs. The variant of Lehmer's algorithm used in GMP
				8818	splits off the most significant two limbs, as suggested, e.g., in ``A
				8819	Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The
				8820	quotients of two double-limb inputs are collected as a 2 by 2 matrix with
				8821	single-limb elements. This is done by the function @code{mpn_hgcd2}. The
				8822	resulting matrix is applied to the inputs using @code{mpn_mul_1} and
				8823	@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one
				8824	limb. In the rare case of a large quotient, no progress can be made by
				8825	examining just the most significant two limbs, and the quotient is computed
				8826	using plain division.
				8827
				8828	The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean
				8829	algorithm and the binary algorithm. The quadratic part of the work are
				8830	the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the
				8831	linear work is also significant. There are roughly @math{N} calls to the
				8832	@code{mpn_hgcd2} function. This function uses a couple of important
				8833	optimizations:
				8834
				8835	@itemize
				8836	@item
				8837	It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next
				8838	section). This means that when called with the most significant two limbs of
				8839	two large numbers, the returned matrix does not always correspond exactly to
				8840	the initial quotient sequence for the two large numbers; the final quotient
				8841	may sometimes be one off.
				8842
				8843	@item
				8844	It takes advantage of the fact the quotients are usually small. The division
				8845	operator is not used, since the corresponding assembler instruction is very
				8846	slow on most architectures. (This code could probably be improved further, it
				8847	uses many branches that are unfriendly to prediction).
				8848
				8849	@item
				8850	It switches from double-limb calculations to single-limb calculations half-way
				8851	through, when the input numbers have been reduced in size from two limbs to
				8852	one and a half.
				8853
				8854	@end itemize
				8855
				8856	@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms
				8857	@subsection Subquadratic GCD
				8858
				8859	For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD
				8860	(Half GCD) function, as a generalization to Lehmer's algorithm.
				8861
				8862	Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2
				8863	\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation
				8864	matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) =
				8865	T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S}
				8866	limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The
				8867	matrix elements will also be of size roughly @math{N/2}.
				8868
				8869	The HGCD base case uses Lehmer's algorithm, but with the above stop condition
				8870	that returns reduced numbers and the corresponding transformation matrix
				8871	half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is
				8872	computed recursively, using the divide and conquer algorithm in ``On
				8873	Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller
				8874	(@pxref{References}). The recursive algorithm consists of these main
				8875	steps.
				8876
				8877	@itemize
				8878
				8879	@item
				8880	Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the
				8881	resulting matrix @math{T_1} to the full numbers, reducing them to a size just
				8882	above @math{3N/2}.
				8883
				8884	@item
				8885	Perform a small number of division or subtraction steps to reduce the numbers
				8886	to size below @math{3N/2}. This is essential mainly for the unlikely case of
				8887	large quotients.
				8888
				8889	@item
				8890	Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced
				8891	numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing
				8892	them to a size just above @math{N/2}.
				8893
				8894	@item
				8895	Compute @math{T = T_1 T_2}.
				8896
				8897	@item
				8898	Perform a small number of division and subtraction steps to satisfy the
				8899	requirements, and return.
				8900	@end itemize
				8901
				8902	GCD is then implemented as a loop around HGCD, similarly to Lehmer's
				8903	algorithm. Where Lehmer repeatedly chops off the top two limbs, calls
				8904	@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the
				8905	sub-quadratic GCD chops off the most significant third of the limbs (the
				8906	proportion is a tuning parameter, and @math{1/3} seems to be more efficient
				8907	than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting
				8908	matrix. Once the input numbers are reduced to size below
				8909	@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work.
				8910
				8911	The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))},
				8912	where @math{M(N)} is the time for multiplying two @math{N}-limb numbers.
				8913
				8914	@comment node-name, next, previous, up
				8915
				8916	@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms
				8917	@subsection Extended GCD
				8918
				8919	The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also
				8920	cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b),
				8921	ax+by=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to
				8922	handle this case. The binary algorithm is used only for single-limb GCDEXT.
				8923	Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above
				8924	this threshold, GCDEXT is implemented as a loop around HGCD, but with more
				8925	book-keeping to keep track of the cofactors. This gives the same asymptotic
				8926	running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))}
				8927
				8928	One difference to plain GCD is that while the inputs @math{a} and @math{b} are
				8929	reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in
				8930	size. This makes the tuning of the chopping-point more difficult. The current
				8931	code chops off the most significant half of the inputs for the call to HGCD in
				8932	the first iteration, and the most significant two thirds for the remaining
				8933	calls. This strategy could surely be improved. Also the stop condition for the
				8934	loop, where Lehmer's algorithm is invoked once the inputs are reduced below
				8935	@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the
				8936	current size of the cofactors.
				8937
				8938	@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms
				8939	@subsection Jacobi Symbol
				8940	@cindex Jacobi symbol algorithm
				8941
				8942	@c Editor Note: I don't see other people defining the inputs, it would be nice
				8943	@c here because the code uses (a/b) where other references use (n/k)
				8944
				8945	Jacobi symbol @m{\left(a \over b\right), (@var{a}/@var{b})}
				8946
				8947	Initially if either operand fits in a single limb, a reduction is done with
				8948	either @code{mpn_mod_1} or @code{mpn_modexact_1_odd}, followed by the binary
				8949	algorithm on a single limb. The binary algorithm is well suited to a single limb,
				8950	and the whole calculation in this case is quite efficient.
				8951
				8952	For inputs larger than @code{GCD_DC_THRESHOLD}, @code{mpz_jacobi},
				8953	@code{mpz_legendre} and @code{mpz_kronecker} are computed via the HGCD (Half
				8954	GCD) function, as a generalization to Lehmer's algorithm.
				8955
				8956	Most GCD algorithms reduce @math{a} and @math{b} by repeatatily computing the
				8957	quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and iteratively replacing
				8958
				8959	@c Couldn't figure out macros with commas.
				8960	@tex
				8961	$$ a, b = b, a - q * b$$
				8962	@end tex
				8963	@ifnottex
				8964	@math{a, b = b, a - q * b}
				8965	@end ifnottex
				8966
				8967	Different algorithms use different methods for calculating q, but the core
				8968	algorithm is the same if we use @ref{Lehmer's Algorithm} or
				8969	@ref{Subquadratic GCD, HGCD}.
				8970
				8971	At each step it is possible to compute if the reduction inverts the Jacobi
				8972	symbol based on the two least significant bits of @var{a} and @var{b}. For
				8973	more details see ``Efficient computation of the Jacobi symbol'' by
				8974	M@"oller (@pxref{References}).
				8975
				8976	A small set of bits is thus used to track state
				8977	@itemize
				8978	@item
				8979	current sign of result (1 bit)
				8980
				8981	@item
				8982	two least significant bits of @var{a} and @var{b} (4 bits)
				8983
				8984	@item
				8985	a pointer to which input is currently the denominator (1 bit)
				8986	@end itemize
				8987
				8988	In all the routines sign changes for the result are accumulated using fast bit
				8989	twiddling which avoids conditional jumps.
				8990
				8991	The final result is calculated after verifying the inputs are coprime (GCD = 1)
				8992	by raising @m{(-1)^e,(-1)^e}
				8993
				8994	Much of the HGCD code is shared directly with the HGCD implementations, such
				8995	as the 2x2 matrix calculation, @xref{Lehmer's Algorithm} basecase and
				8996	@code{GCD_DC_THRESHOLD}.
				8997
				8998	The asymptotic running time is @m{O(M(N)\log N),O(M(N)*log(N))}, where
				8999	@math{M(N)} is the time for multiplying two @math{N}-limb numbers.
				9000
				9001	@need 1000
				9002	@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
				9003	@section Powering Algorithms
				9004	@cindex Powering algorithms
				9005
				9006	@menu
				9007	* Normal Powering Algorithm::
				9008	* Modular Powering Algorithm::
				9009	@end menu
				9010
				9011
				9012	@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
				9013	@subsection Normal Powering
				9014
				9015	Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
				9016	successively squaring and then multiplying by the base when a 1 bit is seen in
				9017	the exponent, as per Knuth section 4.6.3. The ``left to right''
				9018	variant described there is used rather than algorithm A, since it's just as
				9019	easy and can be done with somewhat less temporary memory.
				9020
				9021
				9022	@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms
				9023	@subsection Modular Powering
				9024
				9025	Modular powering is implemented using a @math{2^k}-ary sliding window
				9026	algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
				9027	(@pxref{References}). @math{k} is chosen according to the size of the
				9028	exponent. Larger exponents use larger values of @math{k}, the choice being
				9029	made to minimize the average number of multiplications that must supplement
				9030	the squaring.
				9031
				9032	The modular multiplies and squarings use either a simple division or the REDC
				9033	method by Montgomery (@pxref{References}). REDC is a little faster,
				9034	essentially saving N single limb divisions in a fashion similar to an exact
				9035	remainder (@pxref{Exact Remainder}).
				9036
				9037
				9038	@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
				9039	@section Root Extraction Algorithms
				9040	@cindex Root extraction algorithms
				9041
				9042	@menu
				9043	* Square Root Algorithm::
				9044	* Nth Root Algorithm::
				9045	* Perfect Square Algorithm::
				9046	* Perfect Power Algorithm::
				9047	@end menu
				9048
				9049
				9050	@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
				9051	@subsection Square Root
				9052	@cindex Square root algorithm
				9053	@cindex Karatsuba square root algorithm
				9054
				9055	Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
				9056	Zimmermann (@pxref{References}).
				9057
				9058	An input @math{n} is split into four parts of @math{k} bits each, so with
				9059	@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3b^3 + a2b^2
				9060	+ a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or
				9061	second highest bit is set. In GMP, @math{k} is kept on a limb boundary and
				9062	the input is left shifted (by an even number of bits) to normalize.
				9063
				9064	The square root of the high two parts is taken, by recursive application of
				9065	the algorithm (bottoming out in a one-limb Newton's method),
				9066	@tex
				9067	$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$
				9068	@end tex
				9069	@ifnottex
				9070
				9071	@example
				9072	s1,r1 = sqrtrem (a3*b + a2)
				9073	@end example
				9074
				9075	@end ifnottex
				9076	This is an approximation to the desired root and is extended by a division to
				9077	give @math{s},@math{r},
				9078	@tex
				9079	$$\eqalign{
				9080	q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr
				9081	s &= s'b + q \cr
				9082	r &= ub + a_0 - q^2
				9083	}$$
				9084	@end tex
				9085	@ifnottex
				9086
				9087	@example
				9088	q,u = divrem (r1b + a1, 2s1)
				9089	s = s1*b + q
				9090	r = u*b + a0 - q^2
				9091	@end example
				9092
				9093	@end ifnottex
				9094	The normalization requirement on @ms{a,3} means at this point @math{s} is
				9095	either correct or 1 too big. @math{r} is negative in the latter case, so
				9096	@tex
				9097	$$\eqalign{
				9098	\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr
				9099	r &\leftarrow r + 2s - 1 \cr
				9100	s &\leftarrow s - 1
				9101	}$$
				9102	@end tex
				9103	@ifnottex
				9104
				9105	@example
				9106	if r < 0 then
				9107	r = r + 2*s - 1
				9108	s = s - 1
				9109	@end example
				9110
				9111	@end ifnottex
				9112	The algorithm is expressed in a divide and conquer form, but as noted in the
				9113	paper it can also be viewed as a discrete variant of Newton's method, or as a
				9114	variation on the schoolboy method (no longer taught) for square roots two
				9115	digits at a time.
				9116
				9117	If the remainder @math{r} is not required then usually only a few high limbs
				9118	of @math{r} and @math{u} need to be calculated to determine whether an
				9119	adjustment to @math{s} is required. This optimization is not currently
				9120	implemented.
				9121
				9122	In the Karatsuba multiplication range this algorithm is @m{O({3\over2}
				9123	M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers
				9124	of @math{n} limbs. In the FFT multiplication range this grows to a bound of
				9125	@m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is
				9126	found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.
				9127
				9128	The algorithm does all its calculations in integers and the resulting
				9129	@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
				9130	The extended precision given by @code{mpf_sqrt_ui} is obtained by
				9131	padding with zero limbs.
				9132
				9133
				9134	@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
				9135	@subsection Nth Root
				9136	@cindex Root extraction algorithm
				9137	@cindex Nth root algorithm
				9138
				9139	Integer Nth roots are taken using Newton's method with the following
				9140	iteration, where @math{A} is the input and @math{n} is the root to be taken.
				9141	@tex
				9142	$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
				9143	@end tex
				9144	@ifnottex
				9145
				9146	@example
				9147	1 A
				9148	a[i+1] = - * ( --------- + (n-1)*a[i] )
				9149	n a[i]^(n-1)
				9150	@end example
				9151
				9152	@end ifnottex
				9153	The initial approximation @m{a_1,a[1]} is generated bitwise by successively
				9154	powering a trial root with or without new 1 bits, aiming to be just above the
				9155	true root. The iteration converges quadratically when started from a good
				9156	approximation. When @math{n} is large more initial bits are needed to get
				9157	good convergence. The current implementation is not particularly well
				9158	optimized.
				9159
				9160
				9161	@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
				9162	@subsection Perfect Square
				9163	@cindex Perfect square algorithm
				9164
				9165	A significant fraction of non-squares can be quickly identified by checking
				9166	whether the input is a quadratic residue modulo small integers.
				9167
				9168	@code{mpz_perfect_square_p} first tests the input mod 256, which means just
				9169	examining the low byte. Only 44 different values occur for squares mod 256,
				9170	so 82.8% of inputs can be immediately identified as non-squares.
				9171
				9172	On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total
				9173	99.25% of inputs identified as non-squares. On a 64-bit system 97 is tested
				9174	too, for a total 99.62%.
				9175
				9176	These moduli are chosen because they're factors of @math{2^@W{24}-1} (or
				9177	@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just
				9178	using additions (see @code{mpn_mod_34lsub1}).
				9179
				9180	When nails are in use moduli are instead selected by the @file{gen-psqr.c}
				9181	program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or
				9182	@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but
				9183	this is not currently implemented.
				9184
				9185	In any case each modulus is applied to the @code{mpn_mod_34lsub1} or
				9186	@code{mpn_mod_1} remainder and a table lookup identifies non-squares. By
				9187	using a ``modexact'' style calculation, and suitably permuted tables, just one
				9188	multiply each is required, see the code for details. Moduli are also combined
				9189	to save operations, so long as the lookup tables don't become too big.
				9190	@file{gen-psqr.c} does all the pre-calculations.
				9191
				9192	A square root must still be taken for any value that passes these tests, to
				9193	verify it's really a square and not one of the small fraction of non-squares
				9194	that get through (i.e.@: a pseudo-square to all the tested bases).
				9195
				9196	Clearly more residue tests could be done, @code{mpz_perfect_square_p} only
				9197	uses a compact and efficient set. Big inputs would probably benefit from more
				9198	residue testing, small inputs might be better off with less. The assumed
				9199	distribution of squares versus non-squares in the input would affect such
				9200	considerations.
				9201
				9202
				9203	@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms
				9204	@subsection Perfect Power
				9205	@cindex Perfect power algorithm
				9206
				9207	Detecting perfect powers is required by some factorization algorithms.
				9208	Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
				9209	extractions, though naturally only prime roots need to be considered.
				9210	(@xref{Nth Root Algorithm}.)
				9211
				9212	If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
				9213	roots which are divisors of @math{e} need to be considered, much reducing the
				9214	work necessary. To this end divisibility by a set of small primes is checked.
				9215
				9216
				9217	@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
				9218	@section Radix Conversion
				9219	@cindex Radix conversion algorithms
				9220
				9221	Radix conversions are less important than other algorithms. A program
				9222	dominated by conversions should probably use a different data representation.
				9223
				9224	@menu
				9225	* Binary to Radix::
				9226	* Radix to Binary::
				9227	@end menu
				9228
				9229
				9230	@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
				9231	@subsection Binary to Radix
				9232
				9233	Conversions from binary to a power-of-2 radix use a simple and fast
				9234	@math{O(N)} bit extraction algorithm.
				9235
				9236	Conversions from binary to other radices use one of two algorithms. Sizes
				9237	below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
				9238	Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
				9239	@math{n} is the biggest power that fits in a limb. But instead of simply
				9240	using the remainder @math{r} from such divisions, an extra divide step is done
				9241	to give a fractional limb representing @math{r/b^n}. The digits of @math{r}
				9242	can then be extracted using multiplications by @math{b} rather than divisions.
				9243	Special case code is provided for decimal, allowing multiplications by 10 to
				9244	optimize to shifts and adds.
				9245
				9246	Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
				9247	For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
				9248	calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
				9249	reached. @math{t} is then divided by that largest power, giving a quotient
				9250	which is the digits above that power, and a remainder which is those below.
				9251	These two parts are in turn divided by the second highest power, and so on
				9252	recursively. When a piece has been divided down to less than
				9253	@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
				9254	used.
				9255
				9256	The advantage of this algorithm is that big divisions can make use of the
				9257	sub-quadratic divide and conquer division (@pxref{Divide and Conquer
				9258	Division}), and big divisions tend to have less overheads than lots of
				9259	separate single limb divisions anyway. But in any case the cost of
				9260	calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
				9261
				9262	@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
				9263	the same basic thing, the point where it becomes worth doing a big division to
				9264	cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
				9265	of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
				9266	assumes that's already available, which is the case when recursing.
				9267
				9268	Since the base case produces digits from least to most significant but they
				9269	want to be stored from most to least, it's necessary to calculate in advance
				9270	how many digits there will be, or at least be sure not to underestimate that.
				9271	For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
				9272	from @code{mp_bases}, rounding up. The result is either correct or one too
				9273	big.
				9274
				9275	Examining some of the high bits of the input could increase the chance of
				9276	getting the exact number of digits, but an exact result every time would not
				9277	be practical, since in general the difference between numbers 100@dots{} and
				9278	99@dots{} is only in the last few bits and the work to identify 99@dots{}
				9279	might well be almost as much as a full conversion.
				9280
				9281	The @math{r/b^n} scheme described above for using multiplications to bring out
				9282	digits might be useful for more than a single limb. Some brief experiments
				9283	with it on the base case when recursing didn't give a noticeable improvement,
				9284	but perhaps that was only due to the implementation. Something similar would
				9285	work for the sub-quadratic divisions too, though there would be the cost of
				9286	calculating a bigger radix power.
				9287
				9288	Another possible improvement for the sub-quadratic part would be to arrange
				9289	for radix powers that balanced the sizes of quotient and remainder produced,
				9290	i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
				9291	@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to
				9292	smooth out a graph of times against sizes, but may or may not be a net
				9293	speedup.
				9294
				9295
				9296	@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms
				9297	@subsection Radix to Binary
				9298
				9299	@strong{This section needs to be rewritten, it currently describes the
				9300	algorithms used before GMP 4.3.}
				9301
				9302	Conversions from a power-of-2 radix into binary use a simple and fast
				9303	@math{O(N)} bitwise concatenation algorithm.
				9304
				9305	Conversions from other radices use one of two algorithms. Sizes below
				9306	@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups
				9307	of @math{n} digits are converted to limbs, where @math{n} is the biggest
				9308	power of the base @math{b} which will fit in a limb, then those groups are
				9309	accumulated into the result by multiplying by @math{b^n} and adding. This
				9310	saves multi-precision operations, as per Knuth section 4.4 part E
				9311	(@pxref{References}). Some special case code is provided for decimal, giving
				9312	the compiler a chance to optimize multiplications by 10.
				9313
				9314	Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
				9315	First groups of @math{n} digits are converted into limbs. Then adjacent
				9316	limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x}
				9317	and @math{y} are the limbs. Adjacent limb pairs are combined into quads
				9318	similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block
				9319	remains, that being the result.
				9320
				9321	The advantage of this method is that the multiplications for each @math{x} are
				9322	big blocks, allowing Karatsuba and higher algorithms to be used. But the cost
				9323	of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
				9324	@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on
				9325	some processors much bigger still.
				9326
				9327	@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned
				9328	for decimal), though it might be better based on a limb count, so as to be
				9329	independent of the base. But that sort of count isn't used by the base case
				9330	and so would need some sort of initial calculation or estimate.
				9331
				9332	The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the
				9333	corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
				9334	much faster than @code{mpn_divrem_1} (often by a factor of 5, or more).
				9335
				9336
				9337	@need 1000
				9338	@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms
				9339	@section Other Algorithms
				9340
				9341	@menu
				9342	* Prime Testing Algorithm::
				9343	* Factorial Algorithm::
				9344	* Binomial Coefficients Algorithm::
				9345	* Fibonacci Numbers Algorithm::
				9346	* Lucas Numbers Algorithm::
				9347	* Random Number Algorithms::
				9348	@end menu
				9349
				9350
				9351	@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms
				9352	@subsection Prime Testing
				9353	@cindex Prime testing algorithms
				9354
				9355	The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic
				9356	Functions}) first does some trial division by small factors and then uses the
				9357	Miller-Rabin probabilistic primality testing algorithm, as described in Knuth
				9358	section 4.5.4 algorithm P (@pxref{References}).
				9359
				9360	For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where
				9361	@math{q} is odd, this algorithm selects a random base @math{x} and tests
				9362	whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n,
				9363	x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n}
				9364	is probably prime, if not then @math{n} is definitely composite.
				9365
				9366	Any prime @math{n} will pass the test, but some composites do too. Such
				9367	composites are known as strong pseudoprimes to base @math{x}. No @math{n} is
				9368	a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise
				9369	22), hence with @math{x} chosen at random there's no more than a @math{1/4}
				9370	chance a ``probable prime'' will in fact be composite.
				9371
				9372	In fact strong pseudoprimes are quite rare, making the test much more
				9373	powerful than this analysis would suggest, but @math{1/4} is all that's proven
				9374	for an arbitrary @math{n}.
				9375
				9376
				9377	@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms
				9378	@subsection Factorial
				9379	@cindex Factorial algorithm
				9380
				9381	Factorials are calculated by a combination of two algorithms. An idea is
				9382	shared among them: to compute the odd part of the factorial; a final step
				9383	takes account of the power of @math{2} term, by shifting.
				9384
				9385	For small @math{n}, the odd factor of @math{n!} is computed with the simple
				9386	observation that it is equal to the product of all positive odd numbers
				9387	smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!},
				9388	where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on
				9389	recursively. The procedure can be best illustrated with an example,
				9390
				9391	@quotation
				9392	@math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}}
				9393	@end quotation
				9394
				9395	Current code collects all the factors in a single list, with a loop and no
				9396	recursion, and compute the product, with no special care for repeated chunks.
				9397
				9398	When @math{n} is larger, computation pass trough prime sieving. An helper
				9399	function is used, as suggested by Peter Luschny:
				9400	@tex
				9401	$$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n}
				9402	p^{\mathop{\rm L}(p,n)} $$
				9403	@end tex
				9404	@ifnottex
				9405
				9406	@example
				9407	n
				9408	-----
				9409	n! \| \| L(p,n)
				9410	msf(n) = -------------- = \| \| p
				9411	[n/2]!^2.2^k p=3
				9412	@end example
				9413	@end ifnottex
				9414
				9415	Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to
				9416	obtain an odd integer number: @math{k} is the number of 1 bits in the binary
				9417	representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)}
				9418	can be defined as zero when @math{p} is composite, and, for any prime
				9419	@math{p}, it is computed with:
				9420	@tex
				9421	$$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2
				9422	\leq\log_p(n)$$
				9423	@end tex
				9424	@ifnottex
				9425
				9426	@example
				9427	---
				9428	\ n
				9429	L(p,n) = / [---] mod 2 <= log (n) .
				9430	--- p^i p
				9431	i>0
				9432	@end example
				9433	@end ifnottex
				9434
				9435	With this helper function, we are able to compute the odd part of @math{n!}
				9436	using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm
				9437	msf}(n)\cdot2^k , n!=[n/2]!^2msf(n)2^k}. The recursion stops using the
				9438	small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}.
				9439
				9440	Both the above algorithms use binary splitting to compute the product of many
				9441	small factors. At first as many products as possible are accumulated in a
				9442	single register, generating a list of factors that fit in a machine word. This
				9443	list is then split into halves, and the product is computed recursively.
				9444
				9445	Such splitting is more efficient than repeated N@cross{}1 multiplies since it
				9446	forms big multiplies, allowing Karatsuba and higher algorithms to be used.
				9447	And even below the Karatsuba threshold a big block of work can be more
				9448	efficient for the basecase algorithm.
				9449
				9450
				9451	@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
				9452	@subsection Binomial Coefficients
				9453	@cindex Binomial coefficient algorithm
				9454
				9455	Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
				9456	by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
				9457	\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
				9458	evaluating the following product simply from @math{i=2} to @math{i=k}.
				9459	@tex
				9460	$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
				9461	@end tex
				9462	@ifnottex
				9463
				9464	@example
				9465	k (n-k+i)
				9466	C(n,k) = (n-k+1) * prod -------
				9467	i=2 i
				9468	@end example
				9469
				9470	@end ifnottex
				9471	It's easy to show that each denominator @math{i} will divide the product so
				9472	far, so the exact division algorithm is used (@pxref{Exact Division}).
				9473
				9474	The numerators @math{n-k+i} and denominators @math{i} are first accumulated
				9475	into as many fit a limb, to save multi-precision operations, though for
				9476	@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
				9477	@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
				9478
				9479
				9480	@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
				9481	@subsection Fibonacci Numbers
				9482	@cindex Fibonacci number algorithm
				9483
				9484	The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
				9485	for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
				9486	values efficiently.
				9487
				9488	For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
				9489	used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
				9490	up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}.
				9491
				9492	Beyond the table, values are generated with a binary powering algorithm,
				9493	calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
				9494	low across the bits of @math{n}. The formulas used are
				9495	@tex
				9496	$$\eqalign{
				9497	F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
				9498	F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr
				9499	F_{2k} &= F_{2k+1} - F_{2k-1}
				9500	}$$
				9501	@end tex
				9502	@ifnottex
				9503
				9504	@example
				9505	F[2k+1] = 4F[k]^2 - F[k-1]^2 + 2(-1)^k
				9506	F[2k-1] = F[k]^2 + F[k-1]^2
				9507
				9508	F[2k] = F[2k+1] - F[2k-1]
				9509	@end example
				9510
				9511	@end ifnottex
				9512	At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit
				9513	of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
				9514	it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
				9515	repeated until all bits of @math{n} are incorporated. Notice these formulas
				9516	require just two squares per bit of @math{n}.
				9517
				9518	It'd be possible to handle the first few @math{n} above the single limb table
				9519	with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
				9520	F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
				9521	turns out to be faster for only about 10 or 20 values of @math{n}, and
				9522	including a block of code for just those doesn't seem worthwhile. If they
				9523	really mattered it'd be better to extend the data table.
				9524
				9525	Using a table avoids lots of calculations on small numbers, and makes small
				9526	@math{n} go fast. A bigger table would make more small @math{n} go fast, it's
				9527	just a question of balancing size against desired speed. For GMP the code is
				9528	kept compact, with the emphasis primarily on a good powering algorithm.
				9529
				9530	@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
				9531	@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last
				9532	step of the algorithm can become one multiply instead of two squares. One of
				9533	the following two formulas is used, according as @math{n} is odd or even.
				9534	@tex
				9535	$$\eqalign{
				9536	F_{2k} &= F_k (F_k + 2F_{k-1}) \cr
				9537	F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
				9538	}$$
				9539	@end tex
				9540	@ifnottex
				9541
				9542	@example
				9543	F[2k] = F[k]*(F[k]+2F[k-1])
				9544
				9545	F[2k+1] = (2F[k]+F[k-1])(2F[k]-F[k-1]) + 2(-1)^k
				9546	@end example
				9547
				9548	@end ifnottex
				9549	@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
				9550	multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
				9551	can be applied just to the low limb of the calculation, without a carry or
				9552	borrow into further limbs, which saves some code size. See comments with
				9553	@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
				9554
				9555
				9556	@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms
				9557	@subsection Lucas Numbers
				9558	@cindex Lucas number algorithm
				9559
				9560	@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
				9561	numbers with the following simple formulas.
				9562	@tex
				9563	$$\eqalign{
				9564	L_k &= F_k + 2F_{k-1} \cr
				9565	L_{k-1} &= 2F_k - F_{k-1}
				9566	}$$
				9567	@end tex
				9568	@ifnottex
				9569
				9570	@example
				9571	L[k] = F[k] + 2*F[k-1]
				9572	L[k-1] = 2*F[k] - F[k-1]
				9573	@end example
				9574
				9575	@end ifnottex
				9576	@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
				9577	saved. Trailing zero bits on @math{n} can be handled with a single square
				9578	each.
				9579	@tex
				9580	$$ L_{2k} = L_k^2 - 2(-1)^k $$
				9581	@end tex
				9582	@ifnottex
				9583
				9584	@example
				9585	L[2k] = L[k]^2 - 2*(-1)^k
				9586	@end example
				9587
				9588	@end ifnottex
				9589	And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
				9590	numbers, similar to what @code{mpz_fib_ui} does.
				9591	@tex
				9592	$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
				9593	@end tex
				9594	@ifnottex
				9595
				9596	@example
				9597	L[2k+1] = 5F[k-1](2F[k]+F[k-1]) - 4(-1)^k
				9598	@end example
				9599
				9600	@end ifnottex
				9601
				9602
				9603	@node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms
				9604	@subsection Random Numbers
				9605	@cindex Random number algorithms
				9606
				9607	For the @code{urandomb} functions, random numbers are generated simply by
				9608	concatenating bits produced by the generator. As long as the generator has
				9609	good randomness properties this will produce well-distributed @math{N} bit
				9610	numbers.
				9611
				9612	For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N}
				9613	are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil,
				9614	ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally
				9615	require only one or two attempts, but the attempts are limited in case the
				9616	generator is somehow degenerate and produces only 1 bits or similar.
				9617
				9618	@cindex Mersenne twister algorithm
				9619	The Mersenne Twister generator is by Matsumoto and Nishimura
				9620	(@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1},
				9621	which is a Mersenne prime, hence the name of the generator. The state is 624
				9622	words of 32-bits each, which is iterated with one XOR and shift for each
				9623	32-bit word generated, making the algorithm very fast. Randomness properties
				9624	are also very good and this is the default algorithm used by GMP.
				9625
				9626	@cindex Linear congruential algorithm
				9627	Linear congruential generators are described in many text books, for instance
				9628	Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters
				9629	@math{A} and @math{C}, an integer state @math{S} is iterated by the formula
				9630	@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new
				9631	state is a linear function of the previous, mod @math{M}, hence the name of
				9632	the generator.
				9633
				9634	In GMP only moduli of the form @math{2^N} are supported, and the current
				9635	implementation is not as well optimized as it could be. Overheads are
				9636	significant when @math{N} is small, and when @math{N} is large clearly the
				9637	multiply at each step will become slow. This is not a big concern, since the
				9638	Mersenne Twister generator is better in every respect and is therefore
				9639	recommended for all normal applications.
				9640
				9641	For both generators the current state can be deduced by observing enough
				9642	output and applying some linear algebra (over GF(2) in the case of the
				9643	Mersenne Twister). This generally means raw output is unsuitable for
				9644	cryptographic applications without further hashing or the like.
				9645
				9646
				9647	@node Assembly Coding, , Other Algorithms, Algorithms
				9648	@section Assembly Coding
				9649	@cindex Assembly coding
				9650
				9651	The assembly subroutines in GMP are the most significant source of speed at
				9652	small to moderate sizes. At larger sizes algorithm selection becomes more
				9653	important, but of course speedups in low level routines will still speed up
				9654	everything proportionally.
				9655
				9656	Carry handling and widening multiplies that are important for GMP can't be
				9657	easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in
				9658	@file{longlong.h}, but hand coding low level routines invariably offers a
				9659	speedup over generic C by a factor of anything from 2 to 10.
				9660
				9661	@menu
				9662	* Assembly Code Organisation::
				9663	* Assembly Basics::
				9664	* Assembly Carry Propagation::
				9665	* Assembly Cache Handling::
				9666	* Assembly Functional Units::
				9667	* Assembly Floating Point::
				9668	* Assembly SIMD Instructions::
				9669	* Assembly Software Pipelining::
				9670	* Assembly Loop Unrolling::
				9671	* Assembly Writing Guide::
				9672	@end menu
				9673
				9674
				9675	@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding
				9676	@subsection Code Organisation
				9677	@cindex Assembly code organisation
				9678	@cindex Code organisation
				9679
				9680	The various @file{mpn} subdirectories contain machine-dependent code, written
				9681	in C or assembly. The @file{mpn/generic} subdirectory contains default code,
				9682	used when there's no machine-specific version of a particular file.
				9683
				9684	Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and
				9685	64-bit variants in a family cannot share code and have separate directories.
				9686	Within a family further subdirectories may exist for CPU variants.
				9687
				9688	In each directory a @file{nails} subdirectory may exist, holding code with
				9689	nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each
				9690	file indicates the nails values the code handles. Nails code only exists
				9691	where it's faster, or promises to be faster, than plain code. There's no
				9692	effort put into nails if they're not going to enhance a given CPU.
				9693
				9694
				9695	@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding
				9696	@subsection Assembly Basics
				9697
				9698	@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
				9699	for overall GMP performance. All multiplications and divisions come down to
				9700	repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n},
				9701	@code{mpn_lshift} and @code{mpn_rshift} are next most important.
				9702
				9703	On some CPUs assembly versions of the internal functions
				9704	@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
				9705	mainly through avoiding function call overheads. They can also potentially
				9706	make better use of a wide superscalar processor, as can bigger primitives like
				9707	@code{mpn_addmul_2} or @code{mpn_addmul_4}.
				9708
				9709	The restrictions on overlaps between sources and destinations
				9710	(@pxref{Low-level Functions}) are designed to facilitate a variety of
				9711	implementations. For example, knowing @code{mpn_add_n} won't have partly
				9712	overlapping sources and destination means reading can be done far ahead of
				9713	writing on superscalar processors, and loops can be vectorized on a vector
				9714	processor, depending on the carry handling.
				9715
				9716
				9717	@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding
				9718	@subsection Carry Propagation
				9719	@cindex Assembly carry propagation
				9720
				9721	The problem that presents most challenges in GMP is propagating carries from
				9722	one limb to the next. In functions like @code{mpn_addmul_1} and
				9723	@code{mpn_add_n}, carries are the only dependencies between limb operations.
				9724
				9725	On processors with carry flags, a straightforward CISC style @code{adc} is
				9726	generally best. AMD K6 @code{mpn_addmul_1} however is an example of an
				9727	unusual set of circumstances where a branch works out better.
				9728
				9729	On RISC processors generally an add and compare for overflow is used. This
				9730	sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry
				9731	propagation schemes require 4 instructions, meaning at least 4 cycles per
				9732	limb, but other schemes may use just 1 or 2. On wide superscalar processors
				9733	performance may be completely determined by the number of dependent
				9734	instructions between carry-in and carry-out for each limb.
				9735
				9736	On vector processors good use can be made of the fact that a carry bit only
				9737	very rarely propagates more than one limb. When adding a single bit to a
				9738	limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on
				9739	random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
				9740	2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds
				9741	all limbs in parallel, adds one set of carry bits in parallel and then only
				9742	rarely needs to fall through to a loop propagating further carries.
				9743
				9744	On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
				9745	for the RISC style idioms that are necessary to handle carry bits in
				9746	C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms
				9747	would be better. And so unfortunately almost any loop involving carry bits
				9748	needs to be coded in assembly for best results.
				9749
				9750
				9751	@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding
				9752	@subsection Cache Handling
				9753	@cindex Assembly cache handling
				9754
				9755	GMP aims to perform well both on operands that fit entirely in L1 cache and
				9756	those which don't.
				9757
				9758	Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
				9759	large operands, so L2 and main memory performance is important for them.
				9760	@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
				9761	square basecases, so L1 performance matters most for them, unless assembly
				9762	versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
				9763	which case the remaining uses are mostly for larger operands.
				9764
				9765	For L2 or main memory operands, memory access times will almost certainly be
				9766	more than the calculation time. The aim therefore is to maximize memory
				9767	throughput, by starting a load of the next cache line while processing the
				9768	contents of the previous one. Clearly this is only possible if the chip has a
				9769	lock-up free cache or some sort of prefetch instruction. Most current chips
				9770	have both these features.
				9771
				9772	Prefetching sources combines well with loop unrolling, since a prefetch can be
				9773	initiated once per unrolled loop (or more than once if the loop covers more
				9774	than one cache line).
				9775
				9776	On CPUs without write-allocate caches, prefetching destinations will ensure
				9777	individual stores don't go further down the cache hierarchy, limiting
				9778	bandwidth. Of course for calculations which are slow anyway, like
				9779	@code{mpn_divrem_1}, write-throughs might be fine.
				9780
				9781	The distance ahead to prefetch will be determined by memory latency versus
				9782	throughput. The aim of course is to have data arriving continuously, at peak
				9783	throughput. Some CPUs have limits on the number of fetches or prefetches in
				9784	progress.
				9785
				9786	If a special prefetch instruction doesn't exist then a plain load can be used,
				9787	but in that case care must be taken not to attempt to read past the end of an
				9788	operand, since that might produce a segmentation violation.
				9789
				9790	Some CPUs or systems have hardware that detects sequential memory accesses and
				9791	initiates suitable cache movements automatically, making life easy.
				9792
				9793
				9794	@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding
				9795	@subsection Functional Units
				9796
				9797	When choosing an approach for an assembly loop, consideration is given to
				9798	what operations can execute simultaneously and what throughput can thereby be
				9799	achieved. In some cases an algorithm can be tweaked to accommodate available
				9800	resources.
				9801
				9802	Loop control will generally require a counter and pointer updates, costing as
				9803	much as 5 instructions, plus any delays a branch introduces. CPU addressing
				9804	modes might reduce pointer updates, perhaps by allowing just one updating
				9805	pointer and others expressed as offsets from it, or on CISC chips with all
				9806	addressing done with the loop counter as a scaled index.
				9807
				9808	The final loop control cost can be amortised by processing several limbs in
				9809	each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop
				9810	control isn't a big fraction the work done.
				9811
				9812	Memory throughput is always a limit. If perhaps only one load or one store
				9813	can be done per cycle then 3 cycles/limb will the top speed for ``binary''
				9814	operations like @code{mpn_add_n}, and any code achieving that is optimal.
				9815
				9816	Integer resources can be freed up by having the loop counter in a float
				9817	register, or by pressing the float units into use for some multiplying,
				9818	perhaps doing every second limb on the float side (@pxref{Assembly Floating
				9819	Point}).
				9820
				9821	Float resources can be freed up by doing carry propagation on the integer
				9822	side, or even by doing integer to float conversions in integers using bit
				9823	twiddling.
				9824
				9825
				9826	@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding
				9827	@subsection Floating Point
				9828	@cindex Assembly floating Point
				9829
				9830	Floating point arithmetic is used in GMP for multiplications on CPUs with poor
				9831	integer multipliers. It's mostly useful for @code{mpn_mul_1},
				9832	@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
				9833	@code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
				9834
				9835	With IEEE 53-bit double precision floats, integer multiplications producing up
				9836	to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication
				9837	into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With
				9838	some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
				9839	used, if one of the lower two 21-bit pieces also uses the sign bit.
				9840
				9841	For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
				9842	invariant single limb is split at the start, into 3 or 4 pieces. Inside the
				9843	loop, the bignum operand is split into 32-bit pieces. Fast conversion of
				9844	these unsigned 32-bit pieces to floating point is highly machine-dependent.
				9845	In some cases, reading the data into the integer unit, zero-extending to
				9846	64-bits, then transferring to the floating point unit back via memory is the
				9847	only option.
				9848
				9849	Converting partial products back to 64-bit limbs is usually best done as a
				9850	signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed
				9851	and unsigned are the same, but most processors lack unsigned conversions.
				9852
				9853	@sp 2
				9854
				9855	Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
				9856	@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split
				9857	into four 16-bit parts. The multi-limb operand U is split in the loop into
				9858	two 32-bit parts.
				9859
				9860	@tex
				9861	\global\newdimen\GMPbits \global\GMPbits=0.18em
				9862	\def\GMPbox#1#2#3{%
				9863	\hbox{%
				9864	\hbox to 128\GMPbits{\hfil
				9865	\vbox{%
				9866	\hrule
				9867	\hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
				9868	\hrule}%
				9869	\hskip #1\GMPbits}%
				9870	\raise \GMPboxdepth \hbox{\hskip 2em #3}}}
				9871	%
				9872	\GMPdisplay{%
				9873	\vbox{%
				9874	\hbox{%
				9875	\hbox to 128\GMPbits {\hfil
				9876	\vbox{%
				9877	\hrule
				9878	\hbox to 64\GMPbits{%
				9879	\GMPvrule \hfil$v48$\hfil
				9880	\vrule \hfil$v32$\hfil
				9881	\vrule \hfil$v16$\hfil
				9882	\vrule \hfil$v00$\hfil
				9883	\vrule}
				9884	\hrule}}%
				9885	\raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
				9886	\vskip 0.5ex
				9887	\hbox{%
				9888	\hbox to 128\GMPbits {\hfil
				9889	\raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
				9890	\vbox{%
				9891	\hrule
				9892	\hbox to 64\GMPbits {%
				9893	\GMPvrule \hfil$u32$\hfil
				9894	\vrule \hfil$u00$\hfil
				9895	\vrule}%
				9896	\hrule}}%
				9897	\raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
				9898	\vskip 0.5ex
				9899	\hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
				9900	\GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
				9901	\vskip 0.5ex
				9902	\GMPbox{16}{u00 \times v16}{$p16$}
				9903	\vskip 0.5ex
				9904	\GMPbox{32}{u00 \times v32}{$p32$}
				9905	\vskip 0.5ex
				9906	\GMPbox{48}{u00 \times v48}{$p48$}
				9907	\vskip 0.5ex
				9908	\GMPbox{32}{u32 \times v00}{$r32$}
				9909	\vskip 0.5ex
				9910	\GMPbox{48}{u32 \times v16}{$r48$}
				9911	\vskip 0.5ex
				9912	\GMPbox{64}{u32 \times v32}{$r64$}
				9913	\vskip 0.5ex
				9914	\GMPbox{80}{u32 \times v48}{$r80$}
				9915	}}
				9916	@end tex
				9917	@ifnottex
				9918	@example
				9919	@group
				9920	+---+---+---+---+
				9921	\|v48\|v32\|v16\|v00\| V operand
				9922	+---+---+---+---+
				9923
				9924	+-------+---+---+
				9925	x \| u32 \| u00 \| U operand (one limb)
				9926	+---------------+
				9927
				9928	---------------------------------
				9929
				9930	+-----------+
				9931	\| u00 x v00 \| p00 48-bit products
				9932	+-----------+
				9933	+-----------+
				9934	\| u00 x v16 \| p16
				9935	+-----------+
				9936	+-----------+
				9937	\| u00 x v32 \| p32
				9938	+-----------+
				9939	+-----------+
				9940	\| u00 x v48 \| p48
				9941	+-----------+
				9942	+-----------+
				9943	\| u32 x v00 \| r32
				9944	+-----------+
				9945	+-----------+
				9946	\| u32 x v16 \| r48
				9947	+-----------+
				9948	+-----------+
				9949	\| u32 x v32 \| r64
				9950	+-----------+
				9951	+-----------+
				9952	\| u32 x v48 \| r80
				9953	+-----------+
				9954	@end group
				9955	@end example
				9956	@end ifnottex
				9957
				9958	@math{p32} and @math{r32} can be summed using floating-point addition, and
				9959	likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed
				9960	with @math{r64} and @math{r80} from the previous iteration.
				9961
				9962	For each loop then, four 49-bit quantities are transferred to the integer unit,
				9963	aligned as follows,
				9964
				9965	@tex
				9966	% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
				9967	% crossing into the upper 64 bits.
				9968	\def\GMPbox#1#2#3{%
				9969	\hbox{%
				9970	\hbox to 128\GMPbits {%
				9971	\hfil
				9972	\vbox{%
				9973	\hrule
				9974	\hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
				9975	\hrule}%
				9976	\hskip #1\GMPbits}%
				9977	\raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
				9978	}}
				9979	\newbox\b \setbox\b\hbox{64 bits}%
				9980	\newdimen\bw \bw=\wd\b \advance\bw by 2em
				9981	\newdimen\x \x=128\GMPbits
				9982	\advance\x by -2\bw
				9983	\divide\x by4
				9984	\GMPdisplay{%
				9985	\vbox{%
				9986	\hbox to 128\GMPbits {%
				9987	\GMPvrule
				9988	\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
				9989	\hfil 64 bits\hfil
				9990	\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
				9991	\vrule
				9992	\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
				9993	\hfil 64 bits\hfil
				9994	\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
				9995	\vrule}%
				9996	\vskip 0.7ex
				9997	\GMPbox{0}{p00+r64'}{i00}
				9998	\vskip 0.5ex
				9999	\GMPbox{16}{p16+r80'}{i16}
				10000	\vskip 0.5ex
				10001	\GMPbox{32}{p32+r32}{i32}
				10002	\vskip 0.5ex
				10003	\GMPbox{48}{p48+r48}{i48}
				10004	}}
				10005	@end tex
				10006	@ifnottex
				10007	@example
				10008	@group
				10009	\|-----64bits----\|-----64bits----\|
				10010	+------------+
				10011	\| p00 + r64' \| i00
				10012	+------------+
				10013	+------------+
				10014	\| p16 + r80' \| i16
				10015	+------------+
				10016	+------------+
				10017	\| p32 + r32 \| i32
				10018	+------------+
				10019	+------------+
				10020	\| p48 + r48 \| i48
				10021	+------------+
				10022	@end group
				10023	@end example
				10024	@end ifnottex
				10025
				10026	The challenge then is to sum these efficiently and add in a carry limb,
				10027	generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
				10028	extends 33 bits into the high half).
				10029
				10030
				10031	@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding
				10032	@subsection SIMD Instructions
				10033	@cindex Assembly SIMD
				10034
				10035	The single-instruction multiple-data support in current microprocessors is
				10036	aimed at signal processing algorithms where each data point can be treated
				10037	more or less independently. There's generally not much support for
				10038	propagating the sort of carries that arise in GMP.
				10039
				10040	SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
				10041	work as one 32@cross{}32 from GMP's point of view, and need some shifts and
				10042	adds besides. But of course if say the SIMD form is fully pipelined and uses
				10043	less instruction decoding then it may still be worthwhile.
				10044
				10045	On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and
				10046	@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the
				10047	P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1},
				10048	@code{mpn_addmul_1}, and @code{mpn_submul_1}.
				10049
				10050
				10051	@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding
				10052	@subsection Software Pipelining
				10053	@cindex Assembly software pipelining
				10054
				10055	Software pipelining consists of scheduling instructions around the branch
				10056	point in a loop. For example a loop might issue a load not for use in the
				10057	present iteration but the next, thereby allowing extra cycles for the data to
				10058	arrive from memory.
				10059
				10060	Naturally this is wanted only when doing things like loads or multiplies that
				10061	take several cycles to complete, and only where a CPU has multiple functional
				10062	units so that other work can be done in the meantime.
				10063
				10064	A pipeline with several stages will have a data value in progress at each
				10065	stage and each loop iteration moves them along one stage. This is like
				10066	juggling.
				10067
				10068	If the latency of some instruction is greater than the loop time then it will
				10069	be necessary to unroll, so one register has a result ready to use while
				10070	another (or multiple others) are still in progress. (@pxref{Assembly Loop
				10071	Unrolling}).
				10072
				10073
				10074	@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding
				10075	@subsection Loop Unrolling
				10076	@cindex Assembly loop unrolling
				10077
				10078	Loop unrolling consists of replicating code so that several limbs are
				10079	processed in each loop. At a minimum this reduces loop overheads by a
				10080	corresponding factor, but it can also allow better register usage, for example
				10081	alternately using one register combination and then another. Judicious use of
				10082	@command{m4} macros can help avoid lots of duplication in the source code.
				10083
				10084	Any amount of unrolling can be handled with a loop counter that's decremented
				10085	by @math{N} each time, stopping when the remaining count is less than the
				10086	further @math{N} the loop will process. Or by subtracting @math{N} at the
				10087	start, the termination condition becomes when the counter @math{C} is less
				10088	than 0 (and the count of remaining limbs is @math{C+N}).
				10089
				10090	Alternately for a power of 2 unroll the loop count and remainder can be
				10091	established with a shift and mask. This is convenient if also making a
				10092	computed jump into the middle of a large loop.
				10093
				10094	The limbs not a multiple of the unrolling can be handled in various ways, for
				10095	example
				10096
				10097	@itemize @bullet
				10098	@item
				10099	A simple loop at the end (or the start) to process the excess. Care will be
				10100	wanted that it isn't too much slower than the unrolled part.
				10101
				10102	@item
				10103	A set of binary tests, for example after an 8-limb unrolling, test for 4 more
				10104	limbs to process, then a further 2 more or not, and finally 1 more or not.
				10105	This will probably take more code space than a simple loop.
				10106
				10107	@item
				10108	A @code{switch} statement, providing separate code for each possible excess,
				10109	for example an 8-limb unrolling would have separate code for 0 remaining, 1
				10110	remaining, etc, up to 7 remaining. This might take a lot of code, but may be
				10111	the best way to optimize all cases in combination with a deep pipelined loop.
				10112
				10113	@item
				10114	A computed jump into the middle of the loop, thus making the first iteration
				10115	handle the excess. This should make times smoothly increase with size, which
				10116	is attractive, but setups for the jump and adjustments for pointers can be
				10117	tricky and could become quite difficult in combination with deep pipelining.
				10118	@end itemize
				10119
				10120
				10121	@node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding
				10122	@subsection Writing Guide
				10123	@cindex Assembly writing guide
				10124
				10125	This is a guide to writing software pipelined loops for processing limb
				10126	vectors in assembly.
				10127
				10128	First determine the algorithm and which instructions are needed. Code it
				10129	without unrolling or scheduling, to make sure it works. On a 3-operand CPU
				10130	try to write each new value to a new register, this will greatly simplify later
				10131	steps.
				10132
				10133	Then note for each instruction the functional unit and/or issue port
				10134	requirements. If an instruction can use either of two units, like U0 or U1
				10135	then make a category ``U0/U1''. Count the total using each unit (or combined
				10136	unit), and count all instructions.
				10137
				10138	Figure out from those counts the best possible loop time. The goal will be to
				10139	find a perfect schedule where instruction latencies are completely hidden.
				10140	The total instruction count might be the limiting factor, or perhaps a
				10141	particular functional unit. It might be possible to tweak the instructions to
				10142	help the limiting factor.
				10143
				10144	Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the
				10145	final loop branch at the end of the last. Now fill the buckets with dummy
				10146	instructions using the functional units desired. Run this to make sure the
				10147	intended speed is reached.
				10148
				10149	Now replace the dummy instructions with the real instructions from the slow
				10150	but correct loop you started with. The first will typically be a load
				10151	instruction. Then the instruction using that value is placed in a bucket an
				10152	appropriate distance down. Run the loop again, to check it still runs at
				10153	target speed.
				10154
				10155	Keep placing instructions, frequently measuring the loop. After a few you
				10156	will need to wrap around from the last bucket back to the top of the loop. If
				10157	you used the new-register for new-value strategy above then there will be no
				10158	register conflicts. If not then take care not to clobber something already in
				10159	use. Changing registers at this time is very error prone.
				10160
				10161	The loop will overlap two or more of the original loop iterations, and the
				10162	computation of one vector element result will be started in one iteration of
				10163	the new loop, and completed one or several iterations later.
				10164
				10165	The final step is to create feed-in and wind-down code for the loop. A good
				10166	way to do this is to make a copy (or copies) of the loop at the start and
				10167	delete those instructions which don't have valid antecedents, and at the end
				10168	replicate and delete those whose results are unwanted (including any further
				10169	loads).
				10170
				10171	The loop will have a minimum number of limbs loaded and processed, so the
				10172	feed-in code must test if the request size is smaller and skip either to a
				10173	suitable part of the wind-down or to special code for small sizes.
				10174
				10175
				10176	@node Internals, Contributors, Algorithms, Top
				10177	@chapter Internals
				10178	@cindex Internals
				10179
				10180	@strong{This chapter is provided only for informational purposes and the
				10181	various internals described here may change in future GMP releases.
				10182	Applications expecting to be compatible with future releases should use only
				10183	the documented interfaces described in previous chapters.}
				10184
				10185	@menu
				10186	* Integer Internals::
				10187	* Rational Internals::
				10188	* Float Internals::
				10189	* Raw Output Internals::
				10190	* C++ Interface Internals::
				10191	@end menu
				10192
				10193	@node Integer Internals, Rational Internals, Internals, Internals
				10194	@section Integer Internals
				10195	@cindex Integer internals
				10196
				10197	@code{mpz_t} variables represent integers using sign and magnitude, in space
				10198	dynamically allocated and reallocated. The fields are as follows.
				10199
				10200	@table @asis
				10201	@item @code{_mp_size}
				10202	The number of limbs, or the negative of that when representing a negative
				10203	integer. Zero is represented by @code{_mp_size} set to zero, in which case
				10204	the @code{_mp_d} data is undefined.
				10205
				10206	@item @code{_mp_d}
				10207	A pointer to an array of limbs which is the magnitude. These are stored
				10208	``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
				10209	least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
				10210	significant. Whenever @code{_mp_size} is non-zero, the most significant limb
				10211	is non-zero.
				10212
				10213	Currently there's always at least one readable limb, so for instance
				10214	@code{mpz_get_ui} can fetch @code{_mp_d[0]} unconditionally (though its value
				10215	is undefined if @code{_mp_size} is zero).
				10216
				10217	@item @code{_mp_alloc}
				10218	@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
				10219	and normally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine
				10220	is about to (or might be about to) increase @code{_mp_size}, it checks
				10221	@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
				10222	@code{MPZ_REALLOC} is generally used for this.
				10223
				10224	@code{mpz_t} variables initialised with the @code{mpz_roinit_n} function or
				10225	the @code{MPZ_ROINIT_N} macro have @code{_mp_alloc = 0} but can have a
				10226	non-zero @code{_mp_size}. They can only be used as read-only constants. See
				10227	@ref{Integer Special Functions} for details.
				10228	@end table
				10229
				10230	The various bitwise logical functions like @code{mpz_and} behave as if
				10231	negative values were twos complement. But sign and magnitude is always used
				10232	internally, and necessary adjustments are made during the calculations.
				10233	Sometimes this isn't pretty, but sign and magnitude are best for other
				10234	routines.
				10235
				10236	Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
				10237	have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
				10238	allocation functions. Care is taken to ensure that these are big enough that
				10239	no reallocation is necessary (since it would have unpredictable consequences).
				10240
				10241	@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t}
				10242	is usually a @code{long}. This is done to make the fields just 32 bits on
				10243	some 64 bits systems, thereby saving a few bytes of data space but still
				10244	providing plenty of range.
				10245
				10246
				10247	@node Rational Internals, Float Internals, Integer Internals, Internals
				10248	@section Rational Internals
				10249	@cindex Rational internals
				10250
				10251	@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
				10252	denominator (@pxref{Integer Internals}).
				10253
				10254	The canonical form adopted is denominator positive (and non-zero), no common
				10255	factors between numerator and denominator, and zero uniquely represented as
				10256	0/1.
				10257
				10258	It's believed that casting out common factors at each stage of a calculation
				10259	is best in general. A GCD is an @math{O(N^2)} operation so it's better to do
				10260	a few small ones immediately than to delay and have to do a big one later.
				10261	Knowing the numerator and denominator have no common factors can be used for
				10262	example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
				10263
				10264	This general approach to common factors is badly sub-optimal in the presence
				10265	of simple factorizations or little prospect for cancellation, but GMP has no
				10266	way to know when this will occur. As per @ref{Efficiency}, that's left to
				10267	applications. The @code{mpq_t} framework might still suit, with
				10268	@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
				10269	denominator, or of course @code{mpz_t} variables can be used directly.
				10270
				10271
				10272	@node Float Internals, Raw Output Internals, Rational Internals, Internals
				10273	@section Float Internals
				10274	@cindex Float internals
				10275
				10276	Efficient calculation is the primary aim of GMP floats and the use of whole
				10277	limbs and simple rounding facilitates this.
				10278
				10279	@code{mpf_t} floats have a variable precision mantissa and a single machine
				10280	word signed exponent. The mantissa is represented using sign and magnitude.
				10281
				10282	@c FIXME: The arrow heads don't join to the lines exactly.
				10283	@tex
				10284	\global\newdimen\GMPboxwidth \GMPboxwidth=5em
				10285	\global\newdimen\GMPboxheight \GMPboxheight=3ex
				10286	\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
				10287	\GMPdisplay{%
				10288	\vbox{%
				10289	\hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
				10290	\vskip 0.7ex
				10291	\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
				10292	\hbox {
				10293	\hbox to 3\GMPboxwidth {%
				10294	\setbox 0 = \hbox{@code{\_mp\_exp}}%
				10295	\dimen0=3\GMPboxwidth
				10296	\advance\dimen0 by -\wd0
				10297	\divide\dimen0 by 2
				10298	\advance\dimen0 by -1em
				10299	\setbox1 = \hbox{$\rightarrow$}%
				10300	\dimen1=\dimen0
				10301	\advance\dimen1 by -\wd1
				10302	\GMPcentreline{\dimen0}%
				10303	\hfil
				10304	\box0%
				10305	\hfil
				10306	\GMPcentreline{\dimen1{}}%
				10307	\box1}
				10308	\hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
				10309	\vskip 0.5ex
				10310	\vbox {%
				10311	\hrule
				10312	\hbox{%
				10313	\vrule height 2ex depth 1ex
				10314	\hbox to \GMPboxwidth {}%
				10315	\vrule
				10316	\hbox to \GMPboxwidth {}%
				10317	\vrule
				10318	\hbox to \GMPboxwidth {}%
				10319	\vrule
				10320	\hbox to \GMPboxwidth {}%
				10321	\vrule
				10322	\hbox to \GMPboxwidth {}%
				10323	\vrule}
				10324	\hrule
				10325	}
				10326	\hbox {%
				10327	\hbox to 0.8 pt {}
				10328	\hbox to 3\GMPboxwidth {%
				10329	\hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
				10330	\hbox to 5\GMPboxwidth{%
				10331	\setbox 0 = \hbox{@code{\_mp\_size}}%
				10332	\dimen0 = 5\GMPboxwidth
				10333	\advance\dimen0 by -\wd0
				10334	\divide\dimen0 by 2
				10335	\advance\dimen0 by -1em
				10336	\dimen1 = \dimen0
				10337	\setbox1 = \hbox{$\leftarrow$}%
				10338	\setbox2 = \hbox{$\rightarrow$}%
				10339	\advance\dimen0 by -\wd1
				10340	\advance\dimen1 by -\wd2
				10341	\hbox to 0.3 em {}%
				10342	\box1
				10343	\GMPcentreline{\dimen0}%
				10344	\hfil
				10345	\box0
				10346	\hfil
				10347	\GMPcentreline{\dimen1}%
				10348	\box2}
				10349	}}
				10350	@end tex
				10351	@ifnottex
				10352	@example
				10353	most least
				10354	significant significant
				10355	limb limb
				10356
				10357	_mp_d
				10358	\|---- _mp_exp ---> \|
				10359	_____ _____ _____ _____ _____
				10360	\|_____\|_____\|_____\|_____\|_____\|
				10361	. <------------ radix point
				10362
				10363	<-------- _mp_size --------->
				10364	@sp 1
				10365	@end example
				10366	@end ifnottex
				10367
				10368	@noindent
				10369	The fields are as follows.
				10370
				10371	@table @asis
				10372	@item @code{_mp_size}
				10373	The number of limbs currently in use, or the negative of that when
				10374	representing a negative value. Zero is represented by @code{_mp_size} and
				10375	@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
				10376	unused. (In the future @code{_mp_exp} might be undefined when representing
				10377	zero.)
				10378
				10379	@item @code{_mp_prec}
				10380	The precision of the mantissa, in limbs. In any calculation the aim is to
				10381	produce @code{_mp_prec} limbs of result (the most significant being non-zero).
				10382
				10383	@item @code{_mp_d}
				10384	A pointer to the array of limbs which is the absolute value of the mantissa.
				10385	These are stored ``little endian'' as per the @code{mpn} functions, so
				10386	@code{_mp_d[0]} is the least significant limb and
				10387	@code{_mp_d[ABS(_mp_size)-1]} the most significant.
				10388
				10389	The most significant limb is always non-zero, but there are no other
				10390	restrictions on its value, in particular the highest 1 bit can be anywhere
				10391	within the limb.
				10392
				10393	@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
				10394	for convenience (see below). There are no reallocations during a calculation,
				10395	only in a change of precision with @code{mpf_set_prec}.
				10396
				10397	@item @code{_mp_exp}
				10398	The exponent, in limbs, determining the location of the implied radix point.
				10399	Zero means the radix point is just above the most significant limb. Positive
				10400	values mean a radix point offset towards the lower limbs and hence a value
				10401	@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean
				10402	a radix point further above the highest limb.
				10403
				10404	Naturally the exponent can be any value, it doesn't have to fall within the
				10405	limbs as the diagram shows, it can be a long way above or a long way below.
				10406	Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
				10407	are treated as zero.
				10408	@end table
				10409
				10410	The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the
				10411	@code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is
				10412	usually @code{long}. This is done to make some fields just 32 bits on some 64
				10413	bits systems, thereby saving a few bytes of data space but still providing
				10414	plenty of precision and a very large range.
				10415
				10416
				10417	@sp 1
				10418	@noindent
				10419	The following various points should be noted.
				10420
				10421	@table @asis
				10422	@item Low Zeros
				10423	The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
				10424	zeros can always be ignored. Routines likely to produce low zeros check and
				10425	avoid them to save time in subsequent calculations, but for most routines
				10426	they're quite unlikely and aren't checked.
				10427
				10428	@item Mantissa Size Range
				10429	The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
				10430	the value can be represented in less. This means low precision values or
				10431	small integers stored in a high precision @code{mpf_t} can still be operated
				10432	on efficiently.
				10433
				10434	@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is
				10435	allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
				10436	and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
				10437	@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
				10438	@code{_mp_prec}.
				10439
				10440	@item Rounding
				10441	All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs
				10442	with the high non-zero will ensure the application requested minimum precision
				10443	is obtained.
				10444
				10445	The use of simple ``trunc'' rounding towards zero is efficient, since there's
				10446	no need to examine extra limbs and increment or decrement.
				10447
				10448	@item Bit Shifts
				10449	Since the exponent is in limbs, there are no bit shifts in basic operations
				10450	like @code{mpf_add} and @code{mpf_mul}. When differing exponents are
				10451	encountered all that's needed is to adjust pointers to line up the relevant
				10452	limbs.
				10453
				10454	Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
				10455	but the choice is between an exponent in limbs which requires shifts there, or
				10456	one in bits which requires them almost everywhere else.
				10457
				10458	@item Use of @code{_mp_prec+1} Limbs
				10459	The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
				10460	@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
				10461	operation. @code{mpf_add} for instance will do an @code{mpn_add} of
				10462	@code{_mp_prec} limbs. If there's no carry then that's the result, but if
				10463	there is a carry then it's stored in the extra limb of space and
				10464	@code{_mp_size} becomes @code{_mp_prec+1}.
				10465
				10466	Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
				10467	needed for the intended precision, only the @code{_mp_prec} high limbs. But
				10468	zeroing it out or moving the rest down is unnecessary. Subsequent routines
				10469	reading the value will simply take the high limbs they need, and this will be
				10470	@code{_mp_prec} if their target has that same precision. This is no more than
				10471	a pointer adjustment, and must be checked anyway since the destination
				10472	precision can be different from the sources.
				10473
				10474	Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
				10475	if available. This ensures that a variable which has @code{_mp_size} equal to
				10476	@code{_mp_prec+1} will get its full exact value copied. Strictly speaking
				10477	this is unnecessary since only @code{_mp_prec} limbs are needed for the
				10478	application's requested precision, but it's considered that an @code{mpf_set}
				10479	from one variable into another of the same precision ought to produce an exact
				10480	copy.
				10481
				10482	@item Application Precisions
				10483	@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
				10484	@code{_mp_prec}. The value in bits is rounded up to a whole limb then an
				10485	extra limb is added since the most significant limb of @code{_mp_d} is only
				10486	non-zero and therefore might contain only one bit.
				10487
				10488	@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
				10489	limb from @code{_mp_prec} before converting to bits. The net effect of
				10490	reading back with @code{mpf_get_prec} is simply the precision rounded up to a
				10491	multiple of @code{mp_bits_per_limb}.
				10492
				10493	Note that the extra limb added here for the high only being non-zero is in
				10494	addition to the extra limb allocated to @code{_mp_d}. For example with a
				10495	32-bit limb, an application request for 250 bits will be rounded up to 8
				10496	limbs, then an extra added for the high being only non-zero, giving an
				10497	@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading
				10498	back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
				10499	multiply by 32, giving 256 bits.
				10500
				10501	Strictly speaking, the fact the high limb has at least one bit means that a
				10502	float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
				10503	for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
				10504	multiple of the limb size.
				10505	@end table
				10506
				10507
				10508	@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
				10509	@section Raw Output Internals
				10510	@cindex Raw output internals
				10511
				10512	@noindent
				10513	@code{mpz_out_raw} uses the following format.
				10514
				10515	@tex
				10516	\global\newdimen\GMPboxwidth \GMPboxwidth=5em
				10517	\global\newdimen\GMPboxheight \GMPboxheight=3ex
				10518	\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
				10519	\GMPdisplay{%
				10520	\vbox{%
				10521	\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
				10522	\vbox {%
				10523	\hrule
				10524	\hbox{%
				10525	\vrule height 2.5ex depth 1.5ex
				10526	\hbox to \GMPboxwidth {\hfil size\hfil}%
				10527	\vrule
				10528	\hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
				10529	\vrule}
				10530	\hrule}
				10531	}}
				10532	@end tex
				10533	@ifnottex
				10534	@example
				10535	+------+------------------------+
				10536	\| size \| data bytes \|
				10537	+------+------------------------+
				10538	@end example
				10539	@end ifnottex
				10540
				10541	The size is 4 bytes written most significant byte first, being the number of
				10542	subsequent data bytes, or the twos complement negative of that when a negative
				10543	integer is represented. The data bytes are the absolute value of the integer,
				10544	written most significant byte first.
				10545
				10546	The most significant data byte is always non-zero, so the output is the same
				10547	on all systems, irrespective of limb size.
				10548
				10549	In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
				10550	of the limb size. @code{mpz_inp_raw} will still accept this, for
				10551	compatibility.
				10552
				10553	The use of ``big endian'' for both the size and data fields is deliberate, it
				10554	makes the data easy to read in a hex dump of a file. Unfortunately it also
				10555	means that the limb data must be reversed when reading or writing, so neither
				10556	a big endian nor little endian system can just read and write @code{_mp_d}.
				10557
				10558
				10559	@node C++ Interface Internals, , Raw Output Internals, Internals
				10560	@section C++ Interface Internals
				10561	@cindex C++ interface internals
				10562
				10563	A system of expression templates is used to ensure something like @code{a=b+c}
				10564	turns into a simple call to @code{mpz_add} etc. For @code{mpf_class}
				10565	the scheme also ensures the precision of the final
				10566	destination is used for any temporaries within a statement like
				10567	@code{f=wx+yz}. These are important features which a naive implementation
				10568	cannot provide.
				10569
				10570	A simplified description of the scheme follows. The true scheme is
				10571	complicated by the fact that expressions have different return types. For
				10572	detailed information, refer to the source code.
				10573
				10574	To perform an operation, say, addition, we first define a ``function object''
				10575	evaluating it,
				10576
				10577	@example
				10578	struct __gmp_binary_plus
				10579	@{
				10580	static void eval(mpf_t f, const mpf_t g, const mpf_t h)
				10581	@{
				10582	mpf_add(f, g, h);
				10583	@}
				10584	@};
				10585	@end example
				10586
				10587	@noindent
				10588	And an ``additive expression'' object,
				10589
				10590	@example
				10591	__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
				10592	operator+(const mpf_class &f, const mpf_class &g)
				10593	@{
				10594	return __gmp_expr
				10595	<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
				10596	@}
				10597	@end example
				10598
				10599	The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to
				10600	encapsulate any possible kind of expression into a single template type. In
				10601	fact even @code{mpf_class} etc are @code{typedef} specializations of
				10602	@code{__gmp_expr}.
				10603
				10604	Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
				10605
				10606	@example
				10607	template <class T>
				10608	mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
				10609	@{
				10610	expr.eval(this->get_mpf_t(), this->precision());
				10611	return *this;
				10612	@}
				10613
				10614	template <class Op>
				10615	void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
				10616	(mpf_t f, mp_bitcnt_t precision)
				10617	@{
				10618	Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
				10619	@}
				10620	@end example
				10621
				10622	where @code{expr.val1} and @code{expr.val2} are references to the expression's
				10623	operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
				10624	@code{__gmp_expr}).
				10625
				10626	This way, the expression is actually evaluated only at the time of assignment,
				10627	when the required precision (that of @code{f}) is known. Furthermore the
				10628	target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
				10629	with @code{f} as the output argument.
				10630
				10631	Compound expressions are handled by defining operators taking subexpressions
				10632	as their arguments, like this:
				10633
				10634	@example
				10635	template <class T, class U>
				10636	__gmp_expr
				10637	<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
				10638	operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
				10639	@{
				10640	return __gmp_expr
				10641	<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
				10642	(expr1, expr2);
				10643	@}
				10644	@end example
				10645
				10646	And the corresponding specializations of @code{__gmp_expr::eval}:
				10647
				10648	@example
				10649	template <class T, class U, class Op>
				10650	void __gmp_expr
				10651	<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
				10652	(mpf_t f, mp_bitcnt_t precision)
				10653	@{
				10654	// declare two temporaries
				10655	mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
				10656	Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
				10657	@}
				10658	@end example
				10659
				10660	The expression is thus recursively evaluated to any level of complexity and
				10661	all subexpressions are evaluated to the precision of @code{f}.
				10662
				10663
				10664	@node Contributors, References, Internals, Top
				10665	@comment node-name, next, previous, up
				10666	@appendix Contributors
				10667	@cindex Contributors
				10668
				10669	Torbj@"orn Granlund wrote the original GMP library and is still the main
				10670	developer. Code not explicitly attributed to others, was contributed by
				10671	Torbj@"orn. Several other individuals and organizations have contributed
				10672	GMP. Here is a list in chronological order on first contribution:
				10673
				10674	Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early
				10675	versions of the library.
				10676
				10677	Richard Stallman helped with the interface design and revised the first
				10678	version of this manual.
				10679
				10680	Brian Beuning and Doug Lea helped with testing of early versions of the
				10681	library and made creative suggestions.
				10682
				10683	John Amanatides of York University in Canada contributed the function
				10684	@code{mpz_probab_prime_p}.
				10685
				10686	Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen
				10687	FFT multiply code, and the Karatsuba square root code. He also improved the
				10688	Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his
				10689	comparisons between bignum packages. The ECMNET project Paul is organizing
				10690	was a driving force behind many of the optimizations in GMP 3. Paul also
				10691	wrote the new GMP 4.3 nth root code (with Torbj@"orn).
				10692
				10693	Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
				10694	contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact},
				10695	@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil)
				10696	grant 301314194-2.
				10697
				10698	Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
				10699	He has also made valuable suggestions and tested numerous intermediary
				10700	releases.
				10701
				10702	Joachim Hollman was involved in the design of the @code{mpf} interface, and in
				10703	the @code{mpz} design revisions for version 2.
				10704
				10705	Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
				10706	@code{mpz_legendre}.
				10707
				10708	Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
				10709	@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
				10710
				10711	Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
				10712	improvements for population count. Robert also wrote highly optimized
				10713	Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed
				10714	the ARM assembly code.
				10715
				10716	Torsten Ekedahl of the Mathematical department of Stockholm University provided
				10717	significant inspiration during several phases of the GMP development. His
				10718	mathematical expertise helped improve several algorithms.
				10719
				10720	Linus Nordberg wrote the new configure system based on autoconf and
				10721	implemented the new random functions.
				10722
				10723	Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm
				10724	macros, parameter tuning, speed measuring, the configure system, function
				10725	inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas
				10726	number functions, printf and scanf functions, perl interface, demo expression
				10727	parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and
				10728	various miscellaneous improvements elsewhere.
				10729
				10730	Kent Boortz made the Mac OS 9 port.
				10731
				10732	Steve Root helped write the optimized alpha 21264 assembly code.
				10733
				10734	Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
				10735	@code{istream} input routines.
				10736
				10737	Jason Moxham rewrote @code{mpz_fac_ui}.
				10738
				10739	Pedro Gimeno implemented the Mersenne Twister and made other random number
				10740	improvements.
				10741
				10742	Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the
				10743	quadratic Hensel division code, and (with Torbj@"orn) the new divide and
				10744	conquer division code for GMP 4.3. Niels also helped implement the new Toom
				10745	multiply code for GMP 4.3 and implemented helper functions to simplify Toom
				10746	evaluations for GMP 5.0. He wrote the original version of mpn_mulmod_bnm1, and
				10747	he is the main author of the mini-gmp package used for gmp bootstrapping.
				10748
				10749	Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy,
				10750	and found the optimal strategies for evaluation and interpolation in Toom
				10751	multiplication.
				10752
				10753	Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and
				10754	implemented most of the new Toom multiply and squaring code for 5.0.
				10755	He is the main author of the current mpn_mulmod_bnm1, mpn_mullo_n, and
				10756	mpn_sqrlo. Marco also wrote the functions mpn_invert and mpn_invertappr,
				10757	and improved the speed of integer root extraction. He is the author of
				10758	mini-mpq, an additional layer to mini-gmp; of most of the combinatorial
				10759	functions and the BPSW primality testing implementation, for both the
				10760	main library and the mini-gmp package.
				10761
				10762	David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing
				10763	division relevant to Toom multiplication. He also worked on fast assembly
				10764	sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote
				10765	the internal middle product functions @code{mpn_mulmid_basecase},
				10766	@code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines.
				10767
				10768	Martin Boij wrote @code{mpn_perfect_power_p}.
				10769
				10770	Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster),
				10771	specializations of @code{numeric_limits} and @code{common_type}, C++11
				10772	features (move constructors, explicit bool conversion, UDL), make the
				10773	conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize
				10774	operations where one argument is a small compile-time constant, replace
				10775	some heap allocations by stack allocations. He also fixed the eofbit
				10776	handling of C++ streams, and removed one division from @file{mpq/aors.c}.
				10777
				10778	David S Miller wrote assembly code for SPARC T3 and T4.
				10779
				10780	Mark Sofroniou cleaned up the types of mul_fft.c, letting it work for huge
				10781	operands.
				10782
				10783	Ulrich Weigand ported GMP to the powerpc64le ABI.
				10784
				10785	(This list is chronological, not ordered after significance. If you have
				10786	contributed to GMP but are not listed above, please tell
				10787	@email{gmp-devel@@gmplib.org} about the omission!)
				10788
				10789	The development of floating point functions of GNU MP 2, were supported in part
				10790	by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
				10791	System SOlving).
				10792
				10793	The development of GMP 2, 3, and 4.0 was supported in part by the IDA Center
				10794	for Computing Sciences.
				10795
				10796	The development of GMP 4.3, 5.0, and 5.1 was supported in part by the Swedish
				10797	Foundation for Strategic Research.
				10798
				10799	Thanks go to Hans Thorsen for donating an SGI system for the GMP test system
				10800	environment.
				10801
				10802	@node References, GNU Free Documentation License, Contributors, Top
				10803	@comment node-name, next, previous, up
				10804	@appendix References
				10805	@cindex References
				10806
				10807	@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
				10808	@c but being long words they upset paragraph formatting (the preceding line
				10809	@c can get badly stretched). Would like an conditional @* style line break
				10810	@c if the uref is too long to fit on the last line of the paragraph, but it's
				10811	@c not clear how to do that. For now explicit @texlinebreak{}s are used on
				10812	@c paragraphs that come out bad.
				10813
				10814	@section Books
				10815
				10816	@itemize @bullet
				10817	@item
				10818	Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
				10819	Analytic Number Theory and Computational Complexity'', Wiley, 1998.
				10820
				10821	@item
				10822	Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational
				10823	Perspective'', 2nd edition, Springer-Verlag, 2005.
				10824	@texlinebreak{} @uref{https://www.math.dartmouth.edu/~carlp/}
				10825
				10826	@item
				10827	Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
				10828	Texts in Mathematics number 138, Springer-Verlag, 1993.
				10829	@texlinebreak{} @uref{https://www.math.u-bordeaux.fr/~cohen/}
				10830
				10831	@item
				10832	Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
				10833	``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
				10834	@texlinebreak{} @uref{https://www-cs-faculty.stanford.edu/~knuth/taocp.html}
				10835
				10836	@item
				10837	John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
				10838	The Benjamin Cummings Publishing Company Inc, 1981.
				10839
				10840	@item
				10841	Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
				10842	Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
				10843
				10844	@item
				10845	Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler
				10846	Collection'', Free Software Foundation, 2008, available online
				10847	@uref{https://gcc.gnu.org/onlinedocs/}, and in the GCC package
				10848	@uref{https://ftp.gnu.org/gnu/gcc/}
				10849	@end itemize
				10850
				10851	@section Papers
				10852
				10853	@itemize @bullet
				10854	@item
				10855	Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square
				10856	Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also
				10857	available online as INRIA Research Report 4475, June 2002,
				10858	@uref{https://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf}
				10859
				10860	@item
				10861	Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
				10862	Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
				10863	@texlinebreak{} @uref{https://www.mpi-inf.mpg.de/~ziegler/TechRep.ps.gz}
				10864
				10865	@item
				10866	Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
				10867	using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
				10868	1994. Also available @uref{https://gmplib.org/~tege/divcnst-pldi94.pdf}.
				10869
				10870	@item
				10871	Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant
				10872	integers'', IEEE Transactions on Computers, 11 June 2010.
				10873	@uref{https://gmplib.org/~tege/division-paper.pdf}
				10874
				10875	@item
				10876	Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and
				10877	small'', to appear.
				10878
				10879	@item
				10880	Tudor Jebelean,
				10881	``An algorithm for exact division'',
				10882	Journal of Symbolic Computation,
				10883	volume 15, 1993, pp.@: 169-180.
				10884	Research report version available @texlinebreak{}
				10885	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
				10886
				10887	@item
				10888	Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
				10889	Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
				10890	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
				10891
				10892	@item
				10893	Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
				10894	ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{}
				10895	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
				10896
				10897	@item
				10898	Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
				10899	pp.@: 111-116. Technical report version available @texlinebreak{}
				10900	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
				10901
				10902	@item
				10903	Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
				10904	of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
				10905	pp.@: 145-157. Technical report version also available @texlinebreak{}
				10906	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
				10907
				10908	@item
				10909	Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
				10910	Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early
				10911	technical report version also available
				10912	@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
				10913
				10914	@item
				10915	Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally
				10916	equidistributed uniform pseudorandom number generator'', ACM Transactions on
				10917	Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30.
				10918	Available online @texlinebreak{}
				10919	@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.pdf}
				10920
				10921	@item
				10922	R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
				10923	Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
				10924	Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'',
				10925	Journal of Computer and System Sciences, volume 8, number 3, June 1974,
				10926	pp.@: 366-386.
				10927
				10928	@item
				10929	Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD
				10930	computation'', in Mathematics of Computation, volume 77, January 2008, pp.@:
				10931	589-607, @uref{https://www.ams.org/journals/mcom/2008-77-261/S0025-5718-07-02017-0/home.html}
				10932
				10933	@item
				10934	Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
				10935	Mathematics of Computation, volume 44, number 170, April 1985.
				10936
				10937	@item
				10938	Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
				10939	Zahlen'', Computing 7, 1971, pp.@: 281-292.
				10940
				10941	@item
				10942	Kenneth Weber, ``The accelerated integer GCD algorithm'',
				10943	ACM Transactions on Mathematical Software,
				10944	volume 21, number 1, March 1995, pp.@: 111-122.
				10945
				10946	@item
				10947	Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
				10948	November 1999, @uref{https://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf}
				10949
				10950	@item
				10951	Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
				10952	Implementations'', @texlinebreak{}
				10953	@uref{https://homepages.loria.fr/PZimmermann/papers/proof-div-sqrt.ps.gz}
				10954
				10955	@item
				10956	Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
				10957	Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More
				10958	on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
				10959	volume 43, number 8, August 1994, pp.@: 899-908.
				10960
				10961	@item
				10962	Niels M@"oller, ``Efficient computation of the Jacobi symbol'', @texlinebreak{}
				10963	@uref{https://arxiv.org/abs/1907.07795}
				10964	@end itemize
				10965
				10966	@node GNU Free Documentation License, Concept Index, References, Top
				10967	@appendix GNU Free Documentation License
				10968	@cindex GNU Free Documentation License
				10969	@cindex Free Documentation License
				10970	@cindex Documentation license
				10971	@include fdl-1.3.texi
				10972
				10973
				10974	@node Concept Index, Function Index, GNU Free Documentation License, Top
				10975	@comment node-name, next, previous, up
				10976	@unnumbered Concept Index
				10977	@printindex cp
				10978
				10979	@node Function Index, , Concept Index, Top
				10980	@comment node-name, next, previous, up
				10981	@unnumbered Function and Type Index
				10982	@printindex fn
				10983
				10984	@bye
				10985
				10986	@c Local variables:
				10987	@c fill-column: 78
				10988	@c compile-command: "make gmp.info"
				10989	@c End: