mirror of
				https://github.com/saitohirga/WSJT-X.git
				synced 2025-11-03 21:40:52 -05:00 
			
		
		
		
	
		
			
	
	
		
			247 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		
		
			
		
	
	
			247 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
								 | 
							
								[section:lanczos The Lanczos Approximation]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[h4 Motivation]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								['Why base gamma and gamma-like functions on the Lanczos approximation?]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								First of all I should make clear that for the gamma function
							 | 
						||
| 
								 | 
							
								over real numbers (as opposed to complex ones)
							 | 
						||
| 
								 | 
							
								the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ]
							 | 
						||
| 
								 | 
							
								[@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld])
							 | 
						||
| 
								 | 
							
								appears to offer no clear advantage over more traditional methods such as
							 | 
						||
| 
								 | 
							
								[@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation].
							 | 
						||
| 
								 | 
							
								__pugh carried out an extensive comparison of the various methods available
							 | 
						||
| 
								 | 
							
								and discovered that they were all very similar in terms of complexity
							 | 
						||
| 
								 | 
							
								and relative error.  However, the Lanczos approximation does have a couple of
							 | 
						||
| 
								 | 
							
								properties that make it worthy of further consideration:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								* The approximation has an easy to compute truncation error that holds for 
							 | 
						||
| 
								 | 
							
								all /z > 0/.  In practice that means we can use the same approximation for all
							 | 
						||
| 
								 | 
							
								/z > 0/, and be certain that no matter how large or small /z/ is, the truncation 
							 | 
						||
| 
								 | 
							
								error will /at worst/ be bounded by some finite value.
							 | 
						||
| 
								 | 
							
								* The approximation has a form that is particularly amenable to analytic
							 | 
						||
| 
								 | 
							
								manipulation, in particular ratios of gamma or gamma-like functions
							 | 
						||
| 
								 | 
							
								are particularly easy to compute without resorting to logarithms.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								It is the combination of these two properties that make the approximation
							 | 
						||
| 
								 | 
							
								attractive: Stirling's approximation is highly accurate for large z, and
							 | 
						||
| 
								 | 
							
								has some of the same analytic properties as the Lanczos approximation, but
							 | 
						||
| 
								 | 
							
								can't easily be used across the whole range of z.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								As the simplest example, consider the ratio of two gamma functions: one could 
							 | 
						||
| 
								 | 
							
								compute the result via lgamma:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								   exp(lgamma(a) - lgamma(b));
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								However, even if lgamma is uniformly accurate to 0.5ulp, the worst case 
							 | 
						||
| 
								 | 
							
								relative error in the above can easily be shown to be:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								   Erel > a * log(a)/2 + b * log(b)/2
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For small /a/ and /b/ that's not a problem, but to put the relationship another
							 | 
						||
| 
								 | 
							
								way: ['each time a and b increase in magnitude by a factor of 10, at least one
							 | 
						||
| 
								 | 
							
								decimal digit of precision will be lost.]  
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In contrast, by analytically combining like power
							 | 
						||
| 
								 | 
							
								terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated
							 | 
						||
| 
								 | 
							
								for small /a/ and /b/, and kept under control for very large (or very small
							 | 
						||
| 
								 | 
							
								for that matter) /a/ and /b/.  Of course, computing large powers is itself a
							 | 
						||
| 
								 | 
							
								notoriously hard problem, but even so, analytic combinations of Lanczos 
							 | 
						||
| 
								 | 
							
								approximations can make the difference between obtaining a valid result, or
							 | 
						||
| 
								 | 
							
								simply garbage.  Refer to the implementation notes for the __beta function for
							 | 
						||
| 
								 | 
							
								an example of this method in practice.  The incomplete 
							 | 
						||
| 
								 | 
							
								[link math_toolkit.sf_gamma.igamma gamma_p gamma] and 
							 | 
						||
| 
								 | 
							
								[link math_toolkit.sf_beta.ibeta_function beta] functions
							 | 
						||
| 
								 | 
							
								use similar analytic combinations of power terms, to combine gamma and beta
							 | 
						||
| 
								 | 
							
								functions divided by large powers into single (simpler) expressions.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[h4 The Approximation]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The Lanczos Approximation to the Gamma Function is given by:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos0]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Where S[sub g](z) is an infinite sum, that is convergent for all z > 0, 
							 | 
						||
| 
								 | 
							
								and /g/ is an arbitrary parameter that controls the "shape" of the
							 | 
						||
| 
								 | 
							
								terms in the sum which is given by:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos0a]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								With individual coefficients defined in closed form by:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos0b]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								However, evaluation of the sum in that form can lead to numerical instability
							 | 
						||
| 
								 | 
							
								in the computation of the ratios of rising and falling factorials (effectively
							 | 
						||
| 
								 | 
							
								we're multiplying by a series of numbers very close to 1, so roundoff errors
							 | 
						||
| 
								 | 
							
								can accumulate quite rapidly).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The Lanczos approximation is therefore often written in partial fraction form
							 | 
						||
| 
								 | 
							
								with the leading constants absorbed by the coefficients in the sum:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos1]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								where:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos2]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen 
							 | 
						||
| 
								 | 
							
								number of terms to evaluate in the "Lanczos sum" part.  
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[note 
							 | 
						||
| 
								 | 
							
								Some authors
							 | 
						||
| 
								 | 
							
								choose to define the sum from k=1 to N, and hence end up with N+1 coefficients.
							 | 
						||
| 
								 | 
							
								This happens to confuse both the following discussion and the code (since C++
							 | 
						||
| 
								 | 
							
								deals with half open array ranges, rather than the closed range of the sum).
							 | 
						||
| 
								 | 
							
								This convention is consistent with __godfrey, but not __pugh, so take care
							 | 
						||
| 
								 | 
							
								when referring to the literature in this field.]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[h4 Computing the Coefficients]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The coefficients C0..CN-1 need to be computed from /N/ and /g/ 
							 | 
						||
| 
								 | 
							
								at high precision, and then stored as part of the program.
							 | 
						||
| 
								 | 
							
								Calculation of the coefficients is performed via the method of __godfrey;
							 | 
						||
| 
								 | 
							
								let the constants be contained in a column vector P, then:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								P = D B C F
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								where B is an NxN matrix:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos4]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								D is an NxN matrix:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos3]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								C is an NxN matrix:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos5]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								and F is an N element column vector:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos6]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Note than the matrices B, D and C contain all integer terms and depend
							 | 
						||
| 
								 | 
							
								only on /N/, this product should be computed first, and then multiplied
							 | 
						||
| 
								 | 
							
								by /F/ as the last step.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[h4 Choosing the Right Parameters]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The trick is to choose
							 | 
						||
| 
								 | 
							
								/N/ and /g/ to give the desired level of accuracy: choosing a small value for
							 | 
						||
| 
								 | 
							
								/g/ leads to a strictly convergent series, but one which converges only slowly.
							 | 
						||
| 
								 | 
							
								Choosing a larger value of /g/ causes the terms in the series to be large
							 | 
						||
| 
								 | 
							
								and\/or divergent for about the first /g-1/ terms, and to then suddenly converge 
							 | 
						||
| 
								 | 
							
								with a "crunch".
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								__pugh has determined the optimal
							 | 
						||
| 
								 | 
							
								value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing
							 | 
						||
| 
								 | 
							
								these values leads to cancellation errors in the Lanczos sum as the largest
							 | 
						||
| 
								 | 
							
								term in the (alternating) series is approximately 1000 times larger than the result.  
							 | 
						||
| 
								 | 
							
								These optimal values appear not to be useful in practice unless the evaluation
							 | 
						||
| 
								 | 
							
								can be done with a number of guard digits /and/ the coefficients are stored
							 | 
						||
| 
								 | 
							
								at higher precision than that desired in the result.  These values are best
							 | 
						||
| 
								 | 
							
								reserved for say, computing to float precision with double precision arithmetic. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[table Optimal choices for N and g when computing with guard digits (source: Pugh)
							 | 
						||
| 
								 | 
							
								[[Significand Size] [N] [g][Max Error]]
							 | 
						||
| 
								 | 
							
								[[24] [6] [5.581][9.51e-12]]
							 | 
						||
| 
								 | 
							
								[[53][13][13.144565][9.2213e-23]]
							 | 
						||
| 
								 | 
							
								]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The alternative described by __godfrey is to perform an exhaustive
							 | 
						||
| 
								 | 
							
								search of the /N/ and /g/ parameter space to determine the optimal combination for
							 | 
						||
| 
								 | 
							
								a given /p/ digit floating-point type.  Repeating this work found a good 
							 | 
						||
| 
								 | 
							
								approximation for double precision arithmetic (close to the one __godfrey found), 
							 | 
						||
| 
								 | 
							
								but failed to find really
							 | 
						||
| 
								 | 
							
								good approximations for 80 or 128-bit long doubles.  Further it was observed 
							 | 
						||
| 
								 | 
							
								that the approximations obtained tended to optimised for the small values
							 | 
						||
| 
								 | 
							
								of z (1 < z < 200) used to test the implementation against the factorials.
							 | 
						||
| 
								 | 
							
								Computing ratios of gamma functions with large arguments were observed to
							 | 
						||
| 
								 | 
							
								suffer from error resulting from the truncation of the Lancozos series.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								__pugh identified all the locations where the theoretical error of the 
							 | 
						||
| 
								 | 
							
								approximation were at a minimum, but unfortunately has published only the largest
							 | 
						||
| 
								 | 
							
								of these minima.  However, he makes the observation that the minima
							 | 
						||
| 
								 | 
							
								coincide closely with the location where the first neglected term (a[sub N]) in the
							 | 
						||
| 
								 | 
							
								Lanczos series S[sub g](z) changes sign.  These locations are quite easy to
							 | 
						||
| 
								 | 
							
								locate, albeit with considerable computer time.  These "sweet spots" need
							 | 
						||
| 
								 | 
							
								only be computed once, tabulated, and then searched when required for an
							 | 
						||
| 
								 | 
							
								approximation that delivers the required precision for some fixed precision
							 | 
						||
| 
								 | 
							
								type.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Unfortunately, following this path failed to find a really good approximation
							 | 
						||
| 
								 | 
							
								for 128-bit long doubles, and those found for 64 and 80-bit reals required an
							 | 
						||
| 
								 | 
							
								excessive number of terms.  There are two competing issues here: high precision
							 | 
						||
| 
								 | 
							
								requires a large value of /g/, but avoiding cancellation errors in the evaluation
							 | 
						||
| 
								 | 
							
								requires a small /g/.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								At this point note that the Lanczos sum can be converted into rational form
							 | 
						||
| 
								 | 
							
								(a ratio of two polynomials, obtained from the partial-fraction form using
							 | 
						||
| 
								 | 
							
								polynomial arithmetic),
							 | 
						||
| 
								 | 
							
								and doing so changes the coefficients so that /they are all positive/.  That 
							 | 
						||
| 
								 | 
							
								means that the sum in rational form can be evaluated without cancellation 
							 | 
						||
| 
								 | 
							
								error, albeit with double the number of coefficients for a given N.  Repeating 
							 | 
						||
| 
								 | 
							
								the search of the "sweet spots", this time evaluating the Lanczos sum in 
							 | 
						||
| 
								 | 
							
								rational form, and testing only those "sweet spots" whose theoretical error
							 | 
						||
| 
								 | 
							
								is less than the machine epsilon for the type being tested, yielded good
							 | 
						||
| 
								 | 
							
								approximations for all the types tested.  The optimal values found were quite
							 | 
						||
| 
								 | 
							
								close to the best cases reported by __pugh (just slightly larger /N/ and slightly
							 | 
						||
| 
								 | 
							
								smaller /g/ for a given precision than __pugh reports), and even though converting
							 | 
						||
| 
								 | 
							
								to rational form doubles the number of stored coefficients, it should be
							 | 
						||
| 
								 | 
							
								noted that half of them are integers (and therefore require less storage space)
							 | 
						||
| 
								 | 
							
								and the approximations require a smaller /N/ than would otherwise be required,
							 | 
						||
| 
								 | 
							
								so fewer floating point operations may be required overall.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following table shows the optimal values for /N/ and /g/ when computing
							 | 
						||
| 
								 | 
							
								at fixed precision.  These should be taken as work in progress: there are no
							 | 
						||
| 
								 | 
							
								values for 106-bit significand machines (Darwin long doubles & NTL quad_float),
							 | 
						||
| 
								 | 
							
								and further optimisation of the values of /g/ may be possible.
							 | 
						||
| 
								 | 
							
								Errors given in the table
							 | 
						||
| 
								 | 
							
								are estimates of the error due to truncation of the Lanczos infinite series
							 | 
						||
| 
								 | 
							
								to /N/ terms.  They are calculated from the sum of the first five neglected
							 | 
						||
| 
								 | 
							
								terms - and are known to be rather pessimistic estimates - although it is noticeable
							 | 
						||
| 
								 | 
							
								that the best combinations of /N/ and /g/ occurred when the estimated truncation error
							 | 
						||
| 
								 | 
							
								almost exactly matches the machine epsilon for the type in question.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[table Optimum value for N and g when computing at fixed precision
							 | 
						||
| 
								 | 
							
								[[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]]
							 | 
						||
| 
								 | 
							
								[[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]]
							 | 
						||
| 
								 | 
							
								[[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]]
							 | 
						||
| 
								 | 
							
								[[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]]
							 | 
						||
| 
								 | 
							
								[[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]]
							 | 
						||
| 
								 | 
							
								]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Finally note that the Lanczos approximation can be written as follows
							 | 
						||
| 
								 | 
							
								by removing a factor of exp(g) from the denominator, and then dividing 
							 | 
						||
| 
								 | 
							
								all the coefficients by exp(g):
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[equation lanczos7]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								This form is more convenient for calculating lgamma, but for the gamma
							 | 
						||
| 
								 | 
							
								function the division by /e/ turns a possibly exact quality into an
							 | 
						||
| 
								 | 
							
								inexact value: this reduces accuracy in the common case that 
							 | 
						||
| 
								 | 
							
								the input is exact, and so isn't used for the gamma function.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[h4 References]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent
							 | 
						||
| 
								 | 
							
								Lanczos complex Gamma approximation"].
							 | 
						||
| 
								 | 
							
								# [#pugh]Glendon Ralph Pugh, 
							 | 
						||
| 
								 | 
							
								[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf 
							 | 
						||
| 
								 | 
							
								"An Analysis of the Lanczos Gamma Approximation"], 
							 | 
						||
| 
								 | 
							
								PhD Thesis November 2004.
							 | 
						||
| 
								 | 
							
								# Viktor T. Toth, 
							 | 
						||
| 
								 | 
							
								[@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"].
							 | 
						||
| 
								 | 
							
								# Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html 
							 | 
						||
| 
								 | 
							
								The Lanczos Approximation].
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[endsect][/section:lanczos The Lanczos Approximation]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[/ 
							 | 
						||
| 
								 | 
							
								  Copyright 2006 John Maddock and Paul A. Bristow.
							 | 
						||
| 
								 | 
							
								  Distributed under the Boost Software License, Version 1.0.
							 | 
						||
| 
								 | 
							
								  (See accompanying file LICENSE_1_0.txt or copy at
							 | 
						||
| 
								 | 
							
								  http://www.boost.org/LICENSE_1_0.txt).
							 | 
						||
| 
								 | 
							
								]
							 |