added libtommath-0.14
This commit is contained in:
		
							parent
							
								
									b66471f74f
								
							
						
					
					
						commit
						82f4858291
					
				
							
								
								
									
										199
									
								
								bn.tex
									
									
									
									
									
								
							
							
						
						
									
										199
									
								
								bn.tex
									
									
									
									
									
								
							| @ -1,15 +1,15 @@ | ||||
| \documentclass{article} | ||||
| \begin{document} | ||||
| 
 | ||||
| \title{LibTomMath v0.13 \\ A Free Multiple Precision Integer Library} | ||||
| \title{LibTomMath v0.14 \\ A Free Multiple Precision Integer Library \\ http://math.libtomcrypt.org } | ||||
| \author{Tom St Denis \\ tomstdenis@iahu.ca} | ||||
| \maketitle | ||||
| \newpage | ||||
| 
 | ||||
| \section{Introduction} | ||||
| ``LibTomMath'' is a free and open source library that provides multiple-precision integer functions required to form a basis | ||||
| of a public key cryptosystem.  LibTomMath is written entire in portable ISO C source code and designed to have an application | ||||
| interface much like that of MPI from Michael Fromberger.   | ||||
| ``LibTomMath'' is a free and open source library that provides multiple-precision integer functions required to form a  | ||||
| basis of a public key cryptosystem.  LibTomMath is written entire in portable ISO C source code and designed to have an  | ||||
| application interface much like that of MPI from Michael Fromberger.   | ||||
| 
 | ||||
| LibTomMath was written from scratch by Tom St Denis but designed to be  drop in replacement for the MPI package.  The  | ||||
| algorithms within the library are derived from descriptions as provided in the Handbook of Applied Cryptography and Knuth's | ||||
| @ -23,8 +23,7 @@ LibTomMath was designed with the following goals in mind: | ||||
| \item Be written entirely in portable C. | ||||
| \end{enumerate} | ||||
| 
 | ||||
| All three goals have been achieved.  Particularly the speed increase goal.  For example, a 512-bit modular exponentiation  | ||||
| is eight times faster\footnote{On an Athlon XP with GCC 3.2} with LibTomMath compared to MPI. | ||||
| All three goals have been achieved to one extent or another (actual figures depend on what platform you are using). | ||||
| 
 | ||||
| Being compatible with MPI means that applications that already use it can be ported fairly quickly.  Currently there are  | ||||
| a few differences but there are many similarities.  In fact the average MPI based application can be ported in under 15 | ||||
| @ -54,16 +53,26 @@ make install | ||||
| 
 | ||||
| Now within your application include ``tommath.h'' and link against libtommath.a to get MPI-like functionality. | ||||
| 
 | ||||
| \subsection{Microsoft Visual C++} | ||||
| A makefile is also provided for MSVC (\textit{tested against MSVC 6.00 with SP5}) which allows the library to be used | ||||
| with that compiler as well.  To build the library type | ||||
| 
 | ||||
| \begin{verbatim} | ||||
| nmake -f makefile.msvc | ||||
| \end{verbatim} | ||||
| 
 | ||||
| Which will build ``tommath.lib''.   | ||||
| 
 | ||||
| \section{Programming with LibTomMath} | ||||
| 
 | ||||
| \subsection{The mp\_int Structure} | ||||
| All multiple precision integers are stored in a structure called \textbf{mp\_int}.  A multiple precision integer is | ||||
| essentially an array of \textbf{mp\_digit}.  mp\_digit is defined at the top of bn.h.  Its type can be changed to suit | ||||
| a particular platform.   | ||||
| essentially an array of \textbf{mp\_digit}.  mp\_digit is defined at the top of ``tommath.h''.  The type can be changed  | ||||
| to suit a particular platform.   | ||||
| 
 | ||||
| For example, when \textbf{MP\_8BIT} is defined\footnote{When building bn.c.} a mp\_digit is a unsigned char and holds  | ||||
| seven bits.  Similarly when \textbf{MP\_16BIT} is defined a mp\_digit is a unsigned short and holds 15 bits.   | ||||
| By default a mp\_digit is a unsigned long and holds 28 bits.   | ||||
| For example, when \textbf{MP\_8BIT} is defined a mp\_digit is a unsigned char and holds seven bits.  Similarly  | ||||
| when \textbf{MP\_16BIT} is defined a mp\_digit is a unsigned short and holds 15 bits.   By default a mp\_digit is a  | ||||
| unsigned long and holds 28 bits which is optimal for most 32 and 64 bit processors. | ||||
| 
 | ||||
| The choice of digit is particular to the platform at hand and what available multipliers are provided.  For  | ||||
| MP\_8BIT either a $8 \times 8 \Rightarrow 16$ or $16 \times 16 \Rightarrow 16$ multiplier is optimal.  When  | ||||
| @ -83,20 +92,19 @@ $W$ is the number of bits in a digit (default is 28). | ||||
| 
 | ||||
| \subsection{Calling Functions} | ||||
| Most functions expect pointers to mp\_int's as parameters.   To save on memory usage it is possible to have source | ||||
| variables as destinations.  For example: | ||||
| variables as destinations.  The arguements are read left to right so to compute $x + y = z$ you would pass the arguments | ||||
| in the order $x, y, z$.  For example: | ||||
| \begin{verbatim} | ||||
|    mp_add(&x, &y, &x);           /* x = x + y */ | ||||
|    mp_mul(&x, &z, &x);           /* x = x * z */ | ||||
|    mp_div_2(&x, &x);             /* x = x / 2 */ | ||||
|    mp_mul(&y, &x, &z);           /* z = y * x */ | ||||
|    mp_div_2(&x, &y);             /* y = x / 2 */ | ||||
| \end{verbatim} | ||||
| 
 | ||||
| \section{Quick Overview} | ||||
| \subsection{Return Values} | ||||
| All functions that return errors will return \textbf{MP\_OKAY} if the function was succesful.  It will return  | ||||
| \textbf{MP\_MEM} if it ran out of heap memory or \textbf{MP\_VAL} if one of the arguements is out of range.   | ||||
| 
 | ||||
| \subsection{Basic Functionality} | ||||
| Essentially all LibTomMath functions return one of three values to indicate if the function worked as desired.  A  | ||||
| function will return \textbf{MP\_OKAY} if the function was successful.  A function will return \textbf{MP\_MEM} if | ||||
| it ran out of memory and \textbf{MP\_VAL} if the input was invalid.   | ||||
| 
 | ||||
| Before an mp\_int can be used it must be initialized with  | ||||
| 
 | ||||
| \begin{verbatim} | ||||
| @ -106,7 +114,7 @@ int mp_init(mp_int *a); | ||||
| For example, consider the following. | ||||
| 
 | ||||
| \begin{verbatim} | ||||
| #include "bn.h" | ||||
| #include "tommath.h" | ||||
| int main(void) | ||||
| { | ||||
|    mp_int num; | ||||
| @ -383,6 +391,18 @@ in $c$ and returns success. | ||||
| 
 | ||||
| This function requires $O(N)$ additional digits of memory and $O(2 \cdot N)$ time. | ||||
| 
 | ||||
| \subsubsection{mp\_mul\_2(mp\_int *a, mp\_int *b)} | ||||
| Multiplies $a$ by two and stores in $b$.  This function is hard coded todo a shift by one place so it is faster | ||||
| than calling mp\_mul\_2d with a count of one.   | ||||
| 
 | ||||
| This function requires $O(N)$ additional digits of memory and $O(N)$ time. | ||||
| 
 | ||||
| \subsubsection{mp\_div\_2(mp\_int *a, mp\_int *b)} | ||||
| Divides $a$ by two and stores in $b$.  This function is hard coded todo a shift by one place so it is faster | ||||
| than calling mp\_div\_2d with a count of one. | ||||
| 
 | ||||
| This function requires $O(N)$ additional digits of memory and $O(N)$ time. | ||||
| 
 | ||||
| \subsubsection{mp\_mod\_2d(mp\_int *a, int b, mp\_int *c)} | ||||
| Performs the action of reducing $a$ modulo $2^b$ and stores the result in $c$.  If the shift count $b$ is less than  | ||||
| or equal to zero the function places $a$ in $c$ and returns success.   | ||||
| @ -412,7 +432,7 @@ of $c$ is the maximum length of the two inputs. | ||||
| \subsection{Basic Arithmetic} | ||||
| 
 | ||||
| \subsubsection{mp\_cmp(mp\_int *a, mp\_int *b)} | ||||
| Performs a \textbf{signed} comparison between $a$ and $b$ returning \textbf{MP\_GT} is $a$ is larger than $b$. | ||||
| Performs a \textbf{signed} comparison between $a$ and $b$ returning \textbf{MP\_GT} if $a$ is larger than $b$. | ||||
| 
 | ||||
| This function requires no additional memory and $O(N)$ time. | ||||
| 
 | ||||
| @ -559,57 +579,6 @@ A very useful observation is that multiplying by $R = \beta^n$ amounts to perfor | ||||
| requires no single precision multiplications.   | ||||
| 
 | ||||
| \section{Timing Analysis} | ||||
| \subsection{Observed Timings} | ||||
| A simple test program ``demo.c'' was developed which builds with either MPI or LibTomMath (without modification).  The | ||||
| test was conducted on an AMD Athlon XP processor with 266Mhz DDR memory and the GCC 3.2 compiler\footnote{With build | ||||
| options ``-O3 -fomit-frame-pointer -funroll-loops''}.    The multiplications and squarings were repeated 100,000 times  | ||||
| each while the modular exponentiation (exptmod) were performed 50 times each.  The ``inversions'' refers to multiplicative | ||||
| inversions modulo an odd number of a given size.  The RDTSC (Read Time Stamp Counter) instruction was used to measure the  | ||||
| time the entire iterations took and was divided by the number of iterations to get an average.  The following results  | ||||
| were observed. | ||||
| 
 | ||||
| \begin{small} | ||||
| \begin{center} | ||||
| \begin{tabular}{c|c|c|c} | ||||
| \hline \textbf{Operation} & \textbf{Size (bits)} & \textbf{Time with MPI (cycles)} & \textbf{Time with LibTomMath (cycles)} \\ | ||||
| \hline | ||||
| Inversion & 128 & 264,083  & 59,782   \\ | ||||
| Inversion & 256 & 549,370  & 146,915   \\ | ||||
| Inversion & 512 & 1,675,975  & 367,172   \\ | ||||
| Inversion & 1024 & 5,237,957  & 1,054,158   \\ | ||||
| Inversion & 2048 & 17,871,944  & 3,459,683   \\ | ||||
| Inversion & 4096 & 66,610,468  & 11,834,556   \\ | ||||
| \hline | ||||
| Multiply & 128 & 1,426   & 451     \\ | ||||
| Multiply & 256 & 2,551   & 958     \\ | ||||
| Multiply & 512 & 7,913   & 2,476     \\ | ||||
| Multiply & 1024 & 28,496   & 7,927   \\ | ||||
| Multiply & 2048 & 109,897   & 28,224     \\ | ||||
| Multiply & 4096 & 469,970   & 101,171     \\ | ||||
| \hline  | ||||
| Square & 128 & 1,319   & 511     \\ | ||||
| Square & 256 & 1,776   & 947     \\ | ||||
| Square & 512 & 5,399  & 2,153    \\ | ||||
| Square & 1024 & 18,991  & 5,733     \\ | ||||
| Square & 2048 & 72,126  & 17,621    \\ | ||||
| Square & 4096 & 306,269  & 67,576   \\ | ||||
| \hline  | ||||
| Exptmod & 512 & 32,021,586  & 3,118,435 \\ | ||||
| Exptmod & 768 & 97,595,492  & 8,493,633 \\ | ||||
| Exptmod & 1024 & 223,302,532  & 17,715,899     \\ | ||||
| Exptmod & 2048 & 1,682,223,369   & 114,936,361      \\ | ||||
| Exptmod & 2560 & 3,268,615,571   & 229,402,426       \\ | ||||
| Exptmod & 3072 & 5,597,240,141   & 367,403,840      \\ | ||||
| Exptmod & 4096 & 13,347,270,891   & 779,058,433       | ||||
| 
 | ||||
| \end{tabular} | ||||
| \end{center} | ||||
| \end{small} | ||||
| 
 | ||||
| Note that the figures do fluctuate but their magnitudes are relatively intact.  The purpose of the chart is not to | ||||
| get an exact timing but to compare the two libraries.  For example, in all of the tests the exact time for a 512-bit | ||||
| squaring operation was not the same.  The observed times were all approximately 2,500 cycles, more importantly they | ||||
| were always faster than the timings observed with MPI by about the same magnitude.   | ||||
| 
 | ||||
| \subsection{Digit Size} | ||||
| The first major constribution to the time savings is the fact that 28 bits are stored per digit instead of the MPI  | ||||
| @ -619,29 +588,59 @@ A savings of $64^2 - 37^2 = 2727$ single precision multiplications. | ||||
| 
 | ||||
| \subsection{Multiplication Algorithms} | ||||
| For most inputs a typical baseline $O(n^2)$ multiplier is used which is similar to that of MPI.  There are two variants  | ||||
| of the baseline multiplier.  The normal and the fast variants.  The normal baseline multiplier is the exact same as the | ||||
| algorithm from MPI.  The fast baseline multiplier is optimized for cases where the number of input digits $N$ is less | ||||
| than or equal to $2^{w}/\beta^2$.  Where $w$ is the number of bits in a \textbf{mp\_word}.  By default a mp\_word is | ||||
| 64-bits which means $N \le 256$ is allowed which represents numbers upto $7168$ bits. | ||||
| of the baseline multiplier.  The normal and the fast comba variant.  The normal baseline multiplier is the exact same as  | ||||
| the algorithm from MPI.  The fast comba baseline multiplier is optimized for cases where the number of input digits $N$  | ||||
| is less than or equal to $2^{w}/\beta^2$.  Where $w$ is the number of bits in a \textbf{mp\_word} or simply $lg(\beta)$. | ||||
| By default a mp\_word is 64-bits which means $N \le 256$ is allowed which represents numbers upto $7,168$ bits.  However, | ||||
| since the Karatsuba multiplier (discussed below) will kick in before that size the slower baseline algorithm (that MPI | ||||
| uses) should never really be used in a default configuration.   | ||||
| 
 | ||||
| The fast baseline multiplier is optimized by removing the carry operations from the inner loop.  This is often referred | ||||
| to as the ``comba'' method since it computes the products a columns first then figures out the carries.  This has the | ||||
| effect of making a very simple and paralizable inner loop. | ||||
| The fast comba baseline multiplier is optimized by removing the carry operations from the inner loop.  This is often  | ||||
| referred to as the ``comba'' method since it computes the products a columns first then figures out the carries.  To | ||||
| accomodate this the result of the inner multiplications must be stored in words large enough not to lose the carry bits.   | ||||
| This is why there is a limit of $2^{w}/\beta^2$ digits in the input.  This optimization has the effect of making a  | ||||
| very simple and efficient inner loop. | ||||
| 
 | ||||
| For large inputs, typically 80 digits\footnote{By default that is 2240-bits or more.} or more the Karatsuba method is  | ||||
| used.  This method has significant overhead but an asymptotic running time of $O(n^{1.584})$ which means for fairly large | ||||
| inputs this method is faster.  The Karatsuba implementation is recursive which means for extremely large inputs they | ||||
| will benefit from the algorithm. | ||||
| \subsubsection{Karatsuba Multiplier} | ||||
| For large inputs, typically 80 digits\footnote{By default that is 2240-bits or more.} or more the Karatsuba multiplication | ||||
| method is used.  This method has significant overhead but an asymptotic running time of $O(n^{1.584})$ which means for  | ||||
| fairly large inputs this method is faster than the baseline (or comba) algorithm.  The Karatsuba implementation is  | ||||
| recursive which means for extremely large inputs they will benefit from the algorithm. | ||||
| 
 | ||||
| The algorithm is based on the observation that if  | ||||
| 
 | ||||
| \begin{eqnarray} | ||||
| x = x_0 + x_1\beta \nonumber \\ | ||||
| y = y_0 + y_1\beta | ||||
| \end{eqnarray} | ||||
| 
 | ||||
| Where $x_0, x_1, y_0, y_1$ are half the size of their respective summand than  | ||||
| 
 | ||||
| \begin{equation} | ||||
| x \cdot y = x_1y_1\beta^2 + ((x_1 - y_1)(x_0 - y_0) + x_0y_0 + x_1y_1)\beta + x_0y_0 | ||||
| \end{equation} | ||||
| 
 | ||||
| It is trivial that from this only three products have to be produced: $x_0y_0, x_1y_1, (x_1-y_1)(x_0-y_0)$ which | ||||
| are all of half size numbers.  A multiplication of two half size numbers requires only $1 \over 4$ of the | ||||
| original work which means with no recursion the Karatsuba algorithm achieves a running time of ${3n^2}\over 4$.   | ||||
| The routine provided does recursion which is where the $O(n^{1.584})$ work factor comes from. | ||||
| 
 | ||||
| The multiplication by $\beta$ and $\beta^2$ amount to digit shift operations.   | ||||
| The extra overhead in the Karatsuba method comes from extracting the half size numbers $x_0, x_1, y_0, y_1$ and | ||||
| performing the various smaller calculations.   | ||||
| 
 | ||||
| The library has been fairly optimized to extract the digits using hard-coded routines instead of the hire | ||||
| level functions however there is still significant overhead to optimize away. | ||||
| 
 | ||||
| MPI only implements the slower baseline multiplier where carries are dealt with in the inner loop.  As a result even at | ||||
| smaller numbers (below the Karatsuba cutoff) the LibTomMath multipliers are faster. | ||||
| 
 | ||||
| \subsection{Squaring Algorithms} | ||||
| 
 | ||||
| Similar to the multiplication algorithms there are two baseline squaring algorithms.  Both have an asymptotic running | ||||
| time of $O((t^2 + t)/2)$.  The normal baseline squaring is the same from MPI and the fast is a ``comba'' squaring | ||||
| algorithm.  The comba method is used if the number of digits $N$ is less than $2^{w-1}/\beta^2$ which by default  | ||||
| covers numbers upto $3584$ bits.   | ||||
| Similar to the multiplication algorithms there are two baseline squaring algorithms.  Both have an asymptotic  | ||||
| running time of $O((t^2 + t)/2)$.  The normal baseline squaring is the same from MPI and the fast method is  | ||||
| a ``comba'' squaring algorithm.  The comba method is used if the number of digits $N$ is less than  | ||||
| $2^{w-1}/\beta^2$ which by default covers numbers upto $3,584$ bits.   | ||||
| 
 | ||||
| There is also a Karatsuba squaring method which achieves a running time of $O(n^{1.584})$ after considerably large | ||||
| inputs. | ||||
| @ -653,25 +652,31 @@ than MPI is. | ||||
| 
 | ||||
| LibTomMath implements a sliding window $k$-ary left to right exponentiation algorithm.  For a given exponent size $L$ an | ||||
| appropriate window size $k$ is chosen.  There are always at most $L$ modular squarings and $\lfloor L/k \rfloor$ modular | ||||
| multiplications.   The $k$-ary method works by precomputing values $g(x) = b^x$ for $0 \le x < 2^k$ and a given base  | ||||
| multiplications.   The $k$-ary method works by precomputing values $g(x) = b^x$ for $2^{k-1} \le x < 2^k$ and a given base  | ||||
| $b$.  Then the multiplications are grouped in windows of $k$ bits.  The sliding window technique has the benefit  | ||||
| that it can skip multiplications if there are zero bits following or preceding a window.  Consider the exponent  | ||||
| $e = 11110001_2$ if $k = 2$ then there will be a two squarings, a multiplication of $g(3)$, two squarings, a multiplication | ||||
| of $g(3)$, four squarings and and a multiplication by $g(1)$.  In total there are 8 squarings and 3 multiplications.   | ||||
| of $g(3)$, four squarings and and a multiplication by $g(1)$.  In total there are 8 squarings and 3 multiplications. | ||||
| 
 | ||||
| MPI uses a binary square-multiply method.  For the same exponent $e$ it would have had 8 squarings and 5 multiplications.   | ||||
| There is a precomputation phase for the method LibTomMath uses but it generally cuts down considerably on the number | ||||
| of multiplications.  Consider a 512-bit exponent.  The worst case for the LibTomMath method results in 512 squarings and  | ||||
| 124 multiplications.  The MPI method would have 512 squarings and 512 multiplications.  Randomly every $2k$ bits another  | ||||
| multiplication is saved via the sliding-window technique on top of the savings the $k$-ary method provides. | ||||
| MPI uses a binary square-multiply method for exponentiation.  For the same exponent $e = 11110001_2$ it would have had to | ||||
| perform 8 squarings and 5 multiplications.  There is a precomputation phase for the method LibTomMath uses but it  | ||||
| generally cuts down considerably on the number of multiplications.  Consider a 512-bit exponent.  The worst case for the  | ||||
| LibTomMath method results in 512 squarings and 124 multiplications.  The MPI method would have 512 squarings  | ||||
| and 512 multiplications.  Randomly every $2k$ bits another multiplication is saved via the sliding-window  | ||||
| technique on top of the savings the $k$-ary method provides. | ||||
| 
 | ||||
| Both LibTomMath and MPI use Barrett reduction instead of division to reduce the numbers modulo the modulus given. | ||||
| However, LibTomMath can take advantage of the fact that the multiplications required within the Barrett reduction | ||||
| do not have to give full precision.  As a result the reduction step is much faster and just as accurate.  The LibTomMath code | ||||
| will automatically determine at run-time (e.g. when its called) whether the faster multiplier can be used.  The | ||||
| do not have to give full precision.  As a result the reduction step is much faster and just as accurate.  The LibTomMath  | ||||
| code will automatically determine at run-time (e.g. when its called) whether the faster multiplier can be used.  The | ||||
| faster multipliers have also been optimized into the two variants (baseline and comba baseline). | ||||
| 
 | ||||
| LibTomMath also has a variant of the exptmod function that uses Montgomery reductions instead of Barrett reductions | ||||
| which is faser.  As a result of all these changes exponentiation in LibTomMath is much faster than compared to MPI.   | ||||
| which is faster.  The code will automatically detect when the Montgomery version can be used (\textit{Requires the | ||||
| modulus to be odd and below the MONTGOMERY\_EXPT\_CUTOFF size}).  The Montgomery routine is essentially a copy of the  | ||||
| Barrett exponentiation routine except it uses Montgomery reduction. | ||||
| 
 | ||||
| As a result of all these changes exponentiation in LibTomMath is much faster than compared to MPI.  On most ALU-strong | ||||
| processors (AMD Athlon for instance) exponentiation in LibTomMath is often more then ten times faster than MPI.    | ||||
| 
 | ||||
| \end{document} | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -100,14 +100,18 @@ fast_mp_montgomery_reduce (mp_int * a, mp_int * m, mp_digit mp) | ||||
|     W[ix + 1] += W[ix] >> ((mp_word) DIGIT_BIT); | ||||
|   } | ||||
| 
 | ||||
|   /* nox fix rest of carries */ | ||||
|   for (++ix; ix <= m->used * 2 + 1; ix++) { | ||||
|     W[ix] += (W[ix - 1] >> ((mp_word) DIGIT_BIT)); | ||||
|   } | ||||
| 
 | ||||
|   { | ||||
|     register mp_digit *tmpa; | ||||
|     register mp_word *_W; | ||||
|     register mp_word *_W, *_W1; | ||||
| 
 | ||||
|     /* nox fix rest of carries */ | ||||
|     _W1 = W + ix; | ||||
|     _W = W + ++ix; | ||||
| 
 | ||||
|     for (; ix <= m->used * 2 + 1; ix++) { | ||||
|       *_W++ += *_W1++ >> ((mp_word) DIGIT_BIT); | ||||
|     } | ||||
| 
 | ||||
|     /* copy out, A = A/b^n
 | ||||
|      * | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -46,6 +46,7 @@ mp_div_2 (mp_int * a, mp_int * b) | ||||
|       *tmpb++ = 0; | ||||
|     } | ||||
|   } | ||||
|   b->sign = a->sign; | ||||
|   mp_clamp (b); | ||||
|   return MP_OKAY; | ||||
| } | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -51,7 +51,9 @@ mp_div_2d (mp_int * a, int b, mp_int * c, mp_int * d) | ||||
|   } | ||||
| 
 | ||||
|   /* shift by as many digits in the bit count */ | ||||
|   mp_rshd (c, b / DIGIT_BIT); | ||||
|   if (b >= DIGIT_BIT) { | ||||
|      mp_rshd (c, b / DIGIT_BIT); | ||||
|   }      | ||||
| 
 | ||||
|   /* shift any bit count < DIGIT_BIT */ | ||||
|   D = (mp_digit) (b % DIGIT_BIT); | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -37,8 +37,7 @@ int | ||||
| mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c) | ||||
| { | ||||
|   mp_int  x0, x1, y0, y1, t1, t2, x0y0, x1y1; | ||||
|   int     B, err, x; | ||||
| 
 | ||||
|   int     B, err; | ||||
| 
 | ||||
|   err = MP_MEM; | ||||
| 
 | ||||
| @ -59,13 +58,13 @@ mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c) | ||||
|     goto Y0; | ||||
| 
 | ||||
|   /* init temps */ | ||||
|   if (mp_init (&t1) != MP_OKAY) | ||||
|   if (mp_init_size (&t1, B * 2) != MP_OKAY) | ||||
|     goto Y1; | ||||
|   if (mp_init (&t2) != MP_OKAY) | ||||
|   if (mp_init_size (&t2, B * 2) != MP_OKAY) | ||||
|     goto T1; | ||||
|   if (mp_init (&x0y0) != MP_OKAY) | ||||
|   if (mp_init_size (&x0y0, B * 2) != MP_OKAY) | ||||
|     goto T2; | ||||
|   if (mp_init (&x1y1) != MP_OKAY) | ||||
|   if (mp_init_size (&x1y1, B * 2) != MP_OKAY) | ||||
|     goto X0Y0; | ||||
| 
 | ||||
|   /* now shift the digits */ | ||||
| @ -76,18 +75,32 @@ mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c) | ||||
|   x1.used = a->used - B; | ||||
|   y1.used = b->used - B; | ||||
| 
 | ||||
|   /* we copy the digits directly instead of using higher level functions
 | ||||
|    * since we also need to shift the digits | ||||
|    */ | ||||
|   for (x = 0; x < B; x++) { | ||||
|     x0.dp[x] = a->dp[x]; | ||||
|     y0.dp[x] = b->dp[x]; | ||||
|   } | ||||
|   for (x = B; x < a->used; x++) { | ||||
|     x1.dp[x - B] = a->dp[x]; | ||||
|   } | ||||
|   for (x = B; x < b->used; x++) { | ||||
|     y1.dp[x - B] = b->dp[x]; | ||||
|   { | ||||
|     register int x; | ||||
|     register mp_digit *tmpa, *tmpb, *tmpx, *tmpy; | ||||
| 
 | ||||
|     /* we copy the digits directly instead of using higher level functions
 | ||||
|      * since we also need to shift the digits | ||||
|      */ | ||||
|     tmpa = a->dp; | ||||
|     tmpb = b->dp; | ||||
| 
 | ||||
|     tmpx = x0.dp; | ||||
|     tmpy = y0.dp; | ||||
|     for (x = 0; x < B; x++) { | ||||
|       *tmpx++ = *tmpa++; | ||||
|       *tmpy++ = *tmpb++; | ||||
|     } | ||||
| 
 | ||||
|     tmpx = x1.dp; | ||||
|     for (x = B; x < a->used; x++) { | ||||
|       *tmpx++ = *tmpa++; | ||||
|     } | ||||
| 
 | ||||
|     tmpy = y1.dp; | ||||
|     for (x = B; x < b->used; x++) { | ||||
|       *tmpy++ = *tmpb++; | ||||
|     } | ||||
|   } | ||||
| 
 | ||||
|   /* only need to clamp the lower words since by definition the upper words x1/y1 must
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -23,8 +23,7 @@ int | ||||
| mp_karatsuba_sqr (mp_int * a, mp_int * b) | ||||
| { | ||||
|   mp_int  x0, x1, t1, t2, x0x0, x1x1; | ||||
|   int     B, err, x; | ||||
| 
 | ||||
|   int     B, err; | ||||
| 
 | ||||
|   err = MP_MEM; | ||||
| 
 | ||||
| @ -41,22 +40,31 @@ mp_karatsuba_sqr (mp_int * a, mp_int * b) | ||||
|     goto X0; | ||||
| 
 | ||||
|   /* init temps */ | ||||
|   if (mp_init (&t1) != MP_OKAY) | ||||
|   if (mp_init_size (&t1, a->used * 2) != MP_OKAY) | ||||
|     goto X1; | ||||
|   if (mp_init (&t2) != MP_OKAY) | ||||
|   if (mp_init_size (&t2, a->used * 2) != MP_OKAY) | ||||
|     goto T1; | ||||
|   if (mp_init (&x0x0) != MP_OKAY) | ||||
|   if (mp_init_size (&x0x0, B * 2) != MP_OKAY) | ||||
|     goto T2; | ||||
|   if (mp_init (&x1x1) != MP_OKAY) | ||||
|   if (mp_init_size (&x1x1, (a->used - B) * 2) != MP_OKAY) | ||||
|     goto X0X0; | ||||
| 
 | ||||
|   /* now shift the digits */ | ||||
|   for (x = 0; x < B; x++) { | ||||
|     x0.dp[x] = a->dp[x]; | ||||
|   } | ||||
|   { | ||||
|     register int x; | ||||
|     register mp_digit *dst, *src; | ||||
| 
 | ||||
|   for (x = B; x < a->used; x++) { | ||||
|     x1.dp[x - B] = a->dp[x]; | ||||
|     src = a->dp; | ||||
| 
 | ||||
|     /* now shift the digits */ | ||||
|     dst = x0.dp; | ||||
|     for (x = 0; x < B; x++) { | ||||
|       *dst++ = *src++; | ||||
|     } | ||||
| 
 | ||||
|     dst = x1.dp; | ||||
|     for (x = B; x < a->used; x++) { | ||||
|       *dst++ = *src++; | ||||
|     } | ||||
|   } | ||||
| 
 | ||||
|   x0.used = B; | ||||
| @ -77,7 +85,7 @@ mp_karatsuba_sqr (mp_int * a, mp_int * b) | ||||
|     goto X1X1;			/* t1 = (x1 - x0) * (y1 - y0) */ | ||||
| 
 | ||||
|   /* add x0y0 */ | ||||
|   if (mp_add (&x0x0, &x1x1, &t2) != MP_OKAY) | ||||
|   if (s_mp_add (&x0x0, &x1x1, &t2) != MP_OKAY) | ||||
|     goto X1X1;			/* t2 = x0y0 + x1y1 */ | ||||
|   if (mp_sub (&t2, &t1, &t1) != MP_OKAY) | ||||
|     goto X1X1;			/* t1 = x0y0 + x1y1 - (x1-x0)*(y1-y0) */ | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										35
									
								
								bn_mp_lshd.c
									
									
									
									
									
								
							
							
						
						
									
										35
									
								
								bn_mp_lshd.c
									
									
									
									
									
								
							| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -31,16 +31,31 @@ mp_lshd (mp_int * a, int b) | ||||
|     return res; | ||||
|   } | ||||
| 
 | ||||
|   /* increment the used by the shift amount than copy upwards */ | ||||
|   a->used += b; | ||||
|   for (x = a->used - 1; x >= b; x--) { | ||||
|     a->dp[x] = a->dp[x - b]; | ||||
|   } | ||||
|   { | ||||
|     register mp_digit *tmpa, *tmpaa; | ||||
| 
 | ||||
|   /* zero the lower digits */ | ||||
|   for (x = 0; x < b; x++) { | ||||
|     a->dp[x] = 0; | ||||
|     /* increment the used by the shift amount than copy upwards */ | ||||
|     a->used += b; | ||||
|      | ||||
|     /* top */ | ||||
|     tmpa = a->dp + a->used - 1; | ||||
|      | ||||
|     /* base */ | ||||
|     tmpaa = a->dp + a->used - 1 - b; | ||||
| 
 | ||||
|     /* much like mp_rshd this is implemented using a sliding window
 | ||||
|      * except the window goes the otherway around.  Copying from | ||||
|      * the bottom to the top.  see bn_mp_rshd.c for more info. | ||||
|      */ | ||||
|     for (x = a->used - 1; x >= b; x--) { | ||||
|       *tmpa-- = *tmpaa--; | ||||
|     } | ||||
| 
 | ||||
|     /* zero the lower digits */ | ||||
|     tmpa = a->dp; | ||||
|     for (x = 0; x < b; x++) { | ||||
|       *tmpa++ = 0; | ||||
|     } | ||||
|   } | ||||
|   mp_clamp (a); | ||||
|   return MP_OKAY; | ||||
| } | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -18,36 +18,29 @@ | ||||
| int | ||||
| mp_montgomery_setup (mp_int * a, mp_digit * mp) | ||||
| { | ||||
|   mp_int  t, tt; | ||||
|   int     res; | ||||
|   unsigned long x, b; | ||||
| 
 | ||||
|   if ((res = mp_init (&t)) != MP_OKAY) { | ||||
|     return res; | ||||
| /* fast inversion mod 2^32 
 | ||||
|  * | ||||
|  * Based on the fact that  | ||||
|  * | ||||
|  * XA = 1 (mod 2^n)  =>  (X(2-XA)) A = 1 (mod 2^2n) | ||||
|  *                   =>  2*X*A - X*X*A*A = 1 | ||||
|  *                   =>  2*(1) - (1)     = 1 | ||||
|  */ | ||||
|   b = a->dp[0]; | ||||
| 
 | ||||
|   if ((b & 1) == 0) { | ||||
|     return MP_VAL; | ||||
|   } | ||||
| 
 | ||||
|   if ((res = mp_init (&tt)) != MP_OKAY) { | ||||
|     goto __T; | ||||
|   } | ||||
| 
 | ||||
|   /* tt = b */ | ||||
|   tt.dp[0] = 0; | ||||
|   tt.dp[1] = 1; | ||||
|   tt.used = 2; | ||||
| 
 | ||||
|   /* t = m mod b */ | ||||
|   t.dp[0] = a->dp[0]; | ||||
|   t.used = 1; | ||||
| 
 | ||||
|   /* t = 1/m mod b */ | ||||
|   if ((res = mp_invmod (&t, &tt, &t)) != MP_OKAY) { | ||||
|     goto __TT; | ||||
|   } | ||||
|   x = (((b + 2) & 4) << 1) + b;	/* here x*a==1 mod 2^4 */ | ||||
|   x *= 2 - b * x;		/* here x*a==1 mod 2^8 */ | ||||
|   x *= 2 - b * x;		/* here x*a==1 mod 2^16; each step doubles the nb of bits */ | ||||
|   x *= 2 - b * x;		/* here x*a==1 mod 2^32 */ | ||||
| 
 | ||||
|   /* t = -1/m mod b */ | ||||
|   *mp = ((mp_digit) 1 << ((mp_digit) DIGIT_BIT)) - t.dp[0]; | ||||
|   *mp = ((mp_digit) 1 << ((mp_digit) DIGIT_BIT)) - (x & MP_MASK); | ||||
| 
 | ||||
|   res = MP_OKAY; | ||||
| __TT:mp_clear (&tt); | ||||
| __T:mp_clear (&t); | ||||
|   return res; | ||||
|   return MP_OKAY; | ||||
| } | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -50,6 +50,11 @@ mp_mul_2 (mp_int * a, mp_int * b) | ||||
| 	if ((res = mp_grow (b, b->used + 1)) != MP_OKAY) { | ||||
| 	  return res; | ||||
| 	} | ||||
| 
 | ||||
| 	/* after the grow *tmpb is no longer valid so we have to reset it! 
 | ||||
| 	 * (this bug took me about 17 minutes to find...!) | ||||
| 	 */ | ||||
| 	tmpb = b->dp + b->used; | ||||
|       } | ||||
|       /* add a MSB of 1 */ | ||||
|       *tmpb = 1; | ||||
| @ -61,5 +66,6 @@ mp_mul_2 (mp_int * a, mp_int * b) | ||||
|       *tmpb++ = 0; | ||||
|     } | ||||
|   } | ||||
|   b->sign = a->sign; | ||||
|   return MP_OKAY; | ||||
| } | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -32,9 +32,11 @@ mp_mul_2d (mp_int * a, int b, mp_int * c) | ||||
|   } | ||||
| 
 | ||||
|   /* shift by as many digits in the bit count */ | ||||
|   if ((res = mp_lshd (c, b / DIGIT_BIT)) != MP_OKAY) { | ||||
|     return res; | ||||
|   } | ||||
|   if (b >= DIGIT_BIT) { | ||||
|      if ((res = mp_lshd (c, b / DIGIT_BIT)) != MP_OKAY) { | ||||
|        return res; | ||||
|      } | ||||
|   }      | ||||
|   c->used = c->alloc; | ||||
| 
 | ||||
|   /* shift any bit count < DIGIT_BIT */ | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										37
									
								
								bn_mp_rshd.c
									
									
									
									
									
								
							
							
						
						
									
										37
									
								
								bn_mp_rshd.c
									
									
									
									
									
								
							| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -20,7 +20,6 @@ mp_rshd (mp_int * a, int b) | ||||
| { | ||||
|   int     x; | ||||
| 
 | ||||
| 
 | ||||
|   /* if b <= 0 then ignore it */ | ||||
|   if (b <= 0) { | ||||
|     return; | ||||
| @ -32,14 +31,34 @@ mp_rshd (mp_int * a, int b) | ||||
|     return; | ||||
|   } | ||||
| 
 | ||||
|   /* shift the digits down */ | ||||
|   for (x = 0; x < (a->used - b); x++) { | ||||
|     a->dp[x] = a->dp[x + b]; | ||||
|   } | ||||
|   { | ||||
|     register mp_digit *tmpa, *tmpaa; | ||||
| 
 | ||||
|   /* zero the top digits */ | ||||
|   for (; x < a->used; x++) { | ||||
|     a->dp[x] = 0; | ||||
|     /* shift the digits down */ | ||||
| 
 | ||||
|     /* base */ | ||||
|     tmpa = a->dp; | ||||
|      | ||||
|     /* offset into digits */ | ||||
|     tmpaa = a->dp + b; | ||||
|      | ||||
|     /* this is implemented as a sliding window where the window is b-digits long
 | ||||
|      * and digits from the top of the window are copied to the bottom | ||||
|      * | ||||
|      * e.g. | ||||
|       | ||||
|      b-2 | b-1 | b0 | b1 | b2 | ... | bb |   ----> | ||||
|                  /\                   |      ----> | ||||
|                   \-------------------/      ----> | ||||
|     */          | ||||
|     for (x = 0; x < (a->used - b); x++) { | ||||
|       *tmpa++ = *tmpaa++; | ||||
|     } | ||||
| 
 | ||||
|     /* zero the top digits */ | ||||
|     for (; x < a->used; x++) { | ||||
|       *tmpa++ = 0; | ||||
|     } | ||||
|   } | ||||
|   mp_clamp (a); | ||||
| } | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| @ -55,8 +55,14 @@ s_mp_add (mp_int * a, mp_int * b, mp_int * c) | ||||
|     register int i; | ||||
| 
 | ||||
|     /* alias for digit pointers */ | ||||
|      | ||||
|     /* first input */ | ||||
|     tmpa = a->dp; | ||||
|      | ||||
|     /* second input */ | ||||
|     tmpb = b->dp; | ||||
|      | ||||
|     /* destination */ | ||||
|     tmpc = c->dp; | ||||
| 
 | ||||
|     u = 0; | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
| @ -10,7 +10,7 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										11
									
								
								bncore.c
									
									
									
									
									
								
							
							
						
						
									
										11
									
								
								bncore.c
									
									
									
									
									
								
							| @ -10,10 +10,13 @@ | ||||
|  * The library is free for all purposes without any express | ||||
|  * guarantee it works. | ||||
|  * | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
 | ||||
|  * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
 | ||||
|  */ | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| int     KARATSUBA_MUL_CUTOFF = 80,	/* Min. number of digits before Karatsuba multiplication is used. */ | ||||
|         KARATSUBA_SQR_CUTOFF = 80,	/* Min. number of digits before Karatsuba squaring is used. */ | ||||
|         MONTGOMERY_EXPT_CUTOFF = 74;	/* max. number of digits that montgomery reductions will help for */ | ||||
| /* configured for a AMD Duron Morgan core with etc/tune.c */ | ||||
| int     KARATSUBA_MUL_CUTOFF = 73,	/* Min. number of digits before Karatsuba multiplication is used. */ | ||||
|         KARATSUBA_SQR_CUTOFF = 121,	/* Min. number of digits before Karatsuba squaring is used. */ | ||||
|         MONTGOMERY_EXPT_CUTOFF = 128;	/* max. number of digits that montgomery reductions will help for */ | ||||
| 
 | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										13
									
								
								changes.txt
									
									
									
									
									
								
							
							
						
						
									
										13
									
								
								changes.txt
									
									
									
									
									
								
							| @ -1,3 +1,16 @@ | ||||
| Mar 15th, 2003 | ||||
| v0.14  -- Tons of manual updates | ||||
|        -- cleaned up the directory | ||||
|        -- added MSVC makefiles | ||||
|        -- source changes [that I don't recall] | ||||
|        -- Fixed up the lshd/rshd code to use pointer aliasing | ||||
|        -- Fixed up the mul_2d and div_2d to not call rshd/lshd unless needed | ||||
|        -- Fixed up etc/tune.c a tad | ||||
|        -- fixed up demo/demo.c to output comma-delimited results of timing | ||||
|           also fixed up timing demo to use a finer granularity for various functions | ||||
|        -- fixed up demo/demo.c testing to pause during testing so my Duron won't catch on fire | ||||
|           [stays around 31-35C during testing :-)] | ||||
|         | ||||
| Feb 13th, 2003 | ||||
| v0.13  -- tons of minor speed-ups in low level add, sub, mul_2 and div_2 which propagate  | ||||
|           to other functions like mp_invmod, mp_div, etc... | ||||
|  | ||||
							
								
								
									
										116
									
								
								demo/demo.c
									
									
									
									
									
								
							
							
						
						
									
										116
									
								
								demo/demo.c
									
									
									
									
									
								
							| @ -69,18 +69,32 @@ int mp_reduce_setup(mp_int *a, mp_int *b) | ||||
|    } | ||||
|    return mp_div(a, b, a, NULL); | ||||
| } | ||||
| 
 | ||||
| int mp_rand(mp_int *a, int c) | ||||
| { | ||||
|    long z = abs(rand()) & 65535; | ||||
|    mp_set(a, z?z:1); | ||||
|    while (c--) { | ||||
|       s_mp_lshd(a, 1); | ||||
|       mp_add_d(a, abs(rand()), a); | ||||
|    } | ||||
|    return MP_OKAY; | ||||
| } | ||||
| #endif | ||||
| 
 | ||||
|    char cmd[4096], buf[4096]; | ||||
| int main(void) | ||||
| { | ||||
|    mp_int a, b, c, d, e, f; | ||||
|    unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, inv_n; | ||||
|    unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, inv_n, | ||||
|                  div2_n, mul2_n; | ||||
|    unsigned rr; | ||||
|    int cnt; | ||||
| 
 | ||||
| #ifdef TIMER | ||||
|    int n; | ||||
|    ulong64 tt; | ||||
|    FILE *log; | ||||
| #endif | ||||
| 
 | ||||
|    mp_init(&a); | ||||
| @ -90,60 +104,66 @@ int main(void) | ||||
|    mp_init(&e); | ||||
|    mp_init(&f); | ||||
| 
 | ||||
| 
 | ||||
| #ifdef TIMER | ||||
| goto multtime; | ||||
| 
 | ||||
|       printf("CLOCKS_PER_SEC == %lu\n", CLOCKS_PER_SEC); | ||||
|       mp_read_radix(&a, "340282366920938463463374607431768211455", 10); | ||||
|       mp_read_radix(&b, "340282366920938463463574607431768211455", 10); | ||||
|       while (a.used * DIGIT_BIT < 8192) { | ||||
| goto expttime;       | ||||
| 
 | ||||
|       log = fopen("add.log", "w"); | ||||
|       for (cnt = 4; cnt <= 128; cnt += 4) { | ||||
|          mp_rand(&a, cnt); | ||||
|          mp_rand(&b, cnt); | ||||
|          reset(); | ||||
|          for (rr = 0; rr < 10000000; rr++) { | ||||
|              mp_add(&a, &b, &c); | ||||
|          } | ||||
|          tt = rdtsc(); | ||||
|          printf("Adding\t\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|          mp_sqr(&a, &a); | ||||
|          mp_sqr(&b, &b); | ||||
|          fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|       } | ||||
|       fclose(log); | ||||
|   | ||||
|       mp_read_radix(&a, "340282366920938463463374607431768211455", 10); | ||||
|       mp_read_radix(&b, "340282366920938463463574607431768211455", 10); | ||||
|       while (a.used * DIGIT_BIT < 8192) { | ||||
|       log = fopen("sub.log", "w"); | ||||
|       for (cnt = 4; cnt <= 128; cnt += 4) { | ||||
|          mp_rand(&a, cnt); | ||||
|          mp_rand(&b, cnt); | ||||
|          reset(); | ||||
|          for (rr = 0; rr < 10000000; rr++) { | ||||
|              mp_sub(&a, &b, &c); | ||||
|          } | ||||
|          tt = rdtsc(); | ||||
|          printf("Subtracting\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|          mp_sqr(&a, &a); | ||||
|          mp_sqr(&b, &b); | ||||
|          printf("Subtracting\t\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|          fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|       } | ||||
|       fclose(log); | ||||
|        | ||||
| multtime:       | ||||
| 
 | ||||
|    mp_read_radix(&a, "340282366920938463463374607431768211455", 10); | ||||
|    while (a.used * DIGIT_BIT < 8192) { | ||||
|    log = fopen("sqr.log", "w"); | ||||
|    for (cnt = 4; cnt <= 128; cnt += 4) { | ||||
|       mp_rand(&a, cnt); | ||||
|       reset(); | ||||
|       for (rr = 0; rr < 250000; rr++) { | ||||
|           mp_sqr(&a, &b); | ||||
|       } | ||||
|       tt = rdtsc(); | ||||
|       printf("Squaring\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|       mp_copy(&b, &a); | ||||
|       fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|    } | ||||
|    fclose(log); | ||||
|     | ||||
|    mp_read_radix(&a, "340282366920938463463374607431768211455", 10); | ||||
|    while (a.used * DIGIT_BIT < 8192) { | ||||
|    log = fopen("mult.log", "w"); | ||||
|    for (cnt = 4; cnt <= 128; cnt += 4) { | ||||
|       mp_rand(&a, cnt); | ||||
|       mp_rand(&b, cnt); | ||||
|       reset(); | ||||
|       for (rr = 0; rr < 250000; rr++) { | ||||
|           mp_mul(&a, &a, &b); | ||||
|           mp_mul(&a, &b, &c); | ||||
|       } | ||||
|       tt = rdtsc(); | ||||
|       printf("Multiplying\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|       mp_copy(&b, &a); | ||||
|       fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|    } | ||||
|    fclose(log); | ||||
| 
 | ||||
| expttime:   | ||||
|    { | ||||
| @ -157,6 +177,7 @@ expttime: | ||||
|          "1214855636816562637502584060163403830270705000634713483015101384881871978446801224798536155406895823305035467591632531067547890948695117172076954220727075688048751022421198712032848890056357845974246560748347918630050853933697792254955890439720297560693579400297062396904306270145886830719309296352765295712183040773146419022875165382778007040109957609739589875590885701126197906063620133954893216612678838507540777138437797705602453719559017633986486649523611975865005712371194067612263330335590526176087004421363598470302731349138773205901447704682181517904064735636518462452242791676541725292378925568296858010151852326316777511935037531017413910506921922450666933202278489024521263798482237150056835746454842662048692127173834433089016107854491097456725016327709663199738238442164843147132789153725513257167915555162094970853584447993125488607696008169807374736711297007473812256272245489405898470297178738029484459690836250560495461579533254473316340608217876781986188705928270735695752830825527963838355419762516246028680280988020401914551825487349990306976304093109384451438813251211051597392127491464898797406789175453067960072008590614886532333015881171367104445044718144312416815712216611576221546455968770801413440778423979", | ||||
|          NULL          | ||||
|       }; | ||||
|    log = fopen("expt.log", "w"); | ||||
|    for (n = 0; primes[n]; n++) { | ||||
|       mp_read_radix(&a, primes[n], 10); | ||||
|       mp_zero(&b); | ||||
| @ -183,12 +204,21 @@ expttime: | ||||
|          exit(0); | ||||
|       } | ||||
|       printf("Exponentiating\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|       fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|    } | ||||
|    }    | ||||
| 
 | ||||
|    mp_read_radix(&a, "340282366920938463463374607431768211455", 10); | ||||
|    mp_read_radix(&b, "234892374891378913789237289378973232333", 10); | ||||
|    while (a.used * DIGIT_BIT < 8192) { | ||||
|    fclose(log); | ||||
| invtime: | ||||
|    log = fopen("invmod.log", "w"); | ||||
|    for (cnt = 4; cnt <= 128; cnt += 4) { | ||||
|       mp_rand(&a, cnt); | ||||
|       mp_rand(&b, cnt); | ||||
|        | ||||
|       do { | ||||
|          mp_add_d(&b, 1, &b); | ||||
|          mp_gcd(&a, &b, &c); | ||||
|       } while (mp_cmp_d(&c, 1) != MP_EQ); | ||||
|        | ||||
|       reset(); | ||||
|       for (rr = 0; rr < 10000; rr++) { | ||||
|           mp_invmod(&b, &a, &c); | ||||
| @ -200,16 +230,18 @@ expttime: | ||||
|          return 0; | ||||
|       } | ||||
|       printf("Inverting mod\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt); | ||||
|       mp_sqr(&a, &a); | ||||
|       mp_sqr(&b, &b); | ||||
|       fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt); | ||||
|    } | ||||
|    fclose(log); | ||||
|     | ||||
|    return 0; | ||||
|    | ||||
| #endif | ||||
| 
 | ||||
|    inv_n = expt_n = lcm_n = gcd_n = add_n = sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = 0;    | ||||
|    div2_n = mul2_n = inv_n = expt_n = lcm_n = gcd_n = add_n =  | ||||
|    sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = cnt = 0; | ||||
|    for (;;) { | ||||
|        if (!(++cnt & 15)) sleep(3); | ||||
|     | ||||
|        /* randomly clear and re-init one variable, this has the affect of triming the alloc space */ | ||||
|        switch (abs(rand()) % 7) { | ||||
| @ -223,7 +255,7 @@ expttime: | ||||
|        } | ||||
|     | ||||
|     | ||||
|        printf("%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%5d\r", add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, expt_n, inv_n, _ifuncs); | ||||
|        printf("%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu ", add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, expt_n, inv_n, div2_n, mul2_n); | ||||
|        fgets(cmd, 4095, stdin); | ||||
|        cmd[strlen(cmd)-1] = 0; | ||||
|        printf("%s  ]\r",cmd); fflush(stdout); | ||||
| @ -386,7 +418,29 @@ draw(&a);draw(&b);draw(&c);draw(&d); | ||||
|                 return 0; | ||||
|              } | ||||
|                  | ||||
|        } | ||||
|        } else if (!strcmp(cmd, "div2")) { ++div2_n; | ||||
|              fgets(buf, 4095, stdin);  mp_read_radix(&a, buf, 10); | ||||
|              fgets(buf, 4095, stdin);  mp_read_radix(&b, buf, 10); | ||||
|              mp_div_2(&a, &c); | ||||
|              if (mp_cmp(&c, &b) != MP_EQ) { | ||||
|                  printf("div_2 %lu failure\n", div2_n); | ||||
|                  draw(&a); | ||||
|                  draw(&b); | ||||
|                  draw(&c); | ||||
|                  return 0; | ||||
|              } | ||||
|        } else if (!strcmp(cmd, "mul2")) { ++mul2_n; | ||||
|              fgets(buf, 4095, stdin);  mp_read_radix(&a, buf, 10); | ||||
|              fgets(buf, 4095, stdin);  mp_read_radix(&b, buf, 10); | ||||
|              mp_mul_2(&a, &c); | ||||
|              if (mp_cmp(&c, &b) != MP_EQ) { | ||||
|                  printf("mul_2 %lu failure\n", mul2_n); | ||||
|                  draw(&a); | ||||
|                  draw(&b); | ||||
|                  draw(&c); | ||||
|                  return 0; | ||||
|              } | ||||
|        }              | ||||
|         | ||||
|    } | ||||
|    return 0;    | ||||
|  | ||||
| @ -17,4 +17,4 @@ mersenne: mersenne.o | ||||
| 	$(CC) mersenne.o $(LIBNAME) -o mersenne | ||||
|          | ||||
| clean: | ||||
| 	rm -f *.o *.exe pprime tune mersenne  | ||||
| 	rm -f *.log *.o *.obj *.exe pprime tune mersenne  | ||||
							
								
								
									
										14
									
								
								etc/makefile.msvc
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										14
									
								
								etc/makefile.msvc
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,14 @@ | ||||
| #MSVC Makefile | ||||
| # | ||||
| #Tom St Denis | ||||
| 
 | ||||
| CFLAGS = /I../ /Ogityb2 /Gs /DWIN32 /W3 | ||||
| 
 | ||||
| pprime: pprime.obj | ||||
| 	cl pprime.obj ../tommath.lib  | ||||
| 
 | ||||
| mersenne: mersenne.obj | ||||
| 	cl mersenne.obj ../tommath.lib | ||||
| 	 | ||||
| tune: tune.obj | ||||
| 	cl tune.obj ../tommath.lib	 | ||||
| @ -3,14 +3,14 @@ | ||||
|  * Tom St Denis, tomstdenis@iahu.ca | ||||
|  */ | ||||
| #include <time.h> | ||||
| #include <bn.h> | ||||
| #include <tommath.h> | ||||
| 
 | ||||
| int | ||||
| is_mersenne (long s, int *pp) | ||||
| { | ||||
|   mp_int    n, u, mu; | ||||
|   int       res, k; | ||||
|   long      ss; | ||||
|   mp_int  n, u, mu; | ||||
|   int     res, k; | ||||
|   long    ss; | ||||
| 
 | ||||
|   *pp = 0; | ||||
| 
 | ||||
| @ -85,7 +85,7 @@ __N:mp_clear (&n); | ||||
| long | ||||
| i_sqrt (long x) | ||||
| { | ||||
|   long      x1, x2; | ||||
|   long    x1, x2; | ||||
| 
 | ||||
|   x2 = 16; | ||||
|   do { | ||||
| @ -104,7 +104,7 @@ i_sqrt (long x) | ||||
| int | ||||
| isprime (long k) | ||||
| { | ||||
|   long      y, z; | ||||
|   long    y, z; | ||||
| 
 | ||||
|   y = i_sqrt (k); | ||||
|   for (z = 2; z <= y; z++) { | ||||
| @ -118,9 +118,9 @@ isprime (long k) | ||||
| int | ||||
| main (void) | ||||
| { | ||||
|   int       pp; | ||||
|   long      k; | ||||
|   clock_t   tt; | ||||
|   int     pp; | ||||
|   long    k; | ||||
|   clock_t tt; | ||||
| 
 | ||||
|   k = 3; | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										20
									
								
								etc/pprime.c
									
									
									
									
									
								
							
							
						
						
									
										20
									
								
								etc/pprime.c
									
									
									
									
									
								
							| @ -8,10 +8,10 @@ | ||||
| #include "tommath.h" | ||||
| 
 | ||||
| /* fast square root */ | ||||
| static    mp_digit | ||||
| static  mp_digit | ||||
| i_sqrt (mp_word x) | ||||
| { | ||||
|   mp_word   x1, x2; | ||||
|   mp_word x1, x2; | ||||
| 
 | ||||
|   x2 = x; | ||||
|   do { | ||||
| @ -28,10 +28,10 @@ i_sqrt (mp_word x) | ||||
| 
 | ||||
| 
 | ||||
| /* generates a prime digit */ | ||||
| static    mp_digit | ||||
| static  mp_digit | ||||
| prime_digit () | ||||
| { | ||||
|   mp_digit  r, x, y, next; | ||||
|   mp_digit r, x, y, next; | ||||
| 
 | ||||
|   /* make a DIGIT_BIT-bit random number */ | ||||
|   for (r = x = 0; x < DIGIT_BIT; x++) { | ||||
| @ -141,8 +141,8 @@ prime_digit () | ||||
| int | ||||
| pprime (int k, int li, mp_int * p, mp_int * q) | ||||
| { | ||||
|   mp_int    a, b, c, n, x, y, z, v; | ||||
|   int       res, ii; | ||||
|   mp_int  a, b, c, n, x, y, z, v; | ||||
|   int     res, ii; | ||||
|   static const mp_digit bases[] = { 2, 3, 5, 7, 11, 13, 17, 19 }; | ||||
| 
 | ||||
|   /* single digit ? */ | ||||
| @ -329,10 +329,10 @@ __C:mp_clear (&c); | ||||
| int | ||||
| main (void) | ||||
| { | ||||
|   mp_int    p, q; | ||||
|   char      buf[4096]; | ||||
|   int       k, li; | ||||
|   clock_t   t1; | ||||
|   mp_int  p, q; | ||||
|   char    buf[4096]; | ||||
|   int     k, li; | ||||
|   clock_t t1; | ||||
| 
 | ||||
|   srand (time (NULL)); | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										100
									
								
								etc/tune.c
									
									
									
									
									
								
							
							
						
						
									
										100
									
								
								etc/tune.c
									
									
									
									
									
								
							| @ -8,19 +8,19 @@ | ||||
| clock_t | ||||
| time_mult (void) | ||||
| { | ||||
|   clock_t   t1; | ||||
|   int       x, y; | ||||
|   mp_int    a, b, c; | ||||
|   clock_t t1; | ||||
|   int     x, y; | ||||
|   mp_int  a, b, c; | ||||
| 
 | ||||
|   mp_init (&a); | ||||
|   mp_init (&b); | ||||
|   mp_init (&c); | ||||
| 
 | ||||
|   t1 = clock (); | ||||
|   for (x = 8; x <= 128; x += 8) { | ||||
|     for (y = 0; y < 1000; y++) { | ||||
|       mp_rand (&a, x); | ||||
|       mp_rand (&b, x); | ||||
|   for (x = 4; x <= 128; x += 4) { | ||||
|     mp_rand (&a, x); | ||||
|     mp_rand (&b, x); | ||||
|     for (y = 0; y < 10000; y++) { | ||||
|       mp_mul (&a, &b, &c); | ||||
|     } | ||||
|   } | ||||
| @ -33,17 +33,17 @@ time_mult (void) | ||||
| clock_t | ||||
| time_sqr (void) | ||||
| { | ||||
|   clock_t   t1; | ||||
|   int       x, y; | ||||
|   mp_int    a, b; | ||||
|   clock_t t1; | ||||
|   int     x, y; | ||||
|   mp_int  a, b; | ||||
| 
 | ||||
|   mp_init (&a); | ||||
|   mp_init (&b); | ||||
| 
 | ||||
|   t1 = clock (); | ||||
|   for (x = 8; x <= 128; x += 8) { | ||||
|     for (y = 0; y < 1000; y++) { | ||||
|       mp_rand (&a, x); | ||||
|   for (x = 4; x <= 128; x += 4) { | ||||
|     mp_rand (&a, x); | ||||
|     for (y = 0; y < 10000; y++) { | ||||
|       mp_sqr (&a, &b); | ||||
|     } | ||||
|   } | ||||
| @ -52,20 +52,54 @@ time_sqr (void) | ||||
|   return clock () - t1; | ||||
| } | ||||
| 
 | ||||
| clock_t | ||||
| time_expt (void) | ||||
| { | ||||
|   clock_t t1; | ||||
|   int     x, y; | ||||
|   mp_int  a, b, c, d; | ||||
| 
 | ||||
|   mp_init (&a); | ||||
|   mp_init (&b); | ||||
|   mp_init (&c); | ||||
|   mp_init (&d); | ||||
| 
 | ||||
|   t1 = clock (); | ||||
|   for (x = 4; x <= 128; x += 4) { | ||||
|     mp_rand (&a, x); | ||||
|     mp_rand (&b, x); | ||||
|     mp_rand (&c, x); | ||||
|     if (mp_iseven (&c) != 0) { | ||||
|       mp_add_d (&c, 1, &c); | ||||
|     } | ||||
|     for (y = 0; y < 10; y++) { | ||||
|       mp_exptmod (&a, &b, &c, &d); | ||||
|     } | ||||
|   } | ||||
|   mp_clear (&d); | ||||
|   mp_clear (&c); | ||||
|   mp_clear (&b); | ||||
|   mp_clear (&a); | ||||
| 
 | ||||
|   return clock () - t1; | ||||
| } | ||||
| 
 | ||||
| int | ||||
| main (void) | ||||
| { | ||||
|   int       best_mult, best_square; | ||||
|   clock_t   best, ti; | ||||
|   int     best_mult, best_square, best_exptmod; | ||||
|   clock_t best, ti; | ||||
|   FILE   *log; | ||||
| 
 | ||||
|   best_mult = best_square = 0; | ||||
|   best_mult = best_square = best_exptmod = 0; | ||||
| 
 | ||||
|   /* tune multiplication first */ | ||||
|   log = fopen ("mult.log", "w"); | ||||
|   best = CLOCKS_PER_SEC * 1000; | ||||
|   for (KARATSUBA_MUL_CUTOFF = 8; KARATSUBA_MUL_CUTOFF <= 128; | ||||
|        KARATSUBA_MUL_CUTOFF++) { | ||||
|   for (KARATSUBA_MUL_CUTOFF = 8; KARATSUBA_MUL_CUTOFF <= 128; KARATSUBA_MUL_CUTOFF++) { | ||||
|     ti = time_mult (); | ||||
|     printf ("%4d : %9lu\r", KARATSUBA_MUL_CUTOFF, ti); | ||||
|     fprintf (log, "%d, %lu\n", KARATSUBA_MUL_CUTOFF, ti); | ||||
|     fflush (stdout); | ||||
|     if (ti < best) { | ||||
|       printf ("New best: %lu, %d         \n", ti, KARATSUBA_MUL_CUTOFF); | ||||
| @ -73,13 +107,15 @@ main (void) | ||||
|       best_mult = KARATSUBA_MUL_CUTOFF; | ||||
|     } | ||||
|   } | ||||
|   fclose (log); | ||||
| 
 | ||||
|   /* tune squaring */ | ||||
|   log = fopen ("sqr.log", "w"); | ||||
|   best = CLOCKS_PER_SEC * 1000; | ||||
|   for (KARATSUBA_SQR_CUTOFF = 8; KARATSUBA_SQR_CUTOFF <= 128; | ||||
|        KARATSUBA_SQR_CUTOFF++) { | ||||
|   for (KARATSUBA_SQR_CUTOFF = 8; KARATSUBA_SQR_CUTOFF <= 128; KARATSUBA_SQR_CUTOFF++) { | ||||
|     ti = time_sqr (); | ||||
|     printf ("%4d : %9lu\r", KARATSUBA_SQR_CUTOFF, ti); | ||||
|     fprintf (log, "%d, %lu\n", KARATSUBA_SQR_CUTOFF, ti); | ||||
|     fflush (stdout); | ||||
|     if (ti < best) { | ||||
|       printf ("New best: %lu, %d         \n", ti, KARATSUBA_SQR_CUTOFF); | ||||
| @ -87,10 +123,30 @@ main (void) | ||||
|       best_square = KARATSUBA_SQR_CUTOFF; | ||||
|     } | ||||
|   } | ||||
|   fclose (log); | ||||
| 
 | ||||
|   /* tune exptmod */ | ||||
|   KARATSUBA_MUL_CUTOFF = best_mult; | ||||
|   KARATSUBA_SQR_CUTOFF = best_square; | ||||
| 
 | ||||
|   log = fopen ("expt.log", "w"); | ||||
|   best = CLOCKS_PER_SEC * 1000; | ||||
|   for (MONTGOMERY_EXPT_CUTOFF = 8; MONTGOMERY_EXPT_CUTOFF <= 192; MONTGOMERY_EXPT_CUTOFF++) { | ||||
|     ti = time_expt (); | ||||
|     printf ("%4d : %9lu\r", MONTGOMERY_EXPT_CUTOFF, ti); | ||||
|     fflush (stdout); | ||||
|     fprintf (log, "%d : %lu\r", MONTGOMERY_EXPT_CUTOFF, ti); | ||||
|     if (ti < best) { | ||||
|       printf ("New best: %lu, %d\n", ti, MONTGOMERY_EXPT_CUTOFF); | ||||
|       best = ti; | ||||
|       best_exptmod = MONTGOMERY_EXPT_CUTOFF; | ||||
|     } | ||||
|   } | ||||
|   fclose (log); | ||||
| 
 | ||||
|   printf | ||||
|     ("\n\n\nKaratsuba Multiplier Cutoff: %d\nKaratsuba Squaring Cutoff: %d\n", | ||||
|      best_mult, best_square); | ||||
|     ("\n\n\nKaratsuba Multiplier Cutoff: %d\nKaratsuba Squaring Cutoff: %d\nMontgomery exptmod Cutoff: %d\n", | ||||
|      best_mult, best_square, best_exptmod); | ||||
| 
 | ||||
|   return 0; | ||||
| } | ||||
|  | ||||
							
								
								
									
										4
									
								
								makefile
									
									
									
									
									
								
							
							
						
						
									
										4
									
								
								makefile
									
									
									
									
									
								
							| @ -1,6 +1,6 @@ | ||||
| CFLAGS  +=  -I./ -Wall -W -Wshadow -O3 -fomit-frame-pointer -funroll-loops | ||||
| 
 | ||||
| VERSION=0.13 | ||||
| VERSION=0.14 | ||||
| 
 | ||||
| default: libtommath.a | ||||
| 
 | ||||
| @ -60,7 +60,7 @@ docs:	docdvi | ||||
| 	rm -f bn.log bn.aux bn.dvi | ||||
| 	 | ||||
| clean: | ||||
| 	rm -f *.pdf *.o *.a *.exe etclib/*.o demo/demo.o test ltmtest mpitest mtest/mtest mtest/mtest.exe \
 | ||||
| 	rm -f *.pdf *.o *.a *.obj *.lib *.exe etclib/*.o demo/demo.o test ltmtest mpitest mtest/mtest mtest/mtest.exe \
 | ||||
|         bn.log bn.aux bn.dvi *.log *.s mpi.c  | ||||
| 	cd etc ; make clean | ||||
| 
 | ||||
|  | ||||
							
								
								
									
										26
									
								
								makefile.msvc
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										26
									
								
								makefile.msvc
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,26 @@ | ||||
| #MSVC Makefile | ||||
| # | ||||
| #Tom St Denis | ||||
| 
 | ||||
| CFLAGS = /I. /Ogityb2 /Gs /DWIN32 /W3 | ||||
| 
 | ||||
| default: library | ||||
| 
 | ||||
| OBJECTS=bncore.obj bn_mp_init.obj bn_mp_clear.obj bn_mp_exch.obj bn_mp_grow.obj bn_mp_shrink.obj \ | ||||
| bn_mp_clamp.obj bn_mp_zero.obj  bn_mp_set.obj bn_mp_set_int.obj bn_mp_init_size.obj bn_mp_copy.obj \ | ||||
| bn_mp_init_copy.obj bn_mp_abs.obj bn_mp_neg.obj bn_mp_cmp_mag.obj bn_mp_cmp.obj bn_mp_cmp_d.obj \ | ||||
| bn_mp_rshd.obj bn_mp_lshd.obj bn_mp_mod_2d.obj bn_mp_div_2d.obj bn_mp_mul_2d.obj bn_mp_div_2.obj \ | ||||
| bn_mp_mul_2.obj bn_s_mp_add.obj bn_s_mp_sub.obj bn_fast_s_mp_mul_digs.obj bn_s_mp_mul_digs.obj \ | ||||
| bn_fast_s_mp_mul_high_digs.obj bn_s_mp_mul_high_digs.obj bn_fast_s_mp_sqr.obj bn_s_mp_sqr.obj \ | ||||
| bn_mp_add.obj bn_mp_sub.obj bn_mp_karatsuba_mul.obj bn_mp_mul.obj bn_mp_karatsuba_sqr.obj \ | ||||
| bn_mp_sqr.obj bn_mp_div.obj bn_mp_mod.obj bn_mp_add_d.obj bn_mp_sub_d.obj bn_mp_mul_d.obj \ | ||||
| bn_mp_div_d.obj bn_mp_mod_d.obj bn_mp_expt_d.obj bn_mp_addmod.obj bn_mp_submod.obj \ | ||||
| bn_mp_mulmod.obj bn_mp_sqrmod.obj bn_mp_gcd.obj bn_mp_lcm.obj bn_fast_mp_invmod.obj bn_mp_invmod.obj \ | ||||
| bn_mp_reduce.obj bn_mp_montgomery_setup.obj bn_fast_mp_montgomery_reduce.obj bn_mp_montgomery_reduce.obj \ | ||||
| bn_mp_exptmod_fast.obj bn_mp_exptmod.obj bn_mp_2expt.obj bn_mp_n_root.obj bn_mp_jacobi.obj bn_reverse.obj \ | ||||
| bn_mp_count_bits.obj bn_mp_read_unsigned_bin.obj bn_mp_read_signed_bin.obj bn_mp_to_unsigned_bin.obj \ | ||||
| bn_mp_to_signed_bin.obj bn_mp_unsigned_bin_size.obj bn_mp_signed_bin_size.obj bn_radix.obj \ | ||||
| bn_mp_xor.obj bn_mp_and.obj bn_mp_or.obj bn_mp_rand.obj bn_mp_montgomery_calc_normalization.obj | ||||
| 
 | ||||
| library: $(OBJECTS) | ||||
| 	lib /out:tommath.lib $(OBJECTS) | ||||
| @ -41,7 +41,7 @@ void rand_num(mp_int *a) | ||||
|    unsigned char buf[512]; | ||||
| 
 | ||||
| top: | ||||
|    size = 1 + ((fgetc(rng)*fgetc(rng)) % 96); | ||||
|    size = 1 + ((fgetc(rng)*fgetc(rng)) % 512); | ||||
|    buf[0] = (fgetc(rng)&1)?1:0; | ||||
|    fread(buf+1, 1, size, rng); | ||||
|    for (n = 0; n < size; n++) { | ||||
| @ -57,7 +57,7 @@ void rand_num2(mp_int *a) | ||||
|    unsigned char buf[512]; | ||||
| 
 | ||||
| top: | ||||
|    size = 1 + ((fgetc(rng)*fgetc(rng)) % 96); | ||||
|    size = 1 + ((fgetc(rng)*fgetc(rng)) % 512); | ||||
|    buf[0] = (fgetc(rng)&1)?1:0; | ||||
|    fread(buf+1, 1, size, rng); | ||||
|    for (n = 0; n < size; n++) { | ||||
| @ -72,6 +72,8 @@ int main(void) | ||||
|    int n; | ||||
|    mp_int a, b, c, d, e; | ||||
|    char buf[4096]; | ||||
|     | ||||
|    static int tests[] = { 11, 12 }; | ||||
| 
 | ||||
|    mp_init(&a); | ||||
|    mp_init(&b); | ||||
| @ -89,7 +91,7 @@ int main(void) | ||||
|    } | ||||
| 
 | ||||
|    for (;;) { | ||||
|        n = 4; // fgetc(rng) % 11;
 | ||||
|        n =  fgetc(rng) % 13; | ||||
| 
 | ||||
|    if (n == 0) { | ||||
|        /* add tests */ | ||||
| @ -235,7 +237,24 @@ int main(void) | ||||
|       printf("%s\n", buf);       | ||||
|       mp_todecimal(&c, buf); | ||||
|       printf("%s\n", buf);       | ||||
|    }  | ||||
|    } else if (n == 11) { | ||||
|       rand_num(&a); | ||||
|       mp_mul_2(&a, &a); | ||||
|       mp_div_2(&a, &b); | ||||
|       printf("div2\n"); | ||||
|       mp_todecimal(&a, buf); | ||||
|       printf("%s\n", buf);       | ||||
|       mp_todecimal(&b, buf); | ||||
|       printf("%s\n", buf); | ||||
|    } else if (n == 12) { | ||||
|       rand_num2(&a); | ||||
|       mp_mul_2(&a, &b); | ||||
|       printf("mul2\n"); | ||||
|       mp_todecimal(&a, buf); | ||||
|       printf("%s\n", buf);       | ||||
|       mp_todecimal(&b, buf); | ||||
|       printf("%s\n", buf); | ||||
|    } | ||||
|    } | ||||
|    fclose(rng); | ||||
|    return 0; | ||||
|  | ||||
							
								
								
									
										36
									
								
								timings.txt
									
									
									
									
									
								
							
							
						
						
									
										36
									
								
								timings.txt
									
									
									
									
									
								
							| @ -1,36 +0,0 @@ | ||||
| CLOCKS_PER_SEC == 1000 | ||||
| Adding           128-bit =>  14534883/sec,       688 ticks | ||||
| Adding           256-bit =>  11037527/sec,       906 ticks | ||||
| Adding           512-bit =>   8650519/sec,      1156 ticks | ||||
| Adding          1024-bit =>   5871990/sec,      1703 ticks | ||||
| Adding          2048-bit =>   3575259/sec,      2797 ticks | ||||
| Adding          4096-bit =>   2018978/sec,      4953 ticks | ||||
| Subtracting      128-bit =>  11025358/sec,       907 ticks | ||||
| Subtracting      256-bit =>   9149130/sec,      1093 ticks | ||||
| Subtracting      512-bit =>   7440476/sec,      1344 ticks | ||||
| Subtracting     1024-bit =>   5078720/sec,      1969 ticks | ||||
| Subtracting     2048-bit =>   3168567/sec,      3156 ticks | ||||
| Subtracting     4096-bit =>   1833852/sec,      5453 ticks | ||||
| Squaring         128-bit =>   3205128/sec,        78 ticks | ||||
| Squaring         256-bit =>   1592356/sec,       157 ticks | ||||
| Squaring         512-bit =>    696378/sec,       359 ticks | ||||
| Squaring        1024-bit =>    266808/sec,       937 ticks | ||||
| Squaring        2048-bit =>     85999/sec,      2907 ticks | ||||
| Squaring        4096-bit =>     21949/sec,     11390 ticks | ||||
| Multiplying      128-bit =>   3205128/sec,        78 ticks | ||||
| Multiplying      256-bit =>   1592356/sec,       157 ticks | ||||
| Multiplying      512-bit =>    615763/sec,       406 ticks | ||||
| Multiplying     1024-bit =>    192752/sec,      1297 ticks | ||||
| Multiplying     2048-bit =>     53510/sec,      4672 ticks | ||||
| Multiplying     4096-bit =>     14801/sec,     16890 ticks | ||||
| Exponentiating   513-bit =>       531/sec,        47 ticks | ||||
| Exponentiating   769-bit =>       177/sec,       141 ticks | ||||
| Exponentiating  1025-bit =>        88/sec,       282 ticks | ||||
| Exponentiating  2049-bit =>        13/sec,      1890 ticks | ||||
| Exponentiating  2561-bit =>         6/sec,      3812 ticks | ||||
| Exponentiating  3073-bit =>         4/sec,      6031 ticks | ||||
| Exponentiating  4097-bit =>         1/sec,     12843 ticks | ||||
| Inverting mod    128-bit =>     19160/sec,      5219 ticks | ||||
| Inverting mod    256-bit =>      8290/sec,     12062 ticks | ||||
| Inverting mod    512-bit =>      3565/sec,     28047 ticks | ||||
| Inverting mod   1024-bit =>      1305/sec,     76594 ticks | ||||
							
								
								
									
										36
									
								
								timings2.txt
									
									
									
									
									
								
							
							
						
						
									
										36
									
								
								timings2.txt
									
									
									
									
									
								
							| @ -1,36 +0,0 @@ | ||||
| CLOCKS_PER_SEC == 1000 | ||||
| Adding           128-bit =>  15600624/sec,       641 ticks | ||||
| Adding           256-bit =>  12804097/sec,       781 ticks | ||||
| Adding           512-bit =>  10000000/sec,      1000 ticks | ||||
| Adding          1024-bit =>   7032348/sec,      1422 ticks | ||||
| Adding          2048-bit =>   4076640/sec,      2453 ticks | ||||
| Adding          4096-bit =>   2424242/sec,      4125 ticks | ||||
| Subtracting      128-bit =>  10845986/sec,       922 ticks | ||||
| Subtracting      256-bit =>   9416195/sec,      1062 ticks | ||||
| Subtracting      512-bit =>   7710100/sec,      1297 ticks | ||||
| Subtracting     1024-bit =>   5159958/sec,      1938 ticks | ||||
| Subtracting     2048-bit =>   3299241/sec,      3031 ticks | ||||
| Subtracting     4096-bit =>   1987676/sec,      5031 ticks | ||||
| Squaring         128-bit =>   3205128/sec,        78 ticks | ||||
| Squaring         256-bit =>   1592356/sec,       157 ticks | ||||
| Squaring         512-bit =>    696378/sec,       359 ticks | ||||
| Squaring        1024-bit =>    266524/sec,       938 ticks | ||||
| Squaring        2048-bit =>     86505/sec,      2890 ticks | ||||
| Squaring        4096-bit =>     22471/sec,     11125 ticks | ||||
| Multiplying      128-bit =>   3205128/sec,        78 ticks | ||||
| Multiplying      256-bit =>   1592356/sec,       157 ticks | ||||
| Multiplying      512-bit =>    615763/sec,       406 ticks | ||||
| Multiplying     1024-bit =>    190548/sec,      1312 ticks | ||||
| Multiplying     2048-bit =>     54418/sec,      4594 ticks | ||||
| Multiplying     4096-bit =>     14897/sec,     16781 ticks | ||||
| Exponentiating   513-bit =>       531/sec,        47 ticks | ||||
| Exponentiating   769-bit =>       177/sec,       141 ticks | ||||
| Exponentiating  1025-bit =>        84/sec,       297 ticks | ||||
| Exponentiating  2049-bit =>        13/sec,      1875 ticks | ||||
| Exponentiating  2561-bit =>         6/sec,      3766 ticks | ||||
| Exponentiating  3073-bit =>         4/sec,      6000 ticks | ||||
| Exponentiating  4097-bit =>         1/sec,     12750 ticks | ||||
| Inverting mod    128-bit =>     17301/sec,       578 ticks | ||||
| Inverting mod    256-bit =>      8103/sec,      1234 ticks | ||||
| Inverting mod    512-bit =>      3422/sec,      2922 ticks | ||||
| Inverting mod   1024-bit =>      1330/sec,      7516 ticks | ||||
| @ -1,5 +0,0 @@ | ||||
| Exponentiating   513-bit =>       531/sec,        94 ticks | ||||
| Exponentiating   769-bit =>       187/sec,       266 ticks | ||||
| Exponentiating  1025-bit =>        88/sec,       562 ticks | ||||
| Exponentiating  2049-bit =>        13/sec,      3719 ticks | ||||
| 
 | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user