Monday, October 18, 2010

Pollard Strassen Polynomial Evaluation Method of Factoring, and a Variation

I admit I was reading Prime Numbers by Pomerance again and I have taken a fancy to one of the factorization methods described in the book. The book only gives a single paragraph on the method and I thought it would be fun to talk about it here. The idea is essentially this, to factor a number n pick chop the numbers less than n up into equal sized blocks and multiply all the numbers in these blocks together mod n and then take the greatest common divisor of the product of these blocks of numbers with n. Once we hit a block of numbers which has a gcd > 1 then if the gcd < n then we have a proper factor of n or if not then we can just search through the numbers in that block one at a time, checking to find the first number with a gcd with n greater than 1, and wallah!

Since we are guaranteed at least one factor in the numbers less than n^1/2 we only need to worry about breaking up the numbers less than that up into blocks. Rather intuitively the best division of work would be to make the blocks all sized equally and so you naturally come to the decision to make the blocks of size B = n^1/4. In order to make the calculation of the products of these numbers as quick as possible we take the time to construct the polynomial p(x) = x * (x-1) * (x-2) * (x-3) * ... * (x - B + 1) and reduce its terms mod n. Once we have that polynomial we just have to check gcd(p(B*i), n) to see if there is a factor of n in the block of numbers [B*i, B(i-1)] until we find an i that it works for.

I rather like the algorithm it isn't quite as cool as the rho method of factorization but it is surprisingly elegant and effective and has the additional advantage that it is clearly guaranteed to work. Just for fun I want to propose an idea that is almost certain to be worthless as an effective means of factorization but nonetheless appeals to my sense of aesthetics.

These polynomials which we are using to quickly evaluate the product of chunks of numbers are using the fact that factorials are nice smooth numbers with a lot of factors. In Pomerance's book he motivates the polynomial evaluation method by talking about how if one could evaluate factorials easily then factorization could be achieved by a simple binary search since gcd(k!, n) > 1 if k is greater than the smallest prime factor of n and equal to 1 otherwise.

Being a physicist I rarely have to care about the exact integer value of a factorial and instead am perfectly alright with using Stirlings approximation to the factorial as though it were exact since for any numbers for which it would be arduous to calculate the factorial exactly the error in its approximation is the most minuscule fraction, far below the error that one could hope to measure in a physical experiment.

But of course if we are interested in the factorization of n! then if we are off by even 1 the factorization will be completely different. Clearly trying to directly apply the binary search idea utilizing Stirling series is useless. The fractional error may drop dramatically with size but the absolute error still grows exponentially. So trying to keep enough terms in the approximation to get within 1 of the value of the actual exponential is a losing game. We could simply try all the numbers "close" to the estimated number but besides the fact that we might have to try an uncomfortably large number of candidate numbers in order to be certain of having tried the actual factorial there is a more serious problem. If we were trying to factor a 10 digit number the size of the factorials involved would be millions of digits long which is just a tad bit of overkill for factoring a 10 digit number. So if we want to keep the operations manageable we should really properly evaluate the factorial mod n. But because the factorials are so much larger than n even a tiny difference percentage wise in the value of the factorial would correspond to many times n and so give answers which are well mixed on n.

From all of these considerations it would appear that we would be much much better off simply doing trial division or for that matter actually using the Pollard Strassen polynomial evaluation method to factor our number. But still one might potentially be able to use Stirlings approximation or something like it (perhaps an appropriate adaptation to keep coherence mod n) might work out in the end.... perhaps even make for an actually useful factorization algorithm... I doubt it though. Still it is fun to think about and I would consider it something of a triumph to design a factorization algorithm that used the stirling approximation that just wasn't worse than the sieve of Eratosthenes.


Anonymous said...

I be enduring infer from a few of the articles on your website in the present circumstances, and I really like your line of blogging. I added it to my favorites net page roster and last will and testament be checking back soon. Divert repress out of order my orientation as highly and let me know what you think. Thanks.

ma~ said...

That comment has got to be the most amusing spam I have ever gotten.