Sunday, January 24, 2010

Pollard rho method of factorization: iterated functions and factorization

The Pollard rho method of factorization is a heuristic factorization method which is based on the idea of iterated functions. Say you take some particular function f(x) which maps the integers in the interval [0, n-1] to themselves. Then necessarily for any particular starting number you pick say x0 the sequence of numbers x0, f(x0), f(f(x0)), ... will necessarily start repeating itself before n+1 iterations of the function since there are only n possible values.

Of course not all numbers in the interval need belong to a cycle and if the function sometimes maps multiple numbers to a single number then there must exist numbers that do not belong to any cycle. Starting off our iterated function at one of these numbers will cause a string of numbers the first few of which will not be repeated followed by an infinite loop of numbers that form a cycle. This "shape" of the graph of such an iterated function is what gives the name to this method of factorization since the greek letter rho is likewise made up of a short tail which leads into a loop.

it is certainly not obvious how one could hope to use such an iterated function to factorize a number. The motivation lies with the birthday paradox namely that if we pick a random collection of say sqrt(n) numbers then the probability that 2 of them are congruent mod n is about 50%. So if we think about our iterated function as being a pseudo-random function then we would expect that after about sqrt(n) iterations we will have around a 50% chance of picking a value that we have picked before.

Finally the idea of how we can use iterated functions to factor numbers begins to form. The key idea is that if we consider a sequence of numbers then after about sqrt(p) iterations we can expect to have found 2 numbers which are congruent to each other mod p. We call this a collision mod p. Of course if n is the number that we are attempting to factor then n = p*q with p prime and while we do not have access to p we do have access to n. So instead of analyzing sequences mod p we look at sequences mod n. Since n is a multiple of p when we mod by n we do not change the value of the same integer sequence mod p. So although we can expect to find a cycle mod n in around sqrt(n) steps more importantly we can expect a collision(and therefore a cycle) mod p in sqrt(p) steps.

How to detect such a collision mod p takes a bit of thought since we do not have the value of p and therefore cannot simply check for such collisions directly. Say there has been a collision mod p in our sequence and let the terms which are equal mod p be f and f'. Since f = f' mod(p) then f - f' = 0 mod p and therefore f - f' = kp for some integer k. Let g(i) be the i'th iteration of the function f acting on x0. If we take the greatest common divisor of the differences g(i) - g(j) and n then in the case of a collision mod p we will recover a factor of n which if we are lucky will not be equal to n itself.

There is one last piece to the puzzle which needs to be put into place before this is of any use however. If we were simply to check the value of gcd( g(i)-g(j), n) for all values of i and j such that i < i =" kL"> T. When this happens then g(i) will be the kL-T element of the cycle and the element g(2i) will be the 2kL - T element in the cycle meaning they differ by L elements and therefore must correspond to the same element in the cycle.

At this point we have sufficiently analyzed the algorithm so that if we readily had access to "random" functions from the interval [0, n-1] to itself then we could expect to produce a factor of n = p*q with p being the least prime factor of n in O(sqrt(p)) operations. Since p <>0 after say 100*n^(1/4) steps and performing hundreds or thousand such restarts can push this probability as low as you like.

That said it should be obvious that we do not have access to such suitably random functions. Generating such a function in its completeness would of course take O(n) time which is much greater than the O(sqrt(n)) time required for trial division. Because of this in reality such an arbitrary random function is replaced with some arbitrary function which is assumed to behave sufficiently "randomly". An oft made and at least somewhat good function which is often chosen is f(x) = x^2 + 1 (mod n) This function is relatively easy to calculate and gives good mixing and in practice actually produces good results. Unfortunately there is no proof that x^2 + 1 actually does operate as advertised. With a particular choice of function the claims of operating time that I just made should be viewed with scrutiny. For instance if the function is f(x) = x + a (mod n) then the iteration of the function is very clearly not "sufficiently random" and in this case one would expect a collision after O(n) steps instead of O(sqrt(n)). With the function of f(x) = x^2 + 1 one expects much better behavior but still probably not quite as good as one would expect from a truly random function and the actual performance is not know rigorously. Thus this factorization method remains purely heuristic but ultimately that is of little importance in practice since a factorization is just as valuable if it is found by a heuristic method as if it is found by more rigorous means.

The real value of the pollard rho method however really is its simplicity and intrinsic interest and elegance. There are other factorization algorithms which have been rigorously analyzed and are known to be faster (for instance the number field sieve method which I am working up to). In practice one wonders how important it actually is that the iterated function really be "random" one could for instance use a higher order polynomial but there is no reason to believe that this would significantly improve the randomness and since higher order polynomials would take longer to compute very likely in practice the simple x^2 + a is best.

One could concievably dynamically generate an actually random function by choosing the value of a particular argument when the function is called with that argument for the first time and then saving the value of the function for that argument. Since this would require about O(sqrt(n)*ln(n)) bits of memory to carry out the memory requirements could easily become unrealistic. For a number of say 20 digits we would expect to require about 20*10^10 bits of memory on average before a factor was found. For perspective since 1 gigabyte of memory is about 10^10 bits. A number with just 30 digits (a factorization task easy enough for my graphing calculator to handle) would require an average of 30*10^15 bits of memory which is roughly 3,000,000 gigabytes. Clearly it is not realistic to randomly generate and store a function in this manner.

More interestingly one might consider picking a random function by picking random coefficients in some set of basis functions. Obviously any complete basis of an n dimensional vector space must have n vectors in it. So if we want to be able to specify any possible function taking values at n discrete points we are not going to be able to do better than a straight table of values. On the other hand if we are willing to settle for sparsely populating the function space with functions such that they are sprinkled uniformly over the function space then we are much more in luck.

For instance we could pick random Fourier coefficients in a Fourier basis and then interpret the function mod n. Based on how large of a difference between functions we are willing to tolerate we could choose an appropriate number of coefficients to be able to approximate the functions sufficiently well. With careful analysis a Fourier basis could actually be possibly helpful but since the basis functions are inherently periodic there is a very real danger of creating very structured behavior in the iteration of the function which would interfere with the motivation of having picked a "random" function in the first place.

It would be intriguing to attempt something like this with one or another class of orthogonal polynomials. For instance one could use the Lagrange polynomials and pick weights for the first few polynomials to generate a polynomial which could be used as the randomly chosen function. If the probability distribution of the size of these coefficients had an appropriately chosen decay rate then one could very effectively sprinkle functions uniformly over the function space. Of course uniformity of the distribution of the functions over the function space is no guarantee that the iteration of the function is in any sense "random". Also ultimately this suffers from the same problem that probably the increased randomness is not worth the added time it will take to calculate the function.

So if we are going to spend a significant amount of time generating and or calculating the function we use then we had better expect some speed up in the generation of collisions for doing it. For sufficiently large n the added randomness of higher order uniformly distributed polynomials would probably be worth the extra calculation but the best we can hope for from truly randomly distributed functions is O(n^(/1/4)) which while not too shabby perhaps we could improve upon it.

For instance in the case of Fermat numbers it is known that the prime factors of the k'th Fermat number are of the form 2^(k+2) + 1 With that in mind it seems reasonable to make sure that you look at the numbers of the form 1 mod 2^(k+2) which is easily accomplished by making your function f(x) = x^2^(n+2) + 1. Of course now your function is much more computationally intensive but the speed up is well worth it.

In general what is desirable is that when we view the iterated function sequence mod p it is desirable that we don't have to wait very long before there is a collision. This comes down to the desire that the average cycle length mod p is low for most primes. Of course such functions exist for instance the function which is f(x) = x+1 if 2 | x and x - 1 otherwise has a cycle length of 2 for all starting numbers but obviously the property of "sufficiently random" is not met. But while it is fairly easy to show that every iterated function over a finite set causes paths that consist of tails leading into cycles what is not at all easy to do is to figure out say how long the average cycle is going to be.

I started writing this blog post around 11:00 this morning and it is now 11:00 at night so I don't think I shall be figuring out any interesting new twists to this 35 year old algorithm before bed. Of course here you can cue the usual rampant speculation perhaps certain properties of the Fourier spectrum of the function would reveal properties of the iterated function, or better yet perhaps there is some function basis for which functions with small iteration cycle lengths are sparse etc etc etc.

Wednesday, January 20, 2010

Verdict is in, high school students are noisy

Although today was better than yesterday the high school students that I am having to share my office with are extremely noisy. Besides their intrinsic loudness there is also the fact that they have power tools to play with. Fortunately the power tool fun is mostly in the lab adjacent to my office but the walls are thin. Although I do feel slightly interested in what they are doing really that only serves to make them all the more annoying.

Sad to say it but I am afraid to do my real work for fear that they would see it and comprehend it. For some reason I feel the need to project a sense that whatever it is that I am doing must be complex and arcane. I don't want to actually do my physics homework since they might look over my shoulder and discover that my physics is mostly just algebra, integrals, curls, and divergences and the like (all definitely accessible at a high school level). For similar reasons I don't want to do anything else that might be sufficiently simple to be comprehended by those around me. But reading and understanding the complex stuff takes more concentration and work than I can easily muster in the midst of such mayhem. I only hope that the students won't stay so late that they will still be hanging around my office late at night. I walked to the library so that I could read my number theory books in peace but hopefully this next six weeks just translates to the loss of the use of my office only between the hours of 3 and 6 or so not from 3:00 on.

Friday, January 15, 2010

My office is getting taken over by robots

As I was sitting in my office today I was informed of the fact that for the next 6 weeks I would be sharing the office with a bunch of high school students. The high school students are apparently working to make robots for some competition. They put a keybox next to the office door so that anyone who knows the combination to the box can get out the key and unlock the door. I suppose that on the bright side this means that if I lock myself out of my office during the first 6 weeks of the semester I won't have to go get a master key in order to get back in.

I am a little apprehensive about the presence of these students it might be interesting to have them around or it could be monumentally annoying. Since I have become something of an artificial intelligence fanatic over the past couple of years though I probably won't be able to keep myself from asking them about how they are coding up the AI. Perhaps they should be getting a little primer in q-learning or kernel learning...

Wednesday, January 13, 2010

My Web Identity Equity

I just put "Tim Anderton" (my name) into google and hit search. A quick perusal of the results is interesting but not terribly heartening. Although results that actually have to do with me at least happen to be on the first page. Results 2, 7, and 8 on the first page were related to me. Not surprisingly the most prominent result was the course web-page for the class that I was a TA for last semester. The next results were similar 7 was a page showing past REU participants and 8 was my facebook page.

Sifting through the first hundred results drops the percentage of relevant pages down from 30% to 17% which is still surprisingly high. But simply taking the percentage of items which are related to me is rather naive since it doesn't take into account the order of results. Clearly if the 17 results actually related to me were all at the end of the 100 results then that would be a much weaker net presence than if they were the first 17. How exactly I should deal with the strength of the order dependence is unclear.

After some thought I have settled on a system for measuring this quality of prominence, For lack of a better name I shall call it Web Identity Equity. Of course this is not restricted to persons if you enter the term "Thermal Physics" and then looked through the results to see where the textbook of that name by Kittel and Kroemer falls you could measure the identity equity of that book on the web thus the name. Instead of having the Web Identity Equity (WIE) represented as just a single number I have decided it is better to treat it as a vector.

The first element of the vector should definitely be the percentage. Even though it doesn't take into account any sort of ordering it is easy to interpret and gives good information about the density of results.

After a little deliberation the second element of the vector I decided should be a weighted sum with weights of 1 over the search rank. So the first search result gets weight 1 the second result gets weight 1/2 the third weight 1/3 and so on. This has the advantage that early results are greatly favored over later results but there is not much difference between the importance of the 14th and 15th result. In order to make the number nicer to process and compare the sum should be normalized. The sum of 1/n from 1 to N limits to Ln(N) + gamma where gamma is the Euler constant So not caring to carry out the normalization exactly I will just divide the result of the sum of the reciprocals of the relevant rankings by Ln(N) + gamma and call it close enough. The error is very small for any appreciable number of results the effect is about 2% for 10 results and is already 0.1% for 100 results so the effect wouldn't be noticeable anyway.

The 1/r weighting however falls off extremely slowly for the tail end of the curve as the fact that the series diverges attests. So the natural next step is to use the weighting 1/r^2 (r is the ranking of the result of course). This has the nice property that the sum of the ranking weights converges for an infinite number of results to Pi^2/6. Since now the tail end of the distribution is given in some sense zero weight this number represents a sort of heavy favoring of early search results.

Finally for the fourth number I feel that I need to take into account the fact that people will often not go to the next page of search results if they can help it and results on the same page will often share similar perusal. So for the last number we take a weighted average of the density of relevant results on each page. Since moving from page to page is much less likely than just looking at the next result I feel that the weights should have a strong bias for the early result pages. But unlike for the earlier measures I feel that here it is appropriate to treat the number of pages visited as a poisson distribution. Somewhat obviously the mean of the poisson distribution should be greater than one since you can't visit less than 1 search page and the probability of visiting more than 1 should be nonzero but the mean number of search pages is definitely less than 2 and if we just let the mean be 1 the appropriate weighting is rather nicely just 1/r! (that is 1 over r factorial).

it is tempting to include an extremely rank dependent ranking system that models the search depth as a poisson distributed parameter with some larger mean but frankly it is too much of a pain to calculate and normalize for large numbers of results unless I was willing to write a script. I might do that in the future but for the moment this will have to do.

Now getting back to the actual search results for my name. My Web Identity Equity Vector using the first 100 google results is

0.170, 0.212, 0.181, 0.252

I can't help but be somewhat amazed by the consistency in the different numbers. Apparently my web identity equity is about 20% no matter how you measure it.

Of course while fun this is only really interesting when you have other things to compare it with. This sort of thing would be perfect to write a little web script for so that you could do it for any random thing you felt like but alas I have no such skill.

Tuesday, January 12, 2010

Utility Access Only: or My New Office

Over the Christmas break I was moved out of my old comfy office which was a converted storage room in the basement, to a room that is old lab space. I suppose this room does have the advantage (disadvantage?) that I actually get reception for my cell phone here. But rather ironically this room is probably the least accessible spot in the whole damn building. On top of the offices positional awkwardness, which I assure you is severe, my office doesn't appear at first glance to have a working door.

My office has two doors, one is clearly marked JFB 205 and the other is rather humorously marked "Utility Access". Judging from the exposed strut running down the middle of the room between the two doors I think it is plausible that once upon a time the utility access door actually was what it purports to be. At any rate the door actually marked as though it led to an office instead of a janitorial closet is now completely inoperable. When the work crew came in to install the desks for some reason they figured it would be best to keep the desks as close as possible to the door and as far away as possible from the far wall... I don't think they had quite worked out that the desks poke out into the room farther than the door is away from the wall. Thus if you try to access the room through the obvious entrance you find the door jams after opening about an inch and a half. Giving the impression that perhaps the occupants have barricaded the door in order to defend against the imminent zombie apocalypse.

Fortunately for me I had tried to access the room before the desks were actually installed and so was aware that the apparent janitorial closet right next to the office was in fact a secret entryway. I hope the desk placement stupidity isn't corrected too quickly since this will enevitably lead to other people being aware of how to enter the room.

Fermat Number Factorizations

Fermat once conjectured that all numbers of the form 2^(2^n) + 1 were prime. More or less this was based on the fact that the first few such numbers 3, 5, 17, 257, 65537 were all prime. This was the farthest out that the primality of such numbers could be tested in Fermats day since the very next number in the sequence is 4294967297 which rather understandably Fermat did not recognize as being 641*6700417.

So far as we know the only prime Fermat numbers are in fact the first few numbers known to Fermat to be prime. Primality testing of the Fermat numbers and more challengingly the factorization of Fermat numbers has become something of a past time in number theory especially since computers got into the fray.

I should take a moment to talk about just how large the Fermat numbers really are. It is convenient to think of fermat numbers as binary strings. The Fermat numbers in binary are 11, 1001, 10000001, 1000000000000001, 100000000000000000000000000000001, .... and so on. The formula for producing them being to make a string of 2^n zeroes and then replacing the first and last zero with 1's. Viola a binary expansion of the n'th Fermat number. Rather obviously these numbers grow very very large very very quickly. Because of the difficulty of factorization of large numbers and the phenomenal growth of these numbers we have managed to completely factor only the first 11 Fermat numbers. This may not seem like much of an accomplishment but consider that "unbreakable" RSA cryptography uses numbers of a few hundred digits. F11 has 2048 binary digits which translates to over 600 decimal digits. One should keep in mind of course that RSA uses numbers which are worst case for factorization algorithms so factoring a 600 digit RSA number would be much harder than factoring a 600 digit Fermat number.

F12 is a 4096 digit number in binary (1234 decimal digits) and so even after dividing out all the known factors 114689, 26017793, 63766529, 190274191361, and 1256132134125569 we are still left with the incredible task of factoring a number of over 1200(decimal) digits. Although finding these gargantuan factors for F12 is certainly a laudable feat we have barely scratched the surface of the problem.

Friday, January 8, 2010

Binary Budokan

This ad for chrome cracks me up.

Thursday, January 7, 2010

The Time Machine

I just watched the most recent version of "The Time Machine" I have watched both the old black and white version and the one produced in the sixties. I thought the movies were all right and of course it wouldn't seem right for me to not have seen them being the time travel fanatic that I am. This most recent movie is much much better than the other two though with the reason being that it doesn't quite follow the story line set out by H.G. Wells. Now I think the original story line of the time machine is decent but overall the plot is not what the story is about. The story is about time travel not about weena. I think I am going to have to buy this time machine movie. The time travel sequences are very pretty and the story is well done and even a bit original although the eloi and the morlocks are still at the center of the whole thing.

I have always been something of a time travel fanatic. I have fond memories of Kip Thorne's "Black Holes and Time Warps". I remember checking the book out of the library in 2nd or 3rd grade reading the first 40 pages or so. I had to check it out multiple times even to get that far. Mostly I would find some out of the way corner and I would sit musing and dreaming while holding the book instead of actually reading it. I would read a few lines or a paragraph or two until something was said that I didn't understand or that made me think of something else and I would close the book with my finger in the pages at the point where I was reading. I remember learning scientific notation from the book which kindly explained for non-technical readers that 6x10^100 meant a 6 with a hundred zeroes after it. I kept going back to that little passage and rereading it over and over again. It was because here there was some tiny glimmer of the workings of real science something of the mechanism that made physics and math really work. I admit that scientific notation is about as simple as you get but to a second grade version of me it was something I could fully understand with a little thought and that was exciting. If I had read it from an algebra book in school it would have perhaps been interesting but no reason to day dream about unlocking the secrets of the universe.

Wednesday, January 6, 2010

Combine Ball

A couple of years ago a team I was in made a half life 2 mod that was a kind of dodge ball game. You threw combine balls around with the gravity gun. It was actually pretty fun. I figured it was time to post the video

Course Evaluations are In

So I got my first course evaluations for the discussion sections I taught in the fall. On the overall I took away the message that I really do need to be more careful about being clear be more carefully prepared and do more discussion and less working of problems. Of course I actually got more complaints to the effect that I wasn't doing enough homework problems than complaints that I wasn't doing the discussion well. But the complaints that I was not working enough homework problems were completely groundless since I was in fact doing almost nothing but working homework problems. It requires much less preparation to show how a few homework problems are to be worked than to prepare a real discussion.

I think my first semester of teaching went less well than it might have but I think it has made me a better teacher. I was a much better discussion leader towards the end of the semester than towards the beginning. Partly because my discussion sections were held at 7:30 and 8:35 in the morning I lost a great deal of the people in my classes. Since there were more evaluations than there ultimately were students attending the discussions I am pretty sure that some of the more negative reviews of my teaching were from people who came two or three times and then never again. I'm eager to see how the discussion sections I will be teaching in the spring evolve. Hopefully the classes won't go from not very full to barely there. Of course there will be some natural attrition since a large number of people will decide that the discussion section is not worth their time no matter how good it is. Unfortunately I won't really be able to productively compare the sections I will be teaching in the spring to the ones I taught in the fall since I will be teaching 2020 discussion sections instead of 2220 sections. Very probably the attrition rate in the algebra based physics (2000 series) is going to be higher than the calculus based versions (2200 series).

Tuesday, January 5, 2010

Playing Poker for a Living: Who are you playing against?

Because poker is played against other players instead of against the house you don't need to be phenomenally good at poker in order to have a positive expectation, you only need to be better than someone you are playing against. As the existence of casinos demonstrates many people are willing to lose money on average in order to have a chance to win some money. The fact is that a lot of people find it fun to gamble and even more people find it fun to fantasize about winning at gambling. So very good poker players simply end up taking money from players who are less good.

Online poker is the best way to try and make a living at poker. The cost of playing online poker is smaller than the cost of playing poker at a physical casino, also many players who do not have access to a casino are willing to play online. Online activity isn't bound to just one physical place and so it is different times for different players giving more uniform behavior over time.

There are a large number of people who play poker occasionally for fun and a few people who play poker constantly, some for a living. I would be willing to bet that the distribution of the number of players who play with a certain frequency roughly follows a power law. N(f) = a*f^k for some choice of constants a and k where N(f) is the number of players who play with the frequency f. I expect this to hold true mostly for relatively small frequencies. Obviously there are many more low frequency players than high frequency players which means that here we expect k to be negative.

Lets take a moment here to consider the question of what this sort of distribution would mean for who you are likely to be playing when you log in to a poker site. Higher frequency players play more often but there are a lot more lower frequency players. If the power law distribution holds (and there is some reason to think that it would) then a player which plays with a frequency f would have a chance of being picked with probability density proportional to f*N(f) since the probability of a particular player being picked is the combination of the probability of that player being on which is f and the number of players with that frequency. Normalizing this density we get

P(f) = (k+2)*f^(k+1).

taking this a step further we calculate the average frequency of play as

integral f*P(f) df = (k+2)/(k+3) = faverage

This of course isn't very helpful unless we have some idea of the value of k. As we already said k must be negative to make lower frequencies more common than higher frequencies. But this means that there is a singularity at f=0 and examination of the probability density that we just came up with suggests to us that if we want to keep the total number of people online at a given time finite we need a k which is greater than -1.

for the limiting values of 0 and -1 we get an average play frequency of 2/3 and 1/2 respectively. The middle values have values in between these. for the middle of the road k = -1/2 we get 3/5 rather obviously no player can be playing with frequency 1 and it is doubtful that the average player is playing 2/3 of their day since that would mean they just take out 8 hours to sleep and thats it. Even the low estimate of 1/2 is too high. Qualitatively though the point is that high frequency players are on more often than low frequency players. If we take these frequencies to be a fraction of the highest normal play frequency then these numbers make more sense. The highest realistic value for the top player frequency is probably about 1/2 which would correspond to playing 12 hours of poker every day. A regular 40 hour work week would correspond to a frequency of 5/21. Using this as a base line I would say the highest tier of poker players probably have a playing frequency of around 1/3. This gives much more realistic figures for the average frequency of around 2/9, 1/5, and 1/6 for values of k 0, -1/2, and -1 respectively.

At any rate this suggests that most people you play will be people who play a very significant fraction of their time. This however doesn't take into account the fact that people do not play at all times of the day evenly. Taking this into account will skew the distribution back towards low frequency players during peak times and towards high frequency players at off times.

Ludum Dare #16 Voting Results

Much to my surprise after the voting happened for Ludum Dare 16 it turns out my game entry came in 12th overall!

Top 20 Entries

Considering there were 121 total entries, that this was my first game making competition and that quite a number of the people who enter in these competitions actually code or even make games for a living, I'd say that is reason to be proud. Also my game was the only game in the top 20 which didn't come with some sort of executable. Of course I would like to say that this was a major disadvantage because it means many fewer people ran my game than might otherwise have. But I suppose that it may also be an advantage since only people that really wanted to play my game would have run it.

I'm going to try and participate in the next LD too and this time I will not have to relearn python or learn pygame and assuming I can find a few hours to detatch it from my LD 16 entry I will have a rough game engine. This time I will spend a few hours before hand making sure that I can make an executable.

Sunday, January 3, 2010

Stochastic Factorization Algorithms

There are a number of probabilistic algorithms for primality testing. Probabilistic primality tests are based on fermats little theorem and its relatives.

fermat's little theorem

basically there are properties of prime numbers that can be tested for in a computationally efficient manner. Thanks to this sort of test we can probabilistically "recognize" prime numbers.

Factorization on the other hand is a completely different story. Primality testing and factorization are both difficult problems but their difficulty comes in completely different packages. In primality testing we are trying to distinguish between just two different possibilities whereas in factorization there are a vast number of potential answers. Relatedly it is extremely easy to check a factorization solution we simply multiply our proposed prime factors together and if we have found the correct factorization of our number then the product of the prime factors will be that number. On the other hand if we are handed a number and told that it is prime checking that solution to the primality testing problem is no easier than or original problem.

Since knowing the factorization of a prime number is equivalent to knowing that number is prime the factorization problem cannot be any easier than primality testing. But since we know of no way to make factorization work in less than exponential time and probabilistic primality testing works in essentially constant time... this is not much of a restriction. Obviously it would be nice if we could make good probabilstic factorization work. Since factorizations are easy to check a correct factorization suggested by probabilistic methods would quickly become a known factorization.

What we want then is a way to look at a number and simply "recognize" its factors. For numbers written in our regular decimal system there are a few nice ways to simply recognize some of the factors of that number. For example if the last digit of a number is 0, 2, 4, 6, or 8 then one of the factors of the number is 2. Somewhat less trivially if the sum of the digits of a number is divisible by three then the original number is divisible by three. A test that you might not be familiar with is one for 11 if you alternately add and subtract the digits of a number divisible by 11 if the result is divisible by 11 then the number itself is divisible by 11.

Although we could come up with tricks like these for every prime number that wouldn't end up giving us a factorization procedure that was much more effective than just checking for factors by dividing by all the numbers less than the square root of the number we want to factorize, since we would have to perform procedures to recognize each divisor individually.

But this brings us to the question what if we could probabilistically test for divisibility by whole classes of divisors at a time? There must be a way that numbers which contain different classes of primes "look" different. I don't have a nice rigorous mathematical way to look at a number and test whether or not it contains say a twin prime or more usefully a 4n+1 prime nor for that matter any general category of prime. But it is reasonable to believe that there are statistically recognizable properties of numbers with primes of different types. I just spent a few hours coding and trying to find such properties but didn't find anything by hand.

Even so the question remains unanswered as to whether or not it might be possible to write a program which could learn good heuristics which would suggest factorizations for numbers. Rather obviously this sort of tactic wouldn't be much good for factoring numbers which are used in cryptography since those numbers are composed of two enormous primes and it is just not likely that any heuristic program no matter how good could realistically "recognize" the factors of such a number. Realistically in fact this sort of strategy probably only has a real chance of learning to quickly factor relatively small numbers say on the order of 20 digits. Which means that more normal sieve factorization methods will be more than capable of easily handling these same numbers. Despite the fact that nothing useful is likely to come out of it this conjecture has taken my fancy. I've already put some effort into actually working it out and I am actually going to put some mental muscle into this.

Friday, January 1, 2010

New Years Resolutions

New years resolutions are things that you come up with in order to inspire you to achieve them. Of course they also tend to be things that you ultimately decide to not go through with. With both of those characteristics in mind here are my new years resolutions.


  1. Learn general relativity

  2. Get up earlier than I absolutely have to

  3. Come up with a working theory of fractional dimensional spaces

  4. Prove the Riemann hypothesis (preferrably as a side effect of the theory of fractional dimensional spaces)

  5. figure out how to collapse Jupiter into a black hole