An Experiment in Integer FFTs

As a hobby I'm putting together an integer Fast Fourier Transform, something that can be used to multiply huge numbers together. Progress is slow and painful, but it's actually kind of fun optimizing more and more.

How fast can it get? Well, anybody who says that code they've written is the fastest in history usually gets embarassed about ten minutes later when somebody one-ups them. Not much has been written yet on my part, mostly just some Pentium-optimized assembly and a simple driver routine. However, it seems that the forward radix-2 transform absolutely flies right now!

Here's the latest version. I'm a big fan of gcc, and regrettably that means the assembly language is in AT&T syntax. Go ahead and steal it, I don't care.

You also get, at no extra charge, floating point code that can perform modular multiplication four times faster than using integer instructions, and vectorized for use with the FFT.

UPDATES:

4/14/98:
Added decimation-in-time FFT. Cleaned up assembly routines, got rid of useless ones, renamed the rest. Also added high-performance transpose routines (I'm quite proud of them, they seem to be extremely fast). Together with the FFTs this is almost enough to make a fast integer multiply.

12/14/98:
The old stuff has led to production code! Carey Bloodworth's monster pi calculator needed a faster NTT for use with the Pentium, so I wrote one. This one includes normalization code that the previous ones did not, and hence is good for integers all the way up to 31 bits in size. Unfortunately, this slows things down by a third; however, as I say rather often, if the answer doesn't have to be right, arbitrary speed is possible. The older NTTs could only be used for primes 26 bits or less in size, and even then were limited to run lengths of 1k or so. This code has no such restriction.

PS: Look in the pi page I have for links to real experts on high-speed FFTs, Joerg Arndt and Mikko Tommila. Finally, these people have gone hopelessly insane writing high-performance FFTs, and they're the best at it.