Date: Mon, 31 Jul 95 23:42:34 CDT From: rusin (Dave Rusin) To: vecsey@course1.harvard.edu Subject: Re: Multiplying HUGE Numbers Newsgroups: sci.math In article <3vj7ui$gqa@decaxp.harvard.edu> you write: >What's a simple algorithm to multiply numbers, i.e. with hundreds of >digits in them. > What's wrong with the one you learned in grade school? Two 100-digit numbers can be multiplied with 10000 single multiplies and then of course 10000 additions. Not at all a problem for today's chips -- especially since a 100-digit number is only about 40 8-bit words, so it's more like 1600 multiplies and adds (with some carrying of course). I wouldn't sweat it until the number of operations to track exceeded 2^16 (I dislike using long integers as my _counter_!) No, I understand the problem: you start worrying about numbers 1000 digits long, say for some security code or something; or you realize you have to carry out thousands of multiplies of 100-digit numbers just for some subroutine. You _can_ improve the amount of time and space needed for multiplies, but there is a lot of overhead (and a lot of programming headache). I tried to carry it all out once; I decided it wasn't worth the trouble to reinvent the wheel. You can, for example, pick up the BASIC dialect called UBASIC, which allows numbers many thousands of digits long. If it's critical I can probably locate some pointers I have to free code for this kind of thing. The best methods can multiply n digit numbers in something like O(n log(n)) time. This is better than the O(n^2) time of the grade-school method, and of course is more than the O(n) time it takes to read in and write out the answer. Before you leap into it let me say that the big-O constants get worse when you try to get a better asymptotic function, that is, for any method of a given theoretical speed there is a cutoff length of the numbers involved -- for any smaller numbers than that cutoff, there is no savings in speed with the whiz-bang method because of the overhead in conversion and so on. Here are a couple of methods I know of. In the first method (which is not optimal but is an improvement) you attempt to multiply (a N + b)(c N + d) (here N is a large power of your base -- 2 or 10 usually) faster than by providing the four-term expansion (a*c) N^2 + (b*c + a*d) N + b*d, which clearly takes four multiplications of the "small" numbers a b c d (each less than N say). You can do this because the middle term is (a+b)*(c+d) - (a*c) - (b*d), the last two terms of which are computed anyway. Thus you can get the coefficient of N with only one additional multiply -- as well as with 3 extra adds. So for example if N = 10^k, then we can multiply two (2k)-digit numbers in an amount of time close to 3*T, where T is the amount of time needed to multiply two k-digit numbers. Then we can show by induction that the amount of time needed to multiply 2^r-digit numbers is at most O(3^r) . Note that (3^r) = (2^r)^c where c = log3/log2 = 1.6 is less than 2. Thus we improved the traditional O(n^2) method to an O(n^1.6) one. Here's another method. Take any two large integers x, y. Reduce mod p for lots of different primes p. Clearly (x*y) mod p = (x mod p) * (y mod p), which for small p can be computed with just a few machine cycles. Then x*y is the integer whose reduction mod p is what was just computed. I say "the" integer because any two such must differ by something which is divisible by all the primes p. Well, the product of the first N primes is something like N^N, so you can certainly chart n-digit numbers with about n multiplies (each of numbers only about log(n) in length). You can continue to manipulate (x*y) (add, multiply,...) just by doing so mod p for all these p. At some point you probably want to convert this collection of residue classes back into an ordinary integer; I think that takes about log(n) steps every time you incorporate a new prime, i.e. n log(n) steps here as well. The use of the big-O notation here is pretty suspect, since it's not clear what a "step" is, I haven't included additions at all, and I've ignored space limitations. Most people forget that you technically need to include a lot of log log n terms to accout for the fact that, for example, you would need more time just to update your counters (say, the one keeping track of which prime you're working with, above) when the counters need to count up to a larger number. Thus I think it's clear you need to pick an intended size of integer for your application, then select the algorithm which -- in practice, not in theory! -- optimizes your speed for such integers. dave