A Divide and Conquer Approach to Fisher's Exact Test
Keywords: Fischer's exact test, linear programming, partition function
Abstract: Here, a form f(T) of an r x c contingency table T is the list of subtotals for rows and columns. The probability of T occurring conditional on f(T) = f, f fixed, is p(T) = v. The fisher exact statistic F(T) of T is 1-{sum pr(T')}, pr(T') > v and f(T') = f. Computer procedures usually enumerate possible entries for a fixed column ( or row ), and sum weighted values of F for the remaining r x (c-1) forms. This requires many recursive calls to the procedure. Since the space and time requirements explode they are posted to avoid repetition.
The approach here is to efficiently enumerate splits of forms f into pairs (f1,f2) having fixed shapes r x c1 and r x c2, c1+c2 = c, via linear programming. If r >> c interchange rows and columns first. The number of calls is much less but cannot be maintained by returning weighted sums of F(T1), F(T2) because, for a fixed splitting, the possible T1, T2 are almost independent. Therefor, except for the first call, a histogram of probability values is returned. These values depend only on the partition of a sum that the entries of a given Ti define. Since the mathematical partition function grows more slowly than functions of binomial coefficients, the space and time requirements of the histograms are less. Prototyping will be discussed.