2012-11-18 11:56:03Morris

[UVA][dp][最優二叉樹] 12057 - Prefix Codes

Given an alphabet S, and a probability Prob(a) for each , a binary prefix code represents each a in S as a bit string B(a), such that B(a1) is not a prefix of B(a2) for any a1≠a2 in S.

Huffman’s algorithm constructs a binary prefix code by pairing the two least probable elements of S, a0 and a1. a0 and a1 are given codes with a common (as yet to be determined) prefix p and differ only in their last bit: B(a0) = p0 while B(a1) = p1. a0 and a1 are removed from S and replaced by a new element b with Prob(b) = Prob(a0) + Prob(a1). b is an imaginary element standing for both a0 and a1. The Huffman code is computed for this reduced S, and p is set equal to B(b). This reduction of the problem continues until S contains one element a represented by the empty string; that is, when S = {a}, B(a) = ε.

Huffman’s code is optimal in that there is no other prefix code with a shorter average length defined as:

One problem with Huffman codes is that they don’t necessarily preserve any ordering that the elements may have. For example, suppose S = {A, B, C} and Prob(A) = 0.7, Prob(B) = 0.1, Prob(C) = 0.2. A Huffman code for S is B(A) = 1, B(B) = 00, B(C) = 01. The lexicographic ordering of these strings is B(B), B(C), B(A) [i.e. 00,01,1], so the coding does not preserve the original order A, B, C. Therefore, algorithms like binary search might not work as expected on Huffman-coded data.



Given an ordered set S and Prob, you are to compute an ordered prefix code - one whose lexicographic order preserves the order of S.

Input 

Input consists of several data sets. Each set begins with 0<n<100, the number of elements in S. n lines follow; the i-th line gives the probability of ai, the i-th element of S. Each probability is given as 0.dddd (that is, with exactly four decimal digits). The probabilities sum to 1.0000 exactly. A line containing 0 follows the last data set.

Output 

For each data set, compute an optimal ordered binary prefix code for S. The output should consist of one line giving the average code length, followed by n lines, with the i-th line giving the code for the i-th element of S. If you have solved the problem, these n lines will be in lexicographic order. If there are many optimal solutions, choose any one.

Output an empty line between cases.

Sample Input 

3
0.7000
0.1000
0.2000
3
0.7000
0.2000
0.1000
0

Sample Output 

1.3000
0
10
11
1.3000
0
10
11

被題目描述給騙了,寫了霍夫曼提交沒過,後來才發現他要求的是最優二叉樹,
把課本上的代碼敲一敲 O(n^3) 提交就過了。

#include <stdio.h>
int root[105][105];
char bits[105];
void dfs(int l, int r, int dep) {
    if(l > r) {
        bits[dep] = '\0';
        puts(bits);
        return;
    }
    bits[dep] = '0';
    dfs(l, root[l][r]-1, dep+1);
    bits[dep] = '1';
    dfs(root[l][r]+1, r, dep+1);
}
int main() {
    int n;
    while(scanf("%d", &n) == 1 && n) {
        double q[105] = {}, tmp;
        double w[105][105] = {}, e[105][105];
        int l, i, j, r;
        for(i = 0; i < n; i++)
            scanf("%lf", &q[i]);
        n--;
        for(i = 1; i <= n+1; i++)
            e[i][i-1] = w[i][i-1] = q[i-1];
        for(l = 1; l <= n; l++) {
            for(i = 1; i <= n-l+1; i++) {
                j = i+l-1;
                e[i][j] = 32767;
                w[i][j] = w[i][j-1]+q[j];
                for(r = i; r <= j; r++) {
                    tmp = e[i][r-1]+e[r+1][j]+w[i][j];
                    if(tmp < e[i][j]) {
                        e[i][j] = tmp;
                        root[i][j] = r;
                    }
                }
            }
        }
        printf("%.4lf\n", e[1][n]-w[1][n]);
        dfs(1, n, 0);
    }
    return 0;
}