2013-12-08 21:04:27Morris

[高等演算法] 雜

高等演算法 第 4 次作業
1. For two points p and q in the plane, we say that p dominates q if both x and y coordinates of p are greater than that of q respectively. Given a set S of n points, the rank of a point p in S is the number of points in S dominated by p. We want to find the rank of every point in S. A naïve algorithm needs O(n^2) time. Design a faster algorithm for this problem.

google search '2D rank finding problem'
solved by
1) D&C strategy (implements by mergesort.)
2) Sweep line algorithm (implements by RMQ problem[binary indexed tree or segment tree])
=> O(n logn)

2. In the fractional knapsack problem, the thief can take fractions of items, rather than having to make a binary 0/1 choice for each item. Solve this problem in O(n) time.

greedy strategy, knapsack capacity m.
-----
the problem lack the value of item.
item attribute (weight, value). how to get maximum value with limited capacity.
solved by selection algorithm.
step 1. find the median number by selection algorithm - O(n)
step 1.1 if n == 1 using greedy.
step 2. calcuate the sum of the numbers(weight) which less than median number. - O(n)
step 3-1. if sum > m
            get the numbers which less than median number. back step 1.
          else
            select the numbers which less than median number.// get those item.value
            get the number which larger then median nubmer.
            m -= n/2, back step 1.
T(n) = T(n/2) + O(n) => T(n) = O(n).

3. Assume there are n supposedly identical VLSI chips that are capable of testing each other.
There is a test jig that can accommodate two chips at a time. When the jig is loaded, each chip
tests the other and reports whether it is good or bad. A good chip always reports accurately
whether the other chip is good or bad, but the answer of a bad chip cannot be trusted. Show
that the good chips can be identified with O(n) pairwise tests, assuming that more than n/2 of
the chips are good.

tag : pigeonhole principle, math, D&C
-----
| A says | B says | result
------------------------------------------
|   good |   good | both good or both bad.
------------------------------------------
|      X |      X | least one is bad.
------------------------------------------
#a are good, #b are bad, a+b = n and a > b
1) n is even,
we make pair with random way.
by pigeonhole principle, at least one pair which are both good.
only choose (good, good) pair, random leave one in this pair.
if there are k pairs which are both bad, there are >= k pairs which are both good.
Then, we will get smaller scale problem.
    T(n) = T(n/2) + O(n)
2) n is odd.
when making pair, let 多餘挑出
證明為何可以直接捨棄
假使用 m 對呈現 (good, good) 其中會隨機挑出 p 個好的和 q 個壞的,
p + q = m
2.1) if m is odd, 保證 p > q, 捨棄多餘的, 繼續遞迴
2.2) if m is even, 保證加入多餘的 p > q, 繼續遞迴

4. Design an algorithm to find a longest (simple) path of a given digraph.

Dynamic programming O(V+E)
dynamic programming order by topological sort.

5. Given a sequence S of n nonnegative numbers x1, x2, … ,xn, and an integer k, partition S into k or fewer consecutive subsequences such that the largest sum of these subsequences is
minimized over all possible partitions.

greedy with binary search - O(k log(maximum_guess))
binary search result in [0, maximum_guess],
use O(k) check yes or no.
-----
dynamic programming O(K^2 * S)
dp[k][s] = min(max(dp[i][s-1], sum[i+1][k])) // i < k

6. Solve Problem 10069 “Distinct Subsequences” in the ACM web site.

dynamic programming - O(nm)
if a[i] == b[j]
    dp[i][j] = dp[i-1][j] + dp[i-1][j-1]
else a[i] != b[j]
    dp[i][j] = dp[i-1][j]

7. Solve Problem 10271 “Chopsticks” in the ACM web site.

sort length.

dp[i][j] 表示前 i 隻筷子分配給 j 個人
遞迴時,要保證剩餘筷子有足夠的當做第三隻筷子可以分配給已經分配的 j 個人
而保證 i-th 一定和 (i-1)-th 配。

8. Solve problem “Printing neatly” in chapter 15 of the book [CLRS].

有 n 個單詞,一行最多可放入 M 個字元的稿紙內,
將單字印到稿紙後,稿紙的每一行結尾都會留下空白,
word_length[i], sum[i][j] = sigma(word_length[k]), i <= k <= j

dp[i] 表示前 i 個單詞的最少留白個數
dp[i] = min(dp[j] + M - sum[j+1][i]), j < i and sum[j+1][i] <= M

9. Consider the activity-selection problem appears in p.41 of unit-5. Now, assume that each given activity is associated with a positive weight. Design an algorithm to find an independent subset of activities (activities that can be allowed to use the same resource) whose sum of weights is maximized.

activities each marked by a start time (si), finish time (fi), weight (wi)
The problem is to select the maximum number of activities that can be performed by a single person or machine.

if without weight, sort by finish time (fi)
    use greedy algorithm get the maximum activities can be allowed.

with weight, use sweep line algorithm
sort by start time (si)
prev_max[fi] = max(prev_max[fi], prev_max[si] + wi, prev_max[fi-1])


10. Given a list of n positive integers d1, d2, …, dn, we want to efficiently determine whether there exists a simple undirected graph whose nodes have degrees precisely d1, d2, …, dn. Design an algorithm to solve this problem.

tag : UVa 10720 - Graph Construction, greedy algorithm, Erdős–Gallai theorem
-----
1) greedy algorithm
step 1 : choose the maximum degrees node x
step 2 : let x links 前幾個大的點
step 3 : n--, back step 2
- O(n^2 logn)

2) Erdős–Gallai theorem
d1 >= ... >= dn
sigma(di) /*i = 1 to k*/ <= k*(k-1) + sigma(min(di, k))/* i = k+1 to n*/
holds for 1 <= k <= n
use binary search check
- O(n logn)

11. Let G be a connected undirected graph. Given two spanning trees T and R of G, find the
shortest sequence of trees T0, T1, …, Tk, such that T0 = T, Tk = R, and each tree differs from the previous one by an addition of one edge and a deletion of one edge.

maybe greedy algorithm use ?

12. Given a weighted graph, find a spanning tree such that the maximal edge weight of the tree is minimized over all spanning trees.

kruskal algorithm guarantee that it will minimize the maximal edge weight of the tree.

13. Let G be an undirected graph. Each edge e in G is assigned a probability pe, 0  pe  1, and a positive weight we. The length of a path is the sum of the weights of edges of the path, and the reliability of a path is the product of the probabilities of edges of the path. Given a value t, a path is said to be reliable if its reliability is not less than t. Design an efficient algorithm to find a reliable path from a given vertex u to another given vertex v with shortest length.

find the shortest path => transform the graph with DAG. (shortest path used.)
we will using dynamic program in DAG.

dp[i] = max(dp[j] * reliability[j][i]) // j -> i
dp[u] = 1;
return dp[v];


高等演算法 第 5 次作業
1. Suppose we wish not only to increment a counter but also to reset it to zero (i.e.,
make all bits in it 0). Counting the time to examine or modify a bit as O(1), show
how to implement a counter as an array of bits so that any sequence of n
INCREMENT and RESET operations takes time O(n) on an initially zero counter.
(Hint: Keep a pointer to the high-order 1.)


2. Show how to implement a queue with two ordinary stacks so that the amortized cost
of each Enqueue and Dequeue operation is O(1).

instack, outstack
push(val)
    instack.push(val)
pop() {
    if(outstack.empty)
        while(!instack.empty)
            outstack.push(instack.pop())
    return outstack.pop();
}

3. Consider an ordinary binary min-heap data structure with n elements supporting the
instructions INSERT and EXTRACT-MIN in O(lg n) worst-case time. Give a
potential function such that the amortized cost of INSERT is O(lgn) and the
amortized cost of EXTRACT-MIN is O(1) and show that it works. Recall that
heapsort consists of two phases: an O(n) time heap-construction phase and a
sequence of n Extract-Min operations. According to the above amortized analysis,
does it imply that the time complexity of heapsort is O(n)?

No, 攤銷是預先支付 INSERT cost 2lgn, 先幫 EXTRACT-MIN 附了 lgn
但是 O(n) time heap-construction 並沒有預先支付的動作,因此不能說明
EXTRACT-MIN 的操作是 O(1)。

4. Design a data structure to support the following two operations for a dynamic multiset
    S of integers, which allows duplicate values:

    INSERT(S, x) inserts x into S.
    DELETE-LARGER-HALF(S) deletes the largest |S|/2 elements from S.

Explain how to implement this data structure so that any sequence of m INSERT and
DELETE-LARGER-HALF operations runs in O(m) time. Your implementation
should also include a way to output the elements of S in O(|S|) time.

Dynamic Tables & Selection algorithm

Dynamic Tables : INSERT & DELETE for all elements - O(n)
1) INSERT(S, x) inserts x into S.
    using the same step with Dynamic Table - O(1)
2) DELETE-LARGER-HALF(S) deletes the largest |S|/2 elements from S.
    same as DELETE |S|/2 elements - O(|S|/2)
    but it will use selection algorithm move the largest |S|/2 elements to the tail.
    selection algorithm by O(|S|)

5. Suppose that in the dynamic table operations, instead of contracting a table by
halving its size when its load factor drops below 1/4, we contract it by multiplying
its size by 2/3 when its load factor drops below 1/3. What is the amortized cost of a
TABLE-DELETE that uses this strategy?

TABLE-DELETE cost 2
The total credit = size[T] * 2 / 3 - num[T]

6. Binary search of a sorted array takes logarithmic search time, but the time to insert a
new element is linear in the size of the array. We can improve the time for insertion
by keeping several sorted arrays. Specifically, suppose that we wish to support
SEARCH and INSERT on a set of n elements. Let k = lg(n+1), and let the binary
representation of n be nk1, nk2, … , n0. We have k sorted arrays A0, A1, … , Ak1,
where for i = 0, 1, …, k1, the length of array Ai is 2^i. Each array is either full or
empty, depending on whether ni = 1 or ni = 0, respectively. The total number of
elements held in all k arrays is therefore exactly n. Although each individual array is
sorted, elements in different arrays bear no particular relationship to each other.

a) Describe how to perform the SEARCH operation for this data structure. Analyze
its worst-case running time.
b) Describe how to perform the INSERT operation. Analyze its worst-case and
amortized running times.
c) Discuss how to implement DELETE.

共有 lg(n+1) 堆,每堆 n/ lg(n+1) 個數字,這 lg(n+1) 個陣列分別做排序
維護每一堆的個數在 [n/lg(n+1)/2, n/(lg(n+1))] 之間 ?
a) SEARCH O(lgn * lgn)
    對每一堆都進行搜尋 O(lg n)
b) 找未滿堆中最少元素的堆 O(lg n)
        1) 如果找得到未滿堆,直接對其進行插入排序 O(n/lgn)
        2) 反之,將其中一堆滿的,對分兩堆,對其中一半進行插入排序 O(n/lgn)
   INSERT O(n/lgn)
c) 找最大堆進行刪除 O(n/lgn)
    當有一堆維護條件不足時,在找一堆進行合併排序 O(lgn)

7. Study the famous KMP algorithm for the string matching problem and give an
amortized analysis of the algorithm.

j:=0;
for i:=1 to n do
begin
  while (j>0) and (B[j+1]<>A[i]) do j:=P[j];
  if B[j+1]=A[i] then j:=j+1;
  if j=m then
    begin
      writeln('Pattern occurs with shift ',i-m);
    j:=P[j];
end;

为什么这个程序是 O(n) 的?其实,主要的争议在于,while 循环使得执行次数出现了不
确定因素。我们将用到时间复杂度的摊还分析中的主要策略,简单地说就是通过观察某一个
变量或函数值的变化来对零散的、杂乱的、不规则的执行次数进行累计。
KMP 的时间复杂度分析可谓摊还分析的典型。我们从上述程序的 j 值入手。每一次执行

while 循环都会使 j 减小(但不能减成负的),而另外的改变 j 值的地方只有第五行。每次执行了这一行, j 都只能加1;因此,整个过程中j最多加了
n个1。于是,j最多只有n次减小的机会(j值减小的次数当然不能超过n,因为j永远是非负整数)。
这告诉我们,while循环总共最多执行了n次。按照摊还分析的说法,平摊到每次for循环中后,
一次 for循环的复杂度为O(1)。整个过程显然是O(n)的。
这样的分析对于后面P数组预处理的过程同样有效,同样可以得到预处理过程的复杂度为 O(m)。