编辑距离问题-算法导论

编辑距离问题-算法导论
编辑距离问题-算法导论

第三次上机报告

班级:031013 学号:03101283 姓名:黄辉煌

Title:动态规划求最短编辑距离

问题描述:已知字符串X和Y,要求使用最少的字符串操作将字符串X转换成字符串Y,字符串操作包括以下四种:

(1)删除一个字符(Delete),Cost(Delete)=2;

(2)插入一个字符(Insert),Cost(Insert)=2;

(3)替换一个字符(Replace),Cost(Replace)=1;

(4)复制一个字符(Copy),Cost(Copy)= -1.

Input::字符串X和Y;

Output:字符串X和Y的编辑距离EditDistance。

Testing Data:

1. X = algorithm Y = altruistic

2. X = GATCGGCAT Y = CAATGTGAATC

一、算法设计与描述

对于序列X[i](0≤i≤m)和序列Y[j](0≤j≤n)来说,定义s[i,j]为X[i]转换成Y[j]的操作序列的最小代价。假定我们已经知道了最后一次执行的操作,此时分情况讨论问题的最优解结构。

1. 最后一次操作为Copy,根据书中定义可知,x[i]=y[j], 问题就转化成了求将X[i-1]转

换为Y[j-1]的最小开销。因此,当最后一次操作为copy时,可以定义s[i,j]=s[i-1,j-1]+cost(copy)。

2. 最后一次操作为replace。此时,根据题目的定义可知x[i]≠y[j]。仿照上述分析,可以

得到相同的最优子结构。此时s[i,j]=s[i-1,j-1]+cost(replace)。

3. 最后一次操作为delete。根据题意,这时只是将x[i]从序列x[i]中除去,对序列Y[j]

没有任何影响,此时问题的最优子结构的形式为将X[i-1]转换成Y[j],于是可以得到s[i,j]=s[i-1,j]+cost(delete)。

4. 最后一次操作为insert。根据题意,在进行插入操作时,序列X[i]无任何变化,序列

Y[j]添加一个字符,因此,这时候问题的最优子结构的形式为将X[i]转换成为Y[j-1],此时s[i,j]=s[i,j-1]+cost(insert).

5. 最终s[i,j] = Min {Case1, Case2, Case3, Case4}

二、主程序代码

#include

#include

#include

using namespace std;

typedef enum{Copy,Replace,Delete,Insert};

int cost[4] = {-1,1,2,2};

int s[15][15] = {0};

int Edit_Distance(char *x, char *z, int x_Length, int z_Length);

void Print_Cost(int x_Length, int z_Length);

void main()

{

char x[] = "algorithm";

char z[] = "altruistic";

int x_Length = strlen(x), z_Length = strlen(z);

cout<<"代价:"<

cout<<"从i到j的编辑距离:"<

Print_Cost(x_Length, z_Length);

}

int Edit_Distance(char *x, char *z, int x_Length, int z_Length)

{

int i = 0, j = 0,temp;

//初始化s[0,0]=0

//s[i,0] = i * cost(delete)

for(i = 1; i <= x_Length; i++)

s[i][0] = i * cost[Delete];

//s[0,j] = j * cost[insert]

for(j = 1; j <= z_Length; j++)

s[0][j] = i * cost[Insert];

for(i = 0; i < x_Length; i++)

{

for(j = 0; j < z_Length; j++)

{

s[i+1][j+1] = INT_MAX;

//copy

if(x[i] == z[j])

{

temp = s[i][j] + cost[Copy];

if(temp < s[i+1][j+1])

s[i+1][j+1] = temp;

}

//Replace

if(x[i] != z[j])

{

temp = s[i][j] + cost[Replace];

if(temp < s[i+1][j+1])

s[i+1][j+1] = temp;

}

//delete

temp = s[i][j+1] + cost[Delete];

if(temp < s[i+1][j+1])

s[i+1][j+1] = temp;

//insert

temp = s[i+1][j] + cost[Insert];

if(temp < s[i+1][j+1])

s[i+1][j+1] = temp;

}

}

return s[x_Length][z_Length];

}

void Print_Cost(int x_Length, int z_Length) {

int i, j;

for(i = 1; i <= x_Length; i++)

{

for(j = 1; j <= z_Length; j++)

cout<

cout<

}

}

三、设计与调试分析

1、算法复杂度分析

本算法时间复杂度为O(n^2)

2、运行结果

四、总结

程序利用动态规划的思想,将问题剖析细化得到最优解。这与之前所接触的程序设计思想有一定差异性,所以刚开始时并无头绪。这是理解编程的又一个进步。

算法导论 第三版 第24章 答案 英

Chapter24 Michelle Bodnar,Andrew Lohr April12,2016 Exercise24.1-1 If we change our source to z and use the same ordering of edges to decide what to relax,the d values after successive iterations of relaxation are: s t x y z ∞∞∞∞0 2∞7∞0 25790 25690 24690 Theπvalues are: s t x y z NIL NIL NIL NIL NIL z NIL z NIL NIL z x z s NIL z x y s NIL z x y s NIL Now,if we change the weight of edge(z,x)to4and rerun with s as the source,we have that the d values after successive iterations of relaxation are: s t x y z 0∞∞∞∞ 06∞7∞ 06472 02472 0247?2 Theπvalues are: s t x y z NIL NIL NIL NIL NIL NIL s NIL s NIL NIL s y s t NIL x y s t NIL x y s t 1

Note that these values are exactly the same as in the worked example.The di?erence that changing this edge will cause is that there is now a negative weight cycle,which will be detected when it considers the edge(z,x)in the for loop on line5.Since x.d=4>?2+4=z.d+w(z,x),it will return false on line7. Exercise24.1-2 Suppose there is a path from s to v.Then there must be a shortest such path of lengthδ(s,v).It must have?nite length since it contains at most|V|?1 edges and each edge has?nite length.By Lemma24.2,v.d=δ(s,v)<∞upon termination.On the other hand,suppose v.d<∞when BELLMAN-FORD ter-minates.Recall that v.d is monotonically decreasing throughout the algorithm, and RELAX will update v.d only if u.d+w(u,v)

算法导论第二章答案

第二章算法入门 由于时间问题有些问题没有写的很仔细,而且估计这里会存在不少不恰当之处。另,思考题2-3 关于霍纳规则,有些部分没有完成,故没把解答写上去,我对其 c 问题有疑问,请有解答方法者提供个意见。 给出的代码目前也仅仅为解决问题,没有做优化,请见谅,等有时间了我再好好修改。 插入排序算法伪代码 INSERTION-SORT(A) 1 for j ← 2 to length[A] 2 do key ←A[j] 3 Insert A[j] into the sorted sequence A[1..j-1] 4 i ←j-1 5 while i > 0 and A[i] > key 6 do A[i+1]←A[i] 7 i ←i ? 1 8 A[i+1]←key C#对揑入排序算法的实现: public static void InsertionSort(T[] Input) where T:IComparable { T key; int i; for (int j = 1; j < Input.Length; j++) { key = Input[j]; i = j - 1; for (; i >= 0 && Input[i].CompareTo(key)>0;i-- ) Input[i + 1] = Input[i]; Input[i+1]=key; } } 揑入算法的设计使用的是增量(incremental)方法:在排好子数组A[1..j-1]后,将元素A[ j]揑入,形成排好序的子数组A[1..j] 这里需要注意的是由于大部分编程语言的数组都是从0开始算起,这个不伪代码认为的数组的数是第1个有所丌同,一般要注意有几个关键值要比伪代码的小1. 如果按照大部分计算机编程语言的思路,修改为: INSERTION-SORT(A) 1 for j ← 1 to length[A] 2 do key ←A[j] 3 i ←j-1

算法导论复习资料

算法导论复习资料 一、选择题:第一章的概念、术语。 二、考点分析: 1、复杂度的渐进表示,复杂度分析。 2、正确性证明。 考点:1)正确性分析(冒泡,归并,选择);2)复杂度分析(渐进表示O,Q,?,替换法证明,先猜想,然后给出递归方程)。 循环不变性的三个性质: 1)初始化:它在循环的第一轮迭代开始之前,应该是正确的; 2)保持:如果在循环的某一次迭代开始之前它是正确的,那么,在下一次迭代开始之前,它也应该保持正确; 3)当循环结束时,不变式给了我们一个有用的性质,它有助于表明算法是正确的。 插入排序算法: INSERTION-SORT(A) 1 for j ← 2 to length[A] 2 do key ←A[j] 3 ?Insert A[j] into the sorted sequence A[1,j - 1]. 4 i ←j - 1 5 while i > 0 and A[i] > key 6 do A[i + 1] ←A[i] 7 i ←i - 1 8 A[i + 1] ←key 插入排序的正确性证明:课本11页。 归并排序算法:课本17页及19页。 归并排序的正确性分析:课本20页。 3、分治法(基本步骤,复杂度分析)。——许多问题都可以递归求解 考点:快速排序,归并排序,渐进排序,例如:12球里面有一个坏球,怎样用最少的次数找出来。(解:共有24种状态,至少称重3次可以找出不同的球) 不是考点:线性时间选择,最接近点对,斯特拉算法求解。 解:基本步骤: 一、分解:将原问题分解成一系列的子问题; 二、解决:递归地解各子问题。若子问题足够小,则直接求解; 三、合并:将子问题的结果合并成原问题的解。 复杂度分析:分分治算法中的递归式是基于基本模式中的三个步骤的,T(n)为一个规模为n的运行时间,得到递归式 T(n)=Q(1) n<=c T(n)=aT(n/b)+D(n)+C(n) n>c 附加习题:请给出一个运行时间为Q(nlgn)的算法,使之能在给定的一个由n个整数构成的集合S 和另一个整数x时,判断出S中是否存在有两个其和等于x的元素。

算法导论习题答案

Chapter2 Getting Start 2.1 Insertion sort 2.1.2 将Insertion-Sort 重写为按非递减顺序排序 2.1.3 计算两个n 位的二进制数组之和 2.2 Analyzing algorithms 当前n-1个元素排好序后,第n 个元素已经是最大的元素了. 最好时间和最坏时间均为2()n Θ 2.3 Designing algorithms 2.3.3 计算递归方程的解 22()2(/2)2,1k if n T n T n n if n for k =?=?+ = >? (1) 当1k =时,2n =,显然有()lg T n n n = (2) 假设当k i =时公式成立,即()lg 2lg 22i i i T n n n i ===?, 则当1k i =+,即12i n +=时, 2.3.4 给出insertion sort 的递归版本的递归式 2.3-6 使用二分查找来替代insertion-sort 中while 循环内的线性扫描,是否可以将算法的时间提高到(lg )n n Θ? 虽然用二分查找法可以将查找正确位置的时间复杂度降下来,但

是移位操作的复杂度并没有减少,所以最坏情况下该算法的时间复杂度依然是2()n Θ 2.3-7 给出一个算法,使得其能在(lg )n n Θ的时间内找出在一个n 元素的整数数组内,是否存在两个元素之和为x 首先利用快速排序将数组排序,时间(lg )n n Θ,然后再进行查找: Search(A,n,x) QuickSort(A,n); i←1; j←n; while A[i]+A[j]≠x and i,()()b b n a n +=Θ 0a >时,()()2b b b b n a n n n +<+= 对于121,2b c c ==,12()b b b c n n a c n <+< 0a <时,()b b n a n +<

《算法导论2版》课后答案_加强版2

1 算法导论第三次习题课 2008.12.17

2 19.1-1 如果x 是根节点:degree[sibling[x]] > sibling[x] 如果x 不是根节点:degree[sibling[x]] = sibling[x] –119.1-3 略

3 19.2-2 过程略( union 操作) 19.2-3 (1)decrease-key (2)extract-min 过程略 19.2-6 算法思想:找到堆中最小值min ,用min-1代替-∞. [遍历堆中树的根节点]

4 15.1-1 15.1-2 略P195 或P329 15.1-4 f i [j]值只依赖f i [j-1]的值,从而可以从2n 压缩为2个。再加上f*、l*、l i [j]。 Print-station(l, I, j ) //i 为线路,j 为station if j>1 then Print-station(l, l i [j], j-1 ) print “line”I, “station”j;

5 15.2-1 略(见课本) 15.2-2 15.2-4 略 MATRIX-CHAIN-MULTIPLY(A, s, i, j) if j>i x= MATRIX-CHAIN-MULTIPLY(A, s, s(i,j), j) y= MATRIX-CHAIN-MULTIPLY(A, s, s(i,j)+1,j) return MATRIX-MULTIPLY(x, y) else return A i

6 15.3-1 (归纳法)递归调用 枚举15.3-2 没有效果,没有重叠子问题 15.3-4 略 (3)n Ο3/2(4/) n n Θ

算法导论 第八章答案

8.2-4 :在O(1)的时间内,回答出输入的整数中有多少个落在区间[a...b]内。给出的算法的预处理时间为O(n+k) 算法思想:利用计数排序,由于在计数排序中有一个存储数值个数的临时存储区C[0...k],利用这个数组即可。 #include using namespace std; //通过改编计数排序而来,因为有些部分需要注释掉 void counting_sort(int*&a, int length, int k, int*&b, int*&c); int main() { const int LEN =100; int*a =newint[LEN]; for(int i =0; i < LEN; i++) a[i] = (i -50)*(i -50) +4; int* b =new int[LEN]; const int k =2504; int* c =new int[k +1]; counting_sort(a, LEN, k, b, c); //这里需要注释掉 //for(int i = 0; i < LEN; i++) //cout<>m>>n) { if(m >n) cout<<"区间输入不对"< k && m >0) cout<<"个数为"< k && m <=0) cout<<"个数为"<= 0; i--) { b[c[a[i]] - 1] = a[i]; c[a[i]]--; }*/ } PS:计数排序的总时间为O(k+n),在实践中,如果当k = O(n)时,我们常常采用计数排序,

算法导论 第三版 第35章 答案 英

Chapter35 Michelle Bodnar,Andrew Lohr April12,2016 Exercise35.1-1 We could select the graph that consists of only two vertices and a single edge between them.Then,the approximation algorithm will always select both of the vertices,whereas the minimum vertex cover is only one vertex.more generally,we could pick our graph to be k edges on a graph with2k vertices so that each connected component only has a single edge.In these examples,we will have that the approximate solution is o?by a factor of two from the exact one. Exercise35.1-2 It is clear that the edges picked in line4form a matching,since we can only pick edges from E ,and the edges in E are precisely those which don’t share an endpoint with any vertex already in C,and hence with any already-picked edge. Moreover,this matching is maximal because the only edges we don’t include are the ones we removed from E .We did this because they shared an endpoint with an edge we already picked,so if we added it to the matching it would no longer be a matching. Exercise35.1-3 We will construct a bipartite graph with V=R∪L.We will try to construct it so that R is uniform,not that R is a vertex cover.However,we will make it so that the heuristic that the professor(professor who?)suggests will cause us to select all the vertices in L,and show that|L|>2|R|. Initially start o?with|R|=n?xed,and L empty.Then,for each i from 2up to n,we do the following.Let k= n i .Select S a subset of the vertices of R of size ki,and so that all the vertices in R?S have a greater or equal degree.Then,we will add k vertices to L,each of degree i,so that the union of their neighborhoods is S.Note that throughout this process,the furthest apart the degrees of the vertices in R can be is1,because each time we are picking the smallest degree vertices and increasing their degrees by1.So,once this has been done for i=n,we can pick a subset of R whose degree is one less than the rest of R(or all of R if the degrees are all equal),and for each vertex in 1

算法导论 课后题答案

Partial Solutions for Introduction to algorithms second edition Professor: Song You TA: Shao Wen

ACKNOWLEDGEMENT CLASS ONE: JINZI CLASS TWO: LIUHAO, SONGDINMIN, SUNBOSHAN, SUNYANG CLASS FOUR:DONGYANHAO, FANSHENGBO, LULU, XIAODONG, CLASS FIVE:GAOCHEN, WANGXIAOCHUAN, LIUZHENHUA, WANGJIAN, YINGYING CLASS SIX: ZHANGZHAOYU, XUXIAOPENG, PENGYUN, HOULAN CLASS: LIKANG,JIANGZHOU, ANONYMITY The collator of this Answer Set, SHAOWen, takes absolutely no responsibility for the contents. This is merely a vague suggestion to a solution to some of the exercises posed in the book Introduction to algorithms by Cormen, Leiserson and Rivest. It is very likely that there are many errors and that the solutions are wrong. If you have found an error, have a better solution or wish to contribute in some constructive way please send an Email to shao_wen_buaa@https://www.360docs.net/doc/484455549.html, It is important that you try hard to solve the exercises on your own. Use this document only as a last resort or to check if your instructor got it all wrong. Have fun with your algorithms and get a satisfactory result in this course. Best regards, SHAOWen

算法导论第三版答案

Solution to Exercise2.2-2 S ELECTION-S ORT.A/ n D A:length for j D1to n 1 smallest D j for i D j C1to n if A?i

2-2Selected Solutions for Chapter2:Getting Started A?low::high contains the value .The initial call to either version should have the parameters A; ;1;n. I TERATIVE-B INARY-S EARCH.A; ;low;high/ while low high mid D b.low C high/=2c if ==A?mid return mid elseif >A?mid low D mid C1 else high D mid 1 return NIL R ECURSIVE-B INARY-S EARCH.A; ;low;high/ if low>high return NIL mid D b.low C high/=2c if ==A?mid return mid elseif >A?mid return R ECURSIVE-B INARY-S EARCH.A; ;mid C1;high/ else return R ECURSIVE-B INARY-S EARCH.A; ;low;mid 1/ Both procedures terminate the search unsuccessfully when the range is empty(i.e., low>high)and terminate it successfully if the value has been found.Based on the comparison of to the middle element in the searched range,the search continues with the range halved.The recurrence for these procedures is therefore T.n/D T.n=2/C?.1/,whose solution is T.n/D?.lg n/.

算法导论 第三版 第一章 答案 英

Chapter1 Michelle Bodnar,Andrew Lohr April12,2016 Exercise1.1-1 An example of a real world situation that would require sorting would be if you wanted to keep track of a bunch of people’s?le folders and be able to look up a given name quickly.A convex hull might be needed if you needed to secure a wildlife sanctuary with fencing and had to contain a bunch of speci?c nesting locations. Exercise1.1-2 One might measure memory usage of an algorithm,or number of people required to carry out a single task. Exercise1.1-3 An array.It has the limitation of requiring a lot of copying when re-sizing, inserting,and removing elements. Exercise1.1-4 They are similar since both problems can be modeled by a graph with weighted edges and involve minimizing distance,or weight,of a walk on the graph.They are di?erent because the shortest path problem considers only two vertices,whereas the traveling salesman problem considers minimizing the weight of a path that must include many vertices and end where it began. Exercise1.1-5 If you were for example keeping track of terror watch suspects,it would be unacceptable to have it occasionally bringing up a wrong decision as to whether a person is on the list or not.It would be?ne to only have an approximate solution to the shortest route on which to drive,an extra little bit of driving is not that bad. 1

相关文档
最新文档