Lecture 15 Dynamic Programming, Longest Common Subsequence

合集下载

Programming Language Pragmatics PL3

Static and dynamic refer to things bound before run time and at run time
IT IS DIFFICULT TO OVERSTATE THE IMPORTANCE OF BINDING TIMES IN PROGRAMMING LANGUAGES
Binding Time

compile time Choose the mapping of high-level constructs to machine code plan for data layout link time layout of whole program in memory (virtual memory) Resolves inter-module references when separate compilation load time choice of physical addresses for objects within the program Physical addresses may change at run-time
7
Lifetime and Storage Managements

When discuss the names and bindings, distinguishing between names and the objects to which they refer and identify several key events Key events

When a variable is passed to a subroutine by reference (var parameters in Pascal or ―&‖ parameters in C++), the binding between the parameter name and the variable that was passed has a lifetime shorter than that of the variable itself If object outlives binding, it‘s garbage If binding outlives object, it‘s a dangling reference

Lecture0.Introdnction

Camera:
• Light & photograph
Robot arm:
• Movement & motor speed
Introduction: Smart Phone with Camera, GPS etc. iPhone
From /tw/
Feng-LiLian Lian© ©2013 2011 Feng-Li NTUEE-SS0-Intro-13
Introduction: Modulation & Demodulation in Communication
Feng-LiLian Lian© ©2012 2011 Feng-Li NTUEE-SS0-Intro-8
Signal Frequency Characteristics:
f (Hz) 10 20K 100M 300M 40G 2.4G 300G
Introduction: Digital Signal Processing
Feng-LiLian Lian© ©2013 2011 Feng-Li NTUEE-SS0-Intro-10
Signals and Systems
Feng-LiLian Lian© ©2013 2011 Feng-Li NTUEE-SS0-Intro-11
LTI & Convolution
(Chap 2)
Course Flowchart
Signals & Systems
(Chap 1)
Feng-LiLian Lian© ©2013 2011 Feng-Li NTUEE-SS0-Intro-18
LTI & Convolution

编程语言英语知识点总结

编程语言英语知识点总结IntroductionIn the world of software development, programming languages are the building blocks of all digital applications. They enable developers to write code, create algorithms, and design user interfaces. With the evolution of technology, new programming languages emerge, offering different features and functionalities. In this comprehensive knowledge summary, we will explore the fundamental concepts of programming languages, their types, and the popular languages used in the software industry.Basic Concepts of Programming Languages1. Syntax and Semantics:Syntax and semantics are the core elements of any programming language. Syntax refers to the rules and structure of the language, while semantics relates to the meaning and interpretation of the code. In simpler terms, syntax determines how the code should be written, while semantics defines what the code does.2. Variables and Data Types:Variables are used to store data values in a program. They act as containers that hold different types of data, such as numbers, strings, or boolean values. Data types define the characteristics of the data and specify how it should be processed by the computer.3. Control Structures:Control structures govern the flow of a program by allowing developers to define conditional statements (if-else) and looping constructs (while, for). They enable the program to make decisions and perform repetitive tasks.4. Functions and Methods:Functions and methods are reusable blocks of code that perform specific tasks. They help in organizing and modularizing the code, making it easier to maintain and understand. Types of Programming Languages1. Procedural Languages:Procedural languages focus on defining a sequence of instructions for the computer to execute. They are based on procedures or subroutines that contain a series of steps to perform a specific task. Popular examples include C, Pascal, and Fortran.2. Object-Oriented Languages:Object-oriented languages are designed around the concept of objects, which encapsulate data and behavior. They support features such as inheritance, polymorphism, and encapsulation. Some well-known object-oriented languages are Java, C++, and Python.3. Functional Languages:Functional languages emphasize the application of mathematical functions to solve problems. They treat computation as the evaluation of functions and promote concepts like immutability and recursion. Haskell, Lisp, and Erlang are notable functional languages.4. Scripting Languages:Scripting languages are used for automating tasks, such as system administration, web development, and game scripting. They are often interpreted rather than compiled, making them suitable for rapid prototyping and development. Examples include JavaScript, PHP, and Ruby.5. Domain-Specific Languages (DSLs):DSLs are tailored for specific domains or industries and focus on solving specialized problems within those domains. They are used in areas like finance, healthcare, and telecommunications. SQL for database querying and HTML for web markup are prime examples of DSLs.Popular Programming Languages1. Python:Python is a high-level, general-purpose language known for its simplicity and readability. It features a rich standard library and supports multiple programming paradigms. It is widely used in web development, data science, and artificial intelligence.2. JavaScript:JavaScript is a versatile language primarily used for client-side web development. It enables interactive and dynamic web pages by allowing developers to manipulate the Document Object Model (DOM) and handle user events.3. Java:Java is a robust, platform-independent language that powers enterprise-grade applications, mobile apps, and embedded systems. It emphasizes portability, security, and performance, making it a popular choice for large-scale projects.4. C++:C++ is an extension of the C programming language, with an added focus on object-oriented programming and generic programming. It is widely used in system software, game development, and performance-critical applications.5. C#:C# (pronounced as C sharp) is a modern language developed by Microsoft and used inthe .NET framework. It combines the power of C++ with the simplicity of Java and is favored for building Windows desktop applications and web services.6. Ruby:Ruby is a dynamic, object-oriented language known for its elegant syntax and developer-friendly environment. It is commonly used in web development, thanks to the Ruby on Rails framework, which facilitates rapid application development.7. Swift:Swift is a relatively new language introduced by Apple for iOS, macOS, watchOS, and tvOS development. It offers modern features, including safety, concurrency, and syntax clarity, making it a preferred choice for Apple ecosystem apps.8. PHP:PHP is a server-side scripting language designed for web development. It powers a significant portion of the web, particularly in the context of content management systems like WordPress and e-commerce platforms like Magento.ConclusionProgramming languages play a crucial role in shaping the digital landscape and enabling developers to build innovative solutions. Understanding the basic concepts and types of programming languages provides a strong foundation for mastering any language. With the right knowledge and skills, developers can leverage the capabilities of different languages to create robust and efficient software applications. Stay updated with the latest trends and advancements in the world of programming languages to stay ahead in the ever-evolving tech industry.。

计算机专业常用英语

17
10. local a. 局部；本地[机]
localhost 本(主)机
比较 remote a. 远程 remote access 远程访问 remote communications 远程通信 remote terminal 远程终端
11. ring network 环形网络 ring topology 环形拓扑
4
比较 model 模型 module 模块 modulo 模(运算)
11. opcode 操作码( operation code ) 12. decode v. 译码
decoder 译码器，解码器，翻译程序
5
2.2 Microprocessor And Microcomputer
1. integrated circuit 集成电路
8. library (程序)库，库
16
9. single threading 单线程处理方式
Within a program, the running of a single process at a time. 在程序中，一次运行一个进程的方式。
single-precision 单精度 single-user computer 单用户计算机 thread 线程；线索 threaded tree 线索树 threading 线程技术
11
16. idle a. 空闲，等待
Operational but not in use. 用来说明处在可操作状态但不在使用。 idle state 空闲状态
12
18. launch v. 启动，激活 19. prototyping 原型法 20. project n. 投影运算 21. workstation 工作站

MIT基础数学讲义(计算机系)lecture15

every item of B, then A has k times as many items as B.
For example, suppose A is a set of students, B is a set of recitations, and f de nes the assignment
f f;1( x1x2 : : : xn]) = (x1 x2 x3 : : : xn)
(xn x1 x2 : : : xn;1)
(xn;1 xn x1 : : : xn;2)
: : :x1 appears (x2 x3 x4 : : :
in n di
x1)g
erent
places : : :
By the Division Rule, jAj = njBj. This gives:
de in
nition, however, f A to some element
;b12(bB) c,atnhebne
a set,
f ;1(b)
not just a single value. For exampe, if f
is actually the empty set. In the special
4
Lecture 15: Lecture Notes
1 The Division Rule
We will state the Division Rule twice, once informally and then again with more precise notation.
Theorem 1.1 (Division Rule) If B is a nite set and f : A 7! B maps precisely k items of A to

dynamic programming 动态规划

Naturally, the TSP can be solved in time On!, by enumerating all tours |but this is very impractical. Since the TSP is one of the NP-complete problems, we have little hope of developing a polynomialamic programming gives an algorithm of complexity On22n |exponential, but much faster than n!.
A B C D with total cost 20 300 10 + 40 20 10 + 40 10 100 = 108; 000. Among
the ve possible orders the ve possible binary trees with four leaves this latter method is
2. Travelling Salesman Problem:
The traveling salesman problem. Suppose that you are given n cities and the distances dij between any two cities; you wish to nd the shortest tour that takes you from your home city to all cities and back.
for all j 2 S; j 6= 1 do
opt:=
mfCinjS6=;1jC :=f1m; 2i;n:i6=: :j;;in2Sg;

2024年Python语言程序设计课件

Python语言程序设计课件语言程序设计课件一、引言是一种面向对象的解释型计算机程序设计语言，由GuidovanRossum于1989年底发明，第一个公开发行版发行于1991年。

具有丰富和强大的库，它常被昵称为胶水语言，能够把用其他语言制作的各种模块（尤其是C/C++）很轻松地联结在一起。

在设计上坚持了清晰划一的风格，这使得成为一门易读、易维护，并且被大量用户所欢迎的编程语言。

二、语言特点1.易于学习：有相对较少的关键字，结构简单，和一个明确定义的语法，学习起来更加简单。

2.易于阅读：代码定义的更清晰。

3.易于维护：的成功在于它的是相当容易维护的。

4.一个广泛的标准库：的最大的优势之一是丰富的库，跨平台的，在UNIX，Windows和Macintosh兼容很好。

5.互动模式：互动模式的支持，您可以从终端输入执行代码并获得结果的语言，互动模式很方便调试。

6.可移植：基于其开放的特性，已经被移植（也就是使其工作）到许多平台。

7.可扩展：如果你需要一段运行很快的关键代码，或者是想要编写一些不愿开放的算法，你可以使用C或C++完成那部分程序，然后从你的程序中调用。

8.数据库：提供所有主要的商业数据库的接口。

9.GUI编程：支持GUI可以创建和移植到许多系统调用。

10.可嵌入：你可以将嵌入到C/C++程序，让你的程序的用户获得"脚本化"的能力。

三、语言程序设计基础1.变量与数据类型变量是计算机语言中能存储计算结果或能表示值抽象概念。

变量可以通过变量名访问。

在中，变量就是代表一个对象的名字和地质。

数据类型是解释器根据变量的值来决定如何解释和存储变量的值的。

2.运算符与表达式算术运算符：用于基本的算术运算，如加法、减法、乘法、除法等。

比较（关系）运算符：用于比较两个变量的值，如等于、不等于、大于、小于等。

赋值运算符：用于将一个值赋给变量。

逻辑运算符：用于根据表达式的值返回True或False。

洋娃娃带你过中级笔记

洋娃娃带你过中级笔记1. 引言在计算机科学和工程领域，算法是解决问题的步骤序列。

学习和掌握算法对于程序员来说非常重要，特别是针对不同难度级别的算法。

本篇笔记将向您介绍中级算法，并使用洋娃娃作为您的学习伙伴，带您逐步深入了解这一主题。

2. 算法的基础知识在开始学习中级算法之前，让我们先复习一些算法的基础知识。

2.1 什么是算法？算法是一组解决问题的明确步骤。

它们可以用于执行计算、处理数据等多种任务。

2.2 算法的特性•明确性：算法必须明确无歧义地描述如何解决问题。

•有限性：算法必须在有限步骤内终止。

•有效性：算法应该使用有效的方法来解决问题。

•输入和输出：算法必须有输入和输出。

2.3 常见的算法复杂度•时间复杂度：描述算法执行所需的时间。

•空间复杂度：描述算法在执行期间所需的内存空间。

3. 中级算法示例中级算法需要更多的思考和技巧，以下是几个示例，以帮助您理解这一级别的算法。

3.1 排序算法排序算法是一种将元素按照一定顺序进行排列的算法。

常见的排序算法有冒泡排序、选择排序和快速排序等。

冒泡排序冒泡排序的思想是将最大的元素逐步移到数组的末尾。

它的时间复杂度为O(n^2)。

def bubbleSort(arr):n = len(arr)for i in range(n):for j in range(0, n-i-1):if arr[j] > arr[j+1]:arr[j], arr[j+1] = arr[j+1], arr[j]选择排序选择排序的思想是每次选择数组中的最小元素，并将其放置在已排序部分的末尾。

它的时间复杂度为O(n^2)。

def selectionSort(arr):n = len(arr)for i in range(n):min_index = ifor j in range(i+1, n):if arr[j] < arr[min_index]:min_index = jarr[i], arr[min_index] = arr[min_index], arr[i]3.2 动态规划动态规划是一种解决复杂问题的优化技术。

the c programming language 英文版

C语言圣经：C语言基础知识与高级应用《The C Programming Language》英文版是由Kernighan和Ritchie 合著的一本介绍C语言的经典教材。

该书于1978年首次出版，一经推出便成为学习C语言的必读教材之一。

该书共分为十章，涵盖了C语言的基本语法、数据类型、控制结构、函数、指针、文件和标准库等内容。

作者在书中详细解释了C语言的语法细节，并通过大量的例程和案例分析帮助读者更好地理解和掌握C语言编程。

此外，《The C Programming Language》还具有以下特点：
言简意赅：作者在书中使用了简洁明了的文字和代码示例来解释C语言的概念和语法，使得读者能够快速理解和掌握C语言编程。

案例丰富：书中提供了大量的例程和案例分析，这些案例涵盖了C语言的各种应用场景，包括数值计算、字符串处理、文件操作等，使得读者能够更好地理解和应用C语言编程。

强调实践：该书强调实践编程的重要性，鼓励读者自己动手编写程序并调试运行。

书中还提供了大量的编程练习和思考题，帮助读者巩固所学知识并提高编程能力。

适合初学者：该书从C语言的基本语法和语义开始讲解，逐步深入到高级主题，使得初学者能够逐步掌握C语言编程。

同时，书中还提供了许多实用的编程技巧和注意事项，帮助读者避免在编程过程中出现错误。

总之，《The C Programming Language》是一本非常优秀的C语言
教材，适合初学者和有一定经验的C语言开发者阅读。

通过阅读该书，读者可以快速掌握C语言编程的核心概念和技术，提高自己的编程能力和水平。

c++ 信奥赛常用英语

c++ 信奥赛常用英语在C++ 信奥赛中（计算机奥林匹克竞赛），常用英语词汇主要包括以下几方面：1. 基本概念：- Algorithm（算法）- Data structure（数据结构）- Programming language（编程语言）- C++（C++ 编程语言）- Object-oriented（面向对象）- Function（函数）- Variable（变量）- Constants（常量）- Loops（循环）- Conditional statements（条件语句）- Operators（运算符）- Control structures（控制结构）- Memory management（内存管理）2. 常用算法与数据结构：- Sorting algorithms（排序算法）- Searching algorithms（搜索算法）- Graph algorithms（图算法）- Tree algorithms（树算法）- Dynamic programming（动态规划）- Backtracking（回溯）- Brute force（暴力破解）- Divide and conquer（分治）- Greedy algorithms（贪心算法）- Integer array（整数数组）- Linked list（链表）- Stack（栈）- Queue（队列）- Tree（树）- Graph（图）3. 编程实践：- Code optimization（代码优化）- Debugging（调试）- Testing（测试）- Time complexity（时间复杂度）- Space complexity（空间复杂度）- Input/output（输入/输出）- File handling（文件处理）- Console output（控制台输出）4. 竞赛相关：- IOI（国际信息学奥林匹克竞赛）- NOI（全国信息学奥林匹克竞赛）- ACM-ICPC（ACM 国际大学生程序设计竞赛）- Codeforces（代码力）- LeetCode（力扣）- HackerRank（黑客排名）这些英语词汇在信奥赛领域具有广泛的应用，掌握这些词汇有助于提高选手之间的交流效率，同时对提升编程能力和竞赛成绩也有很大帮助。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

6.046J Introduction to Algorithms, Fall 2005Please use the following citation format:Erik Demaine and Charles Leiserson, 6.046J Introduction toAlgorithms, Fall 2005. (Massachusetts Institute of Technology: MITOpenCourseWare). (accessed MM DD, YYYY).License: Creative Commons Attribution-Noncommercial-Share Alike. Note: Please use the actual date you accessed this material in your citation.For more information about citing these materials or our Terms of Use, visit: /terms6.046J Introduction to Algorithms, Fall 2005Transcript – Lecture 15So, the topic today is dynamic programming. The term programming in the name of this term doesn't refer to computer programming. OK, programming is an old word that means any tabular method for accomplishing something. So, you'll hear about linear programming and dynamic programming. Either of those, even though we now incorporate those algorithms in computer programs, originally computer programming, you were given a datasheet and you put one line per line of code as a tabular method for giving the machine instructions as to what to do.OK, so the term programming is older. Of course, and now conventionally when you see programming, you mean software, computer programming. But that wasn't always the case. And these terms continue in the literature. So, dynamic programming is a design technique like other design techniques we've seen such as divided and conquer. OK, so it's a way of solving a class of problems rather than a particular algorithm or something. So, we're going to work through this for the example of so-called longest common subsequence problem, sometimes called LCS, OK, which is a problem that comes up in a variety of contexts.And it's particularly important in computational biology, where you have long DNA strains, and you're trying to find commonalities between two strings, OK, one which may be a genome, and one may be various, when people do, what is that thing called when they do the evolutionary comparisons? The evolutionary trees, yeah, right, yeah, exactly, phylogenetic trees, there you go, OK, phylogenetic trees. Good, so here's the problem. So, you're given two sequences, x going from one to m, and y running from one to n. You want to find a longest sequence common to both. OK, and here I say a, not the, although it's common to talk about the longest common subsequence. Usually the longest comment subsequence isn't unique. There could be several different subsequences that tie for that. However, people tend to,it's one of the sloppinesses that people will say. I will try to say a, unless it's unique. But I may slip as well because it's just such a common thing to just talk about the, even though there might be multiple. So, here's an example. Suppose x is this sequence, and y is this sequence.So, what is a longest common subsequence of those two sequences? See if you can just eyeball it. AB: length two? Anybody have one longer? Excuse me? BDB, BDB. BDAB, BDAB, BDAB, anything longer? So, BDAB: that's the longest one. Is there another one that's the same length? Is there another one that ties? BCAB, BCAB, another one? BCBA, yeah, there are a bunch of them all of length four. There isn't one of length five. OK, we are actually going to come up with an algorithm that, ifit's correct, we're going to show it's correct, guarantees that there isn't one of length five. So all those are, we can say, any one of these is the longest comment subsequence of x and y. We tend to use it this way using functional notation, but it's not a function that's really a relation.So, we'll say something is an LCS when really we only mean it's an element, if you will, of the set of longest common subsequences. Once again, it's classic abusive notation. As long as we know what we mean, it's OK to abuse notation. What wecan't do is misuse it. But abuse, yeah! Make it so it's easy to deal with. But you have to know what's going on underneath. OK, so let's see, so there's a fairly simple brute force algorithm for solving this problem.And that is, let's just check every, maybe some of you did this in your heads, subsequence of x from one to m to see if it's also a subsequence of y of one to n. So, just take every subsequence that you can get here, check it to see if it's in there. So let's analyze that. So, to check, so if I give you a subsequence of x, how long does it take you to check whether it is, in fact, a subsequence of y? So, I give you something like BCAB. How long does it take me to check to see if it's a subsequence of y? Length of y, which is order n. And how do you do it?Yeah, you just scan. So as you hit the first character that matches, great. Now, if you will, recursively see whether the suffix of your string matches the suffix of x. OK, and so, you are just simply walking down the tree to see if it matches. You're walking down the string to see if it matches. OK, then the second thing is, then how many subsequences of x are there? Two to the n? x just goes from one to m, two to the m subsequences of x, OK, two to the m. Two to the m subsequences of x, OK, one way to see that, you say, well, how many subsequences are there of something there?If I consider a bit vector of length m, OK, that's one or zero, just every position where there's a one, I take out, that identifies an element that I'm going to take out. OK, then that gives me a mapping from each subsequence of x, from each bit vector to a different subsequence of x. Now, of course, you could have matching characters there, that in the worst case, all of the characters are different.OK, and so every one of those will be a unique subsequence. So, each bit vector of length m corresponds to a subsequence. That's a generally good trick to know. So, the worst-case running time of this method is order n times two to the m, which is, since m is in the exponent, is exponential time. And there's a technical term that we use when something is exponential time. Slow: good. OK, very good. OK, slow, OK, so this is really bad. This is taking a long time to crank out how long the longest common subsequence is because there's so many subsequences. OK, so we're going to now go through a process of developing a far more efficient algorithm for this problem. OK, and we're actually going to go through several stages.The first one is to go through simplification stage. OK, and what we're going to do is look at simply the length of the longest common sequence of x and y. And then what we'll do is extend the algorithm to find the longest common subsequence itself. OK, so we're going to look at the length. So, simplify the problem, if you will, to just try to compute the length. What's nice is the length is unique. OK, there's only going to be one length that's going to be the longest. OK, and what we'll do is just focus on the problem of computing the length. And then we'll do is we can back up from that and figure out what actually is the subsequence that realizes that length.OK, and that will be a big simplification because we don't have to keep track of a lot of different possibilities at every stage. We just have to keep track of the one number, which is the length. So, it's sort of reduces it to a numerical problem. We'll adopt the following notation. It's pretty standard notation, but I just want, if I putabsolute values around the string or a sequence, it denotes the length of the sequence, S. OK, so that's the first thing. The second thing we're going to do is, actually, we're going to, which takes a lot more insight when you come up with a problem like this,and in some sense, ends up being the hardest part of designing a good dynamic programming algorithm from any problem, which is we're going to actually look not at all subsequences of x and y, but just prefixes. OK, we're just going to look at prefixes and we're going to show how we can express the length of the longest common subsequence of prefixes in terms of each other. In particular, we're going to define c of ij to be the length, the longest common subsequence of the prefix of x going from one to i, and y of going to one to j.And what we are going to do is we're going to calculate c[i,j] for all ij. And if we do that, how then do we solve the problem of the longest common of sequence of x and y? How do we solve the longest common subsequence? Suppose we've solved this for all I and j. How then do we compute the length of the longest common subsequence of x and y? Yeah, c[m,n], that's all, OK? So then, c of m, n is just equal to the longest common subsequence of x and y, because if I go from one to n, I'm done, OK? And so, it's going to turn out that what we want to do is figure out how to express to c[m,n], in general, c[i,j], in terms of other c[i,j].So, let's see how we do that. OK, so our theorem is going to say that c[i,j] is just --OK, it says that if the i'th character matches the j'th character, then i'th character of x matches the j'th character of y, then c of ij is just c of I minus one, j minus one plus one. And if they don't match, then it's either going to be the longer of c[i, j-1], and c[i-1, j], OK? So that's what we're going to prove. And that's going to give us a way of relating the calculation of a given c[i,j] to values that are strictly smaller, OK, that is at least one of the arguments is smaller of the two arguments. OK, and that's going to give us a way of being able, then, to understand how to calculate c[i,j]. So, let's prove this theorem. So, we'll start with a case x[i] equals y of j. And so,let's draw a picture here. So, we have x here. And here is y. OK, so here's my sequence, x, which I'm sort of drawing as this elongated box, sequence y, and I'm saying that x[i] and y[j], those are equal. OK, so let's see what that means. OK, so let's let z of one to k be, in fact, the longest common subsequence of x of one to i, y of one to j, where c of ij is equal to k.OK, so the longest common subsequence of x and y of one to I and y of one to j has some value. Let's call it k. And so, let's say that we have some sequence which realizes that. OK, we'll call it z. OK, so then, can somebody tell me what z of k is? What is z of k here? Yeah, it's actually equal to x of I, which is also equal to y of j? Why is that? Why couldn't it be some other value?Yeah, so you got the right idea. So, the idea is, suppose that the sequence didn't include this element here at the last element, the longest common subsequence. OK, so then it includes a bunch of values in here, and a bunch of values in here, same values. It doesn't include this or this. Well, then I could just tack on this extra character and make it be longer, make it k plus one because these two match. OK, so if the sequence ended before ----just extend it by tacking on x[i]. OK, it would be fairly simple to just tack on x[i]. OK, so if that's the case, then if I look at z going one up to k minus one, that'scertainly a common sequence of x of 1 up to, excuse me, of up to i minus one. And, y of one up to j minus one, OK, because this is a longest common sequence. z is a longest common sequence is, from x of one to i, y of one to j.And, we know what the last character is. It's just x[i], or equivalently, y[j]. So therefore, everything except the last character must at least be a common sequence of x of one to i minus one, y of one to j minus one. Everybody with me? It must be a comment sequence. OK, now, what you also suspect? What do you also suspect about z of one to k? It's a common sequence of these two. Yeah? Yeah, it's a longest common sequence.So that's what we claim, z of one up to k minus one is in fact a longest common subsequence of x of one to i minus one, and y of one to j minus one, OK? So, let's prove that claim. So, we'll just have a little diversion to prove the claim. OK, so suppose that w is a longer comment sequence, that is, that the length, the w, is bigger than k minus one. OK, so suppose we have a longer comment sequence than z of one to k minus one. So, it's got to have length that's bigger than k minus one if it's longer. OK, and now what we do is we use a classic argument you're going to see multiple times, not just this week, which it will be important for this week, but through several lectures. Hence, it's called a cut and paste argument.So, the idea is let's take a look at w, concatenate it with that last character, z of k. so, this is string, OK, so that's just my terminology for string concatenation. OK, so I take whatever I claimed was a longer comment subsequence, and I concatenate z of k to it. OK, so that is certainly a common sequence of x of one to I minus one, and y of one to j. And it has length bigger than k because it's basically, what is its length? The length of w is bigger than k minus one. I add one character. So, this combination here, now, has length bigger that k.OK, and that's a contradiction, thereby proving the claim. So, I'm simply saying, I claim this. Suppose you have a longer one. Well, let me show, if I had a longer common sequence for the prefixes where we dropped the character from both strings if it was longer there, but we would have made the whole thing longer. So that can't be. So, therefore, this must be a longest common subsequence, OK? Questions? Because you are going to need to be able to do this kind of proof ad nauseam, almost.So, if there any questions, let them at me, people. OK, so now what we have established is that z one through k is a longest common subsequence of the two prefixes when we drop the last character. So, thus, we have c of i minus one, j minus one is equal to what? What's c of i minus one, j minus one? k minus one; thank you. Let's move on with the class, right, OK, which implies that c of ij is just equal to c of I minus one, j minus one plus one. So, it's fairly straightforward if you think about what's going on there. It's not always as straightforward in some problems as it is for longest common subsequence.The idea is, so I'm not going to go through the other cases. They are similar. But, in fact, we've hit on one of the two hallmarks of dynamic programming. So, by hallmarks, I mean when you see this kind of structure in a problem, there's a good chance that dynamic programming is going to work as a strategy. The dynamic programming hallmark is the following. This is number one. And that is the property of optimal substructure. OK, what that says is an optimal solution to a problem, andby this, we really mean problem instance. But it's tedious to keep saying problem instance.A problem is generally, in computer science, viewed as having an infinite number of instances typically, OK, so sorting is a problem. A sorting instance is a particular input. OK, so we're really talking about problem instances, but I'm just going to say problem, OK? So, when you have an optimal solution to a problem, contains optimal solutions to subproblems. OK, and that's worth drawing a box around because it's so important. OK, so here, for example, if z is a longest common subsequence of x and y,OK, then any prefix of z is a longest common subsequence of a prefix of x, and a prefix of y, OK? So, this is basically what it says. I look at the problem, and I can see that there is optimal substructure going on. OK, in this case, and the idea is that almost always, it means that there's a cut and paste argument you could do to demonstrate that, OK, that if the substructure were not optimal, then you'd be able to find a better solution to the overall problem using cut and paste.OK, so this theorem, now, gives us a strategy for being able to compute longest comment subsequence. Here's the code; oh wait. OK, so going to ignore base cases in this, if --And we will return the value of the longest common subsequence. It's basically just implementing this theorem. OK, so it's either the longest comment subsequence if they match. It's the longest comment subsequence of one of the prefixes where you drop that character for both strengths and add one becausethat's the matching one. Or, you drop a character from x, and it's the longest comment subsequence of that. Or you drop a character from y, whichever one of those is longer. That ends up being the longest comment subsequence.OK, so what's the worst case for this program? What's going to happen in the worst case? Which of these two clauses is going to cause us more headache? The second clause: why the second clause? Yeah, you're doing two LCS sub-calculations here. Here, you're only doing one. Not only that, but you get to decrement both indices, whereas here you've basically got to, you only get to decrement one index, andyou've got to calculate two of them. So that's going to generate the tree.So, and the worst case, x of i is not equal to x of j for all i and j. So, let's draw a recursion tree for this program to sort of get an understanding as to what is going on to help us. And, I'm going to do it with m equals seven, and n equals six. OK, so we start up the top with my two indices being seven and six. And then, in the worst case, we had to execute these. So, this is going to end up being six, six, and seven, five for indices after the first call.And then, this guy is going to split. And he's going to produce five, six here, decrement the first index, I. And then, if I keep going down here, we're going to get four, six and five, five. And these guys keep extending here. I get six five, five five, six four, OK? Over here, I'm going to get decrement the first index, six five, and I get five five, six four, and these guys keep going down. And over here, I get seven four. And then we get six four, seven three, and those keep going down. So, we keep just building this tree out.OK, so what's the height of this tree? Not of this one for the particular value of m and n, but in terms of m and n. What's the height of this tree? It's the max of m and n. You've got the right, it's theta of the max. It's not the max. Max would be, in thiscase, you're saying it has height seven. But, I think you can sort of see, for example, along a path like this that, in fact, I've only, after going three levels, reduced m plus n, good, very good, m plus n.So, height here is m plus n. OK, and its binary. So, the height: that implies the work is exponential in m and n. All that work, and are we any better off than the brute force algorithm? Not really. And, our technical term for this is slow. OK, and we like speed. OK, we like fast. OK, but I'm sure that some of you have observed something interesting about this tree. Yeah, there's a lot of repeated work here. Right, there's a lot of repeated work.In particular, this whole subtree, and this whole subtree, OK, they are the same. That's the same subtree, the same subproblem that you are solving. OK, you can even see over here, there is even similarity between this whole subtree and this whole subtree. OK, so there's lots of repeated work. OK, and one thing is, if you want to do things fast, don't keep doing the same thing. OK, don't keep doing the same thing. When you find you are repeating something, figure out a way of not doing it. So, that brings up our second hallmark for dynamic programming.And that's a property called overlapping subproblems, OK? OK, recursive solution contains many, excuse me, contains a small number of distinct subproblems repeated many times. And once again, this is important enough to put a box around.I don't put boxes around too many things. Maybe I should put our boxes around things. This is definitely one to put a box around, OK? So, for example, so here we have a recursive solution. This tree is exponential in size. It's two to the m plus n in height, in size, in the total number of problems if I actually implemented like that. But how many distinct subproblems are there?m times n, OK? So, the longest comment subsequence, the subproblem space contains m times n, distinct subproblems. OK, and then this is a small number compared with two to the m plus n, or two to the n, or two to the m, or whatever. OK, this is small, OK, because for each subproblem, it's characterized by an I and a j. An I goes from one to m, and j goes from one to n, OK? There aren't that many different subproblems.It's just the product of the two. So, here's an improved algorithm, which is often a good way to solve it. It's an algorithm called a memo-ization algorithm. And, this is memo-ization, not memorization because what you're going to do is make a little memo whenever you solve a subproblem. Make a little memo that says I solved this already. And if ever you are asked for it rather than recalculating it, say, oh, I see that. I did that before. Here's the answer, OK? So, here's the code. It's very similar to that code.So, it basically keeps a table around of c[i,j]. It says, what we do is we check. If the entry for c[i,j] is nil, we haven't computed it, then we compute it. And, how do we compute it? Just the same way we did before. OK, so this whole part here, OK, is exactly what we have had before. It's the same as before. And then, we just return c[i,j]. If we don't bother to keep recalculating, OK, so if it's nil, we calculate it. Otherwise, we just return it. It's not calculated, calculate and return it. Otherwise, just return it: OK, pretty straightforward code. OK.OK, now the tricky thing is how much time does it take to execute this? This takes a little bit of thinking. Yeah? Yeah, it takes order MN. OK, why is that? Yeah, but Ihave to look up c[i,j]. I might call c[i,j] a bunch of times. When I'm doing this, I'm still calling it recursively. Yeah, so you have to, so each recursive call is going to look at, and the worst-case, say, is going to look at the max of these two things. Well, this is going to involve a recursive call, and a lookup.So, this might take a fair amount of effort to calculate. I mean, you're right, and your intuition is right. Let's see if we can get a more precise argument, why this is taking order m times n. What's going on here? Because not every time I call this is it going to just take me a constant amount of work to do this. Sometimes it's going to take me a lot of work. Sometimes I get lucky, and I return it.So, your intuition is dead on. It's dead on. We just need a little bit more articulate explanation, so that everybody is on board. Try again? Good, at most three times, yeah. OK, so that's one way to look at it. Yeah. There is another way to look at it that's kind of what you are expressing there is an amortized, a bookkeeping, way of looking at this. What's the amortized cost? You could say what the amortized cost of calculating one of these, where basically whenever I call it, I'm going to charge a constant amount for looking up. And so, I could get to look up whatever is in here to call the things.But if it, in fact, so in some sense, this charge here, of calling it and returning it, etc., I charged that to my caller. OK, so I charged these lines and this line to the caller. And I charge the rest of these lines to the c[i,j] element. And then, the point is that every caller basically only ends up being charged for a constant amount of stuff. OK, to calculate one c[i,j], it's only an amortized constant amount of stuff that I'm charging to that calculation of i and j, that calculation of i and j. OK, so you can view it in terms of amortized analysis doing a bookkeeping argument that just says, let me charge enough to calculate my own, do all my own local things plus enough to look up the value in the next level and get it returned.OK, and then if it has to go off and calculate, well, that's OK because that's all been charged to a different ij at that point. So, every cell only costs me a constant amount of time that order MN cells total of order MN. OK: constant work per entry. OK, and you can sort of use an amortized analysis to argue that. How much space does it take? We haven't usually looked at space, but here we are going to start looking at space. That turns out, for some of these algorithms, to be really important. How much space do I need, storage space? Yeah, also m times n, OK, to store the c[i,j] table. OK, the rest, storing x and y, OK, that's just m plus n.So, that's negligible, but mostly I need the space m times n. So, this memo-ization type algorithm is a really good strategy in programming for many things where, when you have the same parameters, you're going to get the same results. Itdoesn't work in programs where you have a side effect, necessarily, that is, when the calculation for a given set of parameters might be different on each call.But for something which is essentially like a functional programming type of environment, then if you've calculated it once, you can look it up. And, so this is very helpful. But, it takes a fair amount of space, and it also doesn't proceed in a very orderly way. So, there is another strategy for doing exactly the same calculation in a bottom-up way. And that's what we call dynamic programming.OK, the idea is to compute the table bottom-up. I think I'm going to get rid of, I think what we'll do is we'll just use, actually I think what I'm going to do is use thisboard. OK, so here's the idea. What we're going to do is look at the c[i,j] table and realize that there's actually an orderly way of filling in the table. This is sort of a top-down with memo-ization. OK, but there's actually a way we can do it bottom up. So, here's the idea. So, let's make our table.OK, so there's x. And then, there's y. And, I'm going to initialize the empty string. I didn't cover the base cases for c[i,j], but c of zero meaning a prefix with no elements in it. The prefix of that with anything else, the length is zero. OK, so that's basically how I'm going to bound the borders here. And now, what I can do is just use my formula, which I've conveniently erased up there, OK, to compute what is the longest common subsequence, length of the longest comment subsequence from this character in y, and this character in x up to this character.So here, for example, they don't match. So, it's the maximum of these two values. Here, they do match. OK, so it says it's one plus the value here. And, I'm going to draw a line. Whenever I'm going to get a match, I'm going to draw a line like that, indicating that I had that first case, the case where they had a good match. And so, all I'm doing is applying that recursive formula from the theorem that we proved. So here, it's basically they don't match. So, it's the maximum of those two. Here, they match. So, it's one plus that guy. Here, they don't match.So, it's basically the maximum of these two. Here, they don't match. So it's the maximum. So, it's one plus that guy. So, everybody understand how I filled out that first row? OK, well that you guys can help. OK, so this one is what? Just call it out. Zero, good. One, because it's the maximum, one, two, right. This one, now, gets from there, two, two. OK, here, zero, one, because it's the maximum of those two. Two, two, two, good. One, one, two, two, two, three, three. One, two, three, get that line, three, four, OK. One there, three, three, four, good, four. OK, and our answer: four. So this is blindingly fast code if you code this up, OK, because it gets to use the fact that modern machines in particular do very well on regular strides through memory. So, if you're just plowing through memory across like this, OK, and your two-dimensional array is stored in that order, which it is, otherwise you go this way, stored in that order. This can really fly in terms of the speed of the calculation. So, how much time did it take us to do this? Yeah, order MN, theta MN. Yeah? We'll talk about space in just a minute. OK, so hold that question. Good question, good question, already, wow, good, OK, how do I now figure out, remember, we had the simplification. We were going to just calculate the length. OK, it turns out I can now figure out a particular sequence that matches it. And basically, I do that.I can reconstruct the longest common subsequence by tracing backwards. So essentially I start here. Here I have a choice because this one was dependent on, since it doesn't have a bar here, it was dependent on one of these two. So, let me go this way. OK, and now I have a diagonal element here. So what I'll do is simply mark the character that appeared in those positions as I go this way. I have three here. And now, let me keep going, three here, and now I have another one. So that means this character gets selected.And then I go up to here, OK, and then up to here. And now I go diagonally again, which means that this character is selected. And I go to here, and then I go here. And then, I go up here and this character is selected. So here is my longest common subsequence. And this was just one path back. I could have gone a path like this and。