博弈论(哈佛大学原版教程)

合集下载

博弈论 朱弗登博格

博弈论 朱弗登博格

朱弗登博格是哈佛大学经济学教授,主要研究领域为博弈论和动态经济学。

他与让·梯若尔合著了《博弈论》一书,该书是经济学研究生和高年级本科生学习博弈论的最好教材,也是其他对博弈论有兴趣的读者的必备参考书。

朱弗登博格在博弈论领域的研究涵盖了多种类型的博弈,包括合作博弈、非合作博弈、重复博弈、机制设计等。

他的研究成果在经济学、计算机科学、政治学等多个领域都有广泛的应用。

除了博弈论,朱弗登博格还在动态经济学领域进行了深入的研究,他的研究涉及到市场失灵、经济增长、产业组织等问题。

他的研究成果为理解现代经济现象提供了重要的理论基础。

总的来说,朱弗登博格在博弈论和动态经济学领域的研究成果为学术界和实践界提供了重要的理论指导和决策支持。

完美打印版英文学习Game Theory 博弈论公开课哈佛大学第1到5课字幕讲稿

完美打印版英文学习Game Theory 博弈论公开课哈佛大学第1到5课字幕讲稿

PRINT ECON-159: GAME THEORYChapter 1. What Is Strategy? [00:00:00]Professor Ben Polak: So this is Game Theory Economics 159. If you're here for art history, you're either in the wrong room or stay anyway, maybe this is the right room; but this is Game Theory, okay. You should have four handouts; everyone should have four handouts. There is a legal release form--we'll talk about it in a minute--about the videoing. There is a syllabus, which is a preliminary syllabus: it's also online. And there are two games labeled Game 1 and Game 2. Can I get you all to look at Game 1 and start thinking about it. And while you're thinking about it, I am hoping you can multitask a bit. I'll describe a bit about the class and we'll get a bit of admin under our belts. But please try and look at--somebody's not looking at it, because they're using it as a fan here--so look at Game 1 and fill out that form for me, okay?So while you're filling that out, let me tell you a little bit about what we're going to be doing here. So what is Game Theory? Game Theory is a method of studying strategic situations. So what's a strategic situation? Well let's start off with what's not a strategic situation. In your Economics - in your Intro Economics class in 115 or 110, you saw some pretty good examples of situations that were not strategic. You saw firms working in perfect competition. Firms in perfect competition are price takers: they don't particularly have to worry about the actions of their competitors. You also saw firms that were monopolists and monopolists don't have any competitors to worry about, so that's not a particularly strategic situation. They're not price takers but they take the demand curve. Is this looking familiar for some of you who can remember doing 115 last year or maybe two years ago for some of you? Everything in between is strategic. So everything that constitutes imperfect competition is a strategic setting. Think about the motor industry, the motor car industry. Ford has to worry about what GM is doing and what Toyota is doing, and for the moment at least what Chrysler is doing but perhaps not for long. So there's a small number of firms and their actions affect each other.So for a literal definition of what strategic means: it's a setting where the outcomes that affect you depend on actions, not just on your own actions, but on actions of others. All right, that's as much as I'm going to say for preview right now, we're going to come back and see plenty of this over the course of the next semester.Chapter 2. Strategy: Where Does It Apply? [00:02:16]So what I want to do is get on to where this applies. It obviously applies in Economics, but it also applies in politics, and in fact, this class will count as a Political Science class if you're a Political Science major. You should go check with the DUS in Political Science. It count - Game Theory is very important in law these days. So for those of you--for the half of you--that are going to end up in law school, this is pretty good training. Game Theory is also used in biology and towards the middle of the semester we're actually going to see some examples of Game Theory as applied to evolution. And not surprisingly, Game Theory applies to sport.Chapter 3. (Administrative Issues) [00:02:54]So let's talk about a bit of admin. How are you doing on filling out those games? Everyone managing to Lecture 1 - Introduction: Five First Lessons [September 5, 2007]multitask: filling in Game 1? Keep writing. I want to get some admin out of the way and I want to start by getting out of the way what is obviously the elephant in the room. Some of you will have noticed that there's a camera crew here, okay. So as some of you probably know, Yale is undergoing an open education project and they're videoing several classes, and the idea of this, is to make educational materials available beyond the walls of Yale. In fact, on the web,internationally, so people in places, maybe places in the U.S. or places miles away, maybe in Timbuktu or whatever, who find it difficult to get educational materials from the local university or whatever, can watch certain lectures from Yale on the web.Some of you would have been in classes that do that before. What's going to different about this class is that you're going to be participating in it. The way we teach this class is we're going to play games, we're going to have discussions, we're going to talk among the class, and you're going to be learning from each other, and I want you to help people watching at home to be able to learn too. And that means you're going to be on film, at the very least on mike.So how's that going to work? Around the room are three T.A.s holding mikes. Let me show you where they are: one here, one here, and one here. When I ask for classroom discussions, I'm going to have one of the T.A.s go to you with a microphone much like in "Donahue" or something, okay. At certain times, you're going to be seen on film, so the camera is actually going to come around and point in your direction.Now I really want this to happen. I had to argue for this to happen, cause I really feel that this class isn't about me. I'm part of the class obviously, but it's about you teaching each other and participating. But there's a catch, the catch is, that that means you have to sign that legal release form.So you'll see that you have in front of you a legal release form, you have to be able to sign it, and what that says is that we can use you being shown in class. Think of this as a bad hair day release form. All right, you can't sue Yale later if you had a bad hair day. For those of you who are on the run from the FBI, your Visa has run out, or you're sitting next to your ex-girlfriend, now would be a good time to put a paper bag over your head.All right, now just to get you used to the idea, in every class we're going to have I think the same two people, so Jude is the cameraman; why don't you all wave to Jude: this is Jude okay. And Wes is our audio guy: this is Wes. And I will try and remember not to include Jude and Wes in the classroom discussions, but you should be aware that they're there. Now, if this is making you nervous, if it's any consolation, it's making me very nervous.So, all right, we'll try and make this class work as smoothly as we can, allowing for this extra thing. Let me just say, no one's making any money off this--at least I'm hoping these guys are being paid--but me and the T.A.s are not being paid. The aim of this, that I think is a good aim, it's an educational project, and I'm hoping you'll help us with it. The one difference it is going to mean, is that at times I might hold some of the discussions for the class, coming down into this part of the room, here, to make it a little easier for Jude.All right, how are we doing now on filling out those forms? Has everyone filled in their strategy for the first game? Not yet. Okay, let's go on doing a bit more admin. The thing you mostly care about I'm guessing, is the grades. All right, so how is the grade going to work for this class? 30% of the class will be on problem sets, 30% of the grade; 30% on the mid-term, and 40% on the final; so 30/30/40.The mid-term will be held in class on October 17th; that is also in your syllabus. Please don't anybody tell me late - any time after today you didn't know when the mid-term was and therefore it clashes with 17 different things. The mid-term is on October 17th, which is a Wednesday, in class. All right, the problem sets: there will be roughly ten problem sets and I'll talk about them more later on when I hand them out.The first one will go out on Monday but it will be due ten days later. Roughly speaking they'll be every week.The grade distribution: all right, so this is the rough grade distribution. Roughly speaking, a sixth of the class are going to end up with A's, a sixth are going to end up with A-, a sixth are going to end up with B+, a sixth are going to end up with B, a sixth are going to end up with B-, and the remaining sixth, if I added that up right, are going to end up with what I guess we're now calling the presidential grade, is that right?That's not literally true. I'm going to squeeze it a bit, I'm going to curve it a bit, so actually slightly fewer than a sixth will get straight A's, and fewer than a sixth will get C's and below. We'll squeeze the middle to make them be more B's. One thing I can guarantee from past experience in this class, is that the median grade will be a B+. The median will fall somewhere in the B+'s. Just as forewarning for people who have forgotten what a median is, that means half of you--not approximately half, it means exactly half of you--will be getting something like B+ and below and half will get something like B+ and above.Now, how are you doing in filling in the forms? Everyone filled them in yet? Surely must be pretty close to getting everyone filled in. All right, so last things to talk about before I actually collect them in - textbooks. There are textbooks for this class. The main textbook is this one, Dutta's book Strategy and Games. If you want a slightly tougher book, more rigorous book, try Joel Watson's book, Strategies. Both of those books are available at the bookstore.But I want to warn everybody ahead of time, I will not be following the textbook. I regard these books as safety nets. If you don't understand something that happened in class, you want to reinforce an idea that came up in class, then you should read the relevant chapters in the book and the syllabus will tell you which chapters to read for each class, or for each week of class, all right. But I will not be following these books religiously at all. In fact, they're just there as back up.In addition, I strongly recommend people read, Thinking Strategically. This is good bedtime reading. Do any of you suffer from insomnia? It's very good bedtime reading if you suffer from insomnia. It's a good book and what's more there's going to be a new edition of this book this year and Norton have allowed us to get advance copies of it. So if you don't buy this book this week, I may be able to make the advance copy of the new edition available for some of you next week. I'm not taking a cut on that either, all right, there's no money changing hands.All right, sections are on the syllabus sign up - sorry on the website, sign up as usual. Put yourself down on the wait list if you don't get into the section you want. You probably will get into the section you want once we're done.Chapter 4. Elements of a Game: Strategies, Actions, Outcomes and Payoffs [00:09:40]All right, now we must be done with the forms. Are we done with the forms? All right, so why don't we send the T.A.s, with or without mikes, up and down the aisles and collect in your Game #1; not Game #2, just Game #1.Just while we're doing that, I think the reputation of this class--I think--if you look at the course evaluations online or whatever, is that this class is reasonably hard but reasonably fun. So I'm hoping that's what the reputation of the class is. If you think this class is going to be easy, I think it isn't actually an easy class. It's actually quite a hard class, but I think I can guarantee it's going to be a fun class. Now one reason it's a fun class, is the nice thing about teaching Game Theory - quieten down folks--one thing about teaching Game Theory is, you get to play games, and that's exactly what we've just been doing now. This is our first game and we're going to play games throughout the course, sometimes several times a week,sometimes just once a week.We got all these things in? Everyone handed them in? So I need to get those counted. Has anyone taken the Yale Accounting class? No one wants to - has aspirations to be - one person has. I'll have a T.A. do it, it's all right, we'll have a T.A. do it. So Kaj, can you count those for me? Is that right? Let me read out the game you've just played."Game 1, a simple grade scheme for the class. Read the following carefully. Without showing your neighbor what you are doing, put it in the box below either the letter Alpha or the letter Beta. Think of this as a grade bid. I will randomly pair your form with another form and neither you nor your pair will ever know with whom you were paired. Here's how the grades may be assigned for the class. [Well they won't be, but we can pretend.] If you put Alpha and you're paired with Beta, then you will get an A and your pair a C. If you and your pair both put Alpha, you'll both get B-. If you put Beta and you're paired with Alpha, you'll get a C and your pair an A. If you and your pair both put Beta, then you'll both get B+."So that's the thing you just filled in.Now before we talk about this, let's just collect this information in a more useful way. So I'm going to remove this for now. We'll discuss this in a second, but why don't we actually record what the game is, that we're playing, first. So this is our grade game, and what I'm going to do, since it's kind of hard to absorb all the information just by reading a paragraph of text, I'm going to make a table to record the information. So what I'm going to do is I'm going to put me here, and my pair, the person I'm randomly paired with here, and Alpha and Beta, which are the choices I'm going to make here and on the columns Alpha and Beta, the choices my pair is making.In this table, I'm going to put my grades. So my grade if we both put Alpha is B-, if we both put Beta, wasB+. If I put Alpha and she put a Beta, I got an A, and if I put Beta and she put an Alpha, I got a C. Is that correct? That's more or less right? Yeah, okay while we're here, why don't we do the same for my pair? So this is my grades on the left hand table, but now let's look at what my pair will do, what my pair will get.So I should warn the people sitting at the back that my handwriting is pretty bad, that's one reason for moving forward. The other thing I should apologize at this stage of the class is my accent. I will try and improve the handwriting, there's not much I can do about the accent at this stage.So once again if you both put Alpha then my pair gets a B-. If we both put Beta, then we both get a B+; in particular, my pair gets a B+. If I put Alpha and my pair puts Beta, then she gets a C. And if I put Beta and she puts Alpha, then she gets an A. So I now have all the information that was on the sheet of paper that you just handed in.Now there's another way of organizing this that's standard in Game Theory, so we may as well get used to it now on the first day. Rather then drawing two different tables like this, what I'm going to do is I'm going to take the second table and super-impose it on top of the first table. Okay, so let me do that and you'll see what I mean. What I'm going to do is draw a larger table, the same basic structure: I'm choosing Alpha and Beta on the rows, my pair is choosing Alpha and Beta on the columns, but now I'm going to put both grades in. So the easy ones are on the diagonal: you both get B- if we both choose Alpha; we both get B+ if we both choose Beta. But if I choose Alpha and my pair chooses Beta, I get an A and she gets a C. And if I choose Beta and she chooses Alpha, then it's me who gets the C and it's her who gets the A.So notice what I did here. The first grade corresponds to the row player, me in this case, and the second grade in each box corresponds to the column player, my pair in this case. So this is a nice succinct way of recording what was in the previous two tables. This is an outcome matrix; this tells us everything that wasOkay, so now seems a good time to start talking about what people did. So let's just have a show of hands. How many of you chose Alpha? Leave your hands up so that Jude can catch that, so people can see at home, okay. All right and how many of you chose Beta? There's far more Alphas - wave your hands the Beta's okay. All right, there's a Beta here, okay. So it looks like a lot of - well we're going to find out, we're going to count--but a lot more Alpha's than Beta's. Let me try and find out some reasons why people chose.So let me have the Alpha's up again. So, the woman who's in red here, can we get a mike to the - yeah, is it okay if we ask you? You're not on the run from the FBI? We can ask you why? Okay, so you chose Alpha right? So why did you choose Alpha?Student: [inaudible] realized that my partner chose Alpha, therefore I chose [inaudible].Professor Ben Polak: All right, so you wrote out these squares, you realized what your partner was going to do, and responded to that. Any other reasons for choosing Alpha around the room? Can we get the woman here? Try not to be intimidated by these microphones, they're just mikes. It's okay.Student: The reason I chose Alpha, regardless of what my partner chose, I think there would be better outcomes than choosing Beta.Professor Ben Polak: All right, so let me ask your names for a second-so your name was?Student: Courtney.Professor Ben Polak: Courtney and your name was?Student: Clara Elise.Professor Ben Polak: Clara Elise. So slightly different reasons, same choice Alpha. Clara Elise's reason -what did Clara Elise say? She said, no matter what the other person does, she reckons she'd get a better grade if she chose Alpha. So hold that thought a second, we'll come back to - is it Clara Elise, is that right? We'll come back to Clara Elise in a second. Let's talk to the Beta's a second; let me just emphasize at this stage there are no wrong answers. Later on in the class there'll be some questions that have wrong answers. Right now there's no wrong answers. There may be bad reasons but there's no wrong answers. So let's have the Beta's up again. Let's see the Beta's. Oh come on! There was a Beta right here. You were a Beta right? You backed off the Beta, okay. So how can I get a mike into a Beta? Let' s stick in this aisle a bit. Is that a Beta right there? Are you a Beta right there? Can I get the Beta in here? Who was the Beta in here? Can we get the mike in there? Is that possible? In here - you can leave your hand so that - there we go. Just point towards - that's fine, just speak into it, that's fine.Student: So the reason right?Professor Ben Polak: Yeah, go ahead.Student: I personally don't like swings that much and it's the B-/B+ range, so I'd much rather prefer that to a swing from A to C, and that's my reason.Professor Ben Polak: All right, so you're saying it compresses the range. I'm not sure it does compress the range. I mean if you chose Alpha, you're swinging from A to B-; and from Beta, swinging from B+ to C. I mean those are similar kind of ranges but it certainly is a reason. Other reasons for choosing? Yeah, the guy in blue here, yep, good. That's all right. Don't hold the mike; just let it point at you, that's fine.Student: Well I guess I thought we could be more collusive and kind of work together, but I guess not. So IProfessor Ben Polak: There's a siren in the background so I missed the answer. Stand up a second, so we can just hear you.Student: Sure.Professor Ben Polak: Sorry, say again.Student: Sure. My name is Travis. I thought we could work together, but I guess not.Professor Ben Polak: All right good. That's a pretty good reason.Student: If you had chosen Beta we would have all gotten B+'s but I guess not.Professor Ben Polak: Good, so Travis is giving us a different reason, right? He's saying that maybe, some of you in the room might actually care about each other's grades, right? I mean you all know each other in class. You all go to the same college. For example, if we played this game up in the business school--are there any MBA students here today? One or two. If we play this game up in the business school, I think it's quite likely we're going to get a lot of Alpha's chosen, right? But if we played this game up in let's say the Divinity School, all right and I'm guessing that Travis' answer is reflecting what you guys are reasoning here. If you played in the Divinity School, you might think that people in the Divinity School might care about other people's grades, right? There might be ethical reasons--perfectly good, sensible, ethical reasons--for choosing Beta in this game. There might be other reasons as well, but that's perhaps the reason to focus on. And perhaps, the lesson I want to draw out of this is that right now this is not a game. Right now we have actions, strategies for people to take, and we know what the outcomes are, but we're missing something that will make this a game. What are we missing here?Student: Objectives.Professor Ben Polak: We're missing objectives. We're missing payoffs. We're missing what people care about, all right. So we can't really start analyzing a game until we know what people care about, and until we know what the payoffs are. Now let's just say something now, which I'll probably forget to say in any other moment of the class, but today it's relevant.Game Theory, me, professors at Yale, cannot tell you what your payoff should be. I can't tell you in a useful way what it is that your goals in life should be or whatever. That's not what Game Theory is about. However, once we know what your payoffs are, once we know what your goals are, perhaps Game Theory can you help you get there.So we've had two different kinds of payoffs mentioned here. We had the kind of payoff where we care about our own grade, and Travis has mentioned the kind of payoff where you might care about other people's grades. And what we're going to do today is analyze this game under both those possible payoffs. To start that off, let's put up some possible payoffs for the game. And I promise we'll come back and look at some other payoffs later. We'll revisit the Divinity School later.Chapter 5. Strictly Dominant versus Strictly Dominated Strategies [00:21:38]All right, so here once again is our same matrix with me and my pair, choosing actions Alpha and Beta, but this time I'm going to put numbers in here. And some of you will perhaps recognize these numbers, but that's not really relevant for now. All right, so what's the idea here? Well the first idea is that these numbers represent utiles or utilities. They represent what these people are trying to maximize, what they're toachieve, their goals.The idea is - just to compare this to the outcome matrix - for the person who's me here, (A,C) yields a payoff of--(A,C) is this box--so (A,C) yields a payoff of three, whereas (B-,B-) yields a payoff of 0, and so on. So what's the interpretation? It's the first interpretation: the natural interpretation that a lot of you jumped to straight away. These are people--people with these payoffs are people--who only care about their own grades. They prefer an A to a B+, they prefer a B+ to a B-, and they prefer a B- to a C. Right, I'm hoping I the grades in order, otherwise it's going to ruin my curve at the end of the year. So these people only care about their own grades. They only care about their own grades.What do we call people who only care about their own grades? What's a good technical term for them? In England, I think we refer to these guys - whether it's technical or not - as "evil gits." These are not perhaps the most moral people in the universe. So now we can ask a different question. Suppose, whether these are actually your payoffs or not, pretend they are for now. Suppose these are all payoffs. Now we can ask, not what did you do, but what should you do? Now we have payoffs that can really switch the question to a normative question: what should you do? Let's come back to - was it Clara Elise--where was Clara Elise before? Let's get the mike on you again. So just explain what you did and why again.Student: Why I chose Alpha?Professor Ben Polak: Yeah, stand up a second, if that's okay.Student: Okay.Professor Ben Polak: You chose Alpha; I'm assuming these were roughly your payoffs, more or less, you were caring about your grades.Student: Yeah, I was thinking-Professor Ben Polak: Why did you choose Alpha?Student: I'm sorry?Professor Ben Polak: Why did you choose Alpha? Just repeat what you said before.Student: Because I thought the payoffs - the two different payoffs that I could have gotten--were highest if I chose Alpha.Professor Ben Polak: Good; so what Clara Elise is saying--it's an important idea--is this (and tell me ifI'm paraphrasing you incorrectly but I think this is more or less what you're saying): is no matter what the other person does, no matter what the pair does, she obtains a higher payoff by choosing Alpha. Let's just see that. If the pair chooses Alpha and she chooses Alpha, then she gets 0. If the pair chooses Alpha and she chose Beta, she gets -1. 0 is bigger than -1. If the pair chooses Beta, then if she chooses Alpha she gets 3, Beta she gets 1, and 3 is bigger than 1. So in both cases, no matter what the other person does, she receives a higher payoff from choosing Alpha, so she should choose Alpha. Does everyone follow that line of reasoning? That's a stronger line of reasoning then the reasoning we had earlier. So the woman, I have immediately forgotten the name of, in the red shirt, whose name was-Student: Courtney.Professor Ben Polak: Courtney, so Courtney also gave a reason for choosing Alpha, and it was a perfectly good reason for choosing Alpha, nothing wrong with it, but notice that this reason's a stronger reason. It kind of implies your reason.So let's get some definitions down here. I think I can fit it in here. Let's try and fit it in here.Definition: We say that my strategy Alpha strictly dominates my strategy Beta, if my payoff from Alpha is strictly greater than that from Beta, [and this is the key part of the definition], regardless of what others do.Shall we just read that back? "We say that my strategy Alpha strictly dominates my strategy Beta, if my payoff from Alpha is strictly greater than that from Beta, regardless of what others do." Now it's by no means my main aim in this class to teach you jargon. But a few bits of jargon are going to be helpful in allowing the conversation to move forward and this is certainly one. "Evil gits" is maybe one too, but this is certainly one.Let's draw out some lessons from this. Actually, so you can still read that, let me bring down and clean this board. So the first lesson of the class, and there are going to be lots of lessons, is a lesson that emerges immediately from the definition of a dominated strategy and it's this. So Lesson One of the course is: do not play a strictly dominated strategy. So with apologies to Strunk and White, this is in the passive form, that's dominated, passive voice. Do not play a strictly dominated strategy. Why? Somebody want to tell me why? Do you want to get this guy? Stand up - yeah.Student: Because everyone's going to pick the dominant outcome and then everyone's going to get the worst result - the collectively worst result.Professor Ben Polak: Yeah, that's a possible answer. I'm looking for something more direct here. So we look at the definition of a strictly dominated strategy. I'm saying never play one. What's a possible reason for that? Let's - can we get the woman there?Student: [inaudible]Professor Ben Polak: "You'll always lose." Well, I don't know: it's not about winning and losing. What else could we have? Could we get this guy in the pink down here?Student: Well, the payoffs are lower.Professor Ben Polak: The payoffs are lower, okay. So here's an abbreviated version of that, I mean it's perhaps a little bit longer. The reason I don't want to play a strictly dominated strategy is, if instead, I play the strategy that dominates it, I do better in every case. The reason I never want to play a strictly dominated strategy is, if instead I play the strategy that dominates it, whatever anyone else does I'm doing better than I would have done. Now that's a pretty convincing argument. That sounds like a convincing argument. It sounds like too obvious even to be worth stating in class, so let me now try and shake your faith a little bitin this answer.Chapter 6. Contracts and Collusion [00:29:33]You're somebody who's wanted by the FBI, right?Okay, so how about the following argument? Look at the payoff matrix again and suppose I reason as follows. Suppose I reason and say if we, me and my pair, both reason this way and choose Alpha then we'll both get 0. But if we both reasoned a different way and chose Beta, then we'll both get 1. So I should choose Beta: 1 is bigger than 0, I should choose Beta. What's wrong with that argument? My argument must be wrong because it goes against the lesson of the class and the lessons of the class are gospel right, they're not wrong ever, so what's wrong with that argument? Yes, Ale - yeah good.。

博弈论 Game theory (全)

博弈论 Game theory (全)

博弈论 Game Theory博弈论亦名“对策论”、“赛局理论”,属应用数学的一个分支, 目前在生物学,经济学,国际关系,计算机科学, 政治学,军事战略和其他很多学科都有广泛的应用。

在《博弈圣经》中写到:博弈论是二人在平等的对局中各自利用对方的策略变换自己的对抗策略,达到取胜的意义。

主要研究公式化了的激励结构间的相互作用。

是研究具有斗争或竞争性质现象的数学理论和方法。

也是运筹学的一个重要学科。

博弈论考虑游戏中的个体的预测行为和实际行为,并研究它们的优化策略。

表面上不同的相互作用可能表现出相似的激励结构(incentive structure),所以他们是同一个游戏的特例。

其中一个有名有趣的应用例子是囚徒困境(Prisoner's dilemma)。

具有竞争或对抗性质的行为称为博弈行为。

在这类行为中,参加斗争或竞争的各方各自具有不同的目标或利益。

为了达到各自的目标和利益,各方必须考虑对手的各种可能的行动方案,并力图选取对自己最为有利或最为合理的方案。

比如日常生活中的下棋,打牌等。

博弈论就是研究博弈行为中斗争各方是否存在着最合理的行为方案,以及如何找到这个合理的行为方案的数学理论和方法。

生物学家使用博弈理论来理解和预测演化(论)的某些结果。

例如,约翰·史密斯(John Maynard Smith)和乔治·普莱斯(George R. Price)在1973年发表于《自然》杂志上的论文中提出的“evolutionarily stable strategy”的这个概念就是使用了博弈理论。

其余可参见演化博弈理论(evolutionary game theory)和行为生态学(behavioral ecology)。

博弈论也应用于数学的其他分支,如概率,统计和线性规划等。

历史博弈论思想古已有之,我国古代的《孙子兵法》就不仅是一部军事著作,而且算是最早的一部博弈论专著。

博弈论最初主要研究象棋、桥牌、赌博中的胜负问题,人们对博弈局势的把握只停留在经验上,没有向理论化发展。

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)

Lecture XVII:Dynamic Games withIncomplete InformationMarkus M.M¨o biusMay6,2004•Gibbons,sections4.1and4.2•Osborne,chapter101IntroductionIn the last two lectures I introduced the idea of incomplete information. We analyzed some important simultaneous move games such as sealed bid auctions and public goods.In practice,almost all of the interesting models with incomplete informa-tion are dynamic games also.Before we talk about these games we’ll need a new solution concept called Perfect Bayesian Equilibrium.Intuitively,PBE is to extensive form games with incomplete games what SPE is to extensive form games with complete information.The concept we did last time,BNE is a simply the familiar Nash equilibrium under the Harsanyi representation of incomplete information.In principle,we could use the Harsanyi representation and SPE in dynamic games of incomplete information.However,dynamic games with incomplete information typi-cally don’t have enough subgames to do SPE.Therefore,many’non-credible’threats are possible again and we get too many unreasonable SPE’s.PBE allows subgame reasoning at information sets which are not single nodes whereas SPE only applies at single node information sets of players(because only those can be part of a proper subgame).The following example illustrates some problems with SPE.11.1Example I-SPEOurIts unique SPE is(R,B).The next game looks formally the same-however,SPE is the same as NEThe old SPE survives-all(pR+(1−p)RR,B)for all p is SPE.But there are suddenly strange SPE such as(L,qA+(1−q)B)for q≥1.Player2’s2strategy looks like an non-credible threat again-but out notion of SPE can’t rule it out!Remember:SPE can fail to rule out actions which are not optimal given any’beliefs’about uncertainty.Remark1This problem becomes severe with incomplete information:moves of Nature are not observed by one or both players.Hence the resulting exten-sive form game will have no or few subgames.This and the above example illustrate the need to replace the concept of a’subgame’with the concept of a ’continuation game’.1.2Example II:Spence’s Job-Market SignallingThe most famous example of dynamic game with incomplete information is Spence’s signalling game.There are two players-afirm and a worker.The worker has some private information about his ability and has the option of acquiring some cation is always costly,but less so for more able workers.However,education does not improve the worker’s productivity at all!In Spence’s model education merely serves as a signal tofirms.His model allows equilibria where able workers will acquire educa-tion and less able workers won’t.Hencefirms will pay high wages only to those who acquired education-however,they do this because education has revealed the type of the player rather than improved his productivity.Clearly this is an extreme assumption-in reality education has presum-ably dual roles:there is some signalling and some productivity enhancement. But it is an intriguing insight that education might be nothing more than a costly signal which allows more able workers to differentiate themselves from less able ones.Let’s look at the formal set-up of the game:•Stage0:Nature chooses the abilityθof a worker.SupposeΘ={2,3} and that P rob(θ=2)=p and P rob(θ=3)=1−p.•Stage I:Player1(worker)observes his type and chooses eduction levele∈{0,1}.Education has cost ceθ.Note,that higher ability workershave lower cost and that getting no education is costless.•Stage II:Player2(the competitive labor market)chooses the wage rate w(e)of workers after observing the education level.3Suppose that u1(e,w,;θ)=w−ceθand that u2(e,w;θ)=−(w−θ)2.Note,that education does not enter thefirm’s utility function.Also note, that the BR of thefirm is to set wages equal to the expected ability of the worker under this utility function.This is exactly what the competitive labor market would do ifθis equal to the productivity of a worker(the dollar amount of output he produces).If afirm pays above the expected productivity it will run a loss,and if it pays below some otherfirm would come in and offer more to the worker.So the market should offer exactly the expected productivity.The particular(rather strange-looking)utility function we have chosen implements the market outcome with a singlefirm -it’s a simple shortcut.Spence’s game is a signalling game.Each signalling game has the same three-part structure:nature chooses types,the sender(worker)observes his type and takes and action,the receiver(firm)sees that action but not the worker’s type.Hence thefirm tries to deduce the worker’s type using his action.His action therefore serves as a signal.Spence’s game is extreme because the signal(education)has no value to thefirm except for its sig-nalling function.This is not the case for all signalling models:think of a car manufacturer who can be of low or high quality and wants to send a signal to the consumer that he is a high-quality producer.He can offer a short or long warranty for the car.The extended warranty will not only signal his type but also benefit the consumer.The(Harsanyi)extensive form representation of Spence’s game(and any other41.3Why does SPE concept together with Harsanyirepresentation not work?We could find the set of SPE in the Harsanyi representation of the game.The problem is that the game has no proper subgame in the second round when the firm makes its decision.Therefore,the firm can make unreasonable threats such as the following:both workers buy education and the firms pays the educated worker w =3−p (his expected productivity),and the uneducated worker gets w =−235.11(or something else).Clearly,every worker would get education,and the firm plays a BR to a worker getting education (check for yourself using the Harsanyi representation).However,the threat of paying a negative wage is unreasonable.Once the firm sees a worker who has no education it should realize that the worker has a least ability level 2and should therefore at least get a wage of w =2.2Perfect Bayesian EquilibriumLet G be a multistage game with incomplete information and observed ac-tions in the Harsanyi representation.Write Θi for the set of possible types for player i and H i to be the set of possible information sets of player i .For each information set h i ∈H i denote the set of nodes in the information set with X i (h i )and X H i X i (h i ).A strategy in G is a function s i :H i →∆(A i ).Beliefs are a function µi :H i →∆(X i )such that the support of belief µi (h i )is within X i (h i ).Definition 1A PBE is a strategy profile s ∗together with a belief system µsuch that1.At every information set strategies are optimal given beliefs and oppo-nents’strategies (sequential rationality ).σ∗i (h )maximizes E µi (x |h i )u i σi ,σ∗−i |h,θi ,θ−i2.Beliefs are always updated according to Bayes rule when applicable .The first requirement replaces subgame perfection.The second requirement makes sure that beliefs are derived in a rational manner -assuming that you know the other players’strategies you try to derive as many beliefs as5possible.Branches of the game tree which are reached with zero probability cannot be derived using Bayes rule:here you can choose arbitrary beliefs. However,the precise specification will typically matter for deciding whether an equilibrium is PBE or not.Remark2In the case of complete information and observed actions PBE reduces to SPE because beliefs are trivial:each information set is a singleton and the belief you attach to being there(given that you are in the correspond-ing information set)is simply1.2.1What’s Bayes Rule?There is a close connection between agent’s actions and their beliefs.Think of job signalling game.We have to specify the beliefs of thefirm in the second stage when it does not know for sure the current node,but only the information set.Let’s go through various strategies of the worker:•The high ability worker gets education and the low ability worker does not:e(θ=2)=0and e(θ=3)=1.In this case my beliefs at the information set e=1should be P rob(High|e=1)=1and similarly, P rob(High|e=0)=0.•Both workers get education.In this case,we should have:P rob(High|e=1)=1−p(1) The beliefs after observing e=0cannot be determined by Bayes rule because it’s a probability zero event-we should never see it if players follow their actions.This means that we can choose beliefs freely at this information set.•The high ability worker gets education and the low ability worker gets education with probability q.This case is less trivial.What’s the prob-ability of seeing worker get education-it’s1−p+pq.What’s the probability of a worker being high ability and getting education?It’s 1−p.Hence the probability that the worker is high ability after wehave observed him getting education is1−p1−p+pq .This is the non-trivialpart of Bayes rule.6Formally,we can derive the beliefs at some information set h i of player i as follows.There is a probability p(θj)that the other player is of typeθj. These probabilities are determined by nature.Player j(i.e.the worker)has taken some action a j such that the information set h i was reached.Eachtype of player j takes action a j with some probabilityσ∗j (a j|θj)according tohis equilibrium strategy.Applying Bayes rule we can then derive the belief of player i that player j has typeθj at information set h i:µi(θj|a j)=p(θj)σ∗j(a j|θj)˜θj∈Θjp˜θjσ∗ja j|˜θj(2)1.In the job signalling game with separating beliefs Bayes rule gives usexactly what we expect-we belief that a worker who gets education is high type.2.In the pooling case Bayes rule gives us P rob(High|e=1)=1−pp×1+(1−p)×1=1−p.Note,that Bayes rule does NOT apply forfinding the beliefs after observing e=0because the denominator is zero.3.In the semi-pooling case we get P rob(High|e=1)=(1−p)×1p×q+(1−p)×1.Sim-ilarly,P rob(High|e=0)=(1−p)×0p×(1−q)+(1−p)×0=0.3Signalling Games and PBEIt turns out that signalling games are a very important class of dynamic games with incomplete information in applications.Because the PBE con-cept is much easier to state for the signalling game environment we define it once again in this section for signalling games.You should convince yourself that the more general definition from the previous section reduces to the definition below in the case of signalling games.3.1Definitions and ExamplesEvery signalling game has a sender,a receiver and two periods.The sender has private information about his type and can take an action in thefirst action.The receiver observes the action(signal)but not the type of the sender,and takes his action in return.7•Stage0:Nature chooses the typeθ1∈Θ1of player1from probability distribution p.•Stage1:Player1observesθ1and chooses a1∈A1.•Stage2:Player2observes a1and chooses a2∈A2.The players utilities are:u1=u1(a1,a2;θ1)(3)u2=u2(a1,a2;θ1)(4) 3.1.1Example1:Spence’s Job Signalling Game•worker is sender;firm is receiver•θis the ability of the worker(private information to him)•A1={educ,no educ}•A2={wage rate}3.1.2Example2:Initial Public Offering•player1-owner of privatefirm•player2-potential investor•Θ-future profitability•A1-fraction of company retained•A2-price paid by investor for stock3.1.3Example3:Monetary Policy•player1=FED•player2-firms•Θ-Fed’s preference for inflation/unemployment•A1-first period inflation•A2-expectation of second period inflation83.1.4Example4:Pretrial Negotiation•player1-defendant•player2-plaintiff•Θ-extent of defendant’s negligence•A1-settlement offer•A2-accept/reject3.2PBE in Signalling GamesA PBE in the signalling game is a strategy profile(s∗1(θ1),s∗2(a1))togetherwith beliefsµ2(θ1|a1)for player2such that1.Players strategies are optimal given their beliefs and the opponents’strategies,i.e.s∗1(θ1)maximizes u1(a1,s∗2(a1);θ1)for allθ1∈Θ1(5)s∗2(a1)maximizesθ1∈Θ1u2(a1,a2;θ1)µ2(θ1|a1)for all a1∈A1(6)2.Player2’s beliefs are compatible with Bayes’rule.If any type of player1plays a1with positive probability thenµ2(θ1|a1)=p(θ1)P rob(s∗1(θ1)=a1)θ1∈Θ1p(θ1)P rob(s∗1(θ1)=a1)for allθ1∈Θ13.3Types of PBE in Signalling GamesTo help solve for PBE’s it helps to think of all PBE’s as taking one of the following three forms”1.Separating-different types take different actions and player2learnstype from observing the action2.Pooling-all types of player1take same action;no info revealed3.Semi-Separating-one or more types mixes;partial learning(oftenonly type of equilibrium9Remark3In the second stage of the education game the”market”must have an expectation that player1is typeθ=2and attach probabilityµ(2|a1) to the player being type2.The wage in the second period must be between 2and3.This rules out the unreasonable threat of the NE I gave you in the education game(with negative wages).1Remark4In the education game suppose the equilibrium strategies ares∗1(θ=2)=0and s∗1(θ=3)=1,i.e.only high types get education.Thenfor any prior(p,1−p)at the start of the game beliefs must be:µ2(θ=2|e=0)=1µ2(θ=3|e=0)=0µ2(θ=2|e=1)=0µ2(θ=3|e=1)=1If player1’s strategy is s∗1(θ=2)=12×0+12×1and s∗1(θ=3)=1:µ2(θ=2|e=0)=1µ2(θ=3|e=0)=0µ2(θ=2|e=1)=p2p+1−p=p2−pµ2(θ=3|e=1)=2−2p 2−pAlso note,that Bayes rule does NOT apply after an actions which should notoccur in equilibrium.Suppose s∗1(θ=2)=s∗1(θ=3)=1then it’s OK toassumeµ2(θ=2|e=0)=57 64µ2(θ=3|e=0)=7 64µ2(θ=2|e=1)=pµ2(θ=3|e=1)=1−pThefirst pair is arbitrary.1It also rules out unreasonable SPE in the example SPE I which I have initially.Under any beliefs player2should strictly prefer B.10Remark5Examples SPE II and SPE III from the introduction now make sense-if players update according to Bayes rule we get the’reasonable’beliefs of players of being with equal probability in one of the two nodes.4Solving the Job Signalling GameFinally,after11tough pages we can solve our signalling game.The solution depends mainly on the cost parameter c.4.1Intermediate Costs2≤c≤3A separating equilibrium of the model is when only the able worker buys education and thefirm pays wage2to the worker without education and wage3to the worker with education.Thefirm beliefs that the worker is able iffhe gets educated.•The beliefs are consistent with the equilibrium strategy profile.•Now look at optimality.Player2sets the wage to the expected wage so he is maximizing.•Player1of typeθ=2gets3−c2≤2for a1=1and2for a1=0.Hence he should not buy education.•Player1of typeθ=3gets3−c3≥2for a1=1and2for a1=0.Hence he should get educated.1.Note that for too small or too big costs there is no separating PBE.2.There is no separating PBE where theθ=2type gets an educationand theθ=3type does not.4.2Small Costs c≤1A pooling equilibrium of the model is that both workers buy education and that thefirm pays wage w=3−p if it observes education,and wage2 otherwise.Thefirm believes that the worker is able with probability1−p if it observes education,and that the worker is of low ability if it observes no education.11•The beliefs are consistent with Bayes rule for e=1.If e=0has been observed Bayes rule does not apply because e=0should never occur -hence any belief isfine.The belief that the worker is low type if he does not get education makes sure that the worker gets punished for not getting educated.•Thefirm pays expected wage-hence it’s optimal response.The lowability guy won’t deviate as long2.5−c2≥2and the high ability typewon’t deviate as long as2.5−c3≥2.For c≤1both conditions aretrue.1.While this pooling equilibrium only works for small c there is alwaysanother pooling equilibrium where no worker gets education and the firms thinks that any worker who gets education is of the low type. 4.31<c<2Assume that p=12for this section.In the parameter range1<c<2there isa semiseparating PBE of the model.The high ability worker buys education and the low ability worker buys education with positive probability q.Thewage is w=2if thefirm observes no education and set to w=2+11+q ifit observes education.The beliefs that the worker is high type is zero if he gets no education and11+qif he does.•Beliefs are consistent(check!).•Firm plays BR.•Player1of low type won’t deviate as long as2+11+q −c2≤2.•Player1of high type won’t deviate as long as2+11+q −c3≥2.Set1+q=2c .It’s easy to check that thefirst condition is binding andthe second condition is strictly true.So we are done if we choose q=2c −1.Note,that as c→2we get back the separating equilibrium and as c→1we get the pooling one.12。

哈佛博弈课

哈佛博弈课
哈佛博弈课
读书笔记模板
01 思维导图
03 读书笔记 05 目录分析
目录
02 内容摘要 04 精彩摘录 06 作者介绍
思维导图
关键字分析思维导图
生活
意愿
剖析 课
技巧
信息
成本

参与者
博弈论 博弈
原则
哈佛
策略
方向
选择
艺术
理性
重要性
内容摘要
这是一本关于哈佛博弈课的经管书,此书将博弈论层层剖析,结合生活中的例子,结合当今社会热点,将我 们生活中可能用到的博弈做出讲解,有利于人们在生活中运用。读此书,让你“知己知彼”,了解对方的想法, 了解他们知道些什么,是什么激励着他们,甚至他们是怎么看你的;读此书,让你“洞察先机”,让你的策略选 择不能完全出于你的主观意愿,还能考虑到其他参与者如何行动;读此书,让你“影响对方”,扭转对方的思维, 做到进可攻退可守,瞬间提高决断力,引导并利用他人的情绪,把胜利导向自己,扭转对方的思维,做到进可攻 退可守。
第二课这,就是博 弈
第一课不一样的博 弈
第三课博弈给我们 带来什么
绑住自己的手 蚂蚁和狮子的策略 整体与联盟的较量 困扰NBA的高薪难题 非合作的选择
策略性的互动决策 博弈的构成要素 什么叫做理性 无处不在的对局
实现均衡 零和博弈 非零和博弈 负和博弈 正和博弈 多次博弈与单次博弈
第五课蜈蚣的博弈
强硬与温和的演进 威胁与承诺并举
双赢会持续吗 找到自己的位置与方向
搭便车 搭便车行为无处不在
第十一课你不是你 第十二课最优收益
第十三课信息的对称
第十四课不要做这样 的笨蛋
第十六课沟通的技 巧
第十五课你到底想 要什么

section2(博弈论讲义(Harvard University))

section2(博弈论讲义(Harvard University))
GOV 2005: Game Theory
Section 2: Externalities
Alexis Diamond adiamond@
Agenda
• Key terms and definitions • Complementarity and cross-partial derivatives)
• πi = 2(i + j + cij) - i2 • πj = 2(i + j + cij) - j2
• Best response functions:
– d (πi)/di = 2+2cj - 2i – • i*=1+cj … This is our BR function for agent i
Nash Eqm (TR & DC)
strategy profiles
Pareto Optima
Set of Rationalizable Strategies: {T,D} x {C,R} Set of Weakly Congruent Strategies: {T,D} x {C,R} Set of Best Response Complete: {T,D} x {C, R} or {T,D} x {L,C,R} Set of Congruent Strategies: {T,D} x {C, R}
Cournot Oligopoly: Explanation
Positive Cross-Partial Derivatives = Complementarity
π1 = (1000 - q1 - q2)q1 - 100q1 d (π1)/dq1 = 1000 - 2q1 - q2 - 100 d ((π1)/dq1)/dq2 = -1 < 0 • Why? What does this mean?

section1(博弈论讲义(Harvard University))

The decision to help in a dispute depends on one’s ability to influence the outcome, one’s level of motivation, and the costliness of getting involved
Simplifies to... (PBA + PBC -1)(UBA -UBC) = (KBA -KBC) Simplifies to...
[b/(a + b + c)] (UBA -UBC) = (KBA -KBC)
Translation: Validity
• What is the point at which B is indifferent? [b/(a + b + c)] (UBA -UBC) = (KBA -KBC) • [b/(a + b + c)] = resources B can contribute • (UBA -UBC) = B’s motivation for A vs. C • (KBA -KBC) = B’s costs for A vs. C
Analysis: Whither Alliances?
• Adopt the perspective of a player: B • B’s utility from an alliance with A = PBA(UBA) + (1-PBA)(UBC) - KBA (1) • B’s utility from an alliance with C = PBC(UBC) + (1-PBC)(UBA) - KBC (2) • What if equation 1 > equation 2?

section9(博弈论讲义(Harvard University))


• Beliefs mean something new here!
– Probability distribution over location in the information set.
• Strategies corresponding to events off the equilibrium path may be paired with any beliefs, because these strategies are not consistent with Bayes‘ Rule (and occur with probability zero.) • Two major types of equilibria:
• In Nash equilibrium, the credibility of best responses judged along equilibrium path; off the path, NE may include incredible strategies • In subgame perfect NE, best replies are judged in every subgame • In perfect Bayesian equilibrium, best replies judged at each information set
– Here, all strategies are credible
• Think of the pirate game: (99,0,1) is SPE, but there are many not-credible NE of the form (100-x, 0, x)
• Grim trigger strategies are supported as subgame perfect equilibria

博弈论(哈佛大学原版教程)

博弈论(哈佛⼤学原版教程)Lecture XIII:Repeated GamesMarkus M.M¨o biusApril19,2004Gibbons,chapter2.3.B,2.3.COsborne,chapter14Osborne and Rubinstein,sections8.3-8.51IntroductionSo far one might get a somewhat misleading impression about SPE.When we?rst introduced dynamic games we noted that they often have a large number of(unreasonable)Nash equilibria.In the models we’ve looked at so far SPE has’solved’this problem and given us a unique NE.In fact,this is not really the norm.We’ll see today that many dynamic games still have a very large number of SPE.2Credible ThreatsWe introduced SPE to rule out non-credible threats.In many?nite horizon games though credible threats are common and cause a multiplicity of SPE.Consider the following game:1actions before choosing the second period actions.Now one way to get a SPE is to play any of the three pro?les above followed by another of them (or same one).We can also,however,use credible threats to get other actions played inperiod1,such as:Play(B,R)in period1.If player1plays B in period1play(T,L)in period2-otherwise play (M,C)in period2.It is easy to see that no single period deviation helps here.In period2a NE is played so obviously doesn’t help.In period1player1gets4+3if he follows strategy and at most5+1 if he doesn’t.Player2gets4+1if he follows and at most2+1if he doesn’t. Therefore switching to the(M,C)equilibrium in period2is a credible threat.23Repeated Prisoner’s DilemmaNote,that the PD doesn’t have multiple NE so in a?nite horizon we don’t have the same easy threats to use.Therefore,the? nitely repeated PD has a unique SPE in which every player defects in eachperiod.In in?nite other types of threats are credible.Proposition1In the in?nitely repeated PD withδ≥12there exists a SPEin which the outcome is that both players cooperate in every period. Proof:Consider the following symmetric pro?le:s i(h t)=CIf both players have played C in everyperiod along the path leading to h t.D If either player has ever played D.To see that there is no pro?table single deviation note that at any h t such that s i(h t)=D player i gets:0+δ0+δ20+..if he follows his strategy and1+δ0+δ20+..if he plays C instead and then follows s i.At any h t such that s i(h t)=C player i gets:1+δ1+δ21+..=1 1?δ3if he follows his strategy and2+δ0+δ20+..=2if he plays D instead and then follows s i.Neither of these deviations is worth while ifδ≥12.QEDRemark1While people sometimes tend to think of this as showing that people will cooperate in they repeatedly interact ifdoes not show this.All it shows is that there is one SPE in which they do.The correct moral to draw is that there many possible outcomes.3.1Other SPE of repeated PD1.For anyδit is a SPE to play D every period.2.Forδ≥12there is a SPE where the players play D in the?rst periodand then C in all future periods.3.Forδ>1√2there is a SPE where the players play D in every evenperiod and C in every odd period.4.Forδ≥12there is a SPE where the players play(C,D)in every evenperiod and(D,C)in every odd period.3.2Recipe for Checking for SPEWhenever you are supposed to check that a strategy pro?le is an SPE you should?rst try to classify all histories(i.e.all information sets)on and o?the equilibrium path.Then you have to apply the SPDP for each class separately.Assume you want to check that the cooperation with grim trigger pun-ishment is SPE.There are two types of histories you have to check.Along the equilibrium path there is just one history:everybody coop-erated so far.O?the equilibrium path,there is again only one class: one person has defected.4Assume you want to check that cooperating in even periods and defect-ing in odd periods plus grim trigger punishment in case of deviation by any player from above pattern is SPE.There are three types of his-tories:even and odd periods along the equilibrium path,and o?the equilibrium path histories.Assume you want to check that TFT(’Tit for Tat’)is SPE(which it isn’t-see next lecture).Then you have you check four histories:only the play of both players in the last period matters for future play-so there are four relevant histories such as player1and2both cooperated in the last period,player1defected and player2cooperated etc.1Sometimes the following result comes in handy.Lemma1If players play Nash equilibria of the stage game in each period in such a way that the particular equilibrium being played in a period is a function of time only and does not depend on previous play,then this strategy is a Nash equilibrium. The proof is immediate:we check for the SPDP.Assume that there is a pro?table deviation.Such a deviation will not a?ect future play by assump-tion:if the stage game has two NE,for example,and NE1is played in even periods and NE2in odd periods,then a deviation will not a?ect future play.1 Therefore,the deviation has to be pro?table in the current stage game-but since a NE is being played no such pro?table deviation can exist.Corollary1A strategy which has players play the same NE in each period is always SPE.In particular,the grim trigger strategy is SPE if the punishment strategy in each stage game is a NE(as is the case in the PD). 4Folk TheoremThe examples in3.1suggest that the repeated PD has a tremendous number of equilibria whenδis large.Essentially,this means that game theory tells us we can’t really tell what is going to happen.This turns out to be an accurate description of most in?nitely repeated games.1If a deviation triggers a switch to only NE1this statement would no longer be true.5Let G be a simultaneous move game with action sets A1,A2,..,A I and mixed strategy spacesΣ1,Σ2,..,ΣI and payo?function?u i.De?nition1A payo?vector v=(v1,v2,..,v I)? I is feasible if there exists action pro?les a1,a2,..,a k∈A and non-negative weightsω1,..,ωI which sum up to1such thatv i=ω1?u ia1+ω2?u ia2+..+ωk?u ia k+De?nition2A payo?vector v is strictly individually rational ifv i>v i=minσ?i∈Σ?imaxσi(σ?i)∈Σiu i(σi(σ?i),σ?i)(1)We can think of this as the lowest payo?a rational player could ever get in equilibrium if he anticipates hisopponents’(possibly non-rational)play.Intuitively,the minmax payo?v i is the payo?player1can guarantee to herself even if the other players try to punish her as badly as they can.The minmax payo?is a measure of the punishment other players can in?ict. Theorem1FolkTheorem.Suppose that the set of feasible payo?s of G is I-dimensional.Then for any feasible and strictly individually rational payo?vector v there existsδ<1such that for allδ>δthere exists a SPE x?of G∞such that the average payo?to s?is v,i.e.u i(s?)=v i 1?δThe normalized(or average)payo?is de?ned as P=(1?δ)u i(s?).It is the payo?which a stage game would have to generate in each period such that we are indi?erent between that payo?stream and the one generates by s?:P+δP+δ2P+...=u i(s?)4.1Example:Prisoner’s DilemmaThe feasible payoset is the diamond bounded by(0,0),(2,-1),(-1,2) and(1,1).Every point inside can be generated as a convex combina-tions of these payo?vectors.6The minmax payofor each player is0as you can easily check.The other player can at most punish his rival by defecting,and each playerthat the equilibria I showed before generate payo?s inside this area.4.2Example:BOSConsiderHere each player can guarantee herself at least payo?23which is the pay-o?from playing the mixed strategy Nash equilibrium.You can check that whenever player2mixes with di?erent probabilities,player1can guarantee herself more than this payo?by playing either F or O all the time.4.3Idea behind the Proof1.Have players on the equilibrium path play an action with payo?v(oralternate if necessary to generate this payo?).22.If some player deviates punish him by having the other players for Tperiods chooseσ?i such that player i gets v i.3.After the end of the punishment phase reward all players(other than i)for having carried out the punishment by switching to an action pro?lewhere player i gets v Pij+ .2For example,in the BoS it is not possible to generate32,32in the stage game evenwith mixing.However,if players alternate and play(O,F)and then(F,O)the players can get arbitrarily close for largeδ. 8。

博弈论讲义3



你看过电影《美丽心灵》 吗?(老梁说电影)
2015年1月28日星期三
博弈论与日常生活
1、纳什:天才还是疯子?

纳什(John Forbes Nash )其人:
1928年出生于美国弗吉尼亚州,1948年同时 被4所大学录取,包括普林斯顿、哈佛,最终纳什 选择了普林斯顿。1950年,发表博士论文《非合 作博弈》,同年又发表《n人博弈中的均衡》。 1957年,与艾丽西亚结婚,不幸的是,第二年, 被送进精神病院。

表白

表白
不表白
(10,10) (10,10)
不表白
(10,10) (0,0)
2015年1月28日星期三
博弈论与日常生活
【案例二】
一对夫妻在屋子里休息,邻居敲门来借锤子,不 情愿地借给了他。

第二天来借锯,丈夫说,我们下午正要用,邻 居问,你们两个都要去吗?答,是的。 那太好了,你们去剪树枝,肯定不打球了,把 高尔夫球杆借我一用?

2015年1月28日星期三
博弈论与日常生活
4、纳什均衡

纳什均衡定义: 假设有n个局中人参与博弈,给定其他人策略的条件下, 每个局中人选择自己的最优策略(个人最优策略可能 依赖于也可能不依赖于他人的战略),从而使自己效 用最大化。所有局中人策略构成一个策略组合 (Strategy Profile)。纳什均衡指的是这样一种战 略组合,这种策略组合由所有参与人最优策略组成。 即在给定别人策略的情况下,没有人有足够理由打破 这种均衡。
博弈论与日常生活
占优均衡

以囚徒1为例,无论囚徒2采取什么策 略…
囚徒2

囚 徒 1

不坦白 (0, -10) (-1, -1)
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
1
Note, that a pure Nash equilibrium is a (degenerate) mixed equilibrium, too.
1
the behavior of real-world people rather than come up with an explanation of how they should play a game, then the notion of NE and even even IDSDS can be too restricting. Behavioral game theory has tried to weaken the joint assumptions of rationality and common knowledge in order to come up with better theories of how real people play real games. Anyone interested should take David Laibson’s course next year. Despite these reservation about Nash equilibrium it is still a very useful benchmark and a starting point for any game analysis. In the following we will go through three proofs of the Existence Theorem using various levels of mathematical sophistication: • existence in 2 × 2 games using elementary techniques • existence in 2 × 2 games using a fixed point approach • general existence theorem in finite games You are only required to understand the simplest approach. The rest is for the intellectually curious.
Lecture VI: Existence of Nash equilibrium
Markus M. M¨ obius February 26, 2004
• Gibbons, sections 1.3B • Osborne, chapter 4
1
Nash’s Existence Theorem
When we introduced the notion of Nash equilibrium the idea was to come up with a solution concept which is stronger than IDSDS. Today we show that NE is not too strong in the sense that it guarantees the existence of at least one mixed Nash equilibrium in most games (for sure in all finite games). This is reassuring because it tells that there is at least one way to play most games.1 Let’s start by stating the main theorem we will prove: Theorem 1 (Nash Existence)Every finite strategic-form game has a mixedstrategy Nash equilibrium. Many game theorists therefore regard the set of NE for this reason as the lower bound for the set of reasonably solution concept. A lot of research has gone into refining the notion of NE in order to retain the existence result but get more precise predictions in games with multiple equilibria (such as coordination games). However, we have already discussed games which are solvable by IDSDS and hence have a unique Nash equilibrium as well (for example, the two thirds of the average game), but subjects in an experiment will not follow those equilibrium prescription. Therefore, if we want to describe and predict
What’s useful about this approach is that it generalizes to a proof that any two by two game has at least one Nash equilibriu, i.e. its two best response correspondences have to intersect in at least one point. An informal argument runs as follows: 1. The best response correspondence for player 2 maps each α into at least one β . The graph of the correspondence connects the left and right side of the square [0, 1] × [0, 1]. This connection is continuous - the only discontinuity could happen when player 2’s best response switches from L to R or vice versa at some α∗ . But at this switching point player 2 has to be exactly indifferent between both strategies hence the graph has the value BR2 (α∗ ) = [0, 1] at this point and there cannot be a discontinuity. Note, that this is precisely why we need mixed strategies - with pure strategies the BR graph would generally be discontinuous at some point. 2. By an analogous argument the BR graph of player 1 connects the upper and lower side of the square [0, 1] × [0, 1]. 3. Two lines which connect the left/right side and the upper/lower side of the square respectively have to intersect in at least one point. Hence each 2 by 2 game has a mixed Nash equilibrium.
2
Nash Existence in 2 × 2 Games
Let us consider the simple 2 × 2 game which we discussed in the previous lecture on mixed Nash equilibria:
L
R
U
1,
2/3
β
1/4
α
1
We immediately see, that both correspondences intersect in the single point and β = 2 which is therefore the unique (mixed) Nash equilibrium of α= 1 4 3 the game. 3
(3)
We draw both best-response correspondences in a single graph (the graph is in color - so looking at it on the computer screen might help you):
BR2(α) 1
We next draw the best-response curves of both players. Recall that player 1’s strategy can be represented by a single number α such that σ1 = αU + (1 − α)D while player 2’s strategy is σ2 = βL + (1 − β )R. 2
Let’s find the best-response of player 2 to player 1 playing strategy α: u2 (L, αU + (1 − α)D) = 2 − α u2 (R, αU + (1 − α)D) = 1 + 3α
相关文档
最新文档