斯坦福密码课程06.2-collision-resistance-generic-birthday-attack
identity-based encryption from the weil pairing

Appears in SIAM J.of Computing,Vol.32,No.3,pp.586-615,2003.An extended abstract of this paper appears in the Proceedings of Crypto2001,volume2139of Lecture Notes in Computer Science,pages 213–229,Springer-Verlag,2001.Identity-Based Encryption from the Weil PairingDan Boneh∗Matthew Franklin†dabo@ franklin@AbstractWe propose a fully functional identity-based encryption scheme(IBE).The scheme has chosen ciphertext security in the random oracle model assuming a variant of the computational Diffie-Hellman problem.Our system is based on bilinear maps between groups.The Weil pairing onelliptic curves is an example of such a map.We give precise definitions for secure identity basedencryption schemes and give several applications for such systems.1IntroductionIn1984Shamir[41]asked for a public key encryption scheme in which the public key can be an arbitrary string.In such a scheme there are four algorithms:(1)setup generates global system parameters and a master-key,(2)extract uses the master-key to generate the private key corresponding to an arbitrary public key string ID∈{0,1}∗,(3)encrypt encrypts messages using the public key ID,and(4)decrypt decrypts messages using the corresponding private key.Shamir’s original motivation for identity-based encryption was to simplify certificate management in e-mail systems.When Alice sends mail to Bob at bob@ she simply encrypts her message using the public key string“bob@”.There is no need for Alice to obtain Bob’s public key certificate.When Bob receives the encrypted mail he contacts a third party,which we call the Private Key Generator(PKG).Bob authenticates himself to the PKG in the same way he would authenticate himself to a CA and obtains his private key from the PKG.Bob can then read his e-mail.Note that unlike the existing secure e-mail infrastructure,Alice can send encrypted mail to Bob even if Bob has not yet setup his public key certificate.Also note that key escrow is inherent in identity-based e-mail systems:the PKG knows Bob’s private key.We discuss key revocation,as well as several new applications for IBE schemes in the next section.Since the problem was posed in1984there have been several proposals for IBE schemes[11,45, 44,31,25](see also[33,p.561]).However,none of these are fully satisfactory.Some solutions require that users not collude.Other solutions require the PKG to spend a long time for each private key generation request.Some solutions require tamper resistant hardware.It is fair to say that until the results in[5]constructing a usable IBE system was an open problem.Interestingly,the related notions of identity-based signature and authentication schemes,also introduced by Shamir[41],do have satisfactory solutions[15,14].In this paper we propose a fully functional identity-based encryption scheme.The performance of our system is comparable to the performance of ElGamal encryption in F∗p.The security of our system is based on a natural analogue of the computational Diffie-Hellman assumption.Based on ∗Supported by DARPA contract F30602-99-1-0530,NSF,and the Packard Foundation.†Supported by an NSF Career Award and the Packard Foundation.this assumption we show that the new system has chosen ciphertext security in the random oracle ing standard techniques from threshold cryptography[20,22]the PKG in our scheme can be distributed so that the master-key is never available in a single location.Unlike common threshold systems,we show that robustness for our distributed PKG is free.Our IBE system can be built from any bilinear map e:G1×G1→G2between two groups G1,G2as long as a variant of the Computational Diffie-Hellman problem in G1is hard.We use the Weil pairing on elliptic curves as an example of such a map.Until recently the Weil pairing has mostly been used for attacking elliptic curve systems[32,17].Joux[26]recently showed that the Weil pairing can be used for“good”by using it for a protocol for three party one round Diffie-Hellman key exchange.Sakai et al.[40]used the pairing for key exchange and Verheul[46]used it to construct an ElGamal encryption scheme where each public key has two corresponding private keys.In addition to our identity-based encryption scheme,we show how to construct an ElGamal encryption scheme with“built-in”key escrow,i.e.,where one global escrow key can decrypt ciphertexts encrypted under any public key.To argue about the security of our IBE system we define chosen ciphertext security for identity-based encryption.Our model gives the adversary more power than the standard model for chosen ciphertext security[37,2].First,we allow the attacker to attack an arbitrary public key ID of her choice.Second,while mounting a chosen ciphertext attack on ID we allow the attacker to obtain from the PKG the private key for any public key of her choice,other than the private key for ID.This models an attacker who obtains a number of private keys corresponding to some identities of her choice and then tries to attack some other public key ID of her choice.Even with the help of such queries the attacker should have negligible advantage in defeating the semantic security of the system.The rest of the paper is organized as follows.Several applications of identity-based encryption are discussed in Section1.1.We then give precise definitions and security models in Section2.We describe bilinear maps with certain properties in Section3.Our identity-based encryption scheme is presented in Section4using general bilinear maps.Then a concrete identity based system from the Weil pairing is given in Section5.Some extensions and variations(efficiency improvements,distribution of the master-key)are considered in Section6.Our construction for ElGamal encryption with a global escrow key is described in Section7.Section8gives conclusions and some open problems.The Appendix containsa more detailed discussion of the Weil pairing.1.1Applications for Identity-Based EncryptionThe original motivation for identity-based encryption is to help the deployment of a public key infras-tructure.In this section,we show several other unrelated applications.1.1.1Revocation of Public KeysPublic key certificates contain a preset expiration date.In an IBE system key expiration can be done by having Alice encrypt e-mail sent to Bob using the public key:“bob@ current-year”. In doing so Bob can use his private key during the current year only.Once a year Bob needs to obtain a new private key from the PKG.Hence,we get the effect of annual private key expiration.Note that unlike the existing PKI,Alice does not need to obtain a new certificate from Bob every time Bob refreshes his private key.One could potentially make this approach more granular by encrypting e-mail for Bob using “bob@ current-date”.This forces Bob to obtain a new private key every day.This might be possible in a corporate PKI where the PKG is maintained by the corporation.With this approach key revocation is very simple:when Bob leaves the company and his key needs to be revoked, the corporate PKG is instructed to stop issuing private keys for Bob’s e-mail address.As a result,Bob can no longer read his email.The interesting property is that Alice does not need to communicate with any third party certificate directory to obtain Bob’s daily public key.Hence,identity based encryption is a very efficient mechanism for implementing ephemeral public keys.Also note that this approach enables Alice to send messages into the future:Bob will only be able to decrypt the e-mail on the date specified by Alice(see[38,12]for methods of sending messages into the future using a stronger security model).Managing user credentials.A simple extension to the discussion above enables us to manage user credentials using the IBE system.Suppose Alice encrypts mail to Bob using the public key:“bob@ current-year clearance=secret”.Then Bob will only be able to read the email if on the specified date he has secret clearance.Consequently,it is easy to grant and revoke user credentials using the PKG.1.1.2Delegation of Decryption KeysAnother application for IBE systems is delegation of decryption capabilities.We give two example applications.In both applications the user Bob plays the role of the PKG.Bob runs the setup algorithm to generate his own IBE system parameters params and his own master-key.Here we view params as Bob’s public key.Bob obtains a certificate from a CA for his public key params.When Alice wishes to send mail to Bob shefirst obtains Bob’s public key params from Bob’s public key certificate.Note that Bob is the only one who knows his master-key and hence there is no key-escrow with this setup.1.Delegation to a laptop.Suppose Alice encrypts mail to Bob using the current date as the IBE encryption key(she uses Bob’s params as the IBE system parameters).Since Bob has the master-key he can extract the private key corresponding to this IBE encryption key and then decrypt the message.Now,suppose Bob goes on a trip for seven days.Normally,Bob would put his private key on his laptop.If the laptop is stolen the private key is compromised.When using the IBE system Bob could simply install on his laptop the seven private keys corresponding to the seven days of the trip.If the laptop is stolen,only the private keys for those seven days are compromised.The master-key is unharmed.This is analogous to the delegation scenario for signature schemes considered by Goldreich et al.[23].2.Delegation of duties.Suppose Alice encrypts mail to Bob using the subject line as the IBE encryption key.Bob can decrypt mail using his master-key.Now,suppose Bob has several assistants each responsible for a different task(e.g.one is‘purchasing’,another is‘human-resources’,etc.).Bob gives one private key to each of his assistants corresponding to the assistant’s responsibility.Each assistant can then decrypt messages whose subject line falls within its responsibilities,but it cannot decrypt messages intended for other assistants.Note that Alice only obtains a single public key from Bob(params),and she uses that public key to send mail with any subject line of her choice.The mail can only be read by the assistant responsible for that subject.More generally,IBE can simplify security systems that manage a large number of public keys.Rather than storing a big database of public keys the system can either derive these public keys from usernames, or simply use the integers1,...,n as distinct public keys.2DefinitionsIdentity-Based Encryption.An identity-based encryption scheme E is specified by four random-ized algorithms:Setup,Extract,Encrypt,Decrypt:Setup:takes a security parameter k and returns params(system parameters)and master-key.The system parameters include a description of afinite message space M,and a description of afinite ciphertext space C.Intuitively,the system parameters will be publicly known,while the master-key will be known only to the“Private Key Generator”(PKG).Extract:takes as input params,master-key,and an arbitrary ID∈{0,1}∗,and returns a private key d.Here ID is an arbitrary string that will be used as a public key,and d is the corresponding private decryption key.The Extract algorithm extracts a private key from the given public key.Encrypt:takes as input params,ID,and M∈M.It returns a ciphertext C∈C.Decrypt:takes as input params,C∈C,and a private key d.It return M∈M.These algorithms must satisfy the standard consistency constraint,namely when d is the private key generated by algorithm Extract when it is given ID as the public key,then∀M∈M:Decrypt(params,C,d)=M where C=Encrypt(params,ID,M)Chosen ciphertext security.Chosen ciphertext security(IND-CCA)is the standard acceptable notion of security for a public key encryption scheme[37,2,13].Hence,it is natural to require that an identity-based encryption scheme also satisfy this strong notion of security.However,the definition of chosen ciphertext security must be strengthened a bit.The reason is that when an adversary attacks a public key ID in an identity-based system,the adversary might already possess the private keys of users ID1,...,ID n of her choice.The system should remain secure under such an attack.Hence,the definition of chosen ciphertext security must allow the adversary to obtain the private key associated with any identity ID i of her choice(other than the public key ID being attacked).We refer to such queries as private key extraction queries.Another difference is that the adversary is challenged on a public key ID of her choice(as opposed to a random public key).We say that an identity-based encryption scheme E is semantically secure against an adaptive chosen ciphertext attack(IND-ID-CCA)if no polynomially bounded adversary A has a non-negligible advantage against the Challenger in the following IND-ID-CCA game:Setup:The challenger takes a security parameter k and runs the Setup algorithm.It givesthe adversary the resulting system parameters params.It keeps the master-key to itself.Phase1:The adversary issues queries q1,...,q m where query q i is one of:–Extraction query ID i .The challenger responds by running algorithm Extract to gen-erate the private key d i corresponding to the public key ID i .It sends d i to theadversary.–Decryption query ID i,C i .The challenger responds by running algorithm Extract to generate the private key d i corresponding to ID i.It then runs algorithm Decrypt todecrypt the ciphertext C i using the private key d i.It sends the resulting plaintext tothe adversary.These queries may be asked adaptively,that is,each query q i may depend on the repliesto q1,...,q i−1.Challenge:Once the adversary decides that Phase1is over it outputs two equal lengthplaintexts M0,M1∈M and an identity ID on which it wishes to be challenged.The onlyconstraint is that ID did not appear in any private key extraction query in Phase1.The challenger picks a random bit b∈{0,1}and sets C=Encrypt(params,ID,M b).Itsends C as the challenge to the adversary.Phase2:The adversary issues more queries q m+1,...,q n where query q i is one of:–Extraction query ID i where ID i=ID.Challenger responds as in Phase1.–Decryption query ID i,C i = ID,C .Challenger responds as in Phase1.These queries may be asked adaptively as in Phase1.Guess:Finally,the adversary outputs a guess b ∈{0,1}and wins the game if b=b .We refer to such an adversary A as an IND-ID-CCA adversary.We define adversary A’sadvantage in attacking the scheme E as the following function of the security parameter k (k is given as input to the challenger):Adv E,A(k)= Pr[b=b ]−12 .The probability is over the random bits used by the challenger and the adversary.Using the IND-ID-CCA game we can define chosen ciphertext security for IBE schemes.As usual,we say that a function g:R→R is negligible if for any d>0we have|g(k)|<1/k d for sufficiently large k. Definition2.1.We say that the IBE system E is semantically secure against an adaptive chosen ci-phertext attack if for any polynomial time IND-ID-CCA adversary A the function Adv E,A(k)is negligible. As shorthand,we say that E is IND-ID-CCA secure.Note that the standard definition of chosen ciphertext security(IND-CCA)[37,2]is the same as above except that there are no private key extraction queries and the adversary is challenged on a random public key(rather than a public key of her choice).Private key extraction queries are related to the definition of chosen ciphertext security in the multiuser settings[7].After all,our definition involves multiple public keys belonging to multiple users.In[7]the authors show that that multiuser IND-CCA is reducible to single user IND-CCA using a standard hybrid argument.This does not hold in the identity-based settings,IND-ID-CCA,since the adversary gets to choose which public keys to corrupt during the attack.To emphasize the importance of private key extraction queries we note that our IBE system can be easily modified(by removing one of the hash functions)into a system which has chosen ciphertext security when private extraction queries are disallowed.However,the scheme is completely insecure when extraction queries are allowed.Semantically secure identity based encryption.The proof of security for our IBE system makes use of a weaker notion of security known as semantic security(also known as semantic security against a chosen plaintext attack)[24,2].Semantic security is similar to chosen ciphertext security(IND-ID-CCA)except that the adversary is more limited;it cannot issue decryption queries while attacking the challenge public key.For a standard public key system(not an identity based system)semantic security is defined using the following game:(1)the adversary is given a random public key generated by the challenger,(2)the adversary outputs two equal length messages M0and M1and receives the encryption of M b from the challenger where b is chosen at random in{0,1},(3)the adversary outputs b and wins the game if b=b .The public key system is said to be semantically secure if no polynomial time adversary can win the game with a non-negligible advantage.As shorthand we say that a semantically secure public key system is IND-CPA secure.Semantic security captures our intuition that given a ciphertext the adversary learns nothing about the corresponding plaintext.To define semantic security for identity based systems(denoted IND-ID-CPA)we strengthen the standard definition by allowing the adversary to issue chosen private key extraction queries.Similarly, the adversary is challenged on a public key ID of her choice.We define semantic security for identity based encryption schemes using an IND-ID-CPA game.The game is identical to the IND-ID-CCA game defined above except that the adversary cannot make any decryption queries.The adversary can only make private key extraction queries.We say that an identity-based encryption scheme E is semantically secure(IND-ID-CPA)if no polynomially bounded adversary A has a non-negligible advantage against the Challenger in the following IND-ID-CPA game:Setup:The challenger takes a security parameter k and runs the Setup algorithm.It givesthe adversary the resulting system parameters params.It keeps the master-key to itself.Phase1:The adversary issues private key extraction queries ID1,...,ID m.The challengerresponds by running algorithm Extract to generate the private key d i corresponding tothe public key ID i.It sends d i to the adversary.These queries may be asked adaptively.Challenge:Once the adversary decides that Phase1is over it outputs two equal lengthplaintexts M0,M1∈M and a public key ID on which it wishes to be challenged.Theonly constraint is that ID did not appear in any private key extraction query in Phase1.The challenger picks a random bit b∈{0,1}and sets C=Encrypt(params,ID,M b).Itsends C as the challenge to the adversary.Phase2:The adversary issues more extraction queries ID m+1,...,ID n.The only constraintis that ID i=ID.The challenger responds as in Phase1.Guess:Finally,the adversary outputs a guess b ∈{0,1}and wins the game if b=b .We refer to such an adversary A as an IND-ID-CPA adversary.As we did above,theadvantage of an IND-ID-CPA adversary A against the scheme E is the following function of the security parameter k:Adv E,A(k)= Pr[b=b ]−12 .The probability is over the random bits used by the challenger and the adversary.Definition2.2.We say that the IBE system E is semantically secure if for any polynomial time IND-ID-CPA adversary A the function Adv E,A(k)is negligible.As shorthand,we say that E is IND-ID-CPA secure.One way identity-based encryption.One can define an even weaker notion of security called one-way encryption(OWE)[16].Roughly speaking,a public key encryption scheme is a one-way encryption if given the encryption of a random plaintext the adversary cannot produce the plaintext in its entirety. One way encryption is a weak notion of security since there is nothing preventing the adversary from, say,learning half the bits of the plaintext.Hence,one-way encryption schemes do not generally provide secure encryption.In the random oracle model one-way encryption schemes can be used for encrypting session-keys(the session-key is taken to be the hash of the plaintext).We note that one can extend the notion of one-way encryption to identity based systems by adding private key extraction queries to the definition.We do not give the full definition here since in this paper we use semantic security as the weakest notion of security.See[5]for the full definition of identity based one-way encryption,and its use as part of an alternative proof strategy for our main result.Random oracle model.To analyze the security of certain natural cryptographic constructions Bel-lare and Rogaway introduced an idealized security model called the random oracle model[3].Roughlyspeaking,a random oracle is a function H:X→Y chosen uniformly at random from the set of all functions{h:X→Y}(we assume Y is afinite set).An algorithm can query the random oracle at any point x∈X and receive the value H(x)in response.Random oracles are used to model crypto-graphic hash functions such as SHA-1.Note that security in the random oracle model does not imply security in the real world.Nevertheless,the random oracle model is a useful tool for validating natural cryptographic constructions.Security proofs in this model prove security against attackers that are confined to the random oracle world.Notation.From here on we use Z q to denote the group{0,...,q−1}under addition modulo q.For a group G of prime order we use G∗to denote the set G∗=G\{O}where O is the identity element in the group G.We use Z+to denote the set of positive integers.3Bilinear maps and the Bilinear Diffie-Hellman AssumptionLet G1and G2be two groups of order q for some large prime q.Our IBE system makes use of a bilinear mapˆe:G1×G1→G2between these two groups.The map must satisfy the following properties: 1.Bilinear:We say that a mapˆe:G1×G1→G2is bilinear ifˆe(aP,bQ)=ˆe(P,Q)ab for all P,Q∈G1 and all a,b∈Z.2.Non-degenerate:The map does not send all pairs in G1×G1to the identity in G2.Observe that since G1,G2are groups of prime order this implies that if P is a generator of G1thenˆe(P,P)is a generator of G2.putable:There is an efficient algorithm to computeˆe(P,Q)for any P,Q∈G1.A bilinear map satisfying the three properties above is said to be an admissible bilinear map.In Section5we give a concrete example of groups G1,G2and an admissible bilinear map between them. The group G1is a subgroup of the additive group of points of an elliptic curve E/F p.The group G2is asubgroup of the multiplicative group of afinitefield F∗p2.Therefore,throughout the paper we view G1as an additive group and G2as a multiplicative group.As we will see in Section5.1,the Weil pairing can be used to construct an admissible bilinear map between these two groups.The existence of the bilinear mapˆe:G1×G1→G2as above has two direct implications to these groups.The MOV reduction:Menezes,Okamoto,and Vanstone[32]show that the discrete log problem in G1is no harder than the discrete log problem in G2.To see this,let P,Q∈G1be an instance of the discrete log problem in G1where both P,Q have order q.We wish tofind anα∈Z q such that Q=αP.Let g=ˆe(P,P)and h=ˆe(Q,P).Then,by bilinearity ofˆe we know that h=gα.By non-degeneracy ofˆe both g,h have order q in G2.Hence,we reduced the discrete log problem in G1to a discrete log problem in G2.It follows that for discrete log to be hard in G1we must choose our security parameter so that discrete log is hard in G2(see Section5).Decision Diffie-Hellman is Easy:The Decision Diffie-Hellman problem(DDH)[4]in G1is to dis-tinguish between the distributions P,aP,bP,abP and P,aP,bP,cP where a,b,c are random in Z∗q and P is random in G∗1.Joux and Nguyen[28]point out that DDH in G1is easy.To see this,observe that given P,aP,bP,cP∈G∗1we havec=ab mod q⇐⇒ˆe(P,cP)=ˆe(aP,bP).The Computational Diffie-Hellman problem(CDH)in G1can still be hard(CDH in G1is tofind abP given random P,aP,bP ).Joux and Nguyen[28]give examples of mappingsˆe:G1×G1→G2where CDH in G1is believed to be hard even though DDH in G1is easy.3.1The Bilinear Diffie-Hellman Assumption(BDH)Since the Decision Diffie-Hellman problem(DDH)in G1is easy we cannot use DDH to build cryp-tosystems in the group G1.Instead,the security of our IBE system is based on a variant of the Computational Diffie-Hellman assumption called the Bilinear Diffie-Hellman Assumption(BDH). Bilinear Diffie-Hellman Problem.Let G1,G2be two groups of prime order q.Letˆe:G1×G1→G2be an admissible bilinear map and let P be a generator of G1.The BDH problem in G1,G2,ˆe is as follows:Given P,aP,bP,cP for some a,b,c∈Z∗q compute W=ˆe(P,P)abc∈G2.An algorithm A has advantage in solving BDH in G1,G2,ˆe ifPr A(P,aP,bP,cP)=ˆe(P,P)abc ≥where the probability is over the random choice of a,b,c in Z∗q,the random choice of P∈G∗1,and the random bits of A.BDH Parameter Generator.We say that a randomized algorithm G is a BDH parameter generator if(1)G takes a security parameter k∈Z+,(2)G runs in polynomial time in k,and(3)G outputs a prime number q,the description of two groups G1,G2of order q,and the description of an admissible bilinear mapˆe:G1×G1→G2.We denote the output of G by G(1k)= q,G1,G2,ˆe .The security parameter k is used to determine the size of q;for example,one could take q to be a random k-bit prime.For i=1,2we assume that the description of the group G i contains polynomial time(in k) algorithms for computing the group action in G i and contains a generator of G i.The generator of G i enables us to generate uniformly random elements in G i.Similarly,we assume that the description of ˆe contains a polynomial time algorithm for computingˆe.We give an example of a BDH parameter generator in Section5.1.BDH Assumption.Let G be a BDH parameter generator.We say that an algorithm A has advan-tage (k)in solving the BDH problem for G if for sufficiently large k:Adv G,A(k)=Pr A(q,G1,G2,ˆe,P,aP,bP,cP)=ˆe(P,P)abc q,G1,G2,ˆe ←G(1k),P←G∗1,a,b,c←Z∗q ≥ (k) We say that G satisfies the BDH assumption if for any randomized polynomial time(in k)algorithm A we have that Adv G,A(k)is a negligible function.When G satisfies the BDH assumption we say that BDH is hard in groups generated by G.In Section5.1we give some examples of BDH parameter generators that are believed to satisfy the BDH assumption.We note that Joux[26](implicitly)used the BDH assumption to construct a one-round three party Diffie-Hellman protocol.The BDH assumption is also needed for constructions in[46,40].Hardness of BDH.It is interesting to study the relationship of the BDH problem to other hard problems used in cryptography.Currently,all we can say is that the BDH problem in G1,G2,ˆe is no harder than the CDH problem in G1or G2.In other words,an algorithm for CDH in G1or G2is sufficient for solving BDH in G1,G2,ˆe .The converse is currently an open problem:is an algorithm for BDH sufficient for solving CDH in G1or in G2?We refer to a survey by Joux[27]for a more detailed analysis of the relationship between BDH and other standard problems.We note that in all our examples(in Section5.1)the isomorphisms from G1to G2induced by the bilinear map are believed to be one-way functions.More specifically,for a point Q∈G∗1define the isomorphism f Q:G1→G2by f Q(P)=ˆe(P,Q).If any one of these isomorphisms turns out to be invertible then BDH is easy in G1,G2,ˆe .Fortunately,an efficient algorithm for inverting f Q for some fixed Q would imply an efficient algorithm for deciding DDH in the group G2.In all our examples DDH is believed to be hard in the group G2.Hence,all the isomorphisms f Q:G1→G2induced by the bilinear map are believed to be one-way functions.4Our Identity-Based Encryption SchemeWe describe our scheme in stages.First we give a basic identity-based encryption scheme which is not secure against an adaptive chosen ciphertext attack.The only reason for describing the basic scheme is to make the presentation easier to follow.Our full scheme,described in Section4.2,extends the basic scheme to get security against an adaptive chosen ciphertext attack(IND-ID-CCA)in the random oracle model.In Section4.3we relax some of the requirements on the hash functions.The presentation in this section uses an arbitrary BDH parameter generator G satisfying the BDH assumption.In Section5we describe a concrete IBE system using the Weil pairing.4.1BasicIdentTo explain the basic ideas underlying our IBE system we describe the following simple scheme,called BasicIdent.We present the scheme by describing the four algorithms:Setup,Extract,Encrypt,Decrypt. We let k be the security parameter given to the setup algorithm.We let G be some BDH parameter generator.Setup:Given a security parameter k∈Z+,the algorithm works as follows:Step1:Run G on input k to generate a prime q,two groups G1,G2of order q,and an admissible bilinear mapˆe:G1×G1→G2.Choose a random generator P∈G1.Step2:Pick a random s∈Z∗q and set P pub=sP.Step3:Choose a cryptographic hash function H1:{0,1}∗→G∗1.Choose a cryptographic hash function H2:G2→{0,1}n for some n.The security analysis will view H1,H2as random oracles. The message space is M={0,1}n.The ciphertext space is C=G∗1×{0,1}n.The system parameters are params= q,G1,G2,ˆe,n,P,P pub,H1,H2 .The master-key is s∈Z∗q.Extract:For a given string ID∈{0,1}∗the algorithm does:(1)computes Q ID=H1(ID)∈G∗1,and (2)sets the private key d ID to be d ID=sQ ID where s is the master key.Encrypt:To encrypt M∈M under the public key ID do the following:(1)compute Q ID=H1(ID)∈,(2)choose a random r∈Z∗q,and(3)set the ciphertext to beG∗1) where g ID=ˆe(Q ID,P pub)∈G∗2C= rP,M⊕H2(g rID。
斯坦福大学机器学习所有问题及答案合集

CS 229机器学习(问题及答案)斯坦福大学目录(1) 作业1(Supervised Learning) 1(2) 作业1解答(Supervised Learning) 5(3) 作业2(Kernels, SVMs, and Theory)15(4) 作业2解答(Kernels, SVMs, and Theory)19(5) 作业3(Learning Theory and UnsupervisedLearning)27 (6) 作业3解答(Learning Theory and Unsupervised Learning)31 (7) 作业4(Unsupervised Learning and Reinforcement Learning)39 (8) 作业4解答(Unsupervised Learning and Reinforcement Learning)44(9) Problem Set #1: Supervised Learning 56(10) Problem Set #1 Answer 62(11) Problem Set #2: Problem Set #2: Naive Bayes, SVMs, and Theory 78 (12) Problem Set #2 Answer 85CS229Problem Set#11 CS229,Public CourseProblem Set#1:Supervised Learning1.Newton’s method for computing least squaresIn this problem,we will prove that if we use Newton’s method solve the least squares optimization problem,then we only need one iteration to converge toθ∗.(a)Find the Hessian of the cost function J(θ)=1T x(i)−y(i))2.2P i=1(θm(b)Show that thefirst iteration of Newton’s method gives usθ⋆=(X T X)−1X T~y,thesolution to our least squares problem.2.Locally-weighted logistic regressionIn this problem you will implement a locally-weighted version of logistic regression,wherewe weight different training examples differently according to the query point.The locally-weighted logistic regression problem is to maximizeλℓ(θ)=−θTθ+Tθ+2mw(i)hy(i)log hθ(x(i))+(1−y(i))log(1−hθ(x(i)))i. Xi=1The−λTθhere is what is known as a regularization parameter,which will be discussed2θin a future lecture,but which we include here because it is needed for Newton’s method to perform well on this task.For the entirety of this problem you can use the valueλ=0.0001.Using this definition,the gradient ofℓ(θ)is given by∇θℓ(θ)=XT z−λθwhere z∈R m is defined byz i=w(i)(y(i)−hθ(x(i)))and the Hessian is given byH=XT DX−λIwhere D∈R m×m is a diagonal matrix withD ii=−w(i)hθ(x(i))(1−hθ(x(i)))For the sake of this problem you can just use the above formulas,but you should try toderive these results for yourself as well.Given a query point x,we choose compute the weightsw(i)=exp2τ2CS229Problem Set#12(a)Implement the Newton-Raphson algorithm for optimizingℓ(θ)for a new query pointx,and use this to predict the class of x.T he q2/directory contains data and code for this problem.You should implementt he y=lwlr(X train,y train,x,tau)function in the lwlr.mfile.This func-t ion takes as input the training set(the X train and y train matrices,in the formdescribed in the class notes),a new query point x and the weight bandwitdh tau.G iven this input the function should1)compute weights w(i)for each training exam-p le,using the formula above,2)maximizeℓ(θ)using Newton’s method,andfinally3)o utput y=1{hθ(x)>0.5}as the prediction.W e provide two additional functions that might help.The[X train,y train]=load data;function will load the matrices fromfiles in the data/folder.The func-t ion plot lwlr(X train,y train,tau,resolution)will plot the resulting clas-s ifier(assuming you have properly implemented lwlr.m).This function evaluates thel ocally weighted logistic regression classifier over a large grid of points and plots ther esulting prediction as blue(predicting y=0)or red(predicting y=1).Dependingo n how fast your lwlr function is,creating the plot might take some time,so wer ecommend debugging your code with resolution=50;and later increase it to atl east200to get a better idea of the decision boundary.(b)Evaluate the system with a variety of different bandwidth parametersτ.In particular,tryτ=0.01,0.050.1,0.51.0,5.0.How does the classification boundary change whenvarying this parameter?Can you predict what the decision boundary of ordinary(unweighted)logistic regression would look like?3.Multivariate least squaresSo far in class,we have only considered cases where our target variable y is a scalar value.S uppose that instead of trying to predict a single output,we have a training set withm ultiple outputs for each example:{(x(i),y(i)),i=1,...,m},x(i)∈R n,y(i)∈R p.Thus for each training example,y(i)is vector-valued,with p entries.We wish to use a linearmodel to predict the outputs,as in least squares,by specifying the parameter matrixΘiny=ΘT x,whereΘ∈R n×p.(a)The cost function for this case isJ(Θ)=12m p2(Θi=1j=1T x(i))j−y(ji)X X(Θ.W rite J(Θ)in matrix-vector notation(i.e.,without using any summations).[Hint: Start with the m×n design matrixX=—(x(1))T——(x(2))T—...—(x(m))T—2CS229Problem Set#13 and the m×p target matrixY=—(y(1))T——(y(2))T—...—(y(m))T—and then work out how to express J(Θ)in terms of these matrices.](b)Find the closed form solution forΘwhich minimizes J(Θ).This is the equivalent tothe normal equations for the multivariate case.(c)Suppose instead of considering the multivariate vectors y(i)all at once,we instead(i)compute each variable yj separately for each j=1,...,p.In this case,we have a p individual linear models,of the form(i)T(i),j=1,...,p. yj=θj x(So here,eachθj∈R n).How do the parameters from these p independent least squares problems compare to the multivariate solution?4.Naive BayesI n this problem,we look at maximum likelihood parameter estimation using the naiveB ayes assumption.Here,the input features x j,j=1,...,n to our model are discrete,b inary-valued variables,so x j∈{0,1}.We call x=[x1x2···x n]T to be the input vector.For each training example,our output targets are a single binary-value y∈{0,1}.Our model is then parameterized by φj|y=0=p(x j=1|y=0),φj|y=1=p(x j=1|y=1),andφy=p(y=1).We model the joint distribution of(x,y)according top(y)=(φy)y(1−φy)1−yp(x|y=0)=nYp(x j|y=0)j=1=nYx j(1−φj|y=0)1−x(φj|y=0)j j=1p(x|y=1)=nYp(x j|y=1)j=1=nYx j(1−φj|y=1)1−x(φj|y=1)j j=1(i),y(i);ϕ)in terms of the(a)Find the joint likelihood functionℓ(ϕ)=log Q i=1p(xmodel parameters given above.Here,ϕrepresents the entire set of parameters {φy,φj|y=0,φj|y=1,j=1,...,n}.(b)Show that the parameters which maximize the likelihood function are the same as3CS229Problem Set#14those given in the lecture notes;i.e.,thatm(i)φj|y=0=P j=1∧yi=11{x(i)=0}mP i=11{y(i)=0} m(i)φj|y=1=P j=1∧yi=11{x(i)=1}mP i=11{y(i)=1}φy=P(i)=1}mi=11{y.m(c)Consider making a prediction on some new data point x using the most likely classestimate generated by the naive Bayes algorithm.Show that the hypothesis returnedby naive Bayes is a linear classifier—i.e.,if p(y=0|x)and p(y=1|x)are the classp robabilities returned by naive Bayes,show that there exists someθ∈R n+1sucht hatT x1p(y=1|x)≥p(y=0|x)if and only ifθ≥0.(Assumeθ0is an intercept term.)5.Exponential family and the geometric distribution(a)Consider the geometric distribution parameterized byφ:p(y;φ)=(1−φ)y−1φ,y=1,2,3,....Show that the geometric distribution is in the exponential family,and give b(y),η,T(y),and a(η).(b)Consider performing regression using a GLM model with a geometric response vari-able.What is the canonical response function for the family?You may use the factthat the mean of a geometric distribution is given by1/φ.(c)For a training set{(x(i),y(i));i=1,...,m},let the log-likelihood of an examplebe log p(y(i)|x(i);θ).By taking the derivative of the log-likelihood with respect toθj,derive the stochastic gradient ascent rule for learning using a GLM model withgoemetric responses y and the canonical response function.4CS229Problem Set#1Solutions1CS229,Public CourseProblem Set#1Solutions:Supervised Learning1.Newton’s method for computing least squaresIn this problem,we will prove that if we use Newton’s method solve the least squares optimization problem,then we only need one iteration to converge toθ∗.(a)Find the Hessian of the cost function J(θ)=1T x(i)−y(i))2.2P i=1(θmAnswer:As shown in the class notes∂J(θ)∂θj=mXT x(i)−y(i))x(i)(θj. i=1So∂2J(θ)∂θj∂θk==m∂X T x(i)−y(i))x(i)(θ∂θkj i=1mX(i)(i)T X)jkx j x k=(Xi=1Therefore,the Hessian of J(θ)is H=X T X.This can also be derived by simply applyingrules from the lecture notes on Linear Algebra.(b)Show that thefirst iteration of Newton’s method gives usθ⋆=(X T X)−1X T~y,thesolution to our least squares problem.Answer:Given anyθ(0),Newton’s methodfindsθ(1)according toθ(1)=θ(0)−H−1∇θJ(θ(0))=θ(0)−(X T X)−1(X T Xθ(0)−X T~y)=θ(0)−θ(0)+(X T X)−1X T~y=(XT X)−1X T~y.Therefore,no matter whatθ(0)we pick,Newton’s method alwaysfindsθ⋆after oneiteration.2.Locally-weighted logistic regressionIn this problem you will implement a locally-weighted version of logistic regression,where we weight different training examples differently according to the query point.The locally- weighted logistic regression problem is to maximizeλℓ(θ)=−θTθ+Tθ+2mw(i)hy(i)log hθ(x(i))+(1−y(i))log(1−hθ(x(i)))i. Xi=15CS229Problem Set#1Solutions2 The−λTθhere is what is known as a regularization parameter,which will be discussedθ2in a future lecture,but which we include here because it is needed for Newton’s method to perform well on this task.For the entirety of this problem you can use the valueλ=0.0001.Using this definition,the gradient ofℓ(θ)is given by∇θℓ(θ)=XT z−λθwhere z∈R m is defined byz i=w(i)(y(i)−hθ(x(i)))and the Hessian is given byH=XT DX−λIwhere D∈R m×m is a diagonal matrix withD ii=−w(i)hθ(x(i))(1−hθ(x(i)))For the sake of this problem you can just use the above formulas,but you should try toderive these results for yourself as well.Given a query point x,we choose compute the weightsw(i)=exp2τ2CS229Problem Set#1Solutions3theta=zeros(n,1);%compute weightsw=exp(-sum((X_train-repmat(x’,m,1)).^2,2)/(2*tau));%perform Newton’s methodg=ones(n,1);while(norm(g)>1e-6)h=1./(1+exp(-X_train*theta));g=X_train’*(w.*(y_train-h))-1e-4*theta;H=-X_train’*diag(w.*h.*(1-h))*X_train-1e-4*eye(n);theta=theta-H\g;end%return predicted yy=double(x’*theta>0);(b)Evaluate the system with a variety of different bandwidth parametersτ.In particular,tryτ=0.01,0.050.1,0.51.0,5.0.How does the classification boundary change whenv arying this parameter?Can you predict what the decision boundary of ordinary(unweighted)logistic regression would look like?Answer:These are the resulting decision boundaries,for the different values ofτ.tau = 0.01 tau = 0.05 tau = 0.1tau = 0.5 tau = 0.5 tau = 5For smallerτ,the classifier appears to overfit the data set,obtaining zero training error,but outputting a sporadic looking decision boundary.Asτgrows,the resulting deci-sion boundary becomes smoother,eventually converging(in the limit asτ→∞to theunweighted linear regression solution).3.Multivariate least squaresSo far in class,we have only considered cases where our target variable y is a scalar value. Suppose that instead of tryingto predict a single output,we have a training set with7CS229Problem Set#1Solutions4multiple outputs for each example:{(x(i),y(i)),i=1,...,m},x(i)∈R n,y(i)∈R p.Thus for each training example,y(i)is vector-valued,with p entries.We wish to use a linearmodel to predict the outputs,as in least squares,by specifying the parameter matrixΘiny=ΘT x,whereΘ∈R n×p.(a)The cost function for this case isJ(Θ)=12m p2 X X T x(i))j−y(j=1ji)(Θi=1.W rite J(Θ)in matrix-vector notation(i.e.,without using any summations).[Hint: Start with the m×n design matrixX=—(x(1))T——(x(2))T—...—(x(m))T—and the m×p target matrixY=—(y(1))T——(y(2))T—...—(y(m))T—and then work out how to express J(Θ)in terms of these matrices.] Answer:The objective function can be expressed asJ(Θ)=12tr T(XΘ−Y)(XΘ−Y).To see this,note thatJ(Θ)===1tr T(XΘ−Y)(XΘ−Y)212X i T(XΘ−Y)XΘ−Y)ii12X i X(XΘ−Y)2ijj=12m p2i)(Θi=1j=1T x(i))j−y j(X X(Θ8CS229Problem Set#1Solutions5(b)Find the closed form solution forΘwhich minimizes J(Θ).This is the equivalent tothe normal equations for the multivariate case.Answer:First we take the gradient of J(Θ)with respect toΘ.∇ΘJ(Θ)=∇Θ1tr(XΘ−Y)T(XΘ−Y)2=∇ΘT X T XΘ−ΘT X T Y−Y T XΘ−Y T T1trΘ21=∇ΘT X T XΘ)−tr(ΘT X T Y)−tr(Y T XΘ)+tr(Y T Y)tr(Θ21=tr(Θ2∇ΘT X T XΘ)−2tr(Y T XΘ)+tr(Y T Y)12T XΘ+X T XΘ−2X T Y=X=XT XΘ−X T YSetting this expression to zero we obtainΘ=(XT X)−1X T Y.This looks very similar to the closed form solution in the univariate case,except now Yis a m×p matrix,so thenΘis also a matrix,of size n×p.(c)Suppose instead of considering the multivariate vectors y(i)all at once,we instead(i)compute each variable y j separately for each j=1,...,p.In this case,we have a p individual linear models,of theform(i)T(i),j=1,...,p. y j=θxj(So here,eachθj∈R n).How do the parameters from these p independent least squares problems compareto the multivariate solution?Answer:This time,we construct a set of vectors~y j=(1)jyy(2)j...(m)yj,j=1,...,p.Then our j-th linear model can be solved by the least squares solutionθj=(XT X)−1X T~y j.If we line up ourθj,we see that we have the following equation:[θ1θ2···θp]=(X T X)−1X T~y1(X T X)−1X T~y2···(X T X)−1X T~y p=(XT X)−1X T[~y1~y2···~y p]=(XT X)−1X T Y=Θ.Thus,our p individual least squares problems give the exact same solution as the multi- variate least squares. 9CS229Problem Set#1Solutions64.Naive BayesI n this problem,we look at maximum likelihood parameter estimation using the naiveB ayes assumption.Here,the input features x j,j=1,...,n to our model are discrete,b inary-valued variables,so x j∈{0,1}.We call x=[x1x2···x n]T to be the input vector.For each training example,our output targets are a single binary-value y∈{0,1}.Our model is then parameterized by φj|y=0=p(x j=1|y=0),φj|y=1=p(x j=1|y=1),andφy=p(y=1).We model the joint distribution of(x,y)according top(y)=(φy)y(1−φy)1−yp(x|y=0)=nYp(x j|y=0)j=1=nYx j(1−φj|y=0)1−x(φj|y=0)j j=1p(x|y=1)=nYp(x j|y=1)j=1=nY x j(1−φj|y=1)1−x(φj|y=1)jj=1m(i),y(i);ϕ)in terms of the(a)Find the joint likelihood functionℓ(ϕ)=log Qi=1p(xmodel parameters given above.Here,ϕrepresents the entire set of parameters {φy,φj|y=0,φj|y=1,j= 1,...,n}.Answer:mY p(x(i),y(i);ϕ) ℓ(ϕ)=logi=1mYp(x(i)|y(i);ϕ)p(y(i);ϕ) =logi=1(i);ϕ)m nY Y(i)p(y(i);ϕ) =log j|yp(xi=1j=1(i);ϕ)m nX Xlog p(y(i);ϕ)+(i)=j|ylog p(xi=1j=1m"Xy(i)logφy+(1−y(i))log(1−φy) =i=1nX j=1(i)(i)+xlogφj|y(i)+(1−x j)log(1−φj|yj(i))(b)Show that the parameters which maximize the likelihood function are the same as10CS229Problem Set#1Solutions7those given in the lecture notes;i.e.,thatm(i)φj|y=0=P(i)=0}i=11{x j=1∧ymP i=11{y(i)=0} m(i)φj|y=1=P(i)=1}i=11{x j=1∧ymPi=11{y(i)=1} m(i)=1}φy=Pi=11{y.mA nswer:The only terms inℓ(ϕ)which have non-zero gradient with respect toφj|y=0 are those which includeφj|y(i).Therefore,∇φj|y=0ℓ(ϕ)=∇φj|y=0mX i=1(i)(i)(i))x j)log(1−φj|yjlogφj|y(i)+(1−xj)log(1−φj|y=∇φj|y=0mi=1(i)X(i)=0} xjlog(φj|y=0)1{y=(i)(i)=0}+(1−xj)log(1−φj|y=0)1{ym111{y(i)=0}CS229Problem Set#1Solutions8To solve forφy,mi=1y(i)logφy+(1−y(i))log(1−φy) X∇φyℓ(ϕ)=∇φym1−φy1p(y=1|x)≥p(y=0|x)if and only ifθ≥0.(Assumeθ0is an intercept term.)Answer:p(y=1|x)≥p(y=0|x)≥1⇐⇒p(y=1|x)p(y=0|x)⇐⇒Qj=1p(x j|y=1)≥1np(y=1)p(y=0)Q j=1p(x j|y=0)n⇐⇒Qj≥1 nx j(1−φj|y=0)1−xφyj=1(φj|y=0)Qjn j=1(φj|y=1)x j(1−φj|y=1)1−x(1−φy)n+(1−x j)log1−φj|y=0x j logφj|y=0≥0,12CS229Problem Set#1Solutions9 whereθ0=nlog1−φj|y=01y log(1−φ)−log1−φThenb(y)=1η=log(1−φ)T(y)=ya(η)=log1−φφCS229Problem Set#1Solutions10ℓi(θ)=log"exp T x(i)·y(i)−log T x(i)!!#θT x(i)eθ1−eθ=log exp T x(i)·y(i)−log T x(i)−1θ1e−θ∂∂θj=θT x(i)·y(i)+log T x(i)−1e−θT x(i)e−θ(i)(i)jy(−x(i)+ℓi(θ)=x j)e−θT x(i)−1(i)(i)−=x j y1(i)T x(i)xj1−e−θ=y(i)−1T x(i)CS229Problem Set#21 CS229,Public CourseProblem Set#2:Kernels,SVMs,and Theory1.Kernel ridge regressionIn contrast to ordinary least squares which has a cost functionJ(θ)=12mXT x(i)−y(i))2,(θi=1we can also add a term that penalizes large weights inθ.In ridge regression,our least s quares cost is regularized by adding a termλkθk2,whereλ>0is afixed(known)constant (regularization will be discussed at greater length in an upcoming course lecutre).The ridger egression cost function is thenJ(θ)=12mXT x(i)−y(i))2+(θi=1λkθk2.2(a)Use the vector notation described in class tofind a closed-form expreesion for thevalue ofθwhich minimizes the ridge regression cost function.(b)Suppose that we want to use kernels to implicitly represent our feature vectors in ahigh-dimensional(possibly infinite dimensional)ing a feature mappingφ, the ridge regression cost function becomesJ(θ)=12mXTφ(x(i))−y(i))2+(θi=1λkθk2.2Making a prediction on a new input x new would now be done by computingθTφ(x new).S how how we can use the“kernel trick”to obtain a closed form for the predictiono n the new input without ever explicitly computingφ(x new).You may assume thatt he parameter vectorθcan be expressed as a linear combination of the input feature vectors;i.e.,θ=P (i))for some set of parametersαi.mi=1αiφ(x[Hint:You mayfind the following identity useful:(λI+BA)−1B=B(λI+AB)−1.If you want,you can try to prove this as well,though this is not required for theproblem.]2.ℓ2norm soft margin SVMsIn class,we saw that if our data is not linearly separable,then we need to modify our support vector machine algorithm by introducing an error margin that must be minimized.Specifically,the formulation we have looked at is known as theℓ1norm soft margin SVM.In this problem we will consider an alternative method,known as theℓ2norm soft margin SVM.This new algorithm is given by the following optimization problem(notice that theslack penalties are now squared):.min w,b,ξ12+Ck wk2P i=1ξ2m2is.t.y(i)(w T x(i)+b)≥1−ξi,i=1,...,m15CS229Problem Set#22(a)Notice that we have dropped theξi≥0constraint in theℓ2problem.Show that thesenon-negativity constraints can be removed.That is,show that the optimal value ofthe objective will be the same whether or not these constraints are present.(b)What is the Lagrangian of theℓ2soft margin SVM optimization problem?(c)Minimize the Lagrangian with respect to w,b,andξby taking the following gradients:∇w L,∂L∂b,and∇ξL,and then setting them equal to0.Here,ξ=[ξ1,ξ2,...,ξm]T.(d)What is the dual of theℓ2soft margin SVM optimization problem?3.SVM with Gaussian kernelC onsider the task of training a support vector machine using the Gaussian kernel K(x,z)=exp(−kx−zk2/τ2).We will show that as long as there are no two identical points in thet raining set,we can alwaysfind a value for the bandwidth parameterτsuch that the SVMa chieves zero training error.(a)Recall from class that the decision function learned by the support vector machinecan be written asf(x)=mXαi y(i)K(x(i),x)+b. i=1A ssume that the training data{(x(1),y(1)),...,(x(m),y(m))}consists of points whicha re separated by at least a distance ofǫ;that is,||x(j)−x(i)||≥ǫfor any i=j.F ind values for the set of parameters{α1,...,αm,b}and Gaussian kernel widthτs uch that x(i)is correctly classified,for all i=1,...,m.[Hint:Letαi=1for all ia nd b=0.Now notice that for y∈{−1,+1}the prediction on x(i)will be correct if|f(x(i))−y(i)|<1,sofind a value ofτthat satisfies this inequality for all i.](b)Suppose we run a SVM with slack variables using the parameterτyou found in part(a).Will the resulting classifier necessarily obtain zero training error?Why or whynot?A short explanation(without proof)will suffice.(c)Suppose we run the SMO algorithm to train an SVM with slack variables,underthe conditions stated above,using the value ofτyou picked in the previous part,and using some arbitrary value of C(which you do not know beforehand).Will thisnecessarily result in a classifier that achieve zero training error?Why or why not?Again,a short explanation is sufficient.4.Naive Bayes and SVMs for Spam ClassificationI n this question you’ll look into the Naive Bayes and Support Vector Machine algorithmsf or a spam classification problem.However,instead of implementing the algorithms your-s elf,you’ll use a freely available machine learning library.There are many such librariesa vailable,with different strengths and weaknesses,but for this problem you’ll use theW EKA machine learning package,available at /ml/weka/.WEKA implements many standard machine learning algorithms,is written in Java,andhas both a GUI and a command line interface.It is not the best library for very large-scaledata sets,but it is very nice for playing around with many different algorithms on mediumsize problems.You can download and install WEKA by following the instructions given on the websiteabove.To use it from the command line,youfirst need to install a java runtime environ-ment,then add the weka.jarfile to your CLASSPATH environment variable.Finally,you16CS229Problem Set#23 can call WEKA using the command:java<classifier>-t<training file>-T<test file>For example,to run the Naive Bayes classifier(using the multinomial event model)on ourprovided spam data set by running the command:java weka.classifiers.bayes.NaiveBayesMultinomial-t spam train1000.arff-T spam test.arffT he spam classification dataset in the q4/directory was provided courtesy of ChristianS helton(cshelton@).Each example corresponds to a particular email,and eachf eature correspondes to a particular word.For privacy reasons we have removed the actualw ords themselves from the data set,and instead label the features generically as f1,f2,etc.H owever,the data set is from a real spam classification task,so the results demonstrate thep erformance of these algorithms on a real-world problem.The q4/directory actually con-t ains several different trainingfiles,named spam train50.arff,spam train100.arff,etc(the“.arff”format is the default format by WEKA),each containing the correspondingn umber of training examples.There is also a single test set spam test.arff,which is ah old out set used for evaluating the classifier’s performance.(a)Run the weka.classifiers.bayes.NaiveBayesMultinomial classifier on the datasetand report the resulting error rates.Evaluate the performance of the classifier usingeach of the different trainingfiles(but each time using the same testfile,spam test.arff).Plot the error rate of the classifier versus the number of training examples.(b)Repeat the previous part,but using the weka.classifiers.functions.SMO classifier,which implements the SMO algorithm to train an SVM.How does the performanceof the SVM compare to that of Naive Bayes?5.Uniform convergenceIn class we proved that for anyfinite set of hypotheses H={h1,...,h k},if we pick the hypothesis hˆthat minimizes the training error on a set of m examples,then with probabilityat least(1−δ),12kε(hˆ)≤ε(h i)min log,+2r2mδiwhereε(h i)is the generalization error of hypothesis h i.Now consider a special case(oftenc alled the realizable case)where we know,a priori,that there is some hypothesis in ourc lass H that achieves zero error on the distribution from which the data is drawn.Thenw e could obviously just use the above bound with min iε(h i)=0;however,we can prove ab etter bound than this.(a)Consider a learning algorithm which,after looking at m training examples,choosessome hypothesis hˆ∈H that makes zero mistakes on this training data.(By ourassumption,there is at least one such hypothesis,possibly more.)Show that withprobability1−δε(hˆ)≤1m logkδ.N otice that since we do not have a square root here,this bound is much tighter.[Hint: C onsider the probability that a hypothesis with generalization error greater thanγmakes no mistakes on the training data.Instead of the Hoeffding bound,you might alsofind the following inequality useful:(1−γ)m≤e−γm.]17CS229Problem Set#24(b)Rewrite the above bound as a sample complexity bound,i.e.,in the form:forfixedδandγ,forε(hˆ)≤γto hold with probability at least(1−δ),it suffices that m≥f(k,γ,δ)(i.e.,f(·)is some function of k,γ,andδ).18CS229Problem Set#2Solutions1 CS229,Public CourseProblem Set#2Solutions:Kernels,SVMs,and Theory1.Kernel ridge regressionIn contrast to ordinary least squares which has a cost functionJ(θ)=12mXT x(i)−y(i))2,(θi=1we can also add a term that penalizes large weights inθ.In ridge regression,our least s quares cost is regularized by adding a termλkθk2,whereλ>0is afixed(known)constant (regularization will be discussed at greater length in an upcoming course lecutre).The ridger egression cost function is thenJ(θ)=12mX T x(i)−y(i))2+(θi=1λkθk2.2(a)Use the vector notation described in class tofind a closed-form expreesion for thevalue ofθwhich minimizes the ridge regression cost function.Answer:Using the design matrix notation,we can rewrite J(θ)asJ(θ)=12(Xθ−~y)T(Xθ−~y)+T(Xθ−~y)+λθTθ.Tθ.2Then the gradient is∇θJ(θ)=XT Xθ−X T~y+λθ.Setting the gradient to0gives us0=XT Xθ−X T~y+λθθ=(XT X+λI)−1X T~y.(b)Suppose that we want to use kernels to implicitly represent our feature vectors in ahigh-dimensional(possibly infinite dimensional)ing a feature mappingφ, the ridge regression cost function becomesJ(θ)=12mX Tφ(x(i))−y(i))2+(θi=1λkθk2.2Making a prediction on a new input x new would now be done by computingθTφ(x new).S how how we can use the“kernel trick”to obtain a closed form for the predictiono n the new input without ever explicitly computingφ(x new).You may assume that the parameter vectorθcan beexpressed as a linear combination of the input featurem(i))for some set of parametersαi.vectors;i.e.,θ=P i=1αiφ(x19。
格密码学课程六

2
The Micciancio-Voulgaris Algorithm for CVP
The MV algorithm solves CVP on any n-dimensional lattice in 2O(n) time (for simplicity, we ignore polynomial factors in the input length). It is based around the (closed) Voronoi cell of the lattice, which, to recall, is the set of all points in Rn that are as close or closer to the origin than to any other lattice point: ¯ (L) = {x ∈ Rn : x ≤ x − v ∀ v ∈ L \ {0}}. V We often omit the argument L when it is clear from context. From the definition it can be seen that for any ¯ consists of all the shortest elements of t + L. For any lattice point v, define coset t + L, the set (t + L) ∩ V the halfspace Hv = {x : x ≤ x − v } = {x : 2 x, v ≤ v, v }. ¯ is the intersection of Hv over all v ∈ L \ {0}. The minimal set V of lattice vectors It is easy to see that V ¯ such that V = v∈V Hv is called the set of Voronoi-relevant vectors; we often call them relevant vectors for short. The following characterizes the relevant vectors of a lattice: Fact 2.1 (Voronoi). A nonzero lattice vector v ∈ L \ {0} is relevant if and only if ±v are the only shortest vectors in the coset v + 2L. Corollary 2.2. An n-dimensional lattice has at most 2(2n − 1) ≤ 2n+1 relevant vectors. Proof. Every relevant vector belongs to some nonzero coset of L/2L, of which there exactly 2n − 1. By the above, there are at most two relevant vectors in each such coset. 2
椭圆曲线密码ppt课件

⑴ 密钥生成
• 用户随机地选择一个整数d作为自己的秘密的解 密钥,2≤d≤p-2 。
• 计算y=αd mod p,取y为自己的公开的加密钥。 • 由公开钥y 计算秘密钥d,必须求解离散对数,
由于上述运算是定义在模p有限域上的,所以称为 离散对数运算。
③从x计算y是容易的。可是从y计算x就困难得多, 利用目前最好的算法,对于小心选择的p将至少 需用O(p ½)次以上的运算,只要p足够大,求解 离散对数问题是相当困难的。
篮球比赛是根据运动队在规定的比赛 时间里 得分多 少来决 定胜负 的,因 此,篮 球比赛 的计时 计分系 统是一 种得分 类型的 系统
要小,所以ELGamal密码的加解密速度比RSA稍快。 ②随机数源
由ELGamal密码的解密钥d和随机数k都应是高质量 的随机数。因此,应用ELGamal密码需要一个好的随机 数源,也就是说能够快速地产生高质量的随机数。 ③大素数的选择
为了ELGamal密码的安全,p应为150位(十进制数) 以上的大素数,而且p-1应有大素因子。
三、椭圆曲线密码
2、椭圆曲线
②定义逆元素
设P(x1 ,y1)和Q(x2 ,y2)是解点,如果 x1=x2 且y1=-y2 ,则
P(x1 ,y1)+Q(x2 ,y2)=0 。
这说明任何解点R(x ,y)的逆就是
R(x ,-y)。
注意:规定无穷远点的逆就是其自己。
O(∞,∞)=-O(∞,∞)
篮球比赛是根据运动队在规定的比赛 时间里 得分多 少来决 定胜负 的,因 此,篮 球比赛 的计时 计分系 统是一 种得分 类型的 系统
三、椭圆曲线密码
讲义+#9——卡雷尔例程序四则

|《斯坦福大学开放课程:编程方法》专译组
4
清洁工卡雷尔(CleanupKarel)程序代码 /* * File: CleanupKarel.java * ----------------------* Karel starts at (1, 1) facing East and cleans up any * beepers scattered throughout his world. */ import stanford.karel.*; public class CleanupKarel extends SuperKarel { /* Cleans up a field of beepers, one row at a time */ public void run() { cleanRow(); while (leftIsClear()) { repositionForRowToWest(); cleanRow(); if (rightIsClear()) { repositionForRowToEast(); cleanRow(); } else { /* * In rows with an even number of streets, we want * Karel's left to be blocked after he cleans the last * row, so we turn him to the appropriate orientation. */ turnAround(); } } }
|《斯坦福大学开放课程:编程方法》专译组 5
private void repositionForRowToEast() { turnRight(); move(); turnRight(); } }
生命密码二阶课程

生命密码二阶课程(总10页) -CAL-FENGHAI.-(YICAI)-Company One1-CAL-本页仅作为文档封面,使用请直接删除生命密码二阶1生命密码的意义:对于管理者来说,对人的选,用,驭,留或者是选,驭,用,留都是至关重量,对于普通人来说,了解自己和别人的性格特性,更好的与人沟通和相处,每一种性格的人,在遇到同样的一件事情的时候,都会是不同的行为表现方式,当你理解这一切之后,你就会坦然的接受自己和别人的不足,不再会责怪自己;同样你也会发现自己和别人一些潜在的特质,并将它发挥出来,提高自己,让自己信心百倍。
首先,我们将会引入生命密码二阶课程中最为关键的一个密码图,李小龙的生日是1940年11月27日,他的生命密码三角形如下所示:在所有的生日年份中有一个特殊就是2000年,2+0=0,0+0=5,而不是0,切记三角形外的小圆圈内的数字必须是3, 6, 9,如果不是的话,肯定是计算错误。
现在来简单的介绍一下如何利用年月日生日计算上面的三角形生命密码图谱,首先按照图示把李小龙的生日年份日1940年11月27日,按照上面的位置填写好,从左往右开始计算,首先是27,把它们相加至个位2+7=9,然后是11,1+1=2, 19 (1+9=10, 1+0=1), 40,4+0=4,把9, 2, 1, 4 依次写在三角形内,继续 (9+2=11, 1+1=2), 1+4=5, 2+5=7,当把三角形内的数据加完之后,我们就开始计算三角形外的数字,左半部分,最下面一行的9,2分别左半部分第二行的2相加,2+2=4, (9+2=11, 1+1=2), 4+2=6, 右半部分也是同理可以得到6, 9, 6的结果。
最上面的3, 9, 3的结果是如何计算的呢?和我们刚才描述的左右部分的数字加成有所不一样,第二行的2, 5 分别与最上面7相加,结果不在同侧,而在相反的一侧,2+7=9, 2是左边的数字,而9却在右边,同理5+7=12, 1+2=3, 在左侧。
凯撒加解密c 课程设计
凯撒加解密c 课程设计一、课程目标知识目标:1. 学生能理解凯撒加解密的基本原理,掌握其加密与解密的方法。
2. 学生能够运用所学知识,对简单的凯撒密码进行加密与解密。
3. 学生了解密码学在历史与现代通讯中的重要性。
技能目标:1. 学生通过实际操作,培养逻辑思维能力和问题解决能力。
2. 学生学会运用数学方法分析问题,提高数学运用能力。
3. 学生在小组合作中,提高沟通与协作能力。
情感态度价值观目标:1. 学生培养对密码学的兴趣,激发探究精神。
2. 学生在解密过程中,体验成功解决问题的喜悦,增强自信心。
3. 学生认识到信息安全的重要性,树立正确的网络安全意识。
分析课程性质、学生特点和教学要求:本课程为数学学科拓展课程,结合五年级学生的认知水平,注重培养学生的逻辑思维能力和实际操作能力。
在教学过程中,充分考虑学生的好奇心和求知欲,通过生动有趣的案例,引导学生主动参与课堂,提高学习积极性。
同时,注重培养学生的团队合作精神,使学生在互动交流中共同成长。
将目标分解为具体的学习成果:1. 学生能够独立完成凯撒加解密的操作。
2. 学生能够解释凯撒加解密原理,并运用到实际问题中。
3. 学生在小组合作中,能够主动承担责任,与组员共同解决问题。
4. 学生在课堂中积极参与讨论,分享学习心得,提升网络安全意识。
二、教学内容本章节教学内容以密码学基础知识为主,结合五年级数学课程,围绕凯撒加解密方法展开。
1. 导入:通过介绍密码学在历史和现代的应用,激发学生对密码学的兴趣。
2. 基本概念:- 加密与解密的定义- 凯撒加解密的基本原理3. 教学主体内容:- 凯撒密码的加密方法- 凯撒密码的解密方法- 案例分析:实际操作凯撒加解密4. 教学拓展内容:- 密码学在其他领域的应用- 现代加密技术简介教学大纲安排如下:第一课时:- 导入:密码学应用介绍- 基本概念:加密与解密定义,凯撒加解密原理第二课时:- 教学主体内容:凯撒密码加密方法- 实践操作:学生尝试对简单文本进行加密第三课时:- 教学主体内容:凯撒密码解密方法- 实践操作:学生尝试对已加密文本进行解密第四课时:- 教学拓展内容:密码学在现代的应用- 案例分析:学生分组讨论并分享凯撒加解密案例教学内容与教材关联性:本教学内容与教材中关于逻辑思维、问题解决、数学运用等相关章节相结合,确保学生在掌握基础知识的同时,提高综合运用能力。
斯坦福大学公开课:机器学习课程note1翻译分析解析
CS229 Lecture notesAndrew Ng监督式学习让我们开始先讨论几个关于监督式学习的问题。
假设我们有一组数据集是波特兰,俄勒冈州的47所房子的面积以及对应的价格我们可以在坐标图中画出这些数据:给出这些数据,怎么样我们才能用一个关于房子面积的函数预测出其他波特兰的房子的价格。
为了将来使用的方便,我们使用i x表示“输入变量”(在这个例子中就是房子的面积),也叫做“输入特征”,i y表示“输出变量”也叫做“目标变量”就是我们要预测的那个变量(这x y)叫做一组训练样本,并且我们用来学习的--- 一列训个例子中就是价格)。
一对(,i ix y);i=1,…,m}--叫做一个训练集。
注意:这个上标“(i)”在这个符号练样本{(,i i表示法中就是训练集中的索引项,并不是表示次幂的概念。
我们会使用χ表示输入变量的定义域,使用表示输出变量的值域。
在这个例子中χ=Y=R为了更正式的描述我们这个预测问题,我们的目标是给出一个训练集,去学习产生一个函数h :X → Y 因此h(x)是一个好的预测对于近似的y 。
由于历史性的原因,这个函数h 被叫做“假设”。
预测过程的顺序图示如下:当我们预测的目标变量是连续的,就像在我们例子中的房子的价格,我们叫这一类的学习问题为“回归问题”,当我们预测的目标变量仅仅只能取到一部分的离散的值(就像如果给出一个居住面积,让你去预测这个是房子还是公寓,等等),我们叫这一类的问题是“分类问题”PART ILinear Reression为了使我们的房子问题更加有趣,我们假设我们知道每个房子中有几间卧室:在这里,x 是一个二维的向量属于2R 。
例如,1i x 就是训练集中第i 个房子的居住面积, 2i x 是训练集中第i 个房子的卧室数量。
(通常情况下,当设计一个学习问题的时候,这些输入变量是由你决定去选择哪些,因此如果你是在Portland 收集房子的数据,你可能会决定包含其他的特征,比如房子是否带有壁炉,这个洗澡间的数量等等。
斯坦福大学 CS229 机器学习notes12
CS229Lecture notesAndrew NgPart XIIIReinforcement Learning and ControlWe now begin our study of reinforcement learning and adaptive control.In supervised learning,we saw algorithms that tried to make their outputs mimic the labels y given in the training set.In that setting,the labels gave an unambiguous“right answer”for each of the inputs x.In contrast,for many sequential decision making and control problems,it is very difficult to provide this type of explicit supervision to a learning algorithm.For example, if we have just built a four-legged robot and are trying to program it to walk, then initially we have no idea what the“correct”actions to take are to make it walk,and so do not know how to provide explicit supervision for a learning algorithm to try to mimic.In the reinforcement learning framework,we will instead provide our al-gorithms only a reward function,which indicates to the learning agent when it is doing well,and when it is doing poorly.In the four-legged walking ex-ample,the reward function might give the robot positive rewards for moving forwards,and negative rewards for either moving backwards or falling over. It will then be the learning algorithm’s job tofigure out how to choose actions over time so as to obtain large rewards.Reinforcement learning has been successful in applications as diverse as autonomous helicopterflight,robot legged locomotion,cell-phone network routing,marketing strategy selection,factory control,and efficient web-page indexing.Our study of reinforcement learning will begin with a definition of the Markov decision processes(MDP),which provides the formalism in which RL problems are usually posed.12 1Markov decision processesA Markov decision process is a tuple(S,A,{P sa},γ,R),where:•S is a set of states.(For example,in autonomous helicopterflight,S might be the set of all possible positions and orientations of the heli-copter.)•A is a set of actions.(For example,the set of all possible directions in which you can push the helicopter’s control sticks.)•P sa are the state transition probabilities.For each state s∈S and action a∈A,P sa is a distribution over the state space.We’ll say more about this later,but briefly,P sa gives the distribution over what states we will transition to if we take action a in state s.•γ∈[0,1)is called the discount factor.•R:S×A→R is the reward function.(Rewards are sometimes also written as a function of a state S only,in which case we would have R:S→R).The dynamics of an MDP proceeds as follows:We start in some state s0, and get to choose some action a0∈A to take in the MDP.As a result of our choice,the state of the MDP randomly transitions to some successor states1,drawn according to s1∼P s0a0.Then,we get to pick another action a1.As a result of this action,the state transitions again,now to some s2∼P s1a1.We then pick a2,and so on....Pictorially,we can represent this process as follows:s0a0−→s1a1−→s2a2−→s3a3−→...Upon visiting the sequence of states s0,s1,...with actions a0,a1,...,our total payoffis given byR(s0,a0)+γR(s1,a1)+γ2R(s2,a2)+···.Or,when we are writing rewards as a function of the states only,this becomesR(s0)+γR(s1)+γ2R(s2)+···.For most of our development,we will use the simpler state-rewards R(s), though the generalization to state-action rewards R(s,a)offers no special difficulties.3 Our goal in reinforcement learning is to choose actions over time so as to maximize the expected value of the total payoff:E R(s0)+γR(s1)+γ2R(s2)+···Note that the reward at timestep t is discounted by a factor ofγt.Thus,to make this expectation large,we would like to accrue positive rewards as soon as possible(and postpone negative rewards as long as possible).In economic applications where R(·)is the amount of money made,γalso has a natural interpretation in terms of the interest rate(where a dollar today is worth more than a dollar tomorrow).A policy is any functionπ:S→A mapping from the states to the actions.We say that we are executing some policyπif,whenever we are in state s,we take action a=π(s).We also define the value function for a policyπaccording toVπ(s)=E R(s0)+γR(s1)+γ2R(s2)+··· s0=s,π].Vπ(s)is simply the expected sum of discounted rewards upon starting in state s,and taking actions according toπ.1Given afixed policyπ,its value function Vπsatisfies the Bellman equa-tions:Vπ(s)=R(s)+γ s ∈S P sπ(s)(s )Vπ(s ).This says that the expected sum of discounted rewards Vπ(s)for starting in s consists of two terms:First,the immediate reward R(s)that we get rightaway simply for starting in state s,and second,the expected sum of future discounted rewards.Examining the second term in more detail,we[Vπ(s )].This see that the summation term above can be rewritten E s ∼Psπ(s)is the expected sum of discounted rewards for starting in state s ,where s is distributed according P sπ(s),which is the distribution over where we will end up after taking thefirst actionπ(s)in the MDP from state s.Thus,the second term above gives the expected sum of discounted rewards obtained after thefirst step in the MDP.Bellman’s equations can be used to efficiently solve for Vπ.Specifically, in afinite-state MDP(|S|<∞),we can write down one such equation for Vπ(s)for every state s.This gives us a set of|S|linear equations in|S| variables(the unknown Vπ(s)’s,one for each state),which can be efficiently solved for the Vπ(s)’s.1This notation in which we condition onπisn’t technically correct becauseπisn’t a random variable,but this is quite standard in the literature.4We also define the optimal value function according toV ∗(s )=max πV π(s ).(1)In other words,this is the best possible expected sum of discounted rewards that can be attained using any policy.There is also a version of Bellman’s equations for the optimal value function:V ∗(s )=R (s )+max a ∈A γ s ∈SP sa (s )V ∗(s ).(2)The first term above is the immediate reward as before.The second term is the maximum over all actions a of the expected future sum of discounted rewards we’ll get upon after action a .You should make sure you understand this equation and see why it makes sense.We also define a policy π∗:S →A as follows:π∗(s )=arg max a ∈A s ∈SP sa (s )V ∗(s ).(3)Note that π∗(s )gives the action a that attains the maximum in the “max”in Equation (2).It is a fact that for every state s and every policy π,we haveV ∗(s )=V π∗(s )≥V π(s ).The first equality says that the V π∗,the value function for π∗,is equal to the optimal value function V ∗for every state s .Further,the inequality above says that π∗’s value is at least a large as the value of any other other policy.In other words,π∗as defined in Equation (3)is the optimal policy.Note that π∗has the interesting property that it is the optimal policy for all states s .Specifically,it is not the case that if we were starting in some state s then there’d be some optimal policy for that state,and if we were starting in some other state s then there’d be some other policy that’s optimal policy for s .Specifically,the same policy π∗attains the maximum in Equation (1)for all states s .This means that we can use the same policy π∗no matter what the initial state of our MDP is.2Value iteration and policy iterationWe now describe two efficient algorithms for solving finite-state MDPs.For now,we will consider only MDPs with finite state and action spaces (|S |<∞,|A |<∞).The first algorithm,value iteration ,is as follows:51.For each state s,initialize V(s):=0.2.Repeat until convergence{For every state,update V(s):=R(s)+max a∈Aγ s P sa(s )V(s ).}This algorithm can be thought of as repeatedly trying to update the esti-mated value function using Bellman Equations(2).There are two possible ways of performing the updates in the inner loop of the algorithm.In thefirst,we canfirst compute the new values for V(s)for every state s,and then overwrite all the old values with the new values.This is called a synchronous update.In this case,the algorithm can be viewed as implementing a“Bellman backup operator”that takes a current estimate of the value function,and maps it to a new estimate.(See homework problem for details.)Alternatively,we can also perform asynchronous updates. Here,we would loop over the states(in some order),updating the values one at a time.Under either synchronous or asynchronous updates,it can be shown that value iteration will cause V to converge to V∗.Having found V∗,we can then use Equation(3)tofind the optimal policy.Apart from value iteration,there is a second standard algorithm forfind-ing an optimal policy for an MDP.The policy iteration algorithm proceeds as follows:1.Initializeπrandomly.2.Repeat until convergence{(a)Let V:=Vπ.(b)For each state s,letπ(s):=arg max a∈A s P sa(s )V(s ).}Thus,the inner-loop repeatedly computes the value function for the current policy,and then updates the policy using the current value function.(The policyπfound in step(b)is also called the policy that is greedy with re-spect to V.)Note that step(a)can be done via solving Bellman’s equations as described earlier,which in the case of afixed policy,is just a set of|S| linear equations in|S|variables.After at most afinite number of iterations of this algorithm,V will con-verge to V∗,andπwill converge toπ∗.6Both value iteration and policy iteration are standard algorithms for solv-ing MDPs,and there isn’t currently universal agreement over which algo-rithm is better.For small MDPs,policy iteration is often very fast and converges with very few iterations.However,for MDPs with large state spaces,solving for Vπexplicitly would involve solving a large system of lin-ear equations,and could be difficult.In these problems,value iteration may be preferred.For this reason,in practice value iteration seems to be used more often than policy iteration.3Learning a model for an MDPSo far,we have discussed MDPs and algorithms for MDPs assuming that the state transition probabilities and rewards are known.In many realistic prob-lems,we are not given state transition probabilities and rewards explicitly, but must instead estimate them from data.(Usually,S,A andγare known.) For example,suppose that,for the inverted pendulum problem(see prob-lem set4),we had a number of trials in the MDP,that proceeded as follows:s(1)0a (1) 0−→s(1)1a (1) 1−→s(1)2a (1) 2−→s(1)3a (1) 3−→...s(2)0a (2) 0−→s(2)1a (2) 1−→s(2)2a (2) 2−→s(2)3a (2) 3−→......Here,s(j)i is the state we were at time i of trial j,and a(j)i is the cor-responding action that was taken from that state.In practice,each of the trials above might be run until the MDP terminates(such as if the pole falls over in the inverted pendulum problem),or it might be run for some large butfinite number of timesteps.Given this“experience”in the MDP consisting of a number of trials, we can then easily derive the maximum likelihood estimates for the state transition probabilities:P sa(s )=#times took we action a in state s and got to s#times we took action a in state s(4)Or,if the ratio above is“0/0”—corresponding to the case of never having taken action a in state s before—the we might simply estimate P sa(s )to be 1/|S|.(I.e.,estimate P sa to be the uniform distribution over all states.) Note that,if we gain more experience(observe more trials)in the MDP, there is an efficient way to update our estimated state transition probabilities7 using the new experience.Specifically,if we keep around the counts for both the numerator and denominator terms of(4),then as we observe more trials, we can simply keep accumulating those puting the ratio of these counts then given our estimate of P sa.Using a similar procedure,if R is unknown,we can also pick our estimate of the expected immediate reward R(s)in state s to be the average reward observed in state s.Having learned a model for the MDP,we can then use either value it-eration or policy iteration to solve the MDP using the estimated transition probabilities and rewards.For example,putting together model learning and value iteration,here is one possible algorithm for learning in an MDP with unknown state transition probabilities:1.Initializeπrandomly.2.Repeat{(a)Executeπin the MDP for some number of trials.(b)Using the accumulated experience in the MDP,update our esti-mates for P sa(and R,if applicable).(c)Apply value iteration with the estimated state transition probabil-ities and rewards to get a new estimated value function V.(d)Updateπto be the greedy policy with respect to V.}We note that,for this particular algorithm,there is one simple optimiza-tion that can make it run much more quickly.Specifically,in the inner loop of the algorithm where we apply value iteration,if instead of initializing value iteration with V=0,we initialize it with the solution found during the pre-vious iteration of our algorithm,then that will provide value iteration with a much better initial starting point and make it converge more quickly.。
精品课件-应用密码学-第5章 非对称密码(3)
2020/11/19
15
1155
图5-1 y2≡x3+x+6所表示的曲线 图5-4 y2≡x3+x+6 (mod 11)所表示的曲线
通过比较y2≡x3+x+6在平面的曲线(见图5-1所示)和y2≡x3+x+6 ( mod 11) 在平面上的点(如图5-4所示),直观感觉没有太多的联系。
P +Q+ R1=O。则P+Q =- R1=R,如图5-2。
2020/11/19
2020/11/219020/11/
图5-2 R=P+Q示意图 2020/11/19
8
8 88
点P的倍点定义为:过P点做椭圆曲线的切线,设与椭圆曲线交于R1, 则 P+P+ R1=O, 故2P=- R1=R。如图5-3。
2020/11/19
5
5 55
本章的介绍以第一种椭圆曲线为主,如图5-1是y2≡x3+x+6所表示的曲线,该图 可以用matlab实现。显然该曲线关于x轴对称。
图5-1 y2≡x3+x+6所表示的 曲线
2020/11/19
2020/11/219020/11/
2020/11/19
6
6 66
2.椭圆曲线的加法
2020/11/19
2020/11/19
2020/11/19
20
2200
- ④ 计算点:(x1,y1)=kP - ⑤ 计算点:(x2,y2)=kQ,如果x2=0,则返回第③步 - ⑥ 计算:c=mx2 - ⑦ 传送加密数据(x1,y1,c)给B
(4)解密过程
当实体B解密从A收到的密文(x1,y1,c)时,执行步骤: