Text Classification
赖斯的文本类型论

E.g.: These statements are worth taking with a basketfull of salt.
这句话的with a basketfull是从with a grain of salt (半信半疑)变 过来的,将grain 换成basketfull, 表明程度大大增加。
文本翻译:译文应传达原文的审美及艺术形式。如果在翻译时,两种语 言文化规范发生冲突, 则以原语文化的价值系统为主导。例如,after dinner mustard,可以翻译成“饭后上芥末——雨后送伞”。(“芥末” 的形象最好保留,同时又传递功能意义)
PPT模板下载:/moban/ 节日PPT模板:/jieri/ PPT背景图片:/beijing/ 优秀PPT下载:/xiazai/ Word教程: /word/ 资料下载:/ziliao/ 范文下载:/fanwen/ 教案下载:/jiaoan/
Excel教程:/excel/
资料下载:/ziliao/
PPT课件下载:/kejian/
范文下载:/fanwen/
试卷下载:/shiti/
教案下载:/jiaoan/
卡特琳娜·赖斯(1923-) 1923年出生于德国,是著名的翻译理论 家,精通德语和西班牙语。先后执教于 海德堡大学、维尔茨堡大学和美因茨大 学,长期在大学从事翻译研究和教学。
➢将功能对等理论应用到翻译批评中,提出 比较合理的翻译批评观,以功能等值作为译 文质量评估标准。 ➢70年代初首先提出,等值的概念不应该只 停留在字、词、句的微观层面上,而应该涉 及文本(语篇层面)。
三、操作性功能文本(operative text):目的在于呼吁或说服文本的读者或 “接受者”按某一种方式行事。在翻译中感染听者,打动读者,是译者的 首要任务,尽管有时可能有内容、形式和风格上的变化。该类文本主要指 广告、演说等文本。
Classification writing

Classification EssayWhat is a Classification Essay?In a classification essay, a writer organizes, or sorts, things into categories.How to Write an Effective Classification Essay1.Determine the categories. Be thorough; don't leave out a critical category. Forexample, if you say water sports of Hawaii include snorkeling and sailing, but leave out surfing, your essay would be incomplete because surfing is Hawaii's most famous water sport. On the other hand, don't include too many categories, which will blur your classification. For example, if your topic is sports shoes, and your organizing principle is activity, you wouldn't include high heels with running and bowling shoes.2.Categories should be meaningful and not based on superficial differences. One mightwrite a meaningful essay about American Presidents by dividing them by their politics, their attitudes toward the Supreme Court, or their religious convictions. But dividing Presidents by their horoscopes, their pets, or their nicknames serves little purpose.3.Classify by a single principle. Once you have categories, make sure that they fitinto the same organizing principle. The organizing principle is how you sort the groups. Do not allow a different principle to pop up unexpectedly. For example, if your unifying principle is "tourist-oriented" water sports, don't use another unifying principle, such as "native water sports," which would have different categories: pearl diving, outrigger, or canoe racing.4.Classification must not use categories that overlapExample:Y ou could divide your classmates into intelligent and good-looking. However,perhaps some of your classmates are both intelligent and good-looking. That means that those two groups overlap. Therefore, it is a bad classification.T here is another reason this classification is bad. The types of groups should be consistent. You could use good-looking and ugly or intelligent and stupid, but you cannot mix the types such as the first example of intelligent and good-looking.H owever, even with good-looking and ugly, we have a problem. Do you really thinkthat all of your classmates fit into those categories? Probably not. There are probably plenty of average-look ing students in your class. (I’m just guessing.) It is veryimportant that your groups include everyone in your class. Classifications have to include all the possible classes to completely cover that topic.5.To avoid oversimplifying your subject, qualify your remarks and point out possibleexceptions:Most people deal with loss in three ways. . .In most instances investigators use one of four methods to detect fraud .6.Announce the topic and explain the method of division. State the number ofsubtopics can help readers recall major points.People cope with loss in four common ways: anger, depression, denial, and compensation.Children develop personal values from three major influences: parents, peers, andteachers.What type of computer should you buy? That will depend upon your purpose. .... The two major categories of computers are IBM and Macintosh.e parallel patterns to develop categories and items.8.To make your division easier to follow and remember, consider using bold type orlarger fonts to highlight each subtopic.9.Support equally each category with examples. In general, you should write the samequantity, i.e., give the same number of examples, for each category. The most important category, usually reserved for last, might require more elaboration.Below are some sample thesis statements:1.I am interested in several kinds of work opportunities.2.Television shows may be classified into five types.3.My present life is divided into four aspects.4.My friends (enemies) may be grouped into three major types.5.Several steps led to former President Nixon's resignation from the White House duringthe Watergate scandal.6.Literature is classified into four different forms: short story, novel, drama, and poetry.7.So far in life I have experienced three kinds of love: parental devotion, deep friendship,and romantic attraction.8.The people I know may be classified into mere acquaintances, casual friends, bosombuddies, and loved ones.9.The people I know fall into four different political groups: radical, liberal, conservation,and indifferent.10.The people in my life are classifiable into three distinct classes, each with its ownpeculiar way of behaving: lower class, middle class, and snooty.11.My friends, relatives, and acquaintances fall neatly into three groups: the were-fats,the are-fats, and the will-be-fats.The following are some sample statements for each specific category:12.Coal is a kind of non-renewable resource.13.Coal is a type of non-renewable resource.14.Coal falls under the category of non-renewable resources.15.Coal belongs to the category of non-renewable resources.16.iCoal is a part of the category of non-renewable resources.Coal fits into the category of non-renewable resources.Coal is grouped with non-renewable resources.Coal is related to other non-renewable resources.17.Coal is associated with other non-renewable resources.Reading for funTypes of Girls You Might See In The RestroomSloppy - Skirt drags in toilet while squatting, pees all over front of toilet seat, never uses toilet paper, drags her business all over seat, forgets to flush and emerges with back of skirt caught in pants.Timid - Looks under stall door to see if anyone else is in the can, turns on faucet full force, backs up to toilet, squats quickly, flushes for constant flow of water, coughs, hums, listens intently to learn if sound other than faucet can be heard. Ends up with loud fart, walks out blushing.Frivolous - Lets stream go in little squirts to the tune of "Row, Row, Row Your Boat."Literary - Always takes book of the month to the can with her. Blames "Forever Amber" for her piles.Cautious - Has heard of so many girls contracting VD from toilet seats that she straddles bowl, leans over to flush, pees on her nylons.Worried - A week past due. Squats thoughtfully, counting days overdue on fingers. Uses toilet paper, examines it hopefully. Peers into toilet before flushing, sighing deeply. Walks out biting nails after forgetting to wash hands. Resolves never to go to bed drunk again.Cross-Eyed - Sits on one cheek on the side of the seat and pees all over the floor. Usually wears rubber boots on hervisits to the can, and carries a box of Kleenex in her purse.Big Time - Always leaves toilet door open while she chats and brags to the other girls about the guy she "had" last night. Shows girls her panties with black lace edging and "Welcome" embroidered in the crotch. Has never been to bed with a man.Selfish - Enters alone and locks the door, saying to the girls following that she will be out in a minute. Leisurely pees. Remarks, adjusts clothes and poses before mirror keeping others squirming outside for an hour.Conceited: Approaches toilet with undulating movements. Raises dress by finger tips. Expression while peeing indicates such a lovely creature should not be compelled to attend to such lowly duties. Farts silently and disdainfully.Hardy Girl: Raises dress with a whoop. Scuttles across the floor beating other occupant to toilet. Squats with great force, rattling windows and causing breasts to bob up and down, hums lively tune, peeing in squirts to keep time, farts loudly and with great glee.Drunk - Wobbles to toilet. After several attempts manages to raise dress. Squats on toilet with shrieks of laughter. Pees for a while, singing happy songs, suddenly starts to sob broken heartedly as she realizes that she forgot to pull down her panties. Continues peeing and sobbing.The I Don't Care Girl - Just squats and fires away.Stubborn Girl - Believes all public places are contaminated. Stands three feet in front of toilet, backs up, takes careful aim and fires away, always misses, but will try again.Types of men you meet in washrooms …TYPE CHARACTERISTICSEXCITED Pants are twisted, cannot find hole, rips pants in anger.SOCIABLE Joins pals for a pee whether he wants one or not.TIMID Cannot pee if anyone is watching. Pretends he has peed and sneaks backlater.NOISY Whistles loudly. Peeps over partition to have a look at the other fellow'stool.INDIFFERENT All urinals being occupied, uses sink.CLEVER Pees without holding tool, shows off by adjusting tie at same time.VAIN Undoes 5 buttons to take out tool when 2 would have done.ABSENT MINDED Opens jacket, takes out his tie and pees in his pantsWORRIED Not quite sure what he has been up to lately, makes a furtive but closeinspection of his tool while peeing.DISGRUNTLED Stands for a while, grunts, farts, tries to pee, fails, farts and walks away.SNEAKY Drops silent fart while peeing, sniffs and looks at the bloke stood next tohim.SLOPPY Pees down into his shoe, walks out with his zip open and adjusts his balls10 mins later.LEARNED Reads a book or newspaper while peeing.CHILDISH Looks at the bottom of the urinal to watch bubbles while peeing.STRONG Bangs tool on the side of the urinal to knock the drops off.DRUNKEN Pulls out his tool, sees two, puts one back and pees in his trousersEMBARRASSED Covers his tool with both hands as he stands there and pees through hisfingers.COCKEYED Stands in one cubical and pees in next.SCARED Those that look at the wall because they are scared to look at what they'reholdingTypes of BoyfriendsBoyfriends can be of many types. Here are the categories from which you can evaluate where your boyfriend fits in and find out more about his traits and personality:•Mr Family Man – A perfect marriage material, this type of guy is always ready to help you with household chores, cuddle you and pamper you. He is well behaved and just a sweetie darling. Though, he is not so popular with guys who consider him a soft-boiled egg. Even you may feel that he is a little too compassionate and lacks willpower.•Mr Grumpy – This guy has a lot to complain about everything and anything in the world and everybody is either stupid or evil for him. He rarely ventures out of the house and is a predictable jerk. It is very difficult to fare with such a person for long. •Mr Creampuff – Always ready to say sorry, this guy is just too soft. He is cute and has an innocent look on the face and trembles and jumps entertainingly when he is startled by any loud noise including slamming of doors and a sudden increase in volume of the TV. He is eager to surrender and gets spooked very easily.•Mr Bigfoot – Big, strong and dumb, this type of boyfriend is quite handy when it comes to rearranging furniture or change your home to haul heavy stuff in a jiffy. He is easily fooled too but you have to bear with his heavy sweating and be careful, lest he breaks you in half while hugging you.•Mr Parasite – A couch potato and probably a drug addict, you can easily get hands on him. However, he thinks he has a right to use and abuse everything you own and will hardly be able to fulfill your dreams. Get rid of him quickly or he will sponge off a big chunk of your money very quickly.•Mr Sneaky – This type of guy loves to sneak on you and may even hire private detective to keep an eye on everything you do. He may use hidden cameras and may even go to the lengths of desiring to known each and every word. You can never be sure that whether he is having time of his life or is really having pangs of guilt.•Mr Dreamer – Probably a struggling artist or a philosopher or simply a buffoon, this type of guy has no idea of what his career and growth prospects are and how he is going to achieve his goals. Yet, he always dreams of being rich and famous someday. He is good at telling interesting tall tales but may turn Mr Grumpy after sometime.•Mr Right –A perfect man of everyone’s dreams and answer to everyone’s prayer, this man is rich, handsome, has perfect manners, owns almost every luxury in the world and loves you like God. However, he has long been hunted to extinction.Fun moment: search the website:OkCupidTopics for practice:1.Girls/boys in my class/school。
英语论文-英语双关语及其翻译

英语双关语及其翻译[Abstract] English pun is one of the important figures of speech, and it is widely used in various literary works, such as poems, novels, stories, advertisements and riddles, etc. Based on the definitions of English pun, this paper points out that homonyms, homophones, and homographs are available to construct puns. According to the characteristics and functions of their formation, Lv Xu divided English pun into three types, they are homophonic puns, homographic puns, puns on both pronunciation and meaning. English pun takes advantage of its distinctive features in nature and tries to produce ambiguity on purpose in order to get the effect of aiming at a pigeon and shooting at a crow. English puns can achieve many effects: humor, satire, vivid expression of the characters‟ feelings, which can increase the beauty of language and improve readability in order to attract the readers‟ interest. However, the translation of English pun is always considered to be extremely difficult. Many people even consi der puns as “untranslatable”. Since there is much difference between Chinese and English in phonology and morphology, it is difficult to find equivalence both in sound and in meaning in the target language for a translator. But there is no source text that is absolutely untranslatable; the translation of puns is possible to a certain extent. This paper introduces 3 main types to translate English puns. They are literal translation, free translation, and annotated translation.[Key Words] English pun; classification; characteristics; function; translation【摘要】英语双关语是重要的修辞格之一,这种修辞格广泛运用于各种文学作品,如诗歌、小说、故事、广告及谜语中。
Bag of Tricks for Efficient Text Classification

a r X i v :1607.01759v 2 [c s .C L ] 7 J u l 2016Bag of Tricks for Efficient Text ClassificationArmand JoulinEdouard Grave Piotr Bojanowski Tomas MikolovFacebook AI Research{ajoulin,egrave,bojanowski,tmikolov }@AbstractThis paper proposes a simple and efficient ap-proach for text classification and representa-tion learning.Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of ac-curacy,and many orders of magnitude faster for training and evaluation.We can train fastText on more than one billion words in less than ten minutes using a standard mul-ticore CPU,and classify half a million sen-tences among 312K classes in less than a minute.1IntroductionBuilding good representations for text classi-fication is an important task with many ap-plications,such as web search,information retrieval,ranking and document classifica-tion (Deerwester et al.,1990;Pang and Lee,2008).Recently,models based on neural networks have become increasingly popular for computing sentence representations (Bengio et al.,2003;Collobert and Weston,2008).While these models achieve very good performance in practice (Kim,2014;Zhang and LeCun,2015;Zhang et al.,2015),they tend to be relatively slow both at train and test time,limiting their use on very large datasets.At the same time,simple linear models have also shown impressive performance while being very computationally efficient (Mikolov et al.,2013;Levy et al.,2015).They usually learn word level representations that are later combined to form sen-tence representations.In this work,we propose anextension of these models to directly learn sentence representations.We show that by incorporating additional statistics such as using bag of n-grams,we reduce the gap in accuracy between linear and deep models,while being many orders of magnitude faster.Our work is closely related to stan-dard linear text classifiers (Joachims,1998;McCallum and Nigam,1998;Fan et al.,2008).Similar to Wang and Manning (2012),our moti-vation is to explore simple baselines inspired by models used for learning unsupervised word repre-sentations.As opposed to Le and Mikolov (2014),our approach does not require sophisticated infer-ence at test time,making its learned representations easily reusable on different problems.We evaluate the quality of our model on two different tasks,namely tag prediction and sentiment analysis.2Model architectureA simple and efficient baseline for sentence classification is to represent sentences as bag of words (BoW)and train a linear classifier,for example a logistic regression or support vec-tor machine (Joachims,1998;Fan et al.,2008).However,linear classifiers do not share pa-rameters among features and classes,possibly limiting mon solutions to this problem are to factorize the linear clas-sifier into low rank matrices (Schutze,1992;Mikolov et al.,2013)or to use multilayer neu-ral networks (Collobert and Weston,2008;Zhang et al.,2015).In the case of neural net-works,the information is shared via the hiddenFigure1:Model architecture for fast sentence classification. layers.Figure1shows a simple model with1hidden layer.Thefirst weight matrix can be seen as a look-up table over the words of a sentence.The word representations are averaged into a text rep-resentation,which is in turn fed to a linear classi-fier.This architecture is similar to the cbow model of Mikolov et al.(2013),where the middle word is replaced by a label.The model takes a sequence of words as an input and produces a probability distri-bution over the predefined classes.We use a softmax function to compute these probabilities.Training such model is similar in nature to word2vec,i.e.,we use stochastic gradient descent and backpropagation(Rumelhart et al.,1986)with a linearly decaying learning rate.Our model is trained asynchronously on multiple CPUs.2.1Hierarchical softmaxWhen the number of targets is large,computing the linear classifier is computationally expensive. More precisely,the computational complexity is O(Kd)where K is the number of targets and d the dimension of the hidden layer.In order to im-prove our running time,we use a hierarchical soft-max(Goodman,2001)based on a Huffman cod-ing tree(Mikolov et al.,2013).During training,the computational complexity drops to O(d log2(K)). In this tree,the targets are the leaves.The hierarchical softmax is also advantageous at test time when searching for the most likely class. Each node is associated with a probability that is the probability of the path from the root to that node.If the node is at depth l+1with parents n1,...,n l,its probability isP(n l+1)=li=1P(n i).This means that the probability of a node is always lower than the one of its parent.Exploring the tree with a depthfirst search and tracking the maximum probability among the leaves allows us to discard any branch associated with a smaller probability.In practice,we observe a reduction of the complexity to O(d log2(K))at test time.This approach is further extended to compute the T-top targets at the cost of O(log(T)),using a binary heap.2.2N-gram featuresBag of words is invariant to word order but taking explicitly this order into account is often compu-tationally very expensive.Instead,we use bag of n-gram as additional features to capture some par-tial information about the local word order.This is very efficient in practice while achieving compa-rable results to methods that explicitly use the or-der(Wang and Manning,2012).We maintain a fast and memory efficient mapping of the n-grams by using the hashing trick(Weinberger et al.,2009)with the same hash-ing function as in Mikolov et al.(2011)and10M bins if we only used bigrams,and100M otherwise. 3Experiments3.1Sentiment analysisDatasets and baselines.We employ the same8datasets and evaluation protocol of Zhang et al.(2015).We report the N-grams and TFIDF baselines from Zhang et al.(2015),as well as the character level convolutional model (char-CNN)of Zhang and LeCun(2015)and the very deep convolutional network(VDCNN) of Conneau et al.(2016).We also compare to Tang et al.(2015)following their evaluation protocol.We report their main baselines as well asBoW(Zhang et al.,2015)88.892.996.692.258.068.954.690.4 ngrams(Zhang et al.,2015)92.097.198.695.656.368.554.392.0 ngrams TFIDF(Zhang et al.,2015)92.497.298.795.454.868.552.491.5 char-CNN(Zhang and LeCun,2015)87.295.198.394.762.071.259.594.5 VDCNN(Conneau et al.,2016)91.396.898.795.764.773.463.095.7Table1:Test accuracy[%]on sentiment datasets.FastText has been run with the same parameters for all the datasets.It has10 hidden units and we evaluate it with and without bigrams.For VDCNN and char-CNN,we show the best reported numbers without data augmentation.AG1h3h8h12h2017h3sSogou--8h3013h4018h4036sDBpedia2h5h9h14h5020h8sYelp P.--9h2014h3023h0015sYelp F.--9h4015h1d18sYah.A.8h1d20h1d7h1d17h27sAmz.F.2d5d2d7h3d15h5d20h33sAmz.P.2d5d2d7h3d16h5d20h52sModel Yelp’13Yelp’14Yelp’15IMDBfastText64.266.266.645.2taiyoucon 2011digitals:individuals digital pho-tos from the anime convention taiyoucon 2011in mesa,arizona.if you know the model and/or the character,please comment.#cosplay#24mm #anime #animeconvention #arizona #canon #con #convention #cos #cosplay #costume #mesa #play #taiyou #taiyouconbeagle enjoys the snowfall #snow#2007#beagle #hillsboro #january #maddison #maddy #oregon #snow euclid avenue #newyorkcity#cleveland #euclidavenueModelprec@1Running time Freq.baseline 2.2--Tagspace,h =5030.13h86h Tagspace,h =20035.65h3215hTable 5:Prec@1on the test set for tag prediction onYFCC100M.We also report the training time and test time.Test time is reported for a single thread,while training uses 20threads for both models.Table4shows some qualitative examples.FastText learns to associate words in the caption with their hashtags,e.g.,“christmas”with “#christ-mas”.It also captures simple relations between words,such as “snowfall”and “#snow”.Finally,us-ing bigrams also allows it to capture relations such as “twin cities”and “#minneapolis”.4Discussion and conclusionIn this work,we have developed fastText which extends word2vec to tackle sentence and document classification.Unlike unsupervisedly trained word vectors from word2vec,our word features can be averaged together to form good sentence represen-tations.In several tasks,we have obtained perfor-mance on par with recently proposed methods in-spired by deep learning,while observing a mas-sive speed-up.Although deep neural networks have in theory much higher representational power than shallow models,it is not clear if simple text classifi-cation problems such as sentiment analysis are the right ones to evaluate them.We will publish our code so that the research community can easily build on top of our work.References[Bengio et al.2003]Yoshua Bengio,Rjean Ducharme, Pascal Vincent,and Christian Jauvin.2003.A neu-ral probabilistic language model.JMLR. [Collobert and Weston2008]Ronan Collobert and Jason Weston.2008.A unified architecture for natural lan-guage processing:Deep neural networks with multi-task learning.In ICML.[Conneau et al.2016]Alexis Conneau,Holger Schwenk, Lo¨ıc Barrault,and Yann Lecun.2016.Very deep con-volutional networks for natural language processing.arXiv preprint arXiv:1606.01781.[Deerwester et al.1990]Scott Deerwester,Susan T Du-mais,George W Furnas,Thomas K Landauer,and Richard Harshman.1990.Indexing by latent semantic analysis.Journal of the American society for informa-tion science.[Fan et al.2008]Rong-En Fan,Kai-Wei Chang,Cho-Jui Hsieh,Xiang-Rui Wang,and Chih-Jen Lin.2008.Li-blinear:A library for large linear classification.JMLR. [Goodman2001]Joshua Goodman.2001.Classes for fast maximum entropy training.In ICASSP. [Joachims1998]Thorsten Joachims.1998.Text catego-rization with support vector machines:Learning with many relevant features.Springer.[Kim2014]Yoon Kim.2014.Convolutional neural net-works for sentence classification.In EMNLP.[Le and Mikolov2014]Quoc V Le and Tomas Mikolov.2014.Distributed representations of sentences and documents.arXiv preprint arXiv:1405.4053. [Levy et al.2015]Omer Levy,Yoav Goldberg,and Ido Dagan.2015.Improving distributional similarity with lessons learned from word embeddings.TACL.[McCallum and Nigam1998]Andrew McCallum and Ka-mal Nigam.1998.A comparison of event models for naive bayes text classification.In AAAI workshop on learning for text categorization.[Mikolov et al.2011]Tom´aˇs Mikolov,Anoop Deoras, Daniel Povey,Luk´aˇs Burget,and JanˇCernock`y.2011.Strategies for training large scale neural network lan-guage models.In Workshop on Automatic Speech Recognition and Understanding.IEEE.[Mikolov et al.2013]Tomas Mikolov,Kai Chen,Greg Corrado,and Jeffrey Dean.2013.Efficient estimation of word representations in vector space.arXiv preprint arXiv:1301.3781.[Ni et al.2015]Karl Ni,Roger Pearce,KofiBoakye, Brian Van Essen,Damian Borth,Barry Chen,and Eric rge-scale deep learning on the YFCC100M dataset.In arXiv preprint arXiv:1502.03409.[Pang and Lee2008]Bo Pang and Lillian Lee.2008.Opinion mining and sentiment analysis.Foundations and trends in information retrieval.[Rumelhart et al.1986]David E Rumelhart,Geoffrey E Hinton,and Ronald J Williams.1986.Learning in-ternal representations by error-propagation.In Par-allel Distributed Processing:Explorations in the Mi-crostructure of Cognition.MIT Press.[Schutze1992]Hinrich Schutze.1992.Dimensions of meaning.In Supercomputing.[Tang et al.2015]Duyu Tang,Bing Qin,and Ting Liu.2015.Document modeling with gated recurrent neural network for sentiment classification.In EMNLP. [Wang and Manning2012]Sida Wang and Christopher D Manning.2012.Baselines and bigrams:Simple,good sentiment and topic classification.In ACL. [Weinberger et al.2009]Kilian Weinberger,Anirban Das-gupta,John Langford,Alex Smola,and Josh Atten-berg.2009.Feature hashing for large scale multitask learning.In ICML.[Weston et al.2011]Jason Weston,Samy Bengio,and Nicolas Usunier.2011.Wsabie:Scaling up to large vocabulary image annotation.In IJCAI.[Weston et al.2014]Jason Weston,Sumit Chopra,and Keith Adams.2014.#tagspace:Semantic embed-dings from hashtags.In EMNLP.[Zhang and LeCun2015]Xiang Zhang and Yann LeCun.2015.Text understanding from scratch.arXiv preprint arXiv:1502.01710.[Zhang et al.2015]Xiang Zhang,Junbo Zhao,and Yann LeCun.2015.Character-level convolutional networks for text classification.In NIPS.。
7.Classification

4
Choosing a topic
Avoid topics that are colorless, flat, insipid, boring, unimaginative, monotonous and BROAD. They require no thought to write and no thought to read, and they require no effort to forget. Topics with two extremes and a middle position are almost always destined for dullness. Bad idea: intelligent thinkers, unintelligent thinkers, and average thinkers. How about this: academic intelligence (raw IQ), commonsense intelligence (good practical judgment), and street-smart intelligence (cunning)? Be New, Be Original! (free-ranging thinking/free associations) Hairdresser’s customers classified as dogs: some pretty and lovable like the cocker spaniel, some growly and assertive like the doberman, and some feisty and temperamental like the poodle.
Classification: Establishing Groups
文本分类可解释性

文本分类可解释性
文本分类(text classification),指的是将一个文档归类到一个或多个类别的自然语言处理任务。
文本分类的应用场景非常广泛,包括垃圾邮件过滤、自动打标等任何需要自动归档文本的场合。
文本分类在机器学习中属于监督学习,其流程是:人工标注文档类别、利用语料训练模型、利用模型训练文档的类别。
文本分类(Text Classification 或 Text Categorization,TC),又称自动文本分类(Automatic Text Categorization),是指计算机将载有信息的一篇文本映射到预先给定的某一类别或某几类别主题
的过程,实现这一过程的算法模型叫做分类器。
文本分类问题算是自然语言处理领域中一个非常经典的问题。
根据预定义的类别不同,文本分类分两种:二分类和多分类,多分类可以通过二分类来实现。
从文本的标注类别上来讲,文本分类又可以分为单标签和多标签,因为很多文本同时可以关联到多个类别。
文本分类最初是通过专家规则(Pattern)进行分类,利用知识工程建立专家系统,这样做的好
处是比较直观地解决了问题,但费时费力,覆盖的范围和准确率都有限。
后来伴随着统计学习方法的发展,特别是 90 年代后互联网在线文本数量增长和机器学习学科的兴起,逐渐形成了一套解决大规模文本分类问题的经典做法,也即特征工程 + 浅层分类模型。
又分为传
统机器学习方法和深度学习文本分类方法。
现代大学英语中级写作课程教案精选版

现代大学英语中级写作课程教案Document serial number【KKGB-LBS98YT-BS8CB-BSUT-BST108】《现代大学英语中级写作》,徐克容,外语教学与研究出版社英语写作中级(上)课程教案I 授课题目:Unit One We Learn As We Grow一、教学目的、要求:(一)掌握:1、To learn the basics of exemplication:→ Definition→ Kinds of examples→ Sources of examples2、To learn to outline expositive essays知识点:→ The definition and introduction of exposition and essay.→Exposition is explanatory writing. It’s purpose is toexplain or clarify a point.→ An essay is a related group of paragraphs written for some purpose(二)熟悉:→ Practice the basics of exemplification→ Practice outlining知识点:→ Patterns of exposition, the choice of examples, the choice of appropriate examples, the organization of anexemplification essay:→Types of essays, basic structures of an expositive essay, elements of the expositive essay→ Types of outline, rules concerning outline(三)了解:→Patterns of exposition, types of essays, types of outlineprocess analysis, cause-effect analysis, Comparisonand contrast, classification, definition andanalogy, narrative essays, descriptive essays,expositive essays and argumentative essays二、教学重点及难点:重点:Exemplification, types of outline;难点:Sentence outline and topic outline三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第一课 Exemplification第一课Elements of the Essay: Outlining六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次:Read on the subject and write an example paper of 200-250 words on the given topic.第二次:Read on the subject and write an essay of 200-250 words on the given topic, using either a single extended example or two or three short ones to develop your thesis statement.第三次: Ask students to practice outlining八、课后小结:Emphasis on the writing procedure→ Prewriting-choosing a topic and exploring ideas→ Drafting: getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errorsII授课题目:Unit Two I Made It一、教学目的、要求:(一)掌握:1、To learn the basics of process analysis→ Definition→ Uses→ Types→ Methods2、To learn to write thesis statement知识点:→ The definiton and introduction of process analysis→ The function of process analysis→ The differences between thesis statement vs. topic sentence(二)熟悉:→ The areas the process analysis is usually used.知识点: → Functions of process analysis:giving instructions, giving information and giving the history→ Major types of process analysis: directive analysis, informative process analysis→ Writing an effective thesis statement(三)了解:The basics of process writing and thesis statement二、教学重点及难点:重点:Organization of a process paper, practice of effective thesis statement;难点:Guidelines on process analysis, writing effective thesis ststement三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第二课 Process Analysis第二课 Elements of the essay: The Thesis Statement六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次:Read on the subject and write an informative process paper describing how you succeeded in doing something第二次:Read on the subject and write a directive process paper telling first-year students how to adjust to life at college. 第三次:Ask students to practise writing the thesis statement八、课后小结:Emphasis on the writing procedure→ Prewriting-chossing a topic and exploring ideas→ Drafting:getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errors授课题目:Unit Three College Is Not a Paradise一、教学目的、要求:(一)掌握:1、To learn the basics of Cause-Effect analysis→ Definition→ Uses→ Patterns2、To learn to write an introduction to expositive essays→ What to include in the introduction→ How to write effective introduction知识点:→ The definiton and introduction of cause-effect analysis → The function of cause-effect analysis→ The writing of effective introduction(二)熟悉:→ The functions and areas the cause-effect analysis is usually used.知识点: → Functions of cause-effect analysis: explaining why certain things happen, analyzing what will happen as a result → Major types of cause-effect analysis: focusing on cause and focusing on effects,→ How to start and write effective introduction(三)了解: the basics of cause-effect analysis and writing effective introduction二、教学重点及难点:重点:How to focus on cause or effects, How to start and write effective introduction;难点:How to focus on cause or effects, How to start and write effective introduction三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第三课 Cause-Effect Analysis第三课 Parts of the essay: The Introduction六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次:Read on the subject and write an essay on any of the given topics analyzing cause.第二次:Read on the subject and write, from your own experience, an essay analyzing the effects of anthing taught in class.第三次:Ask students to practise writing the introduction八、课后小结:Emphasis on the writing procedure→ Prewriting- chossing a topic and exploring ideas→ Drafting: getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errors授课题目:Four What Makes the Differences一、教学目的、要求:(一)掌握:1、To learn the basics of Comparison and Contrast→ Definition→ Uses→ Patterns→ Methods2、To learn to develop the body of expositive essays→ What its structure looks like?→ What it includes知识点:→ The definiton and introduction of Comparison and Contrast → The function of cause-effect analysis→ The writing of effective introduction(二)熟悉:→ The functions and areas the comparison/contrast is usually used., the general structure of the body ofan essay知识点: → Functions of comparison/contrast: clarifying something unknown, bringing one or both of the subject intosharper shape→ Three patterns of comparison/contrast: subject bysubject, point by point, mixed sequence→ Familiarity of the general structure of the body of an essay(三)了解: The basics of Comparison and Contrast and the general structure of the body of an essay二、教学重点及难点:重点:Three patterns of comparison/contrast: subject by subject, point by point, mixed sequenceGeneral structure of the body: Beginning, Body and End难点: How to organize a comparison/contrast essay, How to develop body paragraphs三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第四课 Comparison/Contrast第四课 Parts of the essay: The Body六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次:Read on the subject and write a subject-by-subject essay of comparison/contraston any of the given topics第二次:Read on the subject and write a point -by-point essay of comparison/contraston any of the given topics第三次:Ask students to practise writing the body of the essay 八、课后小结:Emphasis on the writing procedure→ Prewriting-chossing a topic and exploring ideas→ Drafting:getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errors授课题目:Unit Five It Takes All Sorts to Make a World一、教学目的、要求:(一)掌握:1、To learn the basics of Classification→ Definition→ Uses→ Methods2、To learn to write the conclusion of expositive essays→ What is classification?→ What is classification used for?知识点:→ The definiton and introduction of classification→ The function of classification→ The writing of effective classification(二)熟悉:→ The functions and areas the classification is usually used., the conclusion of expositive essays知识点: → Functions of classification:To organize and perceive the world around usTo present a mass of material by means of some orderly systemTo deal with complex or abstract topics by breaking a broad subject into smaller, neatly sorted categories.→ The general pattern of classification→ sentence patterns in classification→ Familiarity of the the conclusion of expositive essays (三)了解: The functions and areas the classification is usually used., the conclusion of expositive essays二、教学重点及难点:重点:some sentence patterns in classificationthe conclusion of expositive essays难点: Parts of the conclusion: a summary of the main points, or restatements of your thesis in different work.三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第五课 classification第五课 Parts of the essay: The conclusion六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次:Read on the subject and write a classification essay on any of the given topics第二次:Write an essay of 200-250 words on any of the given topics. 第三次:Ask students to practise writing the conclusion of the essay 八、课后小结:Emphasis on the writing procedure→ Prewriting-chossing a topic and exploring ideas→ Drafting:getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errors授课题目:Unit Six What Does It Mean一、教学目的、要求:(一)掌握:1、To learn the basics of Definition→ Definition→ Types→ Methods of Organization2、To learn to write the title of expositive essays→ What is definiton→ Types of definition知识点:→ The Standard /Formal Definition→ The Connotative/Personal Definition→ The Extended Definition(二)熟悉:→ The functions and areas the definition is usuallyused., the title of expositive essays知识点: → Functions and patterns of definition:→ The Standard /Formal Definition is used to explain a term or concept your audience or reader may not know orunderstand,→ The Connotative/Personal Definition is used to explain any word or concept that doesn’t have the same meaningfor everyone.→ The Extended Definition is used to explore a topic by examining its various meanings and implications.(三)了解: How to write an extended definitionHow to organize an extended essay二、教学重点及难点:重点:Functions and patterns of definitionHow to write an extended definitionHow to write the title of an expositive essay难点:How to organize an extended essayHow to write the title of an expositive essay三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第六课 definition第六课 Parts of the essay: The Title六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次: Read on the subject and write a definition essay on any of the given topics第二次:Write an essay of 200-250 words on any of the given topics. 第三次:Ask students to practise writing the title of the essay八、课后小结:Emphasis on the writing procedure→ Prewriting- choosing a topic and exploring ideas→ Drafting: getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errorsUnit Six Task One DefinitionI What is definition?In talking with other people, we sometimes offer informal definitions to explain just what we mean by a particular term. That is, to avoid confusion or misunderstanding, we have to define a word, term, or concept which is unfamiliar to most readers or open to various interpretations.Suppose, for example, we say to a friend:” Forrest is really an inconsiderate person.” We might then explain what we mean by“ inconsiderate” by saying, “He borrowed my accounting book overnight but didn’t return it for a week. And when I got it back,it was covered with coffee stains.Definition is the explanation of the meaning of a word or concept, and it is also a method of developing an essay.II. The ways to define a word or termThere are three basic ways to define a word or termA. To give a synonym For example: ‘ To mend is to repair.” Or“ A fellow is a man or a boy.”B. To use a sentence (often with an attributive clause) For example,ink may be define in a sentence: “Ink is colored water which we use for writing.”C. To write a paragraph or even an essay But a synonymy or asentence cannot give a satisfactory definition of an abstract term whose meaning is complex. We have to write a paragraph or an essay with examples or negative examples (what the term does not mean),with analogies or comparisons, with classification or cause-effect analysis.III. When we give a definition, we should observe certain principles: 1.First, we should avoid circular definitions. “Democracy is thedemocratic process.” And “astronomer is one who studiesastronomy” are circular definition.2.Second, we should avoid long lists of synonyms if the term to bedefined is an abstract one. For example: By imagination, I meanthe power to form mental images of objects, the power to form new ideas, the gift of employing images in writing, and the tendency to attribute reality to unreal things, situations and states.(picking up words, expressions from a dictionary , in the hopethat one will hit)3.Third, we should avoid loaded definition, Loaded definitions donot explain terms but make an immediate appeal for emotionalapproval.A definition like:’ By state enterprise, I mean high cost andpoor efficiency.” is loaded with pejorative emotionalconnotation. Conversely, “ By state enterprise, I mean one ofthe great blessing of democratic planning” is loaded withfavorable emotional connotation. Such judgements can be vigorious to a discussion, but they lead to argument, not clarification,when offered as definition.IV. Types of definition1.Standard/ Formal definition---denotation is a word’s core, direct,and literal meaning.2.Connotative/Personal meaning---Explains what you mean by a certainterm or concept that could have different meanings for others.On the other hand, connotation is the implied, suggested meaning of a word; it refers to the emotional response stimulated byassociations the word carries with it.A.For Americans, Water gate is associated with a politicalscandal that means dishonesty. And more words are created with the suffix—gate to mean some scandal in English now, thus,Iran Gate, Intelligence GateB.Dogs, in Chinese culture, may be quite a negative image. It isinsulting to call someone a dog. What about the western people In their eyes, dog is lovely and has good associated meanings.They say “ Love me, love my dog.”C.Imperialism means to us Chinese quite negative. Some of thewestern people may be proud of being imperial and imperialismitself.D.People everywhere may also share some connotations for somewords. They are general connotations. Mother means love, care, selfless, etc.E.Let’s get the gang together for a party tonight. (a group)Don’t go around with that gang or you’ll come to no good.(degraded group of people or group of criminals)Connotation can make all the difference. It is the mirror ofyour attitude.3.Extended definition---is an essay length piece of writing usingthis method of development.V. How to write an extended definitionFollow 4 rules for a good definition:1. Don’t use the words “when “‘where”, giving a definition. A commonpractice is to define the noun with a noun, adjective with adjective and so on.2. Remember, that definition is not a repetition.3. Use simple and well- known term in your explanation.4. Point out the distinguishing features of the term.Unit Six Task Two: The TitleI.What is title?A title is a very brief summary of what your paper is about.It is often no more than several words. You may find it easierto write the title after you have completed your paper.A title may be a phrase which can indicate a topic ofinterest (i.e. your focus) and at the same time point towardsa particular kind of discussion (your mode of argument).Accordingly, your title needs not only to indicate what theessay will be about, but also to indicate the point of view itwill adopt concerning whatever it is about.II.The purpose of the titleTo give the reader an idea of what the essay is aboutTo provide focus for the essayTo arouse the reader’s interestIII.How to write a good titleMake it clear, concise and preciseUse a phrase rather than a sentenceExclude all extra wordsIV.Other rules to obeyCenter it at the top of the first page.Use no period at the end or quotation marksCapitalize the first and last wordsCapitalize all other words exceptarticles (a, the)the to in infinitivesprepositions containing one syllablecoordinating conjunctions (and, but, or, etc)A title leads, but a poor title misleads. Be sure that it is appropriate. Besides, be careful with the capitalization.Write an appropriate title for each of the introductory paragraphs that follow.1.Title: _____Reactions to Disappointment___________________Ben Franklin said that the only sure things in life are deathand taxes. He left something out, however: disappointment. Noone gets through life without experiencing many disappointments.Strangely, though, most people seem unprepared fordisappointment and react to it in negative ways. They feeldepressed or try to escape their troubles instead of usingdisappointments asan opportunity for growth.2.Title: ____Annoying People_____________________President Richard Nixon used to keep “enemies list” of all the people he didn’t especially like. Iam ashamed to confess it,butI, too, have an enemies list—a mental one. On this list are the people I would gladly live without , the ones who cause my blood pressure to rise to the boiling point. The top threeplaces on the list go to people with annoying nervous habits,people who talk in movie theatres, and people who talk on carphones while driving.3.Title: ___The Meaning of Maturity______________________Being a mature student does not mean being an old-timer.Maturity is not measured by the number of years a person havelived. Instead, the yardstick of maturity is marked by thequalities of self-denial, determination, and dependability.4.Title: _____College Stress____________________Jack’s heart pounds as he casts panicky looks around theclassroom. He doesn’t recognize the professor, he doesn’t know any of the students, and he can’t even figure out what thesubject is. In front of him is a test. At the last minute hisroommate awakens him. It’s only another anxiety dream. The very fact that dreams like Jack’s are common suggests that college is a stressful situation for young people. The cause of thisstress can be academic, financial, and personal.5.Title: __How to Complain_______________________I’m not just a consumer—I’m a victim. If I order a product, it is sure to arrive in the wrong color, sixe, or quantity. If I hire people to do repairs, they never arrive on the dayscheduled. If I owe a bill, the computer is bound to overcharge me. Therefore, in self-defense, I have developed the following consumer’s guide to complaining affectively授课题目:Unit Seven The Insight I Gained 一、教学目的、要求:(一)掌握:1、To learn the basics of Analogy→ Definition→ Uses→ Methods of Organization2、To learn to use transitions→ What is analogy→ The difference between analogy and comparison知识点:→ The field analogy is used→ The difference between analogy and comparison→ The patterns of analogy(二)熟悉:→ The functions and areas analogy is usually used., to learn to use transition知识点: → Functions and patterns of analogy:→ A comparison explains two obviously similar things and considers both their differences and similarities→ An analogy compares two apparently unlike things, and focus only on their major similarities→ An analogy is thus an extended metaphor—the figure of speech that declares one thing to be another(三)了解: How to organize an analogy by the way ---subject bysubjectHow to organize an analogy by the way—point by point 二、教学重点及难点:重点:Functions and patterns of definitionThe differences between comparison and analogyHow to learn to use transitionHow to organize an analogy by the way ---subject by subject How to organize an analogy by the way—point by point难点:How to learn to use transitionHow to organize an analogy by the way ---subject by subject How to organize an analogy by the way—point by point三、课时安排:共4课时四、授课方式:讲授、课堂快速阅读练习、课堂提问、写作实践讲解五、教学基本内容第六课 definition第六课 Parts of the essay: The Title六、参考书目:《英语写作手册》,《美国大学英语写作》七、作业和思考题:第一次: Read on the subject and write a definition essay on any of the given topics第二次:Write an essay of 200-250 words on any of the given topics. 第三次:Ask students to practise writing the title of the essay八、课后小结:Emphasis on the writing procedure→ Prewriting-chossing a topic and exploring ideas→ Drafting: getting your ideas on paper→ Revising: strengthening your essay→ Editing and proofreading: eliminating technical errors。
写作 classification

About creating categories
Remember the categories of a classification must not overlap or contain items already contained within another entry. Otherwise, the classification will become illogical.
Exercise
Mattresses:queen; twin; firm; double;
Single basis:
size
Exercises: fatigue; swimming; jogging; gymnastics;
Single basis:
field
Churches: Roman Catholic; Baptism; Protestant; Orthodox;
Single basis:
branch
Vacation: Seashore; winter; summer; weekends;
Single basis:
time
Related Expression
About creating categories
Once you are given a topic, you will create categories by organizing elements according to a common feature. Decide how to organize elements of your topic into categories. With your categories created, identify common features of each category.
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
c1=8
...
Milk Bread
cd=D
Pet
Dog
Milk
Cat
Eat
Food
Dry
Michal Rosen-Zvi, UCI 2004
...
Simple model for topics
Given the topic words are independent
C
W
The probability for a word, w, given a topic, z, is θwz
Michal Rosen-Zvi, UCI 2004
W
Nd D
Inferring model parameters
One can find the distribution of θ by sampling
P(θ | c, w, α ) =
∫ dθP(w | c,θ ,α )P(c )
P(w | c, θ , α )P(c )
A need to define conceptual closeness
Michal Rosen-Zvi, UCI 2004
Feature Vector representation
From: Modeling the Internet and the Web: Probabilistic methods and Algorithms, Pierre Baldi, Paolo Frasconi, Padhraic Smyth
Michal Rosen-Zvi, UCI 2004
Michal Rosen-Zvi, UCI 2004
Classification: assigning words to topics
Different models for data:
Prediction of Categorical output e.g., SVM Discrete Classifier, modeling the boundaries between different classes of the data
Michal Rosen-Zvi, UCI 2004
Nave Bayes, multinomial:
α θ C
P({w,C}) = ∫d θ ∏dP(Cd)∏ndP(wnd|Cd,θ)P(θ) w
Generative parameters θwj = P(ω|c=j) ω Must satisfy ∑wθwj = 1, therefore the integration is over the simplex, (space of vectors with non-negative elements that sum up to 1) Might have Dirichlet prior, α
Michal Rosen-Zvi, UCI 2004
Making use of the topics model in cognitive science…
The need for dimensionality reduction Classification methods
Nave Bayes The LDA model
Document/Term count matrix
Doc1
LOVE SOUL RESEAR CH SCIENCE …
High dimensional space, not as high as |V|
Doc2 0 0 19 16 …
Doc3 … 3 2 6 1 …
34 12 0 0 …
SVD
LOVE
Michal Rosen-Zvi, UCI 2004
Nave Bayes classifier: words and topics
A set of labeled documents is given: {Cd,wd: d=1,…,D} w Note: classes are mutually exclusive
Model assumptions Inference by Gibbs sampling Results: applying the model to massive datasets
Michal Rosen-Zvi, UCI 2004
The need for dimensionality reduction
Michal Rosen-Zvi, UCI 2004
The Nave Bayes classifier
Assumes that each of the data points is distributed independently: Results in a trivial learning algorithm Usually does not suffer from overfitting
Nave Bayes
The LDA model
Topics model and semantic representation The Author Topic Model
Model assumptions Inference by Gibbs sampling Results: applying the model to massive datasets
This is a point estimation of the PDF, provides the mean of the posterior PDF under some conditions provides the full PDF
Michal Rosen-Zvi, UCI 2004
Where are we?
Making use of the MAP:
P(w, c | θ , α ) = P ( w | c, θ , α ) P ( c ) ∝ P ( w | c, θ , α ) P ( c ) P(w | θ , α )
θ w, j =
α + n (jw) αV + ∑=1 n (jl )
|V |
Michal Rosen-Zvi, UCI 2004
Sampling in the LDA model
The update rule for fixed α,β and integrating out θ β
Provides point estimates to θ and distributions of the latent variables, z.
∏ P(wi | θ , cd = j )
d =1... D
∏ P(cd = j ) = θ jw
is the
n (j w )
Here θ jw is the probability for word w given topic j and number of times the word w is assigned to topic j
Density Estimator: modeling the distribution of the data points themselves
Generative Models e.g. NB
Michal Rosen-Zvi, UCI 2004
A Spatial Representation: Latent Semantic Analysis (Landauer & Dumais, 1997)
Text Classification
Michal Rosen-Zvi University of California, Irvine
Outline
The need for dimensionality reduction Classification methods Nave Bayes The LDA model Topics model and semantic representation The Author Topic Model
Content-Based Ranking:
Ranking matching documents in a search engine according to their relevance to the user Presenting documents as vectors in the words space - 'bag of words' representation It is a sparse representation, V>>|D|
SOUL
RESEARCH SCIENCE
EACH WORD IS A SINGLE POINT IN A SEMANTIC SPACE
Michal Rosen-Zvi, UCI 2004
Where are we?
The need for dimensionality reduction Classification methods
Michal Rosen-Zvi, UCI 2004
Nd
D
P({w,C}| θ) = ∏dP(Cd)∏ndP(wnd|Cd,θ) w
Learning model parameters
Estimating θ from the probability: P({w, c}| θ ) =
i =1... N D
Michal Rosen-Zvi, UCI 2004
LDA: A generative model for topics
A model that assigns Dirichlet priors to multinomial distributions: Latent Dirichlet Allocation Assumes that a document is a mixture of topics