NAIVE BAYESIAN CLASSIFICATION

NAIVE BAYESIAN CLASSIFICATION
NAIVE BAYESIAN CLASSIFICATION

Na?ve Bayesian Classifier 1

Na?ve Bayesian Classifier

TRAINSIMPLENB(D)

1 C ?? ExtractDifferentClass(D)

2 At ?? ExtractDifferentAttribute(D) //extract numbers of different attributes, for example: A, B...

3 for each i?At

4 for each j?At[i]

//extract possible values in each attribute At[i], for example: A=m, B=h...

5 A[i][j] ?? ExtractDifferentValuesOfDifferentAttribute(D, At[i])

6 N ?? CountTrainingExamples(D) //N is the total number of training examples

7 for each c?C

8 do Nc ?? CountTrainingExamplesInClass(D, c) //Nc is the number of training examples in class c 9 prior[c] ?? Nc/N //prior[c] is the probability of class c

11 for each a?A

//concatenate the attribute with each label, for example: A=m, C=t

10 attrc ?? ConcatenateEachAttributeInClass(A, c)

//calculate the count of such tokens in training examples

12 Act ?? CountTokensOfTerm(attrc, a)

13 for each a?A

//calculate the conditional probability of such token

14 condprob[a][c] ?? Act / prior[c]

15 return A, prior, condprob

APPLYSIMPLENB(C, A, prior, condprob, d)

1 W ?? ExtractTokensFromTest(V, d)

2 for each c?C

3 do score[c] ?? prior[c]

4 for each a?A

5 score[c] *= condprob[a][c]

6 return argmaxc?C score[c]

Na?ve Bayesian Text Classification TRAINMULTINOMIALNB(C, D)

1 V ?? EXTRACTVOCABULARY(D)

2 N ?? COUNTDOCS(D) //N is the total number of documents

3 for each c?C

4 do Nc ?? COUNTDOCSINCLASS(D, c) //Nc is the number of documents in class c

5 prior[c] ?? Nc/N

6 textc ?? CONCATENATETEXTOFALLDOCSINCLASS(D, c)

7 for each t?V

//Tct is the number of occurrences of t in training documents from class c, including multiple

//occurrences of a term in a document

相关主题
相关文档
最新文档