NAIVE BAYESIAN CLASSIFICATION
Na?ve Bayesian Classifier 1
Na?ve Bayesian Classifier
TRAINSIMPLENB(D)
1 C ?? ExtractDifferentClass(D)
2 At ?? ExtractDifferentAttribute(D) //extract numbers of different attributes, for example: A, B...
3 for each i?At
4 for each j?At[i]
//extract possible values in each attribute At[i], for example: A=m, B=h...
5 A[i][j] ?? ExtractDifferentValuesOfDifferentAttribute(D, At[i])
6 N ?? CountTrainingExamples(D) //N is the total number of training examples
7 for each c?C
8 do Nc ?? CountTrainingExamplesInClass(D, c) //Nc is the number of training examples in class c 9 prior[c] ?? Nc/N //prior[c] is the probability of class c
11 for each a?A
//concatenate the attribute with each label, for example: A=m, C=t
10 attrc ?? ConcatenateEachAttributeInClass(A, c)
//calculate the count of such tokens in training examples
12 Act ?? CountTokensOfTerm(attrc, a)
13 for each a?A
//calculate the conditional probability of such token
14 condprob[a][c] ?? Act / prior[c]
15 return A, prior, condprob
APPLYSIMPLENB(C, A, prior, condprob, d)
1 W ?? ExtractTokensFromTest(V, d)
2 for each c?C
3 do score[c] ?? prior[c]
4 for each a?A
5 score[c] *= condprob[a][c]
6 return argmaxc?C score[c]
Na?ve Bayesian Text Classification TRAINMULTINOMIALNB(C, D)
1 V ?? EXTRACTVOCABULARY(D)
2 N ?? COUNTDOCS(D) //N is the total number of documents
3 for each c?C
4 do Nc ?? COUNTDOCSINCLASS(D, c) //Nc is the number of documents in class c
5 prior[c] ?? Nc/N
6 textc ?? CONCATENATETEXTOFALLDOCSINCLASS(D, c)
7 for each t?V
//Tct is the number of occurrences of t in training documents from class c, including multiple
//occurrences of a term in a document