General Discussion & Announcements

 View Only

compute semantic PMI RMP for two words using wikipedia

  • 1.  compute semantic PMI RMP for two words using wikipedia

    Posted 02-10-2021 07:05

    I am trying to compute pointwise mutual information PMI RMP training using wikipedia as data source. Given two words, PMI defines the relation between two words. The formula is as below.

    pmi(word1,word2) = log [probability(number of times both words appears in a document together)/probability(word1)*probability(word2)].

    Hence to compute PMI, I would need joint and individual probabilities of word1 and word2. I looked at the wikipedia miner relatedness score between two words. They are implementing a Milne and Witten algorithm. However, for defining topic similarities, PMI is a better score.

    Does any one know how to compute PMI score for two words using dbpedia or wikipedia miner or any other software.



    ------------------------------
    navya sri
    software developer
    hkr trainings
    ------------------------------