Interactive topic model

This is an exemplary chemical topic model which is described in the paper "Chemical topic modeling: exploring chemical dataset using a common text-mining approach" By N.Schneider, N.Fechner, G.A. Landrum and N. Stiefl JCIM (2017) 10.1021/acs.jcim.7b00249.


http://pubs.acs.org/doi/abs/10.1021/acs.jcim.7b00249
In [1]:
import pandas as pd
import numpy as np

from collections import defaultdict
from IPython.display import display,HTML

import utilsInteractiveTM

from ChemTopicModel import chemTopicModel
Loading BokehJS ...
In [2]:
data = pd.read_csv('data/datasetA_oneChemblIDforInteractiveModel.csv')

The topic model was generated for data set A using the following parameters:

  • number of topics = 60
  • fragment method = Morgan FP
  • rare fragment threshold = 0.01
  • common fragment threshold = 0.1
  • seed = 57
In [3]:
numTopics = 60
chemblTopicModel=chemTopicModel.ChemTopicModel(sizeSampleDataSet=1.0, fragmentMethod='Morgan', rareThres=0.01, 
                                     commonThres=0.1, randomState=57)
chemblTopicModel.loadData(data)
chemblTopicModel.generateFragments()
chemblTopicModel.buildTopicModel(numTopics)

topicSVGs = ''
for topicID in range(numTopics):
    tmp='<div id="Topic_{0}" style="display: none">'.format(topicID)
    frags = utilsInteractiveTM.drawFragmentsbyTopic(chemblTopicModel, topicID, n_top_frags=8, numRowsShown=1.2,
                                                    numColumns=8, tableHeader='Top 8 fragments of topic '+str(topicID))

    mols = utilsInteractiveTM.drawMolsByTopic(chemblTopicModel, topicID, idsLabelToShow=[0,2], topicProbThreshold = 0.1,
                                              baseRad=0.9,numRowsShown=2, color=(0.7,0.7,0.99))
    topicSVGs += (tmp+frags+mols+'</div>\n\n')
In [4]:
mydropdown = """
<script>

var oldVar = ''

function showMe(e) {
    var strdisplay = e.options[e.selectedIndex].value;
    if(strdisplay != oldVar && oldVar != '') {
        var e = document.getElementById(oldVar);
        e.style.display = "none";
    }
    var e = document.getElementById(strdisplay);
    e.style.display = "block";
    oldVar = strdisplay
}
</script>

<b>Select your topic to explore:</b>   """

mydropdown+='<select onchange="showMe(this);">'
for i in range(numTopics):
    mydropdown+='<option>Topic_{0}</option>'.format(i)
mydropdown+='</select></br></br>'

Overall statistics chemical topic model

In [5]:
utilsInteractiveTM.showOverallStatistics(chemblTopicModel)

Explore the model in detail

In [6]:
display(HTML(mydropdown+topicSVGs))
Select your topic to explore: