Termite: A Visualization Tool for Inspecting Topic Models
Termite is a visualization tool for inspecting the output of statistical topic models such as LDA. Provided a corpus .csvfile, it generates an interactive interface to inspect topics. Starting in 2014, the project was moved to a separate Github project, split into two components, and unfortunately abandoned. This forked former version of Termite, which does not rely on a database, provides a simpler input for data to be analyzed and displayed.
Why this fork?
Currently, there are some dependencies "hiccups" when running Termite out of the box, and this fork seeks to minimally document workarounds to get it running. A long term goal of this project is to decouple termite visualization from the termite data pipeline, by deferring the required input to a topic-term matrix and a document-topic matrix instead of the raw corpus. In doing so, other implementations of LDA and Topic Modeling, in general, can leverage the interface provided by Termite. A great example of this decoupling is LDAVis, an r package visualization serving the same purpose and inspired by Termite.
Goal of the Project
Get a panoramic idea of the various building blocks surrounding the Termite system which is used for topic modeling visualization and decouple them in smaller systems. Objective 1: Inspect the main() function to find out which flags are activated when running the file using above command line The progress can be tracked here: https://github.com/sailuh/termite/wiki/Code-Walkthrough.
References
Termite: Visualization Techniques for Assessing Textual Topic Models. Jason Chuang, Christopher D. Manning, Jeffrey Heer. Computer Science Dept, Stanford University. Termite Paper: http://vis.stanford.edu/files/2012-Termite-AVI.pdf Github Link: For the older version: https://github.com/sailuh/termite/wiki The new version: https://github.com/uwdata/termite-visualizations