We present WordGraph, a Python package for exploring the topics of documents corpora. WordGraph provides causal graphical models from text data vocabulary and proposes interactive visualizations of terms networks. Our ease-to-use package is provided with a prebuilt pipeline to access the main modules through jupyter widgets. It results in the encapsulation of a whole vocabulary exploration process within a single jupyter notebook cell, with straight forward parameters settings and interactive plots. WordGraph pipeline is fully customizable by adding/removing widgets or changing default parameters. To assist users with no background in Python nor jupyter notebook, but willing to explore large corpora topics, we also propose an automatic dashboard generation from the customizable jupyter notebook pipeline in a web application style. WordGraph is available through a GitHub repository.
Supplemental Material
Download the research paper here.