This article presents a new hierarchical topic modeling approach called TraCo (Transport Plan and Context-aware Hierarchical Topic Model). The purpose of topic modeling is to discover latent topics from a document corpus and organize them into a hierarchy for better understanding of documents. However, existing models struggle to produce topic hierarchies with high affinity, rationality, and diversity. TraCo improves these attributes by introducing a transport plan dependency method and a context-aware disentangled decoder. The transport plan ensures sparsity and balance in dependencies while the decoder facilitates rationality in hierarchies. Experiments show that TraCo outperforms existing models and enhances performance on downstream tasks.

 

Publication date: 26 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.14113