Code for CDI project, including transcript-caption alignment, story segmentation and topic modeling.
+ ner NER (Name Entity Recognition) automated scripts.
+ py-story-clust [DEPRECATED] greedy story clustering algorithm and bi-clustering.
+ tools tools and helper scripts
+ story-classification [DEPRECATED]
+ sw-clust Uniformed story segmentation and clustering framework base on Swendsen-Wang Cuts algorithm.
+ transcript-aligner Transcript caption aligner.