Skip to content
Change the repository type filter

All

    Repositories list

    • 0000Updated Dec 16, 2024Dec 16, 2024
    • 0000Updated Nov 16, 2024Nov 16, 2024
    • AAST

      Public
      Academic Article Survey Tables (AAST) Dataset
      0000Updated Oct 28, 2024Oct 28, 2024
    • Jupyter Notebook
      0000Updated Oct 2, 2024Oct 2, 2024
    • ConvLogRecaller-dataset
      0000Updated Apr 12, 2024Apr 12, 2024
    • Self-ICL

      Public
      Python
      2410Updated Dec 3, 2023Dec 3, 2023
    • FECS

      Public
      Python
      0700Updated Nov 28, 2023Nov 28, 2023
    • ZARA

      Public
      0110Updated Nov 5, 2023Nov 5, 2023
    • Contrastively learning participant representations per round in thread-based debates.
      Python
      0200Updated Oct 25, 2023Oct 25, 2023
    • The ContributionSum Dataset
      GNU General Public License v3.0
      0100Updated Aug 14, 2023Aug 14, 2023
    • Citation Intent Classification and Its Supporting Evidence Extraction for Citation Graph Construction
      0110Updated Aug 13, 2023Aug 13, 2023
    • AMDRD

      Public
      Analysis Model of Discourse Relations within a Document(AMDRD)
      Python
      GNU General Public License v3.0
      1200Updated Aug 11, 2023Aug 11, 2023
    • Life Event Dialog contains fine-grained personal life event annotations on DailyDialog.
      0800Updated May 2, 2023May 2, 2023
    • 提供台大AI中心共享平台圖片。
      Apache License 2.0
      0000Updated Apr 29, 2023Apr 29, 2023
    • A Traditional-Chinese instruction-following model with datasets based on Alpaca.
      Python
      Apache License 2.0
      1813520Updated Mar 28, 2023Mar 28, 2023
    • C2RC2

      Public
      Categorizing Citation Relations in Scientific Papers Based on the Contributions of Cited Papers
      MIT License
      0200Updated Nov 21, 2022Nov 21, 2022
    • 0000Updated Nov 14, 2022Nov 14, 2022
    • 0000Updated Nov 3, 2022Nov 3, 2022
    • tw-eH

      Public
      Learning to Generate Explanation from e-Hospital Services for Medical Suggestion
      Python
      MIT License
      0300Updated Nov 3, 2022Nov 3, 2022
    • 0000Updated Oct 17, 2022Oct 17, 2022
    • ICDA

      Public
      Interactive Clinical Diagnostic Assistant for Medical Interview
      Python
      0200Updated Sep 7, 2022Sep 7, 2022
    • SEEN

      Public
      SEEN: Structured Event Enhancement Network for Explainable Need Detection of Information Recall Assistance
      Python
      MIT License
      0400Updated Aug 21, 2022Aug 21, 2022
    • PRRCA

      Public
      Peer Review and Rebuttal Counter-Arguments Dataset
      0100Updated Aug 13, 2022Aug 13, 2022
    • NTUSD

      Public
      Sentiment words are employed to compute the tendency of a sentence, and then a document. To detect sentiment words in Chinese documents, a Chinese sentiment dictionary is indispensable. However, a small dictionary may suffer from the problem of coverage. A method to learn sentiment words and their strengths from multiple resources is developed i…
      MIT License
      2800Updated Jun 20, 2022Jun 20, 2022
    • A dialogue dataset is an indispensable resource for building a dialogue system. Additional information like emotions and interpersonal relationships labeled on conversations enables the system to capture the emotion flow of the participants in the dialogue. However, there is no publicly available Chinese dialogue dataset with emotion and relatio…
      0400Updated Jun 1, 2022Jun 1, 2022
    • A word similarity dataset with high proportion of multi-sense words that is designed to facilitate more reliable evaluations of sense embeddings.
      0200Updated Jun 1, 2022Jun 1, 2022
    • Numeral is the crucial part of financial documents. In order to understand the detail of opinions in financial documents, we should not only analyze the text, but also need to assay the numeric information in depth. Because of the informal writing style, analyzing social media data is more challenging than analyzing news and official documents. …
      1300Updated Jun 1, 2022Jun 1, 2022
    • FinProLex provides 5,162 tokens in professional analysts' reports and the financial social media platform posts with expert-like scores. The expert-like scores are calculated based on the pointwise mutual information (PMI).
      1400Updated Jun 1, 2022Jun 1, 2022
    • Numeral is the crucial part of in narrative, especially in financial documents. We should not only analyze the text, but also need to assay the numeric information in depth. Numeracy-600K is a dataset for testing the numeracy of machines.
      1300Updated Jun 1, 2022Jun 1, 2022
    • NTUSD-Fin provides various scoring methods including frequency, CFIDF, chi-squared value, market sentiment score and word vector for the tokens. Only the tokens appeared at least ten times and shown significantly difference between expected and observed frequency with chi-squared test are remained in our dictionary. The predetermined significanc…
      2510Updated Jun 1, 2022Jun 1, 2022