NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
-
Updated
May 20, 2023 - Python
NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
Spanish word embeddings computed with different methods and from different corpora
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
Tools for shrinking fastText models (in gensim format)
Text to abstract art generation for the holidays!
A monolingual and cross-lingual meta-embedding generation and evaluation framework
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
PyTorch repository for text categorization and NER experiments in Turkish and English.
An evaluation of word-embeddings for classification
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
Ensemble PhoBERT with FastText Embedding to improve performance on Vietnamese Sentiment Analysis tasks.
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gensim library). The .vec and .model files are available for download (all in one archive).
Biomedical Word embeddings generated from Spanish Biomedical corpora.
Repository for the free online book Oddly Satisfying Deep Learning from Scratch (link below!)
Machine learning- based solution to the problem of duplicity in the bug reports repository.
Spanish Word Embeddings computed from large corpora and different sizes using fastText.
Add a description, image, and links to the fasttext-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the fasttext-embeddings topic, visit your repo's landing page and select "manage topics."