A modified dataset consisting of English dialogs between a user and an assistant discussing movie preferences in natural language.
-
Updated
Sep 29, 2023
A modified dataset consisting of English dialogs between a user and an assistant discussing movie preferences in natural language.
Synthetically Generating Intent-Aware Information-Seeking Dialogues! Useful for various tasks such as training/evaluating User Intent Predictors with the possibility to training/evaluating on real human dialogues. The backbone LLM of SOLID is Zephyr-7b-beta.
Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust
Collection of ETL scripts used to create a dataset of text in Spanish to train Large Language Models.
A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI
PARROT (Performance Assessment of Reasoning and Responses On Trivia) is a novel benchmarking framework designed to evaluate Large Language Models (LLMs) on real-world, complex, and ambiguous QA tasks.
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
A collection of LLM related papers, thesis, tools, datasets, courses, open source models, benchmarks
Repository for organizing datasets and papers used in Open LLM.
collection of text2cypher datasets, evaluations, and finetuning instructions
Add a description, image, and links to the llm-datasets topic page so that developers can more easily learn about it.
To associate your repository with the llm-datasets topic, visit your repo's landing page and select "manage topics."