Skip to content

Source code related to the article "Leveraging Large Language Models in Code Question Answering: Baselines and Issues"

Notifications You must be signed in to change notification settings

IU-AES-AI4Code/CodeQuestionAnswering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Leveraging Large Language Models in Code Question Answering: Baselines and Issues

This repository source code related to training, inference, and evaluation described in the article "Leveraging Large Language Models in Code Question Answering: Baselines and Issues".

The repository has two main folders: Training and Testing. Training folder contains the code to fine-tune StarСoder and DeepSeek-Coder models, while Testing folder contains code to generate models predictions and evaluate them.

Links to models:

  1. StarCoder with Grammar Correction

  2. DeepSeek-Coder with Grammar Correction

  3. CodeT5+ for Summaries Generation

Links to datasets:

  1. Unified Dataset

  2. Unified Dataset with Grammatical Corrections

  3. Unified Dataset with Generated Summaries

  4. Testing Dataset Based on ClassEval Dataset

  5. High Quality Subset of Unified Dataset

If you have any question related to the code, send an email with your question to georgyandryuschenko@gmail.com. However, feel free to create a GitHub issue too.

About

Source code related to the article "Leveraging Large Language Models in Code Question Answering: Baselines and Issues"

Resources

Stars

Watchers

Forks