SemEnr

Code Semantic Enrichment for Deep Code Search

Dependency

Tested in Ubuntu 16.04

Python 3.6
Keras 2.1.3
Tensorflow-gpu 1.7.0
lucene 7.7.1

Usage

DataSets

The datasets used in our paper will be found at: https://drive.google.com/drive/folders/1j-0xukLQWGrJ8-Lxw7vFAbubFTyXJT2C?usp=sharing

Data Process

If you want to reprocess the data, you can process it into a usable form for the model by following steps:

1.Build corpus for each features (i.e., description, tokens):

python createCorpus.py python createVocab.py python vocab2pkl.py

2.Processing training data and testing data according to the corpus:

python txt2pkl.py

Code Enrichment Module

Build retrieval base: python Index.py

Perform search: python Search.py

Remove stop words: python deleteStopWords.py

Code Search Module

Configuration

Put the data set into the data/github directory under keras

Edit hyper-parameters and settings in config.py

Train and Evaluate

python main.py --mode train
python main.py --mode eval

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Data_Process		Data_Process
Enrichment_Module		Enrichment_Module
SemEnr_CSN-JAVA/keras		SemEnr_CSN-JAVA/keras
SemEnr_DeepCom-JAVA/keras		SemEnr_DeepCom-JAVA/keras
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SemEnr

Dependency

Usage

DataSets

Data Process

Code Enrichment Module

Code Search Module

Configuration

Train and Evaluate

About

Releases

Packages

Languages

Denilah/SemEnr

Folders and files

Latest commit

History

Repository files navigation

SemEnr

Dependency

Usage

DataSets

Data Process

Code Enrichment Module

Code Search Module

Configuration

Train and Evaluate

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages