- Install dependencies: INSTALLING.md
- Process dataset: DataFuncs.py
- Extract Features:
- Extract parse trees add_parse_tree.py
- Extract parse tree features add_path_features.py with utility functions defined in instance_parser.py.
- Run example experiments. run_all.py
https://github.com/rvente/NLP-Final-Project/blob/release/Code/orpheus
├── analysis Use these notebooks for analysis, reading and writing to /results .
│ ├── analysis.ipynb
│ ├── chart_nb_x_alpha.ipynb
│ ├── chart_prev_seen.ipynb
│ ├── chart_svc_prev_seen.ipynb
│ ├── generate_charts.ipynb
│ ├── Presentation.ipynb
│ ├── Presentation-NB.ipynb
│ └── Presentation-SVC.ipynb
├── data Store data and feature extraction output here.
│ ├── 1000A30D__doc+pos.pkl
│ ├── 1000A30D_with_doc.pkl
│ ├── 100A50D.csv
│ ├── 100A50D__doc+pos.pkl
│ ├── 100A50D_POS.pkl
│ ├── DataFuncs.py
│ ├── Run_All.py
│ ├── ...
│ ├── small_with_doc.pkl
│ └── small.xlsx
├── experimentation Configure and run the machine learning models
│ ├── l0_100a_50d.py
│ ├── __pycache__
│ ├── run_all.py Outlines the most general combinations of hyper-parameters.
│ ├── run_prev_seen.py
│ ├── sandbox.py
│ └── svc.py
├── feature_extraction
│ ├── add_parse_tree.py
│ ├── add_path_features.py
│ ├── instance_parser.py
│ └── __pycache__
├── figures Figures generated by the analysis scripts.
│ ├── nb_x_alpha.pdf
│ ├── nb_x_alpha.svg
│ ├── nb_x_prev_seen.pdf
│ └── svm_x_prev_seen.pdf
├── INSTALLING.md How to install and configure
├── logs
gitignored: The filesystem database of experiments
│ ├── 1
│ ├── 10
│ ├── 100
│ ├── 101
│ ├── ...
│ └── _sources
├── prev_seen_logs not gitignored: view sample logs here on another branch
│ ├── 1
│ ├── 10
│ ├── 11
│ ├── ...
│ └── _sources
├── INSTALLING.md
├── requirements_2.txt
├── requirements.txt
├── results
│ ├── nb_df_acc.pkl
│ ├── nb_df_f1.pkl
│ ├── nb_x_alpha_df_acc.pkl
│ ├── svc_df_acc.pkl
│ ├── svc_df_f1.pkl
│ └── svm_x_prev_seen.pkl
├── software_citations.bib
└── virtualenv We recommend a virtual environment for installing packages.