Skip to content

Experimental Data: Boosting Binary Optimization via Binary Classification

Notifications You must be signed in to change notification settings

quasiquasar/gta-jobshop-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Boosting Binary Optimization via Binary Classification: A Case Study of Job Shop Scheduling

1 Introduction

This repository contains the experimental data used in the paper “Boosting Binary Optimization via Binary Classification: A Case Study of Job Shop Scheduling” by O.V. Shylo and H. Shams. Here you can also find the code for generating the probability dominance plots.

The data is located in the folder data, while the code for generating the plots is in the jupyter notebook file generate-plots.ipynb. The file analyze-regression.ipynb provides the code for generating the accuracy plots for the logistic regression.

2 Performance log file structure

Below is the snapshot of the typical log file that we use to compare algorithms.

  algo     expid   problem  runid  mpirank        seed   obj  time  epoch
0  gta  gtacurve  ta11.txt     10        0  1168868254  1395     6      0
1  gta  gtacurve  ta11.txt     10        0  1168868254  1394     9      1
2  gta  gtacurve  ta11.txt     10        0  1168868254  1393    10      1
3  gta  gtacurve  ta11.txt     10        0  1168868254  1392    12      1
4  gta  gtacurve  ta11.txt     10        0  1168868254  1391    12      1

The columns are described in the following table.

columndescription
algoalgorithm name
expidname of the experiment
problemproblem name
runidid of the run (there may be two distinct runs with the same id)
mpiranknot used
seedrandom initialization seed (unique for each distinct run)
objobjective value of the solution that was found
timetotal seconds after the start, when the above objective was found
epochepoch number, when the above objective was found

For the gta algorithm, the experiment id gtacurve corresponds to the runs that were limited to 200 epochs. The experiment id gtacurvetime corresponds to the runs that were limited to 30 minutes. The experiment gtacurve0001 corresponds to the last experiment described in the paper (changed theta-min to 0.0001).

For the stabu algorithm, the experiment id tabu0 corresponds to the runs that were limited to 200 epochs. The experiment id tabutime corresponds to the runs that were limited to 30 minutes.

2.1 Structure of the data/performance folder

All the performance log files are located in the data/performance folder. There is a subfolder for each algorithm, each experiment id and each problem instance.

For example, the folder data/performance/gta/gtacurve/ta11/, contains all the logs by the algorithm gta with experiment id gtacurve, when used on the problem ta11.

3 Logistic regression log file structure

Below is the snapshot of the typical log file that we use to plot the optimal regression parameters.

    algo  expid   problem  runid    ...     time  epoch     muopt  accuracy
0  stabu  tabu0  ta48.txt    141    ...        9      1  0.009183  0.714767
1  stabu  tabu0  ta48.txt    141    ...       19      2  0.012534  0.713215
2  stabu  tabu0  ta48.txt    141    ...       29      3  0.020770  0.734721
3  stabu  tabu0  ta48.txt    141    ...       39      4  0.025433  0.749809
4  stabu  tabu0  ta48.txt    141    ...       48      5  0.033475  0.750561

[5 rows x 9 columns]

The columns are described in the following table.

columndescription
algoalgorithm name
expidname of the experiment
problemproblem name
runidid of the run (there may be two distinct runs with the same id)
seedrandom initialization seed (unique for each distinct run)
timetotal seconds after the start, when the optimal logreg parameter was calculated
epochepoch number, when the optimal logreg parameter was calculated
muoptoptimal logreg parameter found at the above time/epoch stamps)
accuracyaccuracy of the logistic regression at the above time/epoch

3.1 Structure of the data/onlineregression folder

All the regression log files are located in the data/onlineregression folder. There is a subfolder for each algorithm, each experiment id and each problem instance.

For example, the folder data/onlineregression/stabu/tabu0/ta11/, contains all the logistic regression logs by the algorithm stabu with experiment id tabu0, when used on the problem ta11.

About

Experimental Data: Boosting Binary Optimization via Binary Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published