reproducibility versus replicability #7

valdanchev · 2021-12-24T18:14:03Z

Great to see this implemented. Definitions of reproducibility and replication differ across domains, and would probably be helpful to clarify these in a few places. Happy to add these. In probably the most accepted definition now reproducibility would mean the use of the same data sets, techniques, scripts, and framework by independent researchers to obtain the same results. Replication in this setting is a bit tricky though—would the only difference be in the implemented framework, TensorFlow versus PyTorch? Are there other underlying differences between the two frameworks, which may contribute to differences also in how the model is trained or in the results, depending on whether model training or results are replicated?

VictorSanh · 2022-01-04T21:12:47Z

thanks for raising that point @valdanchev !

Replication in this setting is a bit tricky though—would the only difference be in the implemented framework, TensorFlow versus PyTorch? Are there other underlying differences between the two frameworks, which may contribute to differences also in how the model is trained or in the results, depending on whether model training or results are replicated?

The main differences for the replication of the training will be:

framework: tf vs pytorch
optimizer: adafactor vs adam
data processing: example packing vs no example packing

(under the folder evaluation, I used "reproduce" but as you noted, a better term would be "replicate")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproducibility versus replicability #7

reproducibility versus replicability #7

valdanchev commented Dec 24, 2021

VictorSanh commented Jan 4, 2022

reproducibility versus replicability #7

reproducibility versus replicability #7

Comments

valdanchev commented Dec 24, 2021

VictorSanh commented Jan 4, 2022