Improve speed for test #188

tikkss · 2023-12-26T21:34:29Z

Running all tests takes about 70 seconds, even with cached files in my environment. During implementation, I tested frequently and waiting for 70 seconds each time was not realistic.

As a workaround, I've been running tests for the implemented dataset only, which completes in about 2.5 seconds using the following command:

ruby -I lib test/run-test.rb -t "HouseOfRepresentativeTest" -v

However, when submitting a pull request and running all tests, it takes about 70 seconds.

The tests for red-datasets use test-unit. I couldn't find how to speed up the tests, such as parallelization.

Is there any effective solution? Thanks!

My environment:

OS: MacOS Sonoma 14.2.1
CPU: Intel Core i5 2.3GHz 4-core Processor
Memory: 16GB

Running all tests with cached files in my environment on 723bdf2 below:

$ test/run-test.rb -v
Loaded suite test
Started
AFINNTest:
  test: #each:                                                  .: (0.080627)
  #metadata:
    test: #description:                                         .: (0.000977)
AdultTest:
  #metadata:
    test: #description:                                         .: (0.005996)
  test:
    test: #each:                                                .: (1.315731)
  train:
    test: #each:                                                .: (2.806436)
AozoraBunkoTest:
  test: #new:                                                   .: (0.113628)
  Book:
    test: #clear_cache! removes all cache files:                .: (0.034559)
    #html:
      test: not readable:                                       .: (0.000677)
      readable:
        test: encoding is ShiftJIS:                             .: (0.000690)
        test: encoding is UTF-8:                                .: (0.000354)
    #text:
      test: not readable:                                       .: (0.000187)
      test: readable:                                           .: (0.000918)
    converting boolean:
      test: #copyrighted?:                                      .: (0.098200)
      test: #person_copyrighted?:                               .: (0.099811)
CIFARTest:
  cifar-10:
    test:
      test: #each:                                              .: (0.002878)
    train:
      test: #each:                                              .: (0.004444)
  cifar-100:
    test:
      test: #each:                                              .: (0.002528)
      test: #to_table:                                          .: (0.002000)
    train:
      test: #each:                                              .: (0.002576)
      test: #to_table:                                          .: (0.001986)
  invalid:
    test: type:                                                 .: (0.000348)
CLDRPluralsTest:
  test: #each:                                                  .: (0.048240)
  #metadata:
    test: #description:                                         .: (0.018892)
CaliforniaHousingTest:
  test: #each:                                                  .: (1.359899)
  #metadata:
    test: #description:                                         .: (0.000205)
CommunitiesTest:
  test: #each:                                                  .: (0.186278)
  #metadata:
    test: #description:                                         .: (0.004938)
DiamondsTest:
  test: #each:                                                  .: (3.823073)
  #metadata:
    test: #description:                                         .: (0.000963)
DictionaryTest:
  test: #each:                                                  .: (0.102608)
  test: #id:                                                    .: (0.093453)
  test: #ids:                                                   .: (0.090334)
  test: #length:                                                .: (0.083336)
  test: #size:                                                  .: (0.089598)
  test: #value:                                                 .: (0.084973)
  test: #values:                                                .: (0.093050)
DownloaderTest:
  #download:
    test: too many redirection:                                 .: (0.001245)
EStatJapanTest:
  anomaly responses:
    test: forbidden access with invalid app_id:                 .: (0.001666)
  app_id:
    test: configure:                                            .: (0.000370)
    test: constructor:                                          .: (0.000199)
    test: env:                                                  .: (0.000195)
    test: env & configure:                                      .: (0.000194)
    test: env & configure & constructor:                        .: (0.000322)
    test: nothing:                                              .: (0.000164)
  parsing records:
    test: parsing records with default option:                  .: (0.001693)
    test: parsing records with hierarchy_selection:             .: (0.002348)
    test: parsing records with skip_nil_(column|row):           .: (0.001604)
  url generation:
    test: generates url correctly:                              .: (0.000370)
FashionMNISTTest:
  Abnormal:
    test: invalid type:                                         .: (0.000285)
  Normal:
    test:
      test: #each:                                              .: (0.041159)
      test: #to_table:                                          .: (0.399451)
      #metadata:
        test: #id:                                              .: (0.000313)
        test: #name:                                            .: (0.000291)
    train:
      test: #each:                                              .: (0.310068)
      test: #to_table:                                          .: (2.539012)
      #metadata:
        test: #id:                                              .: (0.000267)
        test: #name:                                            .: (0.000087)
FuelEconomyTest:
  test: #each:                                                  .: (0.016510)
  #metadata:
    test: #description:                                         .: (0.001305)
GeoloniaTest:
  test: #each:                                                  .: (7.531156)
  #metadata:
    test: #description:                                         .: (0.001492)
HepatitisTest:
  test: #each:                                                  .: (0.004290)
  #metadata:
    test: #description:                                         .: (0.048317)
HouseOfCouncillorTest:
  :bill:
    test: #each:                                                .: (2.304549)
  :in_house_group:
    test: #each:                                                .: (0.002019)
  :member:
    test: #each:                                                .: (0.047492)
  :question:
    test: #each:                                                .: (0.869110)
HouseOfRepresentativeTest:
  test: #each:                                                  .: (2.884005)
ITACorpusTest:
  #metadata:
    test: #description:                                         .: (0.001511)
  type:
    test: emotion:                                              .: (0.001433)
    test: invalid:                                              .: (0.000172)
    test: recitation:                                           .: (0.001565)
IrisTest:
  test: #each:                                                  .: (0.004085)
  #metadata:
    test: #description:                                         .: (0.009697)
JapaneseDateParserTest:
  test: #parse[month and day with leading a space in Heisei]:   .: (0.000289)
  test: #parse[month         with leading a space in Heisei]:   .: (0.000104)
  test: #parse[          day with leading a space in Heisei]:   .: (0.000108)
  test: #parse[           without leading a space in Heisei]:   .: (0.000161)
  test: #parse[year, month and day with leading a space in Reiwa]:      .: (0.000109)
  test: #parse[year, month         with leading a space in Reiwa]:      .: (0.000073)
  test: #parse[year,           day with leading a space in Reiwa]:      .: (0.000068)
  test: #parse[year,            without leading a space in Reiwa]:      .: (0.000066)
  test: #parse[boundary within Heisei]:                         .: (0.000093)
  test: #parse[boundary within Reiwa]:                          .: (0.000067)
  test: unsupported era initial range:                          .: (0.000203)
KuzushijiMNISTTest:
  Abnormal:
    test: invalid type:                                         .: (0.000206)
  Normal:
    test:
      test: #each:                                              .: (0.040145)
      test: #to_table:                                          .: (0.424113)
      #metadata:
        test: #id:                                              .: (0.000201)
        test: #name:                                            .: (0.000079)
    train:
      test: #each:                                              .: (0.294359)
      test: #to_table:                                          .: (2.597233)
      #metadata:
        test: #id:                                              .: (0.000624)
        test: #name:                                            .: (0.000089)
LIBSVMDatasetListTest:
  test: #each:                                                  .: (0.003166)
  #metadata:
    test: #description:                                         .: (0.005273)
LIBSVMDatasetTest:
  test: :default_feature_value:                                 .: (0.003167)
  test: :note:                                                  .: (0.004178)
  test: classification:                                         .: (0.003153)
  test: multi-label:                                            .: (1.426006)
  test: regression:                                             .: (1.012187)
  test: string:                                                 .: (0.000055)
LicenseTest:
  .try_convert:
    test: String:                                               .: (0.000224)
    test: {name:, url:}:                                        .: (0.000095)
    test: {spdx_id:}:                                           .: (0.000256)
LivedoorNewsTest:
  #metadata:
    test: #description:                                         .: (0.000859)
  type:
    test: dokujo_tsushin:                                       .: (0.311255)
    test: invalid:                                              .: (0.000263)
    test: it_life_hack:                                         .: (0.297497)
    test: kaden_channel:                                        .: (0.295787)
    test: livedoor_homme:                                       .: (0.285076)
    test: movie_enter:                                          .: (0.297599)
    test: peachy:                                               .: (0.294360)
    test: smax:                                                 .: (0.298369)
    test: sports_watch:                                         .: (0.292065)
    test: topic_news:                                           .: (0.283887)
MNISTTest:
  Abnormal:
    test: invalid type:                                         .: (0.000362)
  Normal:
    test:
      test: #each:                                              .: (0.028889)
      test: #to_table:                                          .: (0.350796)
      #metadata:
        test: #id:                                              .: (0.000482)
        test: #name:                                            .: (0.000120)
    train:
      test: #each:                                              .: (0.184738)
      test: #to_table:                                          .: (2.296102)
      #metadata:
        test: #id:                                              .: (0.000323)
        test: #name:                                            .: (0.000116)
MetadataTest:
  #licenses:
    test: String:                                               .: (0.000483)
    test: Symbol:                                               .: (0.000232)
    test: [String]:                                             .: (0.000120)
    test: {name:, url:}:                                        .: (0.000114)
MushroomTest:
  test: #each:                                                  .: (0.206307)
  #metadata:
    test: #description:                                         .: (0.006712)
NagoyaUniversityConversationCorpusTest:
  #metadata:
    test: #description:                                         .: (0.000188)
  each:
    test: #participants:                                        .: (1.405967)
    test: #sentences:                                           .: (1.211507)
    test: others:                                               .: (1.210386)
PMJTDatasetListTest:
  test: #each:                                                  .: (0.134741)
PenguinsTest:
  Adelie:
    test: #each:                                                .: (0.021625)
  Chinstrap:
    test: #each:                                                .: (0.047549)
  Gentoo:
    test: #each:                                                .: (0.018874)
  Penguins:
    test: #each:                                                .: (0.052839)
    test: data cleansing:                                       .: (0.050738)
    test: order of species:                                     .: (0.051520)
  PenguinsRawData::SpeciesBase:
    test: #data_path:                                           .: (0.000588)
PennTreebankTest:
  type:
    test: invalid:                                              .: (0.000426)
    test: test:                                                 .: (0.051926)
    test: train:                                                .: (0.621158)
    test: valid:                                                .: (0.039625)
PostalCodeJapanTest:
  :reading:
    test: :lowercase:                                           .: (0.227016)
    test: :romaji:                                              .: (0.233065)
    test: :uppercase:                                           .: (0.175865)
QuoraDuplicateQuestionPairTest:
  test: #each:                                                  .: (22.387830)
RdatasetTest:
  Rdataset:
    test: invalid package name:                                 .: (0.205192)
    datasets:
      test: invalid dataset name:                               .: (0.214120)
      AirPassengers:
        test: #each:                                            .: (0.048332)
        test: #metadata.description:                            .: (0.100297)
        test: #metadata.id:                                     .: (0.041614)
      airquality:
        test: #each:                                            .: (0.045113)
      attenu:
        test: #each:                                            .: (0.070130)
    drc:
      germination:
        test: #each:                                            .: (0.088386)
    validate:
      nace_rev2:
        test: #each:                                            .: (0.308739)
  RdatasetList:
    #each:
      test: with package_name:                                  .: (0.177569)
      test: without package_name:                               .: (0.208933)
SeabornTest:
  attention:
    test_each:                                                  .: (0.003711)
  flights:
    test_each:                                                  .: (0.004510)
  fmri:
    test_each:                                                  .: (0.056336)
  list:
    test_each:                                                  .: (0.001493)
  penguins:
    test_each:                                                  .: (0.019277)
SudachiSynonymDictionaryTest:
  test: #each:                                                  .: (1.022899)
  #metadata:
    test: #description:                                         .: (0.045505)
TableTest:
  test: #column_names:                                          .: (0.005359)
  test: #dictionary_encode:                                     .: (0.005118)
  test: #each:                                                  .: (0.004869)
  test: #each_column:                                           .: (0.004619)
  test: #each_record:                                           .: (0.004896)
  test: #label_encode:                                          .: (0.004576)
  test: #n_columns:                                             .: (0.004048)
  test: #n_rows:                                                .: (0.004077)
  test: #to_h:                                                  .: (0.004782)
  #[]:
    test: index:                                                .: (0.004264)
    test: name:                                                 .: (0.004362)
  #fetch_values:
    test: found:                                                .: (0.004786)
    not found:
      test: with block:                                         .: (0.004534)
      test: without block:                                      .: (0.004533)
  #find_record:
    test: negative:                                             .: (0.004053)
    test: negative - over:                                      .: (0.004574)
    test: positive:                                             .: (0.004088)
    test: positive - over:                                      .: (0.004790)
TestDataset:
  #clear_cache!:
    test: when the dataset is downloaded:                       .: (0.006043)
    test: when the dataset is not downloaded:                   .: (0.004488)
WikipediaKyotoJapaneseEnglishTest:
  test: description:                                            .: (0.000219)
  test: invalid:                                                .: (0.000179)
  article:
    test: #each:                                                /Users/zzz/src/github.com/red-data-tools/red-datasets/lib/datasets/tar-gz-readable.rb:7: warning: attempt to close unfinished zstream; reset forced.
.: (0.004294)
  lexicon:
    test: #each:                                                .: (1.681613)
WikipediaTest:
  en:
    articles:
      test: #each:                                              Failed to read bzcat input: Errno::EPIPE: Broken pipe
.: (1.693823)
      #metadata:
        test: #description:                                     .: (0.000463)
        test: #id:                                              .: (0.000331)
        test: #name:                                            .: (0.000198)
WineTest:
  test: #each:                                                  .: (0.014964)
  #metadata:
    test: #description:                                         .: (0.062812)

Finished in 73.414974 seconds.
-------------------------------------------------------------------------------------
199 tests, 240 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
-------------------------------------------------------------------------------------
2.71 tests/s, 3.27 assertions/s

The text was updated successfully, but these errors were encountered:

kou · 2023-12-28T01:50:40Z

Thanks. It seems that the following test is the slowest test:

QuoraDuplicateQuestionPairTest:
  test: #each:                                                  .: (22.387830)

Let's look into this as the first step.

tikkss · 2024-01-04T21:44:23Z

I'll give it a try to improve the speed of QuoraDuplicateQuestionPairTest. Thanks!

Because date_time and float are not included in tsv. Before this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (24.666772) Finished in 24.667106 seconds. ------------------------------------------------------------------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------------------------------------------------------------------- 0.04 tests/s, 0.04 assertions/s real 0m25.270s user 0m24.633s sys 0m0.451s ``` After this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (17.476536) Finished in 17.476912 seconds. ------------------------------------------------------------------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------------------------------------------------------------------- 0.06 tests/s, 0.06 assertions/s real 0m18.117s user 0m17.554s sys 0m0.383s ``` Refs red-data-tools#188.

Because date_time and float are not included in tsv. Before this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (24.666772) Finished in 24.667106 seconds. ------------------------------------------------------------------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------------------------------------------------------------------- 0.04 tests/s, 0.04 assertions/s real 0m25.270s user 0m24.633s sys 0m0.451s ``` After this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (17.476536) Finished in 17.476912 seconds. ------------------------------------------------------------------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------------------------------------------------------------------- 0.06 tests/s, 0.06 assertions/s real 0m18.117s user 0m17.554s sys 0m0.383s ``` Refs #188.

Because tsv file is too big(404,290 rows). Before this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (17.476536) Finished in 17.476912 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 0.06 tests/s, 0.06 assertions/s real 0m18.117s user 0m17.554s sys 0m0.383s ``` After this change: $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (0.001124) Finished in 0.001461 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 684.46 tests/s, 684.46 assertions/s real 0m0.595s user 0m0.376s sys 0m0.126s Refs red-data-tools#188.

Because tsv file is too big(404,290 rows). Before this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (17.476536) Finished in 17.476912 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 0.06 tests/s, 0.06 assertions/s real 0m18.117s user 0m17.554s sys 0m0.383s ``` After this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (0.001124) Finished in 0.001461 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 684.46 tests/s, 684.46 assertions/s real 0m0.595s user 0m0.376s sys 0m0.126s ``` Refs red-data-tools#188.

Because tsv file is too big(404,290 rows). Before this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (17.476536) Finished in 17.476912 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 0.06 tests/s, 0.06 assertions/s real 0m18.117s user 0m17.554s sys 0m0.383s ``` After this change: ```bash $ time ruby -I lib test/run-test.rb -t "/QuoraDuplicateQuestionPairTest/" -v Loaded suite test Started QuoraDuplicateQuestionPairTest: test: #each: .: (0.001124) Finished in 0.001461 seconds. ------------------------------------------------------------------------------------- 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ------------------------------------------------------------------------------------- 684.46 tests/s, 684.46 assertions/s real 0m0.595s user 0m0.376s sys 0m0.126s ``` Refs #188. --------- Co-authored-by: Sutou Kouhei <kou@cozmixng.org>

GitHub: red-data-toolsGH-188 Because csv file is too big (277,656 rows). Before this change: ```console time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 5.998956 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m6.603s user 0m6.146s sys 0m0.346s ``` After this change: ```console $ time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 0.001634 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.564s user 0m0.388s sys 0m0.087s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (277,656 rows). Before this change: ```console $ time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 5.998956 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m6.603s user 0m6.146s sys 0m0.346s ``` After this change: ```console $ time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 0.001634 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.564s user 0m0.388s sys 0m0.087s ```

GitHub: GH-188 Because csv file is too big (277,656 rows). Before this change: ```console $ time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 5.998956 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m6.603s user 0m6.146s sys 0m0.346s ``` After this change: ```console $ time ruby test/run-test.rb -t GeoloniaTest --verbose=important-only Finished in 0.001634 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.564s user 0m0.388s sys 0m0.087s ```

GitHub: red-data-toolsGH-188 Because date_time is not included in the following csvs. * diamonds.csv (diamonds dataset inherits from `Ggplot2Dataset`) * mpg.csv (fuel economy dataset inherits from `Ggplot2Dataset`) Before this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.702616 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.665s user 0m4.136s sys 0m0.226s ``` After this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.367821 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.179s user 0m3.738s sys 0m0.202s ``` After this change, the fuel economy dataset test also passed: ```console $ ruby test/run-test.rb -t FuelEconomyTest --verbose=important-only Finished in 0.017332 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications ```

GitHub: GH-188 Because date_time is not included in the following csvs. * diamonds.csv (diamonds dataset inherits from `Ggplot2Dataset`) * mpg.csv (fuel economy dataset inherits from `Ggplot2Dataset`) Before this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.702616 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.665s user 0m4.136s sys 0m0.226s ``` After this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.367821 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.179s user 0m3.738s sys 0m0.202s ``` After this change, the fuel economy dataset test also passed: ```console $ ruby test/run-test.rb -t FuelEconomyTest --verbose=important-only Finished in 0.017332 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications ```

GitHub: red-data-toolsGH-188 Because csv file is too big (53,940 rows). Before this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.367821 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.179s user 0m3.738s sys 0m0.202s ``` After this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 0.002048 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.551s user 0m0.379s sys 0m0.085s ```

GitHub: GH-188 Because csv file is too big (53,940 rows). Before this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 3.367821 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.179s user 0m3.738s sys 0m0.202s ``` After this change: ```console $ time ruby test/run-test.rb -t DiamondsTest --verbose=important-only Finished in 0.002048 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.551s user 0m0.379s sys 0m0.085s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (10,779 rows). Before this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 2.679117 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.295s user 0m3.031s sys 0m0.152s ``` After this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 0.095506 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.659s user 0m0.465s sys 0m0.103s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (10,779 rows and 41 columns). Before this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 2.679117 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.295s user 0m3.031s sys 0m0.152s ``` After this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 0.095506 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.659s user 0m0.465s sys 0m0.103s ```

GitHub: GH-188 Because csv file is too big (10,779 rows and 41 columns). Before this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 2.679117 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.295s user 0m3.031s sys 0m0.152s ``` After this change: ```console $ time ruby test/run-test.rb -t HouseOfRepresentativeTest --verbose=important-only Finished in 0.095506 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.659s user 0m0.465s sys 0m0.103s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (32,561 rows). Before this change: ```console $ time ruby test/run-test.rb -t AdultTest::train --verbose=important-only Finished in 2.742848 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.366s user 0m3.064s sys 0m0.183s ``` After this change: ```console $ time ruby test/run-test.rb -t AdultTest::train --verbose=important-only Finished in 0.001817 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.587s user 0m0.395s sys 0m0.093s ```

GitHub: GH-188 Because csv file is too big (32,561 rows). Before this change: ```console $ time ruby test/run-test.rb -t AdultTest::train --verbose=important-only Finished in 2.742848 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.366s user 0m3.064s sys 0m0.183s ``` After this change: ```console $ time ruby test/run-test.rb -t AdultTest::train --verbose=important-only Finished in 0.001817 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.587s user 0m0.395s sys 0m0.093s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (9,462 rows and 39 columns). Before this change: ```console $ time ruby test/run-test.rb -t HouseOfCouncillorTest:::bill --verbose=important-only Finished in 2.352206 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m2.953s user 0m2.701s sys 0m0.153s ``` After this change: ```console $ time ruby test/run-test.rb -t HouseOfCouncillorTest:::bill --verbose=important-only Finished in 0.002077 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.566s user 0m0.382s sys 0m0.092s ```

GitHub: GH-188 Because csv file is too big (9,462 rows and 39 columns). Before this change: ```console $ time ruby test/run-test.rb -t HouseOfCouncillorTest:::bill --verbose=important-only Finished in 2.352206 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m2.953s user 0m2.701s sys 0m0.153s ``` After this change: ```console $ time ruby test/run-test.rb -t HouseOfCouncillorTest:::bill --verbose=important-only Finished in 0.002077 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.566s user 0m0.382s sys 0m0.092s ```

GitHub: red-data-toolsGH-188 Because text file is too big (134,555 rows in total). Before this change: ```console $ time ruby test/run-test.rb -t NagoyaUniversityConversationCorpusTest --verbose=important-only Finished in 3.65658 seconds. 4 tests, 4 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.550s user 0m3.723s sys 0m0.568s ``` After this change: ```console $ time ruby test/run-test.rb -t NagoyaUniversityConversationCorpusTest --verbose=important-only Finished in 0.060458 seconds. 4 tests, 4 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.613s user 0m0.428s sys 0m0.096s ```

GitHub: GH-188 Because text file is too big (134,555 rows in total). Before this change: ```console $ time ruby test/run-test.rb -t NagoyaUniversityConversationCorpusTest --verbose=important-only Finished in 3.65658 seconds. 4 tests, 4 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m4.550s user 0m3.723s sys 0m0.568s ``` After this change: ```console $ time ruby test/run-test.rb -t NagoyaUniversityConversationCorpusTest --verbose=important-only Finished in 0.060458 seconds. 4 tests, 4 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.613s user 0m0.428s sys 0m0.096s ``` --------- Co-authored-by: Sutou Kouhei <kou@clear-code.com>

GitHub: red-data-toolsGH-188 Because csv file is too big (20,640 rows). Before this change: ```console $ time ruby test/run-test.rb -t CaliforniaHousingTest --verbose=important-only Finished in 1.453959 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.029s user 0m1.745s sys 0m0.161s ``` After this change: ```console $ time ruby test/run-test.rb -t CaliforniaHousingTest --verbose=important-only Finished in 0.331599 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.126s user 0m0.738s sys 0m0.143s ```

GitHub: GH-188 Because csv file is too big (20,640 rows). Before this change: ```console $ time ruby test/run-test.rb -t CaliforniaHousingTest --verbose=important-only Finished in 1.453959 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m3.029s user 0m1.745s sys 0m0.161s ``` After this change: ```console $ time ruby test/run-test.rb -t CaliforniaHousingTest --verbose=important-only Finished in 0.331599 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.126s user 0m0.738s sys 0m0.143s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (67,753 rows). Before this change: ```console $ time ruby test/run-test.rb -t SudachiSynonymDictionaryTest --verbose=important-only Finished in 1.296801 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.964s user 0m1.740s sys 0m0.118s ``` After this change: ```console $ time ruby test/run-test.rb -t SudachiSynonymDictionaryTest --verbose=important-only Finished in 0.010658 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.634s user 0m0.455s sys 0m0.092s ```

GitHub: GH-188 Because csv file is too big (67,753 rows). Before this change: ```console $ time ruby test/run-test.rb -t SudachiSynonymDictionaryTest --verbose=important-only Finished in 1.296801 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.964s user 0m1.740s sys 0m0.118s ``` After this change: ```console $ time ruby test/run-test.rb -t SudachiSynonymDictionaryTest --verbose=important-only Finished in 0.010658 seconds. 2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.634s user 0m0.455s sys 0m0.092s ```

GitHub: red-data-toolsGH-188 Because csv file is too big (16,281 rows). Before this change: ```console $ time ruby test/run-test.rb -t AdultTest::test --verbose=important-only Finished in 1.355024 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.980s user 0m1.777s sys 0m0.113s ``` After this change: ```console $ time ruby test/run-test.rb -t AdultTest::test --verbose=important-only Finished in 0.002213 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.619s user 0m0.442s sys 0m0.089s ```

GitHub: GH-188 Because csv file is too big (16,281 rows). Before this change: ```console $ time ruby test/run-test.rb -t AdultTest::test --verbose=important-only Finished in 1.355024 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m1.980s user 0m1.777s sys 0m0.113s ``` After this change: ```console $ time ruby test/run-test.rb -t AdultTest::test --verbose=important-only Finished in 0.002213 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications real 0m0.619s user 0m0.442s sys 0m0.089s ```

tikkss · 2024-09-13T22:40:36Z

I have given it a try to improve the execution time below:

First, improve the slowest test
Next, improve the next slowest test
Repeat it until exists slow tests

We reduced the test execution time from about 73 seconds to about 24 seconds (3 times faster!).

It was difficult to more reduce the execution time of the following top 5 slow tests, so this is as far as we can go with red-datasets.

$ ruby -I ../../test-unit/test-unit/lib test/run-test.rb --report-slow-tests --progress-row-max=72
Loaded suite test
Started
|/Users/zzz/src/github.com/red-data-tools/red-datasets/lib/datasets/tar-gz-readable.rb:7: warning: attempt to close unfinished zstream; reset forced.
-Failed to read bzcat input: Errno::EPIPE: Broken pipe
Finished in 23.527626 seconds.
------------------------------------------------------------------------
Top 5 slow tests
test: #to_table(FashionMNISTTest::Normal::train):               2.519765
--location /Users/zzz/src/github.com/red-data-tools/red-datasets/test/test-fashion-mnist.rb:42
test: #to_table(MNISTTest::Normal::train):                      2.305772
--location /Users/zzz/src/github.com/red-data-tools/red-datasets/test/test-mnist.rb:41
test: #to_table(KuzushijiMNISTTest::Normal::train):             2.266870
--location /Users/zzz/src/github.com/red-data-tools/red-datasets/test/test-kuzushiji-mnist.rb:42
test: #each(WikipediaTest::en::articles):                       1.690996
--location /Users/zzz/src/github.com/red-data-tools/red-datasets/test/test-wikipedia.rb:9
test: #each(WikipediaKyotoJapaneseEnglishTest::lexicon):        1.494162
--location /Users/zzz/src/github.com/red-data-tools/red-datasets/test/test-wikipedia-kyoto-japanese-english.rb:137
------------------------------------------------------------------------
201 tests, 242 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
------------------------------------------------------------------------
8.54 tests/s, 10.29 assertions/s

More improvements will come from implementing parallelization support in test-unit/test-unit#235.
We'll focus on that next.

Thanks!

tikkss mentioned this issue Jan 8, 2024

Support parallelization test-unit/test-unit#235

Open

tikkss mentioned this issue Jan 15, 2024

quora: reduce needless converters #189

Merged

tikkss mentioned this issue Jan 19, 2024

test quora: reduce checking #190

Merged

tikkss mentioned this issue Jun 28, 2024

Show N slowest tests test-unit/test-unit#253

Closed

tikkss mentioned this issue Aug 30, 2024

test geolonia: reduce checking #202

Merged

tikkss mentioned this issue Aug 31, 2024

test diamonds: reduce needless converters #205

Merged

tikkss mentioned this issue Sep 2, 2024

test diamonds: reduce checking #206

Merged

tikkss mentioned this issue Sep 2, 2024

test house-of-representative: reduce checking #208

Merged

tikkss mentioned this issue Sep 3, 2024

test adult-train: reduce checking #211

Merged

tikkss mentioned this issue Sep 4, 2024

test house-of-councillor: reduce checking #213

Merged

tikkss mentioned this issue Sep 5, 2024

test nagoya-university-conversation-corpus: reduce checking #217

Merged

tikkss mentioned this issue Sep 7, 2024

test california-housing: reduce checking #222

Merged

tikkss mentioned this issue Sep 8, 2024

test sudachi-synonym-dictionary: reduce checking #225

Merged

tikkss mentioned this issue Sep 9, 2024

test adult-test: reduce checking #228

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve speed for test #188

Improve speed for test #188

tikkss commented Dec 26, 2023

kou commented Dec 28, 2023

tikkss commented Jan 4, 2024

tikkss commented Sep 13, 2024

Improve speed for test #188

Improve speed for test #188

Comments

tikkss commented Dec 26, 2023

kou commented Dec 28, 2023

tikkss commented Jan 4, 2024

tikkss commented Sep 13, 2024