CM/CMX v3.5.2 (#1377)

- added `-i` flag to print memory and disk use before running CM/CMX commands: #1375 - added utils.get_disk_use - added utils.get_memory_use - formatted Python modules from the internal repository using autopep8
mlcommons · Dec 20, 2024 · b607edb · b607edb
2 parents 7f66e24 + 2dd17a6
commit b607edb
Show file tree

Hide file tree

Showing 10 changed files with 236 additions and 14 deletions.
diff --git a/.github/workflows/test-cm.yml b/.github/workflows/test-cm.yml
@@ -16,7 +16,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
         on: [ubuntu-latest, windows-latest, macos-latest]
         exclude:
           - python-version: "3.7"

diff --git a/HISTORY.CM.md b/HISTORY.CM.md
@@ -0,0 +1,91 @@
+This document narrates the history of the creation and design of CM and CM4MLOps (also known as CK2) 
+by [Grigori Fursin](https://cKnowledge.org/gfursin). It also highlights the donation of this open-source technology to MLCommons, 
+aimed at benefiting the broader community and fostering its ongoing development as a collaborative, community-driven initiative:
+
+* Jan 28, 2021: After delivering an invited ACM TechTalk'21 about the Collective Knowledge framework (CK1) 
+  and reproducibility initiatives for conferences, as well as CK-MLOps and MLPerf automations, 
+  Grigori received lots of positive feedback and suggestions for improvements to workflow automations:
+  https://learning.acm.org/techtalks/reproducibility. 
+
+  Following this, Grigori began prototyping CK2 (later CM) to streamline CK1, CK-MLOps and MLPerf benchmarking. 
+  The goal was to dramatically simplify CK1 workflows by introducing just a few core and portable automations, 
+  which eventually evolved into `CM script` and `CM cache`.
+
+  At that time, the cTuning foundation hosted CK1 and all the prototypes for the CM framework at https://github.com/ctuning/ck:
+  [ref1](https://github.com/mlcommons/ck/commit/9e57934f4999db23052531e92160772ab831463a), 
+  [ref2](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a),
+  [ref3](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a/incubator).
+
+* Sep 23, 2021: donated CK1, CK-MLOps, MLPerf automations and early prototypes of CM from the cTuning repository to MLCommons:
+  [ref1](https://web.archive.org/web/20240803140223/https://octo.ai/blog/octoml-joins-the-community-effort-to-democratize-mlperf-inference-benchmarking),
+  [ref2](https://github.com/mlcommons/ck/tree/228f80b0bf44610c8244ff0c3f6bec5bbd25aa6c/incubator),
+  [ref3](https://github.com/mlcommons/ck/tree/695c3843fd8121bbdde6c453cd6ec9503986b0c6?tab=readme-ov-file#author-and-coordinator),
+  [ref4](https://github.com/mlcommons/ck/tree/master/ck),
+  [ref5](https://github.com/mlcommons/ck-mlops).
+
+* Mar 1, 2022: started developing cm-mlops: [ref](https://github.com/octoml/cm-mlops/commit/0ae94736a420dfa84f7417fc62d323303b8760c6).
+
+* Mar 24, 2022: after successfully stabilizing the initial prototype of CM, donated it to MLCommons to benefit the entire community:
+  [ref1](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2), 
+  [ref2](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2#coordinators),
+  [ref3](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6),
+  [ref4](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6#diff-d97f0f6f5a32f16d6ed18b9600ffc650f7b25512685f7a2373436c492c6b52b3R48).
+
+* Apr 6, 2022: started transitioning previous MLOps and MLPerf automations from the mlcommons/ck-mlops format 
+  to the new CM format using the cm-mlops repository (will be later renamed to cm4mlops):
+  [ref1](https://github.com/octoml/cm-mlops/commit/d1efdc30fb535ce144020d4e88f3ed768c933176),
+  [ref2](https://github.com/octoml/cm-mlops/blob/d1efdc30fb535ce144020d4e88f3ed768c933176/CONTRIBUTIONS).
+
+* Apr 22, 2022: began architecting "Intelligent Components" in the CM-MLOps repository, 
+  which will be renamed to `CM Script` at a later stage:
+  [ref1](https://github.com/octoml/cm-mlops/commit/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref2](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref3](https://github.com/octoml/cm-mlops/blob/b335c609c47d2c547afe174d9df232652d57f4f8/CONTRIBUTIONS).
+
+  At the same time, prototyped other core CM automations, including IC, Docker, and Experiment:
+  [ref1](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8/automation),
+  [ref2](https://github.com/mlcommons/ck/commits/master/?before=7f66e2438bfe21b4ce2d08326a5168bb9e3132f6+7001).
+
+* Apr 28, 2022: donated CM-MLOps to MLCommons, which was later renamed to CM4MLOps:
+  [ref](https://github.com/mlcommons/ck/commit/456e4861056c0e39c4d689c03da91f90a44be058).
+
+* May 9, 2022: developed the initial set of core IC automations for MLOps (aka CM scripts):
+ [ref1](https://github.com/octoml/cm-mlops/commit/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref2](https://github.com/octoml/cm-mlops/tree/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref3](https://github.com/octoml/cm-mlops/blob/7692240becd6397a96c3975388913ea082002e7a/CONTRIBUTIONS).
+
+* May 11, 2022: After successfully prototyping CM and CM-MLOps, deprecated the CK1 framework in favor of CM. 
+  Welcomed Arjun as a maintainer and tester for CM and CM-MLOps:
+  [ref](https://github.com/octoml/cm-mlops/blob/17405833665bc1e93820f9ff76deb28a0f543bdb/CONTRIBUTIONS).
+
+  Created a [file](https://github.com/mlcommons/ck/blob/master/cm-mlops/CHANGES.md) 
+  to document and track our public developments at MLCommons.
+
+* Jun 8, 2022: renamed the 'IC' automation to the more intuitive 'CM script' automation. 
+  [ref1](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81/cm-devops/automation/script),
+  [ref2](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81),
+  [ref3](https://github.com/octoml/cm-mlops/commit/7910fb7ffc62a617d987d2f887d6f9981ff80187).
+
+* Jun 16, 2022: prototyped the `CM cache` automation to facilitate caching and reuse of the outputs from CM scripts:
+  [ref1](https://github.com/mlcommons/ck/commit/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref2](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref3](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48?tab=readme-ov-file#contacts).
+
+* Sep 6, 2022: delivered CM demo to run MLPerf while deprecating CK1 automations for MLPerf:
+  [ref1](https://github.com/mlcommons/ck/commit/2c5d5c5c944ae5f252113c62af457c7a4c5e877a#diff-faac2c4ecfd0bfb928dafc938d3dad5651762fbb504a2544752a337294ee2573R224),
+  [ref2](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#author-and-coordinator).
+
+  Welcomed Arjun Suresh as a contributor to CM: [ref](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#contributors-in-alphabetical-order).
+
+* From September 2022: coordinated community development of CM and CM4MLOps 
+  to [modularize and automate MLPerf](https://docs.mlcommons.org/inference)
+  and support [reproducibility initiatives at ML and Systems conferences](https://cTuning.or/ae) 
+  through the MLCommons Task Force on Automation and Reproducibility.
+
+* Starting in April 2024, began the gradual transfer of ongoing maintenance and enhancement 
+  responsibilities for CM and CM4MLOps, including MLPerf automations, to MLCommons.
+  Welcomed Anandhu Sooraj as a maintainer and contributor to CM4MLOps with MLPerf automations.
+
+For more details, please refer to this [white paper](https://arxiv.org/abs/2406.16791) 
+and [ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339).
+
diff --git a/cm/CHANGES.md b/cm/CHANGES.md
@@ -1,4 +1,7 @@
-## V3.5.1.1
+## V3.5.2
+   - added `-i` flag to print memory and disk use before running CM/CMX commands: 
+     https://github.com/mlcommons/ck/issues/1375
+   - added utils.get_disk_use
    - added utils.get_memory_use
    - formatted Python modules from the internal repository using autopep8
 

diff --git a/cm/HISTORY.md b/cm/HISTORY.md
@@ -0,0 +1,91 @@
+This document narrates the history of the creation and design of CM and CM4MLOps (also known as CK2) 
+by [Grigori Fursin](https://cKnowledge.org/gfursin). It also highlights the donation of this open-source technology to MLCommons, 
+aimed at benefiting the broader community and fostering its ongoing development as a collaborative, community-driven initiative:
+
+* Jan 28, 2021: After delivering an invited ACM TechTalk'21 about the Collective Knowledge framework (CK1) 
+  and reproducibility initiatives for conferences, as well as CK-MLOps and MLPerf automations, 
+  Grigori received lots of positive feedback and suggestions for improvements to workflow automations:
+  https://learning.acm.org/techtalks/reproducibility. 
+
+  Following this, Grigori began prototyping CK2 (later CM) to streamline CK1, CK-MLOps and MLPerf benchmarking. 
+  The goal was to dramatically simplify CK1 workflows by introducing just a few core and portable automations, 
+  which eventually evolved into `CM script` and `CM cache`.
+
+  At that time, the cTuning foundation hosted CK1 and all the prototypes for the CM framework at https://github.com/ctuning/ck:
+  [ref1](https://github.com/mlcommons/ck/commit/9e57934f4999db23052531e92160772ab831463a), 
+  [ref2](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a),
+  [ref3](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a/incubator).
+
+* Sep 23, 2021: donated CK1, CK-MLOps, MLPerf automations and early prototypes of CM from the cTuning repository to MLCommons:
+  [ref1](https://web.archive.org/web/20240803140223/https://octo.ai/blog/octoml-joins-the-community-effort-to-democratize-mlperf-inference-benchmarking),
+  [ref2](https://github.com/mlcommons/ck/tree/228f80b0bf44610c8244ff0c3f6bec5bbd25aa6c/incubator),
+  [ref3](https://github.com/mlcommons/ck/tree/695c3843fd8121bbdde6c453cd6ec9503986b0c6?tab=readme-ov-file#author-and-coordinator),
+  [ref4](https://github.com/mlcommons/ck/tree/master/ck),
+  [ref5](https://github.com/mlcommons/ck-mlops).
+
+* Mar 1, 2022: started developing cm-mlops: [ref](https://github.com/octoml/cm-mlops/commit/0ae94736a420dfa84f7417fc62d323303b8760c6).
+
+* Mar 24, 2022: after successfully stabilizing the initial prototype of CM, donated it to MLCommons to benefit the entire community:
+  [ref1](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2), 
+  [ref2](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2#coordinators),
+  [ref3](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6),
+  [ref4](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6#diff-d97f0f6f5a32f16d6ed18b9600ffc650f7b25512685f7a2373436c492c6b52b3R48).
+
+* Apr 6, 2022: started transitioning previous MLOps and MLPerf automations from the mlcommons/ck-mlops format 
+  to the new CM format using the cm-mlops repository (will be later renamed to cm4mlops):
+  [ref1](https://github.com/octoml/cm-mlops/commit/d1efdc30fb535ce144020d4e88f3ed768c933176),
+  [ref2](https://github.com/octoml/cm-mlops/blob/d1efdc30fb535ce144020d4e88f3ed768c933176/CONTRIBUTIONS).
+
+* Apr 22, 2022: began architecting "Intelligent Components" in the CM-MLOps repository, 
+  which will be renamed to `CM Script` at a later stage:
+  [ref1](https://github.com/octoml/cm-mlops/commit/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref2](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref3](https://github.com/octoml/cm-mlops/blob/b335c609c47d2c547afe174d9df232652d57f4f8/CONTRIBUTIONS).
+
+  At the same time, prototyped other core CM automations, including IC, Docker, and Experiment:
+  [ref1](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8/automation),
+  [ref2](https://github.com/mlcommons/ck/commits/master/?before=7f66e2438bfe21b4ce2d08326a5168bb9e3132f6+7001).
+
+* Apr 28, 2022: donated CM-MLOps to MLCommons, which was later renamed to CM4MLOps:
+  [ref](https://github.com/mlcommons/ck/commit/456e4861056c0e39c4d689c03da91f90a44be058).
+
+* May 9, 2022: developed the initial set of core IC automations for MLOps (aka CM scripts):
+ [ref1](https://github.com/octoml/cm-mlops/commit/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref2](https://github.com/octoml/cm-mlops/tree/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref3](https://github.com/octoml/cm-mlops/blob/7692240becd6397a96c3975388913ea082002e7a/CONTRIBUTIONS).
+
+* May 11, 2022: After successfully prototyping CM and CM-MLOps, deprecated the CK1 framework in favor of CM. 
+  Welcomed Arjun as a maintainer and tester for CM and CM-MLOps:
+  [ref](https://github.com/octoml/cm-mlops/blob/17405833665bc1e93820f9ff76deb28a0f543bdb/CONTRIBUTIONS).
+
+  Created a [file](https://github.com/mlcommons/ck/blob/master/cm-mlops/CHANGES.md) 
+  to document and track our public developments at MLCommons.
+
+* Jun 8, 2022: renamed the 'IC' automation to the more intuitive 'CM script' automation. 
+  [ref1](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81/cm-devops/automation/script),
+  [ref2](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81),
+  [ref3](https://github.com/octoml/cm-mlops/commit/7910fb7ffc62a617d987d2f887d6f9981ff80187).
+
+* Jun 16, 2022: prototyped the `CM cache` automation to facilitate caching and reuse of the outputs from CM scripts:
+  [ref1](https://github.com/mlcommons/ck/commit/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref2](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref3](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48?tab=readme-ov-file#contacts).
+
+* Sep 6, 2022: delivered CM demo to run MLPerf while deprecating CK1 automations for MLPerf:
+  [ref1](https://github.com/mlcommons/ck/commit/2c5d5c5c944ae5f252113c62af457c7a4c5e877a#diff-faac2c4ecfd0bfb928dafc938d3dad5651762fbb504a2544752a337294ee2573R224),
+  [ref2](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#author-and-coordinator).
+
+  Welcomed Arjun Suresh as a contributor to CM: [ref](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#contributors-in-alphabetical-order).
+
+* From September 2022: coordinated community development of CM and CM4MLOps 
+  to [modularize and automate MLPerf](https://docs.mlcommons.org/inference)
+  and support [reproducibility initiatives at ML and Systems conferences](https://cTuning.or/ae) 
+  through the MLCommons Task Force on Automation and Reproducibility.
+
+* Starting in April 2024, began the gradual transfer of ongoing maintenance and enhancement 
+  responsibilities for CM and CM4MLOps, including MLPerf automations, to MLCommons.
+  Welcomed Anandhu Sooraj as a maintainer and contributor to CM4MLOps with MLPerf automations.
+
+For more details, please refer to this [white paper](https://arxiv.org/abs/2406.16791) 
+and [ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339).
+
diff --git a/cm/cmind/__init__.py b/cm/cmind/__init__.py
@@ -2,7 +2,7 @@
 #
 # Written by Grigori Fursin
 
-__version__ = "3.5.1.1"
+__version__ = "3.5.2"
 
 from cmind.core import access
 from cmind.core import x

diff --git a/cm/cmind/core.py b/cm/cmind/core.py
@@ -858,7 +858,7 @@ def x(self, i, out = None):
           'h', 'help', 'version', 'out', 'j', 'json', 
           'save_to_json_file', 'save_to_yaml_file', 'common', 
           'ignore_inheritance', 'log', 'logfile', 'raise', 'repro',
-          'f', 'time', 'profile']]
+          'i', 'f', 'time', 'profile']]
 
         delayed_error = ''
 
@@ -882,6 +882,8 @@ def x(self, i, out = None):
 
         self_profile = control.get('profile', False)
 
+        self_info = control.get('i', False)
+
         # Check repro
         use_log = str(control_flags.pop('log', '')).strip().lower()
         log_file = control_flags.pop('logfile', '')
@@ -955,6 +957,13 @@ def x(self, i, out = None):
             self.log(f"x input: {spaces} ({i})", "debug")
 
         # Call access helper
+        if not x_was_called and self_info:
+            utils.get_memory_use(True)
+            print ('')
+            utils.get_disk_use('/', True)
+            print ('')
+
+
         if not x_was_called and self_profile:
             # https://docs.python.org/3/library/profile.html#module-cProfile
             import cProfile, pstats, io

diff --git a/cm/cmind/utils.py b/cm/cmind/utils.py
@@ -2219,3 +2219,41 @@ def get_memory_use(console = False):
                         'total_memory': total_memory,
                         'total_memory_gb': total_memory_gb}
 
+##############################################################################
+def get_disk_use(path = '/', console = False):
+    """
+    Get disk space
+
+    Args:
+        console (bool): if True, print to console
+
+    Returns:
+       total (int)
+       total_gb (float)
+       used (int)
+       used_gb (float)
+       free (int)
+       free_gb (float)
+
+    """
+
+    import shutil
+
+    total, used, free = shutil.disk_usage(path)
+
+    total_gb = total / 1e9
+    used_gb = used / 1e9
+    free_gb = free / 1e9
+
+    if console:
+        print(f"Total disk space: {total_gb:.2f} GB")
+        print(f"Used disk space: {used_gb:.2f} GB")
+        print(f"Free disk space: {free_gb:.2f} GB")
+
+    return {'return':0, 
+            'total': total,
+            'total_gb': total_gb,
+            'used': used,
+            'used_gb': used_gb,
+            'free': free,
+            'free_gb': free_gb}
diff --git a/cm4abtf/README.md b/cm4abtf/README.md
diff --git a/cm4mlops/README.md b/cm4mlops/README.md
diff --git a/cm4mlperf/README.md b/cm4mlperf/README.md