-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate CMake rules to download and import models #10167
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Copyright 2022 The IREE Authors | ||
# | ||
# Licensed under the Apache License v2.0 with LLVM Exceptions. | ||
# See https://llvm.org/LICENSE.txt for license information. | ||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
|
||
################################################################################ | ||
# Tests | ||
################################################################################ | ||
|
||
benchmark_tool_py_test( | ||
NAME | ||
cmake_rule_generator_test | ||
SRC | ||
"cmake_rule_generator_test.py" | ||
) |
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
@@ -0,0 +1,204 @@ | ||||
## Copyright 2022 The IREE Authors | ||||
# | ||||
# Licensed under the Apache License v2.0 with LLVM Exceptions. | ||||
# See https://llvm.org/LICENSE.txt for license information. | ||||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||||
"""Generator that generates CMake rules from python defined benchmarks. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Zooming out a bit, I'm wondering if CMake is the really right tool for building / running benchmarks. This moves one step away from CMake (generating rules via Python, instead of adding more logic to the rules themselves or authoring them manually), but what could this look like if we took another step in the same direction? What is CMake (a build system / build system generator) actually providing for us here, and would other scripts / tools / build systems do a better job?
Not really provided:
Moving away from CMake entirely seems like a stretch, but here's another idea: what if we made the benchmarks their own CMake project (or even their own repository)? We'd have more flexibility there with defining benchmark-specific options, including dependencies, and organizing the file structure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm only planning to use CMake to import and build the artifacts, since those have dependencies on the tools built by the build system. The benchmark list will be expressed in JSON and read by benchmark tools or other CI tools. I agree it might be a good idea to have a separate project for benchmarks, but I don't see lots of benefits to do that for now, especially in the separate benchmark project, we need to write some extra CMake rules to find the import/compiler tools from the main IREE project. But it should be fairly easy to move these codes out when we need to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's a very weak dependency though - the tools are often built separately.
Finding a few files doesn't seem particularly tricky? This function isn't doing much: iree/build_tools/cmake/iree_macros.cmake Line 144 in a456db6
For local iteration, it doesn't seem that unreasonable to me to require first building the tools, then running a benchmark script with the path to the tools. I don't think having the benchmarks and the compiler tools in the same CMake project adds that much value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I think the mechanics of this PR itself are fine, but I'm trying to think through the design space to see if there's a more flexible / easier to maintain configuration that we could be building towards) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think using CMake to manage dependencies between things like tflite -> mlir -> vmfb is good, let's not reinvent dependency management. But I think I agree with Scott that this could be a separate CMake project (doesn't need to be in this PR). The separate project could accept a path to an IREE install directory, just as we do for our crosscompile builds. In fact, it could still import the iree_macros.cmake, I think and so could reuse functions without actually being the same project. OTOH, what do we actually gain from the separate CMake project? We could make all benchmark options dependent options, so they're hidden unless you enable benchmarks, if the concern is just with that. Although I guess if they're a separate project they don't need to be as well namespaced, especially since it's a project we wouldn't need to craft so that someone else could use it in their own build. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If we had a standalone A separate CMake project in the same repository offers some of those benefits, but I think it's useful to consider where the design could eventually end up. Running benchmarks on presubmit using tools and benchmark definitions from the same repository is a good argument to keep them in the same repository, but maybe once we have more programs in the test suite that will get tricky. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think most of the advantages will come from having a separate repository. I can expect that things will get tricky when we start adding more larger benchmarks (or even the benchmarks of other ML frameworks), especially right now everything is under But the new repository introduces the works of a separate CI and problems of sync between benchmarks and the core IREE repo (like how to make sure the changes on both sides won't break each other, since we want to run benchmarks for each commit, not just the nightly). Nevertheless, I feel like this topic is stretching a lot from this PR, should we open a dedicated issue for it? : ) To me the current implementation isn't harmful for now and isn't hard to move out from IREE (mostly just refactoring some CMake rules). Just need to make sure to do the required refactoring/separation before introducing the new benchmarks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
SGTM, thanks for the discussion so far :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filed the issue #10244 |
||||
|
||||
The rules will build required artifacts to run benchmarks. | ||||
""" | ||||
|
||||
from dataclasses import dataclass | ||||
from typing import List, Optional | ||||
import os | ||||
import pathlib | ||||
import string | ||||
import urllib.parse | ||||
|
||||
from .definitions import common_definitions, iree_definitions | ||||
from . import iree_benchmarks | ||||
|
||||
# Template dir: build_tools/benchmarks/suites/../../cmake | ||||
TEMPLATE_DIR = pathlib.Path(__file__).parents[2] / "cmake" | ||||
pzread marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
DOWNLOAD_ARTIFACT_CMAKE_TEMPLATE = string.Template( | ||||
open(TEMPLATE_DIR / "iree_download_artifact_template.cmake", "r").read()) | ||||
TFLITE_IMPORT_CMAKE_TEMPLATE = string.Template( | ||||
open(TEMPLATE_DIR / "iree_tflite_import_template.cmake", "r").read()) | ||||
TF_IMPORT_CMAKE_TEMPLATE = string.Template( | ||||
open(TEMPLATE_DIR / "iree_tf_import_template.cmake", "r").read()) | ||||
|
||||
|
||||
@dataclass | ||||
class ModelRule(object): | ||||
target_name: str | ||||
file_path: str | ||||
cmake_rule: str | ||||
|
||||
|
||||
@dataclass | ||||
class IreeModelImportRule(object): | ||||
target_name: str | ||||
output_file_path: str | ||||
mlir_dialect_type: str | ||||
cmake_rule: Optional[str] | ||||
|
||||
|
||||
class CommonRuleFactory(object): | ||||
"""Generates common cmake rules.""" | ||||
|
||||
def __init__(self, model_artifacts_dir: str): | ||||
"""Constructs a CommonRuleFactory. | ||||
|
||||
Args: | ||||
model_artifacts_dir: root directory to store model files. Can contain | ||||
CMake variable syntax in the path. | ||||
""" | ||||
self._model_artifacts_dir = model_artifacts_dir | ||||
self._model_rules = {} | ||||
|
||||
def add_model_rule(self, model: common_definitions.Model) -> ModelRule: | ||||
"""Adds a rule to fetch a model. Reuses the existing rule when possible.""" | ||||
if model.id in self._model_rules: | ||||
return self._model_rules[model.id] | ||||
|
||||
# Model target: <package_name>-model-<model_id> | ||||
target_name = f"model-{model.id}" | ||||
|
||||
model_url = urllib.parse.urlparse(model.source_url) | ||||
_, file_ext = os.path.splitext(model_url.path) | ||||
# Model path: <model_artifacts_dir>/<model_id>_<model_name>.<file ext> | ||||
model_path = f"{self._model_artifacts_dir}/{model.id}_{model.name}{file_ext}" | ||||
|
||||
if model_url.scheme == "https": | ||||
cmake_rule = DOWNLOAD_ARTIFACT_CMAKE_TEMPLATE.substitute( | ||||
__TARGET_NAME=target_name, | ||||
__OUTPUT_PATH=model_path, | ||||
__SOURCE_URL=model.source_url) | ||||
else: | ||||
raise ValueError("Unsupported model url: {model.source_url}.") | ||||
|
||||
model_rule = ModelRule(target_name=target_name, | ||||
file_path=model_path, | ||||
cmake_rule=cmake_rule) | ||||
|
||||
self._model_rules[model.id] = model_rule | ||||
return model_rule | ||||
|
||||
def generate_cmake_rules(self) -> List[str]: | ||||
"""Dump all cmake rules in a correct order.""" | ||||
return [rule.cmake_rule for rule in self._model_rules.values()] | ||||
|
||||
|
||||
class IreeRuleFactory(object): | ||||
"""Generates IREE benchmark cmake rules.""" | ||||
|
||||
def __init__(self, iree_artifacts_dir): | ||||
"""Constructs an IreeRuleFactory. | ||||
|
||||
Args: | ||||
iree_artifacts_dir: root directory to store generated IREE artifacts. Can | ||||
contain CMake variable syntax in the path. | ||||
""" | ||||
self._iree_artifacts_dir = iree_artifacts_dir | ||||
self._import_model_rules = {} | ||||
self._compile_module_rules = {} | ||||
self._generate_flagfile_rules = {} | ||||
|
||||
def add_import_model_rule( | ||||
self, | ||||
model_id: str, | ||||
model_name: str, | ||||
model_source_type: common_definitions.ModelSourceType, | ||||
source_model_rule: ModelRule, | ||||
) -> IreeModelImportRule: | ||||
"""Adds a rule to fetch the model and import into MLIR. Reuses the rule when | ||||
possible.""" | ||||
|
||||
if model_id in self._import_model_rules: | ||||
return self._import_model_rules[model_id] | ||||
|
||||
# If the source model is MLIR, no import rule is needed. | ||||
if model_source_type == common_definitions.ModelSourceType.EXPORTED_LINALG_MLIR: | ||||
import_model_rule = IreeModelImportRule( | ||||
target_name=source_model_rule.target_name, | ||||
output_file_path=source_model_rule.file_path, | ||||
mlir_dialect_type="linalg", | ||||
cmake_rule=None) | ||||
self._import_model_rules[model_id] = import_model_rule | ||||
return import_model_rule | ||||
|
||||
# Import target: <package_name>_iree-import-model-<model_id> | ||||
target_name = f"iree-import-model-{model_id}" | ||||
|
||||
# Imported MLIR path: <iree_artifacts_dir>/<model_id>_<model_name>/<model_name>.mlir | ||||
output_file_path = f"{self._iree_artifacts_dir}/{model_id}_{model_name}/{model_name}.mlir" | ||||
|
||||
if model_source_type == common_definitions.ModelSourceType.EXPORTED_TFLITE: | ||||
cmake_rule = TFLITE_IMPORT_CMAKE_TEMPLATE.substitute( | ||||
__TARGET_NAME=target_name, | ||||
__SOURCE_MODEL_PATH=source_model_rule.file_path, | ||||
__OUTPUT_PATH=output_file_path) | ||||
mlir_dialect_type = "tosa" | ||||
elif model_source_type == common_definitions.ModelSourceType.EXPORTED_TF: | ||||
cmake_rule = TF_IMPORT_CMAKE_TEMPLATE.substitute( | ||||
__TARGET_NAME=target_name, | ||||
__SOURCE_MODEL_PATH=source_model_rule.file_path, | ||||
__OUTPUT_PATH=output_file_path) | ||||
mlir_dialect_type = "mhlo" | ||||
else: | ||||
raise ValueError( | ||||
f"Unsupported source type '{model_source_type}' of the model '{model_id}'." | ||||
) | ||||
|
||||
import_model_rule = IreeModelImportRule(target_name=target_name, | ||||
output_file_path=output_file_path, | ||||
mlir_dialect_type=mlir_dialect_type, | ||||
cmake_rule=cmake_rule) | ||||
|
||||
self._import_model_rules[model_id] = import_model_rule | ||||
return import_model_rule | ||||
|
||||
def generate_cmake_rules(self) -> List[str]: | ||||
"""Dump all cmake rules in a correct order.""" | ||||
import_model_rules = [ | ||||
rule.cmake_rule for rule in self._import_model_rules.values() | ||||
] | ||||
return import_model_rules | ||||
|
||||
|
||||
def _generate_iree_benchmark_rules(common_rule_factory: CommonRuleFactory, | ||||
iree_artifacts_dir: str) -> List[str]: | ||||
iree_rule_factory = IreeRuleFactory(iree_artifacts_dir) | ||||
compile_specs, _ = iree_benchmarks.Linux_x86_64_Benchmarks.generate() | ||||
for compile_spec in compile_specs: | ||||
model = compile_spec.model | ||||
|
||||
source_model_rule = common_rule_factory.add_model_rule(model) | ||||
iree_rule_factory.add_import_model_rule(model_id=model.id, | ||||
model_name=model.name, | ||||
model_source_type=model.source_type, | ||||
source_model_rule=source_model_rule) | ||||
|
||||
# TODO(pzread): Generate the compilation and run rules. | ||||
|
||||
return iree_rule_factory.generate_cmake_rules() | ||||
|
||||
|
||||
def generate_benchmark_rules(model_artifacts_dir: str, | ||||
iree_artifacts_dir: str) -> List[str]: | ||||
"""Generates cmake rules for all benchmarks. | ||||
|
||||
Args: | ||||
model_artifacts_dir: root directory to store model files. Can contain CMake | ||||
variable syntax in the path. | ||||
iree_artifacts_dir: root directory to store generated IREE artifacts. Can | ||||
contain CMake variable syntax in the path. | ||||
Returns: | ||||
List of CMake rules. | ||||
""" | ||||
common_rule_factory = CommonRuleFactory(model_artifacts_dir) | ||||
iree_rules = _generate_iree_benchmark_rules(common_rule_factory, | ||||
iree_artifacts_dir) | ||||
# Currently the rules are simple so the common rules can be always put at the | ||||
# top. Need a topological sort once the dependency gets complicated. | ||||
return common_rule_factory.generate_cmake_rules() + iree_rules |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
#!/usr/bin/env python3 | ||
## Copyright 2022 The IREE Authors | ||
# | ||
# Licensed under the Apache License v2.0 with LLVM Exceptions. | ||
# See https://llvm.org/LICENSE.txt for license information. | ||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
|
||
from suites.definitions import common_definitions | ||
from suites import cmake_rule_generator | ||
import unittest | ||
|
||
|
||
class CommonRuleFactoryTest(unittest.TestCase): | ||
|
||
def setUp(self): | ||
self._factory = cmake_rule_generator.CommonRuleFactory("root/models") | ||
|
||
def test_add_model_rule(self): | ||
model = common_definitions.Model( | ||
id="1234", | ||
name="abcd", | ||
tags=[], | ||
source_type=common_definitions.ModelSourceType.EXPORTED_TFLITE, | ||
source_url="https://example.com/xyz.tflite", | ||
entry_function="main", | ||
input_types=["1xf32"]) | ||
|
||
rule = self._factory.add_model_rule(model) | ||
|
||
self.assertEqual(rule.target_name, "model-1234") | ||
self.assertEqual(rule.file_path, "root/models/1234_abcd.tflite") | ||
|
||
def test_generate_cmake_rules(self): | ||
model_1 = common_definitions.Model( | ||
id="1234", | ||
name="abcd", | ||
tags=[], | ||
source_type=common_definitions.ModelSourceType.EXPORTED_TFLITE, | ||
source_url="https://example.com/xyz.tflite", | ||
entry_function="main", | ||
input_types=["1xf32"]) | ||
model_2 = common_definitions.Model( | ||
id="5678", | ||
name="abcd", | ||
tags=[], | ||
source_type=common_definitions.ModelSourceType.EXPORTED_TFLITE, | ||
source_url="https://example.com/xyz.tflite", | ||
entry_function="main", | ||
input_types=["1xf32"]) | ||
rule_1 = self._factory.add_model_rule(model_1) | ||
rule_2 = self._factory.add_model_rule(model_2) | ||
|
||
rules = self._factory.generate_cmake_rules() | ||
|
||
self.assertEqual(len(rules), 2) | ||
self.assertRegex(rules[0], rule_1.target_name) | ||
self.assertRegex(rules[1], rule_2.target_name) | ||
|
||
|
||
class IreeRuleFactoryTest(unittest.TestCase): | ||
|
||
def setUp(self): | ||
self._factory = cmake_rule_generator.IreeRuleFactory("root/iree") | ||
|
||
def test_add_import_model_rule_import_model(self): | ||
rule = self._factory.add_import_model_rule( | ||
model_id="1234", | ||
model_name="abcd", | ||
model_source_type=common_definitions.ModelSourceType.EXPORTED_TFLITE, | ||
source_model_rule=cmake_rule_generator.ModelRule( | ||
target_name="model-1234", file_path="aaa", cmake_rule="bbb")) | ||
|
||
self.assertEqual(rule.target_name, "iree-import-model-1234") | ||
self.assertEqual(rule.mlir_dialect_type, "tosa") | ||
self.assertEqual(rule.output_file_path, "root/iree/1234_abcd/abcd.mlir") | ||
|
||
def test_add_import_model_rule_forward_mlir(self): | ||
model_rule = cmake_rule_generator.ModelRule( | ||
target_name="model-1234", | ||
file_path="root/models/1234.mlir", | ||
cmake_rule="bbb") | ||
|
||
rule = self._factory.add_import_model_rule( | ||
model_id="1234", | ||
model_name="abcd", | ||
model_source_type=common_definitions.ModelSourceType. | ||
EXPORTED_LINALG_MLIR, | ||
source_model_rule=model_rule) | ||
|
||
self.assertEqual(rule.target_name, model_rule.target_name) | ||
self.assertEqual(rule.mlir_dialect_type, "linalg") | ||
self.assertEqual(rule.output_file_path, model_rule.file_path) | ||
|
||
def test_generate_cmake_rules(self): | ||
rule_1 = self._factory.add_import_model_rule( | ||
model_id="1234", | ||
model_name="abcd", | ||
model_source_type=common_definitions.ModelSourceType.EXPORTED_TFLITE, | ||
source_model_rule=cmake_rule_generator.ModelRule( | ||
target_name="model-1234", file_path="aaa", cmake_rule="bbb")) | ||
rule_2 = self._factory.add_import_model_rule( | ||
model_id="5678", | ||
model_name="efgh", | ||
model_source_type=common_definitions.ModelSourceType.EXPORTED_TF, | ||
source_model_rule=cmake_rule_generator.ModelRule( | ||
target_name="model-5678", file_path="ccc", cmake_rule="eee")) | ||
|
||
rules = self._factory.generate_cmake_rules() | ||
|
||
self.assertEqual(len(rules), 2) | ||
self.assertRegex(rules[0], rule_1.target_name) | ||
self.assertRegex(rules[1], rule_2.target_name) | ||
|
||
|
||
if __name__ == "__main__": | ||
unittest.main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Fetch the model from "$__SOURCE_URL" | ||
add_custom_command( | ||
OUTPUT "$__OUTPUT_PATH" | ||
COMMAND | ||
"$${Python3_EXECUTABLE}" "$${IREE_ROOT_DIR}/build_tools/scripts/download_file.py" | ||
"$__SOURCE_URL" -o "$__OUTPUT_PATH" | ||
DEPENDS | ||
"$${IREE_ROOT_DIR}/build_tools/scripts/download_file.py" | ||
COMMENT "Downloading $__SOURCE_URL" | ||
) | ||
add_custom_target( | ||
"$${_PACKAGE_NAME}_$__TARGET_NAME" | ||
DEPENDS | ||
"$__OUTPUT_PATH" | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Import the Tensorflow model "$__SOURCE_MODEL_PATH" | ||
iree_import_tf_model( | ||
TARGET_NAME "$${_PACKAGE_NAME}_$__TARGET_NAME" | ||
SOURCE "$__SOURCE_MODEL_PATH" | ||
OUTPUT_MLIR_FILE "$__OUTPUT_PATH" | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Import the TFLite model "$__SOURCE_MODEL_PATH" | ||
iree_import_tflite_model( | ||
TARGET_NAME "$${_PACKAGE_NAME}_$__TARGET_NAME" | ||
SOURCE "$__SOURCE_MODEL_PATH" | ||
OUTPUT_MLIR_FILE "$__OUTPUT_PATH" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I might have missed this in an earlier review / design doc)
Are the generated CMake rules going to be checked in, or just materialized when building benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't fully decided yet. But I believe it's possible to generate and include a CMakefile during the configuration? I'll prefer not to check in the generated CMakefile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. seems tricky but possible: https://stackoverflow.com/questions/52556785/code-generator-generating-its-own-cmake-files-and-targets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that seems a bit scary into deep CMake land, which is not a nice land