Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update root-cause-analysis-model-card.md #1684

Merged
merged 7 commits into from
May 16, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 8 additions & 72 deletions models/model-cards/root-cause-analysis-model-card.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,63 +21,49 @@ limitations under the License.
# Model Overview

## Description:

* Root cause analysis is a binary classifier differentiating between ordinary logs and errors/problems/root causes in the log files. <br>

## References(s):

* Devlin J. et al. (2018), BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 <br>

## Model Architecture:

**Architecture Type:**

* Transformers <br>

**Network Architecture:**

* BERT <br>

## Input: (Enter "None" As Needed)

**Input Format:**

* CSV <br>

**Input Parameters:**

* kern.log file contents <br>

**Other Properties Related to Output:**

* N/A <br>

## Output: (Enter "None" As Needed)

**Output Format:**

* Binary Results, Root Cause or Ordinary <br>

**Output Parameters:**

* N/A <br>

**Other Properties Related to Output:**

* N/A <br>

## Software Integration:

**Runtime(s):**

* Morpheus <br>

**Supported Hardware Platform(s):** <br>

* Ampere/Turing <br>

**Supported Operating System(s):** <br>

* Linux <br>

## Model Version(s):
Expand All @@ -88,67 +74,31 @@ limitations under the License.
## Training Dataset:

**Link:**

* https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/models/datasets/training-data/root-cause-training-data.csv <br>

**Properties (Quantity, Dataset Descriptions, Sensor(s)):**

* kern.log files from DGX machines <br>

## Evaluation Dataset:

**Link:**

* https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/models/datasets/validation-data/root-cause-validation-data-input.jsonlines <br>

**Properties (Quantity, Dataset Descriptions, Sensor(s)):**

* kern.log files from DGX machines <br>

## Inference:

**Engine:**

* Triton <br>

**Test Hardware:** <br>

* Other <br>

# Subcards

## Model Card ++ Bias Subcard

### What is the gender balance of the model validation data?
* Not Applicable

### What is the racial/ethnicity balance of the model validation data?
* Not Applicable

### What is the age balance of the model validation data?
* Not Applicable

### What is the language balance of the model validation data?
* Not Applicable

### What is the geographic origin language balance of the model validation data?
* Not Applicable

### What is the educational background balance of the model validation data?
* Not Applicable

### What is the accent balance of the model validation data?
* Not Applicable

### What is the face/key point balance of the model validation data?
* Not Applicable

### What is the skin/tone balance of the model validation data?
* Not Applicable

### What is the religion balance of the model validation data?
* Not Applicable

### Individuals from the following adversely impacted (protected classes) groups participate in model design and testing.
* Not Applicable

Expand All @@ -160,26 +110,24 @@ limitations under the License.
### Name example applications and use cases for this model.
* The model is primarily designed for testing purposes and serves as a small pre-trained model specifically used to evaluate and validate the Root Cause Analysis pipeline. This model is an example of customized transformer-based root cause analysis. It can be used for pipeline testing purposes. It needs to be re-trained for specific root cause analysis or predictive maintenance needs with the fine-tuning scripts in the repo. The hyperparameters can be optimised to adjust to get the best results with another dataset. The aim is to get the model to predict some false positives that could be previously unknown error types. Users can use this root cause analysis approach with other log types too. If they have known failures in their logs, they can use them to train along with ordinary logs and can detect other root causes they weren't aware of before.

### Fill in the blank for the model technique.

### Intended Users.
* This model is designed for developers seeking to test the root cause analysis pipeline with a small pre-trained model trained on a very small `kern.log` file from a DGX.

### Name who is intended to benefit from this model.

* The intended beneficiaries of this model are developers who aim to test the functionality of the DFP pipeline using synthetic datasets

### Describe the model output.
* This model output can be used as a binary result, Root cause or Ordinary

### List the steps explaining how this model works.
### Describe how this model works.
* A BERT model gets fine-tuned with the kern.log dataset and in the inference it predicts one of the binary classes. Root cause or Ordinary.

### Name the adversely impacted groups (protected classes) this has been tested to deliver comparable outcomes regardless of:
* Not Applicable

### List the technical limitations of the model.
* For different log types and content, different models need to be trained.

### Has this been verified to have met prescribed NVIDIA quality standards?
* Yes

### What performance metrics were used to affirm the model's performance?
* F1

Expand All @@ -195,10 +143,7 @@ limitations under the License.
### Link the location of the training dataset's repository.
* https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/models/datasets/training-data/root-cause-training-data.csv

### Is the model used in an application with physical safety impact?
* No

### Describe physical safety impact (if present).
### Describe the life critical impact (if present).
* None

### Was model and dataset assessed for vulnerability for potential form of attack?
Expand All @@ -210,12 +155,6 @@ limitations under the License.
### Name use case restrictions for the model.
* Different models need to be trained depending on the log types.

### Has this been verified to have met prescribed quality standards?
* No

### Name target quality Key Performance Indicators (KPIs) for which this has been tested.
* N/A

### Is the model and dataset compliant with National Classification Management Society (NCMS)?
* No

Expand All @@ -232,7 +171,7 @@ limitations under the License.


### Generatable or reverse engineerable personally-identifiable information (PII)?
* Neither
* None

### Was consent obtained for any PII used?
* N/A
Expand All @@ -249,12 +188,9 @@ limitations under the License.
### If PII collected for the development of this AI model, was it minimized to only what was required?
* N/A

### Is data in dataset traceable?
### Is there data provenance?
* Original raw logs are not saved. The small sample in the repo is saved for testing the pipeline.

### Are we able to identify and trace source of dataset?
* N/A

### Does data labeling (annotation, metadata) comply with privacy laws?
* N/A

Expand Down
Loading