-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add bad case analysis for text classification #3385
Conversation
@@ -3,13 +3,16 @@ | |||
**目录** | |||
* [analysis模块介绍](#analysis模块介绍) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个名字Analysis A大写会更标准
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
@@ -3,13 +3,16 @@ | |||
**目录** | |||
* [analysis模块介绍](#analysis模块介绍) | |||
* [模型评估](#模型评估) | |||
* [错误样例分析](#错误样例分析) | |||
* [稀疏数据筛选方案](#稀疏数据筛选方案) | |||
* [脏数据清洗方案](#脏数据清洗方案) | |||
* [数据增强策略方案](#数据增强策略方案) | |||
|
|||
## analysis模块介绍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analysis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
|
||
**安装TrustAI** | ||
```shell | ||
pip install trustai==0.1.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议使用 >=的方式来控制版本
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同时提需求给TrustAI同学尽可能保持兼容
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
pip install trustai==0.1.7 | ||
``` | ||
|
||
**安装interpretdl**(可选)如果使用词级别可解释性分析GradShap方法,需要安转interpretdl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
安装InterpretDL,主要产品品牌名的大小写正确
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为InterpretDL
|
||
可支持配置的参数: | ||
|
||
* `device`: 选用什么设备进行训练,选择cpu、gpu、xpu、npu;默认为"gpu"。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“可选择cpu、gpu、xpu、npu;”加个可
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已在所有的文档中修改
parser.add_argument("--top_k", type=int, default=3, help="Top K important training data.") | ||
parser.add_argument("--train_file", type=str, default="train.txt", help="Train dataset file name") | ||
parser.add_argument("--interpret_file", type=str, default="bad_case.txt", help="interpretation file name") | ||
parser.add_argument("--interpreted_file", type=str, default="sent_interpret.txt", help="interpreted file name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个文件的help解释很难理解区分
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为interpret_input_file、interpret_result_file
* `top_k`:筛选支持训练证据数量;默认为3。 | ||
* `train_file`:本地数据集中训练集文件名;默认为"train.txt"。 | ||
* `interpret_file`:本地数据集中待分析文件名;默认为"bad_case.txt"。 | ||
* `interpreted_file`:保存句子级别可解释性结果文件名;默认为"sent_interpret.txt"。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个args命名再斟酌下,不是太容易区分。
是否按照interpret_input/interpret_result 这类更直接的描述来区分
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为interpret_input_file、interpret_result_file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Others
Description
新增错误样例分析方案
-基于token级别的可信分析(LIME、GradShap、InterGrad)
-基于特征级别的可信分析(FeatureSimilarity)