-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support other ext tasks except aso task and fix sentiment analysis based on SKEP #4357
Merged
Merged
Changes from all commits
Commits
Show all changes
152 commits
Select commit
Hold shift + click to select a range
37d27f9
initial commit
1649759610 d65208e
refine readme
1649759610 ac4d644
refine codestyle
1649759610 3f433b9
refine readme
1649759610 d3f6ada
refine readme
1649759610 54ed34b
fix model saving bug
1649759610 63b0a76
Merge branch 'develop' into develop
4669194
initial commit
1649759610 f6f93e1
Merge branch 'PaddlePaddle:develop' into develop
1649759610 fb3ade1
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 a83a902
initial commit
1649759610 7bd988a
initial commit
1649759610 700810a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 68e025a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 02a997b
use common metric instead of eval_metrics.py and remove unuseful code
1649759610 1500e5f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 faaf5f5
Merge branch 'develop' into develop
1649759610 6a512a0
Merge branch 'PaddlePaddle:develop' into develop
1649759610 a99fc68
Merge branch 'PaddlePaddle:develop' into develop
1649759610 4b5fa30
Merge branch 'PaddlePaddle:develop' into develop
1649759610 b837b67
mv stage project to ASO_analysis
1649759610 f415740
add unified sentiment analysis
1649759610 41b020d
Merge branch 'PaddlePaddle:develop' into develop
1649759610 252860d
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 aed2d64
refine readme
1649759610 8899a12
refine readme
1649759610 425a273
refnie readme
1649759610 acd9add
add unified sentiment analysis
1649759610 4796016
refine readme
1649759610 e857c6a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 88649f4
initial commit
1649759610 4141916
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 12628a7
initial commit
1649759610 70ba157
refine readme
1649759610 531428e
add taskflow for sentiment analysis with UIE
1649759610 8e8b853
refine Readme
1649759610 410f0e4
refine readme.md
1649759610 64edd33
support sentiment analysis (UIE) with inputing by file format
1649759610 1f9637c
refine readme
1649759610 93509cd
delete predict scripts
1649759610 6473b10
refine readme
1649759610 09d0f12
delete unuseful files
1649759610 e546952
add pipeline for sentiment_analysis
1649759610 447fc14
merging with the newest code
1649759610 83252b0
merging code with the newest code
1649759610 f57df8c
fix to convert data without synonyms
1649759610 249a8a9
add senta pipeline
1649759610 cd3f4e7
refine readme
1649759610 a1de96d
drop functions: inputting file and saving results
1649759610 a5f83b1
add UIE-seta-[base, medium, mini, micro, nano]
1649759610 1da02d0
modify .gitignore to trace deploy code
1649759610 c4c135a
add deploy with SimpleServer
1649759610 363963b
add debug mode
1649759610 a0c8608
fix debug mode
1649759610 5afa387
update the loading method of UIE
1649759610 7109f24
Merge branch 'PaddlePaddle:develop' into develop
1649759610 bb7da20
refine readme
1649759610 1af076f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 8c104f2
fix bug caused by version updating
1649759610 332a1e4
fix hard coding for model name.
1649759610 5e550bd
refine codestyle
1649759610 98c7084
modify readme according the way of 'step by step'
1649759610 7ec344f
refine codestyl
1649759610 6af902d
change saving txt to json files
1649759610 a7f57ea
download font automatically when not input font_path
1649759610 7c331d2
change readme in the way 'step by step'
1649759610 d78f73b
add model prediction by batch
1649759610 e6f0359
add uie-senta-x to support_schema_list
1649759610 e17d624
update sentiment analysis in taskflow
1649759610 485044c
add prediction with saved offline model
1649759610 267509c
change the exception exposure way
1649759610 30e3044
add description for visual schema
1649759610 7c88879
delete comments
1649759610 3e8c777
remove comments
1649759610 f3429b7
remove unused code and comments
1649759610 6c2a712
convert uie-senta-x model params to fit ernie/uie
1649759610 6f15864
refine readme for sentiment analysis
1649759610 55276fa
add running time
1649759610 6c4fd93
refine readme for senta pipeline
1649759610 d0729c0
change uie-base to uie-senta-base
1649759610 46fc8fe
load uie-senta-x with auto module
1649759610 ee5938b
add deploy with SimpleServer
1649759610 46c417c
refine codestyle
1649759610 96bd92f
refine readme
1649759610 fbf6567
add uie-senta-x to support_schema_list
1649759610 ac7c5ed
fix hard coding for mdoel anme
1649759610 d21452b
refine codestyle
1649759610 4128753
refine codestyl
1649759610 4d072d7
refine codestyle
1649759610 661e944
refine codestyle
1649759610 77c090f
refine codestyle
1649759610 128d154
refine codestyle
1649759610 8c59f76
refine codestyle
1649759610 8651ae2
refine codestyle
1649759610 e14ff4a
refine codestyle
1649759610 aae9da1
fix senta response
1649759610 bb441ca
add uie_senta_x
1649759610 d99c204
refine codestyle
1649759610 5650404
remove lambda expressions
1649759610 1614c6c
add link of senta pipeline
1649759610 5f89ceb
refine codestyle
1649759610 9a9be2c
remove local path
1649759610 87782d3
Merge branch 'develop' into develop
1649759610 288aaab
fix typos
1649759610 92f1278
refine readme
1649759610 0190ca1
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 485f3cf
refine readme
1649759610 9400392
Merge branch 'automodel' into develop
1649759610 432e5e2
load uie-senta-x with automodel
1649759610 422ac9a
remove commented code
1649759610 98779c9
restore auto
1649759610 99fe1cb
Merge branch 'develop' into develop
1649759610 22394fa
add link of hotel dataset to readme.
1649759610 48b20c5
add link for downloading test_hotel.txt
1649759610 93073c5
fix url problem for server and client
1649759610 9d1243f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 075a4ae
Merge branch 'PaddlePaddle:develop' into develop
1649759610 5783f19
refine readme
1649759610 e61ad7b
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 19265f3
Merge branch 'PaddlePaddle:develop' into develop
1649759610 ff664d2
fix for senta_examples.py
1649759610 69932d3
update visualization function
1649759610 f922241
update visualization function
1649759610 0b5c224
refine readme and update visualization description
1649759610 7d618a8
update visualization function
1649759610 a62d8c5
Merge branch 'PaddlePaddle:develop' into develop
1649759610 ce9825a
refine readme and update visualization function
1649759610 a3ec63a
change logger in PaddleNLP to log information
1649759610 c344b8c
fix running time for skep and uie
1649759610 df1ad55
Merge branch 'PaddlePaddle:develop' into develop
1649759610 7f00796
fix bug to solve tokenizer updating problem
1649759610 1ea8757
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 3123147
refine label-studio readme
1649759610 88060da
refine label-studio readme
1649759610 ab18fc8
Merge branch 'PaddlePaddle:develop' into develop
1649759610 fc52a00
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 ac89d49
refine label-studio readme
1649759610 579631b
optimize example construction for a, o, as, ao extraction task
1649759610 b4364cf
Merge branch 'PaddlePaddle:develop' into develop
1649759610 30ea32f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 4bfab1c
add the labeling method for ext task: a, as, ao and so on.
1649759610 091e216
add note for visual_analysis.py
1649759610 0a5b991
change link for downloading data and refine log output
1649759610 070bfda
refine log output
1649759610 c34bedb
refine readme
1649759610 e5a94c2
expose options interface
1649759610 e624418
refine readme
1649759610 a021936
modify typos
1649759610 b754bb1
expose options for customing sentiment analysis
1649759610 561f607
README.md
1649759610 7a2b5ca
Merge branch 'PaddlePaddle:develop' into develop
1649759610 1792554
Merge branch 'PaddlePaddle:develop' into develop
1649759610 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ | |
import argparse | ||
import copy | ||
import json | ||
import re | ||
from collections import defaultdict | ||
from functools import partial | ||
|
||
|
@@ -46,11 +47,17 @@ def concate_aspect_and_opinion(text, aspect, opinions): | |
return aspect_text | ||
|
||
|
||
def remove_blanks(example): | ||
example["text"] = re.sub(" +", "", example["text"]) | ||
return example | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同上 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同上 |
||
|
||
def predict_ext(args): | ||
# load dict and dataset | ||
model_name = "skep_ernie_1.0_large_ch" | ||
ext_label2id, ext_id2label = load_dict(args.ext_label_path) | ||
datasets = load_dataset("text", data_files={"test": args.test_path}) | ||
datasets["test"] = datasets["test"].map(remove_blanks) | ||
|
||
tokenizer = SkepTokenizer.from_pretrained(model_name) | ||
trans_func = partial( | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里比较好奇,为什么要改动原文的输入?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
去除原文中的空格,当前的tokenizer 在encode时会忽略空格,导致input_ids长度!=原始文本的长度,会有匹配上的一些问题。