KTeleBERT

Tele-Knowledge Pre-training for Fault Analysis

Author: Zhuo Chen†, Wen Zhang†, Yufeng Huang, Mingyang Chen,Yuxia Geng, Hongtao Yu, Zhen Bi, Yichi Zhang, Zhen Yao, Huajun Chen (College of Computer Science, Zhejiang University) Wenting Song, Xinliang Wu, Yi Yang, Mingyi Chen, Zhaoyang Lian, Yingying Li, Lei Cheng (NAIE PDU, Huawei Technologies Co., Ltd.) In this paper we propose a tele-domain pre-trained language model named TeleBERT to learn the general semantic knowledge in the telecommunication field together with its improved version KTeleBERT, which incorporates those implicit information in machine log data and explicit knowledge contained in our Tele-product Knowledge Graph (Tele-KG).

📚 Tele-data

The data examples are available here. Considering the sensitivity of some data, we cannot publish all of them.

🔬 Workflow

🧵 ANEnc

🔎 Visualization

Visualization for Numerical Data

Visualization for Abnormal KPI Detection Data

🚀 Usage

Requirements

transformers >= 4.21.2
PyTorch >= 1.6.0
tqdm
ltp

Parameter

For more details: config.py

  --train_strategy 
  --batch_size 
  --batch_size_ke 
  --batch_size_od 
  --batch_size_ad 
  --epoch 
  --save_model {0,1}
  --save_pretrain {0,1}
  --from_pretrain {0,1}
  --dump_path   Experiment dump path
  --random_seed 
  --train_ratio  ratio for train/test
  --final_mlm_probability 
  --mlm_probability_increase {linear,curve}
  --mask_stratege {rand,wwm,domain}
  --ernie_stratege 
  --use_mlm_task {0,1}
  --add_special_word {0,1}
  --freeze_layer {0,1,2,3,4}
  --special_token_mask {0,1}
  --emb_init {0,1}
  --cls_head_init {0,1}
  --use_awl {0,1}
  --mask_loss_scale 
  --ke_norm 
  --ke_dim 
  --ke_margin 
  --neg_num 
  --adv_temp    The temperature of sampling in self-adversarial negative sampling.
  --ke_lr 
  --only_ke_loss 
  --use_NumEmb 
  --contrastive_loss {0,1}
  --l_layers L_LAYERS
  --use_kpi_loss
  --only_test {0,1}
  --mask_test {0,1}
  --embed_gen {0,1}
  --ke_test {0,1}
  --ke_test_num 
  --path_gen 
  --order_load 
  --order_num 
  --od_type {linear_cat,vertical_attention}
  --eps EPS             label smoothing
  --num_od_layer 
  --plm_emb_type {cls,last_avg}
  --order_test_name 
  --order_threshold 
  --rank RANK           rank to dist
  --dist DIST           whether to dist
  --device DEVICE       device id (i.e. 0 or 0,1 or cpu)
  --world-size WORLD_SIZE number of distributed processes
  --dist-url DIST_URL   url used to set up distributed training
  --local_rank LOCAL_RANK

Running

train: bash run.sh
test: bash test.sh

Note:

you can open the .sh file for default parameter modification.

🤝 Cite:

Please condiser citing this paper if you use the code from our work. Thanks a lot :)

@inproceedings{DBLP:conf/icde/00070HCGYBZYSWY23,
  author       = {Zhuo Chen and
                  Wen Zhang and
                  Yufeng Huang and
                  Mingyang Chen and
                  Yuxia Geng and
                  Hongtao Yu and
                  Zhen Bi and
                  Yichi Zhang and
                  Zhen Yao and
                  Wenting Song and
                  Xinliang Wu and
                  Yi Yang and
                  Mingyi Chen and
                  Zhaoyang Lian and
                  Yingying Li and
                  Lei Cheng and
                  Huajun Chen},
  title        = {Tele-Knowledge Pre-training for Fault Analysis},
  booktitle    = {{ICDE}},
  pages        = {3453--3466},
  publisher    = {{IEEE}},
  year         = {2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
Dataset		Dataset
KTeleBERT		KTeleBERT
PPT		PPT
downstream_task		downstream_task
figures		figures
.DS_Store		.DS_Store
README.md		README.md
licence		licence

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KTeleBERT

📚 Tele-data

🔬 Workflow

🧵 ANEnc

🔎 Visualization

🚀 Usage

Requirements

Parameter

Running

🤝 Cite:

About

Releases

Packages

Contributors 5

Languages

License

hackerchenzhuo/KTeleBERT

Folders and files

Latest commit

History

Repository files navigation

KTeleBERT

📚 Tele-data

🔬 Workflow

🧵 ANEnc

🔎 Visualization

🚀 Usage

Requirements

Parameter

Running

🤝 Cite:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages