Skip to content
This repository has been archived by the owner on Dec 24, 2023. It is now read-only.
/ NLP Public archive

Project files for NLP proj of Fundamentals of Data Science 2022 spring, NJU.

License

Notifications You must be signed in to change notification settings

CybCom/NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COURSE COMPLETED. ARCHIVED.


NLP

Project files for NLP proj of Fundamentals of Data Science 2022 spring, NJU.

This project is published under GPL v3 protocol.

WARNING! Please REMOVE files in dir "output" before commit, or it will exceed capacity limit of github.

Project Author

CybCom & Zhou

Preparation

ML&DL

Coursera: Machine Learning for basic issues https://www.coursera.org/learn/machine-learning

国立台湾大学:李宏毅机器学习 for BERT https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.php

CS224n for Natural Language Processing, including word2vec http://web.stanford.edu/class/cs224n/index.html

Web Crawler

https://www.zhihu.com/question/20899988

http://c.biancheng.net/python_spider/what-is-spider.html

https://zhuanlan.zhihu.com/p/73742321

Structure

Data Source

Given sheet for training.

Web crawler from gov website cluster

Data Process

Preliminary filtering with logical judgment and string similarity.

Use word2vec with CNN for second classification .

About

Project files for NLP proj of Fundamentals of Data Science 2022 spring, NJU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published