Skip to content

faridani/PyNLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyNLP is designed to perform simple NLP procedures (examples include tokenizing, removing stop words, frequency analysis,...)


Design Features:
	- Light weight
	- PyNLP is multi-core friendly. It will utilize all cores for its operations. 
	- For each module unittests are provided as part of the source code. Unit tests cover most of the functions and modules


Ideas for the future:
	- Combine Tokenize and Preprocess into one package

Help:
	- preprocess contains functions for tokenizing
	- dataTools helps you import your data


Please note: This project is still in very early stages. Please report bugs to faridani@berkeley.edu 



Siamak Faridani
UC Berkeley
July 2010
faridani@berkeley.edu 



About

... just because nltk is too heavy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages