GitHub

Kraken Architecture

Overview

The Kraken Architecture is a sophisticated machine learning framework designed for dynamic text generation tasks. It utilizes the Hugging Face transformers library to orchestrate multiple causal language models (CLMs) and intelligently route input through different models based on the context and content of the input text. The architecture is powered by a custom configuration class (KrakenConfig) that facilitates the integration and management of various components such as tokenizers, models, and routing mechanisms.

Features

Dynamic Model Routing: Uses a sequence classification model to route inputs to the most suitable language model based on the input's characteristics. Multiple Language Models: Supports integration of various pre-trained causal language models, allowing for flexible, context-appropriate responses. Customizable Templates: Includes support for input formatting using predefined templates, enhancing the model's adaptability to different conversational contexts. Extensible Configuration: Leverages a custom configuration setup that can be easily extended and adapted for various use cases involving causal language modeling.

Requirements

Python 3.11+ transformers 4.40+ torch 2.2+

How to Use

(Optional) I. Run the jupyter notebook kraken_prepare_trainingdata to prepare data based on your usecase datasets

(Optional) II. Run the jupyter notebook kraken_train_router.ipynb to train a router that will be imported later as a our router on the Kraken CoE Architecture

Run the kraken_lm_save.ipynb that will load a router (could be the one you have trained in step 0.) and sets up a model following the Kraken CoE Architecture, according to the config.json. This will generate a subfolder ./kraken_model
Run kraken_lm_load.ipynb to understand how to load the newly created model

Cite As

Fernando Fernandes Neto, David Golchinfar, Lucas Atkins, Eric Hartford - Kraken: An OpenSource Collection of Experts Model, 2024

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Kraken-LoRA		Kraken-LoRA
LICENSE		LICENSE
README.md		README.md
config.json		config.json
configuration_kraken.py		configuration_kraken.py
kraken.png		kraken.png
kraken_lm_load.ipynb		kraken_lm_load.ipynb
kraken_lm_save.ipynb		kraken_lm_save.ipynb
kraken_prepare_trainingdata.ipynb		kraken_prepare_trainingdata.ipynb
kraken_train_router.ipynb		kraken_train_router.ipynb
modeling_kraken.py		modeling_kraken.py
qwen_class_train_router.ipynb		qwen_class_train_router.ipynb
requirements.txt		requirements.txt
tokenizer_template_switch.py		tokenizer_template_switch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kraken Architecture

Overview

Features

Requirements

How to Use

Cite As

About

Releases

Packages

Contributors 2

Languages

License

cognitivecomputations/kraken

Folders and files

Latest commit

History

Repository files navigation

Kraken Architecture

Overview

Features

Requirements

How to Use

Cite As

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages