Support for `DBRX` #1363

megha95 · 2024-03-27T14:48:11Z

This PR adds supports for DBRX. DBRX is a Mixture-of-Experts (MoE) model trained by Databricks with 132B total parameters and 36B live parameters. More details about the model can be found at: DBRX Technical Blog

Model weights can be found in HF repo:

Model	Link	Description
DBRX Base	HF	Pre-trained Base Model
DBRX Instruct	HF	Finetuned model for instruction following

Model can be run with 16-bit (BF16 or FP16) or 8-bit (INT8 weight-only + INT8 KV Cache). Detailed instructions for building and running TensorRT engines in the README under examples/dbrx directory.

Note: Given model has 132B total parameters, it is suggested to use mininum 4x80GB GPU cards to run 16-bit inference and 2x80GB GPUs for 8-bit inference.

juney-nvidia · 2024-03-27T16:39:41Z

@megha95 Thanks for the great contributions!
We will soon start to merge your MRs into our internal repo and later will push it onto the github. When it gets done, we will keep you posted.

Thanks
June

kaiyux · 2024-04-09T09:06:34Z

Hi @megha95 , thanks a lot for your contribution!

The changes have been merged into main branch and we added you as co-author, hence I'm closing this PR now, let us know if you have any questions. #1427

Thanks for your support.

EwoutH · 2024-04-24T08:07:54Z

@kaiyux could open a discussion with the core team about transitioning to fully open-source development? This currently is just a code dump for another repo on which the actual development happens.

By not including external contributors in regular development, you lose out on many of the advantages of open source development. You won't have the same level of community engagement and you wont grow a community of external developers. They will always be lagging behind what's actually happens, as long as everything takes place behind closed doors.

@megha95 now isn't properly credited by this cherry pick to an shadow repo and then a squash merge back to this "public" repo. He should be the full author of his changeset.

I hope you and your team are willing to open the discussion on how to transition to true open-source development.

dbrx init

3ae5dc4

dskhudia mentioned this pull request Mar 28, 2024

How inference efficiency is measured databricks/dbrx#9

Open

kaiyux mentioned this pull request Apr 9, 2024

Update TensorRT-LLM #1427

Merged

kaiyux closed this Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for `DBRX` #1363

Support for `DBRX` #1363

megha95 commented Mar 27, 2024

juney-nvidia commented Mar 27, 2024

kaiyux commented Apr 9, 2024

EwoutH commented Apr 24, 2024

Support for DBRX #1363

Support for DBRX #1363

Conversation

megha95 commented Mar 27, 2024

juney-nvidia commented Mar 27, 2024

kaiyux commented Apr 9, 2024

EwoutH commented Apr 24, 2024

Support for `DBRX` #1363

Support for `DBRX` #1363