Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for DBRX #1363

Closed
wants to merge 1 commit into from
Closed

Support for DBRX #1363

wants to merge 1 commit into from

Conversation

megha95
Copy link
Contributor

@megha95 megha95 commented Mar 27, 2024

This PR adds supports for DBRX. DBRX is a Mixture-of-Experts (MoE) model trained by Databricks with 132B total parameters and 36B live parameters. More details about the model can be found at: DBRX Technical Blog

Model weights can be found in HF repo:

Model Link Description
DBRX Base HF Pre-trained Base Model
DBRX Instruct HF Finetuned model for instruction following

Model can be run with 16-bit (BF16 or FP16) or 8-bit (INT8 weight-only + INT8 KV Cache). Detailed instructions for building and running TensorRT engines in the README under examples/dbrx directory.

Note: Given model has 132B total parameters, it is suggested to use mininum 4x80GB GPU cards to run 16-bit inference and 2x80GB GPUs for 8-bit inference.

@juney-nvidia
Copy link
Collaborator

@megha95 Thanks for the great contributions!
We will soon start to merge your MRs into our internal repo and later will push it onto the github. When it gets done, we will keep you posted.

Thanks
June

@kaiyux
Copy link
Member

kaiyux commented Apr 9, 2024

Hi @megha95 , thanks a lot for your contribution!

The changes have been merged into main branch and we added you as co-author, hence I'm closing this PR now, let us know if you have any questions. #1427

Thanks for your support.

@kaiyux kaiyux closed this Apr 9, 2024
@EwoutH
Copy link

EwoutH commented Apr 24, 2024

@kaiyux could open a discussion with the core team about transitioning to fully open-source development? This currently is just a code dump for another repo on which the actual development happens.

By not including external contributors in regular development, you lose out on many of the advantages of open source development. You won't have the same level of community engagement and you wont grow a community of external developers. They will always be lagging behind what's actually happens, as long as everything takes place behind closed doors.

@megha95 now isn't properly credited by this cherry pick to an shadow repo and then a squash merge back to this "public" repo. He should be the full author of his changeset.

I hope you and your team are willing to open the discussion on how to transition to true open-source development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants