Welcome to xlstm-jax Discussions! #4
Replies: 2 comments
-
Hey @superbock and of course hello to the other team members as well! Many thanks for releasing this JAX-based codebase of xLSTM, I am very excited to try it out and I am currently reading through the codebase! I have already trained a basic/small/research xLSTM on German Wikipedia (https://huggingface.co/stefan-it/xlstm-german-wikipedia) with another implementation, so I am curious to try out this JAX implementation here as well. My long term goal is to have a kind of "bi-directional" setting, like it is shown in the Flair Embeddings paper, where the "normal" LSTM (forward + backward LSTM) is replaced with xLSTM and use it for Token Classification of Text Classification tasks. But one of my first questions would be: do you think that the current codebase is also working on TPUs - thanks to TRC this would massively boost xLSTM pretrainings. Many thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
very cool! Is there a way to load the pretrained xLSTM 7B safetensor weights to finetune the model on a custom dataset? |
Beta Was this translation helpful? Give feedback.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
To get started, comment below with an introduction of yourself and tell us about what you do with this community.
Beta Was this translation helpful? Give feedback.
All reactions