-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple improvements: #16
Conversation
…te PR merges); but also add .prettierrc and yarn script to auto-format the entire repo to encourage the decision to enable this ability
…as not been downloaded and processed yet then return an error message with instructions for how to
…ying where to put it with "LLAMA_DIR" environment variable
I've confirmed that download and quantization fixes work. The UI does, too. |
Thanks @fermigas This patch should be a meaningful improvement for everyone that is playing with this I think @cocktailpeanut would love a code review and merge, or pointers for what to update :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I manually made the changes I reviewed below and tested them.
Downloads and quantization code changes work fine. Approved.
All 4 models work with the new UI changes. Approved.
I did not apply the changes in the files I haven't marked "viewed".
@@ -9,14 +9,14 @@ const start = (port) => { | |||
dalai.http(httpServer) | |||
app.use(express.static(path.resolve(__dirname, 'public'))) | |||
app.use(express.json()); | |||
app.use(express.urlencoded()); | |||
app.use(express.urlencoded({ extended: true })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good. Fixes bug where server emits this line on stating:
body-parser deprecated undefined extended: provide extended option .npm/_npx/3c737cbb02d79cc9/node_modules/dalai/bin/web/index.js:12:19
<select id="model"> | ||
<!-- options: 7B, 13B, 30B, 65B --> | ||
<option value="7B">7B</option> | ||
<option value="13B">13B</option> | ||
<option value="30B">30B</option> | ||
<option value="65B">65B</option> | ||
</select> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really useful, and works fine.
Might consider adding a text input box for # of threads, and temperature too. I'm finding the model pretty sensitive to temperature settings.
console.log(`Skip file download, it already exists: ${file}`) | ||
continue; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful. The 6th parameter file of the 65B download kept failing for me. I had to pull it from a torrent to get it to work, then had to do the quantization manually. This and the quantization file check below improve error recover a lot.
@@ -117,12 +132,19 @@ class Dalai { | |||
} | |||
} | |||
async query(req, cb) { | |||
console.log(`> query:`, req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful. You might consider logging the model's output to the console, too.
const outputFile1 = `./models/${model}/ggml-model-f16.bin${suffix}` | ||
const outputFile2 = `./models/${model}/ggml-model-q4_0.bin${suffix}` | ||
if (fs.existsSync(path.resolve(this.home, outputFile1)) && fs.existsSync(path.resolve(this.home, outputFile2))) { | ||
console.log(`Skip quantization, files already exists: ${outputFile1} and ${outputFile2}}`) | ||
continue | ||
} | ||
await this.exec(`./quantize ${outputFile1} ${outputFile2} 2`, this.home) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, this is a big improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
👍 🙏 |
dalai llama
script idempotent:1: Don't re-download files that have already been downloaded
2: Don't re-create models that have already been created
dalai llama <MODEL>
)yarn dalai:llama <optional MODEL>
andyarn start
cwd
by default, and allow for specifying where to put it with theLLAMA_DIR
environment variable./dalai
script in dalai repo to allow for easier local development testing of the CLIdalai llama
succeeds