Skip to content

Conversation

@ochafik
Copy link
Contributor

@ochafik ochafik commented Nov 17, 2024

This should "fix" #7

Not sure this is doing the right thing tbh, feedback welcome:

  • Currently silently truncating the input. Making truncation optional (and exploding if the option isn't set) runs the risk of most people never using the option and randomly exploding when they finally use larger inputs (in prod).
  • Aligned n_batch = n_ubatch = n_ctx to avoid crashes in llama.cpp. Possibly very inefficient? Also, default to the model's n_ctx_train.

Tested w/ all-MiniLM-L6-v2.e4ce9877.q8_0.gguf & nomic-embed-text-v1.5.Q8_0.gguf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant