Skip to content

Conversation

@RaymondLi0
Copy link
Contributor

✨ Description

Small changes to be able to load Apriel-1.5-15B:

  • remove projector_intermediate_size from llava_hybrid and llava converter
  • add hf's gelu activation

An enum for the available activation types for the MLP layer.
"""

gelu_gaussian = "gelu_gaussian"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no point in having a separate gelu because we're not going for exact numerical match, and we already have bigger numerical differences anyway. HF gelu and gelu_pytorch_tanh should both map to our standard gelu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants