You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: "Leaderboard to evaluate vision language models.",
81
69
id: "opencompass/open_vlm_leaderboard",
82
70
},
83
71
{
84
-
description: "Vision language models arena, where models are ranked by votes of users.",
85
-
id: "WildVision/vision-arena",
86
-
},
87
-
{
88
-
description: "Powerful vision-language model assistant.",
89
-
id: "akhaliq/Molmo-7B-D-0924",
90
-
},
91
-
{
92
-
description: "Powerful vision language assistant that can understand multiple images.",
93
-
id: "HuggingFaceTB/SmolVLM2",
94
-
},
95
-
{
96
-
description: "An application for chatting with an image-text-to-text model.",
97
-
id: "GanymedeNil/Qwen2-VL-7B",
98
-
},
99
-
{
100
-
description: "An application that parses screenshots into actions.",
101
-
id: "showlab/ShowUI",
72
+
description: "An application that compares object detection capabilities of different vision language models.",
73
+
id: "sergiopaniego/vlm_object_understanding",
102
74
},
103
75
{
104
-
description: "An application that detects gaze.",
105
-
id: "moondream/gaze-demo",
76
+
description: "An application to compare different OCR models.",
77
+
id: "prithivMLmods/Multimodal-OCR",
106
78
},
107
79
],
108
80
summary:
109
81
"Image-text-to-text models take in an image and text prompt and output text. These models are also called vision-language models, or VLMs. The difference from image-to-text models is that these models take an additional text input, not restricting the model to certain use cases like image captioning, and may also be trained to accept a conversation as input.",
description: "Image enhancer application for low light.",
79
-
id: "keras-io/low-light-image-enhancement",
78
+
description: "Image editing application.",
79
+
id: "black-forest-labs/FLUX.1-Kontext-Dev",
80
80
},
81
81
{
82
-
description: "Style transfer application.",
83
-
id: "keras-io/neural-style-transfer",
82
+
description: "Image relighting application.",
83
+
id: "lllyasviel/iclight-v2-vary",
84
84
},
85
85
{
86
-
description: "An application that generates images based on segment control.",
87
-
id: "mfidabel/controlnet-segment-anything",
88
-
},
89
-
{
90
-
description: "Image generation application that takes image control and text prompt.",
91
-
id: "hysts/ControlNet",
92
-
},
93
-
{
94
-
description: "Colorize any image using this app.",
95
-
id: "ioclab/brightness-controlnet",
96
-
},
97
-
{
98
-
description: "Edit images with instructions.",
99
-
id: "timbrooks/instruct-pix2pix",
86
+
description: "An application for image upscaling.",
87
+
id: "jasperai/Flux.1-dev-Controlnet-Upscaler",
100
88
},
101
89
],
102
90
summary:
103
91
"Image-to-image is the task of transforming an input image through a variety of possible manipulations and enhancements, such as super-resolution, image inpainting, colorization, and more.",
0 commit comments