@@ -3,6 +3,7 @@ description: 'Dataset containing 1 million articles from Wikipedia and their vec
33sidebar_label : ' dbpedia dataset'
44slug : /getting-started/example-datasets/dbpedia-dataset
55title : ' dbpedia dataset'
6+ keywords : ['semantic search', 'vector similarity', 'approximate nearest neighbours', 'embeddings']
67---
78
89The [ dbpedia dataset] ( https://huggingface.co/datasets/Qdrant/dbpedia-entities-openai3-text-embedding-3-large-1536-1M ) contains 1 million articles from Wikipedia and their vector embeddings generated using the ` text-embedding-3-large ` model from OpenAI.
@@ -84,8 +85,10 @@ SELECT id, title
8485FROM dbpedia
8586ORDER BY cosineDistance(vector, ( SELECT vector FROM dbpedia WHERE id = ' <dbpedia:The_Remains_of_the_Day>' ) ) ASC
8687LIMIT 20
88+ ```
8789
88- ` ` ` response title="Response" ┌─id────────────────────────────────────────┬─title───────────────────────────┐
90+ ``` response title="Response"
91+ ┌─id────────────────────────────────────────┬─title───────────────────────────┐
8992 1. │ <dbpedia:The_Remains_of_the_Day> │ The Remains of the Day │
9093 2. │ <dbpedia:The_Remains_of_the_Day_(film)> │ The Remains of the Day (film) │
9194 3. │ <dbpedia:Never_Let_Me_Go_(novel)> │ Never Let Me Go (novel) │
@@ -122,7 +125,6 @@ Run the following SQL to define and build a vector similarity index on the `vect
122125``` sql
123126ALTER TABLE dbpedia ADD INDEX vector_index vector TYPE vector_similarity(' hnsw' , ' cosineDistance' , 1536 , ' bf16' , 64 , 512 );
124127
125-
126128ALTER TABLE dbpedia MATERIALIZE INDEX vector_index;
127129```
128130
@@ -136,7 +138,7 @@ _Approximate Nearest Neighbours_ or ANN refers to group of techniques (e.g., spe
136138
137139Once the vector similarity index has been built, vector search queries will automatically use the index:
138140
139- ` ` ` sql
141+ ``` sql title="Query"
140142SELECT
141143 id,
142144 title
@@ -147,8 +149,10 @@ ORDER BY cosineDistance(vector, (
147149 WHERE id = ' <dbpedia:Glacier_Express>'
148150 )) ASC
149151LIMIT 20
152+ ```
150153
151- ` ` ` response title= " Response" ┌─id──────────────────────────────────────────────┬─title─────────────────────────────────┐
154+ ``` response title="Response"
155+ ┌─id──────────────────────────────────────────────┬─title─────────────────────────────────┐
152156 1. │ <dbpedia:Glacier_Express> │ Glacier Express │
153157 2. │ <dbpedia:BVZ_Zermatt-Bahn> │ BVZ Zermatt-Bahn │
154158 3. │ <dbpedia:Gornergrat_railway> │ Gornergrat railway │
@@ -172,6 +176,7 @@ LIMIT 20
172176 └─────────────────────────────────────────────────┴───────────────────────────────────────┘
173177#highlight-next-line
17417820 rows in set. Elapsed: 0.025 sec. Processed 32.03 thousand rows, 2.10 MB (1.29 million rows/s., 84.80 MB/s.)
179+ ```
175180
176181## Generating embeddings for search query {#generating-embeddings-for-search-query}
177182
0 commit comments