Skip to content

Commit a295b30

Browse files
author
Zhi Zhou
committed
Add video samples
1 parent 31e63fb commit a295b30

File tree

2 files changed

+119
-20
lines changed

2 files changed

+119
-20
lines changed

Basic_Samples/GPT-4V/video_chatcompletions_example_restapi.ipynb

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
},
1414
{
1515
"cell_type": "code",
16-
"execution_count": 19,
16+
"execution_count": 3,
1717
"id": "f4b3d21a",
1818
"metadata": {},
1919
"outputs": [],
@@ -37,7 +37,7 @@
3737
},
3838
{
3939
"cell_type": "code",
40-
"execution_count": 20,
40+
"execution_count": 4,
4141
"id": "fd85fb30",
4242
"metadata": {},
4343
"outputs": [],
@@ -54,7 +54,7 @@
5454
"vision_api_endpoint = config_details['VISION_API_ENDPOINT']\n",
5555
"\n",
5656
"# Insert your video SAS URL, e.g. https://<your-storage-account-name>.blob.core.windows.net/<your-container-name>/<your-video-name>?<SAS-token>\n",
57-
"video_SAS_url = config_details[\"VIDEO_SAS_URL\"] \n",
57+
"video_SAS_url = \"https://gpt4vsamples.blob.core.windows.net/videos/Microsoft%20Copilot%20Short.mp4\" #config_details[\"VIDEO_SAS_URL\"] \n",
5858
"\n",
5959
"# This index name must be unique\n",
6060
"video_index_name = config_details[\"VIDEO_INDEX_NAME\"]\n",
@@ -72,10 +72,21 @@
7272
},
7373
{
7474
"cell_type": "code",
75-
"execution_count": 22,
75+
"execution_count": 5,
7676
"id": "704ffbda",
7777
"metadata": {},
78-
"outputs": [],
78+
"outputs": [
79+
{
80+
"name": "stdout",
81+
"output_type": "stream",
82+
"text": [
83+
"201 {\"name\":\"new-test-zhi-v5\",\"userData\":{},\"features\":[{\"name\":\"vision\",\"modelVersion\":\"2023-05-31\",\"domain\":\"surveillance\"},{\"name\":\"speech\",\"modelVersion\":\"2023-06-30\",\"domain\":\"generic\"}],\"eTag\":\"\\\"d9b271e7335e40b2aab35f0c69562040\\\"\",\"createdDateTime\":\"2023-12-05T18:14:55.4017333Z\",\"lastModifiedDateTime\":\"2023-12-05T18:14:55.4017333Z\"}\n",
84+
"202 {\"name\":\"my-ingestion\",\"state\":\"Running\",\"batchName\":\"e8adb161-5bb5-4a77-bd2c-738086c815a2\",\"createdDateTime\":\"2023-12-05T18:14:56.2923663Z\",\"lastModifiedDateTime\":\"2023-12-05T18:14:56.6048786Z\"}\n",
85+
"{'value': [{'name': 'my-ingestion', 'state': 'Completed', 'batchName': 'e8adb161-5bb5-4a77-bd2c-738086c815a2', 'createdDateTime': '2023-12-05T18:14:56.2923663Z', 'lastModifiedDateTime': '2023-12-05T18:18:37.8844717Z'}]}\n",
86+
"Ingestion completed.\n"
87+
]
88+
}
89+
],
7990
"source": [
8091
"# You only need to run this cell once to create the index\n",
8192
"process_video_indexing(vision_api_endpoint, vision_api_key, video_index_name, video_SAS_url, video_id)"
@@ -90,24 +101,31 @@
90101
},
91102
{
92103
"cell_type": "code",
93-
"execution_count": 18,
104+
"execution_count": 6,
94105
"id": "b6165c63",
95106
"metadata": {},
96107
"outputs": [
97108
{
98109
"name": "stdout",
99110
"output_type": "stream",
100111
"text": [
101-
"The video advertisement appears to focus on a company that specializes in advanced technology and space exploration.\n",
102-
"Product characteristics highlighted include a small handheld device with a screen, possibly a gadget related to space technology, along with detailed whiteboards filled with technical drawings and equations, hinting at the company's focus on innovation and complex engineering.\n",
112+
"The video appears to be a visually stimulating advertisement for a digital or software product.\n",
113+
"It begins with a blurred image that sets a serene mood with pastel colors.\n",
114+
"At timestamp 00:00:04.6000000, the video presents a dynamic logo with a vibrant color palette of blues, pinks, and purples, suggesting creativity and innovation.\n",
115+
"By 00:00:09.2000000, we see a user interface that seems to be for a computer application or operating system, indicating the product's utility in organizing and enhancing digital workspaces.\n",
116+
"\n",
117+
"At 00:00:13.8000000, there's a close-up of a digital screen with simulated features and a disclaimer noting that the screens are simulated, implying that the product is in development and features may vary.\n",
118+
"The focus on the visuals continues with abstract, soft textures and colors at 00:00:18.4000000, which could represent the ease and fluidity of using the software.\n",
119+
"\n",
120+
"The video continues to highlight the software's features through various screen displays set against a backdrop of tranquil and dreamy environments, as seen at timestamps 00:00:23 and 00:00:32.2000000.\n",
121+
"These images are combined with text overlays such as \"Sustainable Design: What, Why, and How\" and email interfaces, hinting at the software's potential applications in productivity and creativity.\n",
103122
"\n",
104-
"The background images include a bustling office environment, a view of the Earth from space, and a conceptual Mars habitat with a geodesic dome structure, indicating the company's involvement in space missions and habitat development.\n",
105-
"The lighting throughout the video is bright and clear, suggesting a transparent and futuristic atmosphere.\n",
106-
"The color palette consists of whites, reds, and blues, which are often associated with technology and space exploration.\n",
123+
"At 00:00:27.6000000 and 00:00:36.8000000, the video includes call-to-action buttons with phrases like \"Organize my plans\" and \"Help me relax,\" which suggest the software's functionality in assisting with personal organization and stress relief.\n",
107124
"\n",
108-
"The human characteristics shown in the video, although faces are not visible, include individuals in professional attire, such as a white lab coat and a shirt with the company logo, which portrays a sense of expertise and professionalism within the company.\n",
125+
"The final frame at 00:00:41.4000000 depicts a neatly organized workspace in a peaceful evening setting, reinforcing the message of tranquility and control over one's digital environment.\n",
109126
"\n",
110-
"In summary, the main message of the advertisement video seems to be highlighting the company's role in pioneering space technology and exploration, showcasing their expertise, innovative products, and the futuristic vision they are contributing towards.\n"
127+
"In summary, the main message of the advertisement seems to be showcasing a software product that is designed to enhance creativity, productivity, and personal well-being through a visually appealing and intuitive interface.\n",
128+
"It emphasizes the software's role in creating a harmonious digital space that can adapt to various user needs, from organization to relaxation.\n"
111129
]
112130
}
113131
],

Basic_Samples/GPT-4V/video_chunk_chatcompletions_example_restapi.ipynb

Lines changed: 88 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
},
1414
{
1515
"cell_type": "code",
16-
"execution_count": null,
16+
"execution_count": 1,
1717
"id": "f4b3d21a",
1818
"metadata": {},
1919
"outputs": [],
@@ -40,7 +40,7 @@
4040
},
4141
{
4242
"cell_type": "code",
43-
"execution_count": null,
43+
"execution_count": 2,
4444
"id": "fd85fb30",
4545
"metadata": {},
4646
"outputs": [],
@@ -57,13 +57,13 @@
5757
"vision_api_endpoint = config_details['VISION_API_ENDPOINT']\n",
5858
"\n",
5959
"# Insert your video SAS URL, e.g. https://<your-storage-account-name>.blob.core.windows.net/<your-container-name>/<your-video-name>?<SAS-token>\n",
60-
"video_SAS_url = config_details[\"VIDEO_SAS_URL\"] \n",
60+
"video_SAS_url = \"https://gpt4vsamples.blob.core.windows.net/videos/Redwire%20Field%20Trip%20-%203D%20Printing%20a%20Zune.mkv\" #config_details[\"VIDEO_SAS_URL\"]\n",
6161
"\n",
6262
"# This index name must be unique\n",
6363
"video_index_name = config_details[\"VIDEO_INDEX_NAME\"]\n",
6464
"\n",
6565
"# This video ID must be unique\n",
66-
"video_id = config_details[\"VIDEO_INDEX_ID\"] # This video ID must be unique"
66+
"video_id = config_details[\"VIDEO_INDEX_ID\"]"
6767
]
6868
},
6969
{
@@ -75,7 +75,7 @@
7575
},
7676
{
7777
"cell_type": "code",
78-
"execution_count": null,
78+
"execution_count": 6,
7979
"id": "704ffbda",
8080
"metadata": {},
8181
"outputs": [],
@@ -93,9 +93,90 @@
9393
},
9494
{
9595
"cell_type": "code",
96-
"execution_count": null,
96+
"execution_count": 5,
9797
"metadata": {},
98-
"outputs": [],
98+
"outputs": [
99+
{
100+
"name": "stdout",
101+
"output_type": "stream",
102+
"text": [
103+
"Video Length: 437.28 seconds\n",
104+
"Segment 1: How many scenes from 0s to 20s?\n",
105+
"Segment 2: How many scenes from 20s to 40s?\n",
106+
"Segment 3: How many scenes from 40s to 60s?\n",
107+
"Segment 4: How many scenes from 60s to 80s?\n",
108+
"Segment 5: How many scenes from 80s to 100s?\n",
109+
"Segment 6: How many scenes from 100s to 120s?\n",
110+
"Segment 7: How many scenes from 120s to 140s?\n",
111+
"Segment 8: How many scenes from 140s to 160s?\n",
112+
"Segment 9: How many scenes from 160s to 180s?\n",
113+
"Segment 10: How many scenes from 180s to 200s?\n",
114+
"Segment 11: How many scenes from 200s to 220s?\n",
115+
"Segment 12: How many scenes from 220s to 240s?\n",
116+
"Segment 13: How many scenes from 240s to 260s?\n",
117+
"Segment 14: How many scenes from 260s to 280s?\n",
118+
"Segment 15: How many scenes from 280s to 300s?\n",
119+
"Segment 16: How many scenes from 300s to 320s?\n",
120+
"Segment 17: How many scenes from 320s to 340s?\n",
121+
"Segment 18: How many scenes from 340s to 360s?\n",
122+
"Segment 19: How many scenes from 360s to 380s?\n",
123+
"Segment 20: How many scenes from 380s to 400s?\n",
124+
"Segment 21: How many scenes from 400s to 420s?\n",
125+
"Segment 22: How many scenes from 420s to 437.28s?\n",
126+
"From 420s to 437.28s, there is one continuous scene.\n",
127+
"This scene is a continuation from the previous segment, showing more of the facility's advanced technology.\n",
128+
"\n",
129+
"Combining this with the previous segments, the total number of scenes from 0s to 437.28s is twenty-three:\n",
130+
"\n",
131+
"1.\n",
132+
"From 0s to 20s, one continuous scene.\n",
133+
"2.\n",
134+
"From 20s to 40s, a scene featuring a red and silver object with a circular control interface.\n",
135+
"3.\n",
136+
"From 40s to 60s, a scene with an intense orange and white color scheme.\n",
137+
"4.\n",
138+
"From 60s to 80s, a close-up of a green and orange electronic device.\n",
139+
"5.\n",
140+
"From 80s to 100s, a scene featuring a 3D printer.\n",
141+
"6.\n",
142+
"From 100s to 120s, a space-themed animation with a cartoon dog.\n",
143+
"7.\n",
144+
"From 120s to 140s, continuation of the close-up of the red electronic device.\n",
145+
"8.\n",
146+
"From 140s to 160s, a bright star-like object transitioning to a Mars-like landscape.\n",
147+
"9.\n",
148+
"From 160s to 180s, a scene in a laboratory environment.\n",
149+
"10.\n",
150+
"From 180s to 200s, continuation within the laboratory environment.\n",
151+
"11.\n",
152+
"From 200s to 220s, the same high-tech laboratory environment.\n",
153+
"12.\n",
154+
"From 220s to 240s, a scene with a robotic arm.\n",
155+
"13.\n",
156+
"From 240s to 260s, an indoor setting with a modern doorway.\n",
157+
"14.\n",
158+
"From 260s to 280s, continuation of the indoor setting with the transportation pod.\n",
159+
"15.\n",
160+
"From 280s to 300s, one continuous scene extending from the previous segment.\n",
161+
"16.\n",
162+
"From 300s to 320s, a scene inside a control room with various monitors and equipment.\n",
163+
"17.\n",
164+
"From 320s to 340s, continuation of the scene with the \"MADE IN SPACE\" equipment inside the facility.\n",
165+
"18.\n",
166+
"From 340s to 360s, continuation of the scene inside the facility focusing on the \"MADE IN SPACE\" equipment.\n",
167+
"19.\n",
168+
"From 360s to 380s, continuation of the scene showing a close-up of the \"MADE IN SPACE\" equipment and its components.\n",
169+
"20.\n",
170+
"From 380s to 400s, continuation of the scene featuring the \"MADE IN SPACE\" equipment within the facility.\n",
171+
"21.\n",
172+
"From 400s to 420s, continuation of the scene, further showcasing the interior of the facility and its technology.\n",
173+
"22.\n",
174+
"From 420s to 437.28s, continuation of the scene, highlighting advanced technology at the facility.\n",
175+
"\n",
176+
"These twenty-three scenes together provide a comprehensive visual exploration of the different settings and technological advancements highlighted in the video.\n"
177+
]
178+
}
179+
],
99180
"source": [
100181
"# Call GPT-4V API with Video Index on Each Video Chunk Sequentially\n",
101182
"\n",

0 commit comments

Comments
 (0)