Skip to content

Commit c8acf55

Browse files
authored
update readme (#26)
Improvements to structure and comprehensiveness. We also need to update the screenshot but I'd like to wait until we fix the transcriptions
1 parent 61095e3 commit c8acf55

File tree

1 file changed

+37
-14
lines changed

1 file changed

+37
-14
lines changed

README.md

Lines changed: 37 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,65 @@
11
<img src="./.github/assets/app-icon.png" alt="Voice Agent App Icon" width="100" height="100">
22

3-
# Swift Voice Agent
3+
# Swift Voice Agent starter app
44

5-
This is a starter template for [LiveKit Agents](https://docs.livekit.io/agents/overview/) that provides a simple voice interface using the [LiveKit Swift SDK](https://github.com/livekit/client-sdk-swift). It supports [voice](https://docs.livekit.io/agents/start/voice-ai), [transcriptions](https://docs.livekit.io/agents/build/text/), and [virtual avatars](https://docs.livekit.io/agents/integrations/avatar/).
5+
This starter app template for [LiveKit Agents](https://docs.livekit.io/agents/overview/) provides a simple voice interface using the [LiveKit Swift SDK](https://github.com/livekit/client-sdk-swift). It supports [voice](https://docs.livekit.io/agents/start/voice-ai), [transcriptions](https://docs.livekit.io/agents/build/text/), [live video input](https://docs.livekit.io/agents/build/vision/#video), and [virtual avatars](https://docs.livekit.io/agents/integrations/avatar/).
66

77
This template is compatible with iOS, iPadOS, macOS, and visionOS and is free for you to use or modify as you see fit.
88

99
<img src="./.github/assets/screenshot.png" alt="Voice Agent Screenshot" height="500">
1010

1111
## Getting started
1212

13-
The easiest way to get this app running is with the [Sandbox for LiveKit Cloud](https://cloud.livekit.io/projects/p_/sandbox) and the [LiveKit CLI](https://docs.livekit.io/home/cli/cli-setup/).
13+
First, you'll need a LiveKit agent to speak with. Try our starter agent for [Python](https://github.com/livekit-examples/agent-starter-python), [Node.js](https://github.com/livekit-examples/agent-starter-node), or [create your own from scratch](https://docs.livekit.io/agents/start/voice-ai/).
1414

15-
First, create a new [Sandbox Token Server](https://cloud.livekit.io/projects/p_/sandbox/templates/token-server) for your LiveKit Cloud project.
15+
Second, you need a token sever. The easiest way to set this up is with the [Sandbox for LiveKit Cloud](https://cloud.livekit.io/projects/p_/sandbox) and the [LiveKit CLI](https://docs.livekit.io/home/cli/cli-setup/).
1616

17-
Then, run the following command to automatically clone this template and connect it to LiveKit Cloud.
17+
First, create a new [Sandbox Token Server](https://cloud.livekit.io/projects/p_/sandbox/templates/token-server) for your LiveKit Cloud project.
18+
Then, run the following command to automatically clone this template and connect it to LiveKit Cloud. This will create a new Xcode project in the current directory.
1819

1920
```bash
2021
lk app create --template agent-starter-swift --sandbox <token_server_sandbox_id>
2122
```
2223

23-
Built and run the app from Xcode by opening `VoiceAgent.xcodeproj`. You may need to adjust your app signing settings to run the app on your device.
24-
25-
You'll also need an agent to speak with. Try our starter agent for [Python](https://github.com/livekit-examples/agent-starter-python), [Node.js](https://github.com/livekit-examples/agent-starter-node), or [create your own from scratch](https://docs.livekit.io/agents/start/voice-ai/).
24+
Then, build and run the app from Xcode by opening `VoiceAgent.xcodeproj`. You may need to adjust your app signing settings to run the app on your device.
2625

2726
> [!NOTE]
2827
> To setup without the LiveKit CLI, clone the repository and then either create a `VoiceAgent/.env.xcconfig` with a `LIVEKIT_SANDBOX_ID` (if using a [Sandbox Token Server](https://cloud.livekit.io/projects/p_/sandbox/templates/token-server)), or open `TokenService.swift` and add your [manually generated](#token-generation) URL and token.
2928
30-
## Token generation
29+
## Feature overview
3130

32-
In a production environment, you will be responsible for developing a solution to [generate tokens for your users](https://docs.livekit.io/home/server/generating-tokens/) which is integrated with your authentication solution. You should disable your sandbox token server and modify `TokenService.swift` to use your own token server.
31+
This starter app has support for a number of features of the agents framework, and is configurable to easily enable or disable them in code based on your needs as you adapt this template to your own use case.
3332

34-
## Chat transcription
33+
### Text, video, and voice input
3534

36-
The app supports agent [transcriptions](https://docs.livekit.io/agents/build/text/). It requires some client-side processing to aggregate the partial results into messages. `TranscriptionStreamReceiver` is responsible for this aggregation. It buffers stream chunks and publishes complete messages when the transcription is finished. Messages have unique IDs and timestamps to help with ordering and display in the UI.
35+
This app supports text, video, and/or voice input according to the needs of your agent. To update the features enabled in the app, edit `VoiceAgent/VoiceAgentApp.swift` and update `AgentFeatures.current` to include or exclude the features you need.
3736

38-
> [!NOTE]
39-
> Text streams are fully supported in LiveKit Agents v1, for v0.x, you'll need to use legacy [transcription events](https://docs.livekit.io/agents/build/text/#transcription-events) as shown in `TranscriptionDelegateReceiver.swift`.
37+
By default, only voice and text input are enabled.
38+
39+
Available input types:
40+
- `.voice`: Allows the user to speak to the agent using their microphone. **Requires microphone permissions.**
41+
- `.text`: Allows the user to type to the agent. See [the docs](https://docs.livekit.io/agents/build/text/) for more details.
42+
- `.video`: Allows the user to share their camera or screen to the agent. This requires a supported model like the Gemini Live API. See [the docs](https://docs.livekit.io/agents/build/vision/#video) for more details.
43+
44+
If you have trouble with screensharing, refer to [the docs](https://docs.livekit.io/home/client/tracks/screenshare/) for more setup instructions.
45+
46+
### Preconnect audio buffer
47+
48+
This app uses `withPreConnectAudio` to capture and buffer audio before the room connection completes. This allows the connection to appear "instant" from the user's perspective and makes your app more responsive. To disable this feature, remove the call to `withPreConnectAudio` as below:
49+
50+
- Location: `VoiceAgent/App/AppViewModel.swift``connectWithVoice()`
51+
- To disable preconnect buffering but keep voice:
52+
- Replace the `withPreConnectAudio { ... }` block with a standard `room.connect` call and enable the microphone after connect, for example:
53+
- Connect with `connectOptions: .init(enableMicrophone: true)` without wrapping in `withPreConnectAudio`, or
54+
- Connect with microphone disabled and call `room.localParticipant.setMicrophone(enabled: true)` after connection.
55+
56+
### Virtual avatar support
57+
58+
If your agent publishes a [virtual avatar](https://docs.livekit.io/agents/integrations/avatar/), this app will automatically render the avatar’s camera feed in `AgentParticipantView` when available.
59+
60+
## Token generation in production
61+
62+
In a production environment, you will be responsible for developing a solution to [generate tokens for your users](https://docs.livekit.io/home/server/generating-tokens/) which is integrated with your authentication solution. You should disable your sandbox token server and modify `TokenService.swift` to use your own token server.
4063

4164
## Contributing
4265

0 commit comments

Comments
 (0)