Skip to content

Commit 476026f

Browse files
feat: Support Voice rcv (#1288)
* feat: Support voice rcv * style: linter pass * fix: further fixes to voice rcv (see desc.) - Handle rtp header ext without discarding - ffmpeg now respects decoder settings - fix errant decode_float opus def - add `get_ssrc` method to recorder - sockets are no longer blocking - irregular sounds (voice) will no longer crash the opus decoder * ci: correct from checks. * feat: add ctx.voice_state * feat 💥: support recording to file breaking as `start``stop`recording are now async * fix: prevent re-use of recorders new recorders are now created as needed * ci: correct from checks. * style: linter pass * fix: ensure recorder can exit when no audio recieved * feat: add elapsed time to recorder * ci: correct from checks. * fix: ensure socket buffer is purged before recording * feat: send udp keepalive packets * feat: switch to monotick clock instead of using the audio timestamp * ci: correct from checks. * docs: add voice recorder docs * fix: ensure non-ffmpeg encoders output to file when expected * refactor: update to comply with linter --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 40bd6d7 commit 476026f

File tree

13 files changed

+976
-48
lines changed

13 files changed

+976
-48
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
::: interactions.api.voice.recorder

docs/src/Guides/23 Voice.md

Lines changed: 29 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -55,20 +55,39 @@ async def play_file(ctx: interactions.InteractionContext):
5555

5656
Check out [Active Voice State](/interactions.py/API Reference/API Reference/models/Internal/active_voice_state/) for a list of available methods and attributes.
5757

58-
## Okay, but what about Soundcloud?
58+
# Voice Recording
5959

60-
interactions.py has an extension library called [`NAFFAudio`](https://github.com/NAFTeam/NAFF-Audio) which can help with that.
61-
It has an object called `YTAudio` which can be used to play audio from Soundcloud and other video platforms.
60+
So you've got a bot that can play music, but what about recording? Well, you're in luck! We've got you covered.
6261

63-
```
64-
pip install naff_audio
65-
```
62+
Let's start with a simple example:
6663

6764
```python
68-
from naff_audio import YTAudio
65+
import asyncio
66+
import interactions
67+
68+
@interactions.slash_command("record", "record some audio")
69+
async def record(ctx: interactions.InteractionContext):
70+
voice_state = await ctx.author.voice.channel.connect()
71+
72+
# Start recording
73+
await voice_state.start_recording()
74+
await asyncio.sleep(10)
75+
await voice_state.stop_recording()
76+
await ctx.send(files=[interactions.File(file, file_name="user_id.mp3") for user_id, file in voice_state.recorder.output.items()])
77+
```
78+
This code will connect to the author's voice channel, start recording, wait 10 seconds, stop recording, and send a file for each user that was recorded.
6979

70-
audio = await YTAudio.from_url("https://soundcloud.com/rick-astley-official/never-gonna-give-you-up-4")
71-
await voice_state.play(audio)
80+
But what if you didn't want to use `mp3` files? Well, you can change that too! Just pass the encoding you want to use to `start_recording`.
81+
82+
```python
83+
await voice_state.start_recording(encoding="wav")
7284
```
7385

74-
`NAFFAudio` also contains other useful features for audio-bots. Check it out if that's your *jam*.
86+
For a list of available encodings, check out Recorder's [documentation](/interactions.py/API Reference/API_Communication/voice/recorder.md)
87+
88+
Are you going to be recording for a long time? You are going to want to write the files to disk instead of keeping them in memory. You can do that too!
89+
90+
```python
91+
await voice_state.start_recording(output_dir="folder_name")
92+
```
93+
This will write the files to the folder `folder_name` in the current working directory, please note that the library will not create the folder for you, nor will it delete the files when you're done.

interactions/api/voice/audio.py

Lines changed: 90 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,70 @@
55
import time
66
from abc import ABC, abstractmethod
77
from pathlib import Path
8-
from typing import Union, Optional
8+
from typing import Union, Optional, TYPE_CHECKING
99

10-
__all__ = (
11-
"AudioBuffer",
12-
"BaseAudio",
13-
"Audio",
14-
"AudioVolume",
15-
)
10+
__all__ = ("AudioBuffer", "BaseAudio", "Audio", "AudioVolume", "RawInputAudio")
1611

1712
from interactions.client.const import get_logger
1813
from interactions.api.voice.opus import Encoder
1914
from interactions.client.utils import FastJson
2015

16+
if TYPE_CHECKING:
17+
from interactions.api.voice.recorder import Recorder
18+
19+
20+
class RawInputAudio:
21+
decoded: bytes
22+
"""The decoded audio"""
23+
pcm: bytes
24+
"""The raw PCM audio"""
25+
sequence: int
26+
"""The audio sequence"""
27+
audio_timestamp: int
28+
"""The current timestamp for this audio"""
29+
timestamp_ns: float
30+
"""The time this audio was received, in nanoseconds"""
31+
timestamp: float
32+
"""The time this audio was received, in seconds"""
33+
ssrc: int
34+
"""The source of this audio"""
35+
_recoder: "Recorder"
36+
"""A reference to the audio recorder managing this object"""
37+
38+
def __init__(self, recorder: "Recorder", data: bytes) -> None:
39+
self.decoded: bytes = b""
40+
self._recorder = recorder
41+
self.timestamp_ns = time.monotonic_ns()
42+
self.timestamp = self.timestamp_ns / 1e9
43+
self.pcm = b""
44+
45+
self.ingest(data)
46+
47+
def ingest(self, data: bytes) -> bytes | None:
48+
data = bytearray(data)
49+
header = data[:12]
50+
51+
decrypted: bytes = self._recorder.decrypt(header, data[12:])
52+
self.ssrc = int.from_bytes(header[8:12], byteorder="big")
53+
self.sequence = int.from_bytes(header[2:4], byteorder="big")
54+
self.audio_timestamp = int.from_bytes(header[4:8], byteorder="big")
55+
56+
if not self._recorder.recording_whitelist or self.user_id in self._recorder.recording_whitelist:
57+
# noinspection PyProtectedMember
58+
if decrypted[0] == 0xBE and decrypted[1] == 0xDE:
59+
# rtp header extension, remove it
60+
header_ext_length = int.from_bytes(decrypted[2:4], byteorder="big")
61+
decrypted = decrypted[4 + 4 * header_ext_length :]
62+
self.decoded = self._recorder.get_decoder(self.ssrc).decode(decrypted)
63+
return self.decoded
64+
65+
@property
66+
def user_id(self) -> Optional[int]:
67+
"""The ID of the user who made this audio."""
68+
while not self._recorder.state.ws.user_ssrc_map.get(self.ssrc):
69+
time.sleep(0.05)
70+
return self._recorder.state.ws.user_ssrc_map.get(self.ssrc)["user_id"]
71+
2172

2273
class AudioBuffer:
2374
def __init__(self) -> None:
@@ -38,25 +89,54 @@ def extend(self, data: bytes) -> None:
3889
with self._lock:
3990
self._buffer.extend(data)
4091

41-
def read(self, total_bytes: int) -> bytearray:
92+
def read(self, total_bytes: int, *, pad: bool = True) -> bytearray:
4293
"""
4394
Read `total_bytes` bytes of audio from the buffer.
4495
4596
Args:
4697
total_bytes: Amount of bytes to read.
98+
pad: Whether to pad incomplete frames with 0's.
4799
48100
Returns:
49101
Desired amount of bytes
102+
103+
Raises:
104+
ValueError: If `pad` is False and the buffer does not contain enough data.
50105
"""
51106
with self._lock:
52107
view = memoryview(self._buffer)
53108
self._buffer = bytearray(view[total_bytes:])
54109
data = bytearray(view[:total_bytes])
55110
if 0 < len(data) < total_bytes:
56-
# pad incomplete frames with 0's
57-
data.extend(b"\0" * (total_bytes - len(data)))
111+
if pad:
112+
# pad incomplete frames with 0's
113+
data.extend(b"\0" * (total_bytes - len(data)))
114+
else:
115+
raise ValueError(
116+
f"Buffer does not contain enough data to fulfill request {len(data)} < {total_bytes}"
117+
)
58118
return data
59119

120+
def read_max(self, total_bytes: int) -> bytearray:
121+
"""
122+
Read up to `total_bytes` bytes of audio from the buffer.
123+
124+
Args:
125+
total_bytes: Maximum amount of bytes to read.
126+
127+
Returns:
128+
Desired amount of bytes
129+
130+
Raises:
131+
EOFError: If the buffer is empty.
132+
"""
133+
with self._lock:
134+
if len(self._buffer) == 0:
135+
raise EOFError("Buffer is empty")
136+
view = memoryview(self._buffer)
137+
self._buffer = bytearray(view[total_bytes:])
138+
return bytearray(view[:total_bytes])
139+
60140

61141
class BaseAudio(ABC):
62142
"""Base structure of the audio."""

0 commit comments

Comments
 (0)