Skip to content

Commit

Permalink
Realtime simultaneous recording and playback (#5)
Browse files Browse the repository at this point in the history
* feat: added new mic recording and audio playback

* chore: bump lib version

* chore: updated readme
  • Loading branch information
demchuk-alex authored Dec 22, 2024
1 parent 285c5fc commit 3c182ad
Show file tree
Hide file tree
Showing 12 changed files with 1,015 additions and 95 deletions.
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,9 @@ yarn-debug.log
yarn-error.log

# Expo
.expo/*
.expo/*

# Example
/example/ios/*
/example/android/*
example/ios/expoaudiostreamexample.xcodeproj/project.pbxproj
171 changes: 141 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,150 @@
# Expo Play Audio Stream 🎶

The Expo Play Audio Stream module is a powerful tool for streaming audio data in your Expo-based React Native applications. It provides a seamless way to play audio chunks in real-time, allowing you to build audio-centric features like voice assistants, audio players, and more.
The Expo Play Audio Stream module is a powerful tool for recording and streaming audio data in your Expo-based React Native applications. It provides a seamless way to record audio from the microphone and play audio chunks in real-time, allowing you to build audio-centric features like voice assistants, audio players, voice recorders, and more.

## Motivation 🎯

Expo's built-in audio capabilities are limited to playing pre-loaded audio files. The Expo Audio Stream module was created to address this limitation, enabling developers to stream audio data dynamically and have more control over the audio playback process.
Expo's built-in audio capabilities are limited to playing pre-loaded audio files and basic recording. The Expo Audio Stream module was created to address these limitations, enabling developers to record high-quality audio with real-time streaming capabilities and have more control over both the recording and playback process. The module provides features like dual-stream output (original and 16kHz versions) which is particularly useful for voice activity detection and speech recognition applications.

## Example Usage 🚀

Here's an example of how you can use the Expo Audio Stream module to play a sequence of audio chunks:
Here's how you can use the Expo Play Audio Stream module for different scenarios:

### Standard Recording and Playback

```javascript
import { ExpoPlayAudioStream } from 'expo-audio-stream';

// Assuming you have some audio data in base64 format
const sampleA = 'base64EncodedAudioDataA';
const sampleB = 'base64EncodedAudioDataB';

useEffect(() => {
async function playAudioChunks() {
try {
await ExpoPlayAudioStream.setVolume(100);
await ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleA);
console.log('Streamed A');
await ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
console.log('Streamed B');
console.log('Streaming A & B');
ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleA);
ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
} catch (error) {
console.error(error);
}
// Example of standard recording and playback
async function handleStandardRecording() {
try {
// Set volume for playback
await ExpoPlayAudioStream.setVolume(0.8);

// Start recording with configuration
const { recordingResult, subscription } = await ExpoPlayAudioStream.startRecording({
onAudioStream: (event) => {
console.log('Received audio stream:', {
audioDataBase64: event.data,
audioData16kHzBase64: event.data16kHz, // usually used for voice activity detection like silero models
position: event.position,
eventDataSize: event.eventDataSize,
totalSize: event.totalSize
});
}
});

// After some time, stop recording
setTimeout(async () => {
const recording = await ExpoPlayAudioStream.stopRecording();
console.log('Recording stopped:', recording);

// Read the file from the fileUri and convert to base64

// Play the recorded audio
const turnId = 'example-turn-1';
await ExpoPlayAudioStream.playAudio(base64Content, turnId);

// Clean up
subscription?.remove();
}, 5000);

} catch (error) {
console.error('Audio handling error:', error);
}
}

// You can also subscribe to audio events from anywhere
const audioSubscription = ExpoPlayAudioStream.subscribeToAudioEvents(async (event) => {
console.log('Audio event received:', {
data: event.data,
data16kHz: event.data16kHz
});
});
// Don't forget to clean up when done
// audioSubscription.remove();
```
playAudioChunks();
}, []);
### Simultaneous Recording and Playback
```javascript
import { ExpoPlayAudioStream } from 'expo-audio-stream';

// Example of simultaneous recording and playback with voice processing
async function handleSimultaneousRecordAndPlay() {
try {
// Start microphone with voice processing
const { recordingResult, subscription } = await ExpoPlayAudioStream.startMicrophone({
onAudioStream: (event) => {
console.log('Received audio stream with voice processing:', {
audioDataBase64: event.data,
audioData16kHz: event.data16kHz
});
}
});

// Play audio while recording is active
const turnId = 'response-turn-1';
await ExpoPlayAudioStream.playSound(someAudioBase64, turnId);

// Example of controlling playback during recording
setTimeout(async () => {
// Interrupt current playback
await ExpoPlayAudioStream.interruptSound();

// Resume playback
await ExpoPlayAudioStream.resumeSound();

// Stop microphone recording
await ExpoPlayAudioStream.stopMicrophone();

// Clean up
subscription?.remove();
}, 5000);

} catch (error) {
console.error('Simultaneous audio handling error:', error);
}
}
```
## API 📚
The Expo Play Audio Stream module provides the following API:
The Expo Play Audio Stream module provides the following methods:
### Standard Audio Operations
- `startRecording(recordingConfig: RecordingConfig)`: Starts microphone recording with the specified configuration. Returns a promise with recording result and audio event subscription.
- `stopRecording()`: Stops the current microphone recording and returns the audio recording data.
- `playAudio(base64Chunk: string, turnId: string)`: Plays a base64 encoded audio chunk with the specified turn ID.
- `pauseAudio()`: Pauses the current audio playback.
- `stopAudio()`: Stops the currently playing audio.
- `streamRiff16Khz16BitMonoPcmChunk(base64Chunk: string): Promise<void>`: Streams a base64-encoded audio chunk in the RIFF format with 16 kHz, 16-bit, mono PCM encoding.
- `setVolume(volume: number): Promise<void>`: Sets the volume of the audio playback, where `volume` is a value between 0 and 100.
- `pause(): Promise<void>`: Pauses the audio playback.
- `start(): Promise<void>`: Starts the audio playback.
- `stop(): Promise<void>`: Stops the audio playback and clears any remaining audio data.
- `setVolume(volume: number)`: Sets the volume for audio playback (0.0 to 1.0).
- `clearPlaybackQueueByTurnId(turnId: string)`: Clears the playback queue for a specific turn ID.
- `subscribeToAudioEvents(onMicrophoneStream: (event: AudioDataEvent) => Promise<void>)`: Subscribe to recording events from anywhere in your application. Returns a subscription that should be cleaned up when no longer needed.
### Simultaneous Recording and Playback
These methods are specifically designed for scenarios where you need to record and play audio at the same time:
- `startMicrophone(recordingConfig: RecordingConfig)`: Starts microphone streaming with voice processing enabled. Returns a promise with recording result and audio event subscription.
- `stopMicrophone()`: Stops the microphone streaming when in simultaneous mode.
- `playSound(audio: string, turnId: string)`: Plays a sound while recording is active. Uses voice processing to prevent feedback.
- `interruptSound()`: Interrupts the current sound playback in simultaneous mode.
- `resumeSound()`: Resumes the current sound playback in simultaneous mode.
All methods are static and most return Promises that resolve when the operation is complete. Error handling is built into each method, with descriptive error messages if operations fail.
## Swift Implementation 🍎
Expand All @@ -55,10 +154,22 @@ The Swift implementation of the Expo Audio Stream module uses the `AVFoundation`
The Kotlin implementation of the Expo Audio Stream module uses the `AudioTrack` class from the Android framework to handle audio playback. It uses a concurrent queue to manage the audio chunks and a coroutine-based playback loop to ensure efficient and asynchronous processing of the audio data.
## Voice Processing and Isolation 🎤
The module implements several audio optimizations for voice recording:
- On iOS 15 and later, users are prompted with system voice isolation options (`microphoneModes`), allowing them to choose their preferred voice isolation level.
- When simultaneous recording and playback is enabled, the module uses iOS voice processing which includes:
- Noise reduction
- Echo cancellation
- Voice optimization
Note: Voice processing may result in lower audio levels as it optimizes for voice clarity over volume. This is a trade-off made to ensure better voice quality and reduce background noise.
## Limitations and Considerations ⚠️
- The Expo Play Audio Stream module is designed to work with specific audio formats (RIFF, 16 kHz, 16-bit, mono PCM). If your audio data is in a different format, you may need to convert it before using the module.
- The module does not provide advanced features like audio effects, mixing, or recording. It is primarily focused on real-time audio streaming.
- The module does not provide advanced features like audio effects or mixing. It is primarily focused on real-time audio streaming and recording.
- The performance of the module may depend on the device's hardware capabilities and the complexity of the audio data being streamed.
## Contributions 🤝
Expand Down
4 changes: 4 additions & 0 deletions example/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,7 @@ yarn-error.*

# typescript
*.tsbuildinfo

# Example
ios/*
android/*
36 changes: 6 additions & 30 deletions example/App.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,9 @@ import { Audio } from 'expo-av';

const ANDROID_SAMPLE_RATE = 16000;
const IOS_SAMPLE_RATE = 48000;
const BIT_DEPTH = 16;
const CHANNELS = 1;
const ENCODING = "pcm_16bit";
const RECORDING_INTERVAL = 50;
const RECORDING_INTERVAL = 100;

const turnId1 = 'turnId1';
const turnId2 = 'turnId2';
Expand All @@ -23,30 +22,7 @@ const turnId2 = 'turnId2';
export default function App() {


const eventListenerSubscriptionRef = useRef<Subscription | null>(null);

useEffect(() => {
async function run() {
try {
// console.log("setPlayAndRecord");
// //await ExpoPlayAudioStream.setVolume(100);
// await ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
// await ExpoPlayAudioStream.setPlayAndRecord();
// console.log("after setPlayAndRecord");
// //await new Promise((resolve) => setTimeout(resolve, 2000));
// await ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
// console.log("streamed A");
// await ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
// console.log("streamed B");
// console.log("streaming A & B");
//ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleA);
//ExpoPlayAudioStream.streamRiff16Khz16BitMonoPcmChunk(sampleB);
} catch (error) {
console.error(error);
}
}
run();
}, []);
const eventListenerSubscriptionRef = useRef<Subscription | undefined>(undefined);

const onAudioCallback = async (audio: AudioDataEvent) => {
console.log(audio.data.slice(0, 100));
Expand All @@ -59,7 +35,7 @@ export default function App() {
onPress={async () => {
await ExpoPlayAudioStream.playAudio(sampleB, turnId1);
}}
title="Stream B"
title="Play sample B"
/>
<View style={{ height: 10, marginBottom: 10 }}>
<Text>====================</Text>
Expand All @@ -68,7 +44,7 @@ export default function App() {
onPress={async () => {
await ExpoPlayAudioStream.pauseAudio();
}}
title="Pause"
title="Pause Audio"
/>
<View style={{ height: 10, marginBottom: 10 }}>
<Text>====================</Text>
Expand All @@ -77,7 +53,7 @@ export default function App() {
onPress={async () => {
await ExpoPlayAudioStream.playAudio(sampleA, turnId2);
}}
title="Stream A"
title="Play sample A"
/>
<View style={{ height: 10, marginBottom: 10 }}>
<Text>====================</Text>
Expand Down Expand Up @@ -113,7 +89,7 @@ export default function App() {
await ExpoPlayAudioStream.stopRecording();
if (eventListenerSubscriptionRef.current) {
eventListenerSubscriptionRef.current.remove();
eventListenerSubscriptionRef.current = null;
eventListenerSubscriptionRef.current = undefined;
}
}}
title="Stop Recording"
Expand Down
Loading

0 comments on commit 3c182ad

Please sign in to comment.