The Cartesia TypeScript library provides convenient access to the Cartesia API from TypeScript.
npm i -s @cartesia/cartesia-js
A full reference for this library is available here.
Instantiate and use the client with the following:
import { CartesiaClient } from "@cartesia/cartesia-js";
import process from "node:process"
import fs from "node:fs"
// Set up the client.
const client = new CartesiaClient({ apiKey: process.env.CARTESIA_API_KEY });
// Call the TTS API's bytes endpoint, which returns binary audio data as an ArrayBuffer.
const response = await client.tts.bytes({
modelId: "sonic-english",
transcript: "Hello, world!",
voice: {
mode: "id",
id: "694f9389-aac1-45b6-b726-9d9369183238",
},
language: "en",
outputFormat: {
container: "wav",
sampleRate: 44100,
encoding: "pcm_f32le",
},
});
// Write the response to a file.
fs.writeFileSync("sonic.wav", new Uint8Array(response));
import { CartesiaClient } from "@cartesia/cartesia-js";
const cartesia = new CartesiaClient({
apiKey: process.env.CARTESIA_API_KEY,
});
// Initialize the WebSocket. Make sure the output format you specify is supported.
const websocket = cartesia.tts.websocket({
container: "raw",
encoding: "pcm_f32le",
sampleRate: 44100,
});
try {
await websocket.connect();
} catch (error) {
console.error(`Failed to connect to Cartesia: ${error}`);
throw error;
}
// Create a stream.
const response = await websocket.send({
modelId: "sonic-english",
voice: {
mode: "id",
id: "a0e99841-438c-4a64-b679-ae501e7d6091",
},
transcript: "Hello, world!",
// The WebSocket sets output_format on your behalf.
});
// Access the raw messages from the WebSocket.
response.on("message", (message) => {
// Raw message.
console.log("Received message:", message);
});
// You can also access messages using a for-await-of loop.
for await (const message of response.events("message")) {
// Raw message.
console.log("Received message:", message);
}
const contextOptions = {
contextId: "my-context",
modelId: "sonic-english",
voice: {
mode: "id",
id: "a0e99841-438c-4a64-b679-ae501e7d6091",
},
};
// Initial request on the context uses websocket.send().
// This response object will aggregate the results of all the inputs sent on the context.
const response = await websocket.send({
...contextOptions,
transcript: "Hello, world!",
});
// Subsequent requests on the same context use websocket.continue().
await websocket.continue({
...contextOptions,
transcript: " How are you today?",
});
See the input streaming docs for more information.
(The WebPlayer
class only supports playing audio in the browser and the raw PCM format with fp32le encoding.)
// If you're using the client in the browser, you can control audio playback using our WebPlayer:
import { WebPlayer } from "@cartesia/cartesia-js";
console.log("Playing stream...");
// Create a Player object.
const player = new WebPlayer();
// Play the audio. (`response` includes a custom Source object that the Player can play.)
// The call resolves when the audio finishes playing.
await player.play(response.source);
console.log("Done playing.");
The SDK exports all request and response types as TypeScript interfaces. Simply import them with the following namespace:
import { Cartesia } from "@cartesia/cartesia-js";
const request: Cartesia.VoiceChangerBytesRequest = {
...
};
When the API returns a non-success status code (4xx or 5xx response), a subclass of the following error will be thrown.
import { CartesiaError } from "@cartesia/cartesia-js";
try {
await client.tts.bytes(...);
} catch (err) {
if (err instanceof CartesiaError) {
console.log(err.statusCode);
console.log(err.message);
console.log(err.body);
}
}
The SDK is instrumented with automatic retries with exponential backoff. A request will be retried as long as the request is deemed retriable and the number of retry attempts has not grown larger than the configured retry limit (default: 2).
A request is deemed retriable when any of the following HTTP status codes is returned:
Use the maxRetries
request option to configure this behavior.
const response = await client.tts.bytes(..., {
maxRetries: 0 // override maxRetries at the request level
});
The SDK defaults to a 60 second timeout. Use the timeoutInSeconds
option to configure this behavior.
const response = await client.tts.bytes(..., {
timeoutInSeconds: 30 // override timeout to 30s
});
The SDK allows users to abort requests at any point by passing in an abort signal.
const controller = new AbortController();
const response = await client.tts.bytes(..., {
abortSignal: controller.signal
});
controller.abort(); // aborts the request
The SDK defaults to node-fetch
but will use the global fetch client if present. The SDK works in the following
runtimes:
- Node.js 18+
- Vercel
- Cloudflare Workers
- Deno v1.25+
- Bun 1.0+
- React Native
The SDK provides a way for your to customize the underlying HTTP client / Fetch function. If you're running in an unsupported environment, this provides a way for you to break glass and ensure the SDK works.
import { CartesiaClient } from "@cartesia/cartesia-js";
const client = new CartesiaClient({
...
fetcher: // provide your implementation here
});
While we value open-source contributions to this SDK, this library is generated programmatically. Additions made directly to this library would have to be moved over to our generation code, otherwise they would be overwritten upon the next generated release. Feel free to open a PR as a proof of concept, but know that we will not be able to merge it as-is. We suggest opening an issue first to discuss with us!
On the other hand, contributions to the README are always very welcome!