IBM · pilartomas · Mar 11, 2024 · Mar 8, 2024 · Mar 11, 2024
diff --git a/.env.test b/.env.test
@@ -1,3 +1,3 @@
-ENDPOINT=https://workbench-api.res.ibm.com
+ENDPOINT=https://bam-api.res.ibm.com
 API_KEY=
 RUN_LANGCHAIN_CHAT_TESTS=false
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@ This is not the [watsonx.ai](https://www.ibm.com/products/watsonx-ai) Node.js SD
 
 You can start a trial version or request a demo via https://www.ibm.com/products/watsonx-ai.
 
-This library provides convenient access to the Generative AI API from Node.js applications. For a full description of the API, please visit the [Tech Preview API Documentation](https://workbench.res.ibm.com/docs/api-reference).
+This library provides convenient access to the Generative AI API from Node.js applications. For a full description of the API, please visit the [Tech Preview API Documentation](https://bam.res.ibm.com/docs/api-reference).
 
 The SDK supports both TypeScript and JavaScript as well as ESM and CommonJS.
 
@@ -20,18 +20,17 @@ The SDK supports both TypeScript and JavaScript as well as ESM and CommonJS.
 - [Prerequisites](#prerequisites)
 - [Installation](#installation)
 - [Usage](#usage)
-- [Roadmap](#roadmap)
+- [API Reference](#api-reference)
+- [Langchain](#langchain)
 
 ![-----------------------------------------------------](./docs/img/rainbow.png)
 
 ## Key features
 
 - ⚡️ Performant - processes 1k of short inputs in about 4 minutes
 - ☀️ Fault-tolerant - retry strategies and overflood protection
-- 🏖️ Worry-free parallel processing - just pass all the data, we take care of the parallel processing
 - 🚦 Handles concurrency limiting - even if you have multiple parallel jobs running
-- ⏩ Requests are always returned in the respective order
-- 🙏 Support both promises and callbacks
+- 📌 Aligned with the REST API - clear structure that mirrors service endpoints and data
 - Integrations
   - ⛓️ LangChain - build applications with LLMs through composability
 
@@ -57,118 +56,70 @@ yarn add @ibm-generative-ai/node-sdk
 
 ### Usage
 
-To use SDK, first you need to create a client. API key can be passed to the client as parameter or by setting `GENAI_API_KEY` environment variable.
+To use the SDK, first you need to create a client. API key can be passed to the client as parameter or by setting `GENAI_API_KEY` environment variable.
 
 ```typescript
 import { Client } from '@ibm-generative-ai/node-sdk';
 
 const client = new Client({ apiKey: 'pak-.....' });
+```
+
+Client contains various services backed by the REST API endpoints, select a service you'd like to use and call CRUDL-like methods on it.
 
-// Single input
-const input = {
+```typescript
+const output = await client.text.generation.create({
   model_id: 'google/flan-ul2',
   input: 'What is the capital of the United Kingdom?',
-  parameters: {
-    decoding_method: 'greedy',
-    min_new_tokens: 1,
-    max_new_tokens: 10,
-  },
-};
-const output = await client.generate(input);
-
-// Multiple inputs, processed in parallel, all resolving at once
-const inputs = [
-  {
-    input: 'What is the capital of the United Kingdom?',
-    model_id: 'google/flan-ul2',
-  },
-  { input: 'What is the capital of the Mexico?', model_id: 'google/flan-ul2' },
-];
-const outputs = await Promise.all(client.generate(inputs));
-
-// Multiple inputs, processed in parallel, resolving in the order of respective inputs
-for (const outputPromise of client.generate(inputs)) {
-  try {
-    console.log(await outputPromise);
-  } catch (err) {
-    console.error(err);
-  }
-}
-
-// Single input using callbacks
-client.generate(input, (err, output) => {
-  if (err) console.error(err);
-  else console.log(output);
-});
-
-// Multiple inputs using callbacks, processed in parallel, called in the order of respective inputs
-client.generate(inputs, (err, output) => {
-  if (err) console.error(err);
-  else console.log(output);
 });
 ```
 
-### Streams
+#### Streams
+
+Some services support output streaming, you can easily recognize streaming methods by their `_stream` suffix.
 
 ```typescript
-const input = {
+const stream = await client.text.generation.create_stream({
   model_id: 'google/flan-ul2',
   input: 'What is the capital of the United Kingdom?',
-};
-
-// Streaming (callback style)
-client.generate(input, { stream: true }, (err, output) => {
-  if (err) {
-    console.error(err);
-  } else if (output === null) {
-    // END of stream
-  } else {
-    console.log(output.stop_reason);
-    console.log(output.generated_token_count);
-    console.log(output.input_token_count);
-    console.log(output.generated_text);
-  }
 });
-
-// Streaming (async iterators)
-const stream = client.generate(input, {
-  stream: true,
-});
-for await (const chunk of stream) {
-  console.log(chunk.stop_reason);
-  console.log(chunk.generated_token_count);
-  console.log(chunk.input_token_count);
-  console.log(chunk.generated_text);
+for await (const output of stream) {
+  console.log(output);
 }
+```
 
-// Streaming (built-in stream methods)
-const stream = client.generate(input, {
-  stream: true,
-});
-stream.on('data', (chunk) => {
-  console.log(chunk.stop_reason);
-  console.log(chunk.generated_token_count);
-  console.log(chunk.input_token_count);
-  console.log(chunk.generated_text);
-});
-stream.on('error', (err) => {
-  console.error('error has occurred', err);
-});
-stream.on('close', () => {
-  console.info('end of stream');
-});
+#### Cancellation
+
+All service methods support cancellation via [AbortSignal](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal). Use the options argument to pass a signal into the method.
+
+```typescript
+const output = await client.text.generation.create(
+  {
+    model_id: 'google/flan-ul2',
+    input: 'What is the capital of the United Kingdom?',
+  },
+  { signal: AbortSignal.timeout(5000) },
+);
 ```
 
-### Retry
+Refer to [examples](./examples/) for further guidance.
 
-Majority of client methods have built-in retry strategy. Number of retries can be configured either when constructing the client or per each method call. If not specified, defaults to 3.
+### API Reference
+
+The SDK structure closely follows [REST API](https://bam.res.ibm.com/docs/api-reference) endpoints. To use the desired functionality, first locate a [service](./src/services/) and then call appropriate method on it.
 
 ```typescript
-const client = new Client({ apiKey: 'pak-.....', retries: 5 });
-client.generate(input, { retries: 8 }); // Maximum of 9 attempts will be made for each request the method invokes
+// Signature template
+const output = await client.service[.subservice].method(input, options);
+
+// POST /v2/text/generation
+const output = await client.text.generation.create(input, options)
 ```
 
-### LangChain
+Input and output of each method is forwarded to the corresponding endpoint. The SDK exports [typing](./src/schema.ts) for each input and output.
+
+Standalone API reference is NOT available at the moment, please refer to the [REST API Reference](https://bam.res.ibm.com/docs/api-reference) to find the functionality you're looking for and the input/output semantics.
+
+## LangChain
 
 [LangChain](https://js.langchain.com/docs/getting-started/guide-llm) is a framework for developing applications powered by language models.
 The following example showcases how you can integrate GenAI into your project.
@@ -185,7 +136,7 @@ const model = new GenAIModel({
 });
 ```
 
-#### Basic usage
+### Basic usage
 
 ```typescript
 const response = await model.invoke(
@@ -195,7 +146,7 @@ const response = await model.invoke(
 console.log(response); // Fantasy Sockery
 ```
 
-#### LLM Chain + Prompt Template
+### LLM Chain + Prompt Template
 
 ```typescript
 import { PromptTemplate } from '@langchain/core/prompts';
@@ -216,7 +167,7 @@ const { text } = await chain.call({ product: 'clothes' });
 console.log(text); // ArcticAegis
 ```
 
-#### Streaming
+### Streaming
 
 ```typescript
 import { GenAIModel } from '@ibm-generative-ai/node-sdk/langchain';
@@ -241,15 +192,14 @@ await model.invoke('Tell me a joke.', {
 });
 ```
 
-#### Chat support
+### Chat support
 
 ```typescript
 import { GenAIChatModel } from '@ibm-generative-ai/node-sdk/langchain';
 import { SystemMessage, HumanMessage } from '@langchain/core/messages';
 
 const client = new GenAIChatModel({
-  modelId: 'meta-llama/llama-2-70b-chat',
-  stream: false,
+  model_id: 'meta-llama/llama-2-70b-chat',
   configuration: {
     endpoint: process.env.ENDPOINT,
     apiKey: process.env.API_KEY,
@@ -260,17 +210,9 @@ const client = new GenAIChatModel({
     max_new_tokens: 25,
     repetition_penalty: 1.5,
   },
-  rolesMapping: {
-    human: {
-      stopSequence: '<human>:',
-    },
-    system: {
-      stopSequence: '<bot>:',
-    },
-  },
 });
 
-const response = await client.call([
+const response = await client.invoke([
   new SystemMessage(
     'You are a helpful assistant that translates English to Spanish.',
   ),
@@ -280,7 +222,7 @@ const response = await client.call([
 console.info(response.content); // "Me encanta la programación."
 ```
 
-#### Prompt Templates (GenAI x LangChain)
+### Prompt Templates (GenAI x LangChain)
 
 For using GenAI Prompt Template in LangChain, there needs to be a conversion between appropriate template syntaxes.
 This can be done via helper classes provided within our SDK.