-
Notifications
You must be signed in to change notification settings - Fork 2
Analyze API
Isaac edited this page Apr 3, 2022
·
5 revisions
https://www.diffbot.com/dev/docs/analyze/
let analyze = await diffbot.analyze({
url: 'https://four-all-ice-creame.myshopify.com/collections/ice-cream-cubes-individual/products/ice-cream-cubes-individual',
body: 'optional-html-post-body',
});
console.log(analyze.humanLanguage);
console.log(analyze.title);
console.log(analyze.type);
console.log(analyze.objects);
Param | Type | Required | Description |
---|---|---|---|
url | string | Yes | Web page URL of the analyze to process |
mode | string | No | By default the Analyze API will fully extract all pages that match an existing Automatic API -- articles, products or image pages. Set mode to a specific page-type (e.g., mode=article ) to extract content only from that specific page-type. All other pages will simply return the default Analyze fields. |
fallback | string | No | Force any non-extracted pages (those with a type of "other") through a specific API. For example, to route all "other" pages through the Article API, pass &fallback=article . Pages that utilize this functionality will return a fallbackType field at the top-level of the response and a originalType field within each extracted object, both of which will indicate the fallback API used. |
fields | string[] | No | Specify optional fields to be returned from any fully-extracted pages, e.g.: &fields=querystring,links . See available fields within each API's individual documentation pages. |
paging | boolean | No | (Undocumented) Pass paging=false to disable automatic concatenation of multiple-page articles. (By default, Diffbot will concatenate up to 20 pages of a single article.) |
discussion | boolean | No | Pass discussion=false to disable automatic extraction of comments or reviews from pages identified as articles or products. This will not affect pages identified as discussions. |
timeout | timeout | No | Sets a value in milliseconds to wait for the retrieval/fetch of content from the requested URL. The default timeout for the third-party response is 30 seconds (30000). |
callback | string | No | Use for jsonp requests. Needed for cross-domain ajax. |
proxy | string | No | Used to specify the IP address of a custom proxy that will be used to fetch the target page, instead of Diffbot's default IPs/proxies. (Ex: &proxy=168.212.226.204 ) |
proxyAuth | string | No | Used to specify the authentication parameters that will be used with the proxy specified in the &proxy parameter. (Ex: &proxyAuth=username:password ) |
body | string | No | Optional HTML markup to pass as POST body |
customJS | string | No | This functionality is currently in beta. See docs for details: https://docs.diffbot.com/docs/en/api-analyze#custom-javascript |
customHeaders | object | No | This functionality is currently in beta. See docs for details: https://docs.diffbot.com/docs/en/api-analyze#custom-headers |
- API Documentation