Apify API client for Python

apify_client is the official library to access the Apify API from your Python applications. It provides useful features like automatic retries and convenience functions that improve the experience of using the Apify API.

Quick Start
Features
Usage concepts
- Nested clients
- Pagination
API Reference

Installation

Requires Python 3.7+

You can install the client from its PyPI listing. To do that, simply run pip install apify-client.

Quick Start

from apify_client import ApifyClient

apify_client = ApifyClient('MY-APIFY-TOKEN')

# Start an actor and waits for it to finish
actor_call = apify_client.actor('john-doe/my-cool-actor').call()

# Fetch results from the actor's default dataset
dataset_items = apify_client.dataset(actor_call['defaultDatasetId']).list_items().items

Features

Besides greatly simplifying the process of querying the Apify API, the client provides other useful features.

Automatic parsing and error handling

Based on the endpoint, the client automatically extracts the relevant data and returns it in the expected format. Date strings are automatically converted to datetime.datetime objects. For exceptions, we throw an ApifyApiError, which wraps the plain JSON errors returned by API and enriches them with other context for easier debugging.

Retries with exponential backoff

Network communication sometimes fails. The client will automatically retry requests that failed due to a network error, an internal error of the Apify API (HTTP 500+) or rate limit error (HTTP 429). By default, it will retry up to 8 times. First retry will be attempted after ~500ms, second after ~1000ms and so on. You can configure those parameters using the max_retries and min_delay_between_retries_millis options of the ApifyClient constructor.

Convenience functions and options

Some actions can't be performed by the API itself, such as indefinite waiting for an actor run to finish (because of network timeouts). The client provides convenient call() and wait_for_finish() functions that do that. Key-value store records can be retrieved as objects, buffers or streams via the respective options, dataset items can be fetched as individual objects or serialized data and we plan to add better stream support and async iterators.

Usage concepts

The ApifyClient interface follows a generic pattern that is applicable to all of its components. By calling individual methods of ApifyClient, specific clients which target individual API resources are created. There are two types of those clients. A client for management of a single resource and a client for a collection of resources.

from apify_client import ApifyClient
apify_client = ApifyClient('MY-APIFY-TOKEN')

# Collection clients do not require a parameter
actor_collection_client = apify_client.actors()
# Create an actor with the name: my-actor
my_actor = actor_collection_client.create(name='my-actor')
# List all of your actors
actor_list = actor_collection_client.list().items

# Collection clients do not require a parameter
dataset_collection_client = apify_client.datasets()
# Get (or create, if it doesn't exist) a dataset with the name of my-dataset
my_dataset = dataset_collection_client.get_or_create(name='my-dataset')

# Resource clients accept an ID of the resource
actor_client = apify_client.actor('john-doe/my-actor')
# Fetch the john-doe/my-actor object from the API
my_actor = actor_client.get()
# Start the run of john-doe/my-actor and return the Run object
my_actor_run = actor_client.start()

# Resource clients accept an ID of the resource
dataset_client = apify_client.dataset('john-doe/my-dataset')
# Append items to the end of john-doe/my-dataset
dataset_client.push_items([{ 'foo': 1 }, { 'bar': 2 }])

The ID of the resource can be either the id of the said resource, or a combination of your username/resource-name.

This is really all you need to remember, because all resource clients follow the pattern you see above.

Nested clients

Sometimes clients return other clients. That's to simplify working with nested collections, such as runs of a given actor.

actor_client = apify_client.actor('john-doe/my-actor')
runs_client = actor_client.runs()
# List the last 10 runs of the john-doe/hello-world actor
actor_runs = runs_client.list(limit=10, desc=True).items

# Select the last run of the john-doe/hello-world actor that finished with a SUCCEEDED status
last_succeeded_run_client = actor_client.last_run(status='SUCCEEDED')
# Fetch items from the run's dataset
dataset_items = last_succeeded_run_client.dataset().list_items().items

Pagination

Most methods named list or list_something return a ListPage object, containing properties items, total, offset, count and limit. There are some exceptions though, like list_keys or list_head which paginate differently. The results you're looking for are always stored under items and you can use the limit property to get only a subset of results. Other properties can be available depending on the method.

API Reference

All public classes, methods and their parameters can be inspected in this API reference.

ApifyClient

The Apify API client.

__init__()
actor()
actors()
build()
builds()
run()
runs()
dataset()
datasets()
key_value_store()
key_value_stores()
request_queue()
request_queues()
webhook()
webhooks()
webhook_dispatch()
webhook_dispatches()
schedule()
schedules()
log()
task()
tasks()
user()

`ApifyClient.init(token=None, *, api_url=None, max_retries=8, min_delay_between_retries_millis=500)`

Initialize the Apify API Client.

Parameters
- token (str, optional) – The Apify API token
- api_url (str, optional) – The URL of the Apify API server to which to connect to. Defaults to https://api.apify.com
- max_retries (int, optional) – How many times to retry a failed request at most
- min_delay_between_retries_millis (int, optional) – How long will the client wait between retrying requests (increases exponentially from this value)

`ApifyClient.actor(actor_id)`

Retrieve the sub-client for manipulating a single actor.

Parameters
- actor_id (str) – ID of the actor to be manipulated
Return type

ActorClient

`ApifyClient.actors()`

Retrieve the sub-client for manipulating actors.

Return type

ActorCollectionClient

`ApifyClient.build(build_id)`

Retrieve the sub-client for manipulating a single actor build.

Parameters
- build_id (str) – ID of the actor build to be manipulated
Return type

BuildClient

`ApifyClient.builds()`

Retrieve the sub-client for querying multiple builds of a user.

Return type

BuildCollectionClient

`ApifyClient.run(run_id)`

Retrieve the sub-client for manipulating a single actor run.

Parameters
- run_id (str) – ID of the actor run to be manipulated
Return type

RunClient

`ApifyClient.runs()`

Retrieve the sub-client for querying multiple actor runs of a user.

Return type

RunCollectionClient

`ApifyClient.dataset(dataset_id)`

Retrieve the sub-client for manipulating a single dataset.

Parameters
- dataset_id (str) – ID of the dataset to be manipulated
Return type

DatasetClient

`ApifyClient.datasets()`

Retrieve the sub-client for manipulating datasets.

Return type

DatasetCollectionClient

`ApifyClient.key_value_store(key_value_store_id)`

Retrieve the sub-client for manipulating a single key-value store.

Parameters
- key_value_store_id (str) – ID of the key-value store to be manipulated
Return type

KeyValueStoreClient

`ApifyClient.key_value_stores()`

Retrieve the sub-client for manipulating key-value stores.

Return type

KeyValueStoreCollectionClient

`ApifyClient.request_queue(request_queue_id, *, client_key=None)`

Retrieve the sub-client for manipulating a single request queue.

Parameters
- request_queue_id (str) – ID of the request queue to be manipulated
- client_key (str) – A unique identifier of the client accessing the request queue
Return type

RequestQueueClient

`ApifyClient.request_queues()`

Retrieve the sub-client for manipulating request queues.

Return type

RequestQueueCollectionClient

`ApifyClient.webhook(webhook_id)`

Retrieve the sub-client for manipulating a single webhook.

Parameters
- webhook_id (str) – ID of the webhook to be manipulated
Return type

WebhookClient

`ApifyClient.webhooks()`

Retrieve the sub-client for querying multiple webhooks of a user.

Return type

WebhookCollectionClient

`ApifyClient.webhook_dispatch(webhook_dispatch_id)`

Retrieve the sub-client for accessing a single webhook dispatch.

Parameters
- webhook_dispatch_id (str) – ID of the webhook dispatch to access
Return type

WebhookDispatchClient

`ApifyClient.webhook_dispatches()`

Retrieve the sub-client for querying multiple webhook dispatches of a user.

Return type

WebhookDispatchCollectionClient

`ApifyClient.schedule(schedule_id)`

Retrieve the sub-client for manipulating a single schedule.

Parameters
- schedule_id (str) – ID of the schedule to be manipulated
Return type

ScheduleClient

`ApifyClient.schedules()`

Retrieve the sub-client for manipulating schedules.

Return type

ScheduleCollectionClient

`ApifyClient.log(build_or_run_id)`

Retrieve the sub-client for retrieving logs.

Parameters
- build_or_run_id (str) – ID of the actor build or run for which to access the log
Return type

LogClient

`ApifyClient.task(task_id)`

Retrieve the sub-client for manipulating a single task.

Parameters
- task_id (str) – ID of the task to be manipulated
Return type

TaskClient

`ApifyClient.tasks()`

Retrieve the sub-client for manipulating tasks.

Return type

TaskCollectionClient

`ApifyClient.user(user_id=None)`

Retrieve the sub-client for querying users.

Parameters
- user_id (str, optional) – ID of user to be queried. If None, queries the user belonging to the token supplied to the client
Return type

UserClient

ActorClient

Sub-client for manipulating a single actor.

get()
update()
delete()
start()
call()
build()
builds()
runs()
last_run()
versions()
version()
webhooks()

`ActorClient.get()`

Retrieve the actor.

https://docs.apify.com/api/v2#/reference/actors/actor-object/get-actor

Returns

The retrieved actor
Return type

dict, optional

`ActorClient.update(*, name=None, title=None, description=None, seo_title=None, seo_description=None, versions=None, restart_on_error=None, is_public=None, is_deprecated=None, is_anonymously_runnable=None, categories=None, default_run_build=None, default_run_memory_mbytes=None, default_run_timeout_secs=None, example_run_input_body=None, example_run_input_content_type=None)`

Update the actor with the specified fields.

https://docs.apify.com/api/v2#/reference/actors/actor-object/update-actor

Parameters
- name (str, optional) – The name of the actor
- title (str, optional) – The title of the actor (human-readable)
- description (str, optional) – The description for the actor
- seo_title (str, optional) – The title of the actor optimized for search engines
- seo_description (str, optional) – The description of the actor optimized for search engines
- versions (list of dict, optional) – The list of actor versions
- restart_on_error (bool, optional) – If true, the main actor run process will be restarted whenever it exits with a non-zero status code.
- is_public (bool, optional) – Whether the actor is public.
- is_deprecated (bool, optional) – Whether the actor is deprecated.
- is_anonymously_runnable (bool, optional) – Whether the actor is anonymously runnable.
- categories (list of str, optional) – The categories to which the actor belongs to.
- default_run_build (str, optional) – Tag or number of the build that you want to run by default.
- default_run_memory_mbytes (int, optional) – Default amount of memory allocated for the runs of this actor, in megabytes.
- default_run_timeout_secs (int, optional) – Default timeout for the runs of this actor in seconds.
- example_run_input_body (Any, optional) – Input to be prefilled as default input to new users of this actor.
- example_run_input_content_type (str, optional) – The content type of the example run input.
Returns

The updated actor
Return type

dict

`ActorClient.delete()`

Delete the actor.

https://docs.apify.com/api/v2#/reference/actors/actor-object/delete-actor

Return type

None

`ActorClient.start(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, wait_for_finish=None, webhooks=None)`

Start the actor and immediately return the Run object.

https://docs.apify.com/api/v2#/reference/actors/run-collection/run-actor

Parameters
- run_input (Any, optional) – The input to pass to the actor run.
- content_type (str, optional) – The content type of the input.
- build (str, optional) – Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the default run configuration for the actor (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the default run configuration for the actor.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the default run configuration for the actor.
- wait_for_finish (int, optional) – The maximum number of seconds the server waits for the run to finish. By default, it is 0, the maximum value is 300.
- webhooks (list of dict, optional) – Optional ad-hoc webhooks (https://docs.apify.com/webhooks/ad-hoc-webhooks) associated with the actor run which can be used to receive a notification, e.g. when the actor finished or failed. If you already have a webhook set up for the actor or task, you do not have to add it again here. Each webhook is represented by a dictionary containing these items:
  - event_types: list of WebhookEventType values which trigger the webhook
  - request_url: URL to which to send the webhook HTTP request
  - payload_template (optional): Optional template for the request payload
Returns

The run object
Return type

dict

`ActorClient.call(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, webhooks=None, wait_secs=None)`

Start the actor and wait for it to finish before returning the Run object.

It waits indefinitely, unless the wait_secs argument is provided.

https://docs.apify.com/api/v2#/reference/actors/run-collection/run-actor

Parameters
- run_input (Any, optional) – The input to pass to the actor run.
- content_type (str, optional) – The content type of the input.
- build (str, optional) – Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the default run configuration for the actor (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the default run configuration for the actor.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the default run configuration for the actor.
- webhooks (list, optional) – Optional webhooks (https://docs.apify.com/webhooks) associated with the actor run, which can be used to receive a notification, e.g. when the actor finished or failed. If you already have a webhook set up for the actor, you do not have to add it again here.
- wait_secs (int, optional) – The maximum number of seconds the server waits for the run to finish. If not provided, waits indefinitely.
Returns

The run object
Return type

dict

`ActorClient.build(*, version_number, beta_packages=None, tag=None, use_cache=None, wait_for_finish=None)`

Build the actor.

https://docs.apify.com/api/v2#/reference/actors/build-collection/build-actor

Parameters
- version_number (str) – Actor version number to be built.
- beta_packages (bool, optional) – If True, then the actor is built with beta versions of Apify NPM packages. By default, the build uses latest stable packages.
- tag (str, optional) – Tag to be applied to the build on success. By default, the tag is taken from the actor version’s buildTag property.
- use_cache (bool, optional) – If true, the actor’s Docker container will be rebuilt using layer cache (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#leverage-build-cache). This is to enable quick rebuild during development. By default, the cache is not used.
- wait_for_finish (int, optional) – The maximum number of seconds the server waits for the build to finish before returning. By default it is 0, the maximum value is 300.
Returns

The build object
Return type

dict

`ActorClient.builds()`

Retrieve a client for the builds of this actor.

Return type

BuildCollectionClient

`ActorClient.runs()`

Retrieve a client for the runs of this actor.

Return type

RunCollectionClient

`ActorClient.last_run(*, status=None)`

Retrieve the client for the last run of this actor.

Last run is retrieved based on the start time of the runs.

Parameters
- status (ActorJobStatus, optional) – Consider only runs with this status.
Returns

The resource client for the last run of this actor.
Return type

RunClient

`ActorClient.versions()`

Retrieve a client for the versions of this actor.

Return type

ActorVersionCollectionClient

`ActorClient.version(version_number)`

Retrieve the client for the specified version of this actor.

Parameters
- version_number (str) – The version number for which to retrieve the resource client.
Returns

The resource client for the specified actor version.
Return type

ActorVersionClient

`ActorClient.webhooks()`

Retrieve a client for webhooks associated with this actor.

Return type

WebhookCollectionClient

ActorCollectionClient

Sub-client for manipulating actors.

list()
create()

`ActorCollectionClient.list(*, my=None, limit=None, offset=None, desc=None)`

List the actors the user has created or used.

https://docs.apify.com/api/v2#/reference/actors/actor-collection/get-list-of-actors

Parameters
- my (bool, optional) – If True, will return only actors which the user has created themselves.
- limit (int, optional) – How many actors to list
- offset (int, optional) – What actor to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the actors in descending order based on their creation date
Returns

The list of available actors matching the specified filters.
Return type

ListPage

`ActorCollectionClient.create(*, name, title=None, description=None, seo_title=None, seo_description=None, versions=None, restart_on_error=None, is_public=None, is_deprecated=None, is_anonymously_runnable=None, categories=None, default_run_build=None, default_run_memory_mbytes=None, default_run_timeout_secs=None, example_run_input_body=None, example_run_input_content_type=None)`

Create a new actor.

https://docs.apify.com/api/v2#/reference/actors/actor-collection/create-actor

Parameters
- name (str) – The name of the actor
- title (str, optional) – The title of the actor (human-readable)
- description (str, optional) – The description for the actor
- seo_title (str, optional) – The title of the actor optimized for search engines
- seo_description (str, optional) – The description of the actor optimized for search engines
- versions (list of dict, optional) – The list of actor versions
- restart_on_error (bool, optional) – If true, the main actor run process will be restarted whenever it exits with a non-zero status code.
- is_public (bool, optional) – Whether the actor is public.
- is_deprecated (bool, optional) – Whether the actor is deprecated.
- is_anonymously_runnable (bool, optional) – Whether the actor is anonymously runnable.
- categories (list of str, optional) – The categories to which the actor belongs to.
- default_run_build (str, optional) – Tag or number of the build that you want to run by default.
- default_run_memory_mbytes (int, optional) – Default amount of memory allocated for the runs of this actor, in megabytes.
- default_run_timeout_secs (int, optional) – Default timeout for the runs of this actor in seconds.
- example_run_input_body (Any, optional) – Input to be prefilled as default input to new users of this actor.
- example_run_input_content_type (str, optional) – The content type of the example run input.
Returns

The created actor.
Return type

dict

ActorVersionClient

Sub-client for manipulating a single actor version.

get()
update()
delete()

`ActorVersionClient.get()`

Return information about the actor version.

https://docs.apify.com/api/v2#/reference/actors/version-object/get-version

Returns

The retrieved actor version data
Return type

dict, optional

`ActorVersionClient.update(*, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type=None, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)`

Update the actor version with specified fields.

https://docs.apify.com/api/v2#/reference/actors/version-object/update-version

Parameters
- build_tag (str, optional) – Tag that is automatically set to the latest successful build of the current version.
- env_vars (list of dict, optional) – Environment variables that will be available to the actor run process, and optionally also to the build process. See the API docs for their exact structure.
- apply_env_vars_to_build (bool, optional) – Whether the environment variables specified for the actor run will also be set to the actor build process.
- source_type (ActorSourceType, optional) – What source type is the actor version using.
- source_code (str, optional) – Source code as a single JavaScript/Node.js file, using the base Docker image specified in baseDockerImage. Required when source_type is ActorSourceType.SOURCE_CODE.
- base_docker_image (str, optional) – The base Docker image to use for single-file actors. Required when source_type is ActorSourceType.SOURCE_CODE.
- source_files (list of dict, optional) – Source code comprised of multiple files, each an item of the array. Required when source_type is ActorSourceType.SOURCE_FILES. See the API docs for the exact structure.
- git_repo_url (str, optional) – The URL of a Git repository from which the source code will be cloned. Required when source_type is ActorSourceType.GIT_REPO.
- tarball_url (str, optional) – The URL of a tarball or a zip archive from which the source code will be downloaded. Required when source_type is ActorSourceType.TARBALL.
- github_gist_url (str, optional) – The URL of a GitHub Gist from which the source will be downloaded. Required when source_type is ActorSourceType.GITHUB_GIST.
Returns

The updated actor version
Return type

dict

`ActorVersionClient.delete()`

Delete the actor version.

https://docs.apify.com/api/v2#/reference/actors/version-object/delete-version

Return type

None

ActorVersionCollectionClient

Sub-client for manipulating actor versions.

list()
create()

`ActorVersionCollectionClient.list()`

List the available actor versions.

https://docs.apify.com/api/v2#/reference/actors/version-collection/get-list-of-versions

Returns

The list of available actor versions.
Return type

ListPage

`ActorVersionCollectionClient.create(*, version_number, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)`

Create a new actor version.

https://docs.apify.com/api/v2#/reference/actors/version-collection/create-version

Parameters
- version_number (str) – Major and minor version of the actor (e.g. 1.0)
- build_tag (str, optional) – Tag that is automatically set to the latest successful build of the current version.
- env_vars (list of dict, optional) – Environment variables that will be available to the actor run process, and optionally also to the build process. See the API docs for their exact structure.
- apply_env_vars_to_build (bool, optional) – Whether the environment variables specified for the actor run will also be set to the actor build process.
- source_type (ActorSourceType) – What source type is the actor version using.
- source_code (str, optional) – Source code as a single JavaScript/Node.js file, using the base Docker image specified in baseDockerImage. Required when source_type is ActorSourceType.SOURCE_CODE.
- base_docker_image (str, optional) – The base Docker image to use for single-file actors. Required when source_type is ActorSourceType.SOURCE_CODE.
- source_files (list of dict, optional) – Source code comprised of multiple files, each an item of the array. Required when source_type is ActorSourceType.SOURCE_FILES. See the API docs for the exact structure.
- git_repo_url (str, optional) – The URL of a Git repository from which the source code will be cloned. Required when source_type is ActorSourceType.GIT_REPO.
- tarball_url (str, optional) – The URL of a tarball or a zip archive from which the source code will be downloaded. Required when source_type is ActorSourceType.TARBALL.
- github_gist_url (str, optional) – The URL of a GitHub Gist from which the source will be downloaded. Required when source_type is ActorSourceType.GITHUB_GIST.
Returns

The created actor version
Return type

dict

RunClient

Sub-client for manipulating a single actor run.

get()
abort()
wait_for_finish()
metamorph()
resurrect()
dataset()
key_value_store()
request_queue()
log()

`RunClient.get()`

Return information about the actor run.

https://docs.apify.com/api/v2#/reference/actor-runs/run-object/get-run

Returns

The retrieved actor run data
Return type

dict

`RunClient.abort(*, gracefully=None)`

Abort the actor run which is starting or currently running and return its details.

https://docs.apify.com/api/v2#/reference/actor-runs/abort-run/abort-run

Parameters
- gracefully (bool, optional) – If True, the actor run will abort gracefully. It will send aborting and persistStates events into the run and force-stop the run after 30 seconds. It is helpful in cases where you plan to resurrect the run later.
Returns

The data of the aborted actor run
Return type

dict

`RunClient.wait_for_finish(*, wait_secs=None)`

Wait synchronously until the run finishes or the server times out.

Parameters
- wait_secs (int, optional) – how long does the client wait for run to finish. None for indefinite.
Returns

The actor run data. If the status on the object is not one of the terminal statuses

(SUCEEDED, FAILED, TIMED_OUT, ABORTED), then the run has not yet finished.
Return type

dict, optional

`RunClient.metamorph(*, target_actor_id, target_actor_build=None, run_input=None, content_type=None)`

Transform an actor run into a run of another actor with a new input.

https://docs.apify.com/api/v2#/reference/actor-runs/metamorph-run/metamorph-run

Parameters
- target_actor_id (str) – ID of the target actor that the run should be transformed into
- target_actor_build (str, optional) – The build of the target actor. It can be either a build tag or build number. By default, the run uses the build specified in the default run configuration for the target actor (typically the latest build).
- run_input (Any, optional) – The input to pass to the new run.
- content_type (str, optional) – The content type of the input.
Returns

The actor run data.
Return type

dict

`RunClient.resurrect()`

Resurrect a finished actor run.

Only finished runs, i.e. runs with status FINISHED, FAILED, ABORTED and TIMED-OUT can be resurrected. Run status will be updated to RUNNING and its container will be restarted with the same default storages.

https://docs.apify.com/api/v2#/reference/actor-runs/resurrect-run/resurrect-run

Returns

The actor run data.
Return type

dict

`RunClient.dataset()`

Get the client for the default dataset of the actor run.

https://docs.apify.com/api/v2#/reference/actors/last-run-object-and-its-storages

Returns

A client allowing access to the default dataset of this actor run.
Return type

DatasetClient

`RunClient.key_value_store()`

Get the client for the default key-value store of the actor run.

https://docs.apify.com/api/v2#/reference/actors/last-run-object-and-its-storages

Returns

A client allowing access to the default key-value store of this actor run.
Return type

KeyValueStoreClient

`RunClient.request_queue()`

Get the client for the default request queue of the actor run.

https://docs.apify.com/api/v2#/reference/actors/last-run-object-and-its-storages

Returns

A client allowing access to the default request_queue of this actor run.
Return type

RequestQueueClient

`RunClient.log()`

Get the client for the log of the actor run.

https://docs.apify.com/api/v2#/reference/actors/last-run-object-and-its-storages

Returns

A client allowing access to the log of this actor run.
Return type

LogClient

RunCollectionClient

Sub-client for listing actor runs.

list()

`RunCollectionClient.list(*, limit=None, offset=None, desc=None, status=None)`

List all actor runs (either of a single actor, or all user’s actors, depending on where this client was initialized from).

https://docs.apify.com/api/v2#/reference/actors/run-collection/get-list-of-runs https://docs.apify.com/api/v2#/reference/actor-runs/run-collection/get-user-runs-list

Parameters
- limit (int, optional) – How many runs to retrieve
- offset (int, optional) – What run to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the runs in descending order based on their start date
- status (ActorJobStatus, optional) – Retrieve only runs with the provided status
Returns

The retrieved actor runs
Return type

ListPage

BuildClient

Sub-client for manipulating a single actor build.

get()
abort()
wait_for_finish()

`BuildClient.get()`

Return information about the actor build.

https://docs.apify.com/api/v2#/reference/actor-builds/build-object/get-build

Returns

The retrieved actor build data
Return type

dict, optional

`BuildClient.abort()`

Abort the actor build which is starting or currently running and return its details.

https://docs.apify.com/api/v2#/reference/actor-builds/abort-build/abort-build

Returns

The data of the aborted actor build
Return type

dict

`BuildClient.wait_for_finish(*, wait_secs=None)`

Wait synchronously until the build finishes or the server times out.

Parameters
- wait_secs (int, optional) – how long does the client wait for build to finish. None for indefinite.
Returns

The actor build data. If the status on the object is not one of the terminal statuses

(SUCEEDED, FAILED, TIMED_OUT, ABORTED), then the build has not yet finished.
Return type

dict, optional

BuildCollectionClient

Sub-client for listing actor builds.

list()

`BuildCollectionClient.list(*, limit=None, offset=None, desc=None)`

List all actor builds (either of a single actor, or all user’s actors, depending on where this client was initialized from).

https://docs.apify.com/api/v2#/reference/actors/build-collection/get-list-of-builds https://docs.apify.com/api/v2#/reference/actor-builds/build-collection/get-user-builds-list

Parameters
- limit (int, optional) – How many builds to retrieve
- offset (int, optional) – What build to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the builds in descending order based on their start date
Returns

The retrieved actor builds
Return type

ListPage

DatasetClient

Sub-client for manipulating a single dataset.

get()
update()
delete()
list_items()
iterate_items()
download_items()
stream_items()
push_items()

`DatasetClient.get()`

Retrieve the dataset.

https://docs.apify.com/api/v2#/reference/datasets/dataset/get-dataset

Returns

The retrieved dataset, or None, if it does not exist
Return type

dict, optional

`DatasetClient.update(*, name=None)`

Update the dataset with specified fields.

https://docs.apify.com/api/v2#/reference/datasets/dataset/update-dataset

Parameters
- name (str, optional) – The new name for the dataset
Returns

The updated dataset
Return type

dict

`DatasetClient.delete()`

Delete the dataset.

https://docs.apify.com/api/v2#/reference/datasets/dataset/delete-dataset

Return type

None

`DatasetClient.list_items(*, offset=None, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)`

List the items of the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items

Parameters
- offset (int, optional) – Number of items that should be skipped at the start. The default value is 0
- limit (int, optional) – Maximum number of items to return. By default there is no limit.
- desc (bool, optional) – By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- clean (bool, optional) – If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- fields (list of str, optional) – A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- omit (list of str, optional) – A list of fields which should be omitted from the items.
- unwind (str, optional) – Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- skip_empty (bool, optional) – If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- skip_hidden (bool, optional) – If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
Returns

A page of the list of dataset items according to the specified filters.
Return type

ListPage

`DatasetClient.iterate_items(*, offset=0, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)`

Iterate over the items in the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items

Parameters
- offset (int, optional) – Number of items that should be skipped at the start. The default value is 0
- limit (int, optional) – Maximum number of items to return. By default there is no limit.
- desc (bool, optional) – By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- clean (bool, optional) – If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- fields (list of str, optional) – A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- omit (list of str, optional) – A list of fields which should be omitted from the items.
- unwind (str, optional) – Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- skip_empty (bool, optional) – If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- skip_hidden (bool, optional) – If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
Yields

dict – An item from the dataset
Return type

Generator

`DatasetClient.download_items(*, item_format='json', offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)`

Download the items in the dataset as raw bytes.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items

Parameters
- item_format (str) – Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
- offset (int, optional) – Number of items that should be skipped at the start. The default value is 0
- limit (int, optional) – Maximum number of items to return. By default there is no limit.
- desc (bool, optional) – By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- clean (bool, optional) – If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- bom (bool, optional) – All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.
- delimiter (str, optional) – A delimiter character for CSV files. The default delimiter is a simple comma (,).
- fields (list of str, optional) – A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- omit (list of str, optional) – A list of fields which should be omitted from the items.
- unwind (str, optional) – Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- skip_empty (bool, optional) – If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- skip_header_row (bool, optional) – If True, then header row in the csv format is skipped.
- skip_hidden (bool, optional) – If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- xml_root (str, optional) – Overrides default root element name of xml output. By default the root element is items.
- xml_row (str, optional) – Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.
Returns

The dataset items as raw bytes
Return type

bytes

`DatasetClient.stream_items(*, item_format='json', offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)`

Retrieve the items in the dataset as a file-like object.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items

Parameters
- item_format (str) – Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
- offset (int, optional) – Number of items that should be skipped at the start. The default value is 0
- limit (int, optional) – Maximum number of items to return. By default there is no limit.
- desc (bool, optional) – By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- clean (bool, optional) – If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- bom (bool, optional) – All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.
- delimiter (str, optional) – A delimiter character for CSV files. The default delimiter is a simple comma (,).
- fields (list of str, optional) – A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- omit (list of str, optional) – A list of fields which should be omitted from the items.
- unwind (str, optional) – Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- skip_empty (bool, optional) – If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- skip_header_row (bool, optional) – If True, then header row in the csv format is skipped.
- skip_hidden (bool, optional) – If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- xml_root (str, optional) – Overrides default root element name of xml output. By default the root element is items.
- xml_row (str, optional) – Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.
Returns

The dataset items as a file-like object
Return type

io.IOBase

`DatasetClient.push_items(items)`

Push items to the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/put-items

Parameters
- items (Union[str, int, float, bool, None, Dict[str, Any], List[Any]]) – The items which to push in the dataset. Either a stringified JSON, a dictionary, or a list of strings or dictionaries.
Return type

None

DatasetCollectionClient

Sub-client for manipulating datasets.

list()
get_or_create()

`DatasetCollectionClient.list(*, unnamed=None, limit=None, offset=None, desc=None)`

List the available datasets.

https://docs.apify.com/api/v2#/reference/datasets/dataset-collection/get-list-of-datasets

Parameters
- unnamed (bool, optional) – Whether to include unnamed datasets in the list
- limit (int, optional) – How many datasets to retrieve
- offset (int, optional) – What dataset to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the datasets in descending order based on their modification date
Returns

The list of available datasets matching the specified filters.
Return type

ListPage

`DatasetCollectionClient.get_or_create(*, name=None)`

Retrieve a named dataset, or create a new one when it doesn’t exist.

https://docs.apify.com/api/v2#/reference/datasets/dataset-collection/create-dataset

Parameters
- name (str, optional) – The name of the dataset to retrieve or create.
Returns

The retrieved or newly-created dataset.
Return type

dict

KeyValueStoreClient

Sub-client for manipulating a single key-value store.

get()
update()
delete()
list_keys()
get_record()
set_record()
delete_record()

`KeyValueStoreClient.get()`

Retrieve the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/store-object/get-store

Returns

The retrieved key-value store, or None if it does not exist
Return type

dict, optional

`KeyValueStoreClient.update(*, name=None)`

Update the key-value store with specified fields.

https://docs.apify.com/api/v2#/reference/key-value-stores/store-object/update-store

Parameters
- name (str, optional) – The new name for key-value store
Returns

The updated key-value store
Return type

dict

`KeyValueStoreClient.delete()`

Delete the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/store-object/delete-store

Return type

None

`KeyValueStoreClient.list_keys(*, limit=None, exclusive_start_key=None)`

List the keys in the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/key-collection/get-list-of-keys

Parameters
- limit (int, optional) – Number of keys to be returned. Maximum value is 1000
- exclusive_start_key (str, optional) – All keys up to this one (including) are skipped from the result
Returns

The list of keys in the key-value store matching the given arguments
Return type

dict

`KeyValueStoreClient.get_record(key, *, as_bytes=False, as_file=False)`

Retrieve the given record from the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/record/get-record

Parameters
- key (str) – Key of the record to retrieve
- as_bytes (bool, optional) – Whether to retrieve the record as unparsed bytes, default False
- as_file (bool, optional) – Whether to retrieve the record as a file-like object, default False
Returns

The requested record, or None, if the record does not exist
Return type

dict, optional

`KeyValueStoreClient.set_record(key, value, content_type=None)`

Set a value to the given record in the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/record/put-record

Parameters
- key (str) – The key of the record to save the value to
- value (Any) – The value to save into the record
- content_type (str, optional) – The content type of the saved value
Return type

None

`KeyValueStoreClient.delete_record(key)`

Delete the specified record from the key-value store.

https://docs.apify.com/api/v2#/reference/key-value-stores/record/delete-record

Parameters
- key (str) – The key of the record which to delete
Return type

None

KeyValueStoreCollectionClient

Sub-client for manipulating key-value stores.

list()
get_or_create()

`KeyValueStoreCollectionClient.list(*, unnamed=None, limit=None, offset=None, desc=None)`

List the available key-value stores.

https://docs.apify.com/api/v2#/reference/key-value-stores/store-collection/get-list-of-key-value-stores

Parameters
- unnamed (bool, optional) – Whether to include unnamed key-value stores in the list
- limit (int, optional) – How many key-value stores to retrieve
- offset (int, optional) – What key-value store to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the key-value stores in descending order based on their modification date
Returns

The list of available key-value stores matching the specified filters.
Return type

ListPage

`KeyValueStoreCollectionClient.get_or_create(*, name=None)`

Retrieve a named key-value store, or create a new one when it doesn’t exist.

https://docs.apify.com/api/v2#/reference/key-value-stores/store-collection/create-key-value-store

Parameters
- name (str, optional) – The name of the key-value store to retrieve or create.
Returns

The retrieved or newly-created key-value store.
Return type

dict

RequestQueueClient

Sub-client for manipulating a single request queue.

get()
update()
delete()
list_head()
add_request()
get_request()
update_request()
delete_request()

`RequestQueueClient.get()`

Retrieve the request queue.

https://docs.apify.com/api/v2#/reference/request-queues/queue/get-request-queue

Returns

The retrieved request queue, or None, if it does not exist
Return type

dict, optional

`RequestQueueClient.update(*, name=None)`

Update the request queue with specified fields.

https://docs.apify.com/api/v2#/reference/request-queues/queue/update-request-queue

Parameters
- name (str, optional) – The new name for the request queue
Returns

The updated request queue
Return type

dict

`RequestQueueClient.delete()`

Delete the request queue.

https://docs.apify.com/api/v2#/reference/request-queues/queue/delete-request-queue

Return type

None

`RequestQueueClient.list_head(*, limit=None)`

Retrieve a given number of requests from the beginning of the queue.

https://docs.apify.com/api/v2#/reference/request-queues/queue-head/get-head

Parameters
- limit (int, optional) – How many requests to retrieve
Returns

The desired number of requests from the beginning of the queue.
Return type

dict

`RequestQueueClient.add_request(request, *, forefront=None)`

Add a request to the queue.

https://docs.apify.com/api/v2#/reference/request-queues/request-collection/add-request

Parameters
- request (dict) – The request to add to the queue
- forefront (bool, optional) – Whether to add the request to the head or the end of the queue
Returns

The added request.
Return type

dict

`RequestQueueClient.get_request(request_id)`

Retrieve a request from the queue.

https://docs.apify.com/api/v2#/reference/request-queues/request/get-request

Parameters
- request_id (str) – ID of the request to retrieve
Returns

The retrieved request, or None, if it did not exist.
Return type

dict, optional

`RequestQueueClient.update_request(request, *, forefront=None)`

Update a request in the queue.

https://docs.apify.com/api/v2#/reference/request-queues/request/update-request

Parameters
- request (dict) – The updated request
- forefront (bool, optional) – Whether to put the updated request in the beginning or the end of the queue
Returns

The updated request
Return type

dict

`RequestQueueClient.delete_request(request_id)`

Delete a request from the queue.

https://docs.apify.com/api/v2#/reference/request-queues/request/delete-request

Parameters
- request_id (str) – ID of the request to delete.
Return type

None

RequestQueueCollectionClient

Sub-client for manipulating request queues.

list()
get_or_create()

`RequestQueueCollectionClient.list(*, unnamed=None, limit=None, offset=None, desc=None)`

List the available request queues.

https://docs.apify.com/api/v2#/reference/request-queues/queue-collection/get-list-of-request-queues

Parameters
- unnamed (bool, optional) – Whether to include unnamed request queues in the list
- limit (int, optional) – How many request queues to retrieve
- offset (int, optional) – What request queue to include as first when retrieving the list
- desc (bool, optional) – Whether to sort therequest queues in descending order based on their modification date
Returns

The list of available request queues matching the specified filters.
Return type

ListPage

`RequestQueueCollectionClient.get_or_create(*, name=None)`

Retrieve a named request queue, or create a new one when it doesn’t exist.

https://docs.apify.com/api/v2#/reference/request-queues/queue-collection/create-request-queue

Parameters
- name (str, optional) – The name of the request queue to retrieve or create.
Returns

The retrieved or newly-created request queue.
Return type

dict

LogClient

Sub-client for manipulating logs.

get()
stream()

`LogClient.get()`

Retrieve the log as text.

https://docs.apify.com/api/v2#/reference/logs/log/get-log

Returns

The retrieved log, or None, if it does not exist.
Return type

str, optional

`LogClient.stream()`

Retrieve the log as a file-like object.

https://docs.apify.com/api/v2#/reference/logs/log/get-log

Returns

The retrieved log as a file-like object, or None, if it does not exist.
Return type

io.IOBase, optional

WebhookClient

Sub-client for manipulating a single webhook.

get()
update()
delete()
test()
dispatches()

`WebhookClient.get()`

Retrieve the webhook.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-object/get-webhook

Returns

The retrieved webhook, or None if it does not exist
Return type

dict, optional

`WebhookClient.update(*, event_types=None, request_url=None, payload_template=None, actor_id=None, actor_task_id=None, actor_run_id=None, ignore_ssl_errors=None, do_not_retry=None, is_ad_hoc=None)`

Update the webhook.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-object/update-webhook

Parameters
- event_types (list of WebhookEventType, optional) – List of event types that should trigger the webhook. At least one is required.
- request_url (str, optional) – URL that will be invoked once the webhook is triggered.
- payload_template (str, optional) – Specification of the payload that will be sent to request_url
- actor_id (str, optional) – Id of the actor whose runs should trigger the webhook.
- actor_task_id (str, optional) – Id of the actor task whose runs should trigger the webhook.
- actor_run_id (str, optional) – Id of the actor run which should trigger the webhook.
- ignore_ssl_errors (bool, optional) – Whether the webhook should ignore SSL errors returned by request_url
- do_not_retry (bool, optional) – Whether the webhook should retry sending the payload to request_url upon failure.
- is_ad_hoc (bool, optional) – Set to True if you want the webhook to be triggered only the first time the condition is fulfilled. Only applicable when actor_run_id is filled.
Returns

The updated webhook
Return type

dict

`WebhookClient.delete()`

Delete the webhook.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-object/delete-webhook

Return type

None

`WebhookClient.test()`

Test a webhook.

Creates a webhook dispatch with a dummy payload.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-test/test-webhook

Returns

The webhook dispatch created by the test
Return type

dict, optional

`WebhookClient.dispatches()`

Get dispatches of the webhook.

https://docs.apify.com/api/v2#/reference/webhooks/dispatches-collection/get-collection

Returns

A client allowing access to dispatches of this webhook using its list method
Return type

WebhookDispatchCollectionClient

WebhookCollectionClient

Sub-client for manipulating webhooks.

list()
create()

`WebhookCollectionClient.list(*, limit=None, offset=None, desc=None)`

List the available webhooks.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-collection/get-list-of-webhooks

Parameters
- limit (int, optional) – How many webhooks to retrieve
- offset (int, optional) – What webhook to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the webhooks in descending order based on their date of creation
Returns

The list of available webhooks matching the specified filters.
Return type

ListPage

`WebhookCollectionClient.create(*, event_types, request_url, payload_template=None, actor_id=None, actor_task_id=None, actor_run_id=None, ignore_ssl_errors=None, do_not_retry=None, idempotency_key=None, is_ad_hoc=None)`

Create a new webhook.

You have to specify exactly one out of actor_id, actor_task_id or actor_run_id.

https://docs.apify.com/api/v2#/reference/webhooks/webhook-collection/create-webhook

Parameters
- event_types (list of WebhookEventType) – List of event types that should trigger the webhook. At least one is required.
- request_url (str) – URL that will be invoked once the webhook is triggered.
- payload_template (str, optional) – Specification of the payload that will be sent to request_url
- actor_id (str, optional) – Id of the actor whose runs should trigger the webhook.
- actor_task_id (str, optional) – Id of the actor task whose runs should trigger the webhook.
- actor_run_id (str, optional) – Id of the actor run which should trigger the webhook.
- ignore_ssl_errors (bool, optional) – Whether the webhook should ignore SSL errors returned by request_url
- do_not_retry (bool, optional) – Whether the webhook should retry sending the payload to request_url upon failure.
- idempotency_key (str, optional) – A unique identifier of a webhook. You can use it to ensure that you won’t create the same webhook multiple times.
- is_ad_hoc (bool, optional) – Set to True if you want the webhook to be triggered only the first time the condition is fulfilled. Only applicable when actor_run_id is filled.
Returns

The created webhook
Return type

dict

WebhookDispatchClient

Sub-client for querying information about a webhook dispatch.

get()

`WebhookDispatchClient.get()`

Retrieve the webhook dispatch.

https://docs.apify.com/api/v2#/reference/webhook-dispatches/webhook-dispatch-object/get-webhook-dispatch

Returns

The retrieved webhook dispatch, or None if it does not exist
Return type

dict, optional

WebhookDispatchCollectionClient

Sub-client for listing webhook dispatches.

list()

`WebhookDispatchCollectionClient.list(*, limit=None, offset=None, desc=None)`

List all webhook dispatches of a user.

https://docs.apify.com/api/v2#/reference/webhook-dispatches/webhook-dispatches-collection/get-list-of-webhook-dispatches

Parameters
- limit (int, optional) – How many webhook dispatches to retrieve
- offset (int, optional) – What webhook dispatch to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the webhook dispatches in descending order based on the date of their creation
Returns

The retrieved webhook dispatches of a user
Return type

ListPage

TaskClient

Sub-client for manipulating a single task.

get()
update()
delete()
start()
call()
get_input()
update_input()
runs()
last_run()
webhooks()

`TaskClient.get()`

Retrieve the task.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-object/get-task

Returns

The retrieved task
Return type

dict, optional

`TaskClient.update(*, name=None, task_input=None, build=None, memory_mbytes=None, timeout_secs=None)`

Update the task with specified fields.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-object/update-task

Parameters
- name (str, optional) – Name of the task
- build (str, optional) – Actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the task settings (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the task settings.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the task settings.
- task_input (dict, optional) – Task input dictionary
Returns

The updated task
Return type

dict

`TaskClient.delete()`

Delete the task.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-object/delete-task

Return type

None

`TaskClient.start(*, task_input=None, build=None, memory_mbytes=None, timeout_secs=None, wait_for_finish=None, webhooks=None)`

Start the task and immediately return the Run object.

https://docs.apify.com/api/v2#/reference/actor-tasks/run-collection/run-task

Parameters
- task_input (dict, optional) – Task input dictionary
- build (str, optional) – Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the task settings (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the task settings.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the task settings.
- wait_for_finish (int, optional) – The maximum number of seconds the server waits for the run to finish. By default, it is 0, the maximum value is 300.
- webhooks (list of dict, optional) – Optional ad-hoc webhooks (https://docs.apify.com/webhooks/ad-hoc-webhooks) associated with the actor run which can be used to receive a notification, e.g. when the actor finished or failed. If you already have a webhook set up for the actor or task, you do not have to add it again here. Each webhook is represented by a dictionary containing these items:
  - event_types: list of WebhookEventType values which trigger the webhook
  - request_url: URL to which to send the webhook HTTP request
  - payload_template (optional): Optional template for the request payload
Returns

The run object
Return type

dict

`TaskClient.call(*, task_input=None, build=None, memory_mbytes=None, timeout_secs=None, webhooks=None, wait_secs=None)`

Start a task and wait for it to finish before returning the Run object.

It waits indefinitely, unless the wait_secs argument is provided.

https://docs.apify.com/api/v2#/reference/actor-tasks/run-collection/run-task

Parameters
- task_input (dict, optional) – Task input dictionary
- build (str, optional) – Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the task settings (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the task settings.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the task settings.
- webhooks (list, optional) – Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. Note: if you already have a webhook set up for the actor or task, you do not have to add it again here.
- wait_secs (int, optional) – The maximum number of seconds the server waits for the task run to finish. If not provided, waits indefinitely.
Returns

The run object
Return type

dict

`TaskClient.get_input()`

Retrieve the default input for this task.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-input-object/get-task-input

Returns

Retrieved task input
Return type

dict, optional

`TaskClient.update_input(*, task_input)`

Update the default input for this task.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-input-object/update-task-input

Return type

Dict
Returns

dict, Retrieved task input

`TaskClient.runs()`

Retrieve a client for the runs of this task.

Return type

RunCollectionClient

`TaskClient.last_run(*, status=None)`

Retrieve the client for the last run of this task.

Last run is retrieved based on the start time of the runs.

Parameters
- status (ActorJobStatus, optional) – Consider only runs with this status.
Returns

The resource client for the last run of this task.
Return type

RunClient

`TaskClient.webhooks()`

Retrieve a client for webhooks associated with this task.

Return type

WebhookCollectionClient

TaskCollectionClient

Sub-client for manipulating tasks.

list()
create()

`TaskCollectionClient.list(*, limit=None, offset=None, desc=None)`

List the available tasks.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-collection/get-list-of-tasks

Parameters
- limit (int, optional) – How many tasks to list
- offset (int, optional) – What task to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the tasks in descending order based on their creation date
Returns

The list of available tasks matching the specified filters.
Return type

ListPage

`TaskCollectionClient.create(*, actor_id, name, build=None, timeout_secs=None, memory_mbytes=None, task_input=None)`

Create a new task.

https://docs.apify.com/api/v2#/reference/actor-tasks/task-collection/create-task

Parameters
- actor_id (str) – Id of the actor that should be run
- name (str) – Name of the task
- build (str, optional) – Actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the task settings (typically latest).
- memory_mbytes (int, optional) – Memory limit for the run, in megabytes. By default, the run uses a memory limit specified in the task settings.
- timeout_secs (int, optional) – Optional timeout for the run, in seconds. By default, the run uses timeout specified in the task settings.
- task_input (dict, optional) – Task input object.
Returns

The created task.
Return type

dict

ScheduleClient

Sub-client for manipulating a single schedule.

get()
update()
delete()
get_log()

`ScheduleClient.get()`

Return information about the schedule.

https://docs.apify.com/api/v2#/reference/schedules/schedule-object/get-schedule

Returns

The retrieved schedule
Return type

dict, optional

`ScheduleClient.update(*, cron_expression=None, is_enabled=None, is_exclusive=None, name=None, actions=None, description=None, timezone=None)`

Update the schedule with specified fields.

https://docs.apify.com/api/v2#/reference/schedules/schedule-object/update-schedule

Parameters
- cron_expression (str, optional) – The cron expression used by this schedule
- is_enabled (bool, optional) – True if the schedule should be enabled
- is_exclusive (bool, optional) – When set to true, don’t start actor or actor task if it’s still running from the previous schedule.
- name (str, optional) – The name of the schedule to create.
- actions (list of dict, optional) – Actors or tasks that should be run on this schedule. See the API documentation for exact structure.
- description (str, optional) – Description of this schedule
- timezone (str, optional) – Timezone in which your cron expression runs (TZ database name from https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)
Returns

The updated schedule
Return type

dict

`ScheduleClient.delete()`

Delete the schedule.

https://docs.apify.com/api/v2#/reference/schedules/schedule-object/delete-schedule

Return type

None

`ScheduleClient.get_log()`

Return log for the given schedule.

https://docs.apify.com/api/v2#/reference/schedules/schedule-log/get-schedule-log

Returns

Retrieved log of the given schedule
Return type

list, optional

ScheduleCollectionClient

Sub-client for manipulating schedules.

list()
create()

`ScheduleCollectionClient.list(*, limit=None, offset=None, desc=None)`

List the available schedules.

https://docs.apify.com/api/v2#/reference/schedules/schedules-collection/get-list-of-schedules

Parameters
- limit (int, optional) – How many schedules to retrieve
- offset (int, optional) – What schedules to include as first when retrieving the list
- desc (bool, optional) – Whether to sort the schedules in descending order based on their modification date
Returns

The list of available schedules matching the specified filters.
Return type

ListPage

`ScheduleCollectionClient.create(*, cron_expression, is_enabled, is_exclusive, name=None, actions=[], description=None, timezone=None)`

Create a new schedule.

https://docs.apify.com/api/v2#/reference/schedules/schedules-collection/create-schedule

Parameters
- cron_expression (str) – The cron expression used by this schedule
- is_enabled (bool) – True if the schedule should be enabled
- is_exclusive (bool) – When set to true, don’t start actor or actor task if it’s still running from the previous schedule.
- name (Optional[str]) – The name of the schedule to create.
- actions (List[Dict]) – Actors or tasks that should be run on this schedule. See the API documentation for exact structure.
- description (Optional[str]) – Description of this schedule
- timezone (Optional[str]) – Timezone in which your cron expression runs (TZ database name from https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)
Returns

The created schedule.
Return type

dict

UserClient

Sub-client for querying user data.

get()

`UserClient.get()`

Return information about user account.

You receive all or only public info based on your token permissions.

https://docs.apify.com/api/v2#/reference/users

Returns

The retrieved user data, or None if the user does not exist.
Return type

dict, optional

ListPage

A single page of items returned from a list() method.

Instance attributes

Name	Type	Description
`items`	`list`	List of returned objects on this page
`offset`	`int`	The limit on the number of returned objects offset specified in the API call
`limit`	`int`	The offset of the first object specified in the API call
`count`	`int`	Count of the returned objects on this page
`total`	`int`	Total number of objects matching the API call criteria
`desc`	`bool`	Whether the listing is descending or not

ActorJobStatus

Available statuses for actor jobs (runs or builds).

READY
RUNNING
SUCCEEDED
FAILED
TIMING_OUT
TIMED_OUT
ABORTING
ABORTED

`ActorJobStatus.READY`

Actor job initialized but not started yet

`ActorJobStatus.RUNNING`

Actor job in progress

`ActorJobStatus.SUCCEEDED`

Actor job finished successfully

`ActorJobStatus.FAILED`

Actor job or build failed

`ActorJobStatus.TIMING_OUT`

Actor job currently timing out

`ActorJobStatus.TIMED_OUT`

Actor job timed out

`ActorJobStatus.ABORTING`

Actor job currently being aborted by user

`ActorJobStatus.ABORTED`

Actor job aborted by user

ActorSourceType

Available source types for actors.

SOURCE_CODE
SOURCE_FILES
GIT_REPO
TARBALL
GITHUB_GIST

`ActorSourceType.SOURCE_CODE`

Actor source code is a single JavaScript/Node.js file

`ActorSourceType.SOURCE_FILES`

Actor source code is comprised of multiple files

`ActorSourceType.GIT_REPO`

Actor source code is cloned from a Git repository

`ActorSourceType.TARBALL`

Actor source code is downloaded using a tarball or Zip file

`ActorSourceType.GITHUB_GIST`

Actor source code is taken from a GitHub Gist

WebhookEventType

Events that can trigger a webhook.

ACTOR_RUN_CREATED
ACTOR_RUN_SUCCEEDED
ACTOR_RUN_FAILED
ACTOR_RUN_TIMED_OUT
ACTOR_RUN_ABORTED
ACTOR_RUN_RESURRECTED

`WebhookEventType.ACTOR_RUN_CREATED`

The actor run was created

`WebhookEventType.ACTOR_RUN_SUCCEEDED`

The actor run has succeeded

`WebhookEventType.ACTOR_RUN_FAILED`

The actor run has failed

`WebhookEventType.ACTOR_RUN_TIMED_OUT`

The actor run has timed out

`WebhookEventType.ACTOR_RUN_ABORTED`

The actor run was aborted

`WebhookEventType.ACTOR_RUN_RESURRECTED`

The actor run was resurrected

Files

docs.md

Latest commit

History

docs.md

File metadata and controls

Apify API client for Python

Installation

Quick Start

Features

Automatic parsing and error handling

Retries with exponential backoff

Convenience functions and options

Usage concepts

Nested clients

Pagination

API Reference

ApifyClient

ApifyClient.__init__(token=None, *, api_url=None, max_retries=8, min_delay_between_retries_millis=500)

ApifyClient.actor(actor_id)

ApifyClient.actors()

ApifyClient.build(build_id)

ApifyClient.builds()

ApifyClient.run(run_id)

ApifyClient.runs()

ApifyClient.dataset(dataset_id)

ApifyClient.datasets()

ApifyClient.key_value_store(key_value_store_id)

ApifyClient.key_value_stores()

ApifyClient.request_queue(request_queue_id, *, client_key=None)

ApifyClient.request_queues()

ApifyClient.webhook(webhook_id)

ApifyClient.webhooks()

ApifyClient.webhook_dispatch(webhook_dispatch_id)

ApifyClient.webhook_dispatches()

ApifyClient.schedule(schedule_id)

ApifyClient.schedules()

ApifyClient.log(build_or_run_id)

ApifyClient.task(task_id)

ApifyClient.tasks()

ApifyClient.user(user_id=None)

ActorClient

ActorClient.get()

ActorClient.delete()

ActorClient.start(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, wait_for_finish=None, webhooks=None)

ActorClient.call(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, webhooks=None, wait_secs=None)

ActorClient.build(*, version_number, beta_packages=None, tag=None, use_cache=None, wait_for_finish=None)

ActorClient.builds()

ActorClient.runs()

ActorClient.last_run(*, status=None)

ActorClient.versions()

ActorClient.version(version_number)

ActorClient.webhooks()

ActorCollectionClient

ActorCollectionClient.list(*, my=None, limit=None, offset=None, desc=None)

ActorVersionClient

ActorVersionClient.get()

ActorVersionClient.update(*, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type=None, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)

ActorVersionClient.delete()

ActorVersionCollectionClient

ActorVersionCollectionClient.list()

ActorVersionCollectionClient.create(*, version_number, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)

RunClient

RunClient.get()

RunClient.abort(*, gracefully=None)

RunClient.wait_for_finish(*, wait_secs=None)

RunClient.metamorph(*, target_actor_id, target_actor_build=None, run_input=None, content_type=None)

RunClient.resurrect()

RunClient.dataset()

RunClient.key_value_store()

RunClient.request_queue()

RunClient.log()

RunCollectionClient

RunCollectionClient.list(*, limit=None, offset=None, desc=None, status=None)

BuildClient

BuildClient.get()

BuildClient.abort()

BuildClient.wait_for_finish(*, wait_secs=None)

BuildCollectionClient

BuildCollectionClient.list(*, limit=None, offset=None, desc=None)

`ApifyClient.init(token=None, *, api_url=None, max_retries=8, min_delay_between_retries_millis=500)`

`ApifyClient.actor(actor_id)`

`ApifyClient.actors()`

`ApifyClient.build(build_id)`

`ApifyClient.builds()`

`ApifyClient.run(run_id)`

`ApifyClient.runs()`

`ApifyClient.dataset(dataset_id)`

`ApifyClient.datasets()`

`ApifyClient.key_value_store(key_value_store_id)`

`ApifyClient.key_value_stores()`

`ApifyClient.request_queue(request_queue_id, *, client_key=None)`

`ApifyClient.request_queues()`

`ApifyClient.webhook(webhook_id)`

`ApifyClient.webhooks()`

`ApifyClient.webhook_dispatch(webhook_dispatch_id)`

`ApifyClient.webhook_dispatches()`

`ApifyClient.schedule(schedule_id)`

`ApifyClient.schedules()`

`ApifyClient.log(build_or_run_id)`

`ApifyClient.task(task_id)`

`ApifyClient.tasks()`

`ApifyClient.user(user_id=None)`

`ActorClient.get()`

`ActorClient.delete()`

`ActorClient.start(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, wait_for_finish=None, webhooks=None)`

`ActorClient.call(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, webhooks=None, wait_secs=None)`

`ActorClient.build(*, version_number, beta_packages=None, tag=None, use_cache=None, wait_for_finish=None)`

`ActorClient.builds()`

`ActorClient.runs()`

`ActorClient.last_run(*, status=None)`

`ActorClient.versions()`

`ActorClient.version(version_number)`

`ActorClient.webhooks()`

`ActorCollectionClient.list(*, my=None, limit=None, offset=None, desc=None)`

`ActorVersionClient.get()`

`ActorVersionClient.update(*, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type=None, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)`

`ActorVersionClient.delete()`

`ActorVersionCollectionClient.list()`

`ActorVersionCollectionClient.create(*, version_number, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)`

`RunClient.get()`

`RunClient.abort(*, gracefully=None)`

`RunClient.wait_for_finish(*, wait_secs=None)`

`RunClient.metamorph(*, target_actor_id, target_actor_build=None, run_input=None, content_type=None)`

`RunClient.resurrect()`

`RunClient.dataset()`

`RunClient.key_value_store()`

`RunClient.request_queue()`

`RunClient.log()`

`RunCollectionClient.list(*, limit=None, offset=None, desc=None, status=None)`

`BuildClient.get()`

`BuildClient.abort()`

`BuildClient.wait_for_finish(*, wait_secs=None)`

`BuildCollectionClient.list(*, limit=None, offset=None, desc=None)`

`DatasetClient.get()`

`DatasetClient.update(*, name=None)`

`DatasetClient.delete()`

`DatasetClient.list_items(*, offset=None, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)`

`DatasetClient.iterate_items(*, offset=0, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)`

`DatasetClient.download_items(*, item_format='json', offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)`

`DatasetClient.stream_items(*, item_format='json', offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)`

`DatasetClient.push_items(items)`

`DatasetCollectionClient.list(*, unnamed=None, limit=None, offset=None, desc=None)`

`DatasetCollectionClient.get_or_create(*, name=None)`

`KeyValueStoreClient.get()`

`KeyValueStoreClient.update(*, name=None)`

`KeyValueStoreClient.delete()`

`KeyValueStoreClient.list_keys(*, limit=None, exclusive_start_key=None)`

`KeyValueStoreClient.get_record(key, *, as_bytes=False, as_file=False)`

`KeyValueStoreClient.set_record(key, value, content_type=None)`

`KeyValueStoreClient.delete_record(key)`

`KeyValueStoreCollectionClient.list(*, unnamed=None, limit=None, offset=None, desc=None)`

`KeyValueStoreCollectionClient.get_or_create(*, name=None)`

`RequestQueueClient.get()`

`RequestQueueClient.update(*, name=None)`

`RequestQueueClient.delete()`

`RequestQueueClient.list_head(*, limit=None)`

`RequestQueueClient.add_request(request, *, forefront=None)`

`RequestQueueClient.get_request(request_id)`

`RequestQueueClient.update_request(request, *, forefront=None)`