Skip to content

Commit

Permalink
Update docs for 0.7
Browse files Browse the repository at this point in the history
  • Loading branch information
perklet committed Jun 26, 2024
1 parent d1a64ff commit de0dbd7
Show file tree
Hide file tree
Showing 7 changed files with 59 additions and 18 deletions.
1 change: 1 addition & 0 deletions FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
buy_me_a_coffee: yifei
21 changes: 21 additions & 0 deletions README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,9 @@ print(r.json())

不过只支持类似 Chrome 的浏览器。Firefox 的支持进展可以查看 [#59](https://github.com/yifeikong/curl_cffi/issues/59)

只有当浏览器指纹发生改编的时候,才会添加新版本。如果你看到某个版本被跳过去了,那是因为
他们的指纹没有发生改变,直接用之前的版本加上新的 header 即可。

- chrome99
- chrome100
- chrome101
Expand All @@ -118,6 +121,8 @@ print(r.json())
- chrome116 <sup>[1]</sup>
- chrome119 <sup>[1]</sup>
- chrome120 <sup>[1]</sup>
- chrome123 <sup>[3]</sup>
- chrome124 <sup>[3]</sup>
- chrome99_android
- edge99
- edge101
Expand All @@ -129,6 +134,7 @@ print(r.json())
注意:
1.`0.6.0` 起添加。
2.`0.6.0` 中修复, 之前的 http2 指纹是[错误的](https://github.com/lwthiker/curl-impersonate/issues/215)
3.`0.7.0` 起添加。

### asyncio

Expand Down Expand Up @@ -207,3 +213,18 @@ JSON 数据。在所有的订阅方案中,切换代理都是直接可用的。
## 赞助

<img src="assets/alipay.jpg" style="width: 512px;" />

## 引用

If you find this project useful, please cite it as below:

```
@software{Kong2023,
author = {Yifei Kong},
title = {curl_cffi - A Python HTTP client for impersonating browser TLS and HTTP/2 fingerprints},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/yifeikong/curl_cffi},
}
```
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ website for no obvious reason, you can give `curl_cffi` a try.

------

<a href="https://scrapfly.io/?utm_source=github&utm_medium=sponsoring&utm_campaign=curl_cffi" target="_blank"><img src="assets/scrapfly.png" alt="Scrapfly.io" width="149"></a>
<a href="https://scrapfly.io/?utm_source=github&utm_medium=sponsoring&utm_campaign=curl_cffi" target="_blank"><img src="https://raw.githubusercontent.com/yifeikong/curl_cffi/main/assets/scrapfly.png" alt="Scrapfly.io" width="149"></a>

[Scrapfly](https://scrapfly.io/?utm_source=github&utm_medium=sponsoring&utm_campaign=curl_cffi)
is an enterprise-grade solution providing Web Scraping API that aims to simplify the
Expand Down
12 changes: 9 additions & 3 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,19 @@ v0.7
----

- v0.7.0
- Added more recent impersonate versions, up to Chrome 124
- Upgraded libcurl to 8.5.0
- Added more recent impersonate versions, up to Chrome 124.
- Upgraded libcurl to 8.7.1.
- Added support for list of tuple in post fields.
- Updated header strategy: always exclude empty headers, never send Expect header.
- Changed default redirect limit to 30.
- Prefer not sending CONNECT for plain http proxy.
- Fix Windows build.
- Fix Safari Stream priority.

v0.6
----

The minimum Python version is now 3.8.
The minimum Python version is now 3.8. Windows fingerprints are wrong in 0.6.x.

- v0.6.1
- ``AsyncSession.close`` is now a coroutine.
Expand Down
2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ a better proxy IP provider and use browser automation tools like playwright.
If you are in a hurry or just want the professionals to take care of the hard parts,
you can consider the commercial solutions from our sponsors:

- `Scraply <https://scrapfly.io/?utm_source=github&utm_medium=sponsoring&utm_campaign=curl_cffi>`_, Cloud-based scraping platform.
- `Scrapfly <https://scrapfly.io/?utm_source=github&utm_medium=sponsoring&utm_campaign=curl_cffi>`_, Cloud-based scraping platform.
- `Yescaptcha <https://yescaptcha.com/i/stfnIO>`_, captcha resolver and proxy service for bypassing Cloudflare.
- `ScrapeNinja <https://scrapeninja.net/?utm_source=github&utm_medium=banner&utm_campaign=cffi>`_, Managed web scraping API.

Expand Down
26 changes: 17 additions & 9 deletions docs/impersonate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ However, only Chrome-like browsers are supported. Firefox support is tracked in
- chrome116 :sup:`1`
- chrome119 :sup:`1`
- chrome120 :sup:`1`
- chrome123 :sup:`3`
- chrome124 :sup:`3`
- chrome99_android
- edge99
- edge101
Expand All @@ -29,12 +31,13 @@ Notes:

1. Added in version `0.6.0`.
2. Fixed in version `0.6.0`, previous http2 fingerprints were `not correct <https://github.com/lwthiker/curl-impersonate/issues/215>`_.
3. Added in version `0.7.0`.

Which version to use?
---------------------

Generally speaking, you should use the latest Chrome or Safari versions. As of 0.6, they're
``chrome120``, ``safari17_0`` and ``safari17_2_ios``. To always impersonate the latest avaiable
Generally speaking, you should use the latest Chrome or Safari versions. As of 0.7, they're
``chrome124``, ``safari17_0`` and ``safari17_2_ios``. To always impersonate the latest avaiable
browser versions, you can simply use ``chrome``, ``safari`` and ``safari_ios``.

.. code-block:: python
Expand Down Expand Up @@ -87,11 +90,15 @@ For Akamai http2 fingerprints, you can fully customize the 3 parts:
* ``CURLOPT_HTTP2_SETTINGS`` sets the settings frame values, for example `1:65536;3:1000;4:6291456;6:262144` (non-standard HTTP/2 options created for this project).
* ``CURLOPT_HTTP2_WINDOW_UPDATE`` sets intial window update value for http2, for example `15663105` (non-standard HTTP/2 options created for this project).

For a complete list of options and explanation, see the `curl-impersoante README`_.

.. _curl-impersonate README: https://github.com/yifeikong/curl-impersonate?tab=readme-ov-file#libcurl-impersonate


Should I randomize my fingerprints for each request?
------

You can use a random from the list above, like:
You can choose a random version from the list above, like:

.. code-block:: python
Expand All @@ -106,15 +113,16 @@ random fingerprints, the server is easy to know that you are not using a typical
If you were thinking about ``ja3``, and not ``ja3n``, then the fingerprints is already
randomized, due to the ``extension permutation`` feature introduced in Chrome 110.

AFAIK, most websites use an allowlist, not a blocklist to filter out bot traffic. So I
don’t think random ja3 fingerprints would work in the wild.
As far as we know, most websites use an allowlist, not a blocklist to filter out bot
traffic. So do not expect random ja3 fingerprints would work in the wild.

Can I change JS fingerprints with this library?
Can I change JavaScript fingerprints with this library?
------

No, you can not. As the name suggests, JavaScript fingerprints are generated using Javascript
No, you can not. As the name suggests, JavaScript fingerprints are generated using JavaScript
APIs provided by real browsers. ``curl_cffi`` is a python binding to a C library, with no
browser or JavaScript runtime under the hood.

If you need to impersonate browsers on the JavaScript perspective, you can search for "Anti-detect
Browser", "Playwright stealth" and similar keywords. Or simply use a commercial plan from our sponsors.
If you need to impersonate browsers on the JavaScript perspective, you can search for
"Anti-detect Browser", "Playwright stealth" and similar keywords. Or simply use a
commercial plan from our sponsors.
13 changes: 9 additions & 4 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,21 +129,26 @@ requests-like
from curl_cffi import requests
url = "https://tls.browserleaks.com/json"
url = "https://tools.scrapfly.io/api/fp/ja3"
# Notice the impersonate parameter
r = requests.get(url, impersonate="chrome")
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110")
print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser
# To keep using the latest browser version as `curl_cffi` updates,
# simply set impersonate="chrome" without specifying a version.
# Other similar values are: "safari" and "safari_ios"
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")
# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get(url, impersonate="chrome", proxies=proxies)
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)
proxies = {"https": "socks://localhost:3128"}
r = requests.get(url, impersonate="chrome", proxies=proxies)
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)
Sessions
~~~~~~
Expand Down

0 comments on commit de0dbd7

Please sign in to comment.