Skip to content

Commit

Permalink
HTTP_TIMEOUT
Browse files Browse the repository at this point in the history
  • Loading branch information
yindaheng98 committed Dec 18, 2023
1 parent ec5e726 commit cec2799
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 5 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,11 @@ python -m dblp_crawler neo4j -h
usage: __main__.py neo4j [-h] [--auth AUTH] --uri URI

optional arguments:
-h, --help show this help message and exit
--auth AUTH Auth to neo4j database.
--uri URI URI to neo4j database.
-h, --help show this help message and exit
--username USERNAME Auth username to neo4j database.
--password PASSWORD Auth password to neo4j database.
--uri URI URI to neo4j database.
--select Mark keyword-matched publications in database (set selected=true).
```

### Config environment variables
Expand All @@ -67,6 +69,8 @@ optional arguments:
* default: `30`
* `HTTP_PROXY`
* Set it `http://your_user:your_password@your_proxy_url:your_proxy_port` if you want to use proxy
* `HTTP_TIMEOUT`
* Timeout for each http request, in seconds
* `HTTP_CONCORRENT`
* Concurrent HTTP requests
* default: `8`
Expand Down
4 changes: 3 additions & 1 deletion dblp_crawler/downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,9 @@ async def download_item(path: str, cache_days: int) -> Optional[ElementTree.Elem
async with http_sem:
try:
async with aiohttp.ClientSession(connector=aiohttp.TCPConnector(verify_ssl=False)) as session:
async with session.get(url, proxy=os.getenv("HTTP_PROXY")) as response:
async with session.get(url,
proxy=os.getenv("HTTP_PROXY"),
timeout=os.getenv("HTTP_TIMEOUT") or 30) as response:
logger.info(" download: %s" % path)
html = await response.text()
data = ElementTree.fromstring(html)
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

setup(
name='dblp_crawler',
version='2.1',
version='2.1.1',
author='yindaheng98',
author_email='[email protected]',
url='https://github.com/yindaheng98/dblp-crawler',
Expand Down

0 comments on commit cec2799

Please sign in to comment.