You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that the logic of this script is to do incremental update of edgar.filing_docs.
Def14_a never gets updated, so the code will scrape the same set of files in the while loop endlessly.
SEC has traffic controls. Paralleling with 8 cores will cause the IP to be blocked. I tested and found that 2 cores plus sleeping for 0.5s works (at least on my server). The key is to avoid submitting more than 10 requests per second, otherwise SEC will block the IP for 10 min.
The text was updated successfully, but these errors were encountered:
It seems that the logic of this script is to do incremental update of
edgar.filing_docs
.Def14_a
never gets updated, so the code will scrape the same set of files in the while loop endlessly.SEC has traffic controls. Paralleling with
8
cores will cause the IP to be blocked. I tested and found that2
cores plus sleeping for0.5
s works (at least on my server). The key is to avoid submitting more than10
requests per second, otherwise SEC will block the IP for 10 min.The text was updated successfully, but these errors were encountered: