-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page loaded before all requests done #319
Conversation
On the topic of in-flight requests tracking, see also PR #274. It is imho better to handle the registry in one place and not mix registry handling code between |
After more work on the topic, it appears my fix for unsupported content is wrong. The last commit I added tried to fix normal content downloading by removing the The reason for this is that there is most likely another callback connected to this slot that reads the reply's buffer for normal content but not for unsupported one. I'll push an update shortly. |
0a1da28
to
ddea4b3
Compare
Everything should be fine now. |
Would it be possible to get a feedback on this PR ? Is it of any interest to you ? |
3dfa605
to
6a031c5
Compare
@EvaSDK Could be nice to get this one rebased on |
Actually I have been trying to cleanup that branch a couple of times already but with dev now being python3 only, it is unlikely that I'll work on merging it as the production I run is still using python 2 and this is what I am targeting for the coming weeks. |
@EvaSDK As |
Add missing Mozilla/ Signed-off-by: Gilles Dartiguelongue <[email protected]>
Signed-off-by: Gilles Dartiguelongue <[email protected]>
Keep version introspectable while avoiding ImportError when dealing with setup.py.
Unsupported content goes through NetworkAccessManager as well, no need to make it special for downloading.
Some responses take a while to download so have some logs to see what is going on. This code should probably be enhanced to skip small downloads and or start emitting logs if downloads takes more than a pre-defined amount of time but for now it is more helpful as is to help debug network problems in the current code.
The method actually calls peek, not read. A new method will be added that uses read and does consume the reply buffer data.
Note that this seems to reveal a problem with requests being still in flight while the page is considered loaded which might break some script that relied on the previously broken behavior. Will fix it in an upcoming merge request.
Also read files in binary mode as this is the expected behavior for this kind of HTTP transfers.
Behave more like a real browser and only care about text/* Content-Type when reading content to encode it properly. Other content is now intended to be available as bytes. Update unittests to reflect this. Fixes tests under PyQt4 as well.
As written at [1], this might be a cause for the segfaults observed at interpreter shutdown time. [1] http://enki-editor.org/2014/08/23/Pyqt_mem_mgmt.html
Also change the generic Exception by the more specific RuntimeError.
Just call in QT event processing and avoid unneeded sleep time.
Reduce time spent just sleeping and allow more QT event processing to happen according to actual time value passed to sleep and wait_for.
Because super is super.
Most of the time, QtWebkit emits pageLoaded when all resources are indeed loaded, however when downloading a file directly for example, the signal is emitted even though content is still flowing down. Keeping a registry allows delaying closing the session until all requests created during the session are indeed complete.
Cannot get my mind around this problem so implement a workaround for now.
With all signals properly connected, I could not find a reason to keep this around.
6a031c5
to
a34ae95
Compare
I was trying to compute checksum of resources downloaded by Ghost.py but some sites stream their content. This is not a problem for regular web content but file that go through
unsupported_content
do not behave properly.One resource is created for each chunk of the file received which obvsiouly does not help in any way with getting the complete content that we can expect. The fix here is to just let
NetworkAccessManager
do its job.The second problem that was detected with this case is that
QtWebKit
emitspageLoaded
even though theQNetworkReply
object in charge of downloading the file is still running. Hence keep a registry of in-flight requests and only allowwait_for_page_loaded
to return when bothpageLoaded
has been emited and no queries are still running.This is basically what was reported in PR #265 hopefully with a clearer wording.