Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force ViewBackend dispatch when the bridge connection is lost #161

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

psaavedra
Copy link
Member

issue

Case of use:

  1. WebView 1 load a Page Domain1/page (surface 1)
  2. WebView 1 load a Page Domain2/page (surface 2) (but is filtered by the Content Filters Policy (surface 2 is inmediately destroyed)
  3. WebView 2 load a Page Domain1/page (surface 3)
  4. WebView 2 is closed
  5. WebView 1 not updated anymore.

Explanation:

The figure shows the situation in 2):

At that point, during the creation of the new surface (4) will trigger the ViewBackend:dispatchFrameCallbacks() but for the surface number 4, being inhibited the previous one (the surface 3). So this Surface 3 will not process the callbacks associated to the frame reached inmediately after the register action is done (2nd red box) and it will be never received the commit from the WebProcess because this will keep waiting for the wl_callback_send_done() triggered in the Surface::dispatchFrameCallbacks().

The 1b95b25 attempts to address that situation adding an Instance::dispatchFrameCallbacks() of the previous surface when the new surface is registered but that does not guarantee that you get the frame ACK message associated to the previous surface after the register (the orange box just after the blue box). The result: the previous surface never sends the wl_callback_send_done() associated for this surface.

So far the real problem. From here how the bug is manifested: ...

In the state in which everything is we have:

  • The new surface (4) was created but destroyed immediately afterwards (but not by an unregisterSurface() but by a wl_client_destroy)
  • The previous surface (3) did not send the wl_callback_send_done() to the target renderer backend so the WebProcess stops to generate a new frame.

Then 3 scenarios:

  • If a page from a different domain is loaded on the same view: Then the situation can be recovered since a new Surface is going to be created and the previous one dies
  • If the new request is in the same domain. GAME_OVER. There is no new a registerSurface() and the Surface 3 is still pending a Surface::dispatchFrameCallbacks()
  • Creation a new page in a view ... apparently everything is still working. We will continue ...

... The new page load works in the new WebView because: 1) it is a new surface 2) associated with a new ViewBackend but recovers the situation? The answer is no: ...

A priori you could thinkg that the registerSurface() will shoot the ViewBackend::dispatchFrameCallbacks() for the previous surface. That is True but for the previous surface in ViewBackend but this is new one so only has one Surface (number 5) therefore it will not dispatch anything. As a consequence once this WebView is closed returns to the previous View, we will have that Surface 3 is still in the same inconsistent state.

So where can we make sure the dispatch for a previous Surface is going to work when those lazy frame happends?. I said before... there is not an unregisterSurface() for Surface 4, it dies abruptly therefore the place where to make sure that the previous frame of a previous surface is dispatched is in the bridgeConnectionLost().

aperezdc and others added 6 commits May 5, 2021 01:48
As each ViewBackend (and hence web view) can get its contents rendered
by different WPEWebProcess due to PSON, keep around a list of all the
differen bridgeIds which have been associated with it in a std::vector.
Newly added elements are always pushed at the back, so the most recent
association (i.e. the "current" active one) is always the last element.
Keeping the list allows getting back to the previously used ones.
Arrange for calling ViewBackend::unregisterSurface() also when a Surface
is destroyed. This is a no-op if the nested compositor had the chance
to receive and handle a FdoIPC::UnregisterSurface message before the
surface is destroyed; but allows undoing the bridge association when the
Wayland client connection is closed beforehand. The destruction callback
for the surface is guaranteed to always be called by libwayland, even
on dropped connections, to allow cleaning up reagardless of the fate
of its clients.

Among others, this covers the case in which a WPEWebProcess is abruptly
exited, for example due to a crash or a top-level resource load being
blocked by a content filter.

Co-authored-by: Adrian Perez de Castro <[email protected]>
Acked-by: Adrian Perez de Castro <[email protected]>
Signed-off-by: Pablo Saavedra <[email protected]>
…urface

In an un/register surface action is taken; make sure that surfaces other than
the new active one do not have callbacks pending to be dispatched.

Co-authored-by: Adrian Perez de Castro <[email protected]>
Acked-by: Adrian Perez de Castro <[email protected]>
Signed-off-by: Pablo Saavedra <[email protected]>
The ~ViewBackend sets now the m_backend to NULL and the
ViewBackend::dispatchFrameCallbacks() checks if m_backend is not NULL
before call wpe_view_backend_dispatch_frame_displayed().

Acked-by: Adrian Perez de Castro <[email protected]>
Signed-off-by: Pablo Saavedra <[email protected]>
This patch address the following scenario:

* View 1: Opens a non-blocked page (surface1) (domain1/pageA)
* View 1: Opens a blocked page (surface2) (domain2/pageA)
  * That add a new bridge Id in the bridgeIds
  * Frames related to the surface1 arrives late (after the new surface is
    registered). There is not a dispatch notification because the
    `bridgeIds.back()`, at that moment, is associated to the `surface2`.
    Therefore `wpe_view_backend_dispatch_frame_displayed(m_backend);` is
    inhibited
  * Destroy surface2 because the content is filtered
* View 2: Opens a non-blocked page (surface3) (domain3/pageC)
* View 2: Closed
  * Destroy ViewBackend
  * Destroy Surface3
* View 1: Open a non-blocked page in the same domain that the related to
  the surface1 (domain1/pageB)

* Issue: New frames will not be generated because the previous
  `wpe_view_backend_dispatch_frame_displayed` was not called for related
  WebProcess backend (1)

Signed-off-by: Pablo Saavedra <[email protected]>
... instead of do it throught the ViewBackend::unregisterSurface().

During destruction we only want to remove the surface-viewbackend
association but we do not want to call dispatchFrameCallbacks because
the nested compositor client is dying

Acked-by: Pablo Saavedra <[email protected]>
@psaavedra psaavedra requested a review from aperezdc May 7, 2021 12:09
Base automatically changed from aperezdc/bridgeid-list to master May 12, 2021 15:38
@jameshilliard
Copy link

So I tried applying this on top of 1.10.0 to see if it would have any effect on #166. I got the following error.

(cog:398): Cog-Core-WARNING **: 18:20:23.914: <app:///> Crash!: The renderer process crashed. Reloading the page may fix intermittent failures.
--398-- REDIR: 0xb21d1d0 (libstdc++.so.6:operator delete(void*)) redirected to 0x484f064 (operator delete(void*))
==398== Invalid read of size 4
==398==    at 0xB158D68: WS::Instance::unregisterSurface(WS::Surface*) (ws.cpp:551)
==398==    by 0xB158DBB: operator() (ws.cpp:157)
==398==    by 0xB158DBB: WS::s_compositorInterface::{lambda(wl_client*, wl_resource*, unsigned int)#1}::operator()(wl_client, wl_resource, unsigned int) const::{lambda(wl_resource)#1}::_FUN(wl_resource) (ws.cpp:159)
==398==    by 0xB944F83: destroy_resource (wayland-server.c:724)
==398==    by 0xB94916B: for_each_helper (wayland-util.c:372)
==398==    by 0xB9496BF: wl_map_for_each (wayland-util.c:385)
==398==    by 0xB945107: wl_client_destroy (wayland-server.c:883)
==398==    by 0xB9451DB: wl_client_connection_data (wayland-server.c:337)
==398==    by 0xB947033: wl_event_loop_dispatch (event-loop.c:1027)
==398==    by 0xB157CDF: operator() (ws.cpp:75)
==398==    by 0xB157CDF: WS::ServerSource::{lambda(_GSource*, int (*)(void*), void*)#3}::_FUN(_GSource*, int (*)(void*), void*) (ws.cpp:84)
==398==    by 0x80F91A3: g_main_dispatch (gmain.c:3337)
==398==    by 0x80F91A3: g_main_context_dispatch (gmain.c:4055)
==398==    by 0x80F9403: g_main_context_iterate.constprop.0 (gmain.c:4131)
==398==    by 0x80F94EB: g_main_context_iteration (gmain.c:4196)
==398==  Address 0x50810ea8 is 8 bytes inside a block of size 24 free'd
==398==    at 0x484F0E8: operator delete(void*) (vg_replace_malloc.c:802)
==398==    by 0xB159103: deallocate (new_allocator.h:133)
==398==    by 0xB159103: deallocate (alloc_traits.h:492)
==398==    by 0xB159103: _M_deallocate_node_ptr (hashtable_policy.h:2064)
==398==    by 0xB159103: _M_deallocate_node (hashtable_policy.h:2054)
==398==    by 0xB159103: _M_erase (hashtable.h:1888)
==398==    by 0xB159103: std::_Hashtable<unsigned int, std::pair<unsigned int const, WS::Surface*>, std::allocator<std::pair<unsigned int const, WS::Surface*> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::erase(std::__detail::_Node_const_iterator<std::pair<unsigned int const, WS::Surface*>, false, false>) (hashtable.h:1863)
==398==    by 0xB158D83: erase (hashtable.h:807)
==398==    by 0xB158D83: erase (unordered_map.h:797)
==398==    by 0xB158D83: WS::Instance::unregisterSurface(WS::Surface*) (ws.cpp:549)
==398==    by 0xB158DBB: operator() (ws.cpp:157)
==398==    by 0xB158DBB: WS::s_compositorInterface::{lambda(wl_client*, wl_resource*, unsigned int)#1}::operator()(wl_client, wl_resource, unsigned int) const::{lambda(wl_resource)#1}::_FUN(wl_resource) (ws.cpp:159)
==398==    by 0xB944F83: destroy_resource (wayland-server.c:724)
==398==    by 0xB94916B: for_each_helper (wayland-util.c:372)
==398==    by 0xB9496BF: wl_map_for_each (wayland-util.c:385)
==398==    by 0xB945107: wl_client_destroy (wayland-server.c:883)
==398==    by 0xB9451DB: wl_client_connection_data (wayland-server.c:337)
==398==    by 0xB947033: wl_event_loop_dispatch (event-loop.c:1027)
==398==    by 0xB157CDF: operator() (ws.cpp:75)
==398==    by 0xB157CDF: WS::ServerSource::{lambda(_GSource*, int (*)(void*), void*)#3}::_FUN(_GSource*, int (*)(void*), void*) (ws.cpp:84)
==398==    by 0x80F91A3: g_main_dispatch (gmain.c:3337)
==398==    by 0x80F91A3: g_main_context_dispatch (gmain.c:4055)
==398==  Block was alloc'd at
==398==    at 0x484CD80: operator new(unsigned long) (vg_replace_malloc.c:417)
==398==    by 0xB1581FF: allocate (new_allocator.h:115)
==398==    by 0xB1581FF: allocate (alloc_traits.h:460)
==398==    by 0xB1581FF: _M_allocate_node<std::pair<unsigned int const, WS::Surface*> > (hashtable_policy.h:2032)
==398==    by 0xB1581FF: _Scoped_node<std::pair<unsigned int const, WS::Surface*> > (hashtable.h:272)
==398==    by 0xB1581FF: _M_emplace<std::pair<unsigned int const, WS::Surface*> > (hashtable.h:1673)
==398==    by 0xB1581FF: insert<std::pair<unsigned int const, WS::Surface*> > (hashtable_policy.h:1021)
==398==    by 0xB1581FF: insert (unordered_map.h:586)
==398==    by 0xB1581FF: WS::Instance::registerSurface(unsigned int, WS::Surface*) (ws.cpp:407)
==398==    by 0xB4CC7A7: ffi_call_SYSV (sysv.S:114)
==398==    by 0xB4CBFCF: ffi_call_int (ffi.c:747)
==398==    by 0xB9482C3: wl_closure_invoke (connection.c:1018)
==398==    by 0xB945443: wl_client_connection_data (wayland-server.c:432)
==398==    by 0xB947033: wl_event_loop_dispatch (event-loop.c:1027)
==398==    by 0xB157CDF: operator() (ws.cpp:75)
==398==    by 0xB157CDF: WS::ServerSource::{lambda(_GSource*, int (*)(void*), void*)#3}::_FUN(_GSource*, int (*)(void*), void*) (ws.cpp:84)
==398==    by 0x80F91A3: g_main_dispatch (gmain.c:3337)
==398==    by 0x80F91A3: g_main_context_dispatch (gmain.c:4055)
==398==    by 0x80F9403: g_main_context_iterate.constprop.0 (gmain.c:4131)
==398==    by 0x80F94EB: g_main_context_iteration (gmain.c:4196)
==398==    by 0x7F5AFEF: g_application_run (gapplication.c:2560)
==398== 

(cog:398): WPE-FDO-WARNING **: 18:20:25.015: Instance::dispatchFrameCallbacks(): Cannot find surface with bridgeId 1 in view backend map. Probably the associated surface is gone due to a premature exit in the client side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants