Page MenuHomePhabricator

elm_win crashes if deleted during event callbacks
Closed, ResolvedPublic


_elm_win_focus_out() calls _elm_widget_top_win_focused_set() which triggers evas callbacks on the object. if a user deletes the win object during this callback then the _elm_win_focus_out() function will continue after the _elm_widget_top_win_focused_set() call and attempt operations which use deleted objects, e.g., starting a job with _elm_win_focus_highlight_reconfigure_job_start().

it's likely that most/all callbacks in elm have similar risks, so I think an audit should be done for all widgets which intercept/override callbacks like these.

zmike created this task.Aug 9 2017, 11:18 AM
cedric added a comment.Aug 9 2017, 2:57 PM

I guess we should review the location that do call callback_call and make sure that we do ref the object before the call.

jpeg added a comment.Aug 9 2017, 9:58 PM

efl_event_callback_call like all eo functions will ref/unref the object.
_elm_win_focus_highlight_reconfigure_job_start needs to be fixed.

The pattern that @zmike is describing is when a callback is triggered and you continue to use the object after you call callback call, it is at risk of the object dying during when coming back from the callback call. I don't see a way to fix the problem other that reviewing all user of callback call.

jpeg removed jpeg as the assignee of this task.Aug 9 2017, 10:28 PM
jpeg added a subscriber: jpeg.

OK so let's review all of EFL where we call a function which by some crazy side effect may delete an object we have access the data to AND isn't the object currently being called on (because that one won't die thanks to EO internal ref). I don't see the link with callbacks here.
The problem here is that ecore_evas is unsafe, as it doesn't use EO for its own events, and so during an ecore_evas callback like ecore_evas_callback_focus_out_set the object can die.

jpeg added a comment.Aug 9 2017, 10:30 PM

In the given scenario the code should crash at the line

sd->focus_highlight.cur.visible = EINA_FALSE;

inside _elm_win_focus_out. If it crashes in the job cb then it is easy to use the EO id instead of sd as the data passed to said job.

zmike added a comment.Aug 10 2017, 5:11 AM

The crash was in the job.

Also calling this a "crazy side effect" is misleading. Deleting objects during callbacks is very common in asynchronous programming.

jpeg added a comment.Aug 10 2017, 3:47 PM
In T5869#94400, @zmike wrote:

The crash was in the job.

Also calling this a "crazy side effect" is misleading. Deleting objects during callbacks is very common in asynchronous programming.

My point was that almost every single function call in EFL can lead to this situation, unless the object is protected by being the one EO is calling.

jpeg added a comment.Aug 10 2017, 6:44 PM

In this specific case, the object is invalid right after the callback.
The program doesn't crash because EO tries really hard to avoid crashes by using eina_freeq and eina_trash. This is the code doing the actual memory free in EO:

if ((klass->objects.trash_count <= 8) && (EINA_LIKELY(!_eo_trash_bypass))) // most likely true for a window
     eina_trash_push(&klass->objects.trash, obj); // keeps data alive
     eina_freeq_ptr_main_add(obj, free, klass->obj_size); // delays deletion unless specific env vars are set

But in theory the program should have crashed at the following line inside _elm_win_focus_out:

sd->focus_highlight.cur.visible = EINA_FALSE;

All other evas objects (that is, besides elm win) use manual free, which means they will be deleted later, not instantly. Elm Win is deleted as soon as its refs reach 0. This makes sense because manual free relies on the canvas to draw two more frames after deleting an object but the window itself is the canvas.

The job should be safely deleted before being run. Right now the job is deleted before being added because group_del is called during evas_object_del/efl_del.

All I can think of is wrapping every ecore_evas callback inside efl_ui_win.c by efl_ref / efl_unref on the window. The rest of the code seems safe, assuming EO is used to make calls on the window.

jpeg added a comment.Aug 10 2017, 7:24 PM

Just pester me enough an I'll fix your bugs.