Page MenuHomePhabricator

Random frequent crash of E
Closed, ResolvedPublic

Description

After patch cffb31f4a768ba6112646960f6e197f85f02f2ca enlightenment keep crashing while trying to save some profile. The profile seems to actually be already dead. This is only happening on my laptop. It might be because I have some more agressive saving setup on it than my desktop, but I can not figure out which one trigger all the crash.

cedric created this task.Apr 10 2018, 5:22 PM
cedric triaged this task as High priority.

just going to copy my reply to the patch comment here

really? i haven't seen this. let me run under valgrind and see...

running e under valgrind... i can't see any issues involving elm config at all. i see some epoll things (that i think are false positives), some "Conditional jump or move depends on uninitialised value(s)" stuff with glob matching in fnmatch (i think its an optimization so false positive), and some others with e_pan smart and these i think are real... but no invalid accesses that would cause a crash. i did things that would cause config to be modified and saved - even changed scaling which would affect elm's config.

so i can't reproduce. do you have backtraces or valgrind info or anything specific to reproduce this?

8138== Invalid read of size 8

8138== at 0x7C234BF: _elm_config_derived_save (elm_config.c:869)

8138== by 0x7C237F8: _elm_config_profile_save.part.9.constprop.22 (elm_config.c:2215)

8138== by 0x7C2342C: _elm_config_save (elm_config.c:2261)

8138== by 0x1A8D17: _e_config_save_cb (e_config.c:2317)

8138== by 0x1A63C0: e_config_save_flush (e_config.c:1718)

8138== by 0x29E197: e_sys_action_do (e_sys.c:398)

8138== by 0x14A9DF: _e_actions_cb_exit_dialog_ok (e_actions.c:2078)

8138== by 0x2AB5F9: _e_wid_activate_hook (e_widget_button.c:123)

8138== by 0x2AB692: _click (e_widget_button.c:144)

8138== by 0xB08996E: _event_callback_call (eo_base_class.c:1618)

8138== by 0xB08996E: _efl_object_event_callback_legacy_call (eo_base_class.c:1691)

8138== by 0xB084846: efl_event_callback_legacy_call (eo_base_class.c:1694)

8138== by 0xB084846: efl_event_callback_legacy_call (eo_base_class.c:1694)

8138== by 0x8392BAA: edje_match_callback_exec_check_finals (edje_match.c:556)

8138== by 0x8392BAA: edje_match_callback_exec (edje_match.c:711)

8138== by 0x8399F11: _edje_emit_cb (edje_program.c:1592)

8138== by 0x8399F11: _edje_emit_handle (edje_program.c:1544)

8138== by 0x83946EE: _edje_message_queue_process.part.3 (edje_message_queue.c:893)

8138== by 0x83948A8: _edje_message_queue_process (edje_message_queue.c:859)

8138== by 0x83948A8: _edje_job (edje_message_queue.c:260)

8138== by 0x5EA317A: _ecore_job_event_handler (ecore_job.c:98)

8138== by 0x5EA88B8: _ecore_event_message_handler_efl_loop_message_handler_message_call (ecore_event_message_handler.c:359)

8138== by 0x5EAF6EE: efl_loop_message_handler_message_call (efl_loop_message_handler.eo.c:14)

8138== by 0x5EAB878: _efl_loop_message_process (efl_loop.c:628)

8138== by 0x5EAA4B6: efl_loop_message_process (efl_loop.c:658)

8138== by 0x5EA50CE: _ecore_main_loop_iterate_internal (ecore_main.c:2418)

8138== by 0x5EA594C: _ecore_main_loop_begin (ecore_main.c:1174)

8138== by 0x5EAB7A8: _efl_loop_begin (efl_loop.c:83)

8138== by 0x5EAA766: efl_loop_begin (efl_loop.eo.c:28)

8138== by 0x5EA5A16: ecore_main_loop_begin (ecore_main.c:1247)

8138== by 0x25F059: main (e_main.c:1088)

8138== Address 0x6c is not stack'd, malloc'd or (recently) free'd

8138

Sorry didn't had time to turn it back on my laptop. 100% reproductible here.

8138== Invalid read of size 8

8138== at 0x7C234BF: _elm_config_derived_save (elm_config.c:869)

8138== by 0x7C237F8: _elm_config_profile_save.part.9.constprop.22 (elm_config.c:2215)

8138== by 0x7C2342C: _elm_config_save (elm_config.c:2261)

8138== by 0x1A8D17: _e_config_save_cb (e_config.c:2317)

8138== by 0x1A63C0: e_config_save_flush (e_config.c:1718)

8138== by 0x29E197: e_sys_action_do (e_sys.c:398)

8138== by 0x14A9DF: _e_actions_cb_exit_dialog_ok (e_actions.c:2078)

8138== by 0x2AB5F9: _e_wid_activate_hook (e_widget_button.c:123)

8138== by 0x2AB692: _click (e_widget_button.c:144)

8138== by 0xB08996E: _event_callback_call (eo_base_class.c:1618)

8138== by 0xB08996E: _efl_object_event_callback_legacy_call (eo_base_class.c:1691)

8138== by 0xB084846: efl_event_callback_legacy_call (eo_base_class.c:1694)

8138== by 0xB084846: efl_event_callback_legacy_call (eo_base_class.c:1694)

8138== by 0x8392BAA: edje_match_callback_exec_check_finals (edje_match.c:556)

8138== by 0x8392BAA: edje_match_callback_exec (edje_match.c:711)

8138== by 0x8399F11: _edje_emit_cb (edje_program.c:1592)

8138== by 0x8399F11: _edje_emit_handle (edje_program.c:1544)

8138== by 0x83946EE: _edje_message_queue_process.part.3 (edje_message_queue.c:893)

8138== by 0x83948A8: _edje_message_queue_process (edje_message_queue.c:859)

8138== by 0x83948A8: _edje_job (edje_message_queue.c:260)

8138== by 0x5EA317A: _ecore_job_event_handler (ecore_job.c:98)

8138== by 0x5EA88B8: _ecore_event_message_handler_efl_loop_message_handler_message_call (ecore_event_message_handler.c:359)

8138== by 0x5EAF6EE: efl_loop_message_handler_message_call (efl_loop_message_handler.eo.c:14)

8138== by 0x5EAB878: _efl_loop_message_process (efl_loop.c:628)

8138== by 0x5EAA4B6: efl_loop_message_process (efl_loop.c:658)

8138== by 0x5EA50CE: _ecore_main_loop_iterate_internal (ecore_main.c:2418)

8138== by 0x5EA594C: _ecore_main_loop_begin (ecore_main.c:1174)

8138== by 0x5EAB7A8: _efl_loop_begin (efl_loop.c:83)

8138== by 0x5EAA766: efl_loop_begin (efl_loop.eo.c:28)

8138== by 0x5EA5A16: ecore_main_loop_begin (ecore_main.c:1247)

8138== by 0x25F059: main (e_main.c:1088)

8138== Address 0x6c is not stack'd, malloc'd or (recently) free'd

8138

Sorry didn't had time to turn it back on my laptop. 100% reproductible here.

oooh under derived save... it may be that my e isn't doing that path... i'll look into it.

this one is bizarre. eet_data_read is seemingly returning a struct with junk!

_elm_config_derived_load() is called to get the derived profiles data/config which is just a struct with a linked list. the struct is allocated ok with ptr 0x55639a6e8680 for example but derivede->profiles is "junk": it's 0x64. for some reason... it SHOULD be NULL if there is no list... or a valid list ptr. but it's basically garbage. and that is not good. this at least seems to be the core "bug begins here" point... just an update. but need to look into why eet is decoding so badly.

hmmm the edd got shut down and not initted again... but how is eet decoding anything? the edd is NULL... this is crazy... :)