Page MenuHomePhabricator

eo: do not free these values here
AbandonedPublic

Authored by bu5hm4n on Dec 6 2019, 10:45 AM.

Details

Summary

recall is already doing exactly these operations, due to mempool usage,
this was not discovered yet.
Additionally, do not use EINA_LIST_FREE, altering the list while in
EINA_LIST_FREE might cause weird bugs.

ref T8490

Diff Detail

Repository
rEFL core/efl
Branch
master
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 14862
bu5hm4n created this revision.Dec 6 2019, 10:45 AM
bu5hm4n requested review of this revision.Dec 6 2019, 10:45 AM
Peter2121 requested changes to this revision.Dec 6 2019, 12:38 PM
Peter2121 added a subscriber: Peter2121.

It's definitely a step ahead, I don't see multiple instances of efreetd started.
But the problem of SIGSEGV is still here.
To produce it:

  • start E (startup error - cannot connect to efreetd)
  • only one instance of efreetd is present
  • start ephoto - still one instance of efreetd is present
  • start efreetd from command line - it exits immediately
  • start efreetd several times from command line - it exits after a 1-2 sec delay
  • the only started efreetd process is consuming 100% of CPU
  • start efreetd several times from command line - one of them is crashed with the same error as before, the same coredump, zeros in 'value'.
This revision now requires changes to proceed.Dec 6 2019, 12:38 PM
bu5hm4n updated this revision to Diff 27390.Dec 7 2019, 3:59 AM
bu5hm4n edited the summary of this revision. (Show Details)

add debug values

@Peter2121 can you check again with this state of the revision ?

I set up a freebsd in a vm, installed everything, however, i cannot reproduce your issue :(

When you crash it now, can you paste the output here, maybe we can make some sense of it this way.

Peter2121 requested changes to this revision.Dec 7 2019, 6:32 AM

With the new revision I cannot reproduce the crash. Starting ephoto does not produce several instances of efreetd.
BUT!
If I manually start many efreetd instances (just 'efreetd' in terminology, without arguments) - at one moment I see multiple instances started, consuming 100% of CPU:

9512 peter      92  20 42024 31416 T  0.0  0.4  0:51.17 /usr/local/bin/efreetd
9590 peter     120  19 42024 32584 R 99.9  0.4  7:49.05 /usr/local/bin/efreetd
9607 peter     120  19 42024 32584 R 96.9  0.4  7:48.64 /usr/local/bin/efreetd
9624 peter      52  19 42024 32568 S  0.0  0.4  0:00.68 /usr/local/bin/efreetd

The processes don't respond on SIGTERM.
If I attach a debugger - I see the following backtrace:

(lldb) bt
* thread #1, name = 'efreetd'
  * frame #0: 0x00000008005c805a libc.so.7`_select + 10
    frame #1: 0x00000008003f6522 libthr.so.3`___lldb_unnamed_symbol44$$libthr.so.3 + 66
    frame #2: 0x000000080033126a libecore.so.1`_ecore_main_select(obj=0x00004000000002fe, pd=0x00000008016130b0, timeout=<unavailable>) at ecore_main.c:1859
    frame #3: 0x000000080032ffa1 libecore.so.1`_ecore_main_loop_iterate_internal(obj=0x00004000000002fe, pd=0x00000008016130b0, once_only=0) at ecore_main.c:2461
    frame #4: 0x000000080033012d libecore.so.1`_ecore_main_loop_begin(obj=0x00004000000002fe, pd=0x00000008016130b0) at ecore_main.c:1200
    frame #5: 0x00000008003356b6 libecore.so.1`_efl_loop_begin(obj=0x00004000000002fe, pd=0x00000008016130b0) at efl_loop.c:57
    frame #6: 0x0000000800335136 libecore.so.1`efl_loop_begin(obj=0x00004000000002fe) at efl_loop.eo.c:28
    frame #7: 0x0000000800330223 libecore.so.1`ecore_main_loop_begin at ecore_main.c:1285
    frame #8: 0x000000000020449c efreetd`main(argc=<unavailable>, argv=<unavailable>) at efreetd.c:82
    frame #9: 0x000000000020411b efreetd`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76

Nothing is shown in command line.
The cache is still not updated, Enlightenment shows an error on start: 'Efreetd cannot be connected to'. The socket '=0' is present in /tmp/xdg-9U9z02/.ecore/efreetd
When I start ephoto from terminology I see the following output in console:


There are no previews in ephoto.

It seems that previously Raster fixed something related to 'select' in libc.so.

This revision now requires changes to proceed.Dec 7 2019, 6:32 AM

If I kill ALL efreetd processes and start ephoto - I finish with one or two efreetd processes, consuming 100% of CPU.
The console log:


If I attach a debugger - I see the same backtrace that I've mentioned in the previous post.

Well, but the crash seems fixed, which is progress I guess.
The issue then is that EINA_LIST_FREE is evil ... :)

@cedric can you check the above ephoto.log.txt, there are lookups that are 0x0 ( in _efl_event_future_recall ) which is not good i think?

Sure, we are going the good direction, thank you :)

bu5hm4n updated this revision to Diff 27391.Dec 7 2019, 7:42 AM
bu5hm4n edited the summary of this revision. (Show Details)

remove debugging output

Okay, i made a mistake in this revision, this cannot work :( mhm

Peter2121 added a comment.EditedDec 7 2019, 8:43 AM

Status of sockets, having only one instance of efreetd started (PID 10298):

% sockstat -u | grep efreet
peter    efreetd    10298 26 stream -> /var/run/dbus/system_bus_socket
peter    efreetd    10298 29 stream /tmp/xdg-9U9z02/.ecore/efreetd/0
peter    efreetd    10298 995 stream/tmp/xdg-9U9z02/.ecore/efreetd/0
peter    enlightenm 9403  35 stream -> /tmp/xdg-9U9z02/.ecore/efreetd/0

Is it normal that there are two lines for the same PID of efreetd (10298) and the same socket (/tmp/...) with different FDs - 29 and 995?

If I put filter to show only connected socket - I see only one FD:

% sockstat -uc | grep efreet
peter    efreetd    10298 26 stream -> /var/run/dbus/system_bus_socket
peter    efreetd    10298 995 stream/tmp/xdg-9U9z02/.ecore/efreetd/0
peter    enlightenm 9403  35 stream -> /tmp/xdg-9U9z02/.ecore/efreetd/0
zmike added a subscriber: zmike.Dec 23 2019, 6:50 AM

What's going on here?

bu5hm4n added a comment.EditedDec 23 2019, 8:01 AM

Tracing bsd issues.

BTW, on a PC where I have the problem, /tmp is filled by efreetd_host-peter.bimp.local_djTpjB.log with the following content repeating continuously:

ecore<2049> ../src/lib/ecore/ecore_main.c:1993 _ecore_main_fd_handlers_bads_rem() No bad fd found. EEEK!
ecore<2049> ../src/lib/ecore/ecore_main.c:1944 _ecore_main_fd_handlers_bads_rem() Removing bad fds

Another interesting log records from /tmp/efreetd_host-peter.bimp.local_xxxxxx.log:

ecore_con<1892> ../src/lib/ecore_con/efl_net_socket_unix.c:67 _efl_net_socket_unix_efl_loop_fd_fd_set() getpeername(1337): Socket is not connected
eina_mmap<1892> ../src/lib/eina/eina_mmap.c:95 _eina_mmap_safe_sigbus() Invalid object - BUS_OBJERR. SIGBUS!!!
bu5hm4n abandoned this revision.Feb 14 2020, 8:23 AM

this is wrong.