Page MenuHomePhabricator

efl 1.20.6 edje_cc illegal instruction error on armhf
Closed, ResolvedPublic

Description

Hello,

when building efl 1.20.6 I get a reproducible illegal instruction error in edje_cc:


(experimental_armhf-dchroot)ametzler@abel:~/EFL/efl-1.20.6$ env EFL_SHD_REGEN=1 make
make --no-print-directory all-recursive
Making all in src
make all-recursive
\
/bin/mkdir -p modules/ethumb/emotion; \
EFL_RUN_IN_TREE=1 ../src/bin/edje/edje_cc -v -id . -fd . -id ./modules/ethumb/emotion modules/ethumb/emotion/template.edc modules/ethumb/emotion/template.edj
/bin/bash: line 2: 21000 Illegal instruction EFL_RUN_IN_TREE=1 ../src/bin/edje/edje_cc -v -id . -fd . -id ./modules/ethumb/emotion modules/ethumb/emotion/template.edc modules/ethumb/emotion/template.edj
Makefile:54289: recipe for target 'modules/ethumb/emotion/template.edj' failed
make[4]: *** [modules/ethumb/emotion/template.edj] Error 132
Makefile:51883: recipe for target 'all-recursive' failed

This is on arm-linux-gnueabihf Debian/sid:


(experimental_armhf-dchroot)ametzler@abel:~/EFL/efl-1.20.6$ head -n 9 /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 50.00
Features : half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt vfpd32 lpae
CPU implementer : 0x56
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x584
CPU revision : 2

ametzler created this task.Feb 18 2018, 4:24 AM


(gdb) run -v -id . -fd . -id ./modules/ethumb/emotion modules/ethumb/emotion/template.edc modules/ethumb/emotion/template.edj
Starting program: /home/ametzler/EFL/efl-1.20.6/src/bin/edje/.libs/edje_cc -v -id . -fd . -id ./modules/ethumb/emotion modules/ethumb/emotion/template.edc modules/ethumb/emotion/template.edj
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
[New Thread 0xb58e3f40 (LWP 22272)]
[New Thread 0xb50e2f40 (LWP 22273)]

Thread 1 "edje_cc" received signal SIGILL, Illegal instruction.
evas_common_cpu_neon_test () at lib/evas/common/evas_cpu.c:103
103 asm volatile (

(gdb) bt full
#0 0xb6e87948 in evas_common_cpu_neon_test ()
at lib/evas/common/evas_cpu.c:103
#1 0xb6e879c8 in evas_common_cpu_feature_test (feature=0xb6e87949 <evas_common_cpu_neon_test>) at lib/evas/common/evas_cpu.c:147
act =
{__sigaction_handler = {sa_handler = 0xb6e87961 <evas_common_cpu_catch_segv>, sa_sigaction = 0xb6e87961 <evas_common_cpu_catch_segv>}, sa_mask = {__val = {0 <repeats 32 times>}}, sa_flags = 268435456, sa_restorer = 0xb67ae174 <_eo_nostep_alloc>}
oact =
{__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0, 0, 2147484741, 3070225080, 69, 2147484999, 2147484999, 3070225080, 1850140416, 2147484741, 2147484741, 3061506504, 0, 2147484999, 2147484999, 2147484741, 2147484741, 4961104, 3061382143, 5, 0, 3070225080, 69, 2147484741, 2147484741, 3070225080, 71, 2147484999, 2147484999, 3070225080, 69, 2147484612}}, sa_flags = 0, sa_restorer = 0x0}
oact2 =
{__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0, 0, 2147484741, 3070225080, 69, 2147484999, 2147484999, 3070225080, 1850140416, 2147484741, 2147484741, 3061506504, 0, 2147484999, 2147484999, 2147484741, 2147484741, 4961104, 3061382143, 5, 0, 3070225080, 69, 2147484741, 2147484741, 3070225080, 71, 2147484999, 2147484999, 3070225080, 69, 2147484612}}, sa_flags = 0, sa_restorer = 0x0}
#2 0xb6e87ae8 in evas_common_cpu_init () at lib/evas/common/evas_cpu.c:241
called = 1
#3 0xb6e87c64 in evas_common_init () at lib/evas/common/evas_draw_main.c:133
#4 0xb6e36c7a in efl_canvas_output_engine_info_set (output=0x4bbd48, info=info@entry=0x4bbd90) at lib/evas/canvas/evas_out.c:135
e = 0x4bb058
#5 0xb6ddc7bc in evas_engine_info_set (obj=<optimized out>, info=info@entry=0x4bbd90) at lib/evas/canvas/evas_main.c:479
e = <optimized out>
#6 0xb6f8d676 in ecore_evas_buffer_allocfunc_new (w=w@entry=1, h=h@entry=1, alloc_func=<optimized out>, free_func=
0xb6f8c8e9 <_ecore_evas_buffer_pix_free>, data=<optimized out>,
data@entry=0x0) at lib/ecore_evas/ecore_evas_buffer.c:804
einfo = 0x4bbd90
bdata = 0x48fa58
ee = 0x48f888
rmethod = 1
__FUNCTION__ = "ecore_evas_buffer_allocfunc_new"
#7 0xb6f8d8d8 in ecore_evas_buffer_new (w=w@entry=1, h=h@entry=1)
at lib/ecore_evas/ecore_evas_buffer.c:846
#8 0x0040986a in _data_image_sets_size_set () at bin/edje/edje_cc_out.c:3423
evas = <optimized out>
ee = <optimized out>
set = <optimized out>
simg = <optimized out>
preimg = <optimized out>
l = <optimized out>
entries = <optimized out>
i = <optimized out>
__FUNCTION__ = "_data_image_sets_size_set"
#9 0x0040ce60 in data_process_lookups () at bin/edje/edje_cc_out.c:4047
img = <optimized out>
set_realloc = <optimized out>
images_unused_list = <optimized out>
i = <optimized out>
de = <optimized out>
de_last = <optimized out>
ee = 0x48f888
rmethod = 1
__FUNCTION__ = "ecore_evas_buffer_allocfunc_new"
#7 0xb6f8d8d8 in ecore_evas_buffer_new (w=w@entry=1, h=h@entry=1)
at lib/ecore_evas/ecore_evas_buffer.c:846
#8 0x0040986a in _data_image_sets_size_set () at bin/edje/edje_cc_out.c:3423
evas = <optimized out>
ee = <optimized out>
set = <optimized out>
simg = <optimized out>
preimg = <optimized out>
l = <optimized out>
entries = <optimized out>
i = <optimized out>
__FUNCTION__ = "_data_image_sets_size_set"
#9 0x0040ce60 in data_process_lookups () at bin/edje/edje_cc_out.c:4047
img = <optimized out>
set_realloc = <optimized out>
images_unused_list = <optimized out>
i = <optimized out>
de = <optimized out>
de_last = <optimized out>
---Type <return> to continue, or q <return> to quit---
set = <optimized out>
set_last = <optimized out>
set_e = <optimized out>
pc = <optimized out>
it = <optimized out>
part = 0x4b6cf8
program = <optimized out>
group = <optimized out>
image = <optimized out>
model = <optimized out>
l2 = <optimized out>
l = <optimized out>
images_in_use = <optimized out>
models_in_use = <optimized out>
group_name = <optimized out>
is_lua = <optimized out>
iui = <optimized out>
__FUNCTION__ = "data_process_lookups"
#10 0x00407fa8 in main (argc=<optimized out>, argv=<optimized out>)
at bin/edje/edje_cc.c:419
i = <optimized out>
st =
{st_dev = 65025, __pad1 = 0, __st_ino = 1986518, st_mode = 33188, st_nlink = 1, st_uid = 2571, st_gid = 2571, st_rdev = 0, __pad2 = 0, st_size = 1086, st_blksize = 4096, st_blocks = 8, st_atim = {tv_sec = 1518950590, tv_nsec = 36994035}, st_mtim = {tv_sec = 1511353427, tv_nsec = 0}, st_ctim = {tv_sec = 1518950547, tv_nsec = 447441844}, st_ino = 1986518}
rpath = "/home/ametzler/EFL/efl-1.20.6/src/modules/ethumb/emotion/template.edc", '\000' <repeats 2571 times>...
rpath2 = "/home/ametzler/EFL/efl-1.20.6/src/modules/ethumb/emotion/template.edj\000\250\265", '\000' <repeats 12 times>, "\017\000\000\000\260\303\376\266\000\000\000\000\001\000\000\000\000\000\000\000P9\250\265\060\364\377\276\000\002\000\000\177ELF\001\001\001\000\000\000\000\000\000\000\000\000\003\000(\000\001\000\000\000\230$\000\000\064\000\000\000\330\322\000\000\000\004\000\005\064\000 \000\006\000(\000\034\000\033\000\001", '\000' <repeats 15 times>, "\240\275\000\000\240\275\000\000\005\000\000\000\000\000\001\000"...
__FUNCTION__ = "main"
(gdb) x/16i $pc
=> 0xb6e87948 <evas_common_cpu_neon_test>: vqadd.u8 d0, d1, d0
0xb6e8794c <evas_common_cpu_neon_test+4>: bx lr
0xb6e8794e: nop
0xb6e87950 <evas_common_cpu_catch_ill>:
ldr r0, [pc, #8] ; (0xb6e8795c <evas_common_cpu_catch_ill+12>)
0xb6e87952 <evas_common_cpu_catch_ill+2>: movs r1, #1
0xb6e87954 <evas_common_cpu_catch_ill+4>: push {r3, lr}
0xb6e87956 <evas_common_cpu_catch_ill+6>: add r0, pc
0xb6e87958 <evas_common_cpu_catch_ill+8>:
blx 0xb6dc226c <__longjmp_chk@plt>
0xb6e8795c <evas_common_cpu_catch_ill+12>: andeq pc, r9, r6, ror #9
0xb6e87960 <evas_common_cpu_catch_segv>: push {r3, lr}
0xb6e87962 <evas_common_cpu_catch_segv+2>:
bl 0xb6e87950 <evas_common_cpu_catch_ill>
0xb6e87966: nop
0xb6e87968 <evas_common_cpu_feature_test>:
ldr r1, [pc, #160] ; (0xb6e87a0c <evas_common_cpu_feature_test+164>)
0xb6e8796a <evas_common_cpu_feature_test+2>:
ldr r2, [pc, #164] ; (0xb6e87a10 <evas_common_cpu_feature_test+168>)
0xb6e8796c <evas_common_cpu_feature_test+4>: push {r4, r5, r6, lr}
0xb6e8796e <evas_common_cpu_feature_test+6>: add r1, pc
(gdb) thread apply all backtrace

Thread 3 (Thread 0xb50e2f40 (LWP 22273)):
#0 0xb6cdcf54 in __libc_do_syscall ()
at /lib/arm-linux-gnueabihf/libpthread.so.0
#1 0xb6cd8258 in pthread_cond_wait@@GLIBC_2.4 ()
at /lib/arm-linux-gnueabihf/libpthread.so.0
#2 0xb6e9ceb6 in eina_condition_wait (cond=0xb6f27070 <evas_thread_queue_condition>) at ../src/lib/eina/eina_inline_lock_posix.x:351
#3 0xb6e9ceb6 in evas_thread_worker_func (data=<optimized out>, thread=<optimized out>) at lib/evas/common/evas_thread_render.c:134
#4 0xb6d2f50c in _eina_internal_call (context=0x47ddc0)
at lib/eina/eina_thread.c:148
#5 0xb6cd35f4 in start_thread () at /lib/arm-linux-gnueabihf/libpthread.so.0
#6 0xb6aed15c in () at /lib/arm-linux-gnueabihf/libc.so.6

Thread 2 (Thread 0xb58e3f40 (LWP 22272)):
#0 0xb6cdcf54 in __libc_do_syscall ()
at /lib/arm-linux-gnueabihf/libpthread.so.0
#1 0xb6cdacd6 in __lll_lock_wait ()
at /lib/arm-linux-gnueabihf/libpthread.so.0
#2 0xb6cd551c in pthread_mutex_lock ()
at /lib/arm-linux-gnueabihf/libpthread.so.0

Thread 1 (Thread 0xb58ed010 (LWP 22269)):
#0 0xb6e87948 in evas_common_cpu_neon_test ()
at lib/evas/common/evas_cpu.c:103
#1 0xb6e879c8 in evas_common_cpu_feature_test (feature=0xb6e87949 <evas_common_cpu_neon_test>) at lib/evas/common/evas_cpu.c:147
#2 0xb6e87ae8 in evas_common_cpu_init () at lib/evas/common/evas_cpu.c:241
#3 0xb6e87c64 in evas_common_init () at lib/evas/common/evas_draw_main.c:133
#4 0xb6e36c7a in efl_canvas_output_engine_info_set (output=0x4bbd48, info=info@entry=0x4bbd90) at lib/evas/canvas/evas_out.c:135
#5 0xb6ddc7bc in evas_engine_info_set (obj=<optimized out>, info=info@entry=0x4bbd90) at lib/evas/canvas/evas_main.c:479
#6 0xb6f8d676 in ecore_evas_buffer_allocfunc_new (w=w@entry=1, h=h@entry=1, alloc_func=<optimized out>, free_func=
0xb6f8c8e9 <_ecore_evas_buffer_pix_free>, data=<optimized out>,
data@entry=0x0) at lib/ecore_evas/ecore_evas_buffer.c:804
#7 0xb6f8d8d8 in ecore_evas_buffer_new (w=w@entry=1, h=h@entry=1)
at lib/ecore_evas/ecore_evas_buffer.c:846
#8 0x0040986a in _data_image_sets_size_set () at bin/edje/edje_cc_out.c:3423
#9 0x0040ce60 in data_process_lookups () at bin/edje/edje_cc_out.c:4047
#10 0x00407fa8 in main (argc=<optimized out>, argv=<optimized out>)
at bin/edje/edje_cc.c:419

(gdb) layout asm shows:
│0xb6e87948 <evas_common_cpu_neon_test> vqadd.u8 d0, d1, d0

config.log reads:


configure:26685: checking whether to use NEON instructions
configure:26700: gcc -c -mfpu=neon -ftree-vectorize -g -O2 -fdebug-prefix-map=/home/ametzler/EFL/efl-1.20.6=. -fstack-protector-strong -Wformat -Werror=format-security -fvisibility=hidden -Wdate-time -D_FORTIFY_SOURCE=2 conftest.c >&5

I am not sure it looks to me like configure.ac checks whether NEON can be compiled, but on runtime the binaries do not check whether NEON is usable.

Hmmm, but evas_common_cpu_neon_test looks like a runtime check to see if NEON is usable. evas_common_cpu_feature_test calls it with a handler installed for SIGILL. For some reason, it's not being trapped though.

raster added a subscriber: raster.Feb 27 2018, 1:41 AM

what @rvandegrift said. efl does try execute instructions to do runtime neon (also mmx, sse etc.) tests and sets up a SIGILL handler to trap this and recover (and thus turn off that instruction set support in the mmx, sse, neon etc. support). i don't actually have any arm devices today without neon support, but i used to (a tegra 2), and i remember it working there, so i've never seen it fail. on x86 the exact same scheme is used and i've seen it work perfectly many times too.

so something disallows SIGILL trapping in this environment, and that, at least i think, is an invalid thing to do unless there is a very good explanation otherwise.

Hello,

with ManPower's commit the SIGILL happens at a slightly different location.

Find attached a gdb log for a -O0 build.

Ah - gcc is using NEON to zero an unsigned long long. That's because 77d2e0cb959 enabled -mfpu=neon -ftree-vectorize to fix breaks due to mixed NEON/non-NEON fpu code. Since Debian supports non-NEON armhf, I think we need --disable-neon.

raster added a comment.Mar 3 2018, 8:40 PM

Totally disabling neon then hurts 99% of people with neon ARM systems (non-neon is very rare these days). Perhaps we only turn on -mfpu=neon -ftree-vectorize when compiling the neon assembly C files and just those? their interface to the rest of evas is regular core arm registers (no floating point or neon registers)... ?

But since gcc wants to use NEON even for constant assignments, I'm worried that -mfpu=neon will result in fragile binaries on non-NEON. Debian's armhf policy currently requires supporting non-NEON, even though full NEON is best for many people.

Rereading the above reference commit - it actually makes a user-supplied -mfpu parameter take precedence over EFL's default. So I guess that means that EFL's NEON support works without gcc's? If so, maybe the easiest solution is for Debian packages to drop -mfpu=neon -ftree-vectorize on armhf only.

raster added a comment.Mar 4 2018, 8:24 PM

If the neon opcodes are in functions we never call because we already determined the system doesn't support neon, then it doesn't matter. Right? It just means having to isolate the neon code into their own files - that'd be for the cpu checks. the other neon asm for the pixel pushing is already isolated, though #included into a parent src file.

the asm neon code in efl is separate to gcc's neon optimizations, but we've found we do need to do the -mfpu=neon etc. to binding it into the rest of the code so it compiles and works. that's why i was mentioning just providing that only for these files. the functions that are then called in these files interface to the rest of the code with regular core arm registers/integers (well i'd hope.assume it wouldnt decide to use neon registers for argument passing normal pointers, integers etc.) so it wouldn't matter what these neon optimized funcs do internally, as long as they don't get called on non-neon systems.

I see, makes sense to me. Thanks for explaining.

bu5hm4n assigned this task to ManMower.Jun 11 2018, 1:19 AM
bu5hm4n added a subscriber: bu5hm4n.

You handled that in the past, can you handle it more so we can close it ? :)

bu5hm4n triaged this task as High priority.Jun 11 2018, 1:20 AM
zmike edited projects, added Restricted Project; removed efl.Jun 11 2018, 6:50 AM
bu5hm4n edited projects, added efl: layout engine; removed Restricted Project.Jun 11 2018, 9:22 AM
ManMower closed this task as Resolved.Jun 11 2018, 9:24 AM

This should be resolved by the referenced commits.