Page MenuHomePhabricator

Enlightenment uses 100% CPU every few seconds once the screen blanks
Closed, ResolvedPublic

Description

  1. Wait for E to blank the screens. (Note, I have E blank the screens and then lock after 10 seconds).
  2. SSH in from another machine.
  3. Watch E use 100% of CPU every few seconds.

Name : enlightenment-git
Version : 0.24.99.24451.g4051c18a4-1
Build Date : Fri 05 Mar 2021 11:20:00 AM EST
Install Date : Fri 05 Mar 2021 11:22:21 AM EST

Name : efl-git
Version : 1.25.99.66451.g6899bd034c-1
Build Date : Fri 05 Mar 2021 11:14:38 AM EST
Install Date : Fri 05 Mar 2021 11:18:36 AM EST

Name : nvidia-dkms
Version : 460.56-1
Build Date : Thu 25 Feb 2021 11:49:07 AM EST
Install Date : Sun 28 Feb 2021 06:02:27 PM EST

abyomi0 created this task.Mar 5 2021, 6:52 PM
ProhtMeyhet triaged this task as Pending on user input priority.EditedMar 6 2021, 11:16 AM
ProhtMeyhet added a subscriber: ProhtMeyhet.

You have to be more specific. How do you check if E uses 100% CPU?

For example: If your system load is way below the number of CPU cores you have, then that just means E is running, but everything else is not - the other processes are asleep. As such it is just E running.

Checking via bpytop.
It looks like this...

ProhtMeyhet closed this task as Invalid.EditedMar 6 2021, 2:06 PM

Yes, expected and exactly what I've told you: The CPU will always have some work to do when the Computer is running.

The point in your video is the upper right corner showing OVERALL CPU usage. You are connected via SSH = that alone is a burden on the CPU as it has to encrypt and decrypt a lot. On top of that Firefox (Web Content, WebExtensions are also Firefox) is doing some JS stuff, Discord is also querying if there is something new "in the pipe", you've got some python running... Still it is only about 30% of your CPU that is being used. The load even is only 1.07.

Now you have Xorg and E using most of the CPU time of that 1/3. That is because windows (not Microsoft) "talk" to Xorg and Xorg wants E to update the Windows and E has to react.

Maxing out a CPU would mean that a process is using 100% of the CPU while it is running at 100%. But, again, what you have is the relative % of E and other processes using your current CPU time which is quite low.

raster added a subscriber: raster.Sat, Apr 17, 2:12 AM

perf top might be useful. perf top -p PIDOFENLIGHTENMNENT ...

raster reopened this task as Open.Sat, Apr 17, 2:27 AM

Worth looking into, but "I don't see the problem". Apps running CAN cause E to use CPU - if something interacts with the XServer and that then causes E to have to do something in response - this definitely can happen. It then depends on what is happening if E is using too much CPU or not. An idle X session with no apps doing anything. Tjhis is an idle e sesson with screen set to lock after blanking (blanking set to 30 sec, dimming to 10 sec), current git master e and efl:

http://www.enlightenment.org/ss/e-607aad751ad7f0.56121349.jpg

nothing... not a peep. i THINK i spotted e using 0.1 cpu at some point... bpytop uses almost all the cpu here with sshd next...

so it's something special for you... what - i can't see. perf top may say say more.

After E blanks the screen.

Samples: 325K of event 'cycles', 4000 Hz, Event count (approx.): 42638079867 lost: 0/0 drop: 0/0
Overhead  Shared Object                   Symbol
   8.13%  libeo.so.1.25.99                [.] _eo_obj_pointer_get
   7.86%  libeo.so.1.25.99                [.] _efl_object_call_resolve
   2.60%  libeo.so.1.25.99                [.] _efl_data_scope_get
   2.50%  libeo.so.1.25.99                [.] _vtable_func_get
   2.40%  libedje.so.1.25.99              [.] _edje_part_recalc
   1.75%  libeo.so.1.25.99                [.] _eo_table_data_get
   1.54%  libc-2.33.so                    [.] __strcmp_avx2
   1.54%  libeo.so.1.25.99                [.] _efl_unref_internal
   1.53%  libpthread-2.33.so              [.] __pthread_getspecific
   1.45%  libeo.so.1.25.99                [.] _efl_object_call_end
   1.43%  libeina.so.1.25.99              [.] _eina_chained_mp_pool_key_cmp
   1.43%  libembryo.so.1.25.99            [.] embryo_program_run
   1.37%  libeo.so.1.25.99                [.] _eo_obj_pointer_done
   1.34%  libeo.so.1.25.99                [.] efl_isa
   1.25%  libeo.so.1.25.99                [.] _eo_table_data_table_get
   1.14%  libc-2.33.so                    [.] _int_malloc
   1.13%  libeo.so.1.25.99                [.] _apply_auto_unref
   1.11%  libc-2.33.so                    [.] _int_free
   1.09%  libeina.so.1.25.99              [.] _eina_share_common_cmp
   1.03%  libc-2.33.so                    [.] __libc_calloc
   1.00%  libeina.so.1.25.99              [.] eina_hash_superfast
   0.95%  libpthread-2.33.so              [.] __pthread_mutex_lock
   0.93%  libpthread-2.33.so              [.] pthread_spin_lock
   0.93%  libeo.so.1.25.99                [.] eina_tls_get
   0.91%  libpthread-2.33.so              [.] __pthread_mutex_unlock_usercnt
   0.78%  libeo.so.1.25.99                [.] _efl_data_scope_safe_get
   0.73%  libeina.so.1.25.99              [.] _eina_share_common_node
   0.69%  libembryo.so.1.25.99            [.] _str_snprintf
   0.65%  libeina.so.1.25.99              [.] eina_rbtree_inline_lookup
   0.65%  libeina.so.1.25.99              [.] eina_freeq_ptr_add
   0.65%  libembryo.so.1.25.99            [.] embryo_data_string_get
   0.65%  libedje.so.1.25.99              [.] _edje_part_recalc_single
   0.59%  libc-2.33.so                    [.] malloc_consolidate
   0.59%  libeo.so.1.25.99                [.] _efl_ref
   0.57%  libedje.so.1.25.99              [.] _edje_program_run
   0.57%  libeina.so.1.25.99              [.] eina_rbtree_inline_lookup
   0.56%  libeo.so.1.25.99                [.] _eo_class_pointer_get
   0.56%  libeina.so.1.25.99              [.] _eina_freeq_process
   0.54%  libeo.so.1.25.99                [.] efl_data_scope_get
   0.50%  libedje.so.1.25.99              [.] _edje_real_part_image_internal_set
   0.49%  libeo.so.1.25.99                [.] efl_unref
   0.49%  libeo.so.1.25.99                [.] _efl_super_cast
   0.47%  libeo.so.1.25.99                [.] _eo_kls_itr_next
   0.45%  libedje.so.1.25.99              [.] _edje_signal_source_key_cmp
   0.45%  libeina.so.1.25.99              [.] eina_inlist_remove
   0.44%  libeina.so.1.25.99              [.] eina_rbtree_inline_insert
   0.42%  libedje.so.1.25.99              [.] _edje_part_description_apply
   0.40%  libeina.so.1.25.99              [.] eina_share_common_add_length
   0.40%  libc-2.33.so                    [.] cfree@GLIBC_2.2.5
   0.38%  libeina.so.1.25.99              [.] eina_convert_itoa
   0.37%  libembryo.so.1.25.99            [.] _embryo_native_call
   0.36%  libeo.so.1.25.99                [.] _eo_table_data_table_get
   0.35%  libedje.so.1.25.99              [.] _edje_image_recalc_apply
   0.35%  libedje.so.1.25.99              [.] eina_rbtree_inline_lookup
   0.35%  libedje.so.1.25.99              [.] _edje_match_fn
   0.34%  libembryo.so.1.25.99            [.] embryo_data_string_set
   0.34%  libevas.so.1.25.99              [.] efl_canvas_image_internal_class_get
   0.34%  libc-2.33.so                    [.] __strlen_avx2
For a higher level overview, try: perf top --sort comm,dso

Name : enlightenment-git
Version : 0.24.99.24551.ga8fb84524-1
Build Date : Sat 17 Apr 2021 09:48:11 AM EDT
Install Date : Sat 17 Apr 2021 09:50:47 AM EDT

Name : efl-git
Version : 1.25.99.66550.g287834b0da-1
Build Date : Sat 17 Apr 2021 09:44:19 AM EDT
Install Date : Sat 17 Apr 2021 09:46:56 AM EDT

well nothing special there - something from theme is running. try this: remove the cpufreq gadget from your shelf?

abyomi0 added a comment.EditedSun, Apr 18, 9:23 AM

well nothing special there - something from theme is running. try this: remove the cpufreq gadget from your shelf?

I don't have that on the shelf. Used to, but not any more.

hmm well something is seemingly animating and using embryo script along the way... what is animating will maybe indicate what is going on... ?

hmm well something is seemingly animating and using embryo script along the way... what is animating will maybe indicate what is going on... ?

What I have on the shelf currently:
iBar, Pager, Temp, Clock, Keyboard, System, Mixer.

I don't what might be animating, though.
So far, I've tried removing certain items from the shelf and deleting the shelf entirely, but it still happens..

well embryo does turn up so something is going through script... i really am not sure what is doing this... something is really odd on your system there.

Hmm. Maybe I ought to wipe out my config...

Hmm. Maybe I ought to wipe out my config...

Well, that wasn't it.
Is there some way I could try to figure out embryo is doing?

unload all modules, then load 1 at a time? at least that will tell us if it's related to a module or not and if so, which one. another you can do is update your build to current git master... :)

abyomi0 added a comment.EditedMon, Apr 19, 5:11 PM

unload all modules, then load 1 at a time? at least that will tell us if it's related to a module or not and if so, which one. another you can do is update your build to current git master... :)

It appears to be the clock module.

Here are my clock settings.

http://www.enlightenment.org/ss/e-607e1c13be9117.60773762.jpg

...and it seems it's got something do with having seconds being displayed. It doesn't happen if I disable that...

well well... but how is it using so much cpu? well ok - I can think of something - when a digit flips the clock in the old dark theme uses an animation that makes it flicker in like broken fluorescent tube.

perhaps disable seconds display - or.. update to git master as now flat theme has dropped and it does no animation for clock digit flips.

well well... but how is it using so much cpu? well ok - I can think of something - when a digit flips the clock in the old dark theme uses an animation that makes it flicker in like broken fluorescent tube.

perhaps disable seconds display - or.. update to git master as now flat theme has dropped and it does no animation for clock digit flips.

Still happens on git master, but as since disabling seconds works around it, then that's good for now.

Updated from: 

Name            : enlightenment-git
Version         : 0.24.99.24551.ga8fb84524-1
Build Date      : Sat 17 Apr 2021 09:48:11 AM EDT
Install Date    : Sat 17 Apr 2021 09:50:47 AM EDT

Name            : efl-git
Version         : 1.25.99.66550.g287834b0da-1
Build Date      : Sat 17 Apr 2021 09:44:19 AM EDT
Install Date    : Sat 17 Apr 2021 09:46:56 AM EDT

to: 

Name            : enlightenment-git
Version         : 0.24.99.24553.g18c0fb89c-1
Build Date      : Mon 19 Apr 2021 09:50:51 PM EDT
Install Date    : Mon 19 Apr 2021 09:53:01 PM EDT

Name            : efl-git
Version         : 1.25.99.66560.g0f6ff82d2a-1
Build Date      : Mon 19 Apr 2021 09:46:18 PM EDT
Install Date    : Mon 19 Apr 2021 09:50:50 PM EDT

but just waking up once a second to flip the digit should be cheap and not need that much cpu. i actually was wrong - the new flat theme still have a transition - a fade digit out over 0.1 sec ... so 6 frames every 60... with screen blanks and seconds enabled i see:

http://www.enlightenment.org/ss/e-607e8dc0f207e3.13780197.png

so 0.1-0.2% cpu (this is on my old i5-4200u laptop) to have it tick over every second. it will tick that timer every second even when the screen is off but it'll not render anything (e freezes the compositor renderer) so the object will logically change state but not produce any output until it's unfrozen when screen un-blanks. i'm sure i could probably save a bit of power and freeze the clock object when screen blanks. i guess i don't have seconds on because i'ts a level of accuracy i don't need and it takes space and haven't noticed, but even with seconds on ... i see a tiny amount of cpu which is about right. what is the cpu used now with seconds on and screen off, clock in shelf?

7f02c5709298cfe915fe1dc0436d74932a11f6f1 + f98a8e82d2fb4b9a5385dd0403b94e47587fe42a

these above now freeze the clock entirely so it'll be totally idle (other bits of e may wake up to do things but not the clock now - not to tick over seconds/minutes etc.)

still - e should not be using so much cpu just ticking over. as i said - even on an old old laptop (i5-4200U CPU @ 1.60GHz - when blanked e forces it down to 800mhz and doesn't allow it to clock-up so e will use more % cpu than if at top clock speeds because the cpu is running slower to save power) ... 0.1 -> 0.2% cpu according to bpytop (use the same tool as you to stay consistent). i am baffled as to why you see so much cpu % being used. also memory used is massively more. 332m. so if your system is idle - no apps running (rendering stuff) ... what does bpytop say? not with the above commits - right now before you update?. my changes above will make the clock stop ticking entirely while the screen is off.

7f02c5709298cfe915fe1dc0436d74932a11f6f1 + f98a8e82d2fb4b9a5385dd0403b94e47587fe42a

these above now freeze the clock entirely so it'll be totally idle (other bits of e may wake up to do things but not the clock now - not to tick over seconds/minutes etc.)

still - e should not be using so much cpu just ticking over. as i said - even on an old old laptop (i5-4200U CPU @ 1.60GHz - when blanked e forces it down to 800mhz and doesn't allow it to clock-up so e will use more % cpu than if at top clock speeds because the cpu is running slower to save power) ... 0.1 -> 0.2% cpu according to bpytop (use the same tool as you to stay consistent). i am baffled as to why you see so much cpu % being used. also memory used is massively more. 332m. so if your system is idle - no apps running (rendering stuff) ... what does bpytop say? not with the above commits - right now before you update?. my changes above will make the clock stop ticking entirely while the screen is off.

No apps running, screen off (without the new commits): Every few seconds bytop (with or without per-core enabled) shows 100% of a core being used.

that is totally bizarre. when scren is on it does the same even when idle? like 60-100% cpu? can you run perf top again and specificall speed up its sampling rates and look at what it says when e's cpu spikes?

perf top -p `pidof enlightenment` -F 50000 -z -d 1

(man perf-top - 50000 might be too high a sample rate - may have to lower it).

i wonder if this is an nvidia problem?I don't see how it can be as e suspends rendering... ? do you have the option of switching to intel gfx entirely? do you use nvidia or nouveau? perhaps perf is not picking up the cpu usage in the nvidia blob?

that is totally bizarre. when scren is on it does the same even when idle? like 60-100% cpu? can you run perf top again and specificall speed up its sampling rates and look at what it says when e's cpu spikes?

perf top -p `pidof enlightenment` -F 50000 -z -d 1

(man perf-top - 50000 might be too high a sample rate - may have to lower it).

i wonder if this is an nvidia problem?I don't see how it can be as e suspends rendering... ? do you have the option of switching to intel gfx entirely? do you use nvidia or nouveau? perhaps perf is not picking up the cpu usage in the nvidia blob?

When the screen is on and idle, everything is fine. The CPU spikes don't happen. As soon as E blanks, it starts.

I currently use nvidia. I'll be switching to either AMD or Intel (discrete) eventually.

Hmm. Could switch to Intel for the time being, but I'd need to grab a DVI to DisplayPort cable.
That could work for now.


Seems fast enough...

hmm so ONLY when screen is off... please just try intel to see if it goes away. also try software rendering (no re-plugging/cables needed). i'm trying to figure out if this has to do with the nvidia driver blob. right now it SHOULDN'T be rendering anything and that should mean the nvidia blob is not involved while the screen is off. to be sure i'd throw some printf's in src/modules/evas/engines/gl_x11/evas_x_main.c in eng_outbuf_flush() to printf every time a frame is rendered... it should not be rendering, but maybe it is and disabling of rendering is not working? this then might make sense. e tries to render, swap a buffer, and nvida tries to show the buffer but because no vsync signal is happening on displays it sits in some spinlock trying to swap to a screen that is off or to query the buffer state of a new backbuffer before rendering to know how to partial-render. that would still be an nvidia blob bug - it should know screen is off and just insta-swap without vsync.

now my current theory is based on assuming it is rendering otherwise nvidia driver would have no effect. so ... switching to intel will help and software rendering will stop using the blob to render anything. anyway - a printf in that function will trigger every time a frame is rendered and swapped to screen. you should see silence (no output) when the screen is off. this code is not driver-dependent so it's happening everywhere (evas_x_main.c).

fyi i just threw in a printf here and i get not renders happening on my old test laptop here. so .. as best i can see the "freeze rendering" works ... so what i see says it cant be the nvidia blob, but it's the only thing that makes sense right now. perhaps the nvidia blob has some thread in the background doing something?

my other theory is it may be the vsync animator thread? if you set:

export ECORE_NO_VSYNC=1

before e runs... it'll disable vsync animators and use good-old cpu-side system clock timers for animating.

Alright, those two commits solve that problem.

Thanks, Raster.

abyomi0 closed this task as Resolved.Tue, Apr 20, 9:50 PM
abyomi0 claimed this task.
abyomi0 removed abyomi0 as the assignee of this task.

well they don't REALLY solve it .. they hide it by being even more aggressive at not doing anything when blanked... :) but the massive cpu % being used is still the problem... it should not be using that much.