Page MenuHomePhabricator

[Evas GL Thread 2] evas: (GL thread) Added GL thread renderer logics based of SW thread renderer
Needs ReviewPublic

Authored by haegeun on Dec 28 2016, 10:49 PM.

Details

Summary

Depends on D4525

  • Modified existing thread logic to manage multiple worker thread
  • Added functions for thread enqueue, thread-id query
  • Added a new threading mode 'FINISH' for waiting main-thread until command processed

Change-Id: I6cf0228a4adb04ea776e1c41936ab327ec06730e

Diff Detail

Repository
rEFL core/efl
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3317
Build 3382: arc lint + arc unit
haegeun updated this revision to Diff 10374.Dec 28 2016, 10:49 PM
haegeun retitled this revision from to [Evas GL Thread 2] evas: (GL thread) Added GL thread renderer logics based of SW thread renderer.
haegeun updated this object.
haegeun edited the test plan for this revision. (Show Details)
jpeg requested changes to this revision.Jan 3 2017, 8:10 PM
jpeg edited edge metadata.

I haven't checked everything. But the mechanism for FINISH looks wrong to me. The locking around the waiting_cmd variable is inconsistent and will eventually not work as expected. I believe finish should simply be a flag on the command, not a property of the thread. The main loop then waits until the thread has completed all its commands (empty queue).

src/lib/evas/common/evas_thread_render.c
125

waiting_cmd is written with queue_lock held but read here with finish_lock...

208

This is a race condition waiting to happen! waiting_cmd is set under the lock queue_lock but read here without the lock. And then written again with another lock (finish_lock). Something is fishy with the locking here.

210

Well the thread better finish successfully otherwise we deadlock here.

282

needs NULL safety (for paranoia)

333

That Eevas- name is ugly :)

338

Shouldn't those threads be created once an engine is created? So we only create a thread for GL if we indeed use the GL engine? Same applies to SW (unless GL engines somehow require the SW async thread, for whatever reason).

This revision now requires changes to proceed.Jan 3 2017, 8:10 PM
haegeun updated this revision to Diff 10427.Jan 4 2017, 12:27 AM
haegeun edited edge metadata.
haegeun marked an inline comment as done.
  1. Changed thread names Eevas -> Evas
  2. Moved thread creation from evas_init() to initializing (sw or gl) backend
haegeun marked 2 inline comments as done.Jan 4 2017, 12:31 AM
haegeun updated this revision to Diff 10429.Jan 4 2017, 4:12 AM
haegeun edited edge metadata.

Refactored finish waiting mechanism (with condition)

  • finish flag is added inside each commmand
  • removed waiting_cmd
  • tuned lock/cond section
haegeun marked 2 inline comments as done.Jan 4 2017, 4:14 AM
haegeun marked an inline comment as done.Jan 4 2017, 8:40 PM
haegeun updated this revision to Diff 10440.Jan 5 2017, 12:59 AM

Fix undefined symbol at runtime

Please do not submit it until we tested and verified it :) just for review

jpeg requested changes to this revision.EditedJan 8 2017, 10:27 PM
jpeg edited edge metadata.

It's better than the previous patch...

But there is still a problem with the finish mechanism.

@cedric it looks like we just leak anything we pass in the commands? i.e. we don't destroy the commands unless we actually run them. I'm quite dubious about the queue cache, I don't quite get it :)

src/lib/evas/common/evas_thread_render.c
124

You can't simply wait on a condition without looping and checking that a variable has changed. Indeed, spurious wakeups may happen, eg. if a signal was received by the thread, or for a bunch of other similar reasons. This means that the return value of wait() needs to be checked, a variable needs to be read after the condition wait, and loop again in case there was no error but the variable hasn't changed (eg. EINTR wouldn't be an error in this situation, it's just that the thread was interrupted by a signal). Also it's not clear here that the command that triggered the finish condition to wake up is the same we're actually waiting for.

I wonder if a semaphore (on the command itself) would be more appropriate here? Then no need for a lock. Just append and wait on the semaphore. By changing cmd->waiting into a malloc'ed semaphore you wouldn't use more memory (due to struct padding & alignment) and would be able to get rid of finish_condition. @cedric thoughts?

169

@cedric what's the point of this queue cache? I don't get it

210
  1. No need to take the lock if the cmd is not a FINISH command. if (cmd->waiting) does not require any lock as the flag is part of the command struct, so not a shared variable. lock take should be inside the if().
  2. You can't signal a condition without also changing a variable.
235

const char *

This revision now requires changes to proceed.Jan 8 2017, 10:27 PM
haegeun updated this revision to Diff 10468.Jan 9 2017, 3:49 AM
haegeun edited edge metadata.

Clear waiting variable

(Do not submit, Only for review)

haegeun updated this revision to Diff 10469.Jan 9 2017, 3:58 AM
haegeun marked an inline comment as done.
haegeun edited edge metadata.

Remove unused variable - finish lock

jpeg added a comment.Jan 9 2017, 4:11 AM

Seems better. (I was at first surprised by the move to ev_thread but I think it makes sense this way)

haegeun updated this revision to Diff 10517.Jan 12 2017, 8:30 PM
  • Stabilized via test
  • Fixed build warning
haegeun updated this revision to Diff 10518.Jan 12 2017, 8:35 PM
  • Revert migration codes
raster added inline comments.Jan 23 2017, 10:53 PM
src/lib/evas/common/evas_thread_render.c
101–102

why not just use eina_threadqueue that does all the correct sleeping, waking, zero-copy queue etc. instead of manually doing conditions where you have to remember to lock the condition before i think either waiting or signalling it (i don't remember) ... a semaphore and thus a nice known queue of N things to read off the queue...

or you just don't want to redo this code here? because if you want a 2 way channel... 2 threadqueues would do the job just fine... :)

238–239

if you used threadqueue, you wouldn't need this. every command is a block of memory with a thread queue msg header plus payload - it can be any size. ... it's simpler... :) no need for the cache

239–240

all of this (lock, condition) can be removed if you just use a threadqueue :)

haegeun added inline comments.Jan 24 2017, 12:05 AM
src/lib/evas/common/evas_thread_render.c
101–102

I think eina_thread_queue may be useful and I can try :)
I agree about functionality and replaceability, but I'm not sure if it has no problem with performance (which one is faster in heavy situation? or side effects?)

Maybe @cedric(original author) have a opinion for this?
Actually, I was focused modding for gl-threading and refactoring itself as my way was careful :)

raster edited edge metadata.Jan 25 2017, 7:26 PM

if there is a performance issue with threadqueue - we'll fix it there and everyone wins. :) the only thing i can think of that would be maybe needed is to tune the thread_queue block pool (it already has a block pool of spare message blocks to re-use again and again and again). maybe improve finding of a fitting block. right now all blocks are 4kb (minus header sizes) and messages added to a queue are put into a block until its full then another block is appended, so every 4kb or a bit under worth of messages a new block needs to be found (or allocated). threadqueue will keep about 20 spare blocks around in a pool to share between thread_queues.. that can be tuned/expanded and so on...

haegeun updated this revision to Diff 10669.Feb 2 2017, 1:05 AM

Refactored thread renderer to use eina_thread_queue

raster requested changes to this revision.Feb 5 2017, 11:09 PM

really mostly there just - replies as per inline comments. :)

src/lib/evas/common/evas_thread_render.c
73

wait... shouldn't this be getting the reply? like an int, bool, whatever value? this has no "value" in a reply. just something that syncs and waits with no reply VALUE fetched... ? wait... or are you replying on the cb to write the return somewhere?

211

here too ... shouldnt this be SENDING the reply in the message struct? this would be far far far cleaner to do it this way... in fact please send replied in the message queue - if needed use a union and a type field or whatever to transport it back... but have replies in the thread queue for replies.

This revision now requires changes to proceed.Feb 5 2017, 11:09 PM
haegeun updated this revision to Diff 10685.Feb 5 2017, 11:42 PM
haegeun edited edge metadata.
haegeun marked 2 inline comments as done.

Added a confirm variable to finish reply :)

haegeun marked 2 inline comments as done.Feb 5 2017, 11:44 PM

And please comment about that :) (about naming... or assertion...)

haegeun updated this revision to Diff 10689.Feb 6 2017, 10:26 PM
This comment was removed by haegeun.
raster added a comment.Feb 7 2017, 1:22 AM

hmm the finish reply has the cmd. what i was talking about is the value. eg the EGLBoolean returned by eglSwapBuffers() or EGLBoolean returned by eglQuerySurface() or GLuint returned by glCreateProgram() or glCreateShader() or GLint returned by glGetAttribLocation() or GLenum returned by glGetError() or const GLubyte * returned by glGetString() and so on... the return values of a gl api call... shouldnt the return value be in the trhead_queue message reply?

hmm the finish reply has the cmd. what i was talking about is the value. eg the EGLBoolean returned by eglSwapBuffers() or EGLBoolean returned by eglQuerySurface() or GLuint returned by glCreateProgram() or glCreateShader() or GLint returned by glGetAttribLocation() or GLenum returned by glGetError() or const GLubyte * returned by glGetString() and so on... the return values of a gl api call... shouldnt the return value be in the trhead_queue message reply?

This was already implemented at return_value of the passing structure, For an example

typedef struct
{
   EGLBoolean return_value;
   EGLDisplay dpy;
   EGLSurface surface;
} Evas_Thread_Command_eglSwapBuffers;

And the value was assigned in worker callback function itself

static void
_gl_thread_eglSwapBuffers(void *data)
{
   Evas_Thread_Command_eglSwapBuffers *thread_param = data;
   thread_param->return_value = eglSwapBuffers(thread_param->dpy, thread_param->surface);
}

And it was returned after the end of thread invoking

EGLBoolean
evas_eglSwapBuffers_th(EGLDisplay dpy, EGLSurface surface)
{
   Evas_Thread_Command_eglSwapBuffers thread_data;

   if (!evas_gl_thread_enabled())
     return eglSwapBuffers(dpy, surface);

   thread_data.dpy = dpy;
   thread_data.surface = surface;

   evas_gl_thread_cmd_enqueue(EVAS_GL_THREAD_TYPE_GL,
                              _gl_thread_eglSwapBuffers,
                              &thread_data,
                              EVAS_GL_THREAD_MODE_FINISH);

///////////// AFTER THREAD INVOKE (MAIN-THREAD WAKE-UP) /////////////////
   return thread_data.return_value;
}

Maybe something infrastructure can be created to support 'async-return'. :)
But I think evas_thread_renderer is very low-level, infrastrcutre of evas lib.
(every gl-engine developer may not know how evas_thread_renderer works,
and they want just use suitable evas_glXXX_th() functions for thread rendering)

So instead of evas_eglSwapBuffers_th() (sync version),
evas_eglSwapBuffers_th_async() and evas_eglSwapBuffers_th_async_done() can be implemented for engine developers.
for an example

evas_eglSwapBuffers_th_async(dpy, surface);
...
Some jobs
...
EGLBoolean ret = evas_eglSwapBuffers_th_async_done();
check ret

I think it is enough to implement using Evas_Thread_Command_eglSwapBuffers structure with dynamic allocation.
And it is more high-level and will be more portable for various cases. in my opinion :)


Firstly, I simply thought that finish thread queue is needed for just waiting purposes, so I minimized passing values.
I'm not sure there are some internal trouble without 'payloads' of Eina_Thread_Queue_Msg. (however it works,,, but now i agree it must be exist)

When passing a value, cb(callback function ptr) and data(parameter structure ptr including return_value) was considered,
And I choose cb. but I think no need to verification. Both are not needed actually :)
(data is already known to evas_XXXX_th() functions)

raster requested changes to this revision.Feb 21 2017, 1:45 AM

yes yes - you pass in the struct to fill with the return value... what i'm saying is return value should probably be delivered VIA the thread queue. Evas_Thread_Command_eglSwapBuffers struct should actually be a thread queue message with the thread queue message header and ONLY have the params, then another thread queue message with the reply in it. the thread queue itself would then hold all the parameters. this is necessary in general to be able to "send the message and forget" for api's that have NO reply value. the message content stays around until the other end of the thread queue consumes it in a wait. the calling function that sends the request can return immediately when it says its done with the thread queue message. if the re is a RETURN value it should come back via the thread queue message payload too. right now you are not using the thread queue message payload for this parameter and return value data. you should. this would ALLOW some calls to become async "over time". for example eglquerysurface when we query for buffer age... that SPECIFICALLY could NOT WAIT for a reply. it can return immediately... but before the next frame renders it can WAIT for the reply from the last buffer age surface query. you do a buffer age query IMMEDIATLEY after a swapbuffer but do not wait for the reply. pick it up next "start of frame". this would remove a "forced round trip" for buffer age. remove the boolean return value from swwapbuffer too and 1 less forces round-trip. remove enough of these and we will get decent performance.

but ALL of this depends on using thread queue properly and that is placing your function params IN the thread queue message payload AND reply values in the message payload in the reply. :) at least do it this way so it can be extended nicely in future like above.

This revision now requires changes to proceed.Feb 21 2017, 1:45 AM
JoogabYun added inline comments.
src/lib/evas/common/evas_thread_render.c
207

Doesen't it need an eina_thread_queue_wait_done?
The memory continuously increase if there is no eina_thread_queue_wait_done()

haegeun marked an inline comment as done.Mar 1 2017, 10:08 PM
haegeun added inline comments.
src/lib/evas/common/evas_thread_render.c
207

Yes :) I will attach a blueprint code with fix it. currently we are testing

haegeun updated this revision to Diff 10821.Mar 1 2017, 10:14 PM
haegeun edited edge metadata.
haegeun marked an inline comment as done.

Apply async finish (we are not yet tested)

haegeun updated this revision to Diff 10824.Mar 1 2017, 11:00 PM

We have tested about async finish and now upload it

An example for ASYNC FINISH

<before (FINISH)>

if (!evas_eglQuerySurface_th(ob->egl_disp, ob->egl_surface, EGL_BUFFER_AGE_EXT, &age))
   ERR("QuerySurface Failed!");

<after (ASYNC FINISH)>

void *querysurface_ref = evas_eglQuerySurfaceASYNC_begin_th(ob->egl_disp, ob->egl_surface, EGL_BUFFER_AGE_EXT, &age);

(immediately returned with ref pointer)
(do some work)

if (evas_eglQuertSurfaceASYNC_end_th(querysurface_ref))
   ERR("QuerySurface Failed!");

Internally it can be implemented using evas_gl_thread_cmd_wait() functions in evas_gl_thread_XXX.c/h
Here is an example

typedef struct
{
   EGLBoolean return_value;
   Eina_Bool finished;

   EGLDisplay dpy;
   EGLSurface surface;
   EGLint attribute;
   EGLint *value;
} Evas_Thread_Command_eglQuerySurfaceASYNC;

static void
_gl_thread_eglQuerySurfaceASYNC(void *data)
{
   Evas_Thread_Command_eglQuerySurfaceASYNC *thread_data = data;

   thread_data->return_value = eglQuerySurface(thread_data->dpy,
                                               thread_data->surface,
                                               thread_data->attribute,
                                               thread_data->value);
   thread_data->finished = EINA_TRUE;
}

void *
evas_eglQuerySurfaceASYNC_begin_th(EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint *value)
{
   Evas_Thread_Command_eglQuerySurfaceASYNC *thread_data = NULL;
   thread_data = eina_mempool_malloc(_mp_command,
                                     sizeof(Evas_Thread_Command_eglQuerySurfaceASYNC));
   if (!thread_data)
      return NULL;

   if (!evas_gl_thread_enabled())
     {
        thread_data->return_value = eglQuerySurface(dpy, surface, attribute, value);
        thread_data->finished = EINA_TRUE;
        return thread_data;
     }

   thread_data->dpy = dpy;
   thread_data->surface = surface;
   thread_data->attribute = attribute;
   thread_data->value = value;
   thread_data->finished = EINA_FALSE;

   evas_gl_thread_cmd_enqueue(EVAS_GL_THREAD_TYPE_GL,
                              _gl_thread_eglQuerySurfaceASYNC,
                              &thread_data,
                              EVAS_GL_THREAD_MODE_ASYNC_FINISH);

   return thread_data;
}

EGLBoolean
evas_eglQuerySurfaceASYNC_end_th(void *ref)
{
   Evas_Thread_Command_eglQuerySurfaceASYNC *thread_data = (Evas_Thread_Command_eglQuerySurfaceASYNC *)ref;

   if (!thread_data->finished)
      evas_gl_thread_cmd_wait(EVAS_GL_THREAD_TYPE_GL, thread_data, &thread_data->finished);

   EGLBoolean return_value = thread_data->return_value;
   eina_mempool_free(_mp_command, thread_data);

   return return_value;
}

it monitors 'finished' variable until it becomes true.
Any feedbacks(void pointer ref...)? :)

jpeg added a comment.Mar 2 2017, 3:31 AM

Off-topic comment:
I think a general solution to synchronizing async render (even SW) with the main loop may be required. See T5225 for a reason why (image save on a snapshot object).

raster accepted this revision.Apr 3 2017, 2:58 AM

:) ok. we can take it from here. thanks!