Page MenuHomePhabricator

efl: reduce PI calculations
Needs ReviewPublic

Authored by ali.alzyod on Jun 8 2020, 9:22 AM.

Details

Test Plan
//gcc main.c -O3 -o app
#include<math.h>
#include<time.h>
#include<stdio.h>
#include<stdlib.h>

long long num = 0;
void nothing(double a)
{
  if( a > 10)
  num++;
}

int main()
{
  double arr[100000] = {0};

  for(int i = 0 ; i < 100000; i++){
    arr[i] = rand()%1000;
  }

  double tmp = 0;
  clock_t start , end;
  double time_used = 0;
  int i,j;

  start = clock();
  for(i = 0 ; i < 100000 ; i++)
  {
    nothing(arr[i] / (M_PI / 2));
  }
  end = clock();
  time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
  printf("time before = %f\n", time_used);

  start = clock();
  for(i = 0 ; i < 100000 ; i++)
  {
    nothing(arr[i] * M_2_PI);
  }
  end = clock();
  time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
  printf("time after  = %f\n", time_used);

  printf("force processing number = %lli\n", num);
}

output:
time before = 0.001408
time after = 0.000218

Diff Detail

Repository
rEFL core/efl
Branch
arcpatch-D11947
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 17186
Build 11455: arc lint + arc unit
ali.alzyod created this revision.Jun 8 2020, 9:22 AM

It seems that this patch has no reviewers specified. If you are unsure who can review your patch, please check this wiki page and see if anyone can be added: https://phab.enlightenment.org/w/maintainers_reviewers/

ali.alzyod requested review of this revision.Jun 8 2020, 9:22 AM

I dont mind if we land this or not. But this is not really impacting anything, const folding covered the cases i checked, and if not, then the static double assignment will fail...

src/lib/elput/elput_evdev.c
1523

Just as a rule of thumb, that also applies to the stuff above: whenever the compiler can calculate a value during compile time. Then the compiler can also do constant folding. So this is more like hand applying compiler optimizations, not really improving the situation afaics.

vtorri added a subscriber: vtorri.Jun 8 2020, 1:26 PM
vtorri added inline comments.
src/bin/elementary/test_flip_page.c
416

why not multiply by M_2_PI instead ? division is slower than multiplication

src/bin/elementary/test_flip_page_eo.c
392

idem

src/lib/elementary/efl_ui_flip.c
566

idem

src/lib/elementary/elm_gengrid.c
1725–1726

i am wondering why we multiply by 1 here, below and in other files

ali.alzyod added a comment.EditedJun 8 2020, 11:04 PM

I dont mind if we land this or not. But this is not really impacting anything, const folding covered the cases i checked, and if not, then the static double assignment will fail...

it will impact the speed for most cases because M_PI is multipled with variables.

but of-course not super speed booster, maybe no one will detect speed up :), but it is good to have this enhancement anyway

ali.alzyod added inline comments.Jun 8 2020, 11:11 PM
src/lib/elput/elput_evdev.c
1523

I think I changed all M_PI values without looking deep into them, you are right about this case, no real time speed up will happen here (at least with good compiler like gcc)

ali.alzyod edited the test plan for this revision. (Show Details)Jun 8 2020, 11:25 PM
ali.alzyod marked 3 inline comments as done.Jun 8 2020, 11:40 PM
ali.alzyod updated this revision to Diff 30576.Jun 8 2020, 11:40 PM

change 1/M_PI_2 to M_2_PI

ali.alzyod edited the test plan for this revision. (Show Details)Jun 8 2020, 11:44 PM
ali.alzyod edited the test plan for this revision. (Show Details)Jun 9 2020, 12:00 AM
raster requested changes to this revision.Jun 9 2020, 2:24 AM
raster added a subscriber: raster.

just saying... -O0 is not a way to say "oh i've made it faster at O0". there is a line between more readable and faster code. your code changes here are NO faster as the default is -O1 anyway so the compiler converts these to constants at compile time UNLESS you use -O0 ... which is just invalid to consider a case to speed up.

so my point is - no speedup at all here. does this patch make the code less readable or not? this is very subjective. my argument would be it's a bit less readable, which means - no.

but let me finish with... please do look at optimizing, but don't waste time on speeding up -O0 as it is a waste of time. look at cases that speed up -O2 and -O3 etc. so.. focus on real places to improve. :) profile. understand the flow and pipeline. find problems there that are real... :)

This revision now requires changes to proceed.Jun 9 2020, 2:24 AM

Additionally to what @raster said. Take objdump -a and check out the produced assembler when compiling something with -O3 you will see, that the two loops boil down to absolutely the same asm. Anyways, additionally, even with -O0 the speedup is so little that the bias is higher than the save up, which is more showing that this might save *something* but surely not enough to have an impact.

I also see that we are in the same situation here than in the past. The benchmark here is not really representing the usecase of efl.

Additionally, check what compilers are capable of doing and what not. Simply check out https://en.wikipedia.org/wiki/Constant_folding which is automatically doing what got done here.

If we are talking about GCC and its optimization, then the whole thing does not give much enhancement.

The impacts are so small: yes
Only valid in -O0 : yes

but again not accepting it because it is not super speed boost does not make sense to me, Almost 99% of the time I build efl with -O0.

ali.alzyod requested review of this revision.Aug 4 2020, 1:22 AM
ali.alzyod edited the test plan for this revision. (Show Details)

I updated testing sample code with -O3 optimization to show the speed up.

vtorri added inline comments.Aug 4 2020, 1:41 AM
src/examples/evas/evas-multi-touch.c
36

float insteadof double ? It seems M_PI_1_180 is used only with float ?

see remark in my other comment below

130–132

2 thoughts :

  • float rx = -atan2(vheight,vy * M_PI_1_180 + 90.0f;
  • atan2f instead of atan2 (with cast of M_PI_2 to float if my computation above is not what you want) ?

same with cosf and sinf

remark: there will be maybe a lost in precision and a gain in speed