Page MenuHomePhabricator

Speedup finding new line, prevent readind invalid memory
ClosedPublic

Authored by ali.alzyod on Mar 28 2019, 5:45 AM.

Details

Summary

1- Speed up detecting new lines.

if (!strncmp(text, "<br", 3) || !strncmp(text, "<ps", 3))

This will cause 6 comparisons (if one of conditions did not meet), or at least 3 comparisons.

this is changed to

if (!strncmp(text, "<", 1))

2- Speedup detecting lines

If this condition is true, we should increment the string for next iteration 3 times, not just one

if (!strncmp(text, "<br", 3) || !strncmp(text, "<ps", 3))

if '<' founded then 'pr' or 'br', we will skip 3 characters for next iteration.

if (!strncmp(text, "<", 1))
      {
         text++;
         len--;
         if (!strncmp(text, "br", 2) || !strncmp(text, "ps", 2))
         {
            text += 2;
            len -= 2;

3- Prevent reading invalid memory out of the string

if (text[3] == '>' || ((text[3] == '/') && (text[4] == '>')))

string could reach last char in string (original string ends with "<br")

but now we will check if remaining string length allow comparison :

if (text[0] == '>' || (len > 1 && ((text[0] == '/') && (text[1] == '>'))))
Test Plan
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>

static int

oldFunc(const char *text)
{
    if (!text)
        return 0;

    while (*text)
    {
        if (!strncmp(text, "<br", 3) || !strncmp(text, "<ps", 3))
        {
            if (text[3] == '>' || ((text[3] == '/') && (text[4] == '>')))
            {
                return 1;
            }
        }
        text++;
    }
    return 0;
}

static int
newFunc(const char *text)
{
    if (!text)
        return 0;
    char *pTemp = (char *)text;

    while (pTemp = strchr(pTemp, '<'))
    {
        pTemp++;
        if (!strncmp(pTemp, "br", 2) || !strncmp(pTemp, "ps", 2))
        {
            pTemp += 2;
            if (pTemp[0] != '\0' && (pTemp[0] == '>' || (pTemp[0] == '/' && pTemp[1] == '>')))
            {
                return 1;
            }
        }
    }

    return 0;
}

int main()
{

    int counter = 1000;
    srand(time(NULL));
    char pStr[50001] = {0};
    char AllChars[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789<>";

    int AllCharsLen = strlen(AllChars);
    for (int i = 0; i < 50000; i++)
        pStr[i] = AllChars[rand() % AllCharsLen];

    clock_t start, end;
    double total_Time1 = 0;
    int i;

    for (int j = 0; j < 3; j++)
    {
        if (j == 0)
        {
            printf("random String\n");
        }
        else if (j == 1)
        {
            printf("With Random <br/>\n");
            int location = rand()%(5000 - 5);
            pStr[location++] = '<';
            pStr[location++] = 'b';
            pStr[location++] = 'r';
            pStr[location++] = '/';
            pStr[location++] = '>';
        }
        else if (j == 2)
        {
            printf("With Random <ps>\n");
            int location = rand()%(5000 - 4);
            pStr[location++] = '<';
            pStr[location++] = 'p';
            pStr[location++] = 's';
            pStr[location++] = '>';
        }

        start = clock();
        for (i = 0; i < counter; i++)
            oldFunc(pStr);
        end = clock();
        total_Time1 = ((double)(end - start)) / CLOCKS_PER_SEC;
        printf("original = %f has new Line = %i\n", total_Time1, oldFunc(pStr));

        start = clock();
        for (i = 0; i < counter; i++)
            newFunc(pStr);
        end = clock();
        total_Time1 = ((double)(end - start)) / CLOCKS_PER_SEC;
        printf("modified = %f has new line = %i\n\n", total_Time1, newFunc(pStr));
    }
}

output:

random String
original = 2.523000 has new Line = 0
modified = 0.090000 has new line = 0

With Random <br/>
original = 0.081000 has new Line = 1
modified = 0.003000 has new line = 1

With Random <ps>
original = 0.016000 has new Line = 1
modified = 0.001000 has new line = 1

Diff Detail

Repository
rEFL core/efl
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
ali.alzyod created this revision.Mar 28 2019, 5:45 AM

It seems that this patch has no reviewers specified. If you are unsure who can review your patch, please check this wiki page and see if anyone can be added: https://phab.enlightenment.org/w/maintainers_reviewers/

ali.alzyod requested review of this revision.Mar 28 2019, 5:45 AM
This comment was removed by ali.alzyod.
ali.alzyod edited the summary of this revision. (Show Details)Mar 28 2019, 5:58 AM
zmike requested changes to this revision.Mar 28 2019, 6:29 AM
zmike added a subscriber: zmike.

Seems like this could be simplified a bit? And if you're looking to optimize this kind of thing then probably you'd also want to improve performance of _entry_remove_new_line since this function does a full copy of the string and then 4 full iterations over it?

src/lib/elementary/elm_entry.c
2873

Why not just loop using strchr(text, '<') if you're going to make this kind of change?

This revision now requires changes to proceed.Mar 28 2019, 6:29 AM
In D8497#153941, @zmike wrote:

Seems like this could be simplified a bit? And if you're looking to optimize this kind of thing then probably you'd also want to improve performance of _entry_remove_new_line since this function does a full copy of the string and then 4 full iterations over it?

I think strchr will make it more complex, since we want to know the length of the string, and we will have to create more variables to collect the values.

(I need the length for this part ` if (text[3] == '>' || ((text[3] == '/') && (text[4] == '>')))` so we will not allow reading memory out of the string

Also, any performance-improving patch should be accompanied of numbers (and how to obtain them). Otherwise, how do you know if you're actually speeding up anything?

Also, any performance-improving patch should be accompanied of numbers (and how to obtain them). Otherwise, how do you know if you're actually speeding up anything?

Sure, I am still writing on this patch.

zmike added a comment.Mar 28 2019, 7:08 AM
In D8497#153941, @zmike wrote:

Seems like this could be simplified a bit? And if you're looking to optimize this kind of thing then probably you'd also want to improve performance of _entry_remove_new_line since this function does a full copy of the string and then 4 full iterations over it?

I think strchr will make it more complex, since we want to know the length of the string, and we will have to create more variables to collect the values.

(I need the length for this part ` if (text[3] == '>' || ((text[3] == '/') && (text[4] == '>')))` so we will not allow reading memory out of the string

If this is a string, then it must be terminated by a \0. Thus, if text[0] is not \0 (as it must not be in order for it to be '/'), text[1] can safely be examined.

ali.alzyod edited the test plan for this revision. (Show Details)Mar 28 2019, 7:19 AM
ali.alzyod edited the test plan for this revision. (Show Details)Mar 28 2019, 7:21 AM
ali.alzyod updated this revision to Diff 21054.Mar 28 2019, 7:44 AM

update code to use strchr

ali.alzyod edited the test plan for this revision. (Show Details)Mar 28 2019, 7:46 AM
ali.alzyod added reviewers: woohyun, bowonryu.
ali.alzyod added inline comments.
src/lib/elementary/elm_entry.c
2873

Done

zmike added a comment.Mar 28 2019, 8:19 AM

Your test case should probably use a string that has some real <br> and <ps/> tags?

ali.alzyod edited the test plan for this revision. (Show Details)Mar 28 2019, 9:40 AM
In D8497#153974, @zmike wrote:

Your test case should probably use a string that has some real <br> and <ps/> tags?

updated

Hi, i can see that this giantly impoves the time spent in this function. So this is fine from my side. However, lease note that if you want to fix performance problems in EFL that such tiny functions will not matter that much in total. As @zmike pointed out, there is a other function which is a bit more wastefull in this regard.

Further more, running entry examples with perf for roughly a second does not even bring this function up for me. So yes this is a performance improvement of this function, but no, it will not matter much for the overall performance of EFL. I don't know if the overall performance of EFL was your goal.

ali.alzyod added a comment.EditedMar 29 2019, 5:05 AM

Hi, i can see that this giantly impoves the time spent in this function. So this is fine from my side. However, lease note that if you want to fix performance problems in EFL that such tiny functions will not matter that much in total. As @zmike pointed out, there is a other function which is a bit more wastefull in this regard.

Further more, running entry examples with perf for roughly a second does not even bring this function up for me. So yes this is a performance improvement of this function, but no, it will not matter much for the overall performance of EFL. I don't know if the overall performance of EFL was your goal.

@bu5hm4n yes It is a small performance enhancement, also it prevent reading invalid memory :)

for the other function I will check if we can do something there.

I think all the single line mode should be working in different way,(may be I am wrong), I think we should only disable multiline support in the underline textblock instead of changing the content.

zmike accepted this revision.Mar 29 2019, 6:52 AM

Thanks for the time you've spent on this, looks good to me!

This revision is now accepted and ready to land.Mar 29 2019, 6:52 AM
This revision was automatically updated to reflect the committed changes.
vtorri added a subscriber: vtorri.Mar 29 2019, 7:46 AM

can't strncmp(pTemp, "br", 2) be replaced with 2 tests (same for "ps") ?

pTemp[0] == 'b' && pTemp[1]=='r'

?

@vtorri I will check if it will effect performance.