Page MenuHomePhabricator

evas_textblock: rainbow flag emoji treated as two clusters(update unibreak to version 4.2)
ClosedPublic

Authored by AbdullehGhujeh on Apr 21 2020, 4:32 AM.

Details

Summary

if we have rainbow flag emoji (🏳️‍🌈)
we can use mouse/keyboard to move cursor inside it because we break it into two clusters, we break on 1F308,

This is wrong as we should treat emoji as a single cluster (based on rules mentioned in Unicode segmentation standard “Do not break within emoji modifier sequences or emoji ZWJ sequences” (https://unicode.org/reports/tr29/#GB11 )).

this issue happens because we don’t give 1F308 its correct grapheme break property value, I think this is a bug in the unibreak library as this Unicode 1F308 should have word break class value equals to Glue_After_ZWJ (based on https://www.unicode.org/reports/tr29/tr29-31.html#Glue_After_Zwj_WB and http://unicode.org/Public/emoji/5.0/emoji-zwj-sequences.txt) which will not make it break and we will get a single cluster.

I noticed that the current unibreak lib used in EFL seems to implement Unicode 9 (latest is Unicode 13) which uses obsolete and unused grapheme break property, such as E_Modifier & Glue_After_ZWJ, so if a new emoji introduced (rainbow flag was introduced after Unicode 9) and based on Unicode 9 it should use property E_Modifier or Glue_After_ZWJ we will have issue with it.

So I have updated unibreak lib using latest released version of unibreak (4.2) which implement Unicode 12.

I needed to remove BREAK_AFTER(i) to pass the tests in D1140 as spaces do not break on latest update (also related to T995).

this should fix T8665 & T8688

Diff Detail

Repository
rEFL core/efl
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
AbdullehGhujeh created this revision.Apr 21 2020, 4:32 AM

It seems that this patch has no reviewers specified. If you are unsure who can review your patch, please check this wiki page and see if anyone can be added: https://phab.enlightenment.org/w/maintainers_reviewers/

AbdullehGhujeh requested review of this revision.Apr 21 2020, 4:32 AM
AbdullehGhujeh edited the summary of this revision. (Show Details)Apr 21 2020, 5:20 AM
AbdullehGhujeh edited the summary of this revision. (Show Details)Apr 21 2020, 5:23 AM
segfaultxavi added a subscriber: segfaultxavi.EditedApr 21 2020, 6:17 AM

I'm curious, is this related to T8663? The problem description looks very similar.
My apologies, now I see this is a subtask of that one.

AbdullehGhujeh edited the summary of this revision. (Show Details)Apr 21 2020, 6:24 AM
ali.alzyod edited the summary of this revision. (Show Details)Apr 24 2020, 3:12 AM

I didn't update the whole library as it will cause issues with word break and we needs to update our EFL code (for example updating the word break code will change the way we treat spaces, the tests added in https://phab.enlightenment.org/D1140 will fail)

Things are improved, but still, I think there are issues with word movement, that should be fixed, So can you create subtask to this patch? and we can work on fixing D1140 fails.

For example:

#include <Elementary.h>


EAPI_MAIN int
elm_main(int argc EINA_UNUSED, char **argv EINA_UNUSED)
{
   Evas_Object *win;

   elm_policy_set(ELM_POLICY_QUIT, ELM_POLICY_QUIT_LAST_WINDOW_CLOSED);
   win = elm_win_util_standard_add("emoji-test", "emoji-test");
   elm_win_autodel_set(win, EINA_TRUE);

   evas_object_resize(win, 320, 320);

   Evas_Object *box;
   box = elm_box_add(win);
   evas_object_size_hint_weight_set(box, EVAS_HINT_EXPAND, EVAS_HINT_EXPAND);
   evas_object_size_hint_align_set(box, EVAS_HINT_FILL, EVAS_HINT_FILL);
   elm_win_resize_object_add(win, box);
 
   Evas_Object *entry;
   entry = elm_entry_add(box);

   elm_entry_text_style_user_push(entry, "DEFAULT='font_size=40'");
   elm_entry_entry_set(entry, "&#x1f3f3;&#xfe0f;&#x200d;&#x1f308;");
   elm_entry_cursor_end_set(entry);
   Evas_Textblock_Cursor * cur = evas_object_textblock_cursor_get(elm_entry_textblock_get(entry));
   //THIS IS THE PROBLEM
   evas_textblock_cursor_word_start(cur);
   

   evas_object_size_hint_weight_set(entry, EVAS_HINT_EXPAND, EVAS_HINT_EXPAND);
   evas_object_size_hint_align_set(entry, EVAS_HINT_FILL, EVAS_HINT_FILL);

   elm_box_pack_end(box, entry);

   evas_object_show(entry);
   evas_object_show(box);

   evas_object_show(win);

   elm_run();
   return 0;
}
ELM_MAIN()

This will introduce same problem.

fully update unibreak lib & remove unneeded code

AbdullehGhujeh edited the summary of this revision. (Show Details)May 19 2020, 2:47 AM
AbdullehGhujeh retitled this revision from Textblock : rainbow flag emoji treated as two clusters(partially update unibreak) to Textblock : rainbow flag emoji treated as two clusters(update unibreak).
AbdullehGhujeh edited the summary of this revision. (Show Details)May 19 2020, 2:49 AM
ali.alzyod retitled this revision from Textblock : rainbow flag emoji treated as two clusters(update unibreak) to evas_textblock: rainbow flag emoji treated as two clusters(update unibreak).Jun 8 2020, 9:59 AM
ali.alzyod requested changes to this revision.Sep 1 2020, 12:51 AM

Needs Rebase

This revision now requires changes to proceed.Sep 1 2020, 12:51 AM
ali.alzyod retitled this revision from evas_textblock: rainbow flag emoji treated as two clusters(update unibreak) to evas_textblock: rainbow flag emoji treated as two clusters(update unibreak to version 4.2).Sep 1 2020, 3:17 AM
ali.alzyod accepted this revision.Sep 1 2020, 3:25 AM
This revision is now accepted and ready to land.Sep 1 2020, 3:25 AM
This revision was automatically updated to reflect the committed changes.
vtorri added a subscriber: vtorri.Sep 1 2020, 4:04 AM

just for your information, 4.3 is out (topic mentions 4.2)