Page MenuHomePhabricator

Imlib2: Some performance issues
Open, Incoming QueuePublic



A couple issues/observations related to performance:

  1. Images with alpha channel render slowly, the old sxiv code seems to work around it by using a separate image for the alpha and then blending the actual image onto it. From some small testing this resulted in about a 2x improvement on my machine. Here's the code-snippet for it.
  1. Scaling without anti-aliasing seems to have a pretty big hit on performance as well. About 13x slower compared to no scaling.
  1. Scaling _with_ anti-aliasing is even slower. About 40x slower compared to no scaling.

I discovered these while trying to investigate a bug report on nsxiv about multi-frame image animation slowing down when zoomed in. As far as I can tell, the culprit here is scaling+aa. Although, I do think that there's room for optimizing how we render animated images on our end, so I'll look into that.

However that brings me to another question, what's Imlib2's stance on multi-frame images? Is there any plans for adding an api for dealing with multi-frame/animated images?

NRK created this task.Mon, Nov 22, 1:54 AM
NRK updated the task description. (Show Details)Mon, Nov 22, 1:59 AM
kwo added a comment.Mon, Nov 22, 11:17 AM

1: If you don't have the "workaround", imlib2 will grab the X window content, blend the image onto the grabbed copy, and then write the blended result back.
If you do have the workaround you will do the blending in a forward flow not involving the X server and just write the result.
So wrt. X operations the first is like read-modify-write whereas the second is just write.
Non-alpha images don't need a blend so you can just do a write.
Since X-operations are relatively expensive (and particularly reads cause round-trip delays) it is always important to avoid them when possible.

2,3: Hmm.. I don't see such numbers. From a quick test (using nsxiv from git) I find roughly x2 due to scaling and almost nothing due to aa.
I'm not entirely confident in those numbers but that's what I see just now doing a trivial time measurement around the rendering in img_render().
This is on x86_64 with asm mmx functions. Tested with a few random alpha and non-alpa images.

Multi-frame images:
Well, I have considered if it would be useful to add functions to query the number of images (frames?) in (multi-frame) images and fetch a specific one.
But I haven't had specific use for it myself.
For testing purposes I have in the ico and webp loaders implemented the possibility to specify a specific image "index" by means of the "key" function, that is, e.g. appending ":6" to the file name should select the 6'th frame.
I don't consider that a proper API though.

NRK added a comment.Mon, Nov 22, 8:16 PM

Thanks for the X explanation, it makes sense.

Interesting that your measurements aren't lining up with mine. I originally investigated this because I saw that there's a good amount of time being spent on __imlib_ScaleAARGBA() on perf report. Anyways, I'll try to get some more people to test it, could be something due to my system/environment.

Function to query and get a specific image would be useful indeed. I wasn't aware of the ":n" trick, I'll play around with it. Is there any function to query the "frame delay"?

As far as I'm seeing, being able to load a specific image and query it's delay should be enough for us to get image animation going. It seems to be a pretty popular request too (I saw feature requests for animated images on feh as well, but was shut down as wontfix since Imlib2 doesn't support it). It could also prevent a lot of duplicated work, currently there's a good amount of work duplication between the Imlib2's gif loader and sxiv gif loader.

NRK added a comment.Fri, Nov 26, 10:06 AM

Hi, the "key" thing on animated webp was always resulting in selecting the last frame for me. I've attached a patch fixing it. (Not sure what the workflow here is for patches, sorry).

kwo added a comment.Mon, Nov 29, 9:18 AM

Yeah, it looks like my uninitialized iter usually had num_frames set so some large value so it seemed to work.
I' m fine with patches like this or sent to me directly,
Thanks :)