Page MenuHomePhabricator

ecore_file: check for a valid url header
Needs RevisionPublic

Authored by myoungwoon on Feb 7 2018, 12:00 AM.

Details

Summary

This patch checks whether the input url has a valid header or not.
For example, when user uses a wrong url such as "htp://phab.enlightenment.org" which has a typo "htp://",
ecore_file_download returns EINA_FALSE.

Test Plan

Execute test suite

Diff Detail

Repository
rEFL core/efl
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 5747
Build 6380: arc lint + arc unit
myoungwoon requested review of this revision.Feb 7 2018, 12:00 AM
myoungwoon created this revision.
raster added a comment.Feb 7 2018, 2:50 AM

ummm you know HTTP:// is also valid, as it Http:// and HtTp:// etc. ... case insensitive. also ftps://, file:// and probably other schemes that curl may support. you basically need to handle all the possible url schemes that curl may support here if you want to verify first... :) here's a quick list i found:

https://ec.haxx.se/protocols-curl.html

raster requested changes to this revision.Feb 7 2018, 2:50 AM
This revision now requires changes to proceed.Feb 7 2018, 2:50 AM
myoungwoon updated this revision to Diff 13824.Feb 7 2018, 8:48 PM

Currently available protocols of ecore_file_download are file://, http://, https://, ftp://
from ecore_file_download_available().

myoungwoon added a comment.EditedFeb 7 2018, 9:16 PM
In D5794#99066, @raster wrote:

ummm you know HTTP:// is also valid, as it Http:// and HtTp:// etc. ... case insensitive. also ftps://, file:// and probably other schemes that curl may support. you basically need to handle all the possible url schemes that curl may support here if you want to verify first... :) here's a quick list i found:

https://ec.haxx.se/protocols-curl.html

Yeap, I looked over protocols-curl site and understood about case insensitive handling...
IMO, there are two things to be considered.

  1. available protocols: ecore_file_download supports all 22 protocols curl supports with case insensitive handling...
  2. do we need to support checking whether the input url is vaild or not.

I think

  1. is out of our EFL libraries because it need a API which checks whether returned acks from named remote server(or local server) is success or fail.
  2. Currently ecore_file_download supports ecore_file_download_protocol_available API which is checking just following protocols: file://, http://, https://, ftp://

If we make the final decision of supported protocals which curl supports for ecore_file_download with case insensitive handling, ecore_file_dowload_protocol_available API should be updated together.

What is your opinion on these - choose supported protocals curl supports, support insensitive handling?

myoungwoon updated this revision to Diff 13825.Feb 7 2018, 9:18 PM

update error messages.

raster added a comment.Feb 8 2018, 2:05 AM

ummm... you mean ecore_file_download_protocol_available right? well technically that should be fixed to list more schemes. ftps is kind of important for starters (ssl ftp).

i really am not sure we should be even checking this here. if we pass an invalid scheme or url to curl... it will fail later on anyway... so what is the purpose of pre-checking these things? why check here in ecore_file? whats the thought process behind having to check here?

In D5794#99151, @raster wrote:

ummm... you mean ecore_file_download_protocol_available right? well technically that should be fixed to list more schemes. ftps is kind of important for starters (ssl ftp).

No, I also think ecore_file_download_protocol_available should be updated to list more schemes.

i really am not sure we should be even checking this here. if we pass an invalid scheme or url to curl... it will fail later on anyway... so what is the purpose of pre-checking these things? why check here in ecore_file? whats the thought process behind having to check here?

Here, I think we should consider two facts about the invalid schemes or urls.
First, is the urls grammatically correct? - for example, missing protocol:// or some typo of protocal name (htp/Flie/fPt Etc.)
Second, the urls is grammatically correct however, as I mentioned, download fails because the domain of urs is not serviced or wong domain, or wong ip address.

For the first, I think ecore_file_download should return more meaningful return values. for example, error codes.
Yes, I think my patch is for the second.
My flow is like this:

  1. ecore_file_download is checking input url is NULL and protocol format.
  2. ecore_file_download had been checked the available protocols like my patch submitted by Vincen Torri ago.
  3. I think ecore_file_download would like to support pre-checking the url by 1, 2, and ecore_file_download_protocol_available API. In other words,
    • if url is null, it just returns EINA_FALSE
    • if url has wrong protocols format or name, it just returns EINA_FALSE because those conditions finally make a failure of downloading.

I am not sure we should be even checking this here as well. However, It looks good to check here like before though checking is not perfect to support more schemes based on curl.

myoungwoon updated this revision to Diff 13836.Feb 8 2018, 9:10 PM

replacee checking codes.

For the first, I think ecore_file_download should return more meaningful return values. for example, error codes.

Can't do that as return is Eina_Bool ... so only true or false... changing that would break ABI/API.

:)

The reason i ask is... curl will eventually check everything including scheme,s syntax, formatting and whatever as well as then connect and check if the target exists and thus end up with a result. The more code we put here, the more work we have to do instead of leaving it all to curl and if that is really valuable to write and maintain such code. the patch still doesn't handle case insensitivity and imho it along with available protocols should be extended to at least everything that validly can "download a file".

it should at least add ftps as that is just ftp over ssl.

In D5794#99187, @raster wrote:

For the first, I think ecore_file_download should return more meaningful return values. for example, error codes.

Can't do that as return is Eina_Bool ... so only true or false... changing that would break ABI/API.

:)

I know.. we can not make such a situation of ABI/API. :)

The more code we put here, the more work we have to do instead of leaving it all to curl and if that is really valuable to write and maintain such code.

I think so. However, if we do, I do not know if we are still producing the "ABI/API" situation. ?

the patch still doesn't handle case insensitivity and imho it along with available protocols should be extended to at least everything that validly can "download a file".

it should at least add ftps as that is just ftp over ssl.

First of all, why not add "case insensitivity handling" and "ftps" into current code?
If you agree, I will update the code with "case insensitivity handling" internal function and "ftps" protocol.

I think so. However, if we do, I do not know if we are still producing the "ABI/API" situation. ?

that is kind of my point about the schemes. if the checking only supports a SUBSET of schemes (the short list you have there) then it MAHY break existing code that used e.g. ftps or sftp or gopher( unlikely but in theory possible), smb etc. that used to work and now will not. so as long as the checking checks every scheme protocol that curl can do then it won't break existing support. :)

cedric added a comment.Mar 6 2018, 5:47 PM

Does curl provide a list of the protocol it support at run time ? Would be best here.

i dont know of a curl api that does this... i had to hunt fairly hard for docs that have it.

There is no curl function which validates url strings; this only occurs once an operation is initiated.

I think the issue here is that ecore_file_download_protocol_available() exists at all, when it seems to objectively be useless. It includes hardcoded strings which are only a small subset of what the corresponding functions support, and it checks the strings in an invalid way based on HTTP spec.

The best solution here seems to be deprecating ecore_file_download_protocol_available() and removing all usage of since it serves no purpose. Then this patch can be changed to just add a note in the deprecated functions docs specifying that it should never be used and perhaps linking to this page https://ec.haxx.se/protocols-curl.html

netstar requested changes to this revision.Sep 4 2018, 2:56 AM
netstar added a subscriber: netstar.

I think maybe strncasecmp would be more suitable here?

This revision now requires changes to proceed.Sep 4 2018, 2:56 AM