Page MenuHomePhabricator

efl_task: remove env from this object
ClosedPublic

Authored by bu5hm4n on Dec 26 2018, 2:40 AM.

Details

Summary

the env operations are moved to the efl.core.env objects, which can be
used there.

Depends on D7510

Diff Detail

Repository
rEFL core/efl
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
bu5hm4n created this revision.Dec 26 2018, 2:40 AM
bu5hm4n requested review of this revision.Dec 26 2018, 2:40 AM
bu5hm4n updated this revision to Diff 18112.Dec 26 2018, 9:09 AM
bu5hm4n edited the summary of this revision. (Show Details)

now possible due to better api

The problem I see (if I am reading this right?) is that the process is now:

  1. create env object
  2. set env vars to env object
  3. set env object to task object
  4. task object internally sets env vars before executing

Except that now the env vars have already been set to the parent process in step 2 since the env object automatically applies env vars. I think perhaps the env object should have a method like 'apply' where the env vars are actually applied so that the env object can be modified without changing the environment unintentionally?

zmike requested changes to this revision.Dec 26 2018, 10:55 AM

Rejecting for discussion purposes.

This revision now requires changes to proceed.Dec 26 2018, 10:55 AM

The workflow is how you said. However, step 2 does not edit the environment. Maybe the API was not clear:
Creating a env object and setting unsetting env vars there does not edit the environment in any case, (I think i wrote this in the class documentation)

Having a object that changes the actual environment is only possible with the Efl.Core.Prov_Env object, that is a singleton.

The reason for the env object to work like this is, to have something that can be used as templates, Create a state -> fork it & edit it -> apply it to a exe instance (as an example)

raster added a subscriber: raster.Dec 26 2018, 11:15 AM

how is this better? it makes creation of a process with a modified env more work. multiple objects for no apparent reason (surely not given here in this patch). having env be part of a task is realistically what an environment is. it's part of a process (executable) and threads (they share the same env and it's implemented this way to modify the same shared env as that is the definition of an env within a process) and on an exe it's setting up an env in preparation for the actual execution (after which it can't be modified just like an environment actually works). why an object vs part of an existing class/interface? why is this better?

a singleton env object is a problem with threads. this has to then be a shared object and this becomes wonderful fun in things like js and lua where you cannot share data between threads (they have to have their own vm states/stacks) and the only way to share is via an api that on the native side does the sharing. in a lua/js env a new thread would be passed its "Self" object and thus be able to share the env. with a singleton, you have to pass in lots of global things (not a thread local object like "self") thus complicating adding thread support for such languages. we should really avoid singletons as much as possible IMHO if there is another way to do it.

raster requested changes to this revision.Dec 26 2018, 11:16 AM

From a technical point of view:

  • The old API is not really definite, the API specifies that the API sometimes mirrors to environment variables, sometimes not. Considering the situation of this within a thread is quite a problem, since i don't know if i am altering the environment variables of the process or not, same argument for the efl_exe class.
  • The new API allows you to create a template object with all your envvars once and you can pass it to any efl_exe that you like, which allows more separation in the workflow of creating a new process, (you don't have to handle env var creation all the time). Further more, the new structure allows you to capture the environment state at a certain point in time, (Remembering enlightenments problem with PA under wayland, you could capture the environment in e's startup, and clear them after that. Later on when you spawn processes, you can pass that env var object again, which results in the same environment variables as e's startup)

Additionally, i like to design APIs with SOLID in mind, as it leads to slim, easy to use, and expandable APIs. (https://en.wikipedia.org/wiki/SOLID)
Considering those points given there, separation of concerns was my main driving factor for doing it like this. A task itself does not need/use/require environments at all, A application however has a environment (which is the one in the Singleton), Additionally a exe object can have a environment that is passed to the spawned process. So the concern of saving / handling the env vars is in the object, the concern of applying them to the entity is in the objects that take or provide such objects.
Further more, having "rich" features like the fork call is not possible at all with the current design, (or at least not as easy like this).

Considering your reasoning with shared objects: You are completely right with this, however, the situation we have right now in master is not any different, REAL environment variables are only edited when you call the env API on the app object, which in fact locks out usage from threads. So all in all, we are in the same situation than before, beside the fact that we can enable setting env vars by making the singleton object shared, where in the situation before we would need to make the app object shared, which would infact make the main loop a shared object, which is probably not what we want.

The old API is not really definite, the API specifies that the API sometimes mirrors to environment variables, sometimes not. Considering the situation of this within a thread is quite a problem, since i don't know if i am altering the environment variables of the process or not, same argument for the efl_exe class.

that's because the task class itself doesn't do this - its the implementation in the exe or thread/app that defines the details by overriding. that's what OO is about - you inherit and modify the behavior of things appropriately for that object type. perhaps it could be clearer that the resulting class needs to define this more clearly, but this is a documentation issue rather than design i guess. i agree that doc-wise this could be better.

The new API allows you to create a template object with all your envvars once and you can pass it to any efl_exe that you like

here is the problem. the task class has an env (if its an env object you supply or the current api - same thing. the object has an environment of some sort). they are different things. so an exe INHERITS its environment from the current state of the process. the app/thread objects (the 2 loops) SHARE the same global env of the process. it's a persistent global env for the process and these interfaces are just ways of accessing it and modifying it portability and safely (read up on the specifics of putenv vs setenv for example - putenv means u have to ensure the string u pass is ALWAYS available after the putenv until overridden by another putenv...). if u now create an env and "set it" ... how does this interact with the global env. what is other libs or code does putenv's? how do u seed this env object correctly at creation and then ensure it behaves right. an env object for use with the app (not exe) should at all times mirror the real global env. one for use in an exe should only mirror the global env on creation but once modified should "copy on write" for use in the exe as that is how fork+exec does work with an env (it inherits parent env and this allows you to make some modifications to that env inside the fork before the exec happens which is common enough - and you prepare those changes via the env api on the task class). an env object has the problem that you do not know the target use case for it yet thus don't know how it should behave yet.

you'd need to have an obj you NEVEr create - you only get it from the obj , then modify it via set/s etc. ... but then what is the point - it's just another object when it couldbe part of the existing one and STILL the behavior would vary - where that obj now would either need to be a shared obj (thus threadsafe) and shared between all app/thread loop objects or it's have to be duplicated per loop/thread (be private/local) but like the current implementation all back onto the sam,e global env store ... but then the same env obj will behave differently on exe's - and your same complaint in #1 exists but we now added more complexity in having to get an obj THEN modify it rather than just mopdify the loop/exe we already have the handle to. :)

my point here is that your design doesn't improve anything. the env obj is still just as ambiguous. if we replace the env obj with a hash table it's even worse as we can't share a hash across threads, thus we need to make local duplicates and then every time u do some mods u have to set the hash back and then the hash has to be walked along with the env to see what changed then the changes implemented. bad things then happen if u forget to set the hash back so it's easier to make mistakes and have bugs... :)

so if we remove the env obj from the loop stuff and have a singleton it still behaves differently to the env object needed for exe's as this modifies the global env vars and the env obj for exes modifies only its internal state

the situation we have right now in master is not any different, REAL environment variables are only edited when you call the env API on the app object, which in fact locks out usage from threads.

correct and not correct. a shred object cannot be referenced or be a child of a non-shared object (or vice-versa). by exposing a shared singleton global we make it easier for people to make mistakes as we are now forcing them to use one.but my point remains then that binding/runtime implementations need to expose this global object then and that becomes more work to remember to do that. (specifically things like lua and js - i'm going to use the lua example here - you'd need a whole new luastate "object" per thread and then have to import globals like this into it and mirror them with a local variable - culd have class funcs return this obj but then need to ensure the reference count is > 1 before returning to ensure these bindings don't go accidentally unref the obj).

basically my thoughts here are that IMHO this is better cleared up with documentation. creating more objects just makes the code longer and more verbose for what I at least see as no actual benefits as the same behavior issues exist and would need documenting too (that the global singleton maps to the process global env) and for exe creation it needs to be documented that you should get the env from the exe first, then set it back when done, and not create env then set (unless u want to clear the env out and have a custom from scratch created env...). or .. we need ways to dup the global env into a enw custom env ... in the end you end up with documentation and nasties here to explaining how to do this and it's still more lines of code than it is now. :)

i get your concerns and it is a bit ambiguous as it stands now - but as i said. i think that's a matter for documentation, not redesign.

Saying "that's what OO is about - you inherit and modify the behavior of things appropriately for that object type. ", without any form of constraint is just wrong and dangerous. OOP is NOT about dropping behaviour and defining new ones just like it serves best. Definitly NOT. There are patterns that allow something like this, but in general just changing behaviour is just very dangerous :(

Additionally, having env on a thread is _again_ a bit weird as the thread will not EDIT the enviroment variables at all, it just has a copy of the env vars. Without any connection. And that even without a clear notice that this is the case. And if there is only a *not-applied* env var object in that task, that cannot be used as template, nor safed somewhere, then why is it there ? Looks like useless API to me.

The rest of your reply does contain a bit of wrong things, in the new API you just set the env var object if you want to change the env vars, if not, the spawned process is just created with the same old environment variables as the spawning process. So no need to create an object that you never use :)

Additionally, if it is the same, i am still wondering how i can do that what i am currently doing with the fork call on the Efl.Core.Env object :)

As of your last sentence: Its not really possible to document the new behaviour of a function that is overridden, so your idea of just documenting it a bit different is not working, while the singleton API can clearly provide the sentence "THIS IS MIRRORED TO THE PROCESS" that must be seen when getting the singleton object.

All in all I still don't see any benefit of the old solution that is not possible with the new one, there are a few rough edges that can also be copied over, (the syncing stuff) but all in all the new approach gives way more flexibility, than the former one. With the benefit of also respecting OOP principles (SOLID). There are indeed more objects with this idea, however, we are talking about min. 1 single object, or a few that are created by the user. Additionally, this approach is very simular to the python way of working for spawning processes, which is a good used & understood API for years ..., which actaully brings up the argument of understandability :)

bu5hm4n requested review of this revision.Jan 2 2019, 5:39 AM

Saying "that's what OO is about - you inherit and modify the behavior of things appropriately for that object type. ", without any form of constraint is just wrong and dangerous.

Did you read the word "appropriately" ? How did I say that it is without constraint... I said "appropriately". In this case I see it as totally appropriate. For an Exe object it's about preparing the environment for that object that by nature of how executing child processes work, start by inheriting their parent's environment UNLESS you explicitly do something to make that otherwise. system(), fork(), exec() etc. all do this (with exec family of functions using the CURRENT env not the parent's), and the same exact methods then manipulating the process environment when used on objects that represent the process (mainloop or thread) and the way these are defined already at the system level, threads share the same environment variables of the process as a whole, so the same API works consistently with the underlying nature of the things they act on. I'd say this is appropriate behavior.

Additionally, having env on a thread is _again_ a bit weird as the thread will not EDIT

Actually any thread today can call putenv() or getenv() and act on the same global env. so threads can modify the env. it is possible and does sometimes happen (they have to modify them in a safe way though and this actually ensures that is done at least at the basic string ownership level). Now the reason this is on the thread object is that threads will be told/passed THEIR thread object. it's passed in the args event callbacks so that would be considered the initial init func for the thread like it is for an app too. it's identical and works the same way. in something like lua, the thread would be set up with its own private luastate and this thread object would then be exposed to that private sate on the first call. you'd need to have a new .lua file with probably some standards defined function to call like args() maybe as would the main lua src/ code too for the main loop (so the runtime can find the right function symbol from lua and bind it in). since it needs a custom lua state, it cannot access globals unless you explicitly go global by global and expose them to the new luastate. this requires an env obj be a singleton shared obj and then the runtime code for every runtime needs to know about all such globals designed this way and explicitly expose them. this just leaves more room for bugs in forgetting to expose them and more work to create those runtimes and maintain such lists of globals that need exposing. for languages like lua, the thread would not see the mainloop object or anything not explicitly exposed and if it has to support threads it HAS to and WILL be given its own thread object that is local to the thread.

And that even without a clear notice that this is the case

As i said - a documentation matter.

And if there is only a *not-applied* env var object in that task

??? what? if you set an env on the app main loop obj or thread obj it is applied. always. with locking and ownership handled. if it's on an exe it is also applied as well TO the exe when it is run. it's setting up the env for that process. if it never actually runs then it's a pointless task, but that's be the same story if it were a special env obj you created to run an exe.

The rest of your reply does contain a bit of wrong things, in the new API you just set the env var object if you want to change the env vars, if not, the spawned process is just created with the same old environment variables as the spawning process. So no need to create an object that you never use :)

I'm talking about the cases you do want a customized env for the target exe. so when you do need to change the env... you have to define creation - does it start empty? as a mirror of current implicit process global env? what is the base you start from to modify? if you made it owned by the exe then its clear that its controlled by that and the exe comes from an object that traces back to the loop or thread object and thus inherits from there which is the process global. if you want to create a new env obj (efl_add() it) then how? does it have to use the exe as the parent?? then why add THEN set when it's more obvious if you get and set? less to go wrong.

Additionally, if it is the same, i am still wondering how i can do that what i am currently doing with the fork call on the Efl.Core.Env object :)

i don't see any need to have a fork() from this. it's just a copy semantic to create a new object that's all. i actually have never had the need in any code i have written to do something like this with an environment. never had to store one and then fork it for modifications. i have had to run an executable in a modified environment with env vars added, changed etc. and i've written a reasonable amount of code in this domain compared to most. so i see it as "feature we don't need - certainly not as a method and if anything should be a copy operator thing in efl if it were needed".

As of your last sentence: Its not really possible to document the new behavior of a function that is overridden, so your idea of just documenting it a bit different is not working, while the singleton API can clearly provide the sentence "THIS IS MIRRORED TO THE PROCESS" that must be seen when getting the singleton object.

it is possible - in the class documentation. it's actually better there as it is a general "subtle change" as to where the data is stored and shared and you don't want to document every method that is overridden, just the overall idea that changes. so it's possible and it would go in the right place.

All in all I still don't see any benefit of the old solution that is not possible with the new one,

what i see is code that is longer and more code to write in C to do it your way. i've covered that multiple times. even if both were about equal in every other way, this alone would be reason to not do it. we should not be making code longer and more complex and easier to get wrong in apps even for the sake of a little bit of purity. the wisdom of decades of writing code is that you realize purity is a great way to make things worse if you are not careful. pragmatic "it's just less lines of code to do it this way even if its not as pure" is pretty much the better way to go, unless there is a VERY good reason otherwise. at best i think you have something that is "about the same" and we're nitpicking, but one thing that is definite is that your way is more lines of code for the person using the api, and that makes it worse IMHO.

bu5hm4n updated this revision to Diff 18844.Jan 25 2019, 1:38 AM

code update

Ignoring the smokebomb about "did you read" (yes i did, still your sentence indicates something else, and ignores basic facts about how things are inherited and things are changed).

#1 Yes threads can call putenv, setenv. How is that possible with your solution ? You can just edit two objects, which is then altering the *same* environment, is that a model of the realitiy ? No. Is that obvious No. So my thinking about that: Bad API.

#2 You can customize objects, by:

  • duplicating the proc_env object, doing your change, passing it in.
  • just creating a env object, apply your change, passing it in.

This disucssion meanwhile have spread with the same arguments over 3 revisions and 1 task. I would really appreciate, if you would keep it in one place, and read my replies.

bu5hm4n updated this revision to Diff 18960.Jan 28 2019, 6:37 AM

remove tuple

cedric requested changes to this revision.Jan 28 2019, 10:04 AM

I am pretty sure @segfaultxavi would complain to about the lack of documentation ;-)

src/lib/ecore/efl_exe.eo
45

Please add some documentation for the env property here too.

This revision now requires changes to proceed.Jan 28 2019, 10:04 AM
bu5hm4n updated this revision to Diff 19013.Jan 29 2019, 8:59 AM

rebase & fixup

zmike requested changes to this revision.Jan 31 2019, 5:55 AM

Minor changes needed.

src/lib/ecore/efl_exe.c
194

Nice catch, crucial for performance reasons.

src/lib/ecore/efl_exe.eo
45

This needs to be resolved.

This revision now requires changes to proceed.Jan 31 2019, 5:55 AM

Please apply this patch locally, your review comments do not apply to the state that you get when you apply the patch.

bu5hm4n requested review of this revision.Feb 3 2019, 3:19 AM
bu5hm4n marked 2 inline comments as done.Feb 7 2019, 7:17 AM
zmike accepted this revision.Feb 11 2019, 9:06 AM
This revision was not accepted when it landed; it landed in state Needs Review.Feb 12 2019, 2:20 AM
Closed by commit rEFLc3d69f66a69c: efl_task: remove env from this object (authored by Marcel Hollerbach <mail@marcel-hollerbach.de>). · Explain Why
This revision was automatically updated to reflect the committed changes.