-
Notifications
You must be signed in to change notification settings - Fork 4
Added OpportunisticResourceEstimator module #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
kamaradclimber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I've reviewed only the beginning of the implementation. There are a few changes already.
This review needs to come with tests (even simple ones).
CMakeLists.txt
Outdated
| ${CMAKE_SOURCE_DIR}/src/RunningContext.cpp | ||
| ${CMAKE_SOURCE_DIR}/src/ConfigurationParser.cpp | ||
| ${CMAKE_SOURCE_DIR}/src/ModulesFactory.cpp | ||
| ${CMAKE_SOURCE_DIR}/src/OpportunisticResourceEstimator.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be named ResourceEstimator if nothing in its implementation is opportunistic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point i have renamed it CommandResourceEstimator to avoid confusion with mesos::ResourceEstimator
src/ModulesFactory.cpp
Outdated
| ::mesos::slave::ResourceEstimator* createResourceEstimator( | ||
| const ::mesos::Parameters& parameters) { | ||
| Configuration cfg = ConfigurationParser::parse(parameters); | ||
| return new OpportunisticResourceEstimator(cfg.prepareCommand, cfg.isDebugSet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it named prepareCommand? It should be something related to what the binary does (something like resourceEstimatorCommand)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to change the Configuration file, i added a oversubscribableCommand variable since the function to use the command is oversubscribable
| Try<Resources> _resources = ::mesos::Resources::parse( | ||
| "[{\"name\" : \"cpus\", \"type\":\"SCALAR\", \"scalar\" : {\"value\" : " | ||
| "\"0\"}, \"role\" : \"*\"}]"); | ||
| Resources resources = _resources.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to return a valid resource if to set?
Shouldn't we return something like a Failure (the "null" case of Try class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@treefell could you answer my comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually try to return NULL and nullptr it does not compile while i do that (i get a problem of compatibility with resources). So i went for resources with cpus at 0 as values. I will try to find another way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try means you can return a "success" with resources or a "failure" with an error message. Try returning a failure
|
|
||
| Try<string> output = cmdrunner.run(m_oversubscribableCommand.get(), input); | ||
| if (output.isError()) { | ||
| return resources; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should:
- log the error
- return the error case of the Try class (unless mesos reacts weirdly to this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a LOG(INFO) to retrieve the error
I am still investigating what i could give instead of resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Pull request has been modified.
src/CommandResourceEstimator.cpp
Outdated
| using namespace process; | ||
| using namespace mesos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using namespace is pure evil. This is a bad practice and you don't need it since you have the other using below :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hum i guess i was confused with the way namespace should be use. It have been removed
src/CommandResourceEstimator.cpp
Outdated
| const Option<Command>& oversubscribableCommand, bool isDebugMode); | ||
|
|
||
| virtual process::Future<Resources> oversubscribable(); | ||
| virtual process::Future<Resources> _oversubscribable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this identifier is not causing any problems regarding the C++ standard, I would advise against starting an identifier with an underscore altogether.
Certain sets of names and function signatures are always reserved to the implementation:
Each name that contains a double underscore __ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
If you have trouble getting the right pointer to the overloaded function please look at https://en.cppreference.com/w/cpp/language/overloaded_address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have change most of the variable that had an _ in front of the name.
src/CommandResourceEstimator.cpp
Outdated
| Future<Resources> CommandResourceEstimatorProcess::_oversubscribable( | ||
| const ResourceUsage& usage) { | ||
| // Mocking a resources to have valid return | ||
| Try<Resources> _resources = ::mesos::Resources::parse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, no underscore
src/CommandResourceEstimator.cpp
Outdated
| return resources; | ||
| } | ||
|
|
||
| Try<Resources> _cmdresources = ::mesos::Resources::parse(output.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
….hpp and remove some _ un naming
Pull request has been modified.
src/CommandResourceEstimator.hpp
Outdated
| @@ -0,0 +1,43 @@ | |||
| #ifndef __OPPORTUNISTIC_RESOURCE_ESTIMATOR_HPP__ | |||
| #define __OPPORTUNISTIC_RESOURCE_ESTIMATOR_HPP__ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You renamed the file but not the guards ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for noticing it! should be fixed
kamaradclimber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to add tests!
Command Resource estimator -unexisting command -existing regular command ModuleFactory for CommandResourceEstimator -empty command parameter -existing command parameter
Pull request has been modified.
kamaradclimber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests!
src/CommandQoSController.cpp
Outdated
|
|
||
| list<::mesos::slave::QoSCorrection> CommandQoSControllerProcess::m_corrections( | ||
| const ResourceUsage& m_usage) { | ||
| LOG(INFO) << "Start Corrections timer_start_module QoSController"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be more appropriate for debug level I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is more for debug, It is use to show time taken between different part of the module. like: start of module, start of the script, end of script, end of module
| // wished output = [{QoSCorrection}, {QoSCorrection}, ...] | ||
|
|
||
| std::string stroutput = output.get(); | ||
| std::string delimiter = ";"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to get why the script could not return the correct format (which seems to be map nicely with json)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the problem come from the way message are supported.
Since the return must be an array of QoSCorrection there is noway to call for any message converting like JsonStringtoMessage().
I tried to convert the JsonArray to Array in c++ but did not find and easy way to do it, a part from calling external library.
Each QoSCorrection look like this:
{"type":"KILL","kill":{"executorId":{"value":"app"},"frameworkId"{"value":"frameworkID"},"containerId":{"value":"containerID"}}}
Which kind of complicated the the way to split it so i use a special character as separator
If you have any good way to do i would gladly take it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Greg, this part looks a bit fragile. I've read your answer but have somle additional questions:
- what is that "QoSscript timer_start_script QoSController" script? If this is something on our side, should it not return directly a json, like all other commands run?
- if it's not in our hands, maybe there should be an intermediate wrapper to do it anyway, because I don't feel this kind of things needs to be done in this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feeback, the "QoSscript timer_start_script QoScontroller" is an output line (just use to get a timestamp) to notify the start of the commandRunner running the QoSscript.rb that give the QoScorrection after receiving the ResourceUsage.
It should totally give a JSON, i will try to rework it to return a JSON.
src/CommandQoSController.cpp
Outdated
| const Option<Command>& _corrections, | ||
| bool isDebugMode) | ||
| : m_correctionsCommand(_corrections), m_isDebugMode(isDebugMode) { | ||
| LOG(INFO) << "new QoSController"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
| LOG(INFO) << output.error(); | ||
| // return Failure("output from CommandRunner gave error: " + | ||
| // output.error()); | ||
| return resources; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it sounds dangerous to return "0 cpus" whenever the command fail. Maybe we could:
- store last result from the script (if any)
- otherwise, explicitely fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually i think returning previous result is more dangerous since if it keep failing it will always have the same amount thus allocating more revocable task that the Agent could take on.
Returning 0 prevent any revocable task to be allocated. In my tough if the Resource Estimator is broken revocable task should not be able to allocate.
I did try to fail, since the resource are returned to the slave it does not take fail from oversubscribable command, apart from modifying the slave i am not sure how we could make it work, if you have any suggestion i am interested
src/CommandResourceEstimator.cpp
Outdated
| const std::string& name, const Option<Command>& _oversubscribable, | ||
| bool isDebugMode) | ||
| : m_oversubscribableCommand(_oversubscribable), m_isDebugMode(isDebugMode) { | ||
| LOG(INFO) << "new resourceEstimator"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can drop this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Pull request has been modified.
| // wished output = [{QoSCorrection}, {QoSCorrection}, ...] | ||
|
|
||
| std::string stroutput = output.get(); | ||
| std::string delimiter = ";"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Greg, this part looks a bit fragile. I've read your answer but have somle additional questions:
- what is that "QoSscript timer_start_script QoSController" script? If this is something on our side, should it not return directly a json, like all other commands run?
- if it's not in our hands, maybe there should be an intermediate wrapper to do it anyway, because I don't feel this kind of things needs to be done in this module.
| ${CMAKE_SOURCE_DIR}/src/Helpers.hpp | ||
| ${CMAKE_SOURCE_DIR}/src/Logger.hpp | ||
| ${CMAKE_SOURCE_DIR}/src/ModulesFactory.hpp | ||
| ${CMAKE_SOURCE_DIR}/src/CommandResourceEstimator.hpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide a small documentation header in hpp files to quickly describe what are those methods supposed to do, what info they expect when running, ... You can get inspiration for the format if you look at Hook and Isolator headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feeback, i will make it more understandable!
| @@ -0,0 +1,3 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # do nothing here, so output file remains empty | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OPT: You can likely change your test to avoid the need of having an empty bash script, using some standard commands for example.
tests/scripts/oversubscribable.sh
Outdated
| "role" : "*"}] | ||
| EOM | ||
|
|
||
| cat $INPUT_FILE > /tmp/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this debug thing (and also comments above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Added the creation of opportunisticResourceEstimator inheriting resourceEstimator module.
Using the CommandRunner from Mesos-Command-Module to call a binary/script. The module also have access to ResourceUsage, which give the information about the resources on the agent/slave.