Skip to content

Conversation

@treefell
Copy link

Added the creation of opportunisticResourceEstimator inheriting resourceEstimator module.
Using the CommandRunner from Mesos-Command-Module to call a binary/script. The module also have access to ResourceUsage, which give the information about the resources on the agent/slave.

Copy link
Contributor

@kamaradclimber kamaradclimber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I've reviewed only the beginning of the implementation. There are a few changes already.

This review needs to come with tests (even simple ones).

CMakeLists.txt Outdated
${CMAKE_SOURCE_DIR}/src/RunningContext.cpp
${CMAKE_SOURCE_DIR}/src/ConfigurationParser.cpp
${CMAKE_SOURCE_DIR}/src/ModulesFactory.cpp
${CMAKE_SOURCE_DIR}/src/OpportunisticResourceEstimator.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be named ResourceEstimator if nothing in its implementation is opportunistic

Copy link
Author

@treefell treefell Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point i have renamed it CommandResourceEstimator to avoid confusion with mesos::ResourceEstimator

::mesos::slave::ResourceEstimator* createResourceEstimator(
const ::mesos::Parameters& parameters) {
Configuration cfg = ConfigurationParser::parse(parameters);
return new OpportunisticResourceEstimator(cfg.prepareCommand, cfg.isDebugSet);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it named prepareCommand? It should be something related to what the binary does (something like resourceEstimatorCommand)

Copy link
Author

@treefell treefell Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to change the Configuration file, i added a oversubscribableCommand variable since the function to use the command is oversubscribable

Try<Resources> _resources = ::mesos::Resources::parse(
"[{\"name\" : \"cpus\", \"type\":\"SCALAR\", \"scalar\" : {\"value\" : "
"\"0\"}, \"role\" : \"*\"}]");
Resources resources = _resources.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to return a valid resource if to set?
Shouldn't we return something like a Failure (the "null" case of Try class)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@treefell could you answer my comment here?

Copy link
Author

@treefell treefell Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually try to return NULL and nullptr it does not compile while i do that (i get a problem of compatibility with resources). So i went for resources with cpus at 0 as values. I will try to find another way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try means you can return a "success" with resources or a "failure" with an error message. Try returning a failure


Try<string> output = cmdrunner.run(m_oversubscribableCommand.get(), input);
if (output.isError()) {
return resources;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should:

  • log the error
  • return the error case of the Try class (unless mesos reacts weirdly to this)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a LOG(INFO) to retrieve the error
I am still investigating what i could give instead of resources

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@mergify mergify bot dismissed kamaradclimber’s stale review September 11, 2020 08:18

Pull request has been modified.

Comment on lines 13 to 14
using namespace process;
using namespace mesos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using namespace is pure evil. This is a bad practice and you don't need it since you have the other using below :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum i guess i was confused with the way namespace should be use. It have been removed

const Option<Command>& oversubscribableCommand, bool isDebugMode);

virtual process::Future<Resources> oversubscribable();
virtual process::Future<Resources> _oversubscribable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this identifier is not causing any problems regarding the C++ standard, I would advise against starting an identifier with an underscore altogether.

Certain sets of names and function signatures are always reserved to the implementation:

Each name that contains a double underscore __ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.

Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

If you have trouble getting the right pointer to the overloaded function please look at https://en.cppreference.com/w/cpp/language/overloaded_address

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have change most of the variable that had an _ in front of the name.

Future<Resources> CommandResourceEstimatorProcess::_oversubscribable(
const ResourceUsage& usage) {
// Mocking a resources to have valid return
Try<Resources> _resources = ::mesos::Resources::parse(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, no underscore

return resources;
}

Try<Resources> _cmdresources = ::mesos::Resources::parse(output.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@mergify mergify bot dismissed St0rmingBr4in’s stale review September 11, 2020 09:58

Pull request has been modified.

@@ -0,0 +1,43 @@
#ifndef __OPPORTUNISTIC_RESOURCE_ESTIMATOR_HPP__
#define __OPPORTUNISTIC_RESOURCE_ESTIMATOR_HPP__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You renamed the file but not the guards ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for noticing it! should be fixed

Copy link
Contributor

@kamaradclimber kamaradclimber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add tests!

Charles Huang added 2 commits October 7, 2020 10:21
Command Resource estimator
  -unexisting command
  -existing regular command
ModuleFactory for CommandResourceEstimator
  -empty command parameter
  -existing command parameter
@mergify mergify bot dismissed kamaradclimber’s stale review November 5, 2020 10:17

Pull request has been modified.

Copy link
Contributor

@kamaradclimber kamaradclimber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests!


list<::mesos::slave::QoSCorrection> CommandQoSControllerProcess::m_corrections(
const ResourceUsage& m_usage) {
LOG(INFO) << "Start Corrections timer_start_module QoSController";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be more appropriate for debug level I think

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is more for debug, It is use to show time taken between different part of the module. like: start of module, start of the script, end of script, end of module

// wished output = [{QoSCorrection}, {QoSCorrection}, ...]

std::string stroutput = output.get();
std::string delimiter = ";";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to get why the script could not return the correct format (which seems to be map nicely with json)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the problem come from the way message are supported.
Since the return must be an array of QoSCorrection there is noway to call for any message converting like JsonStringtoMessage().
I tried to convert the JsonArray to Array in c++ but did not find and easy way to do it, a part from calling external library.

Each QoSCorrection look like this:
{"type":"KILL","kill":{"executorId":{"value":"app"},"frameworkId"{"value":"frameworkID"},"containerId":{"value":"containerID"}}}
Which kind of complicated the the way to split it so i use a special character as separator

If you have any good way to do i would gladly take it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Greg, this part looks a bit fragile. I've read your answer but have somle additional questions:

  • what is that "QoSscript timer_start_script QoSController" script? If this is something on our side, should it not return directly a json, like all other commands run?
  • if it's not in our hands, maybe there should be an intermediate wrapper to do it anyway, because I don't feel this kind of things needs to be done in this module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the feeback, the "QoSscript timer_start_script QoScontroller" is an output line (just use to get a timestamp) to notify the start of the commandRunner running the QoSscript.rb that give the QoScorrection after receiving the ResourceUsage.
It should totally give a JSON, i will try to rework it to return a JSON.

const Option<Command>& _corrections,
bool isDebugMode)
: m_correctionsCommand(_corrections), m_isDebugMode(isDebugMode) {
LOG(INFO) << "new QoSController";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this line

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

LOG(INFO) << output.error();
// return Failure("output from CommandRunner gave error: " +
// output.error());
return resources;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds dangerous to return "0 cpus" whenever the command fail. Maybe we could:

  • store last result from the script (if any)
  • otherwise, explicitely fail

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually i think returning previous result is more dangerous since if it keep failing it will always have the same amount thus allocating more revocable task that the Agent could take on.
Returning 0 prevent any revocable task to be allocated. In my tough if the Resource Estimator is broken revocable task should not be able to allocate.
I did try to fail, since the resource are returned to the slave it does not take fail from oversubscribable command, apart from modifying the slave i am not sure how we could make it work, if you have any suggestion i am interested

const std::string& name, const Option<Command>& _oversubscribable,
bool isDebugMode)
: m_oversubscribableCommand(_oversubscribable), m_isDebugMode(isDebugMode) {
LOG(INFO) << "new resourceEstimator";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can drop this line

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@mergify mergify bot dismissed kamaradclimber’s stale review November 18, 2020 14:50

Pull request has been modified.

// wished output = [{QoSCorrection}, {QoSCorrection}, ...]

std::string stroutput = output.get();
std::string delimiter = ";";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Greg, this part looks a bit fragile. I've read your answer but have somle additional questions:

  • what is that "QoSscript timer_start_script QoSController" script? If this is something on our side, should it not return directly a json, like all other commands run?
  • if it's not in our hands, maybe there should be an intermediate wrapper to do it anyway, because I don't feel this kind of things needs to be done in this module.

${CMAKE_SOURCE_DIR}/src/Helpers.hpp
${CMAKE_SOURCE_DIR}/src/Logger.hpp
${CMAKE_SOURCE_DIR}/src/ModulesFactory.hpp
${CMAKE_SOURCE_DIR}/src/CommandResourceEstimator.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide a small documentation header in hpp files to quickly describe what are those methods supposed to do, what info they expect when running, ... You can get inspiration for the format if you look at Hook and Isolator headers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the feeback, i will make it more understandable!

@@ -0,0 +1,3 @@
#!/bin/bash

# do nothing here, so output file remains empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPT: You can likely change your test to avoid the need of having an empty bash script, using some standard commands for example.

"role" : "*"}]
EOM

cat $INPUT_FILE > /tmp/test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this debug thing (and also comments above).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants