-
Notifications
You must be signed in to change notification settings - Fork 260
Surface the manifest filenames for higher level logic #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
This file was deleted.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to downgrade this dependency outside of OLM doesn't work well with the latest v1.41.x version? Can we instead replace pin on the OLM side-of-things while we work through the changes?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some explanation in these issue comments and in this article. TL;DR; you don't pin a version but specify a minimum requirement.
I wish the issue with grpc v1.41.0 would have been further investigated when discovered. It is not a bug in the grpc library but a conscious behavior change. Now if you don't use keepalive it is the normal behavior for a connection to get IDLE if no rpc call is made. The issue is that you are only testing for READY:
I could change the line and test for IDLE as well. The one thing is that when connections have been IDLE for a defined time (OS setting) they can be garbage collected. The grpc client would have two options:
Option 1 is better if it is important not to incur any delay due to the reestablishment of a TCP connection. Option 2 is generally advised as it allows freeing up unused resources (network connections, filedescriptors, memory on the host) and don't unnecessarily create additional network bandwidth consumption (the keepalive packets)
I don't know enough of the way you use the grpc server to say whether the behavior change would break other things than the tests, hence the downgrade of the grpc library. This is however no sustainable approach, just building up some additional technical debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That gRPC explanation makes a lot of sense -- I also started testing for IDLE when trying to address test flakes which did help the situation. It looks like the version bump on the registry side did affect performance.
I think Option 2 would be better since the client (OLM) needs to talk to the servers (catalog pods) only intermittently, when doing a sync, instead of all the time. It would also help reduce the resource consumption of OLM somewhat versus keeping connections open all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation - I think downgrading is reasonable to me, but I wonder whether this points to a need for a simple integration tests with OLM so we can catch this kind of regression quicker? That said, that would increase the coupling between the two repositories, and they both should have their own lifecycle for managing dependencies, so maybe the value add there is watered down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just closing the loop on this: we ended up downgrading this dependency to v1.40.0 to mitigate a flake in the OLM e2e testing suite. It looks like github is still showing this change in the diff - do we need to rebase against the master branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timflannagan I am waiting for final review of operator-framework/operator-lifecycle-manager#2552. With the new approach we would not need this PR anymore as the match is now based on the resource identification through group, kind, namespace, name rather than filename. If things go as planned I would then close this PR