Skip to content

Conversation

@fgiloux
Copy link

@fgiloux fgiloux commented Jan 6, 2022

Signed-off-by: Frederic Giloux fgiloux@redhat.com

Description of the change:

This pull request surfaces the manifest filenames. There is currently no metadata, only the content of a manifest file carried over by bundle.Object. By returning the names of the manifest files it is possible to attach properties that can influence OLM behavior in regards to a specific manifest. This can then be leveraged for making manifests optional

Motivation for the change:

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

@openshift-ci openshift-ci bot requested review from benluddy and njhale January 6, 2022 13:26
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 6, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fgiloux
To complete the pull request process, please assign benluddy after the PR has been reviewed.
You can assign the PR to them by writing /assign @benluddy in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 6, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 6, 2022

Hi @fgiloux. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@timflannagan
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 6, 2022
@codecov
Copy link

codecov bot commented Jan 6, 2022

Codecov Report

Merging #893 (1511eec) into master (9999f79) will increase coverage by 0.01%.
The diff coverage is 60.00%.

❗ Current head 1511eec differs from pull request most recent head 0b64b11. Consider uploading reports for the commit 0b64b11 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #893      +/-   ##
==========================================
+ Coverage   52.21%   52.22%   +0.01%     
==========================================
  Files         103      103              
  Lines        9117     9119       +2     
==========================================
+ Hits         4760     4762       +2     
  Misses       3451     3451              
  Partials      906      906              
Impacted Files Coverage Δ
pkg/configmap/configmap.go 76.47% <60.00%> (+0.96%) ⬆️
alpha/declcfg/declcfg.go 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9999f79...0b64b11. Read the comment docs.

@fgiloux
Copy link
Author

fgiloux commented Feb 3, 2022

Needed to downgrade grpc version due to: operator-framework/operator-lifecycle-manager#2423 (review)

golang.org/x/net v0.0.0-20210825183410-e898025ed96a
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
google.golang.org/grpc v1.41.0
google.golang.org/grpc v1.40.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need to downgrade this dependency outside of OLM doesn't work well with the latest v1.41.x version? Can we instead replace pin on the OLM side-of-things while we work through the changes?

Copy link
Author

@fgiloux fgiloux Feb 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some explanation in these issue comments and in this article. TL;DR; you don't pin a version but specify a minimum requirement.
I wish the issue with grpc v1.41.0 would have been further investigated when discovered. It is not a bug in the grpc library but a conscious behavior change. Now if you don't use keepalive it is the normal behavior for a connection to get IDLE if no rpc call is made. The issue is that you are only testing for READY:

catalog.Status.GRPCConnectionState.LastConnectTime.After(after.LastUpdateTime.Time) && catalog.Status.GRPCConnectionState.LastObservedState == "READY" {

I could change the line and test for IDLE as well. The one thing is that when connections have been IDLE for a defined time (OS setting) they can be garbage collected. The grpc client would have two options:

  1. use keepalive
  2. gracefully handle disconnection and reestablish a connection when required

Option 1 is better if it is important not to incur any delay due to the reestablishment of a TCP connection. Option 2 is generally advised as it allows freeing up unused resources (network connections, filedescriptors, memory on the host) and don't unnecessarily create additional network bandwidth consumption (the keepalive packets)

I don't know enough of the way you use the grpc server to say whether the behavior change would break other things than the tests, hence the downgrade of the grpc library. This is however no sustainable approach, just building up some additional technical debt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That gRPC explanation makes a lot of sense -- I also started testing for IDLE when trying to address test flakes which did help the situation. It looks like the version bump on the registry side did affect performance.

I think Option 2 would be better since the client (OLM) needs to talk to the servers (catalog pods) only intermittently, when doing a sync, instead of all the time. It would also help reduce the resource consumption of OLM somewhat versus keeping connections open all the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation - I think downgrading is reasonable to me, but I wonder whether this points to a need for a simple integration tests with OLM so we can catch this kind of regression quicker? That said, that would increase the coupling between the two repositories, and they both should have their own lifecycle for managing dependencies, so maybe the value add there is watered down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just closing the loop on this: we ended up downgrading this dependency to v1.40.0 to mitigate a flake in the OLM e2e testing suite. It looks like github is still showing this change in the diff - do we need to rebase against the master branch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timflannagan I am waiting for final review of operator-framework/operator-lifecycle-manager#2552. With the new approach we would not need this PR anymore as the match is now based on the resource identification through group, kind, namespace, name rather than filename. If things go as planned I would then close this PR

Copy link
Member

@timflannagan timflannagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this looks reasonable to me. I think I'd like to see us add some testing around the filenames that get returned from the Load(...) method to ensure we don't regress on the registry side-of-things down the line.

@fgiloux
Copy link
Author

fgiloux commented Feb 4, 2022

I think I'd like to see us add some testing around the filenames that get returned from the Load(...) method to ensure we don't regress on the registry side-of-things down the line.

Fair enough. I have added some.

Copy link
Member

@timflannagan timflannagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're getting close to the finish line here. I see a couple of pkg/lib/indexer/index.Dockerfile* changes that are present in this PR. Any idea what they're needed for, or were they generated locally, and we're not correctly filtering them out in the .gitignore properly?

It looks like codecov is also complaining about a couple of early return statements, and their path isn't being tested. We've had issues with codecov failing in the past due to regressing on overall code coverage on a per-PR-basis over time, which blocked merging entirely. I don't think this is something we'd block the PR from merging given there's a corresponding OLM PR open, but it would be nice to ensure we're not regressing silently when we cut registry releases.

}

// equalSlices compares two string slices and has tolerate different orderings
func equalSlices(a, b []string) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a custom testing helper function here? Can we piggy back off of assert.ElementsMatch(...) instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I made the change

@fgiloux
Copy link
Author

fgiloux commented Feb 4, 2022

I see a couple of pkg/lib/indexer/index.Dockerfile* changes that are present in this PR. Any idea what they're needed for, or were they generated locally, and we're not correctly filtering them out in the .gitignore properly?

I have done some cleansing. I believe that they are generated by the unit tests for mirror.
When opening files it would be advisable to use: TemFile without specifying a dir. It would then use the OS default tmp dir, /tmp for Linux.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 26, 2022
…igher level logic

Signed-off-by: Frederic Giloux <fgiloux@redhat.com>
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 1, 2022
@fgiloux
Copy link
Author

fgiloux commented Mar 30, 2022

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2022
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 31, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 31, 2022

@fgiloux: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@perdasilva
Copy link
Contributor

closing PR as stale. Please re-open if it's still important.

@perdasilva perdasilva closed this Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants