Update eager: make sure client secret can be specified as env var by cosmicBboy · Pull Request #2720 · flyteorg/flytekit

cosmicBboy · 2024-08-29T16:59:17Z

Why are the changes needed?

Currently, the @eager workflows are too strict with its handling of the client secret values. Both secret group and key need to be specified for the secret request to be passed down to the underlying task. There is also an issue with the async event loop not being available for the downstream data persistence step in the task execution.

What changes were proposed in this pull request?

This PR loosens this requirement to be compatible with Flyte backends that do not require secret groups. This PR also:

Allows the user to bind the client secret to an environment variable in cases where the FlyteRemote object can authenticate with just the client secret
Makes sure that an async loop is available for downstream data persistence step in the task execution

How was this patch tested?

The following workflow was tested on flyte sandbox

from flytekit import task, ImageSpec
from flytekit.configuration import Config
from flytekit.experimental import eager
from flytekit.remote import FlyteRemote

flytekit = "git+https://github.com/flyteorg/flytekit@54c4471a56311ff8201ba76d9bda726bc5b4689b"

image = ImageSpec(
    name="polars",
    packages=["flytekit", flytekit],
    registry="localhost:30000",
    apt_packages=["git"],
)


@task
def add_one(x: int) -> int:
    return x + 1


@task
def double(x: int) -> int:
    return x * 2


@eager(
    container_image=image,
    remote=FlyteRemote(
        config=Config.for_sandbox(),
        default_project="flytesnacks",
        default_domain="development",
    ),
)
async def simple_eager_workflow(x: int) -> int:
    out = await add_one(x=x)
    if out < 0:
        return -1
    return await double(x=out)

I updated the documentation accordingly.
All new and existing tests passed.
All commits are signed-off.

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

codecov · 2024-08-29T20:27:23Z

Codecov Report

Attention: Patch coverage is 30.76923% with 9 lines in your changes missing coverage. Please review.

Project coverage is 71.82%. Comparing base (65bf9e7) to head (f0b5d67).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
flytekit/experimental/eager_function.py	30.76%	9 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2720       +/-   ##
===========================================
+ Coverage   45.47%   71.82%   +26.34%     
===========================================
  Files         193      193               
  Lines       19516    19560       +44     
  Branches     2818     3845     +1027     
===========================================
+ Hits         8875    14049     +5174     
+ Misses      10208     4832     -5376     
- Partials      433      679      +246

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cosmicBboy · 2024-08-30T14:56:53Z

flytekit/bin/entrypoint.py

    for k, v in output_file_dict.items():
        utils.write_proto_to_file(v.to_flyte_idl(), os.path.join(ctx.execution_state.engine_dir, k))

+    # make sure an event loop exists for data persistence step


this issue comes up not just with eager. Frameworks like Langchain that have async functionality will cause the event loop to end before the flyte task can run the async data persistence operation.

Could we use a separate event loop just for flytekit?

sounds right, I think someone more well-versed in async can help with that

flytekit/bin/entrypoint.py

Future-Outlier

thank you for this!
can we have

register the task to remote cluster (sandbox or dogfood gcp) to prove it works?
add workflow in unit tests.

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

cosmicBboy · 2024-08-30T15:28:37Z

thank you for this! can we have

register the task to remote cluster (sandbox or dogfood gcp) to prove it works?

add workflow in unit tests.

Tests for this exist here and here. The changes made to the PR (adding the new client_secret_env_var argument) and loosening the strictness of secrets only really applies to Union serverless.

The code snippet in the PR description should work on flyte sandbox

Future-Outlier

LGTM, thank you

thomasjpfan · 2024-08-30T20:26:53Z

flytekit/experimental/eager_function.py

+            remote_cls = type(remote)
+            return remote_cls()


This assumes that the other FlyteRemote object can be created without a Config and uses an environment variable. Is this okay for OSS?

Also, should this carry over the default_domain and default_project from the original remote obj? Or should the other remote object figure out the domain and project from the execution environment?

ooo yeah it should probably carry over those args.

admittedly this code path is basically: "tell me you only run on serverless without saying you run on serverless" 🙃

Basically client_secret_env_var only really applies to serverless (this will fail with Flyte), unless at some point down the road Flyte provides an single API key UX.

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

wild-endeavor

some questions, but this event loop thing... can you explain why it's needed in entrypoint.py but there doesn't seem to be another one for local execution? Won't both need it?

wild-endeavor · 2024-09-04T08:07:01Z

flytekit/experimental/eager_function.py

+    try:
        secret_requests.append(Secret(group=client_secret_group, key=client_secret_key))
+    except ValueError:
+        pass


what if there's nothing in the client_secret_env_var environment variable?

client_secret_env_var doesn't figure into this try except block, or am I missing something?

no yeah i know, i mean, what happens if all three (client_secret_env_var, client_secret_group and client_secret_key) are None?

wild-endeavor · 2024-09-04T08:08:01Z

flytekit/experimental/eager_function.py

    :param local_entrypoint: If True, the eager workflow will can be executed locally but use the provided
        :py:func:`~flytekit.remote.FlyteRemote` object to create task/workflow executions. This is useful for local
        testing against a remote Flyte cluster.
+    :param client_secret_env_var: if specified, binds the client secret to the specified environment variable for


okay with this for now, but I think we should see if we can't move this to some higher level in the future. the secret here we're talking about is the secret for hitting admin itself right?

this should be implemented in Secret but we haven't moved on that for a while: #1726

wild-endeavor · 2024-09-04T08:11:02Z

flytekit/experimental/eager_function.py

+
+    if client_secret_env_var is not None:
+        # this creates a remote client where the env var client secret is sufficient for authentication
+        os.environ[client_secret_env_var] = client_secret


confused what's happening here. if we're removing the asserts, then that means the group/key might be None. If they're none, won't the secrets_manager.get fail? So then what is the client_secret? Or is there some default in the secrets manager that somehow gets returned?
I don't get why we're setting the environment variable here to the value of client secret.

this is to make it easy to authenticate with UnionRemote on serverless (via UNION_SERVERLESS_API_KEY env var). Secrets on serverless only needs a secret key (no group).

just wondering, in the following scenario:

client_secret_key is None

client_secret_group is None
(and this scenario is now valid because the asserts are now removed)
after line 626 client_secret = secrets_manager.get(client_secret_group, client_secret_key)
what is the value of client_secret?

going through the secretsmanager, it looks like it raises an IndexError if you call get(group=None, key=None), so I don't think it ever gets past line 626. Could we move the assertion that at least one of those has be be non-None into the get function?

wild-endeavor

Had a few more comments and went over this with @eapolinario. Could we

Add an assert to secretsmanager.get that at least one of group/key should be non-None
Could we move the async loop stuff to another PR? I think it'd be helpful to discuss it separately. Also do you remember how you triggered the issue? I wanna repro locally to understand more what langchain is doing. Ideally we handle in such a way that it's completely independent of any other library.

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

cosmicBboy · 2024-09-05T14:13:45Z

Had a few more comments and went over this with @eapolinario. Could we

Add an assert to secretsmanager.get that at least one of group/key should be non-None

Updated PR with client secret key/group assertion.

Could we move the async loop stuff to another PR? I think it'd be helpful to discuss it separately. Also do you remember how you triggered the issue? I wanna repro locally to understand more what langchain is doing. Ideally we handle in such a way that it's completely independent of any other library.

Deleted this. For some reason I'm unable to repro the async issue I was seeing earlier on eager or with langchain async document loaders https://github.com/unionai-oss/union-rag/blob/main/union_rag/langchain.py#L138-L141

cosmicBboy added 4 commits August 29, 2024 10:18

fix eager mode

9b65742

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

bind secret to env var

4e04b82

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

add remote creation error handling

5abc9ad

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

update new arg to client_secret_env_var

9b9dd83

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

cosmicBboy requested review from Future-Outlier, eapolinario, kumare3, pingsutw, samhita-alla, thomasjpfan and wild-endeavor as code owners August 29, 2024 16:59

cosmicBboy added 7 commits August 29, 2024 13:03

fix lint

f8f97b5

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

fix bug

54c4471

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

fix kwargs

b526f6f

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

try creating secret

66c6bd4

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

add event loop if needed

9ec46fb

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

debug

d21559f

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

debug

aec338a

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

cosmicBboy commented Aug 30, 2024

View reviewed changes

Future-Outlier reviewed Aug 30, 2024

View reviewed changes

flytekit/bin/entrypoint.py Outdated Show resolved Hide resolved

Future-Outlier reviewed Aug 30, 2024

View reviewed changes

update error

f0b5d67

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

Future-Outlier previously approved these changes Aug 30, 2024

View reviewed changes

thomasjpfan reviewed Aug 30, 2024

View reviewed changes

pass default domain and project to new remote

351a8c6

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

cosmicBboy dismissed Future-Outlier’s stale review via 351a8c6 August 30, 2024 20:57

cosmicBboy enabled auto-merge (squash) September 3, 2024 16:42

wild-endeavor reviewed Sep 4, 2024

View reviewed changes

wild-endeavor reviewed Sep 5, 2024

View reviewed changes

add client secret assertion

2425ed4

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

wild-endeavor approved these changes Sep 5, 2024

View reviewed changes

cosmicBboy merged commit 9f9f197 into master Sep 5, 2024

cosmicBboy deleted the update-eager branch September 9, 2024 14:11

Conversation

cosmicBboy commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

codecov bot commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Future-Outlier left a comment

Choose a reason for hiding this comment

Uh oh!

cosmicBboy commented Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Future-Outlier left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cosmicBboy Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wild-endeavor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wild-endeavor left a comment

Choose a reason for hiding this comment

Uh oh!

cosmicBboy commented Sep 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cosmicBboy commented Aug 29, 2024 •

edited

Loading

codecov bot commented Aug 29, 2024 •

edited

Loading

cosmicBboy commented Aug 30, 2024 •

edited

Loading

cosmicBboy Aug 30, 2024 •

edited

Loading