Skip to content

Conversation

@allisonschiang
Copy link
Member

@allisonschiang allisonschiang commented Nov 11, 2025

Old error:
2025-11-10T22:55:40.444Z ERROR client gosdk/gosdk.go:28 context deadline exceeded; context deadline exceeded; mDNS query failed to find a candidate

new error:
2025-11-11T16:13:31.449Z ERROR client gosdk/gosdk.go:28 failed to connect to robot within time limit. check network connection and try again. see http://docs.viam.com/dev/tools/common-errors/#conn-time-out for troubleshooting steps.

note: this only changes CLI and Go SDK logs, as other SDKs have their own conn logic

note2: currently no viam link system to shorten links so itll stay long

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 11, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 11, 2025
@aldenh-viam
Copy link
Contributor

aldenh-viam commented Nov 12, 2025

Same as other PRs: Would there be any significant performance improvement if we use the errors.Is pattern instead of strings.Contains?

@allisonschiang allisonschiang requested a review from a team November 12, 2025 21:44
@allisonschiang
Copy link
Member Author

Same as other PRs: Would there be any significant performance improvement if we use the errors.Is pattern instead of strings.Contains?

the error returned when the user has no internet and all three connection types fail is "context deadline exceeded; context deadline exceeded; mDNS query failed to find a candidate" which all individually don't have an error type. I also believe context deadline exceeded is returned from an external package, so would have to check it coming in if its that error with strings.Contains() and then make it into a Viam error type anyways? unless there is another way to do it

@cheukt
Copy link
Member

cheukt commented Nov 13, 2025

would prefer the change to be made where the error is getting returned, so in goutils in the dial function. There it'll be easier to use errors.Is

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 13, 2025
@allisonschiang
Copy link
Member Author

would prefer the change to be made where the error is getting returned, so in goutils in the dial function. There it'll be easier to use errors.Is

connectWithLock() seems to be calling dial() once for webRTC and then once for gRPC, so not sure how to make the changes in dial() when we want both dial() attempts to have the contextDeadlineExceeded before we return "failed to connect to robot within time limit". I did make the change to use Errors.is() on dial returns though.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 13, 2025
@cheukt
Copy link
Member

cheukt commented Nov 14, 2025

for example, we can have https://github.com/viamrobotics/goutils/blob/0101be5c0b20620db2ee121dd37dccdcec751361/rpc/dial.go#L103 return a different error (maybe something like all connection attempts timed out) if all connection methods failed with deadline exceeded or mDNS query failed to find a candidate. Then here, we can check that the last error is all connection attempts timed out and then return this custom error. I suggest doing it this way because goutils shouldn't return errors that link specifically to viam doc sites, so something intermediate would be preferred.

As it is, I think we're only checking that one out of the possibly three bundled errors is a context deadline exceeded error, which would be incorrect

@cheukt
Copy link
Member

cheukt commented Nov 18, 2025

talked offline - go clients makes a second dial if the first dial fails. the first dial already (unintentionally) swallows the mDNS timeout, which was why I was a bit confused. I think it makes sense to make the change here instead of a combination of both goutils and rdk, but to make sure that the err is deadline exceeded, and grpcErr comprises of both deadline exceeded and mDNS query failed to find a candidate

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 18, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 18, 2025
}

// instead of returning two context.DeadlineExceeded errors, tell the user how to fix
if errors.Is(err, context.DeadlineExceeded) && errors.Is(grpcErr, context.DeadlineExceeded) && errors.Is(grpcErr, rpc.ErrMDNSQuery) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ErrMDNSQuery gets exported with my matching goutils changes

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 19, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 20, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 20, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 20, 2025
@allisonschiang
Copy link
Member Author

tested again and got error message: failed to connect to machine within time limit. check network connection, whether the viam-server is running, and try again. see https://docs.viam.com/dev/tools/common-errors/#conn-time-out for troubleshooting steps

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2025
// connection and that the machine is on. This should be more helpful than simply returning a chain of context.DeadlineExceeded
// and candidate not found errors.
const connTimeoutURL = "https://docs.viam.com/dev/tools/common-errors/#conn-time-out"
if errors.Is(err, context.DeadlineExceeded) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do this check before we combine the errors in L538

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2025
Copy link
Member

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@allisonschiang
Copy link
Member Author

LGTM

oops sorry I rerequested review

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2025
@allisonschiang allisonschiang merged commit 0ecbc0d into viamrobotics:main Nov 22, 2025
18 checks passed
@allisonschiang allisonschiang deleted the APP-8799 branch November 22, 2025 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test This pull request is marked safe to test from a trusted zone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants