-
Notifications
You must be signed in to change notification settings - Fork 935
Fix json decode error #2653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix json decode error #2653
Conversation
|
Netflix internal testing[1398] @ 7594e0f |
| # All data read, exit main loop | ||
| break | ||
| else: | ||
| if len(events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does this happen? So we got some event (like file close?) and no data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If poll() returned an event (any event) AND read() returned 0 bytes then it must be EOF:
- If it was POLLIN (data ready), then read() would have returned data
- If read() returned 0 bytes despite an event, it must be POLLHUP (writer closed) or a stale POLLIN that resolved to EOF
7594e0f to
ba8226f
Compare
ba8226f to
d946e02
Compare
Problem
Intermittent
JSONDecodeErrorwhen multiple environments are resolved concurrently during deployment:Root Cause
Race condition in FIFO-based IPC between deployer subprocess and parent process:
close(), reader detects process exit and breaks on empty readSolution
Changed
read_from_fifo_when_ready()to use a hybrid approach:select.poll()to wait for datafcntl()to removeO_NONBLOCKflagread()callsread()returns EOF (0 bytes) ONLY after writer closes AND all kernel pipe buffers are drained