-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[release test] move default env vars to docker file #59791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors how default environment variables for release tests are handled by moving them from the Python test runner code into the byod.Dockerfile. This is a positive change as it makes these environment variables a fundamental part of the test image. The related changes in the Python code and tests are correct. My feedback focuses on improving the Dockerfile by consolidating the new ENV instructions to reduce image layers, following Docker best practices.
| ENV RAY_BACKEND_LOG_JSON=1 | ||
|
|
||
| # Logs the full stack trace from Ray Data in case of exception, | ||
| # which is useful for debugging failures. | ||
| ENV RAY_DATA_LOG_INTERNAL_STACK_TRACE_TO_STDOUT=1 | ||
|
|
||
| # To make ray data compatible across multiple pyarrow versions. | ||
| ENV RAY_DATA_AUTOLOAD_PYEXTENSIONTYPE=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce the number of layers in the Docker image and improve maintainability, it's a good practice to group related ENV instructions into a single one. This also allows for a consolidated comment block explaining all the variables. I've also added a comment for RAY_BACKEND_LOG_JSON for clarity and consistency.
# Enable JSON formatted logs for the Ray backend.
# Logs the full stack trace from Ray Data in case of exception for debugging.
# Makes Ray Data compatible across multiple pyarrow versions.
ENV RAY_BACKEND_LOG_JSON=1 \
RAY_DATA_LOG_INTERNAL_STACK_TRACE_TO_STDOUT=1 \
RAY_DATA_AUTOLOAD_PYEXTENSIONTYPE=1
17c7947 to
5cc509b
Compare
| ) | ||
|
|
||
| return default | ||
| return _convert_env_list_to_dict(self["cluster"]["byod"].get("runtime_env", [])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests using anyscale/ray images lose default ENV vars
Moving default environment variables from get_byod_runtime_env() to byod.Dockerfile causes a regression for tests using anyscale/ray images (when ray_version is specified). These tests previously received RAY_BACKEND_LOG_JSON, RAY_DATA_LOG_INTERNAL_STACK_TRACE_TO_STDOUT, and RAY_DATA_AUTOLOAD_PYEXTENSIONTYPE via the runtime env mechanism, but now won't have them since those official Anyscale images aren't built from byod.Dockerfile. The RAY_DATA_AUTOLOAD_PYEXTENSIONTYPE variable in particular could cause PyArrow compatibility issues in affected tests.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change is actually a good change. test using ray_version are mostly templates, and should not use these feature-flag-ish env vars.
new cli/sdk does not have cluster env runtime env any more Signed-off-by: Lonnie Liu <[email protected]>
5cc509b to
b4d718a
Compare
new cli/sdk does not have cluster env runtime env any more