-
Notifications
You must be signed in to change notification settings - Fork 16.4k
perf: cache dashboard bootstrap data #11234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11234 +/- ##
==========================================
- Coverage 61.47% 59.05% -2.42%
==========================================
Files 832 797 -35
Lines 39445 38232 -1213
Branches 3598 3396 -202
==========================================
- Hits 24248 22579 -1669
- Misses 15015 15471 +456
Partials 182 182
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
superset/connectors/base/models.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to add this to SQLAlchemy events, just like what we did with slices, but it was somehow triggered multiple times (4 or 5), which could have performance risks when a datasource is used in multiple dashboards.
So I moved it to this BaseDatasource API instead. The drawback is it may not trigger if the datasource is updated from FAB CRUD view (an unlikely use case).
superset/models/dashboard.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove quotes for consistency.
superset/models/dashboard.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__repr__ is used in cache key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
superset/charts/commands/delete.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't use before_delete event for Slice because the association record in the relationship table datasource_slice is deleted before the Slice object is deleted.
superset/dashboards/dao.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will throw 500 errors when chart don't have uuid, which could happen when chart is deleted but the dashboard cache is not purged for some reason.
tests/superset_test_config.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can override the configs from SUPERSET_CONFIG_PATH (I've set it to ~/.superset/config.py).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this do anything? FEATURE_FLAGS is empty by default in the config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But SUPERSET_CONFIG_PATH (a config file) is loaded before SUPERSET_CONFIG (a Python module). This allows users to override the feature flags in SUPERSET_CONFIG_PATH (a file not tracked in Git) when starting superset like this:
SUPERSET_CONFIG=tests.superset_test_config superset run
ebe41c8 to
ee36591
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned up to break circular imports, because I want to import DruidMetric and DruidColumn from dashboard.py.
superset/models/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a duplicate of https://github.com/apache/incubator-superset/blob/93fdf1d64465b3378de65fb1b058e1a04742a70f/superset/models/dashboard.py#L573
cc @suddjian
Removed so that it's possible to import models.core from dashboard.py. It makes more sense to let Dashboard depends on core (if needed) than vice versa.
superset/models/dashboard.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some refactor for consistency.
4ba6fc2 to
1183dde
Compare
superset/views/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a 500 error when duplicating a dashboard with deleted slices.
96590c2 to
19c6395
Compare
etr2460
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add unit tests for the new decorators?
graceguo-supercat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
d78ab7c to
325e5b8
Compare
325e5b8 to
7939b0a
Compare
Added some basic tests. |
ca38c89 to
985fdb0
Compare

SUMMARY
Large and complex dashboards can be very slow to load because they need to read all the related slices and datasources and they all have go through SQL queries with no caching enabled.
A previous attempt to cache the rendered dashboard page had to be reverted because it did not consider user-specific data on that page. Why don't we cache the non-user-specific bootstrap data instead?
This PR extracts the data building logics from the
/dashboard/:id_or_slugFlask view to theDashboardmodel and uses Flack caching decorator to manage the cache. Cache will be cleaned up when any one of the dashboard, the associated slices, datasources, or table columns/metrics is updated.When tested with a dashboard of 300+ slices in our staging environment, this change can save up to more than 3 seconds of page load time.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TEST PLAN
Make sure the dashboard page still work as expected.
ADDITIONAL INFORMATION