Skip to content

Conversation

jpacik
Copy link
Contributor

@jpacik jpacik commented Oct 26, 2018

No description provided.

@@ -318,6 +319,7 @@ func (c *Controller) processQuery(q *Query) (pop bool, err error) {
return true, errors.New("failed to transition query into executing state")
}
q.alloc = new(execute.Allocator)
// TODO: pass the plan to the executor here
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can go away.

Copy link
Contributor

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just gave it a read through.


## Data Frames
type PhysicalPlanner interface {
Plan(*PlanSpec) (*PlanSpec, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an API design standpoint, is it possible to use the logical planner, skip the physical planner, send the plan spec to the executor, and get a valid (albeit not optimized) result?

If it's not, I think we should change the API to have the physical planner return a physical plan object or something that gives us a strong type guarantee that the correct path has been followed.

Another thing I haven't been clear about is if there's a reason why, on the interface level alone, these two are separated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically, today it's not possible to use just the logical planner and skip physical planning, because a each from needs a range to be pushed into it, and that push down is done as a physical optimization.

Some more depth: right now all our procedure specs are both logical procedure specs and physical procedure specs. That is, they all have a Cost method and can be translated directly into an instance of Transformation or Source. But that won't always be the case; at some point we'll have (for example) a JoinProcedureSpec which is logical (no Cost method and can't be converted to a Transformation) and the physical planner will need to choose either MergeJoinProcedureSpec or NestedLoopProcedureSpec for the physical plan.

(I need to add the above to the docs.)

I think you're right that the logical planner should return something more opaque. Josh and I have talked about breaking up PlanSpec into logical and physical counterparts.

As to why the logical/physical divide is exposed on the interface level, that's a good question. Honestly, I was following the example of what was there in the control package already. I can think of good reasons for the divide, like maybe wanting to cache logical plans to use later on, but that seems to be a ways off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the answer and I'm really looking forward to a further API-level differentiation between the two. I can easily see now why the difference would be needed, but we do need to have them as separate APIs. The influxql query engine suffered for a long time because the outward facing APIs made a lot of assumptions about the previous steps that were invoked without having them encoded as part of the type system so it's important that we do that before having a stable API.

// Rewrite attempts to rewrite a `from -> range` into a `FromRange`
func (rule MergeFromRangeRule) Rewrite(node plan.PlanNode) (plan.PlanNode, bool, error) {
from := node.Predecessors()[0]
fromSpec := from.ProcedureSpec().(*FromProcedureSpec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we did matching rules on the actual types and the kind was only used for JSON encoding. I always feel a bit uncomfortable with encoding the kind into logic like this, but it's everywhere else so this comment is by no means a suggestion to change this.

I would like to revisit it in the future though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to using something else. Ultimately, we want our patterns to be able to not just recognize a procedure of a particular type, but some class of procedures within a taxonomy (like any source that can have a filter pushed into it, any aggregate function, etc).

@jsternberg
Copy link
Contributor

Even if the commits are messy, this is likely best to merge without attempting any rebase or squashing any commits.

Christopher M. Wolff and others added 11 commits October 26, 2018 14:42
* remove old planner

* rename planner package to plan package

* rename all occurrences of planner to plan
* incorrect removal of predecessors and successors

* move GroupMode to functions package due to circular dependency

* need to register sources

* push range down into from

* make rule names consistent

Also use intersect semantics for pushing
multiple range operations into a from operation.

* fix remove yields

* change rule name
* register procedures with side effects

* do not generate yield nodes for side effect operations

* use mock procedure spec in plantest
@jpacik jpacik merged commit 39f39ce into master Oct 26, 2018
@jpacik jpacik deleted the feat-planner branch October 26, 2018 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants