definitions

package
v0.15.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 21, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Definitions resolves paths to sets of ast.Node. It is used in the LSP for "jump-to-definition" functionality, amongst others. A path is either an ast.Ident, or a CUE expression followed by zero or more idents, all chained together by dots.

Introduction

In the text that follows, subscripts are used in order to make identifiers (idents) unique for the purpose of explanation, but they should not be considered part of the ident itself, from the point of view of CUE.

For example, in the code:

x₁: 17
y: x₂

If the user places their cursor on x₂ and invokes "jump-to-definition", the cursor should move to x₁. In CUE, there can be several nodes that define a binding. For example:

x₁: 17
y: x₂
x₃: int

Now, if the user places their cursor on x₂ and invokes "jump-to-definition", they should see both x₁ and x₃ as targets to which they can jump.

The implementation is a lazy, memoized, call-by-need evaluator. The only purpose of this evaluator is to calculate what each element of each path resolves to; there is no calculation of fixed-points, no subsumption, no unification. And the little that this evaluator does do is imprecise. For example, it does not test field names (even when known) against patterns. It does not compute the names of dynamic fields, even when it is trivial to do so statically. It is a MAY-analysis and not a MUST-analysis. This means that it may offer jump-to-definition targets that do not occur during full evaluation, but which we are unable to dismiss with only the simple evaluation offered here. A good example of this is with disjunctions:

x₁: {a₁: 3} | {a₂: 4}
y₁: x₂
y₂: a₃: <n₁
n₂: 5
z₁: y₃.a₄

Here, a₄ will resolve to both a₁ and a₂, even though the constraint via a₃ may (or may not) eliminate one (or both!) branches of the disjunction.

Algorithm 1: simplified CUE

In CUE, a path such as x.y.z, where x is an ident, is only legal if x is defined in the same lexical scope as the path x.y.z, or any ancestor lexical scope. There is one exception to this which is the package scope, which arguably doesn't exist lexically. We return to the package scope much later on.

This restriction on paths complicates the algorithm. For example:

x₁: y₁: x₂.a₁
x₃: {
	x₄: a₂: 17
	z₁: x₅.a₃
}
x₆: a₄: 18

Here, x₂ refers to x₁, x₃, and x₆, whilst x₅ refers only to x₄. Similarly, a₁ refers to a₄, but a₃ refers to a₂.

To explain this evaluator, we start with a simplified version of CUE which does not place this restriction on paths: i.e. the first (and possibly only) element of a path may resolve to a definition that does *not* exist in the same lexical scope (or ancestor of) as that path.

In this evaluator, an "astNode" is a collection of bindings, i.e. key-value pairs. The values are themselves astNodes. An astNode is created with one or more unprocessed ast.Node values, for example, an ast.File, or an ast.StructLit.

When an astNode is evaluated, its unprocessed values are unpacked. An ast.StructLit for example contains a number of ast.Decl nodes, which are themselves then processed. When a astNode encounters an ast.Field, the astNode ensures a binding exists for the field's name, and adds the field's value to the binding's astNode's unprocessed values. Thus if the same field definition is encountered multiple times, its values are accumulated into a single astNode. Note that evaluation of an astNode is not recursive: its bindings are not automatically evaluated. Thus an astNode is the unit of evaluation; by adding new astNodes you can create new points where evaluation can pause (and potentially resume later on).

If, during evaluation, an astNode encounters a path, the path will correspond either to the value of a field (i.e. the astNode is for something like x: y), or an embedding into a struct. The astNode keeps track of these embedded paths and once processing of the astNode's values is complete, it then resolves the embedded paths to further astNodes, and records that this astNode itself resolves to these other astNodes (the resolvesTo field).

The consequence is that the evaluation of an astNode creates and fully populates (with their unprocessed values) all of its bindings before any resolution of paths occurs. Thus evaluation can be driven by demand: if a path is encountered that accesses one of the astNode's bindings (or any binding of an ancestor astNode), then it is guaranteed that the binding (if it exists) contains its complete set of values before it is accessed, and so it is safe to evaluate.

Consider this example:

x: y
y: {
	a: 3
	b: y.a
}

Evaluating the outermost astNode will create two bindings, one for x (with the path y as its value), and one for y (containing the ast.StructLit as its value). If the astNode for y is evaluated, it will create its own bindings for a (containing the ast.BasicLit 3), and for b (containing the path y.a).

Imagine we want to resolve, in the outermost astNode, the path x.a. We first evaluate the outermost astNode, then inspect its bindings. We find an x in there, so we grab that astNode. This completes resolving the x of x.a. We now wish to find an a within that astNode, so we evaluate it. This astNode contains only the path y and so we have to resolve y and record that result within our astNode.

Every astNode knows its own parent astNode. This astNode containing the path y will inspect its own bindings for y, and find nothing. It asks its ancestors whether they know of a binding for y. Its parent does have a binding for y, so we grab that astNode. This completes the resolution of y, and thus the evaluation of the astNode that contains the path y. We now ask this same astNode whether it contains a binding for a. It doesn't, but we also inspect all the astNodes that this node resolves to. There is one resolved-to astNode, and it does contain a binding for a, so we grab that. This completes the resolution of x.a.

In summary: this algorithm traverses the AST breadth first and incrementally, to lazily merge together bindings that share the same path into astNodes.

Unmentioned is that there are various ast.Expr types that can use paths but not declare their own bindings, for example an interpolated string. When these are encountered during evaluation, the astNode accumulates and processes them in the same way as embedded paths. The only difference is they don't need to be recorded within the node's resolves-to set.

Querying

In the previous section, we walked through the example of attempting to resolve the path x.a in the outermost astNode. But this isn't what an LSP client will ask. An LSP client doesn't know what path the cursor is on, nor anything about the current scope or how these may correspond to astNodes. The LSP client knows only the cursor's line and column number.

To facilitate an API that allows querying by file-coordinates, astNodes are extended with a rangeset. For each ast.Node that an astNode processes, it adds to its rangeset the range from the node's start file-offset to its end file-offset. Then, when asked to resolve whatever exists at some file-coordinate, we only need to evaluate the astNodes that contain the file-coordinate in question.

Algorithm 2: real CUE

If we stuck to algorithm 1, it would mean that in:

a₁: b: c: a₂
a₃: b: a₄: 5

a₂ would resolve to a₄. It also means that you get scary collisions with aliases, for example:

a: l₁=b: c: l₂.x
a: x: l₃.c

Here, l₃ resolves to l₁, or b. So the rule that if the the first element of any path is an ident, then it can only be resolved lexically, must be implemented. This means that this evaluator must model "lexical bindings" which are candidates for resolving the first element of a path, separately from "navigable bindings" which are candidates for resolving the rest of the path (as you navigate the path...). The lexical bindings do not have the "merging" behaviour of algorithm 1, for example:

x₁: y₁: 6
x₂: y₂: 7

Whereas before (in Algorithm 1) the evaluator would create one binding for x, now the evaluator creates two bindings for x, each having a distinct astNode value. Both of those astNodes share a "navigable bindings" struct and so any children that either of these astNodes have, can be grouped together appropriately via their shared "navigable bindings". Thus in this example, the evaluation of the outermost astNode creates two bindings for x; their distinct astNodes share a "navigable bindings", and also have one binding each for their respective y fields. These y fields are grouped together within the shared "navigable bindings".

This means that when resolving the first element of a path, we can walk up the lexical bindings only, and then once that's resolved, switch to the navigable bindings for the rest of the path.

For aliases, comprehensions and one or two other things, a binding can be created in the current astNode which is not added to the astNode's navigable bindings. This means it can only ever be found and used as the first element of a path. A navigable binding is always also a lexical binding, but a lexical binding need not be a navigable binding.

File and Package scopes

CUE states that fields declared at the top level of a file are not in the file's scope, but are in fact in the package's scope. At construction, the file astNodes all share a "navigable bindings". Thus if two different files in the same package both declare the same field, they will be correctly grouped together within that navigable bindings.

When a file astNode processes an ast.File, lexical and navigable bindings will be created as normal. When resolving the first element of a path in some deeper astNode, it can be the case that after walking up the chain of ancestor astNodes, no matching lexical binding is found even within the relevant file's astNode's bindings. In this case, it is safe to directly inspect the file's navigable bindings, which amount to the package's lexical bindings. In this way, a path's first element can be an ident that is only declared in some separate file within the same package, and yet it can still be resolved.

Field declaration keys

We wish for jump-to-definition from a field declaration's key to resolve to other declarations of the same field. For example:

x₁: y₁: int
x₂: y₂: 7

x₁ and x₂ should resolve to each other. Similarly y₁ and y₂. To achieve this, when a field is encountered and a new binding added, the new binding's astNode itself adds a new child astNode that contains a [fieldDeclExpr] as its unprocessed value. This value, when evaluated, walks up the navigable bindings ancestors, gathering their names, and stopping when either the package root is reached, or the navigable binding has no name. From this oldest ancestor, the calculated path is then resolved using the normal mechanics for path resolution. This path will resolve to all the declarations of the field in question. Imagine that in the above example, both int and 7 are replaced with the path x.y.

Find Usages / References

When a field usage is resolved to its definition navigable binding:

  1. we record that the file offsets of the field use resolve to this navigable binding;
  2. we record the field usage node in that navigable binding.

For example:

x: y
y: 3

The y on line 1 will resolve to the navigable bindings that represent the y field of line 2. We record that the file offsets for the y of line 1 resolve to that navigable binding. Inside that navigable binding we record it is "used by" the y node of line 1.

So, when the user asks for Find Usages for a given file offset, we first find the definition of that offset as described above. Field usages resolve to their declarations, and field declarations resolve to themselves. Having found the correct navigable binding, we now need to force the evaluation of every astNode that could possibly make use of this navigable binding; evaluating an astNode is enough to add entries to the relevant navigable bindings.

In general, a field can be accessed from anywhere, and so having found the definition of a field, the simple approach is to evaluate every astNode that that contributes to the definition's navigable bindings, all of the descendants of those astNodes, all of the ancestors of those astNodes, and also those ancestors' descendants. In other words, everything. This could get expensive, so there are a couple of ways we can curtail evaluation.

x: {h: 3, y: 4}.y

If the user is asking for the usages of h, we evaluate everything within the in-line struct lit, but we can detect that h is not used by x and so we do not need to evaluate anything further up the tree.

x: {i: h: 3, y: 4}.i

Here though, we don't care just about h, we have to also care about the path to h from within this in-line struct, i.e. [i,h]. We can detect that i is used by x, and so h has escaped from the in-line struct and so we need to fully evaluated further up the tree where there might be an x.h.

Even if a field does escape an in-line struct, other expressions can block it from being accessed.

x: struct.MaxFields({i: h: 3}.i, 1)

Although h escapes from the in-line struct, it does not escape the function call, so again we do not need to evaluate anything further up the tree.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Definitions

type Definitions struct {
	// contains filtered or unexported fields
}

Definitions provides methods to resolve file offsets to their definitions.

func Analyse

func Analyse(ip ast.ImportPath, importCanonicalisation map[string]ast.ImportPath, forPackage DefinitionsForPackageFunc, pkgImporters ImportersFunc, files ...*ast.File) *Definitions

Analyse creates and performs initial configuration of a new Definitions value. It does not perform any analysis eagerly. All files provided will be treated as if they are part of the same package. The set of files cannot be modified after construction; instead, construction is cheap, so the intention is you replace the whole Definitions value.

For the ip, importCanonicalisation, forPackage, and pkgImporters parameters, see the corresponding fields in Definitions.

func (*Definitions) ForFile

func (dfns *Definitions) ForFile(filename string) *FileDefinitions

ForFile looks up the FileDefinitions for the given filename.

func (*Definitions) Reset

func (dfns *Definitions) Reset()

Reset clears all cached analysis from Definitions. The Definitions are returned to the same state as after initial construction and before any analysis (evaluation) has taken place. This is useful for when foreign packages (which are (transitively) imported by this package, or import this package) have been modified but this package itself has not been.

type DefinitionsForPackageFunc

type DefinitionsForPackageFunc = func(importPath ast.ImportPath) *Definitions

DefinitionsForPackageFunc is a callback function used to resolve package definitions by their import path. It returns the Definitions for the given import path, or nil if the package cannot be resolved.

type FileDefinitions

type FileDefinitions struct {

	// File is the original [ast.File] that was passed to [Analyse].
	File *ast.File
	// contains filtered or unexported fields
}

FileDefinitions provides methods to resolve file offsets within a certain file to their definitions.

func (*FileDefinitions) CompletionsForOffset

func (fdfns *FileDefinitions) CompletionsForOffset(offset int) (fields, embeds []string, startOffset, fieldEndOffset, embedEndOffset int)

CompletionsForOffset reports the set of strings that can form a new path element following the path element indicated by the offset (number of bytes from the start of the file).

func (*FileDefinitions) DefinitionsForOffset

func (fdfns *FileDefinitions) DefinitionsForOffset(offset int) []ast.Node

DefinitionsForOffset reports the definitions that the file offset (number of bytes from the start of the file) resolves to.

func (*FileDefinitions) DocCommentsForOffset

func (fdfns *FileDefinitions) DocCommentsForOffset(offset int) map[ast.Node][]*ast.CommentGroup

DocCommentsForOffset is very similar to DefinitionsForOffset. It reports the doc comments associated with then definitions that the file offset (number of bytes from the start of the file) resolves to.

func (*FileDefinitions) UsagesForOffset

func (fdfns *FileDefinitions) UsagesForOffset(offset int, includeDefinitions bool) []ast.Node

UsagesForOffset reports the nodes that make use of whatever the file offset (number of bytes from the start of the file) resolves to.

type ImportersFunc

type ImportersFunc = func() []*Definitions

ImportersFunc is a callback function to fetch the package definitions for packages that directly import the current package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL