You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/blog/typed-napi.md
+82-62Lines changed: 82 additions & 62 deletions
Original file line number
Diff line number
Diff line change
@@ -122,13 +122,9 @@ Thanks to Tree-Sitter's design, we can leverage this rich type information to bu
122
122
123
123
Our new API follows a progressive enhancement approach to type safety:
124
124
125
-
**Preserve untyped AST access**
125
+
**Preserve untyped AST access**. The existing untyped API remains available by default, ensuring backward compatibility
126
126
127
-
The existing untyped API remains available by default, ensuring backward compatibility
128
-
129
-
**Optional type safety on demand**
130
-
131
-
Users can opt into typed AST nodes either manually or automatically for enhanced type checking and autocompletion
127
+
**Optional type safety on demand**. Users can opt into typed AST nodes either manually or automatically for enhanced type checking and autocompletion
132
128
133
129
However, it is a bumpy ride to transition to a new typed API via the path of Tree-sitter's static type.
134
130
@@ -138,28 +134,31 @@ Second, the JSON contains a lot of unnamed kinds, which are not useful to users.
138
134
139
135
Finally, as mentioned earlier, the JSON contains alias types. We need to resolve the alias type to its concrete type, which is also covered in the next section.
140
136
141
-
## Define Type
137
+
## Define Types
138
+
139
+
New API's core involves several key new types and extensions to existing types.
142
140
143
-
### Give`SgNode`its type
141
+
### Let`SgNode`Have Type
144
142
145
-
We add two type parameters to `SgNode` to represent the language type map and the node's kind.
146
-
`SgNode<M, K>` is the main type in the new API. It is a generic type that represents a node with kind `K` of language type map `M`. By default, it is a union of all possible kinds of nodes.
143
+
`SgNode` class, the cornerstone of our new API, now accepts two new optional type parameters.
fields:M[K]['fields'] // demo definition, real one is more complex
152
149
}
153
150
```
154
151
155
-
It provides a **correct** interface for an AST node in a specific language. While it is still **robust** enough to not trigger compiler error when no type information is available.
152
+
It represents a node in a language with type map `M` that has a specific kind `K`. e.g. `SgNode<TypeScript, "function_declaration">` means a function declaration node in TypeScript. When used without a specific kind parameter, `SgNode` defaults to accepting any valid node kind in the language.
153
+
154
+
`SgNode` provides a **correct** AST interface in a specific language. While at the same time, it is still **robust** enough to not trigger compiler error when no type information is available.
156
155
157
156
158
157
### `ResolveType<M, T>`
159
158
160
-
TreeSitter's type alias is helpful to reduce the generated JSON file size but it is not useful to users because the alias is never directly used as a node's kind nor is used as `kind`in ast-grep rule. For example, `declaration` mentioned above can never be used as `kind` in ast-grep rule.
159
+
While Tree-sitter's type aliases help keep the JSON type definitions compact, they present a challenge: these aliases never appear as actual node kinds in ast-grep rules.
161
160
162
-
We need to use a type alias to **correctly**resolve the alias type to its concrete type.
161
+
To handle this, we created `ResolveType`to **correctly**map aliases to their concrete kinds:
163
162
164
163
```typescript
165
164
typeResolveType<M, TextendskeyofM> =
@@ -168,122 +167,143 @@ type ResolveType<M, T extends keyof M> =
168
167
:T
169
168
```
170
169
170
+
This type recursively resolves aliases until it reaches actual node types that developers work with.
171
+
171
172
### `Kinds<M>`
172
173
173
-
Having a collection of possible AST node kinds is awesome, but it is sometime too clumsy to use a big string literal union type.
174
-
Using a type alias to **concisely** represent all possible kinds of nodes is a huge UX improvement.
174
+
Having access to all possible AST node types is powerful, but it is unwieldy to work with large string literal union types. It can be a huge UX improvement to use a type alias to **concisely** represent all possible kinds of nodes.
175
175
176
-
Also, TreeSitter's static type contains a lot of unnamed kinds, which are not useful to users. Including them in the union type is too noisy. We need to allow users to opt-in to use the kind, and fallback to a plain `string` type, creating a more **robust** API.
176
+
Additionally, Tree-sitter's static type contains a bunch of noisy unnamed kinds. But excluding them from the union type can lead to a incomplete type signature. ast-grep instead bundle them into a plain `string` type, creating a more **robust** API.
The above type is a linient string type that is compatible with any string type. But it also uses a well-known trick to take advantage of TypeScript's type priority to prefer the `keyofM` type in completion over the `string& {}` type. To make it more self-explanatory, the `stirng& {}` type is aliased to `LowPriorityString`.
183
+
The above type is a linient string type that is compatible with any string type. But it also uses a [well-known trick](https://stackoverflow.com/a/61048124/2198656) to take advantage of TypeScript's type priority to prefer the `ResolveType` in completion over the `string& {}` type.
184
+
184
185
185
-
Problem? open-ended union is not [well](https://github.com/microsoft/TypeScript/issues/33471) [supported](https://github.com/microsoft/TypeScript/issues/26277) in TypeScript.
186
+
We alias `string& {}` to `LowPriorityString` to make the code's intent clearer. This approach creates a more intuitive developer experience, though it does run into [some limitations](https://github.com/microsoft/TypeScript/issues/33471) with TypeScript's handling of [open-ended unions](https://github.com/microsoft/TypeScript/issues/26277).
186
187
187
-
We need other tricks to make it work better. Introducing `RefineNode` type.
188
+
We need other tricks to address these limitations. Introducing `RefineNode` type.
188
189
189
190
### Bridging general nodes and specific nodes via `RefineNode`
190
191
191
-
There are two categories of nodes:
192
-
* general `string`ly typed SgNode
193
-
* precisely typed SgNode
192
+
A key challenge in our type system was handling two distinct categories of nodes:
194
193
195
-
general node is like the untyped old API (but with better completion)
196
-
precisely typed node is a union type of all possible kinds of nodes
194
+
1. **General Nodes**: String-based typing (like our original API, but with enhanced completion), `SgNode<M, Kinds<M>>`.
195
+
2. **Specific Nodes**: Precisely typed nodes with known kinds, `SgNode<M, 'specific_kind'>`.
197
196
198
-
The previous general node is typed as `SgNode<M, Kinds<M>>`, the later is typed as `SgNode<M, 'specific_kind'>`.
197
+
When dealing with nodes that could be several specific kinds, we faced an interesting type system challenge. Consider these two approaches:
199
198
200
-
when it comes to a node that can have several specific kinds, it is better to use a union type of all possible kinds of nodes.
199
+
```typescript
200
+
// Approach 1: Union in the type parameter
201
+
let single:SgNode<'expression'|'type'>
201
202
202
-
Which kind of union should we use?
203
+
// Approach 2: Union of specific nodes
204
+
let union:SgNode<'expression'> |SgNode<'type'>
205
+
```
203
206
204
-
Note `SgNode<'expression'|'type'>` is different from `SgNode<'expression'> |SgNode<'type'>`
205
-
TypeScript has difficulty in narrowing the previous type, because it not safe to assume the former is equivalent to the later.
207
+
These approaches behave differently in TypeScript, for a [good reason](https://x.com/hd_nvim/status/1868706176281854151):
206
208
207
209
```typescript
208
210
let single:SgNode<'expression'|'type'>
209
211
if (single.kind==='expression') {
210
-
single//Still SgNode<'expression' | 'type'>, not narrowed
212
+
single//Remains SgNode<'expression' | 'type'> - not narrowed!
211
213
}
214
+
212
215
let union:SgNode<'expression'> |SgNode<'type'>
213
216
if (union.kind==='expression') {
214
-
union// SgNode<'expression'>, narrowed
217
+
union//Successfully narrowed to SgNode<'expression'>
215
218
}
216
219
```
217
220
218
-
However, `SgNode` is covariant in the kind parameter and this means it is okay.
219
-
it is general okay to distribute the type constructor over union type if the parameter is covariant.
220
-
but TypeScript does not support this feature.
221
-
222
-
So ast-grep uses a trick via the type `RefineNode<M, K>` to let you refine the former one to the later one.
221
+
`SgNode` is technically covariant in its kind parameter, meaning it's safe to distribute the type constructor over unions. However TypeScript doesn't support this automatically. (We will not go down the rabbit hole of type constructor variance here. But interested readers can check out [this wiki](https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)).)
223
222
224
-
If we don't have confidence to narrow the type, that is, the union type `K` contains a constituent of `string` type, it is equivalent to `SgNode<M, Kinds<M>>`.
225
-
Otherwise, we can refine the node to a union type of all possible kinds of nodes.
223
+
To bridge this gap, we introduced the `RefineNode` type:
1. When `K` includes a string type, it preserves the general node behavior
233
+
2. Otherwise, it refines the node into a union of specific types, using TypeScripts' [distributive conditional types](https://www.typescriptlang.org/docs/handbook/2/conditional-types.html#distributive-conditional-types).
234
234
235
-
Again, having both untyped and typed API is a good trade-off between **correct** and **robust** type checking. You want the compiler to infer as much as possible if a clue of the node type is given, but you also want to allow writing code without type.
235
+
This approach, inspired by [Biome's Rowan API](https://github.com/biomejs/biome/blob/09a04af727b3cdba33ac35837d112adb55726add/crates/biome_rowan/src/ast/mod.rs#L108-L120), achieves our dual goals: it remains **correct** by preserving proper type relationships and stays **robust** by gracefully handling both typed and untyped usage.
236
236
237
+
This hybrid approach gives developers the best of both worlds: strict type checking when types are known, with the flexibility to fall back to string-based typing when needed.
237
238
238
239
## Refine Type
239
240
240
241
Now let's talk about how to refine the general node to a specific node in ast-grep/napi.
241
-
242
-
Both manual and automatic refinement are **concise** and idiomatic in TypeScript.
242
+
We've implemented two concise and idiomatic approaches in TypeScript: manual and automatic refinement.
243
243
244
244
### Refine Node, Manually
245
245
246
-
You can do runtime checking via `sgNode.is("kind")`
246
+
#### Runtime Type Checking
247
+
248
+
The first manual approach uses runtime verification through the `is` method:
249
+
247
250
```typescript
248
251
classSgNode<M, K> {
249
252
is<TextendsK>(kind:T):thisisSgNode<M, T>
250
253
}
251
254
```
252
255
253
-
It can offer one time type narrowing
256
+
This enables straightforward type narrowing:
254
257
255
258
```typescript
256
259
if (sgNode.is("function_declaration")) {
257
260
sgNode.kind// narrow to 'function_declaration'
258
261
}
259
262
```
260
263
261
-
Another way is to provide an optional type parameter to the traversal method to refine the node to a specific kind, in case you are confident that the node is always of a specific kind and want to skip runtime check.
264
+
#### Type Parameter Specification
262
265
263
-
This is like the `document.querySelector<T>` method in the [DOM API](https://www.typescriptlang.org/docs/handbook/dom-manipulation.html#the-queryselector-and-queryselectorall-methods). It returns a general `Element` type, but you can refine it to a specific type like `HTMLDivElement` by providing generic argument.
266
+
Another manual approach lets you explicitly specify node types through type parameters. This is particularly useful when you're certain about a node's kind and want to skip runtime checks for better performance.
264
267
265
-
For example `sgNode.parent<"program">()`. This will refine the node to a specific kind `SgNode<TS, "program">`.
268
+
This pattern may feel familiar if you've worked with the [DOM API](https://www.typescriptlang.org/docs/handbook/dom-manipulation.html#the-queryselector-and-queryselectorall-methods)'s `querySelector<T>`. Just as `querySelector` can be refined from a general `Element` to a specific `HTMLDivElement`, we can refine our nodes:
This uses the interesting overloading feature of TypeScript
274
+
275
+
The type parameter approach uses an interesting overloading signature
268
276
269
277
```typescript
270
278
interfaceNodeMethod<M, K> {
271
-
():SgNode
272
-
<TextendsK>():RefineNode<M, T>
279
+
():SgNode<M> // Untyped version
280
+
<TextendsK>():RefineNode<M, T>// Typed version
273
281
}
274
282
```
275
-
If no type is provided, it returns a general node, `SgNode<M>`.
276
-
If a type is provided, it returns a specific node, `SgNode<M, K>`.
277
283
278
-
The reason why we use two overloading signatures here is to distinguish the two cases. If we use a single generic signature, TypeScript will always return the single version `SgNode<M, K1|K2>` or always returns a union of different `SgNode`s.
284
+
If no type is provided, it returns a general node, `SgNode<M>`. If a type is provided, it returns a specific node, `SgNode<M, K>`.
285
+
286
+
This dual-signature typing avoids the limitations of a single generic signature, which would either always return `SgNode<M, K1|K2>` or always produce a union of `SgNode`s.
287
+
288
+
#### Choosing the Right Type
289
+
290
+
When should you use each manual refinement method? Here are some guidelines:
291
+
292
+
✓ Use `is()` when:
293
+
* You need runtime type check
294
+
* Node types might vary
295
+
* Type safety is crucial
279
296
297
+
✓ Use type parameters when:
280
298
281
-
:::tip When to use type parameter and when `is`?
299
+
* You're completely certain of the node type
300
+
* Performance is critical
301
+
* The node type is fixed
282
302
283
-
If you cannot guarantee the node kind and want to do runtime check, use `is` method.
303
+
:::tip Safety Tip
284
304
285
-
If you are 100% sure about the node kind and want to avoid the runtime check overhead, use type parameter.
286
-
Note this option can break type safety if misused. This command can help you to audit.
305
+
Be cautious with type parameters as they bypass runtime checks. It can break type safety if misused.
0 commit comments