You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/blog/typed-napi.md
+67-63Lines changed: 67 additions & 63 deletions
Original file line number
Diff line number
Diff line change
@@ -4,88 +4,65 @@ sidebar: false
4
4
5
5
# Improve Napi Typing
6
6
7
-
I'm thrilled to announce that [@ast-grep/napi] now supports typed, solving a [long standing issue](https://github.com/ast-grep/ast-grep/issues/48) in our feature request.
7
+
> _Design, Define, Refine, and Confine: Crafting Balanced TypeScript Types_
8
8
9
-
In this blog post, we will walk through the problem and the [design](https://github.com/ast-grep/ast-grep/issues/1669) of the new feature. It will also be a valuable resource to write a good TypeScript type in general.
9
+
We're thrilled to introduce typed AST in [@ast-grep/napi], addressing a [long-requested feature](https://github.com/ast-grep/ast-grep/issues/48) for AST manipulation from the early days of this project.
10
10
11
-
## What's type safety? Why?
11
+
In this blog post, we will delve into the challenges addressed by this feature and explore [the design](https://github.com/ast-grep/ast-grep/issues/1669) that shaped its implementation. _We also believe this post can serve as a general guide to crafting balanced TypeScript types._
12
12
13
-
Writing AST manipulation code is hard. Even if we have a lot of [helpful](https://astexplorer.net/)[interactive](https://ast-grep.github.io/playground.html)[tool](https://github.com/sxzz/ast-kit), it's still hard to handle all edge cases.
13
+
## Why Type Safety Matters in AST
14
14
15
-
AST types are good guide-rail to write comprehensive AST manipulation code. It guides one to write comprehensive AST manipulation code (in case people forget to handle some cases). Using exhaustive checking, one can ensure that all cases are handled.
15
+
Working with Abstract Syntax Trees (ASTs) is complex. Even with AST [excellent](https://astexplorer.net/)[AST](https://ast-grep.github.io/playground.html)[tools](https://github.com/sxzz/ast-kit), handling all edge cases remains challenging.
16
16
17
-
While ast-grep napi is a convenient tool to programmatically process AST , but it lacks the type information to guide user to write robust logic to handle all potential code. Thank to [Mohebifar](https://github.com/mohebifar) from [codemod](https://codemod.com/), `ast-grep/napi` now can provide type information via nodejs API.
17
+
Type information serves as a crucial safety net when writing AST manipulation code. It guides developers toward handling all possible cases and enables exhaustive checking to ensure complete coverage.
18
18
19
-
The solution to solve the problem is generating types from the static information provided by AST parser library, and using several TypeScript tricks to provide a good typing API.
19
+
While `ast-grep/napi` has been a handy tool for programmatic AST processing, it previously lacked type information to help users write robust code. Thank to [Mohebifar](https://github.com/mohebifar) from [codemod](https://codemod.com/), we've now bridged this gap. Our solution generates types from parsers' metadata and employs TypeScript tricks to create an idiomatic API.
20
20
21
-
## What are good TypeScript types?
21
+
## Qualities of Good TypeScript Types
22
22
23
-
before we talk about how we achieve the goal, let's talk about what are good TypeScript types.
23
+
Before diving into our implementation, let's explore what makes TypeScript definitions truly effective. In today's JavaScript ecosystem, creating a great library involves more than just intuitive APIs and thorough documentation – it requires thoughtful type definitions that enhance developer experience.
24
24
25
-
Designing a good library in the modern JavaScript world is not only about providing good API naming, documentation and examples, but also about providing good TypeScript types. A good API type should be:
25
+
A well-designed type system should balance four key qualities:
26
26
27
-
***Correct**: reject invalid code and accept valid code
28
-
***Concise**: easy to read, especially in hover and completion
29
-
***Robust**: if compiler fails to infer your type, it should either graciously grant you the permission to be wild, or gracefully give you a easy to understand error message. it should not report a huge error that doesn't fit a screen
30
-
***Performant**: fast to compile. complex types can slow down the compiler
27
+
***Correct**: Types should act as reliable guardrails, rejecting invalid code while allowing all valid use cases.
28
+
***Concise**: Types should be easy to understand, whether in IDE hovers or code completions. Clear, readable types help developers quickly grasp your API.
29
+
***Robust**: In case type inference fails, the compiler should either graciously tolerate untyped code, or gracefully provide clear error messages. Cryptic type errors that span multiple screens is daunting and unhelpful.
30
+
***Performant**: Both type checking and runtime code should be fast. Complex types can significantly slow down compilation while unnecessary API calls just conforming to type safety can hurt runtime performance.
31
31
32
-
It is really hard to provide a type system that is both [Sound and Complete](https://logan.tw/posts/2014/11/12/soundness-and-completeness-of-the-type-system/#:~:text=A%20type%2Dsystem%20is%20sound,any%20false%20positive%20%5B2%5D.). This is similar to provide a good typing API.
32
+
Balancing these qualities is demanding job because they often compete with each other, just like creating a type system that is both [sound and complete](https://logan.tw/posts/2014/11/12/soundness-and-completeness-of-the-type-system/#:~:text=A%20type%2Dsystem%20is%20sound,any%20false%20positive%20%5B2%5D.). Many TS libraries lean heavily toward strict correctness – for instance, implementing elaborate types to validate routing parameters. While powerful, [type gymnastics](https://www.octomind.dev/blog/navigating-the-typescript-gymnastics-on-developer-dogma-2) can come with significant trade-offs in complexity and compile-time performance. Sometimes, being slightly less strict can lead to a dramatically better developer experience.
33
33
34
+
We will explore how ast-grep balances these qualities through _Design, Define, Refine, and Confine_.
34
35
35
-
TS libs nowaday probably pay too much attention to correctness IMHO.
36
-
Having a type to check your path parameter in your routing is cool, but what's the cost?
36
+
## Design Types
37
37
38
-
Designing a good TypeScript type is essentially a trade-off of these four aspects.
38
+
Let's return to ast-grep's challenge and learn some background knowledge on how Tree-sitter, our underlying parser library, handles types.
39
39
40
+
### TreeSitter's Core API
40
41
41
-
## Design Type
42
+
At its heart, Tree-sitter provides a language-agnostic API for traversing syntax trees. Its base API is intentionally untyped, offering a consistent interface across all programming languages:
42
43
43
-
Let's come back to ast-grep's problem.
44
-
45
-
The design principle of the new API is to progressively provide a more strict code checking and completion when the user gives more type information.
46
-
47
-
1.**Allow untyped AST access if no type information is provided**
48
-
49
-
Existing untyped API is still available and it is the default behavior.
50
-
The new feature should not break the existing code.
51
-
52
-
2.**Allow user to type AST node and enjoy more type safety**
53
-
54
-
The user can give types to AST nodes either manually or automatically.
55
-
Both approaches should refine the general untyped AST nodes to typed AST nodes and bring type check and intelligent completion to the user.
56
-
57
-
### TreeSitter's types
58
-
59
-
ast-grep is based on Tree-Sitter. Tree-Sitter's official API is untyped. It provies a uniform API to access the syntax tree across different languages. A node in Tree-Sitter has several common methods to access its node type, children, parent, and text content.
60
-
61
-
```TypeScript
44
+
```typescript
62
45
classNode {
63
-
kind():string//get the node type
64
-
field(name:string):Node//get a child node by field name
65
-
parent():Node
66
-
children():Node[]
67
-
text():string
46
+
kind():string//Get the type of node, e.g., 'function_declaration'
47
+
field(name:string):Node//Get a specific child by its field name
48
+
parent():Node// Navigate to the parent node
49
+
children():Node[]// Get all child nodes
50
+
text():string// Get the actual source code text
68
51
}
69
52
```
70
-
The API is simple and easy to use, but it lacks type information.
71
53
72
-
In contrast, a specific language's syntax tree, like [estree](https://github.com/estree/estree/blob/0362bbd130e926fed6293f04da57347a8b1e2325/es5.md), has a more specific structure. For example, a function declaration in JavaScript has a `function` keyword, a name, a list of parameters, and a body. Other AST parser libraries encode this structure in their AST object types. For example, a `function_declaration` has fields like `parameters` and `body`.
54
+
This API is elegantly simple, but its generality comes at the cost of type safety.
73
55
74
-
Fortunately tree-sitter provides static node types in json.
75
-
There are several challenges to generate TypeScript types from tree-sitter's static node types.
56
+
In contrast, traditional language-specific parsers bake AST structures directly into their types. Consider [estree](https://github.com/estree/estree/blob/0362bbd130e926fed6293f04da57347a8b1e2325/es5.md). It encodes rich structural information about each node type in JavaScript. For instance, a `function_declaration` is a specific structure with the function's `name`, `parameters` list, and `body` fields.
76
57
77
-
1. json is hosted by parser library repo
78
-
We needs type generation (it is like F-sharp's type provider)
79
-
2. json contains a lot unnamed kinds
80
-
You are writing a compiler plugin, not elementary school math homework
81
-
3. json has alias type
82
-
For example, `declaration` is an alias of `function_declaration`, `class_declaration` and other declaration kinds.
58
+
Fortunately, Tree-sitter hasn't left us entirely without type information. It provides detailed static type information in JSON format and leaves us an opportunity to enchant the flexible runtime API with the type safe magic.
83
59
84
-
### TreeSitter's `TypeMap`
85
-
The new typed API will consume TreeSitte's [static node types](https://tree-sitter.github.io/tree-sitter/using-parsers#static-node-types) like below:
60
+
### Tree-sitter's `TypeMap`
61
+
62
+
Tree-sitter provides [static node types](https://tree-sitter.github.io/tree-sitter/using-parsers#static-node-types) for library authors to consume. The type information has the following form, in TypeScript interface:
86
63
87
64
```typescript
88
-
interfaceTypeMpa {
65
+
interfaceTypeMap {
89
66
[kind:string]: {
90
67
type:string
91
68
named:boolean
@@ -94,12 +71,13 @@ interface TypeMpa {
94
71
types: { type:string, named:boolean }[]
95
72
}
96
73
}
74
+
children?: { name:string, type:string }[]
97
75
subtypes?: { type:string, named:boolean }[]
98
76
}
99
77
}
100
78
```
101
-
What is `TypeMaps`? It is a type that contains all static node types. It is a map from kind to the static type of the kind.
102
-
Here is a simplified example of the TypeScript static type.
79
+
80
+
`TypeMap` is a comprehensive catalog of all possible node types in a language's syntax tree. Let's break this down with a concrete example from TypeScript:
103
81
104
82
```typescript
105
83
typeTypeScript= {
@@ -111,9 +89,19 @@ type TypeScript = {
111
89
body: {
112
90
types: [ { type:"statement_block", named:true } ]
113
91
},
114
-
...
115
92
}
116
93
},
94
+
...
95
+
}
96
+
```
97
+
98
+
The structure contains the information about the node's kind, whether it is named, and its' fields and children.
99
+
`fields` is a map from field name to the type of the field, which encodes the AST structure like traditional parsers.
100
+
101
+
Tree-sitter also has a special type called `subtypes`, an alias of a list of other kinds.
102
+
103
+
```typescript
104
+
typeTypeScript= {
117
105
// node type alias
118
106
declaration: {
119
107
type:"declaration",
@@ -126,13 +114,29 @@ type TypeScript = {
126
114
}
127
115
```
128
116
129
-
The type information is encoded in a JSON object. Syntax node's static type contains the kind, whether it is named, and the fields of the node.
130
-
`fields` is a map from field name to the type of the field, which encodes the structure of the AST like other parser libraries.
117
+
In this example, `declaration` is an alias of `function_declaration`, `class_declaration` and other kinds. The alias type is used to reduce the redundancy in the static type JSON and will NOT be a node's actual kind.
118
+
119
+
Thanks to Tree-Sitter's design, we can leverage this rich type information to build our typed APIs!
120
+
121
+
### Design Principles of ast-grep/napi
122
+
123
+
Our new API follows a progressive enhancement approach to type safety:
124
+
125
+
**Preserve untyped AST access**
126
+
127
+
The existing untyped API remains available by default, ensuring backward compatibility
128
+
129
+
**Optional type safety on demand**
130
+
131
+
Users can opt into typed AST nodes either manually or automatically for enhanced type checking and autocompletion
132
+
133
+
However, it is a bumpy ride to transition to a new typed API via the path of Tree-sitter's static type.
131
134
132
-
Tree-sitter also provides alias types where a kind is an alias of a list of other kinds. For example, `declaration` is an alias of `function_declaration`, `class_declaration` and other kinds. The alias type is used to reduce the number of kinds in the static type.
135
+
First, type information JSON is hosted by Parser Library Repository. ast-grep/napi uses [a dedicated script](https://github.com/ast-grep/ast-grep/blob/main/crates/napi/scripts/generateTypes.ts) to fetch the JSON and generates the type. A [F# like type provider](https://learn.microsoft.com/en-us/dotnet/fsharp/tutorials/type-providers/) is on my TypeScript wishlist.
133
136
134
-
We want to both type a node's kind and its fields.
137
+
Second, the JSON contains a lot of unnamed kinds, which are not useful to users. Including them in the union type is too noisy. We will address this in the next section.
135
138
139
+
Finally, as mentioned earlier, the JSON contains alias types. We need to resolve the alias type to its concrete type, which is also covered in the next section.
0 commit comments