From 65856c9047cd3080c0dfd103de33c4875f2cb347 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 08:48:11 -0400 Subject: [PATCH 01/45] Copy template (as reStructuredText for table of contents) --- proposals/accepted/000-ast-parser-libs.rst | 56 ++++++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 proposals/accepted/000-ast-parser-libs.rst diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst new file mode 100644 index 00000000..a35d4e7a --- /dev/null +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -0,0 +1,56 @@ +Haskell Foundation Project Template - place your title here +=========================================================== + +*This template is for Haskell Foundation Technical Proposals that are projects to be executed by the Haskell Foundation, without necessarily having any further involvement by the proposer.* + +*Please delete the italic text before submitting.* + +Abstract +-------- + +*This section should provide a summary of the proposal that identifies the key problems to be solved and summarizes the solution.* + +Background +---------- + +*This section should explain any background (targeting a casual audience) needed to understand the proposal’s motivation (e.g. a high level overview of the technical details and some history).* + +Problem Statement +----------------- + +*This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. +It should also enumerate the requirements against which a solution should be evaluated.* + +Prior Art and Related Efforts +----------------------------- + +*This section should describe prior attempts to solve the problem, other relevant prior work, and what others in the community are doing to address the problem. +It should describe the relationship between the proposed work and the existing efforts. +If past attempts did not succeed, this section should provide a theory of why not.* + +Technical Content +----------------- + +*This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). +It should also describe the benefits, drawbacks, and risks that are associated with these decisions. +It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach.* + +Timeline +-------- + +*Are there any deadlines that the HF needs to be aware of?* + +Budget +------ + +*How much money is needed to accomplish the goal? How will it be used?* + +Stakeholders +------------ + +*Who stands to gain or lose from the implementation of this proposal? Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups.* + +Success +------- + +*Under what conditions will the project be considered a success?* From 0e9a77e37c9f273c62206969e3174f3478418756 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:18:56 -0400 Subject: [PATCH 02/45] Write abstract and background --- proposals/accepted/000-ast-parser-libs.rst | 91 +++++++++++++++++++--- 1 file changed, 79 insertions(+), 12 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index a35d4e7a..45d0cffc 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -1,19 +1,89 @@ -Haskell Foundation Project Template - place your title here +Split out AST and Parser libraries from GHC =========================================================== -*This template is for Haskell Foundation Technical Proposals that are projects to be executed by the Haskell Foundation, without necessarily having any further involvement by the proposer.* - -*Please delete the italic text before submitting.* - Abstract -------- -*This section should provide a summary of the proposal that identifies the key problems to be solved and summarizes the solution.* +The community lacks AST and Parser libraries for Haskell that are both self-contained and up-to-date. +Experience has shown that there is only way one way to meet each criterion: + +- Be used by GHC, so the library cannot fall behind new language development + +- Be separate from GHC, so the library is forced to be self-contained + +However, no library has so far done both, to meet both criteria. +The purpose of this proposal is to make that library finally exist. + +Background, Prior Art, and Related Efforts +------------------------------------------ + +Making such a library has long been a goal of the Haskell community. +This section highlights various past and ongoing efforts accordingly. + +*This section is purely informative; readers familiar with this backstory can skip this section and move on to the proposal proper.* + +``haskell-src-exts`` +~~~~~~~~~~~~~~~~~~~~ + +An older attempt is the venerable `haskell-src-exts `_ library. +This is actually was part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". +However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . +``haskell-src-exts`` lasted longer, but had great trouble keeping up with GHC, and is now also unmaintained since 2020. + +The lessons are clear: +``haskell-src-exts`` succeeded in being modular and self-contained, failed in trying to keep up with GHC. +Given the size of the community, competing head-on with GHC or trying to keep up with it is very difficult. +Being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. + +``ghc-lib-parser`` +~~~~~~~~~~~~~~~~~~ -Background ----------- +A newer attempt is `ghc-lib-parser `_ library. +Prior users of ``haskell-src-exts`` `like HLint `_ have largely migrated from ``haskell-src-exts`` to ``ghc-lib-parser``. +But ``ghc-lib-parser`` is *not* actually a separately developed library; rather it is one that is extracted from GHC's own source. +This ensures it is up to date with the latest behavior. +But this extraction process is complex, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. +All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a harder time presenting any sort of stable interface. -*This section should explain any background (targeting a casual audience) needed to understand the proposal’s motivation (e.g. a high level overview of the technical details and some history).* +The developers of ``ghc-lib-parser`` would not dispute the above criticisms, for this state of affairs was never intended to be a permanent solution, but rather just a stop gap. +Whereas ``ghc-lib-parser`` succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. +Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. + +Trees that grow +~~~~~~~~~~~~~~~ + +As we can see, each of these prior two attempts did one of the two things right, and correspondingly met one of our two criteria. +There is, however, a third project, that over the years has aimed to allow us to finally hit both criteria: "Trees that grow". +The name comes from `this paper `_. +There are also +`some GHC Wiki pages `_, +and a `GHC Issue Label `_ for it. + +The goal of the Trees that Grow paper was to allow creating variants of Haskell AST to more faithfully capture the input and output of each compilation pass, and also the ``template-haskell`` library. +It presents these data types: + +.. code-block:: haskell + + data Component = Compiler Pass | TemplateHaskell + + data Pass = Parser | Renamer | TypeChecker + +The idea that they are "promoted" via ``DataKinds``, and then type families used in the AST will have instances for these promoted values. +This allows those consumers to "adjust" the AST for their purpose. + + It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. + A goal of at least some usage outside GHC was always there. + +The Trees That Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. +But progress on `reducing AST & parser dependencies `_ has been less easily forthcoming. +I have separated out the modules defining the AST under `Language.Haskell.Syntax.*` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. +But progress is unsteady and unpredictable. + +The basic problem is that the benefits don't actually kick in until the deps are *all* gone, and the code is actually separated out. +Partial progress isn't really directly useful to anyone, and these counters just scoreboard by which we hope to get closer to the end goal. +It is thus hard to do this work with volunteers only, because it is emphatically *not* `"itch scratching" `_ work where incremental progress leads immediate incremental benefits to the contributor. + +The Haskell Foundation's support in getting this "over the finish line", at which the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. Problem Statement ----------------- @@ -21,9 +91,6 @@ Problem Statement *This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. It should also enumerate the requirements against which a solution should be evaluated.* -Prior Art and Related Efforts ------------------------------ - *This section should describe prior attempts to solve the problem, other relevant prior work, and what others in the community are doing to address the problem. It should describe the relationship between the proposed work and the existing efforts. If past attempts did not succeed, this section should provide a theory of why not.* From 01c48e69572fa3e4c81dcb48e233f236f5bb5c3f Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:25:28 -0400 Subject: [PATCH 03/45] Title, author, TOC, bump headings --- proposals/accepted/000-ast-parser-libs.rst | 33 ++++++++++++++-------- 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 45d0cffc..aaeed2b5 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -1,8 +1,17 @@ +=========================================== Split out AST and Parser libraries from GHC -=========================================================== +=========================================== + +:Date: August 2023 +:Authors: + John Ericson, + ??? + +.. sectnum:: +.. contents:: Abstract --------- +======== The community lacks AST and Parser libraries for Haskell that are both self-contained and up-to-date. Experience has shown that there is only way one way to meet each criterion: @@ -15,7 +24,7 @@ However, no library has so far done both, to meet both criteria. The purpose of this proposal is to make that library finally exist. Background, Prior Art, and Related Efforts ------------------------------------------- +========================================== Making such a library has long been a goal of the Haskell community. This section highlights various past and ongoing efforts accordingly. @@ -23,7 +32,7 @@ This section highlights various past and ongoing efforts accordingly. *This section is purely informative; readers familiar with this backstory can skip this section and move on to the proposal proper.* ``haskell-src-exts`` -~~~~~~~~~~~~~~~~~~~~ +-------------------- An older attempt is the venerable `haskell-src-exts `_ library. This is actually was part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". @@ -36,7 +45,7 @@ Given the size of the community, competing head-on with GHC or trying to keep up Being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. ``ghc-lib-parser`` -~~~~~~~~~~~~~~~~~~ +------------------ A newer attempt is `ghc-lib-parser `_ library. Prior users of ``haskell-src-exts`` `like HLint `_ have largely migrated from ``haskell-src-exts`` to ``ghc-lib-parser``. @@ -50,7 +59,7 @@ Whereas ``ghc-lib-parser`` succeeds in keeping up with GHC because it *is* GHC, Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. Trees that grow -~~~~~~~~~~~~~~~ +--------------- As we can see, each of these prior two attempts did one of the two things right, and correspondingly met one of our two criteria. There is, however, a third project, that over the years has aimed to allow us to finally hit both criteria: "Trees that grow". @@ -86,7 +95,7 @@ It is thus hard to do this work with volunteers only, because it is emphatically The Haskell Foundation's support in getting this "over the finish line", at which the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. Problem Statement ------------------ +================= *This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. It should also enumerate the requirements against which a solution should be evaluated.* @@ -96,28 +105,28 @@ It should describe the relationship between the proposed work and the existing e If past attempts did not succeed, this section should provide a theory of why not.* Technical Content ------------------ +================= *This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). It should also describe the benefits, drawbacks, and risks that are associated with these decisions. It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach.* Timeline --------- +======== *Are there any deadlines that the HF needs to be aware of?* Budget ------- +====== *How much money is needed to accomplish the goal? How will it be used?* Stakeholders ------------- +============ *Who stands to gain or lose from the implementation of this proposal? Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups.* Success -------- +======= *Under what conditions will the project be considered a success?* From fff9c1d8c41d141dec4fda02dc8db1f2e6f96af3 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:26:54 -0400 Subject: [PATCH 04/45] Fix typo --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index aaeed2b5..2f645e88 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -92,7 +92,7 @@ The basic problem is that the benefits don't actually kick in until the deps are Partial progress isn't really directly useful to anyone, and these counters just scoreboard by which we hope to get closer to the end goal. It is thus hard to do this work with volunteers only, because it is emphatically *not* `"itch scratching" `_ work where incremental progress leads immediate incremental benefits to the contributor. -The Haskell Foundation's support in getting this "over the finish line", at which the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. +The Haskell Foundation's support in getting this "over the finish line", at which point the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. Problem Statement ================= From 7d048d933fc428b1a0c8dc6135c0cf40da461c3a Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:28:49 -0400 Subject: [PATCH 05/45] Convert quote to footnote --- proposals/accepted/000-ast-parser-libs.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 2f645e88..fa953caa 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -68,7 +68,7 @@ There are also `some GHC Wiki pages `_, and a `GHC Issue Label `_ for it. -The goal of the Trees that Grow paper was to allow creating variants of Haskell AST to more faithfully capture the input and output of each compilation pass, and also the ``template-haskell`` library. +The goal of the Trees that Grow paper was to allow creating variants of Haskell AST to more faithfully capture the input and output of each compilation pass, and also the ``template-haskell`` library. [#intra]_ It presents these data types: .. code-block:: haskell @@ -80,6 +80,7 @@ It presents these data types: The idea that they are "promoted" via ``DataKinds``, and then type families used in the AST will have instances for these promoted values. This allows those consumers to "adjust" the AST for their purpose. +.. [#intra] It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. A goal of at least some usage outside GHC was always there. From 25ffdebf295620861d1f8b26de86fd7f73eb41a4 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:44:13 -0400 Subject: [PATCH 06/45] Split abstract into two parts --- proposals/accepted/000-ast-parser-libs.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index fa953caa..867b5a13 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -13,6 +13,9 @@ Split out AST and Parser libraries from GHC Abstract ======== +Problem +------- + The community lacks AST and Parser libraries for Haskell that are both self-contained and up-to-date. Experience has shown that there is only way one way to meet each criterion: @@ -21,7 +24,14 @@ Experience has shown that there is only way one way to meet each criterion: - Be separate from GHC, so the library is forced to be self-contained However, no library has so far done both, to meet both criteria. + +Solution +-------- + The purpose of this proposal is to make that library finally exist. +The Haskell Foundation will finance the completion of the existing "Trees that grow" project, decoupling GHC's AST and parser from the rest of the compiler so they can be moved to separate libraries. +Those libaries will be "normal" haskell libraries, without any weird dependencies or build process, and published on Hackage. +Those libraries will be used by GHC, ensuring they are maintained. Background, Prior Art, and Related Efforts ========================================== From 7600c24860a4793827f31969e1a701cefa7be682 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 10:44:58 -0400 Subject: [PATCH 07/45] Move footnote --- proposals/accepted/000-ast-parser-libs.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 867b5a13..90c50373 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -90,10 +90,6 @@ It presents these data types: The idea that they are "promoted" via ``DataKinds``, and then type families used in the AST will have instances for these promoted values. This allows those consumers to "adjust" the AST for their purpose. -.. [#intra] - It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. - A goal of at least some usage outside GHC was always there. - The Trees That Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. But progress on `reducing AST & parser dependencies `_ has been less easily forthcoming. I have separated out the modules defining the AST under `Language.Haskell.Syntax.*` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. @@ -105,6 +101,10 @@ It is thus hard to do this work with volunteers only, because it is emphatically The Haskell Foundation's support in getting this "over the finish line", at which point the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. +.. [#intra] + It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. + A goal of at least some usage outside GHC was always there. + Problem Statement ================= From 4c215af1652225f1bdfea4c9c2d6677dc42da822 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 11:01:20 -0400 Subject: [PATCH 08/45] Start reworkin the TOC for the rest --- proposals/accepted/000-ast-parser-libs.rst | 57 ++++++++++++++-------- 1 file changed, 36 insertions(+), 21 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 90c50373..42731d24 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -105,39 +105,54 @@ The Haskell Foundation's support in getting this "over the finish line", at whic It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. A goal of at least some usage outside GHC was always there. -Problem Statement -================= - -*This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. -It should also enumerate the requirements against which a solution should be evaluated.* - -*This section should describe prior attempts to solve the problem, other relevant prior work, and what others in the community are doing to address the problem. -It should describe the relationship between the proposed work and the existing efforts. -If past attempts did not succeed, this section should provide a theory of why not.* - -Technical Content -================= +Roadmap +======= *This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). It should also describe the benefits, drawbacks, and risks that are associated with these decisions. It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach.* -Timeline -======== - *Are there any deadlines that the HF needs to be aware of?* -Budget -====== - *How much money is needed to accomplish the goal? How will it be used?* +The project is split into two separate steps: separating the parser, and separating the AST. +Each step has a method, time estimate, and (most importantly) clear success criteria, including use by downstream projects to ensure value is delivered. +The intent is thus that they are self-contained, and can be individually funded. + +Separate the Parser +------------------- + +Split library +~~~~~~~~~~~~~ + +**Time Estimate:** ?? + +Proof of success: Use by Haddock +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +**Time Estimate:** ?? + +Separate the AST +---------------- + +Split library +~~~~~~~~~~~~~ + +**Time Estimate:** ?? + +Proof of success: Use by HLint +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +**Time Estimate:** ?? + Stakeholders ============ *Who stands to gain or lose from the implementation of this proposal? Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups.* -Success -======= +- GHC Developers + +- Haddock Developers -*Under what conditions will the project be considered a success?* +- HLint Developers From 70de9599f117cad7220f59179a766c6bcbb756ff Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 12 Aug 2023 12:08:19 -0400 Subject: [PATCH 09/45] Add some stray notes --- proposals/accepted/000-ast-parser-libs.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 42731d24..a6a3ff2c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -133,6 +133,8 @@ Proof of success: Use by Haddock **Time Estimate:** ?? +https://gitlab.haskell.org/ghc/ghc/-/issues/21592#note_519447 Note how this use-case only needs the AST not parser. + Separate the AST ---------------- @@ -156,3 +158,10 @@ Stakeholders - Haddock Developers - HLint Developers + +Future Work +=========== + +Factored out pretty print (exact print) + +Depends on resolution of things like https://gitlab.haskell.org/ghc/ghc/-/issues/23447 From 5bf9d912943776c53ebcccd74bbcf3c7a6addf69 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sun, 13 Aug 2023 11:37:30 -0400 Subject: [PATCH 10/45] [ast-parser-libs]: background-section: provide additional color --- proposals/accepted/000-ast-parser-libs.rst | 44 ++++++++++++---------- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index a6a3ff2c..16527173 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -41,32 +41,38 @@ This section highlights various past and ongoing efforts accordingly. *This section is purely informative; readers familiar with this backstory can skip this section and move on to the proposal proper.* -``haskell-src-exts`` --------------------- +.. |haskell-src-exts| replace:: ``haskell-src-exts`` +.. _haskell-src-exts: https://hackage.haskell.org/package/haskell-src-exts -An older attempt is the venerable `haskell-src-exts `_ library. -This is actually was part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". -However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . -``haskell-src-exts`` lasted longer, but had great trouble keeping up with GHC, and is now also unmaintained since 2020. +.. |ghc-lib-parser| replace:: ``ghc-lib-parser`` +.. _ghc-lib-parser: https://hackage.haskell.org/package/ghc-lib-parser + +|haskell-src-exts| +------------------ + +An older attempt is the venerable "Haskell-Source with Extensions (HSE)" package, also known as |haskell-src-exts|_ . This is actually part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . + +`HLint `_, the Haskell linter project developed continuously since 2008 relied heavily on |haskell-src-exts| which lasted longer than the rest of suite, but had great trouble keeping up with GHC. In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". The lessons are clear: -``haskell-src-exts`` succeeded in being modular and self-contained, failed in trying to keep up with GHC. -Given the size of the community, competing head-on with GHC or trying to keep up with it is very difficult. -Being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. -``ghc-lib-parser`` +- |haskell-src-exts| succeeded in being modular and self-contained but failed trying to keep up with GHC; +- given the size of the community, competing head-on with GHC or trying to keep up with it is very difficult: + + - being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. + +|ghc-lib-parser| ------------------ -A newer attempt is `ghc-lib-parser `_ library. -Prior users of ``haskell-src-exts`` `like HLint `_ have largely migrated from ``haskell-src-exts`` to ``ghc-lib-parser``. -But ``ghc-lib-parser`` is *not* actually a separately developed library; rather it is one that is extracted from GHC's own source. -This ensures it is up to date with the latest behavior. -But this extraction process is complex, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. -All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a harder time presenting any sort of stable interface. +In June 2019, `HLint began transitition `_ to |ghc-lib-parser| and in May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. + +A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghcinception]_. This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. + +Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. + +.. [#ghcinception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. -The developers of ``ghc-lib-parser`` would not dispute the above criticisms, for this state of affairs was never intended to be a permanent solution, but rather just a stop gap. -Whereas ``ghc-lib-parser`` succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. -Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. +.. [#exampleghclibparserusers] Today for example, notable users include `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. Trees that grow --------------- From f977acf170f74969e5eabf1702dd9cba972add34 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 15:35:32 -0400 Subject: [PATCH 11/45] More work on the proposal proper --- proposals/accepted/000-ast-parser-libs.rst | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index a6a3ff2c..50ee146b 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -116,17 +116,24 @@ It can be a good idea to describe alternative approaches here as well, and why t *How much money is needed to accomplish the goal? How will it be used?* -The project is split into two separate steps: separating the parser, and separating the AST. +The project is split into two separate steps: separating the AST, and separating the parser. Each step has a method, time estimate, and (most importantly) clear success criteria, including use by downstream projects to ensure value is delivered. The intent is thus that they are self-contained, and can be individually funded. -Separate the Parser -------------------- + +Separate the AST +---------------- Split library ~~~~~~~~~~~~~ -**Time Estimate:** ?? +**Time Estimate:** 1--2 Weeks + +The first step is just separating data definitions. +We don't need to worry about code entangling, just data entangling. + +The timeline for this is pretty short because there exists an easy last-resort way to decouple anything: +just add another TTG type family. Proof of success: Use by Haddock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -135,11 +142,8 @@ Proof of success: Use by Haddock https://gitlab.haskell.org/ghc/ghc/-/issues/21592#note_519447 Note how this use-case only needs the AST not parser. -Separate the AST ----------------- - -Split library -~~~~~~~~~~~~~ +Separate the Parser +------------------- **Time Estimate:** ?? From 9420a5f3d91490d6400192769e661e56b6676d8b Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 15:37:32 -0400 Subject: [PATCH 12/45] Restore breaking on punctuation This makes the diff easier to read with (crude, bad for prose) line-based diffing. --- proposals/accepted/000-ast-parser-libs.rst | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index c892b8c3..1b8a2ef5 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -50,9 +50,12 @@ This section highlights various past and ongoing efforts accordingly. |haskell-src-exts| ------------------ -An older attempt is the venerable "Haskell-Source with Extensions (HSE)" package, also known as |haskell-src-exts|_ . This is actually part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . +An older attempt is the venerable "Haskell-Source with Extensions (HSE)" package, also known as |haskell-src-exts|_ . +This is actually part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". +However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . -`HLint `_, the Haskell linter project developed continuously since 2008 relied heavily on |haskell-src-exts| which lasted longer than the rest of suite, but had great trouble keeping up with GHC. In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". +`HLint `_, the Haskell linter project developed continuously since 2008 relied heavily on |haskell-src-exts| which lasted longer than the rest of suite, but had great trouble keeping up with GHC. +In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". The lessons are clear: @@ -64,11 +67,16 @@ The lessons are clear: |ghc-lib-parser| ------------------ -In June 2019, `HLint began transitition `_ to |ghc-lib-parser| and in May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. +In June 2019, `HLint began transitition `_ to |ghc-lib-parser| and in May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. +Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. -A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghcinception]_. This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. +A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghcinception]_. +This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: +see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. +All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. -Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. +Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. +Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. .. [#ghcinception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. From 5be86812ce8d444ae5ea7e984a947254c57af2d7 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 15:41:06 -0400 Subject: [PATCH 13/45] Add blank line between bullets --- proposals/accepted/000-ast-parser-libs.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 1b8a2ef5..01a92503 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -60,6 +60,7 @@ In Februrary 2019, |ghc-lib-parser| `was released Date: Sun, 13 Aug 2023 15:47:13 -0400 Subject: [PATCH 14/45] Add Shayne Fletcher as coauthor :) --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 01a92503..9ec1f89c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -5,7 +5,7 @@ Split out AST and Parser libraries from GHC :Date: August 2023 :Authors: John Ericson, - ??? + Shayne Fletcher .. sectnum:: .. contents:: From 5f9055298fe919ce3f7992683522e86901e32749 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 16:11:46 -0400 Subject: [PATCH 15/45] Pull HLint into new "downstream projects" section --- proposals/accepted/000-ast-parser-libs.rst | 51 +++++++++++++++------- 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 9ec1f89c..a8ef9b1c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -47,41 +47,62 @@ This section highlights various past and ongoing efforts accordingly. .. |ghc-lib-parser| replace:: ``ghc-lib-parser`` .. _ghc-lib-parser: https://hackage.haskell.org/package/ghc-lib-parser +.. _HLint: https://hackage.haskell.org/package/hlint + |haskell-src-exts| ------------------ An older attempt is the venerable "Haskell-Source with Extensions (HSE)" package, also known as |haskell-src-exts|_ . This is actually part of a larger project called the `"Haskell Suite" `_, the purpose of which was "to implement the whole Haskell compiler as a set of libraries". -However, the whole compiler doesn't appear to exist, and the project as a whole ceased development . - -`HLint `_, the Haskell linter project developed continuously since 2008 relied heavily on |haskell-src-exts| which lasted longer than the rest of suite, but had great trouble keeping up with GHC. -In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". - -The lessons are clear: - -- |haskell-src-exts| succeeded in being modular and self-contained but failed trying to keep up with GHC; - -- given the size of the community, competing head-on with GHC or trying to keep up with it is very difficult: - - - being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. +However, the whole compiler doesn't appear to exist, and the project as a whole ceased development. +``haskell-src-exts`` lasted longer, but had great trouble keeping up with GHC, and is now also unmaintained since 2020. |ghc-lib-parser| ------------------ -In June 2019, `HLint began transitition `_ to |ghc-lib-parser| and in May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. -Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. +In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghcinception]_. This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. +The lesson from |haskell-src-exts| are clear: + +- it succeeded in being modular and self-contained but failed trying to keep up with GHC; + +- given the size of the community, competing head-on with GHC or trying to keep up with it is very difficult: + + - being used by GHC, so keeping up happens automatically, is the clearest way to avoid this problem. + Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. .. [#ghcinception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. -.. [#exampleghclibparserusers] Today for example, notable users include `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. +Downstream projects +------------------- + +In addition to going over the major AST/parser libraries for Haskell, it is also useful to talk about the most notable projects that use them. +Ultimately, it is those projects we want to help out. + +HLint_, the Haskell linter project developed continuously since 2008 is the most notable one, not just because its longstanding and wide use, but also because many of the developers that worked on the previous two libraries also worked on it --- use by HLint served as proof the libraries were fit-for-purpose. + +In June 2019, +following the release of |ghc-lib-parser| and deprecation of |haskell-src_exts| earlier that year as described above, +`HLint began the transitition `_ to |ghc-lib-parser|. +In May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. + +Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. +But just because all these projects are using |ghc-lib-parser| doesn't mean everything is well. +**Insert quote about maintainence overhead.** +The cost of detail with changes to the AST is inevitable --- supporting new language features will inevitably cost developer time. +But all the other busywork of re-extracting the code, etc., is entirely avoidable, *not* inherent to the task at hand. + +It is the opinion of the authors of this proposal that should an independent AST parser libraries be maintained upstream with GHC, the costs saved for downstream developers should _greatly_ exceed any costs incurred by GHC developers. +The goal is thus *not* to simply shift a burden from one group of community members to another, but create a positive-sum outcome where there is far less busywork and more flourishing tooling than before. + +.. [#exampleghclibparserusers] Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. Trees that grow --------------- From f865e4285ffb4fbd6aa35cc7515bcefa5a4d5a98 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sun, 13 Aug 2023 16:56:51 -0400 Subject: [PATCH 16/45] typo --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index a8ef9b1c..bab71a9a 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -89,7 +89,7 @@ Ultimately, it is those projects we want to help out. HLint_, the Haskell linter project developed continuously since 2008 is the most notable one, not just because its longstanding and wide use, but also because many of the developers that worked on the previous two libraries also worked on it --- use by HLint served as proof the libraries were fit-for-purpose. In June 2019, -following the release of |ghc-lib-parser| and deprecation of |haskell-src_exts| earlier that year as described above, +following the release of |ghc-lib-parser| and deprecation of |haskell-src-exts| earlier that year as described above, `HLint began the transitition `_ to |ghc-lib-parser|. In May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. From b24e67899ce6aeded278b6b53dee713f1ad424bd Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 16:56:12 -0400 Subject: [PATCH 17/45] Talk about the type family trade-off --- proposals/accepted/000-ast-parser-libs.rst | 25 ++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index bab71a9a..1fbdd655 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -170,6 +170,31 @@ We don't need to worry about code entangling, just data entangling. The timeline for this is pretty short because there exists an easy last-resort way to decouple anything: just add another TTG type family. +This came up with some acrimony in `GHC Issue #21628 `_, discussing whether it was better to try to change GHC's ``FastString`` or abstract over it. +The purpose of this proposal isn't to relitigate that issue, but because this proposal *is* about resource allocation, something does need to be said on the broader tradeoffs at play + +There is no disagreement that as-is, that data type is not suitable for a nice self-contained library. [#faststring-unsuitable]_ +The disagreement is whether TTG should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. + +I make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take `**Days to Weeks**`, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ + +Out of a basic fiduciary towards the ``Haskell Foundation``, we thus declare that unless "Plan A" works out very quickly, "Plan B" of just introducing another extension point should be used. +We can also revisit the issue later, *after* we have our factored-out AST library. + +.. [#faststring-unsuitable] + Everyone agrees it is insuitable in its current state because things like: + + - Global state because of `string interning `, with a global variable baked into the RTS no less! + + - Memoizing features for other parts of the compiler unrelated to parsing, such as the `"Z-Encoding" ` GHC happens to use for object file symbol `name mangling `. + + Everyone *also* agrees that it is worth revising whether these algorithmic decision still make sense given modern hardware, see `GHC Issue #17259 `_. + +.. [#extension-point] + "Extension point" is Trees That Grow parlance for such a type family. + The idea is that the AST library no longer refers to a data type like ``FastString`` directory, but instead refers to an abstract ``StringP p``. + Then, GHC can define `StringP (GhcPass _) = FastString`` to use it client side, across all compilation passes. + All term-level code continues to works exactly the same as before without modification. Proof of success: Use by Haddock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From 3399bc162f666b5ed16ecca4fd6c9b2a084e650f Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 17:00:21 -0400 Subject: [PATCH 18/45] Fix typo --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 1fbdd655..26f9e70f 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -176,7 +176,7 @@ The purpose of this proposal isn't to relitigate that issue, but because this pr There is no disagreement that as-is, that data type is not suitable for a nice self-contained library. [#faststring-unsuitable]_ The disagreement is whether TTG should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. -I make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take `**Days to Weeks**`, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ +I make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ Out of a basic fiduciary towards the ``Haskell Foundation``, we thus declare that unless "Plan A" works out very quickly, "Plan B" of just introducing another extension point should be used. We can also revisit the issue later, *after* we have our factored-out AST library. From 23dbec93708e1de8c9bee0b67c7fa9c35c765ce1 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 17:00:50 -0400 Subject: [PATCH 19/45] Try to fix en dash --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 26f9e70f..6e2dd0d2 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -163,7 +163,7 @@ Separate the AST Split library ~~~~~~~~~~~~~ -**Time Estimate:** 1--2 Weeks +**Time Estimate:** 1 -- 2 Weeks The first step is just separating data definitions. We don't need to worry about code entangling, just data entangling. From 2f38b7c3a4dabbf0e2ec68ec900b8594674436c2 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 17:01:18 -0400 Subject: [PATCH 20/45] Fix more formatting --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 6e2dd0d2..152c60e1 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -193,7 +193,7 @@ We can also revisit the issue later, *after* we have our factored-out AST librar .. [#extension-point] "Extension point" is Trees That Grow parlance for such a type family. The idea is that the AST library no longer refers to a data type like ``FastString`` directory, but instead refers to an abstract ``StringP p``. - Then, GHC can define `StringP (GhcPass _) = FastString`` to use it client side, across all compilation passes. + Then, GHC can define ``StringP (GhcPass _) = FastString`` to use it client side, across all compilation passes. All term-level code continues to works exactly the same as before without modification. Proof of success: Use by Haddock From 838057f81421a304d3618cb5b8d66ebdeb82790e Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 17:22:26 -0400 Subject: [PATCH 21/45] Discuss work to be done --- proposals/accepted/000-ast-parser-libs.rst | 32 ++++++++++++++++++++-- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 152c60e1..c5188cff 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -128,7 +128,7 @@ This allows those consumers to "adjust" the AST for their purpose. The Trees That Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. But progress on `reducing AST & parser dependencies `_ has been less easily forthcoming. -I have separated out the modules defining the AST under `Language.Haskell.Syntax.*` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. +I have separated out the modules defining the AST under ``Language.Haskell.Syntax.*`` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. But progress is unsteady and unpredictable. The basic problem is that the benefits don't actually kick in until the deps are *all* gone, and the code is actually separated out. @@ -163,10 +163,19 @@ Separate the AST Split library ~~~~~~~~~~~~~ -**Time Estimate:** 1 -- 2 Weeks +**Time Estimate:** 1 – 2 Weeks The first step is just separating data definitions. We don't need to worry about code entangling, just data entangling. +We have already separated those data definitions into modules in the ``Language.Haskell.Syntax.*`` namespace. + +Concretely, the work in this step is to: + +#. Modify those modules to not import any other modules in ``ghc`` (``GHC.*`` modules). + +#. Move those modules to a new separate AST library in the GHC repo. + +#. Adjust ``build-depends`` across the repo so ``ghc`` and any other Haskell Package gets those modules from the new library instead, and CI passes. The timeline for this is pretty short because there exists an easy last-resort way to decouple anything: just add another TTG type family. @@ -201,18 +210,35 @@ Proof of success: Use by Haddock **Time Estimate:** ?? -https://gitlab.haskell.org/ghc/ghc/-/issues/21592#note_519447 Note how this use-case only needs the AST not parser. +It might seem odd that there is a real-world use case for an AST without a Parser, but we do in fact have one: a Summer of Haskell project reducing Haddock's depedencies on GHC. +The situation is nicely described by Laurent who is mentoring the project `here `_, but we'll recap the basics: + +Haddock as whole is still using the complete ``ghc`` library, and parsing is continuing to happen that way. +Individual rendering backends, however, are being split out into separate packages, and those are only using the ``Language.Haskell.Syntax.*`` modules. + +That is all being done by the Summer of Haskell project. +What is to be done in this step is to make those backend packages just depend on the new AST library. +If the Summer of Haskell projects succeeds, this should be very easy since it is precisely those ``Language.Haskell.Syntax.*`` modules that will end up in the AST library. +All code should continue to work as before, since ``ghc`` will also use the new AST library, and thus the parsing initiated by the frontend and the backends should automatically agree on data structures. Separate the Parser ------------------- **Time Estimate:** ?? +This work is more uncertain, because the parser and post-processing steps necessary to get an actual AST may use utility functions currently entangled with the rest of the compiler. +It maybe be the case that we need to finish the far more certain first step (AST library) to get better clarity on what work remains for the parser, and thus price this step accurately. + Proof of success: Use by HLint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Time Estimate:** ?? +We will continue the tradition discussed in the background section of using HLint to validate that parsers for Haskell are usable by real-world programs that are not GHC. + +The migration from |haskell-src-exts| to |ghc-lib-parser| was quite difficult because those libraries are nothing alike. +In contrast, we expect the migration from |ghc-lib-parser| to the new AST and parser libraries to be quite simple and pleasant, because the two new libraries should be very similar to |ghc-lib-parser|, and where they differ they should be strictly easier to use than before. + Stakeholders ============ From f74b4f02a279d25e171c80a9880c17a8241fde07 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 17:32:53 -0400 Subject: [PATCH 22/45] Try adding TODO --- proposals/accepted/000-ast-parser-libs.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index c5188cff..1bed7dd9 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -239,6 +239,9 @@ We will continue the tradition discussed in the background section of using HLin The migration from |haskell-src-exts| to |ghc-lib-parser| was quite difficult because those libraries are nothing alike. In contrast, we expect the migration from |ghc-lib-parser| to the new AST and parser libraries to be quite simple and pleasant, because the two new libraries should be very similar to |ghc-lib-parser|, and where they differ they should be strictly easier to use than before. +.. todo:: + Any more detail we can write here? + Stakeholders ============ From 5fe4b808f516461097f62008c7581fdb4593a594 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 18:07:01 -0400 Subject: [PATCH 23/45] A few more things --- proposals/accepted/000-ast-parser-libs.rst | 39 ++++++++++++++++++---- 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 1bed7dd9..ceb1e612 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -62,7 +62,7 @@ However, the whole compiler doesn't appear to exist, and the project as a whole In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". -A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghcinception]_. +A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghc-inception]_. This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. @@ -78,7 +78,7 @@ The lesson from |haskell-src-exts| are clear: Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. -.. [#ghcinception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. +.. [#ghc-inception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. Downstream projects ------------------- @@ -93,7 +93,7 @@ following the release of |ghc-lib-parser| and deprecation of |haskell-src-exts| `HLint began the transitition `_ to |ghc-lib-parser|. In May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. -Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#exampleghclibparserusers]_. +Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#example-ghc-lib-parser-users]_. But just because all these projects are using |ghc-lib-parser| doesn't mean everything is well. **Insert quote about maintainence overhead.** The cost of detail with changes to the AST is inevitable --- supporting new language features will inevitably cost developer time. @@ -102,7 +102,7 @@ But all the other busywork of re-extracting the code, etc., is entirely avoidabl It is the opinion of the authors of this proposal that should an independent AST parser libraries be maintained upstream with GHC, the costs saved for downstream developers should _greatly_ exceed any costs incurred by GHC developers. The goal is thus *not* to simply shift a burden from one group of community members to another, but create a positive-sum outcome where there is far less busywork and more flourishing tooling than before. -.. [#exampleghclibparserusers] Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. +.. [#example-ghc-lib-parser-users] Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. Trees that grow --------------- @@ -229,6 +229,9 @@ Separate the Parser This work is more uncertain, because the parser and post-processing steps necessary to get an actual AST may use utility functions currently entangled with the rest of the compiler. It maybe be the case that we need to finish the far more certain first step (AST library) to get better clarity on what work remains for the parser, and thus price this step accurately. +.. todo:: + Any more detail we can write here? + Proof of success: Use by HLint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -249,13 +252,37 @@ Stakeholders - GHC Developers + The proposal is asking that we change out code in GHC is organized, so it is crucial that we solicit feedback from the broader `GHC Team `_, and the narrow `GHC HQ group `_ in particular. + It is John's understanding that the GHC developers are broadly supportive of the goal here in the abstract, + (after all, SPJ was an author of the Trees That Grow paper), + but some of the specific details needed to get this done in a timely manner may be more controversial. + + In particular, introducing more extension points to ensure rapid progress was very controversial before, and in return for putting up with such a thing as stop-gap, the GHC HQ might want something in return, like an additional phase of work to eliminate the new extension points afterwords. + - Haddock Developers + The Haddock maintainers will likewise be maintaining the result of the Summer of Code project, along with the integration work done as part of this. + We should ensure that they are satisfied with the work being done here and it comports with their overall desires for the project. + - HLint Developers + The HLint developers have been heavily involved with reusable AST and parser work every step of the way, and should continue to be involved with this too. + In addition, we've chosen HLint to be the integration step for the second half just like Haddock was in the first. + Thankfully, one of the HLint developers, Shayne Fletcher, is also a co-author of this proposal! + Future Work =========== -Factored out pretty print (exact print) +Pretty-printing +--------------- + +Just as it is nice to accompany the AST with logic to convert raw text syntax to it (the parser), +so it is nice to also accompany the AST with logic to do the opposite: render back to text (the pretty-printer). + +There has been much work to allow this to be done in a faithful round trip, know as "exact-print" functionality. +However, the detail of how this works are still fast-evolving. [#exact-print-evolving] + +We therefore think it is best to leave factoring out the pretty-printer into a reusable library (either part of the parser library, or a new 3rd reusable library) as a future work. -Depends on resolution of things like https://gitlab.haskell.org/ghc/ghc/-/issues/23447 +.. [#exact-print-evolving] + See `GHC Issue #23447 `_ for example. From cfa7907e93f4469ed7f6f8dfc472c0bb02f06877 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 18:07:08 -0400 Subject: [PATCH 24/45] Discuss reinstallable GHC lib --- proposals/accepted/000-ast-parser-libs.rst | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index ceb1e612..d5d58c55 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -141,6 +141,20 @@ The Haskell Foundation's support in getting this "over the finish line", at whic It might sound like the goal is only different usages within GHC, but remember that ``template-haskell`` is a separate library used by users of Haskell not just developers of Haskell. A goal of at least some usage outside GHC was always there. +Reinstallable GHC Lib +--------------------- + +One of the problems ``ghc-lib-parser`` aims to solve is that ``ghc`` the library is current cumbersome to install as a regular haskell library (as opposed to by switching toolchains entirely). +There is currently work in flight to solve that. +One that is done, projects like HLint_ *could* just depend on ``ghc`` directly, and still be easily buildable (with Cabal / with Stack / from Hackage) as today. + +Just doing this isn't a good solution though, because ``ghc`` exposes a much a wider surface area than what these projects actually want. +For stability's sake, it is better that those libraries dependent on narrower parsing / AST libraries that only provide what they need. +And longer term, we hope the "tug of war" of between GHC and these projects as consumers of those libraries, versus just the others having to deal with whatever GHC does with just itself in mind, will result in a higher-quality, more flexible, and overall friendlier library. + +In `this comment `_, it is suggested that factoring out the AST and parser can be a good first step making a more modular in GHC in general. +This proposal wish to *stay neutral* on the merits of such a future direction, but it would be remiss not to at least highlight it as one possible outcome. + Roadmap ======= From 2802338888c5aa845fc37dde0f4a5cc2632fbc58 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 18:13:43 -0400 Subject: [PATCH 25/45] Slight reformat subsections not bullets --- proposals/accepted/000-ast-parser-libs.rst | 29 ++++++++++++---------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index d5d58c55..08d56611 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -264,25 +264,28 @@ Stakeholders *Who stands to gain or lose from the implementation of this proposal? Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups.* -- GHC Developers +GHC Developers +-------------- - The proposal is asking that we change out code in GHC is organized, so it is crucial that we solicit feedback from the broader `GHC Team `_, and the narrow `GHC HQ group `_ in particular. - It is John's understanding that the GHC developers are broadly supportive of the goal here in the abstract, - (after all, SPJ was an author of the Trees That Grow paper), - but some of the specific details needed to get this done in a timely manner may be more controversial. +The proposal is asking that we change out code in GHC is organized, so it is crucial that we solicit feedback from the broader `GHC Team `_, and the narrow `GHC HQ group `_ in particular. +It is John's understanding that the GHC developers are broadly supportive of the goal here in the abstract, +(after all, SPJ was an author of the Trees That Grow paper), +but some of the specific details needed to get this done in a timely manner may be more controversial. - In particular, introducing more extension points to ensure rapid progress was very controversial before, and in return for putting up with such a thing as stop-gap, the GHC HQ might want something in return, like an additional phase of work to eliminate the new extension points afterwords. +In particular, introducing more extension points to ensure rapid progress was very controversial before, and in return for putting up with such a thing as stop-gap, the GHC HQ might want something in return, like an additional phase of work to eliminate the new extension points afterwords. -- Haddock Developers +Haddock Developers +------------------ - The Haddock maintainers will likewise be maintaining the result of the Summer of Code project, along with the integration work done as part of this. - We should ensure that they are satisfied with the work being done here and it comports with their overall desires for the project. +The Haddock maintainers will likewise be maintaining the result of the Summer of Code project, along with the integration work done as part of this. +We should ensure that they are satisfied with the work being done here and it comports with their overall desires for the project. -- HLint Developers +HLint Developers +---------------- - The HLint developers have been heavily involved with reusable AST and parser work every step of the way, and should continue to be involved with this too. - In addition, we've chosen HLint to be the integration step for the second half just like Haddock was in the first. - Thankfully, one of the HLint developers, Shayne Fletcher, is also a co-author of this proposal! +The HLint developers have been heavily involved with reusable AST and parser work every step of the way, and should continue to be involved with this too. +In addition, we've chosen HLint to be the integration step for the second half just like Haddock was in the first. +Thankfully, one of the HLint developers, Shayne Fletcher, is also a co-author of this proposal! Future Work =========== From 3117c53578df6270eff465a001d19733ef94f6a2 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 23:32:36 -0400 Subject: [PATCH 26/45] Fix footnote ref, normalize footnote syntax --- proposals/accepted/000-ast-parser-libs.rst | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 08d56611..a13bdc0c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -78,7 +78,8 @@ The lesson from |haskell-src-exts| are clear: Whereas |ghc-lib-parser| succeeds in keeping up with GHC because it *is* GHC, it fails in being self-contained because modularity cannot be `"fixed in post" [production] `_. Code that is intended to be separate from any one consumer must be developed with those boundaries enforced during development. -.. [#ghc-inception] The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. +.. [#ghc-inception] + The extraction process was enabled by insights gained from the `"GHCinception" `_ or "GHC in GHCi" initative. Downstream projects ------------------- @@ -102,7 +103,8 @@ But all the other busywork of re-extracting the code, etc., is entirely avoidabl It is the opinion of the authors of this proposal that should an independent AST parser libraries be maintained upstream with GHC, the costs saved for downstream developers should _greatly_ exceed any costs incurred by GHC developers. The goal is thus *not* to simply shift a burden from one group of community members to another, but create a positive-sum outcome where there is far less busywork and more flourishing tooling than before. -.. [#example-ghc-lib-parser-users] Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. +.. [#example-ghc-lib-parser-users] + Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. Trees that grow --------------- @@ -297,7 +299,7 @@ Just as it is nice to accompany the AST with logic to convert raw text syntax to so it is nice to also accompany the AST with logic to do the opposite: render back to text (the pretty-printer). There has been much work to allow this to be done in a faithful round trip, know as "exact-print" functionality. -However, the detail of how this works are still fast-evolving. [#exact-print-evolving] +However, the detail of how this works are still fast-evolving. [#exact-print-evolving]_ We therefore think it is best to leave factoring out the pretty-printer into a reusable library (either part of the parser library, or a new 3rd reusable library) as a future work. From f13ee3157f34d591e1b6e20429a09292a3707af4 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sun, 13 Aug 2023 23:34:04 -0400 Subject: [PATCH 27/45] "Haskell Foundation" is not a code snippet! --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index a13bdc0c..e9fa8408 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -203,7 +203,7 @@ The disagreement is whether TTG should be blocked on reworking ``FastString`` so I make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ -Out of a basic fiduciary towards the ``Haskell Foundation``, we thus declare that unless "Plan A" works out very quickly, "Plan B" of just introducing another extension point should be used. +Out of a basic fiduciary towards the Haskell Foundation, we thus declare that unless "Plan A" works out very quickly, "Plan B" of just introducing another extension point should be used. We can also revisit the issue later, *after* we have our factored-out AST library. .. [#faststring-unsuitable] From 219a1439646b5a94376b8e8ca2541d511ebc9d57 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Tue, 15 Aug 2023 20:25:19 -0400 Subject: [PATCH 28/45] Include the time Shayne spends maintaining HLint --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index e9fa8408..e7ca10ab 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -96,7 +96,7 @@ In May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#example-ghc-lib-parser-users]_. But just because all these projects are using |ghc-lib-parser| doesn't mean everything is well. -**Insert quote about maintainence overhead.** +Shayne Fetcher reports that keeping up with the latest GHC chagnes with the |ghc-lib-parser|/``ghc-lib``/``ghc-lib-parser-ex``/HLint stack generally costs him **an hour or two a week, and often more**. The cost of detail with changes to the AST is inevitable --- supporting new language features will inevitably cost developer time. But all the other busywork of re-extracting the code, etc., is entirely avoidable, *not* inherent to the task at hand. From cca090addf01f09f994d176399e65daec890dd2d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Laurent=20P=2E=20Ren=C3=A9=20de=20Cotret?= Date: Thu, 17 Aug 2023 07:37:46 -0400 Subject: [PATCH 29/45] Mention that the SoH project will be finished by Laurent if need be --- proposals/accepted/000-ast-parser-libs.rst | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index e7ca10ab..11ed235e 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -5,7 +5,8 @@ Split out AST and Parser libraries from GHC :Date: August 2023 :Authors: John Ericson, - Shayne Fletcher + Shayne Fletcher, + Laurent P. René de Cotret .. sectnum:: .. contents:: @@ -224,17 +225,17 @@ We can also revisit the issue later, *after* we have our factored-out AST librar Proof of success: Use by Haddock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -**Time Estimate:** ?? +**Time Estimate:** 4 weeks of part-time work -It might seem odd that there is a real-world use case for an AST without a Parser, but we do in fact have one: a Summer of Haskell project reducing Haddock's depedencies on GHC. +It might seem odd that there is a real-world use case for an AST without a Parser, but we do in fact have one: a `Haskell Foundation Technical Proposal `_ and associated `Summer of Haskell `_ project reducing Haddock's depedencies on GHC. The situation is nicely described by Laurent who is mentoring the project `here `_, but we'll recap the basics: -Haddock as whole is still using the complete ``ghc`` library, and parsing is continuing to happen that way. +Haddock as a whole is still using the complete ``ghc`` library, and parsing is continuing to happen that way. Individual rendering backends, however, are being split out into separate packages, and those are only using the ``Language.Haskell.Syntax.*`` modules. -That is all being done by the Summer of Haskell project. +That is all being done by the Summer of Haskell project, and will be finished by Laurent if need be once the project is over. What is to be done in this step is to make those backend packages just depend on the new AST library. -If the Summer of Haskell projects succeeds, this should be very easy since it is precisely those ``Language.Haskell.Syntax.*`` modules that will end up in the AST library. +This should be straightforward since it is precisely those ``Language.Haskell.Syntax.*`` modules that will end up in the AST library. All code should continue to work as before, since ``ghc`` will also use the new AST library, and thus the parsing initiated by the frontend and the backends should automatically agree on data structures. Separate the Parser From 10b4bb9e4315d495367e77952cafc5a04bff57a9 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Mon, 21 Aug 2023 16:59:50 -0400 Subject: [PATCH 30/45] Update volunteer info --- proposals/accepted/000-ast-parser-libs.rst | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 11ed235e..50f9d116 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -180,6 +180,8 @@ Separate the AST Split library ~~~~~~~~~~~~~ +**Executor**: Haskell Foundation + **Time Estimate:** 1 – 2 Weeks The first step is just separating data definitions. @@ -225,6 +227,8 @@ We can also revisit the issue later, *after* we have our factored-out AST librar Proof of success: Use by Haddock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +**Executor**: Laurent P. René de Cotret (volunteer) + **Time Estimate:** 4 weeks of part-time work It might seem odd that there is a real-world use case for an AST without a Parser, but we do in fact have one: a `Haskell Foundation Technical Proposal `_ and associated `Summer of Haskell `_ project reducing Haddock's depedencies on GHC. @@ -241,6 +245,8 @@ All code should continue to work as before, since ``ghc`` will also use the new Separate the Parser ------------------- +**Executor**: Haskell Foundation + **Time Estimate:** ?? This work is more uncertain, because the parser and post-processing steps necessary to get an actual AST may use utility functions currently entangled with the rest of the compiler. @@ -252,15 +258,16 @@ It maybe be the case that we need to finish the far more certain first step (AST Proof of success: Use by HLint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -**Time Estimate:** ?? +**Executor**: Shayne Fletcher (volunteer) + +**Time Estimate:** 6 weeks of part-time work We will continue the tradition discussed in the background section of using HLint to validate that parsers for Haskell are usable by real-world programs that are not GHC. The migration from |haskell-src-exts| to |ghc-lib-parser| was quite difficult because those libraries are nothing alike. In contrast, we expect the migration from |ghc-lib-parser| to the new AST and parser libraries to be quite simple and pleasant, because the two new libraries should be very similar to |ghc-lib-parser|, and where they differ they should be strictly easier to use than before. -.. todo:: - Any more detail we can write here? +Shayne Fetcher volunteers to lead this integration as a core HLint maintainer. Stakeholders ============ From 7c5171a4bd2f01177596ae35741ead9736b3bd2e Mon Sep 17 00:00:00 2001 From: John Ericson Date: Mon, 28 Aug 2023 00:49:12 -0400 Subject: [PATCH 31/45] Discuss issues foreseen by Shayne Fletcher with the HLint integration John decided to bump up the estimate accordingly --- proposals/accepted/000-ast-parser-libs.rst | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 50f9d116..111b2b2d 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -260,14 +260,30 @@ Proof of success: Use by HLint **Executor**: Shayne Fletcher (volunteer) -**Time Estimate:** 6 weeks of part-time work +**Time Estimate:** 8 weeks of part-time work We will continue the tradition discussed in the background section of using HLint to validate that parsers for Haskell are usable by real-world programs that are not GHC. The migration from |haskell-src-exts| to |ghc-lib-parser| was quite difficult because those libraries are nothing alike. In contrast, we expect the migration from |ghc-lib-parser| to the new AST and parser libraries to be quite simple and pleasant, because the two new libraries should be very similar to |ghc-lib-parser|, and where they differ they should be strictly easier to use than before. -Shayne Fetcher volunteers to lead this integration as a core HLint maintainer. +Note that HLint does use a few other things behind the AST and Parser that currently make it into |ghc-lib-parser|, but which we might want in our new libraries. + +#. It uses GHC's multi-purpose ``Outputable`` instead of some more dedicated exact-printing machinary + +#. It uses ``parseDynamicFilePragma``, and thus GHC's infamous ``DynFlags`` to support pragmas like ``{-# LANGUAGE ... #-}`` and ``{-# OPTIONS_GHC ... #-}``. + +For the first case, we might consider factoring ``Outputable`` into a separate library too. +Or we can prioritize a more dedicated exact-print solution to use instead of ``Outputable`` (see the future work section). + +For the second case, we might have to do something temporary like e.g. continuing to use an auto-extracted library liek |ghc-lib-parser|, but depending on our newly factored-output libraries, to get this functionality for HLint_. +But longer term, we refer to the discussion of ``OPTIONS_GHC`` in "Modularizing GHC" [modularizing-ghc]_. +The steps advocated there will avoid this problem entirely by restricting ``OPTIONS_GHC`` and giving it a more minimal data structure that is easily to factor out. + +Shayne Fetcher volunteers to lead the HLint integration as a core HLint maintainer. + +.. [modularizing-ghc] + https://hsyl20.fr/home/files/papers/2022-ghc-modularity.pdf Stakeholders ============ @@ -311,5 +327,7 @@ However, the detail of how this works are still fast-evolving. [#exact-print-evo We therefore think it is best to leave factoring out the pretty-printer into a reusable library (either part of the parser library, or a new 3rd reusable library) as a future work. +That said, if ``Outputable`` becomes too much of a hassle for HLint_, as described above, we might prioritize this. + .. [#exact-print-evolving] See `GHC Issue #23447 `_ for example. From bb671f06887a6dce63ee6dcc30f0b71385d7669f Mon Sep 17 00:00:00 2001 From: John Ericson Date: Mon, 28 Aug 2023 00:50:56 -0400 Subject: [PATCH 32/45] Don't need to reinclude paper name for cite (It's not a footnote) --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 111b2b2d..cbb62df3 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -277,7 +277,7 @@ For the first case, we might consider factoring ``Outputable`` into a separate l Or we can prioritize a more dedicated exact-print solution to use instead of ``Outputable`` (see the future work section). For the second case, we might have to do something temporary like e.g. continuing to use an auto-extracted library liek |ghc-lib-parser|, but depending on our newly factored-output libraries, to get this functionality for HLint_. -But longer term, we refer to the discussion of ``OPTIONS_GHC`` in "Modularizing GHC" [modularizing-ghc]_. +But longer term, we refer to the discussion of ``OPTIONS_GHC`` in [modularizing-ghc]_. The steps advocated there will avoid this problem entirely by restricting ``OPTIONS_GHC`` and giving it a more minimal data structure that is easily to factor out. Shayne Fetcher volunteers to lead the HLint integration as a core HLint maintainer. From 62c4acaafe7e994b2d7f349978d83aa26077af15 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Mon, 28 Aug 2023 01:06:05 -0400 Subject: [PATCH 33/45] Some final changes --- proposals/accepted/000-ast-parser-libs.rst | 42 ++++++++++++---------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index cbb62df3..7320bc99 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -24,7 +24,7 @@ Experience has shown that there is only way one way to meet each criterion: - Be separate from GHC, so the library is forced to be self-contained -However, no library has so far done both, to meet both criteria. +However, no library has so far done both, nor met meet both criteria. Solution -------- @@ -63,7 +63,7 @@ However, the whole compiler doesn't appear to exist, and the project as a whole In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". -A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library [#ghc-inception]_. +A |ghc-lib-parser|_ package contains GHC compiler sources packaged as a library. [#ghc-inception]_ This ensures it is up to date with the latest behavior but this extraction process is complex, requires constant patching to keep pace with GHC evolution, and results in a far larger library than is desired: see the *hundreds* of modules included inside it, many of which have nothing to do with the Haskell surface language. All this `"bycatch" `_ in the extraction process results in a library that daunting to use, and which has a hard time presenting a stable interface. @@ -88,24 +88,31 @@ Downstream projects In addition to going over the major AST/parser libraries for Haskell, it is also useful to talk about the most notable projects that use them. Ultimately, it is those projects we want to help out. -HLint_, the Haskell linter project developed continuously since 2008 is the most notable one, not just because its longstanding and wide use, but also because many of the developers that worked on the previous two libraries also worked on it --- use by HLint served as proof the libraries were fit-for-purpose. +HLint_, the Haskell linter project developed continuously since 2008 is the most notable one, not just because its longstanding and wide use, but also because many of the developers that worked on the previous two libraries also worked on it — use by HLint served as proof the libraries were fit-for-purpose. In June 2019, following the release of |ghc-lib-parser| and deprecation of |haskell-src-exts| earlier that year as described above, `HLint began the transitition `_ to |ghc-lib-parser|. In May 2020, the release of HLint-3.0 which "uses the GHC parser" `was announced `_. -Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser| [#example-ghc-lib-parser-users]_. +Today, most users of |haskell-src-exts| have largely migrated from |haskell-src-exts| to |ghc-lib-parser|. [#example-ghc-lib-parser-users]_ But just because all these projects are using |ghc-lib-parser| doesn't mean everything is well. Shayne Fetcher reports that keeping up with the latest GHC chagnes with the |ghc-lib-parser|/``ghc-lib``/``ghc-lib-parser-ex``/HLint stack generally costs him **an hour or two a week, and often more**. -The cost of detail with changes to the AST is inevitable --- supporting new language features will inevitably cost developer time. +The cost of detail with changes to the AST is inevitable — supporting new language features will inevitably cost developer time. But all the other busywork of re-extracting the code, etc., is entirely avoidable, *not* inherent to the task at hand. -It is the opinion of the authors of this proposal that should an independent AST parser libraries be maintained upstream with GHC, the costs saved for downstream developers should _greatly_ exceed any costs incurred by GHC developers. +It is the opinion of the authors of this proposal that should an independent AST parser libraries be maintained upstream with GHC, the costs saved for downstream developers should *greatly* exceed any costs incurred by GHC developers. The goal is thus *not* to simply shift a burden from one group of community members to another, but create a positive-sum outcome where there is far less busywork and more flourishing tooling than before. .. [#example-ghc-lib-parser-users] - Today for example, notable users include HLint_, `ormolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_ & `stylish-haskell `_. + Today for example, notable users include + HLint_, + `ormolu `_, + `ghcide `_, + `hls-hlint-plugin `_, + `hindent `_, + and + `stylish-haskell `_. Trees that grow --------------- @@ -131,7 +138,7 @@ This allows those consumers to "adjust" the AST for their purpose. The Trees That Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. But progress on `reducing AST & parser dependencies `_ has been less easily forthcoming. -I have separated out the modules defining the AST under ``Language.Haskell.Syntax.*`` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. +We have separated out the modules defining the AST under ``Language.Haskell.Syntax.*`` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. But progress is unsteady and unpredictable. The basic problem is that the benefits don't actually kick in until the deps are *all* gone, and the code is actually separated out. @@ -161,14 +168,6 @@ This proposal wish to *stay neutral* on the merits of such a future direction, b Roadmap ======= -*This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). -It should also describe the benefits, drawbacks, and risks that are associated with these decisions. -It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach.* - -*Are there any deadlines that the HF needs to be aware of?* - -*How much money is needed to accomplish the goal? How will it be used?* - The project is split into two separate steps: separating the AST, and separating the parser. Each step has a method, time estimate, and (most importantly) clear success criteria, including use by downstream projects to ensure value is delivered. The intent is thus that they are self-contained, and can be individually funded. @@ -204,10 +203,15 @@ The purpose of this proposal isn't to relitigate that issue, but because this pr There is no disagreement that as-is, that data type is not suitable for a nice self-contained library. [#faststring-unsuitable]_ The disagreement is whether TTG should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. -I make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ +We make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ + +Out of a basic fiduciary duty, we thus declare that unless "Plan A" works out almost as quickly, "Plan B" of just introducing another extension point should be used. +We can also revisit getting rid of any newly-added extension points later, *after* we have our factored-out AST library. -Out of a basic fiduciary towards the Haskell Foundation, we thus declare that unless "Plan A" works out very quickly, "Plan B" of just introducing another extension point should be used. -We can also revisit the issue later, *after* we have our factored-out AST library. +N.B. Third-party code (e.g. HLint_ will often also need ``Data`` instances for the AST. +We could consider making those polymorphic again as they used to be, and factoring them out accordingly. +Or, we can just let downstream projects define their own instances specialized do their own extension type (as GHC does with ``GhcPass``). +The latter is a good cheap "plan B" to delay dealing with those instances so they don't block this milestone. .. [#faststring-unsuitable] Everyone agrees it is insuitable in its current state because things like: From 4925007d4a1a94737ad773f279c3b5c577b9425b Mon Sep 17 00:00:00 2001 From: John Ericson Date: Mon, 28 Aug 2023 01:12:00 -0400 Subject: [PATCH 34/45] Turn TODO into note --- proposals/accepted/000-ast-parser-libs.rst | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 7320bc99..b6310ba8 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -256,8 +256,13 @@ Separate the Parser This work is more uncertain, because the parser and post-processing steps necessary to get an actual AST may use utility functions currently entangled with the rest of the compiler. It maybe be the case that we need to finish the far more certain first step (AST library) to get better clarity on what work remains for the parser, and thus price this step accurately. -.. todo:: - Any more detail we can write here? +.. note:: + + We have a couple options on how to deal with the uncertainty here. + For example: + + - We could either remove this and the HLint integration from the current proposal, saving it for a future proposal. + - we could accept the whole proposal but make sure we edit this section once the previous two are completed with more information before stored. Proof of success: Use by HLint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From 10a63d03c647747a9780a1e006cb4b300700a7e7 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Tue, 29 Aug 2023 11:59:11 -0400 Subject: [PATCH 35/45] Talk about GHC Issue #21592 --- proposals/accepted/000-ast-parser-libs.rst | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index b6310ba8..8cd0a7c8 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -172,7 +172,6 @@ The project is split into two separate steps: separating the AST, and separating Each step has a method, time estimate, and (most importantly) clear success criteria, including use by downstream projects to ensure value is delivered. The intent is thus that they are self-contained, and can be individually funded. - Separate the AST ---------------- @@ -303,10 +302,13 @@ GHC Developers -------------- The proposal is asking that we change out code in GHC is organized, so it is crucial that we solicit feedback from the broader `GHC Team `_, and the narrow `GHC HQ group `_ in particular. -It is John's understanding that the GHC developers are broadly supportive of the goal here in the abstract, +It is John's understanding that the GHC developers are broadly supportive of the goal here in the abstract (after all, SPJ was an author of the Trees That Grow paper), -but some of the specific details needed to get this done in a timely manner may be more controversial. +and also of the approach of tackling the AST and Parser separately. +`GHC Issue #21592 `_ from @alt-romes +contains a very good summary of that initial consensus, including relevant quotes from key people from various previous discussion threads scattered about and potentially hard to find otherwise. +However, some of the specific details needed to get this done in a timely manner may be more controversial. In particular, introducing more extension points to ensure rapid progress was very controversial before, and in return for putting up with such a thing as stop-gap, the GHC HQ might want something in return, like an additional phase of work to eliminate the new extension points afterwords. Haddock Developers From a7eb382980c4e063183e1f67d838957f38db5c1b Mon Sep 17 00:00:00 2001 From: John Ericson Date: Thu, 31 Aug 2023 11:51:52 -0400 Subject: [PATCH 36/45] Fix typos Thanks! Co-authored-by: David Thrane Christiansen --- proposals/accepted/000-ast-parser-libs.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 8cd0a7c8..fc600da9 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -31,7 +31,7 @@ Solution The purpose of this proposal is to make that library finally exist. The Haskell Foundation will finance the completion of the existing "Trees that grow" project, decoupling GHC's AST and parser from the rest of the compiler so they can be moved to separate libraries. -Those libaries will be "normal" haskell libraries, without any weird dependencies or build process, and published on Hackage. +Those libraries will be "normal" Haskell libraries, without any unusual dependencies or build processes, and published on Hackage. Those libraries will be used by GHC, ensuring they are maintained. Background, Prior Art, and Related Efforts @@ -154,7 +154,7 @@ The Haskell Foundation's support in getting this "over the finish line", at whic Reinstallable GHC Lib --------------------- -One of the problems ``ghc-lib-parser`` aims to solve is that ``ghc`` the library is current cumbersome to install as a regular haskell library (as opposed to by switching toolchains entirely). +One of the problems ``ghc-lib-parser`` aims to solve is that ``ghc`` the library is current cumbersome to install as a regular Haskell library (as opposed to by switching toolchains entirely). There is currently work in flight to solve that. One that is done, projects like HLint_ *could* just depend on ``ghc`` directly, and still be easily buildable (with Cabal / with Stack / from Hackage) as today. @@ -163,7 +163,7 @@ For stability's sake, it is better that those libraries dependent on narrower pa And longer term, we hope the "tug of war" of between GHC and these projects as consumers of those libraries, versus just the others having to deal with whatever GHC does with just itself in mind, will result in a higher-quality, more flexible, and overall friendlier library. In `this comment `_, it is suggested that factoring out the AST and parser can be a good first step making a more modular in GHC in general. -This proposal wish to *stay neutral* on the merits of such a future direction, but it would be remiss not to at least highlight it as one possible outcome. +This proposal wishes to *stay neutral* on the merits of such a future direction, but it would be remiss not to at least highlight it as one possible outcome. Roadmap ======= @@ -284,7 +284,7 @@ Note that HLint does use a few other things behind the AST and Parser that curre For the first case, we might consider factoring ``Outputable`` into a separate library too. Or we can prioritize a more dedicated exact-print solution to use instead of ``Outputable`` (see the future work section). -For the second case, we might have to do something temporary like e.g. continuing to use an auto-extracted library liek |ghc-lib-parser|, but depending on our newly factored-output libraries, to get this functionality for HLint_. +For the second case, we might have to do something temporary like e.g. continuing to use an auto-extracted library like |ghc-lib-parser|, but depending on our newly factored-output libraries, to get this functionality for HLint_. But longer term, we refer to the discussion of ``OPTIONS_GHC`` in [modularizing-ghc]_. The steps advocated there will avoid this problem entirely by restricting ``OPTIONS_GHC`` and giving it a more minimal data structure that is easily to factor out. From ab50270e8cdda37e1707aa175a32b1f45cd44883 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Thu, 31 Aug 2023 12:44:42 -0400 Subject: [PATCH 37/45] Add some conditions to assuage concerns about ongoing costs to GHC dev --- proposals/accepted/000-ast-parser-libs.rst | 26 ++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index fc600da9..bc12e8eb 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -227,6 +227,27 @@ The latter is a good cheap "plan B" to delay dealing with those instances so the Then, GHC can define ``StringP (GhcPass _) = FastString`` to use it client side, across all compilation passes. All term-level code continues to works exactly the same as before without modification. +Conditions +^^^^^^^^^^ + +It is important to make clear what must *not* happen as a side-effect of this, so that we are careful to avoid extra costs. + +- The new AST library must live in the same repo, and not cause and extra Git submodule. + Synchronizing changes across Git submodules is a drag on on GHC development today, and we must not make that problem worse. + +- The new AST library be loadable in the same GHCi session as rest of GHC. + Having to restart tools to switch between libraries is a major productivity drag in the Haskell ecosystem, and we wouldn't want to impose it on GHC. + There is existing prior art of loading ``ghc`` and ``ghc-bin`` at the same time, and also recent developments in Cabal that that allow doing this in a less "hacky" manner. + +- Build / CI times should not be impacted. + Since Hadrian build Haskell modules individually, it doesn't much care about library boundaries. + Redividing the same modules into different libraries thus should have negligible impact on build times. + +- Some layer violations are not actually impediments to splitting out an AST library, and thus we should *not* prioritize fixing them with Haskell Foundation funds. + + For example the type family ``GhcNoTc`` doesn't belong in a GHC-agnostic parsing library, as indicated by its name, as it doesn't incur any imports into the rest of GHC from those modules, it doesn't actually impose a problem. + We can simply include it in the AST library, and non-GHC clients can write instances for it like they do for the proper extension points. + Proof of success: Use by Haddock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -263,6 +284,11 @@ It maybe be the case that we need to finish the far more certain first step (AST - We could either remove this and the HLint integration from the current proposal, saving it for a future proposal. - we could accept the whole proposal but make sure we edit this section once the previous two are completed with more information before stored. +Conditions +^^^^^^^^^^ + +- The same conditions on splitting a library without negatively impacting GHC development are imposed as in the separating the AST step. + Proof of success: Use by HLint ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From f0fe5f20a063b4b135453784b276a8a6177c235e Mon Sep 17 00:00:00 2001 From: John Ericson Date: Fri, 1 Sep 2023 11:55:35 -0400 Subject: [PATCH 38/45] Fix reST issues --- proposals/accepted/000-ast-parser-libs.rst | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index bc12e8eb..ff0caa63 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -59,7 +59,7 @@ However, the whole compiler doesn't appear to exist, and the project as a whole ``haskell-src-exts`` lasted longer, but had great trouble keeping up with GHC, and is now also unmaintained since 2020. |ghc-lib-parser| ------------------- +---------------- In Februrary 2019, |ghc-lib-parser| `was released `_ and in May 2019 an `announcement `_ that there would be no further |haskell-src-exts| releases followed and the advice given for anyone wishing to parse Haskell programs to "use the GHC API, specifically you can use |ghc-lib-parser|". @@ -107,12 +107,12 @@ The goal is thus *not* to simply shift a burden from one group of community memb .. [#example-ghc-lib-parser-users] Today for example, notable users include HLint_, - `ormolu `_, - `ghcide `_, - `hls-hlint-plugin `_, - `hindent `_, - and - `stylish-haskell `_. + `ormolu `_, + `ghcide `_, + `hls-hlint-plugin `_, + `hindent `_, + and + `stylish-haskell `_. Trees that grow --------------- @@ -269,6 +269,9 @@ All code should continue to work as before, since ``ghc`` will also use the new Separate the Parser ------------------- +Split library +~~~~~~~~~~~~~ + **Executor**: Haskell Foundation **Time Estimate:** ?? From 2531a472b9c392b5359aa93baa7809d9a12447cd Mon Sep 17 00:00:00 2001 From: John Ericson Date: Fri, 1 Sep 2023 11:56:32 -0400 Subject: [PATCH 39/45] Fix two more reST errors --- proposals/accepted/000-ast-parser-libs.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index ff0caa63..5a1080a3 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -215,9 +215,9 @@ The latter is a good cheap "plan B" to delay dealing with those instances so the .. [#faststring-unsuitable] Everyone agrees it is insuitable in its current state because things like: - - Global state because of `string interning `, with a global variable baked into the RTS no less! + - Global state because of `string interning `_, with a global variable baked into the RTS no less! - - Memoizing features for other parts of the compiler unrelated to parsing, such as the `"Z-Encoding" ` GHC happens to use for object file symbol `name mangling `. + - Memoizing features for other parts of the compiler unrelated to parsing, such as the `"Z-Encoding" `_ GHC happens to use for object file symbol `name mangling `. Everyone *also* agrees that it is worth revising whether these algorithmic decision still make sense given modern hardware, see `GHC Issue #17259 `_. From 29693b50753260628252f121defd764001387274 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 2 Sep 2023 14:10:45 -0400 Subject: [PATCH 40/45] improv wording Thanks @david-christiansen --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 5a1080a3..49ecd99c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -204,7 +204,7 @@ The disagreement is whether TTG should be blocked on reworking ``FastString`` so We make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ -Out of a basic fiduciary duty, we thus declare that unless "Plan A" works out almost as quickly, "Plan B" of just introducing another extension point should be used. +Out of a basic desire to minimize costs where possible, we thus declare that unless "Plan A" works out almost as quickly, "Plan B" of just introducing another extension point should be used. We can also revisit getting rid of any newly-added extension points later, *after* we have our factored-out AST library. N.B. Third-party code (e.g. HLint_ will often also need ``Data`` instances for the AST. From eb8b09112cd6bce0e3a266691248f65ad6394650 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 2 Sep 2023 14:13:07 -0400 Subject: [PATCH 41/45] Fix typo Thanks! Co-authored-by: David Thrane Christiansen --- proposals/accepted/000-ast-parser-libs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 49ecd99c..76ed3f7c 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -142,7 +142,7 @@ We have separated out the modules defining the AST under ``Language.Haskell.Synt But progress is unsteady and unpredictable. The basic problem is that the benefits don't actually kick in until the deps are *all* gone, and the code is actually separated out. -Partial progress isn't really directly useful to anyone, and these counters just scoreboard by which we hope to get closer to the end goal. +Partial progress isn't really directly useful to anyone, and these counters are just a scoreboard by which we hope to get closer to the end goal. It is thus hard to do this work with volunteers only, because it is emphatically *not* `"itch scratching" `_ work where incremental progress leads immediate incremental benefits to the contributor. The Haskell Foundation's support in getting this "over the finish line", at which point the community *will* benefit, and benefit greatly, is thus a crucial way we can surmount the coordination failure the lack of incremental payoff causes. From 95a6a1f11f6fa7172c5494779d8d42d51d528656 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 9 Sep 2023 18:24:36 -0400 Subject: [PATCH 42/45] Start discussing `FastString` some more --- proposals/accepted/000-ast-parser-libs.rst | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 76ed3f7c..2957912f 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -194,15 +194,33 @@ Concretely, the work in this step is to: #. Adjust ``build-depends`` across the repo so ``ghc`` and any other Haskell Package gets those modules from the new library instead, and CI passes. +``FastString`` +^^^^^^^^^^^^^^ + The timeline for this is pretty short because there exists an easy last-resort way to decouple anything: just add another TTG type family. This came up with some acrimony in `GHC Issue #21628 `_, discussing whether it was better to try to change GHC's ``FastString`` or abstract over it. -The purpose of this proposal isn't to relitigate that issue, but because this proposal *is* about resource allocation, something does need to be said on the broader tradeoffs at play +The purpose of this proposal isn't to relitigate that issue, but because this proposal *is* about resource allocation, something does need to be said on the broader trade-offs at play. + +``FastString`` is currently *directly* used in these places inside ``Language.Haskell.Syntax.*`` modules: + +- ``FieldLabelString`` (newtype) +- ``HsOverLabel`` (``HsExpr``) +- ``HsQuasiQuote`` (``HsUntypedSplice``) +- ``HsString`` (``HsLit``) +- ``HsIsString`` (``OverLitVal``) --- maybe only used by extension point?! +- ``ModuleName`` (newtype) +- ``HsIPName`` (newtype) +- ``HsStrTy`` (``HsTyLit``) + +Note that overall ``FastString`` is used far more commonly in identifiers, but the AST is *already* parameterized over the choice of identifier type; +That means that even though identifiers are extremely common in AST values, they do not induce dependencies from the ``Language.Haskell.Syntax.*``, since the choices of identifier types that use ``FastString`` are made downstream in the rest of GHC. There is no disagreement that as-is, that data type is not suitable for a nice self-contained library. [#faststring-unsuitable]_ The disagreement is whether TTG should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. -We make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorthms might take **Days to Weeks**, we can side-step the issue with a new ``StringP`` type family "extension point" like the existing ``IdP`` one in **minutes**. [#extension-point]_ +We make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorithms might take **Days to Weeks**, we can side-step the issue with a new type family "extension points" like the existing ``IdP`` one in **minutes** for these use-cases. [#extension-point]_ +(We could use a single ``StringP`` type family, but it might be nicer to use separate ones, like ``ModuleNameP``, so our extension points remain oriented to the "domain" of what we are doing.) Out of a basic desire to minimize costs where possible, we thus declare that unless "Plan A" works out almost as quickly, "Plan B" of just introducing another extension point should be used. We can also revisit getting rid of any newly-added extension points later, *after* we have our factored-out AST library. From fbf25c7507f36434e2a5308ec2ac349ee4d186a4 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Sat, 9 Sep 2023 20:42:53 -0400 Subject: [PATCH 43/45] Normalize "Trees that Grow" --- proposals/accepted/000-ast-parser-libs.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index 2957912f..be1f5650 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -136,7 +136,7 @@ It presents these data types: The idea that they are "promoted" via ``DataKinds``, and then type families used in the AST will have instances for these promoted values. This allows those consumers to "adjust" the AST for their purpose. -The Trees That Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. +The Trees that Grow project is now 6 years old, and has met great success in avoiding partiality in the compiler, "making illegal states unrepresentable" as many Haskellers would put it. But progress on `reducing AST & parser dependencies `_ has been less easily forthcoming. We have separated out the modules defining the AST under ``Language.Haskell.Syntax.*`` we wish to split out, and we have tests to track progress reducing their deps, and the parser's deps. But progress is unsteady and unpredictable. @@ -198,7 +198,7 @@ Concretely, the work in this step is to: ^^^^^^^^^^^^^^ The timeline for this is pretty short because there exists an easy last-resort way to decouple anything: -just add another TTG type family. +just add another Trees that Grow type family. This came up with some acrimony in `GHC Issue #21628 `_, discussing whether it was better to try to change GHC's ``FastString`` or abstract over it. The purpose of this proposal isn't to relitigate that issue, but because this proposal *is* about resource allocation, something does need to be said on the broader trade-offs at play. @@ -217,7 +217,7 @@ Note that overall ``FastString`` is used far more commonly in identifiers, but t That means that even though identifiers are extremely common in AST values, they do not induce dependencies from the ``Language.Haskell.Syntax.*``, since the choices of identifier types that use ``FastString`` are made downstream in the rest of GHC. There is no disagreement that as-is, that data type is not suitable for a nice self-contained library. [#faststring-unsuitable]_ -The disagreement is whether TTG should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. +The disagreement is whether Trees That Grow should be blocked on reworking ``FastString`` somehow to be better for GHC and non-GHC alike, or whether we should just side-step the issue entirely. We make no claims about what is better in the long term for GHC, but when reworking ``FastString`` and benchmarking the new algorithms might take **Days to Weeks**, we can side-step the issue with a new type family "extension points" like the existing ``IdP`` one in **minutes** for these use-cases. [#extension-point]_ (We could use a single ``StringP`` type family, but it might be nicer to use separate ones, like ``ModuleNameP``, so our extension points remain oriented to the "domain" of what we are doing.) From 03c97a4cdad30705ac7bbe362fe19739d064a5f3 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Tue, 19 Sep 2023 10:30:06 -0400 Subject: [PATCH 44/45] Link fourmolu too as a widely-used tool --- proposals/accepted/000-ast-parser-libs.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index be1f5650..ee982489 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -108,6 +108,7 @@ The goal is thus *not* to simply shift a burden from one group of community memb Today for example, notable users include HLint_, `ormolu `_, + `fourmolu `_, `ghcide `_, `hls-hlint-plugin `_, `hindent `_, From ebc481dc1693d5249a68fa6dc3125b88291f8471 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Tue, 19 Sep 2023 12:20:08 -0400 Subject: [PATCH 45/45] GHCIDE no longer uses `ghc-lib-parser` It use `ghc` directly. --- proposals/accepted/000-ast-parser-libs.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/proposals/accepted/000-ast-parser-libs.rst b/proposals/accepted/000-ast-parser-libs.rst index ee982489..2fd1ee62 100644 --- a/proposals/accepted/000-ast-parser-libs.rst +++ b/proposals/accepted/000-ast-parser-libs.rst @@ -109,7 +109,6 @@ The goal is thus *not* to simply shift a burden from one group of community memb HLint_, `ormolu `_, `fourmolu `_, - `ghcide `_, `hls-hlint-plugin `_, `hindent `_, and