Skip to content

VSCode: Repo with not really 21K shell scripts locks up language server in "Analyzing" loop #301

Closed
@kkm000

Description

@kkm000

I have a repo with a few submodules; nothing out of ordinary: gRPC, Kaldi, and a couple of lesser libs. gRPC has a dozen submodules on its own. Kaldi is written half in C++, half in Bash as a glue. gRPC and friends also have a non-negligible count of shell scripts. But some of these files are accounted multiple times, close to a hundred, via systematic symlinking. There aren't really 21K files, in fact "only" 4K, but some of them are massively reachable through symlinks with a high multiplicity.

$ find -name \*.sh | wc -l
4104

The server seems to constantly crash and restart. VSCode is logging non-stop, and I never seen anything like code completion working:

[Info  - 11:48:45 PM] BashLanguageServer initializing...
Analyzing files matching glob "**/*@(.sh|.inc|.bash|.command)" inside /home/kkm/work/ikke
Glob resolved with 21045 files after 2.389 seconds
Analyzing file:///home/kkm/work/ikke/ext/fmt/test/fuzzing/build.sh
Analyzing file:///home/kkm/work/ikke/ext/grpc/bazel/update_mirror.sh

 ... Only 16646 total "Analyzing" or "Skipping" records, 4399 short of 21045 ...

Analyzing file:///home/kkm/work/ikke/ext/kaldi/egs/swahili/s5/utils/mkgraph_lookahead.sh
Analyzing file:///home/kkm/work/ikke/ext/kaldi/egs/swahili/s5/utils/mkgraph.sh
[Info  - 11:49:33 PM] Connection to server got closed. Server will restart.
[Info  - 11:49:34 PM] BashLanguageServer initializing...
Analyzing files matching glob "**/*@(.sh|.inc|.bash|.command)" inside /home/kkm/work/ikke
Glob resolved with 21045 files after 2.474 seconds

...and there we go again. And again.

Ideally, in this project I'd prefer to exclude all submodule files from the scan, or, in fact, all files: we have just a dozen standalone scripts here. Other library scripts, including Kaldi, are irrelevant, they are just happened to be dropped by Git submodules.

But I'm also member of the Kaldi team. Working on Kaldi itself is a different story. I'd probably want to select only specific directories with common library scripts (just three, in fact). All subdirectories of egs/ are independent experiments ("ee Jeez", "ee jee" singular — likely from "e.g.," but this terminology pre-dates my joining the project :)), practically never calling each other. Most correspond to published research and are contributed by various people, and we safekeep them for reproducibility of published results. Every eg invariably links to the two common script libraries, utils and steps. For example, the swahili/s5 eg in the last pre-crash log message above reached utils/mkgraph.sh through the utils symlink. At this moment, we have exactly 99 egs and a PR for the 100th one, so that every file in these two libraries is counted 99 times by the language server.¹ . So, my "if I had a magic wand" configuration for standalone Kaldi would be to scan only files of a single subdirectory under egs/ where I have opened a file, and these two utility libraries (either through symlinks, or configured manually if you could teach the server to ignore symlinks altogether; adding 2 directories to the VSCode config is no sweat). In fact, out of the ≈4100 shell scripts, ≈3500 are in this egs/ directory. I don't know if you consider it too large a number to do anything special about it, or not. As long as it works for me, anything goes. :)

—————————
¹ Yes, it could have been done better, but the toolkit is a de facto standard in our research area and is approaching 20 years, so we are very careful with radical changes. Users are familiar with this layout. Even weirder, the steps and utils are real subdirectories of (the first ever(?)) eg, wsj/s5, and the next one started the 20 year of the tradition of symlinking them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority ⭐️Triaged and deemed a priority

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions