Skip to content

Commit 1aafc22

Browse files
feat: Generic git host support (local & remote) (#307)
1 parent bbdd9e7 commit 1aafc22

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+6258
-507
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
- Added support for indexing generic git hosts given a remote clone url or local path. [#307](https://github.com/sourcebot-dev/sourcebot/pull/307)
12+
1013
## [3.2.0] - 2025-05-12
1114

1215
### Added

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ zoekt:
1414
export CTAGS_COMMANDS=ctags
1515

1616
clean:
17+
redis-cli FLUSHALL
18+
yarn dev:prisma:migrate:reset
19+
1720
rm -rf \
1821
bin \
1922
node_modules \

docs/docs.json

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,11 +38,21 @@
3838
"docs/connections/bitbucket-data-center",
3939
"docs/connections/gitea",
4040
"docs/connections/gerrit",
41+
"docs/connections/generic-git-host",
42+
"docs/connections/local-repos",
4143
"docs/connections/request-new"
4244
]
4345
}
4446
]
4547
},
48+
{
49+
"group": "Search",
50+
"pages": [
51+
"docs/search/syntax-reference",
52+
"docs/search/multi-branch-indexing",
53+
"docs/search/search-contexts"
54+
]
55+
},
4656
{
4757
"group": "Agents",
4858
"pages": [
@@ -53,11 +63,8 @@
5363
{
5464
"group": "More",
5565
"pages": [
56-
"docs/more/syntax-reference",
57-
"docs/more/multi-branch-indexing",
5866
"docs/more/roles-and-permissions",
59-
"docs/more/mcp-server",
60-
"docs/more/search-contexts"
67+
"docs/more/mcp-server"
6168
]
6269
}
6370
]
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
title: Other Git hosts
3+
---
4+
5+
import GenericGitHost from '/snippets/schemas/v3/genericGitHost.schema.mdx'
6+
7+
Sourcebot can sync code from any Git host (by clone url). This is helpful when you want to search code that not in a [supported code host](/docs/connections/overview#supported-code-hosts).
8+
9+
## Getting Started
10+
11+
To connect to a Git host, create a new [connection](/docs/connections/overview) with type `git` and specify the clone url in the `url` property. For example:
12+
13+
```json
14+
{
15+
"type": "git",
16+
"url": "https://github.com/sourcebot-dev/sourcebot"
17+
}
18+
```
19+
20+
Note that only `http` & `https` URLs are supported at this time.
21+
22+
## Schema reference
23+
24+
<Accordion title="Reference">
25+
[schemas/v3/genericGitHost.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/genericGitHost.json)
26+
27+
<GenericGitHost />
28+
29+
</Accordion>

docs/docs/connections/local-repos.mdx

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: Local Git repositories
3+
---
4+
5+
import GenericGitHost from '/snippets/schemas/v3/genericGitHost.schema.mdx'
6+
7+
<Note>
8+
This feature is only supported when [self-hosting](/self-hosting/overview).
9+
</Note>
10+
11+
Sourcebot can sync code from generic git repositories stored in a local directory. This can be helpful in scenarios where you already have a large number of repos already checked out. Local repositories are treated as **read-only**, meaing Sourcebot will **not** `git fetch` new revisions.
12+
13+
## Getting Started
14+
15+
<Warning>
16+
Only folders containing git repositories at their root **and** have a `remote.origin.url` set in their git config are supported at this time. All other folders will be skipped.
17+
</Warning>
18+
19+
Let's assume we have a `repos` directory located at `$(PWD)` with a collection of git repositories:
20+
21+
```sh
22+
repos/
23+
├─ repo_1/
24+
├─ repo_2/
25+
├─ repo_3/
26+
├─ ...
27+
```
28+
29+
To get Sourcebot to index these repositories:
30+
31+
<Steps>
32+
<Step title="Mount a volume">
33+
We need to mount a docker volume to the `repos` directory so Sourcebot can read it's contents. Sourcebot will **not** write to local repositories, so we can mount a seperate **read-only** volume:
34+
35+
``` bash
36+
docker run \
37+
-v $(pwd)/repos:/repos:ro \
38+
/* additional args */ \
39+
ghcr.io/sourcebot-dev/sourcebot:latest
40+
```
41+
</Step>
42+
43+
<Step title="Create a connection">
44+
We can now create a new git [connection](/docs/connections/overview), specifying local paths with the `file://` prefix. Glob patterns are supported. For example:
45+
46+
```json
47+
{
48+
"type": "git",
49+
"url": "file:///repos/*"
50+
}
51+
```
52+
53+
Sourcebot will expand this glob pattern into paths `/repos/repo_1`, `/repos/repo_2`, etc. and index all valid git repositories.
54+
</Step>
55+
</Steps>
56+
57+
## Examples
58+
59+
60+
<AccordionGroup>
61+
<Accordion title="Sync individual repo">
62+
```json
63+
{
64+
"type": "git",
65+
"url": "file:///path/to/git_repo"
66+
}
67+
```
68+
</Accordion>
69+
<Accordion title="Sync multiple repos using glob patterns">
70+
```json
71+
// Attempt to sync directories contained in `repos/` (non-recursive)
72+
{
73+
"type": "git",
74+
"url": "file:///repos/*"
75+
}
76+
```
77+
</Accordion>
78+
</AccordionGroup>
79+
80+
## Schema reference
81+
82+
<Accordion title="Reference">
83+
[schemas/v3/genericGitHost.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/genericGitHost.json)
84+
85+
<GenericGitHost />
86+
87+
</Accordion>

docs/docs/connections/overview.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ There are two ways to define connections:
3030
<Card horizontal title="Bitbucket Data Center" icon="bitbucket" href="/docs/connections/bitbucket-data-center" />
3131
<Card horizontal title="Gitea" href="/docs/connections/gitea" />
3232
<Card horizontal title="Gerrit" href="/docs/connections/gerrit" />
33+
<Card horizontal title="Other Git hosts" icon="git-alt" href="/docs/connections/generic-git-host" />
34+
<Card horizontal title="Local Git repos" icon="folder" href="/docs/connections/local-repos" />
3335
</CardGroup>
3436

3537
<Note>Missing your code host? [Submit a feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>

docs/docs/more/multi-branch-indexing.mdx renamed to docs/docs/search/multi-branch-indexing.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,5 @@ Additional info:
9090
| Bitbucket Data Center ||
9191
| Gitea ||
9292
| Gerrit ||
93+
| Generic git host ||
9394

docs/docs/more/search-contexts.mdx renamed to docs/docs/search/search-contexts.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ Like other prefixes, contexts can be negated using `-` or combined using `or`:
105105
- `-context:web` excludes frontend repositories from results
106106
- `( context:web or context:backend )` searches across both frontend and backend code
107107

108-
See [this doc](/docs/more/syntax-reference) for more details on the search query syntax.
108+
See [this doc](/docs/search/syntax-reference) for more details on the search query syntax.
109109

110110
## Schema reference
111111

docs/docs/more/syntax-reference.mdx renamed to docs/docs/search/syntax-reference.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,4 @@ Expressions can be prefixed with certain keywords to modify search behavior. Som
3232
| `rev:` | Filter results from a specific branch or tag. By default **only** the default branch is searched. | `rev:beta` - Filter results to branches that match regex `/beta/` |
3333
| `lang:` | Filter results by language (as defined by [linguist](https://github.com/github-linguist/linguist/blob/main/lib/linguist/languages.yml)). By default all languages are searched. | `lang:TypeScript` - Filter results to TypeScript files<br/>`-lang:YAML` - Ignore results from YAML files |
3434
| `sym:` | Match symbol definitions created by [universal ctags](https://ctags.io/) at index time. | `sym:\bmain\b` - Filter results to symbols that match regex `/\bmain\b/` |
35-
| `context:` | Filter results to a predefined [search context](/self-hosting/more/search-contexts). | `context:web` - Filter results to the web context<br/>`-context:pipelines` - Ignore results from the pipelines context |
35+
| `context:` | Filter results to a predefined [search context](/docs/search/search-contexts). | `context:web` - Filter results to the web context<br/>`-context:pipelines` - Ignore results from the pipelines context |

docs/self-hosting/overview.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ Sourcebot is open source and can be self-hosted using our official [Docker image
8282
<Card horizontal title="Bitbucket Data Center" icon="bitbucket" href="/docs/connections/bitbucket-data-center" />
8383
<Card horizontal title="Gitea" href="/docs/connections/gitea" />
8484
<Card horizontal title="Gerrit" href="/docs/connections/gerrit" />
85+
<Card horizontal title="Other Git hosts" icon="git-alt" href="/docs/connections/generic-git-host" />
86+
<Card horizontal title="Local Git repos" icon="folder" href="/docs/connections/local-repos" />
8587
</CardGroup>
8688

8789
<Note>Missing your code host? [Submit a feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>

0 commit comments

Comments
 (0)