You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this example, we will build an embedding index based on Google Drive files and perform semantic search.
5
+
6
+
It continuously updates the index as files are added / updated / deleted in the source folders. It keeps the index in sync with the source folders in real-time.
7
+
8
+
We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
1. We will ingest files from Google Drive folders.
16
+
2. For each file, perform chunking (recursively split) and then embedding.
17
+
3. We will save the embeddings and the metadata in Postgres with PGVector.
18
+
19
+
### Query
20
+
We will match against user-provided text by a SQL query, and reuse the embedding operation in the indexing flow.
4
21
5
22
## Prerequisite
6
23
@@ -25,32 +42,31 @@ Before running the example, you need to:
25
42
26
43
## Run
27
44
28
-
Install dependencies:
29
-
30
-
```sh
31
-
pip install -e .
32
-
```
45
+
- Install dependencies:
33
46
34
-
Setup:
47
+
```sh
48
+
pip install -e .
49
+
```
35
50
36
-
```sh
37
-
cocoindex setup main.py
38
-
```
51
+
- Setup:
39
52
40
-
Run:
53
+
```sh
54
+
cocoindex setup main.py
55
+
```
41
56
42
-
```sh
43
-
python main.py
44
-
```
57
+
- Run:
58
+
59
+
```sh
60
+
python main.py
61
+
```
45
62
46
63
During running, it will keep observing changes in the source folders and update the index automatically.
47
64
At the same time, it accepts queries from the terminal, and performs search on top of the up-to-date index.
48
65
49
66
50
67
## CocoInsight
51
-
CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9).
52
-
53
-
Run CocoInsight to understand your RAG data pipeline:
68
+
I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline.
69
+
It just connects to your local CocoIndex server, with Zero pipeline data retention. Run following command to start CocoInsight:
54
70
55
71
```sh
56
72
cocoindex server -ci main.py
@@ -62,4 +78,6 @@ You can also add a `-L` flag to make the server keep updating the index to refle
62
78
cocoindex server -ci -L main.py
63
79
```
64
80
65
-
Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight).
81
+
Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight).
0 commit comments