Skip to content

migrate: update docs and examples for new standalone CLI #530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/docs/core/basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,4 +101,4 @@ As an indexing flow is long-lived, it needs to store intermediate data to keep t
CocoIndex uses internal storage for this purpose.

Currently, CocoIndex uses Postgres database as the internal storage.
See [Initialization](initialization) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
67 changes: 33 additions & 34 deletions docs/docs/core/cli.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,60 +8,59 @@ import TabItem from '@theme/TabItem';

# CocoIndex CLI

CocoIndex CLI embeds CLI functionality in your program.
It provides a bunch of commands for easily managing and inspecting your flows and indexes.
CocoIndex CLI is a standalone tool for easily managing and inspecting your flows and indexes.

## Enable CocoIndex CLI
## Invoke the CLI

### Use Packaged Main
Once CocoIndex is installed, you can invoke the CLI directly using the `cocoindex` command. Most commands require an `APP_TARGET` argument, which tells the CLI where your flow definitions are located.

The easiest way is to use a packaged main function:
### APP_TARGET Format

<Tabs>
<TabItem value="python" label="Python" default>
The `APP_TARGET` can be:
1. A **path to a Python file** defining your flows (e.g., `main.py`, `path/to/my_flows.py`).
2. An **installed Python module name** that contains your flow definitions (e.g., `my_package.flows`).
3. For commands that operate on a *specific flow* (like `show`, `update`, `evaluate`), you can combine the application reference with a flow name:
* `path/to/my_flows.py:MyFlow`
* `my_package.flows:MyFlow`

```python title="main.py"
import cocoindex
### Environment Variables

@cocoindex.main_fn()
def main():
...
```
Environment variables are needed as CocoIndex library settings, as described in [CocoIndex Settings](settings#list-of-environment-variables).

</TabItem>
</Tabs>
You can set environment variables in an environment file.

With this, when the program is executed with `cocoindex` as its first argument, CocoIndex CLI will take over the control. For example:
* By default, the `cocoindex` CLI searches upward from the current directory for a `.env` file.
* You can use `--env-file <path>` to specify one explicitly:

```sh
$ python main.py cocoindex ls # Run "ls" subcommand: list all flows
```
```sh
cocoindex --env-file path/to/custom.env <COMMAND> ...
```

Loaded variables do *NOT* override existing system ones.
If no file is found, only existing system environment variables are used.

You may also provide a `cocoindex_cmd` argument to the `main_fn` decorator to change the command from `cocoindex` to something else.
### Global Options

### Explicitly CLI Invoke
CocoIndex CLI supports the following global options:

An alternative way is to use `cocoindex.cli.cli` (with type [`click.Group`](https://click.palletsprojects.com/en/stable/api/#click.Group)).
For example, you may invoke the CLI explicitly with additional arguments:
* `--env-file <path>`: Load environment variables from a specified `.env` file. If not provided, `.env` in the current directory is loaded if it exists.
* `--version`: Show the CocoIndex version and exit.
* `--help`: Show the main help message and exit.

<Tabs>
<TabItem value="python" label="Python" default>
:::caution Deprecated Usage

```python
cocoindex.cli.cli.main(args)
```
The old method of invoking the CLI using `python main.py cocoindex ...` via the `@cocoindex.main_fn()` decorator is now deprecated. Please remove `@cocoindex.main_fn()` from your scripts and use the standalone cocoindex command as described.

</TabItem>
</Tabs>
:::

## Subcommands

The following subcommands are available:

| Subcommand | Description |
| ---------- | ----------- |
| `ls` | List all flows present in the current process. Or list all persisted flows under the current app namespace if `--all` is specified. |
| `show` | Show the spec for a specific flow. |
| `ls` | List all flows present in the given file/module. Or list all persisted flows under the current app namespace if no file/module specified. |
| `show` | Show the spec and schema for a specific flow. |
| `setup` | Check and apply backend setup changes for flows, including the internal and target storage (to export). |
| `drop` | Drop the backend setup for specified flows. |
| `update` | Update the index defined by the flow. |
Expand All @@ -71,6 +70,6 @@ The following subcommands are available:
Use `--help` to see the full list of subcommands, and `subcommand --help` to see the usage of a specific one.

```sh
python main.py cocoindex --help # Show all subcommands
python main.py cocoindex show --help # Show usage of "show" subcommand
cocoindex --help # Show all subcommands
cocoindex show --help # Show usage of "show" subcommand
```
2 changes: 1 addition & 1 deletion docs/docs/core/flow_def.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ Following metrics are supported:

### Getting App Namespace

You can use the [`app_namespace` setting](initialization#app-namespace) or `COCOINDEX_APP_NAMESPACE` environment variable to specify the app namespace,
You can use the [`app_namespace` setting](settings#app-namespace) or `COCOINDEX_APP_NAMESPACE` environment variable to specify the app namespace,
to organize flows across different environments (e.g., dev, staging, production), team members, etc.

In the code, You can call `flow.get_app_namespace()` to get the app namespace, and use it to name certain backends. It takes the following arguments:
Expand Down
20 changes: 5 additions & 15 deletions docs/docs/core/flow_methods.mdx
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: Flow Running
title: Run a Flow
toc_max_heading_level: 4
description: Run a CocoIndex Flow, including build / update data in the target storage and evaluate the flow without changing the target storage.
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Running a CocoIndex Flow
# Run a CocoIndex Flow

After a flow is defined as discussed in [Flow Definition](/docs/core/flow_def), you can start to transform data with it.

Expand All @@ -30,17 +30,7 @@ def demo_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataSco
```

It creates a `demo_flow` object in `cocoindex.Flow` type.
To enable CLI, you also need to make sure you have a main function decorated with `@cocoindex.main_fn()`:


```python title="main.py"
@cocoindex.main_fn()
def main():
...

if __name__ == "__main__":
main()
```
</TabItem>
</Tabs>

Expand Down Expand Up @@ -78,7 +68,7 @@ The `cocoindex update` subcommand creates/updates data in the target storage.
Once it's done, the target data is fresh up to the moment when the function is called.

```sh
python main.py cocoindex update
cocoindex update main.py
```

#### Library API
Expand Down Expand Up @@ -115,7 +105,7 @@ Change capture mechanisms enable CocoIndex to continuously capture changes from
To perform live update, run the `cocoindex update` subcommand with `-L` option:

```sh
python main.py cocoindex update -L
cocoindex update main.py -L
```

If there's at least one data source with change capture mechanism enabled, it will keep running until the aborted (e.g. by `Ctrl-C`).
Expand Down Expand Up @@ -232,7 +222,7 @@ It takes the following options:
Example:

```sh
python main.py cocoindex evaluate --output-dir ./eval_output
cocoindex evaluate main.py --output-dir ./eval_output
```

### Library API
Expand Down
134 changes: 0 additions & 134 deletions docs/docs/core/initialization.mdx

This file was deleted.

Loading