Skip to content

Commit 771d558

Browse files
authored
Merge pull request #28 from codefuse-ai/doc-preview
[doc] Add jekyll build and deployment CI
2 parents 80f67cb + 2e87172 commit 771d558

File tree

11 files changed

+543
-1
lines changed

11 files changed

+543
-1
lines changed

.github/workflows/pages.yml

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# This workflow uses actions that are not certified by GitHub.
2+
# They are provided by a third-party and are governed by
3+
# separate terms of service, privacy policy, and support
4+
# documentation.
5+
6+
# Sample workflow for building and deploying a Jekyll site to GitHub Pages
7+
name: Deploy Jekyll site to Pages
8+
9+
on:
10+
push:
11+
branches: ["main", "doc-preview"]
12+
paths:
13+
- "doc/**"
14+
15+
# Allows you to run this workflow manually from the Actions tab
16+
workflow_dispatch:
17+
18+
# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
19+
permissions:
20+
contents: read
21+
pages: write
22+
id-token: write
23+
24+
# Allow one concurrent deployment
25+
concurrency:
26+
group: "pages"
27+
cancel-in-progress: true
28+
29+
jobs:
30+
# Build job
31+
build:
32+
runs-on: ubuntu-latest
33+
defaults:
34+
run:
35+
working-directory: doc
36+
steps:
37+
- name: Checkout
38+
uses: actions/checkout@v3
39+
- name: Setup Ruby
40+
uses: ruby/setup-ruby@v1
41+
with:
42+
ruby-version: '3.1' # Not needed with a .ruby-version file
43+
bundler-cache: true # runs 'bundle install' and caches installed gems automatically
44+
cache-version: 0 # Increment this number if you need to re-download cached gems
45+
working-directory: '${{ github.workspace }}/doc'
46+
- name: Generate COREF API Documents
47+
run: python3 tools/build.py
48+
- name: Setup Pages
49+
id: pages
50+
uses: actions/configure-pages@v3
51+
- name: Build with Jekyll
52+
# Outputs to the './_site' directory by default
53+
run: bundle exec jekyll build --baseurl "${{ steps.pages.outputs.base_path }}"
54+
env:
55+
JEKYLL_ENV: production
56+
- name: Upload artifact
57+
# Automatically uploads an artifact from the './_site' directory by default
58+
uses: actions/upload-pages-artifact@v1
59+
with:
60+
path: "doc/_site/"
61+
62+
# Deployment job
63+
deploy:
64+
environment:
65+
name: github-pages
66+
url: ${{ steps.deployment.outputs.page_url }}
67+
runs-on: ubuntu-latest
68+
needs: build
69+
steps:
70+
- name: Deploy to GitHub Pages
71+
id: deployment
72+
uses: actions/deploy-pages@v2

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,7 @@ CodeFuse-Query为CodeFuse代码大模型提供了以下数据清洗能力:
118118
- [安装、配置、运行](./doc/3_install_and_run.md)
119119
- [Gödel查询语言介绍](./doc/4_godelscript_language.md)
120120
- [VSCode开发插件](./doc/5_toolchain.md)
121+
- [COREF API](https://codefuse-ai.github.io/CodeFuse-Query/godel-api/coref_library_reference.html)
121122

122123
## 教程 (tutorial)
123124
- [在线教程](./tutorial/README.md)

doc/1_abstract.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
随着大规模软件开发的普及,对可扩展且易于适应的静态代码分析技术的需求正在加大。传统的静态分析工具,如 Clang Static Analyzer (CSA) 或 PMD,在检查编程规则或样式问题方面已经展现出了良好的效果。然而,这些工具通常是为了满足特定的目标而设计的,往往无法满足现代软件开发环境中多变和多元化的需求。这些需求可以涉及服务质量 (QoS)、各种编程语言、不同的算法需求,以及各种性能需求。例如,安全团队可能需要复杂的算法,如上下文敏感的污点分析,来审查较小的代码库,而项目经理可能需要一种相对较轻的算法,例如计算圈复杂度的算法,以在较大的代码库上测量开发人员的生产力。
2+
23
这些多元化的需求,加上大型组织中常见的计算资源限制,构成了一项重大的挑战。由于传统工具采用的是问题特定的计算方式,往往无法在这种环境中实现扩展。因此,我们推出了 CodeQuery,这是一个专为大规模静态分析设计的集中式数据平台。
34
在 CodeQuery 的实现中,我们把源代码和分析结果看作数据,把执行过程看作大数据处理,这与传统的以工具为中心的方法有着显著的不同。我们利用大型组织中的常见系统,如数据仓库、MaxCompute 和 Hive 等数据计算设施、OSS 对象存储和 Kubernetes 等灵活计算资源,让 CodeQuery 能够无缝地融入这些系统中。这种方法使 CodeQuery 高度可维护和可扩展,能够支持多元化的需求,并有效应对不断变化的需求。此外,CodeQuery 的开放架构鼓励各种内部系统之间的互操作性,实现了无缝的交互和数据交换。这种集成和交互能力不仅提高了组织内部的自动化程度,也提高了效率,降低了手动错误的可能性。通过打破信息孤岛,推动更互联、更自动化的环境,CodeQuery 显著提高了软件开发过程的整体生产力和效率。
45
此外,CodeQuery 的以数据为中心的方法在处理静态源代码分析的领域特定挑战时具有独特的优势。例如,源代码通常是一个高度结构化和互联的数据集,与其他代码和配置文件有强烈的信息和连接。将代码视为数据,CodeQuery 可以巧妙地处理这些问题,这使得它特别适合在大型组织中使用,其中代码库持续但逐步地进行演变,大部分代码在每天进行微小的改动同时保持稳定。 CodeQuery 还支持如基于代码数据的商业智能 (BI) 这类用例,能生成报告和仪表板,协助监控和决策过程。此外,CodeQuery 在分析大型语言模型 (LLM) 的训练数据方面发挥了重要作用,提供了增强这些模型整体效果的深入见解。
6+
57
在当前的静态分析领域,CodeQuery 带来了一种新的范式。它不仅满足了大规模、复杂的代码库分析需求,还能适应不断变化和多元化的静态分析场景。CodeQuery 的以数据为中心的方法,使得其在处理大数据环境中的代码分析问题时具有独特优势。CodeQuery 的设计,旨在解决大规模软件开发环境中的静态分析问题。它能够将源代码和分析结果视作数据,使得其可以灵活地融入大型组织的各种系统中。这种方法不仅可以有效地处理大规模的代码库,还可以应对各种复杂的分析需求,从而使得静态分析工作变得更加高效和准确。
68

79
CodeQuery 的特点和优势可以概括为以下几点:

doc/5_toolchain.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,5 +79,7 @@ code --install-extension [扩展vsix文件路径]
7979
- `godelScript.libraryDirectoryPath`
8080
- 用于指定 GödelScript 的库文件夹路径,默认为空。需要时请替换为 GödelScript 库文件夹绝对路径。
8181
- 如果已经下载 Sparrow CLI ,则库文件夹路径为 `[sparrow cli root]/lib-1.0`
82-
# 智能助手
82+
83+
# 智能助手
84+
8385
待开放,尽情期待!

doc/Gemfile

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
source 'https://rubygems.org'
2+
3+
gem "jekyll", "~> 4.3.2" # installed by `gem jekyll`
4+
# gem "webrick" # required when using Ruby >= 3 and Jekyll <= 4.2.2
5+
6+
gem "just-the-docs", "0.7.0" # pinned to the current release
7+
# gem "just-the-docs" # always download the latest release

doc/Gemfile.lock

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
GEM
2+
remote: https://rubygems.org/
3+
specs:
4+
addressable (2.8.5)
5+
public_suffix (>= 2.0.2, < 6.0)
6+
colorator (1.1.0)
7+
concurrent-ruby (1.2.2)
8+
em-websocket (0.5.3)
9+
eventmachine (>= 0.12.9)
10+
http_parser.rb (~> 0)
11+
eventmachine (1.2.7)
12+
ffi (1.15.5)
13+
forwardable-extended (2.6.0)
14+
google-protobuf (3.24.3-arm64-darwin)
15+
google-protobuf (3.24.3-x86_64-linux)
16+
http_parser.rb (0.8.0)
17+
i18n (1.14.1)
18+
concurrent-ruby (~> 1.0)
19+
jekyll (4.3.2)
20+
addressable (~> 2.4)
21+
colorator (~> 1.0)
22+
em-websocket (~> 0.5)
23+
i18n (~> 1.0)
24+
jekyll-sass-converter (>= 2.0, < 4.0)
25+
jekyll-watch (~> 2.0)
26+
kramdown (~> 2.3, >= 2.3.1)
27+
kramdown-parser-gfm (~> 1.0)
28+
liquid (~> 4.0)
29+
mercenary (>= 0.3.6, < 0.5)
30+
pathutil (~> 0.9)
31+
rouge (>= 3.0, < 5.0)
32+
safe_yaml (~> 1.0)
33+
terminal-table (>= 1.8, < 4.0)
34+
webrick (~> 1.7)
35+
jekyll-include-cache (0.2.1)
36+
jekyll (>= 3.7, < 5.0)
37+
jekyll-sass-converter (3.0.0)
38+
sass-embedded (~> 1.54)
39+
jekyll-seo-tag (2.8.0)
40+
jekyll (>= 3.8, < 5.0)
41+
jekyll-watch (2.2.1)
42+
listen (~> 3.0)
43+
just-the-docs (0.7.0)
44+
jekyll (>= 3.8.5)
45+
jekyll-include-cache
46+
jekyll-seo-tag (>= 2.0)
47+
rake (>= 12.3.1)
48+
kramdown (2.4.0)
49+
rexml
50+
kramdown-parser-gfm (1.1.0)
51+
kramdown (~> 2.0)
52+
liquid (4.0.4)
53+
listen (3.8.0)
54+
rb-fsevent (~> 0.10, >= 0.10.3)
55+
rb-inotify (~> 0.9, >= 0.9.10)
56+
mercenary (0.4.0)
57+
pathutil (0.16.2)
58+
forwardable-extended (~> 2.6)
59+
public_suffix (5.0.3)
60+
rake (13.0.6)
61+
rb-fsevent (0.11.2)
62+
rb-inotify (0.10.1)
63+
ffi (~> 1.0)
64+
rexml (3.2.6)
65+
rouge (4.1.3)
66+
safe_yaml (1.0.5)
67+
sass-embedded (1.67.0-arm64-darwin)
68+
google-protobuf (~> 3.23)
69+
sass-embedded (1.67.0-x86_64-linux-gnu)
70+
google-protobuf (~> 3.23)
71+
terminal-table (3.0.2)
72+
unicode-display_width (>= 1.1.1, < 3)
73+
unicode-display_width (2.4.2)
74+
webrick (1.8.1)
75+
76+
PLATFORMS
77+
arm64-darwin-21
78+
arm64-darwin-23
79+
x86_64-linux
80+
81+
DEPENDENCIES
82+
jekyll (~> 4.3.2)
83+
just-the-docs (= 0.7.0)
84+
85+
BUNDLED WITH
86+
2.3.26

doc/_config.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
title: CodeFuse-Query Documentation
2+
description: A starter template for a Jeykll site using the Just the Docs theme!
3+
theme: just-the-docs
4+
5+
url: https://codefuse-ai.github.io/CodeFuse-Query

doc/index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
title: Home
3+
layout: default
4+
nav_order: 1
5+
---
6+
## 文档 (Documentation)
7+
8+
请见[仓库首页](https://github.com/codefuse-ai/CodeFuse-Query)

doc/tools/build.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import subprocess
2+
3+
print("Download Sparrow CLI")
4+
subprocess.run([
5+
"curl",
6+
"-L",
7+
"https://github.com/codefuse-ai/CodeFuse-Query/releases/download/2.0.2/sparrow-cli-2.0.2.linux.tar.gz",
8+
"-o",
9+
"sparrow-cli.tar.gz"
10+
])
11+
subprocess.run([
12+
"tar",
13+
"-xvzf",
14+
"sparrow-cli.tar.gz"
15+
])
16+
print("Copy ../assets into ./doc/assets")
17+
subprocess.run(["cp", "-r", "../assets", "./"])
18+
print("Concat coref library from ../language into ./.coref-api-build")
19+
subprocess.run(["python3", "tools/generate_coref_library.py", "../language"])
20+
print("Generate markdown documents into ./godel-api")
21+
subprocess.run(["python3", "tools/generate_markdown.py", "./sparrow-cli/godel-script/usr/bin/godel"])

doc/tools/generate_coref_library.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
import sys
2+
import os
3+
4+
if len(sys.argv) != 2:
5+
print("Usage: python this_file.py language_library_directory")
6+
exit(-1)
7+
8+
input_language_dir = sys.argv[1]
9+
10+
print("Generate library from", input_language_dir)
11+
if not os.path.exists("./.coref-api-build"):
12+
os.mkdir("./.coref-api-build")
13+
14+
mapper = {
15+
"coref.go.gdl": input_language_dir + "/go/lib",
16+
"coref.java.gdl": input_language_dir + "/java/lib",
17+
"coref.javascript.gdl": input_language_dir + "/javascript/lib",
18+
"coref.python.gdl": input_language_dir + "/python/lib",
19+
"coref.xml.gdl": input_language_dir + "/xml/lib",
20+
}
21+
22+
for key in mapper.keys():
23+
output_file = "./.coref-api-build/" + key
24+
result = ""
25+
for root, ignored, files in os.walk(mapper[key]):
26+
for file in files:
27+
result += open(root + "/" + file, "r").read() + "\n"
28+
open(output_file, "w").write(result)

0 commit comments

Comments
 (0)