sql-machine-learning · typhoonzero · Sep 16, 2019 · Sep 16, 2019 · Sep 16, 2019 · Sep 16, 2019
diff --git a/README.md b/README.md
@@ -52,13 +52,13 @@ Done predicting. Predict table : iris.predict
 
 - [Installation](doc/installation.md)
 - [Running a Demo](doc/demo.md)
-- [Extended SQL Syntax](doc/syntax.md)
+- [User Guide](doc/user_guide.md)
 
 ## Contributions
 
 - [Build from source](doc/build.md)
 - [The walkthrough of the source code](doc/walkthrough.md)
-- [The choice of parser generator](doc/sql_parser.md)
+- [The choice of parser generator](doc/design/design_sql_parser.md)
 
 ## Roadmap
 

diff --git a/doc/alps_submitter.md → doc/design/design_alps_submitter.md b/doc/alps_submitter.md → doc/design/design_alps_submitter.md
@@ -1,4 +1,4 @@
-# Proof of Concept: ALPS Submitter
+# _Design:_ ALPS Submitter
 
 ALPS (Ant Learning and Prediction Suite) provides a common algorithm-driven framework in Ant Financial, focusing on providing users with an efficient and easy-to-use machine learning programming framework and a financial learning machine learning algorithm solution.
 

diff --git a/doc/analyzer_design.md → doc/design/design_analyzer.md b/doc/analyzer_design.md → doc/design/design_analyzer.md
@@ -1,4 +1,4 @@
-# Design: Analyze the Machine Learning Mode in SQLFlow
+# _Design:_ Analyze the Machine Learning Mode in SQLFlow
 
 ## Concept
 

diff --git a/doc/ant-xgboost_design.md → doc/design/design_ant_xgboost.md b/doc/ant-xgboost_design.md → doc/design/design_ant_xgboost.md
diff --git a/doc/auth_design.md → doc/design/design_auth.md b/doc/auth_design.md → doc/design/design_auth.md
@@ -1,4 +1,4 @@
-# Design: SQLFlow Authentication and Authorization
+# _Design:_ SQLFlow Authentication and Authorization
 
 ## Concepts
 
@@ -72,7 +72,7 @@ that case, we store session data into a reliable storage service like
 The below figure demonstrates overall workflow for authorization and
 authentication.
 
-<img src="figures/sqlflow_auth.png">
+<img src="../figures/sqlflow_auth.png">
 
 Users can access the JupyterHub web page using their own username and password.
 The user's identity will be verified by the [SSO](https://en.wikipedia.org/wiki/Single_sign-on)

diff --git a/doc/cluster_design.md → doc/design/design_clustermodel.md b/doc/cluster_design.md → doc/design/design_clustermodel.md
@@ -1,4 +1,4 @@
-# Design: Clustering in SQLflow to analyze patterns in data
+# _Design:_ Clustering in SQLflow to analyze patterns in data
 
 ## ClusterModel introduction
 
@@ -9,7 +9,7 @@ This design document introduced how to support the `Cluster Model` in SQLFLow.
 
 The figure below demonstrates the overall workflow for cluster model training, which include both the pre_train autoencoder model and the clustering model.(Reference https://www.dlology.com/blog/how-to-do-unsupervised-clustering-with-keras/)
 
-<div align=center> <img width="460" height="550" src="figures/cluster_model_train_overview.png"> </div>
+<div align=center> <img width="460" height="550" src="../figures/cluster_model_train_overview.png"> </div>
 
 1. The first part is used to load a pre_trained model. We use the output of the trained encoder layer as the input to the clustering model. 
 2. Then, the clustering model starts training with randomly initialized weights, and generate clusters after multiple iterations.

diff --git a/doc/customized+model.md → doc/design/design_customized_model.md b/doc/customized+model.md → doc/design/design_customized_model.md
@@ -1,4 +1,4 @@
-# Design Doc: Define Models for SQLFlow
+# _Design:_ Define Models for SQLFlow
 
 SQLFlow enables SQL programs to call deep learning models defined in Python. This document is about how to define models for SQLFlow.
 

diff --git a/doc/database_abstraction_layer.md → ...sign/design_database_abstraction_layer.md b/doc/database_abstraction_layer.md → ...sign/design_database_abstraction_layer.md
@@ -1,4 +1,4 @@
-# Compatibility with Various SQL Engines
+# _Design:_ Compatibility with Various SQL Engines
 
 SQLFlow interacts with SQL engines like MySQL and Hive, while different SQL engines use variants of SQL syntax, it is important for SQLFlow to have an abstraction layer that hides such differences.
 
@@ -8,7 +8,7 @@ SQLFlow calls Go's [standard database API](https://golang.org/pkg/database/sql/)
 
 ### Data Retrieval
 
-The basic idea of SQLFlow is to extend the SELECT statement of SQL to have the TRAIN and PREDICT clauses.  For more discussion, please refer to the [syntax design](/doc/syntax.md).  SQLFlow translates such "extended SQL statements" into submitter programs, which forward the part from SELECT to TRAIN or PREDICT, which we call the "standard part", to the SQL engine.  SQLFlow also accepts the SELECT statement without TRAIN or PREDICT clauses and would forward such "standard statements" to the engine.  It is noticeable that the "standard part" or "standard statements" are not standardized.  For example, various engines use different syntax for `FULL OUTER JOIN`.
+The basic idea of SQLFlow is to extend the SELECT statement of SQL to have the TRAIN and PREDICT clauses.  For more discussion, please refer to the [syntax design](/doc/design/design_syntax.md).  SQLFlow translates such "extended SQL statements" into submitter programs, which forward the part from SELECT to TRAIN or PREDICT, which we call the "standard part", to the SQL engine.  SQLFlow also accepts the SELECT statement without TRAIN or PREDICT clauses and would forward such "standard statements" to the engine.  It is noticeable that the "standard part" or "standard statements" are not standardized.  For example, various engines use different syntax for `FULL OUTER JOIN`.
 
 - Hive supports `FULL OUTER JOIN` directly.
 - MySQL doesn't have `FULL OUTER JOIN`. However, a user can emulates `FULL OUTER JOIN` using `LEFT JOIN`, `UNION` and `RIGHT JOIN`.

diff --git a/doc/elasticdl_on_sqlflow.md → doc/design/design_elasticdl_on_sqlflow.md b/doc/elasticdl_on_sqlflow.md → doc/design/design_elasticdl_on_sqlflow.md
diff --git a/doc/feature_derivation.md → doc/design/design_feature_derivation.md b/doc/feature_derivation.md → doc/design/design_feature_derivation.md
@@ -1,4 +1,4 @@
-# Design: Feature Derivation
+# _Design:_ Feature Derivation
 
 This file discusses the details and implementations of "Feature Derivation".
 Please refer to [this](https://medium.com/@SQLFlow/feature-derivation-the-conversion-from-sql-data-to-tensors-833519db1467) blog to

diff --git a/doc/design_intermediate_representation.md → ...ign/design_intermediate_representation.md b/doc/design_intermediate_representation.md → ...ign/design_intermediate_representation.md
@@ -6,10 +6,10 @@ As SQLFlow is supporting more and more machine learning toolkits, the correspond
 
 The core `sql` package should include the following functionalities:
 1. The entry point of running extended SQL statements.
-1. The [parsing](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/sql_parser.md) of extended SQL statements.
+1. The [parsing](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/design/design_sql_parser.md) of extended SQL statements.
 1. The verification of extended SQL statements, including verifying the syntax, the existence of the selected fields.
-1. The [feature derivation](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/feature_derivation.md), including name, type, shape, and preprocessing method of the select fields.
-1. The [training data and validation data split](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/training_and_validation.md).
+1. The [feature derivation](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/design/design_feature_derivation.md), including name, type, shape, and preprocessing method of the select fields.
+1. The [training data and validation data split](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/design/design_training_and_validation.md).
 
 With these functionalities, the `sql` package çan translate user typed extended SQL statements to an IR as an exposed Go struct. The codegen package takes the IR and returns a generated Python program for the `sql` package to execute.
 

diff --git a/doc/pipe.md → doc/design/design_pipe.md b/doc/pipe.md → doc/design/design_pipe.md
@@ -1,9 +1,9 @@
-# Piping Responses
+# _Design:_ Piping Responses
 
 
 ## Streaming Responses
 
-As described in the [overall design](doc/syntax.md), a SQLFlow job could be a standard or an extended SQL statement, where an extended SQL statement will be translated into a Python program.  Therefore, each job might generate up to the following data streams:
+As described in the [overall design](design_syntax.md), a SQLFlow job could be a standard or an extended SQL statement, where an extended SQL statement will be translated into a Python program.  Therefore, each job might generate up to the following data streams:
 
 1. standard output, where each element is a line of text,
 1. standard error, where each element is a line of text,

diff --git a/doc/sql_parser.md → doc/design/design_sql_parser.md b/doc/sql_parser.md → doc/design/design_sql_parser.md
@@ -1,4 +1,4 @@
-# Extended SQL Parser Design
+# _Design:_ Extended SQL Parser
 
 This documentation explains the technical decision made in building a SQL
 parser in Go. It is used to parsed the extended SELECT syntax of SQL that

diff --git a/doc/submitter.md → doc/design/design_submitter.md b/doc/submitter.md → doc/design/design_submitter.md
@@ -1,12 +1,12 @@
-# Submitter
+# _Design:_ Submitter
 
 A submitter is a pluggable module in SQLFlow that is used to submit an ML job to a third party computation service.
 
 ## Workflow
 
 When a user types in an extended SQL statement, SQLFlow first parses and semantically verifies the statement. Then SQLFlow either runs the ML job locally or submits the ML job to a third party computation service. 
 
-![](figures/sqlflow-arch2.png)
+![](../figures/sqlflow-arch2.png)
 
 In the latter case, SQLFlow produces a job description (`TrainDescription` or `PredictDescription`) and hands it over to the submitter. For a training SQL, SQLFlow produces `TrainDescription`; for prediction SQL, SQLFlow produces `PredDescription`. The concrete definition of the description looks like the following
 

diff --git a/doc/support_multiple_sql_statements.md → ...design_support_multiple_sql_statements.md b/doc/support_multiple_sql_statements.md → ...design_support_multiple_sql_statements.md
diff --git a/doc/syntax.md → doc/design/design_syntax.md b/doc/syntax.md → doc/design/design_syntax.md
@@ -1,4 +1,4 @@
-# SQLFlow: Design Doc
+# _Design:_ SQLFlow
 
 ## What is SQLFlow
 

diff --git a/doc/training_and_validation.md → doc/design/design_training_and_validation.md b/doc/training_and_validation.md → doc/design/design_training_and_validation.md
@@ -1,11 +1,11 @@
-# Design: Training and Validation
+# _Design:_ Training and Validation
 
 A common ML training job usually involves two kinds of data sets: training data and validation data. These two data sets will be generated automatically by SQLFlow through randomly splitting the select results.
 
 ## Overall
 SQLFlow generates a temporary table following the user-specific table, trains and evaluates a model.
 
-<img src="./figures/training_and_validation.png" width="60%">
+<img src="../figures/training_and_validation.png" width="60%">
 
 Notice, we talk about the **train** process in this post.
 
@@ -125,4 +125,4 @@ In the end, SQLFlow remove the temporary table to release resources.
 
 - If the column sqlflow_random already exists, SQLFlow chooses to quit   
   Notice, *column name started with an underscore is invalid in the hive*
-- Any discussion to implement a better splitting is welcomed
+- Any discussion to implement a better splitting is welcomed
diff --git a/doc/xgboost_on_sqlflow_design.md → doc/design/design_xgboost_on_sqlflow.md b/doc/xgboost_on_sqlflow_design.md → doc/design/design_xgboost_on_sqlflow.md
@@ -1,4 +1,4 @@
-# Design Doc: XGBoost on SQLFlow
+# _Design:_ XGBoost on SQLFlow
 
 ## Introduction
 

diff --git a/doc/text_classification_demo.md b/doc/text_classification_demo.md
@@ -5,7 +5,7 @@ Note that the steps in this tutorial may be changed during the development
 of SQLFlow, we only provide a way that simply works for the current version.
 
 To support custom models like CNN text classification, you may check out the
-current [design](https://github.com/sql-machine-learning/models/blob/develop/doc/customized%2Bmodel.md)
+current [design](https://github.com/sql-machine-learning/models/blob/develop/doc/design/design_customized_model.md)
 for ongoing development.
 
 In this tutorial we use two datasets both for english and chinese text classification.