add user guide for ant-xgboost #772

sperlingxx · 2019-09-03T14:43:26Z

doc/ant-xgboost_user_guide.md

wangkuiyi

Thanks for this PR. I think we need a clear goal for this document. It contains too much about Ant-XGBoost, which is more appropriate to reside in the Ant-XGBoost repo rather than here the SQLFlow repo.

It is titled "user guide", but doesn't contain the part about building and setting up SQLFlow with Ant-XGBoost codegen. It would be easier to have links to the setup of Jupyter Notebook as well, so users could follow and type the examples.

It seems more comprehensive if we could explain novel concepts, like XGBoost Estimator, before showing how to call them.

Also, please follow the Markdown syntax used with Github: https://guides.github.com/features/mastering-markdown/

doc/ant-xgboost_user_guide.md

wangkuiyi · 2019-09-04T03:38:11Z

doc/ant-xgboost_user_guide.md

@@ -0,0 +1,265 @@
+### _user guide:_ Ant-XGBoost on sqlflow


### => #

wangkuiyi · 2019-09-04T03:38:19Z

doc/ant-xgboost_user_guide.md

@@ -0,0 +1,265 @@
+### _user guide:_ Ant-XGBoost on sqlflow
+
+#### Overview


#### => ##

wangkuiyi · 2019-09-04T03:43:03Z

doc/ant-xgboost_user_guide.md

+    While the current `auto_train` method is a very simple approach, we are working on better strategies to further scale up hyperparameter tuning in XGBoost training.
+
+
+### Helpful Backports of XGBoost master


All above are about Ant-XGBoost and should be part of the documentation of Ant-XGBoost, other than SQLFlow. I would recommend moving the above content to github.com/alipay/ant-xgboost, and put a link in this document pointing to that repo.

wangkuiyi · 2019-09-04T03:44:18Z

doc/ant-xgboost_user_guide.md

+
+<br>
+
+## Quick Start


This document is supposed to be about the design of antxgboost_codegen.go. But there is no discussion about this code generator?

doc/ant-xgboost_user_guide.md

wangkuiyi · 2019-09-04T03:47:35Z

doc/ant-xgboost_user_guide.md

+
+<br>
+
+## Overall SQL Syntax


I think this document is not about the extended syntax by SQLFlow to SQL. Do you want to explain the part of SQLFlow syntax to be utilized by the Ant-XGBoost codegen?

@wangkuiyi Yes, I want to inform users the overall sqlflow syntax related to ant-xgboost. So, we rename this section with Overall SQL Syntax for AntXGBoost.

doc/ant-xgboost_user_guide.md

sperlingxx · 2019-09-04T13:52:37Z

@wangkuiyi Thanks for comments! I have refined this guide, and add a tutorial for AntXGBoost.

Yancey0623 · 2019-09-05T09:39:32Z

doc/ant-xgboost_user_guide.md

+[Ant-XGBoost](https://github.com/alipay/ant-xgboost) is fork of [dmlc/xgboost](https://github.com/dmlc/xgboost), which is maintained by active contributors of dmlc/xgboost in Alipay Inc.
+
+Ant-XGBoost extends `dmlc/xgboost` with the capability of running on Kubernetes and automatic hyper-parameter estimation. 
+In particular, Ant-XGBoost includes `auto_train` methods for automatic training and introduces an additional parameter `convergence_criteria` for generalized early stopping strategy.


Maybe we can add links to reference auto_train and convergence_criteria, so that users can know the concept clearly.

Yancey0623 · 2019-09-05T09:42:57Z

doc/ant-xgboost_user_guide.md

+
+## Tutorial
+We provide an [interactive tutorial](../example/jupyter/tutorial_antxgb.ipynb) via jupyter notebook, which can be run out-of-the-box in [sqlflow playground](https://play.sqlflow.org).
+If you want to run it locally, you need to install sqlflow first. You can learn how to install sqlflow at [here](../doc/installation.md).


sqlflow => SQLFlow

Yancey0623 · 2019-09-05T09:44:42Z

doc/ant-xgboost_user_guide.md

+
+* xgboost.Regressor 
+
+  Estimator for regression task, set `train.objective` to `reg:squarederror`(`reg:linear`). 


Does users need to set objective=train.objective in WITH clause or not? If not, which of would be the value of objective?

Yancey0623 · 2019-09-05T09:48:17Z

doc/ant-xgboost_user_guide.md

+### Columns
+
+#### Feature Columns
+For now, two feature column schemas are available.


two feature column schemas
=>
two kinds of feature columns?

Yancey0623 · 2019-09-05T09:50:30Z

doc/ant-xgboost_user_guide.md

+
+First one is `dense schema`, which concatenate numeric table columns transparently, such as `COLUMN f1, f2, f3, f4`.
+
+Second one is `sparse key-value schema`, which received string sparse feature formatted like `$k1:$v1,$k2:$v2,...`.


what does k and v mean?

doc/ant-xgboost_user_guide.md

Yancey0623

LGTM!

sperlingxx added 3 commits September 3, 2019 22:01

add user guide for ant-xgboost

0b8bee5

refine

041802b

refine

b4d9b26

sperlingxx requested review from Yancey0623, BlackPoint-CX, weiguoz, typhoonzero, Echo9573 and tonyyang-svail September 3, 2019 14:43

tonyyang-svail reviewed Sep 3, 2019

View reviewed changes

doc/ant-xgboost_user_guide.md Outdated Show resolved Hide resolved

doc/ant-xgboost_user_guide.md Outdated Show resolved Hide resolved

doc/ant-xgboost_user_guide.md Outdated Show resolved Hide resolved

realign

ff877d1

wangkuiyi reviewed Sep 4, 2019

View reviewed changes

sperlingxx added 3 commits September 4, 2019 21:37

refine doc

e61298d

refine

4bbf8e0

fix

6cb472d

fix

f13ca81

sperlingxx mentioned this pull request Sep 5, 2019

[TODO List] Ant-XGBoost on sqlflow #630

Closed

13 tasks

Yancey0623 reviewed Sep 5, 2019

View reviewed changes

refine

87f29d3

Yancey0623 approved these changes Sep 5, 2019

View reviewed changes

sperlingxx merged commit 8fe6cde into sql-machine-learning:develop Sep 5, 2019

sperlingxx deleted the antxgb_doc branch September 5, 2019 12:48

		@@ -0,0 +1,265 @@
		### _user guide:_ Ant-XGBoost on sqlflow

		#### Overview

		While the current `auto_train` method is a very simple approach, we are working on better strategies to further scale up hyperparameter tuning in XGBoost training.


		### Helpful Backports of XGBoost master


		* xgboost.Regressor

		Estimator for regression task, set `train.objective` to `reg:squarederror`(`reg:linear`).


		First one is `dense schema`, which concatenate numeric table columns transparently, such as `COLUMN f1, f2, f3, f4`.

		Second one is `sparse key-value schema`, which received string sparse feature formatted like `$k1:$v1,$k2:$v2,...`.

add user guide for ant-xgboost #772

add user guide for ant-xgboost #772

Uh oh!

Conversation

sperlingxx commented Sep 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wangkuiyi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sperlingxx commented Sep 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Yancey0623 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sperlingxx commented Sep 3, 2019 •

edited

Loading

wangkuiyi left a comment •

edited

Loading

sperlingxx commented Sep 4, 2019 •

edited

Loading