|
| 1 | +# How to Contribute SQLFLow Models |
| 2 | + |
| 3 | +This guide will introduce how to contribute to SQLFlow models. You can find design doc: [Define SQLFLow Models](/doc/customized+model.md), and feel free to check it out. |
| 4 | + |
| 5 | +## Develop an SQLFlow Model |
| 6 | + |
| 7 | +1. Open the [SQLFlow models repo](https://github.com/sql-machine-learning/models) on your web browser, and fork the official repo to your account. |
| 8 | + |
| 9 | +1. Clone the forked repo on your hosts: |
| 10 | + |
| 11 | + ``` bash |
| 12 | + > git clone https://github.com/<Your Github ID>/models.git |
| 13 | + ``` |
| 14 | + |
| 15 | +1. Set up your local python environment by `make setup && source venv/bin/activate`. If you are using [PyCharm](https://www.jetbrains.com/pycharm/), you can simply `make setup` and then import the `models` folder as a new project. |
| 16 | + |
| 17 | +1. You can add a new mode definition Python script under the folder [sqlflow_models](/sqlflow_models). For example, adding a new Python script `mydnnclassfier.py`: |
| 18 | + |
| 19 | + ``` text |
| 20 | + `-sqlflow_models |
| 21 | + |- dnnclassifier.py |
| 22 | + `- mydnnclassifier.py |
| 23 | + ``` |
| 24 | + |
| 25 | +1. You can choose whatever name you like for your model. Your model definition should be a [keras subclass model](https://keras.io/models/about-keras-models/#model-subclassing) |
| 26 | + |
| 27 | + ``` python |
| 28 | + import tensorflow as tf |
| 29 | +
|
| 30 | + class MyDNNClassifier(tf.keras.Model): |
| 31 | + def __init__(self, feature_columns, hidden_units=[10,10], n_classes=2): |
| 32 | + ... |
| 33 | + ... |
| 34 | + ``` |
| 35 | + |
| 36 | +1. Import `MyDNNClassfier` in [sqlflow_models/\_\_init__.py](/sqlflow_models/__init__.py): |
| 37 | + |
| 38 | + ``` python |
| 39 | + ... |
| 40 | + from .mydnnclassfier import MyDNNClassifier |
| 41 | + ``` |
| 42 | + |
| 43 | +1. You can test your `MyDNNClassifier` by adding a new Python unit test script `tests/test_mydnnclassifier.py` and run the test as: `python tests/test_mydnnclassifier.py`: |
| 44 | + |
| 45 | + ``` python |
| 46 | + from sqlflow_models import MyDNNClassifier |
| 47 | + from tests.base import BaseTestCases |
| 48 | +
|
| 49 | + import tensorflow as tf |
| 50 | + import unittest |
| 51 | +
|
| 52 | +
|
| 53 | + class TestMyDNNClassifier(BaseTestCases.BaseTest): |
| 54 | + def setUp(self): |
| 55 | + self.features = {...} |
| 56 | + self.label = [...] |
| 57 | + feature_columns = [...] |
| 58 | + self.model = MyDNNClassifier(feature_columns=feature_columns) |
| 59 | +
|
| 60 | + if __name__ == '__main__': |
| 61 | + unittest.main() |
| 62 | + ``` |
| 63 | +
|
| 64 | +## Test Your SQLFlow Model |
| 65 | +
|
| 66 | +If you have developed a new model, please perform the integration test with the SQLFlow gRPC server to make sure it works well with SQLFlow. |
| 67 | +
|
| 68 | +1. Launch an SQLFlow all-in-one Docker container |
| 69 | +
|
| 70 | + ``` bash |
| 71 | + cd ./models |
| 72 | + > docker run --rm -it -v $PWD:/models -p 8888:8888 sqlflow/sqlflow |
| 73 | + ``` |
| 74 | +
|
| 75 | +1. Update `sqlflow_models` in the SQLFlow all-in-one Docker container: |
| 76 | +
|
| 77 | + ``` bash |
| 78 | + > docker exec -it <container-id> pip install -U /models |
| 79 | + ``` |
| 80 | +
|
| 81 | +1. Open a web browser and go to `localhost:8888` to access the Jupyter Notebook. Using your custom model by modifying the `TRAIN` parameter of the SQLFlow extend SQL: `TRAIN sqlflow_models.MyDNNClassifier`: |
| 82 | +
|
| 83 | +``` sql |
| 84 | +SELECT * from iris.train |
| 85 | +TRAIN sqlflow_models.MyDNNClassifier |
| 86 | +WITH n_classes = 3, hidden_units = [10, 20] |
| 87 | +COLUMN sepal_length, sepal_width, petal_length, petal_width |
| 88 | +LABEL class |
| 89 | +INTO sqlflow_models.my_dnn_model; |
| 90 | +``` |
| 91 | +
|
| 92 | +## Publish your model in the SQLFlow all-in-one Docker image |
| 93 | +
|
| 94 | +If you have already tested your code, please create a pull request and invite other develops to review it. If one of the develops **approve** your pull request, then you can merge it to the develop branch. |
| 95 | +The travis-ci would build the SQLFlow all-in-one Docker image with the latest models code every night and push it to the Docker hub with tag: `sqlflow/sqlflow:nightly`, you can find the latest models in it the second day. |
0 commit comments