Open
Description
This is an umbrella ticket for many sub-tasks
Description
The existing ML path selection is implemented in the utbot-analytics module
It suffers from a few problems:
- It uses external ML libraries for the model inference. It brings large size of jar
- It uses Smile library for inference (better to use scikit-learn and provide the model importer)
- Smile wrapper for blas is used for Matrix multiplication
- Kotlin implementation without external runtime is too slow (need our own native implementation of 1-3 operations like matrix mul) - probably multik could help
- The DJL inference is too slow
- The imported library in JSON/txt format
- We measure the metrics on the contest data
- The utbot-analytics module de-facto is not used.
- There a lot of ML-related settings mixed together with another settings in UtSettings
Expected behavior
- utbot-analytics module and its inheritors should be easily enabled/disabled from the intellij/cli modules
- Scripts for training should be structured and isolated
- Deployed ML models should be a part of jar
- No external libraries in the utbot-analytics module
- External settings should be extracted to the UtMLSettings
- Models are located in resources and packed with the plugin
- Models are not larger than 100 KB (zipped or saved in alternative binary format, not json or txt)
- utbot-analytics module contains only interfaces and pure Kotlin implementations
- utbot contains separate modules for model inference for the custom inference implementations (like DJL)
- Different path selectors could be easily compared and results could be displayed as a report
- The new metrics of path selection are created
- We reached better (significantly) numbers in metrics
- Obtained models are ranged and well described
- Training process and hyperparameter tuning is well described and published.
Related issues
- 1. Enable utbot-analytics module in utbot-intellij module
- 2. Split training scripts and bash scripts for project needs
- 3. Move DJL models and PyTorch wrapper to the separate module
- 4. Mathematical operations in utbot-analytics should be written in pure Kotlin to reduce the size of the plugin
- 5. Path selector quality analysis.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Todo