model_builder scikit-learn integration

I am working on making an existing scikit-learn model pipeline produce probabilistic output. To do that, I used model_builder to make a pymc model that could integrate into a scikit-learn `Pipeline`, including standardization of inputs and outputs. However, I find that the current API doesn't seem suitable for this. I made my own modifications to the `ModelBuilder` class and [example](https://www.pymc.io/projects/examples/en/latest/howto/model_builder.html) `LinearModel` subclass to get it to work. I think the main change was to have the `fit` and `predict` methods take X and y as separate parameters rather than as members of a data dict with specially-named keys. My reference for the scikit-learn estimator API is the scikit-learn [documentation](https://scikit-learn.org/stable/developers/develop.html) and  [template](https://github.com/scikit-learn-contrib/project-template/blob/master/skltemplate/_template.py) for TemplateEstimator.

I very well might be one the wrong track (or at least on a different one than what model_builder intends), but what I came up with seems to work for being able to apply `sklearn.preprocessing.StandardScaler` to inputs and to point outputs using `sklearn.compose.TransformedTargetRegressor`. These seem like reasonable goals for `ModelBuilder` subclasses to be able to integrate with, so maybe tests and/or examples of such would be good.

Any thoughts? I'm happy to contribute what I can.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

model_builder scikit-learn integration #155

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

model_builder scikit-learn integration #155

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions