Skip to content

model_builder scikit-learn integration #155

Closed
@pdb5627

Description

@pdb5627

I am working on making an existing scikit-learn model pipeline produce probabilistic output. To do that, I used model_builder to make a pymc model that could integrate into a scikit-learn Pipeline, including standardization of inputs and outputs. However, I find that the current API doesn't seem suitable for this. I made my own modifications to the ModelBuilder class and example LinearModel subclass to get it to work. I think the main change was to have the fit and predict methods take X and y as separate parameters rather than as members of a data dict with specially-named keys. My reference for the scikit-learn estimator API is the scikit-learn documentation and template for TemplateEstimator.

I very well might be one the wrong track (or at least on a different one than what model_builder intends), but what I came up with seems to work for being able to apply sklearn.preprocessing.StandardScaler to inputs and to point outputs using sklearn.compose.TransformedTargetRegressor. These seem like reasonable goals for ModelBuilder subclasses to be able to integrate with, so maybe tests and/or examples of such would be good.

Any thoughts? I'm happy to contribute what I can.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions