ENH: Option to adjust the uniform distribution bounds for jitter/initialisation

### Before

```python
def make_initial_point_fn(
    *,
    free_rvs: Sequence[TensorVariable],
    rvs_to_transforms: dict[TensorVariable, Transform],
    initval_strategies: dict[TensorVariable, np.ndarray | Variable | str | None],
    jitter_rvs: set[TensorVariable] | None = None,
    default_strategy: str = "support_point",
    return_transformed: bool = False,
) -> list[TensorVariable]:
    
    # ... existing code ...

        if variable in jitter_rvs:
            jitter = pt.random.uniform(-1, 1, size=value.shape)
            jitter.name = f"{variable.name}_jitter"
            value = value + jitter
```


### After

```python
def make_initial_point_fn(
    *,
    model,
    overrides: StartDict | None = None,
    jitter_rvs: set[TensorVariable] | None = None,
    jitter_bounds: tuple[float, float] = (-1, 1), # <---- new
    default_strategy: str = "support_point",
    return_transformed: bool = True,
) -> Callable:



    # ... existing code ...
    initial_values = make_initial_point_expression(
        free_rvs=model.free_RVs,
        rvs_to_transforms=model.rvs_to_transforms,
        initval_strategies=initval_strats,
        jitter_rvs=jitter_rvs,
        jitter_bounds=jitter_bounds, # <---- new
        default_strategy=default_strategy,
        return_transformed=return_transformed,
    )
    # ... rest of existing code ...


def make_initial_point_fns_per_chain(
    *,
    model,
    overrides: StartDict | Sequence[StartDict | None] | None,
    jitter_rvs: set[TensorVariable] | None = None,
    jitter_bounds: tuple[float, float] = (-1, 1),    # <---- new
    chains: int,
) -> list[Callable]:


    if isinstance(overrides, dict) or overrides is None:
        ipfns = [
            make_initial_point_fn(
                model=model,
                overrides=overrides,
                jitter_rvs=jitter_rvs,
                jitter_bounds=jitter_bounds,    # <---- new
                return_transformed=True,
            )
        ] * chains
    elif len(overrides) == chains:
        ipfns = [
            make_initial_point_fn(
                model=model,
                jitter_rvs=jitter_rvs,
                jitter_bounds=jitter_bounds,     # <---- new
                overrides=chain_overrides,
                return_transformed=True,
            )
            for chain_overrides in overrides
        ]



def make_initial_point_expression(
    *,
    free_rvs: Sequence[TensorVariable],
    rvs_to_transforms: dict[TensorVariable, Transform],
    initval_strategies: dict[TensorVariable, np.ndarray | Variable | str | None],
    jitter_rvs: set[TensorVariable] | None = None,
    jitter_bounds: tuple[float, float] = (-1, 1),  # <---- new
    default_strategy: str = "support_point",
    return_transformed: bool = False,
) -> list[TensorVariable]:
    
    # ... existing code ...

        if variable in jitter_rvs:
            jitter = pt.random.uniform(         
                jitter_bounds[0],             # <---- new
                jitter_bounds[1], 
                size=value.shape
            )
            jitter.name = f"{variable.name}_jitter"
            value = value + jitter

    # ... existing code ...
```


### Context for the issue:

To assist multi-path Pathfinder in exploring complicated posteriors (i.e., multimodal, flat or saddle point regions, or posteriors with several local modes that get stuck during optimisation), each single Pathfinder needs to be initialised over a broader region. This would require random initialisation points wider than Uniform(-1, 1). Attached is an image comparing wide and broad random initialisations from Figure 11 of the [paper](https://www.jmlr.org/papers/volume23/21-0889/21-0889.pdf).
<img width="466" alt="Screenshot 2024-11-03 195921" src="https://github.com/user-attachments/assets/63f6f2d8-10cf-46fa-a9d3-e9bded188963">

Allowing the uniform distribution bounds to be input parameters would allow users of the multi-path Pathfinder algorithm to adjust the initialisations to their scenario better.

I'm happy to work on this feature :) Any suggestions on how you'd like the changes to be made?


References:
Zhang, L., Carpenter, B., Gelman, A., & Vehtari, A. (2022). Pathfinder: Parallel quasi-Newton variational inference. Journal of Machine Learning Research, 23(306), 1–49.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Option to adjust the uniform distribution bounds for jitter/initialisation #7555

Before

After

Context for the issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Option to adjust the uniform distribution bounds for jitter/initialisation #7555

Description

Before

After

Context for the issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions