Skip to content

Wrong average embedding during inference due to a small bug in neuracoref.pyx #336

Open
@valedica

Description

@valedica

The average embeddings can be wrongly calculated during inference due to a small bug in neuralcoref.pyx:

if token.lower not in PUNCTS:

PUNCTS is a list of strings, while token.lower is an integer hash.
This means that punctuation embeddings will be added to the average embeddings of spans, causing a potential mismatch between training and inference features.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions