Open
Description
The average embeddings can be wrongly calculated during inference due to a small bug in neuralcoref.pyx:
neuralcoref/neuralcoref/neuralcoref.pyx
Line 896 in 60338df
PUNCTS
is a list of strings, while token.lower
is an integer hash.
This means that punctuation embeddings will be added to the average embeddings of spans, causing a potential mismatch between training and inference features.
Metadata
Metadata
Assignees
Labels
No labels