Skip to content

Series import: filter out invalid candidates #821

Closed
@php-coder

Description

@php-coder

We should filter out candidates with short names because otherwise they can match when we'll use LIKE and leads to a wrong category:

r.m.w.s.SeriesInfoExtractorServiceImpl   : Determining category from a fragment: 'ДАГОМЕЯ ? М? ДОИСТОРИЧЕСКАЯ ФАУНА ДИНОЗАВРЫ АВИАПОЧТА 3 БЕЗЗУБЦОВЫЕ МАРКИ С ПОЛЯМИ КУПИТЬ! (39)'
r.m.w.s.SeriesInfoExtractorServiceImpl   : Possible candidates: [С, М?, ФАУНА, (39), 3, БЕЗЗУБЦОВЫЕ, ПОЛЯМИ, ДОИСТОРИЧЕСКАЯ, ДИНОЗАВРЫ, МАРКИ, АВИАПОЧТА, ДАГОМЕЯ, ?, КУПИТЬ!]
r.m.w.s.SeriesInfoExtractorServiceImpl   : Found categories: []
r.m.w.s.SeriesInfoExtractorServiceImpl   : Possible candidate: 'С%'
r.m.w.s.SeriesInfoExtractorServiceImpl   : Found categories: [2]

The result is wrong because 2 is id of the Sport category, while I expect that it should found "Prehistoric animals" category.

Also let's filter out candidates with long names.

Perhaps, after fixing #819 it will find "Prehistoric animals" category from the first attempt (via aliases) and won't produce wrong result (because it won't do lookup with LIKE).

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions