Skip to content

Reformat transactions, add docs, etc. #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 14, 2022
Merged

Reformat transactions, add docs, etc. #45

merged 5 commits into from
Mar 14, 2022

Conversation

philvarner
Copy link
Collaborator

Related Issue(s):

Description:

PR Checklist:

  • Code is formatted and linted (run pre-commit run --all-files)
  • Tests pass (run make test)
  • Documentation has been updated to reflect changes, if applicable, and docs build successfully (run make docs)
  • Changes are added to the CHANGELOG, if applicable

@@ -1,57 +1,53 @@
repos:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reformatting

}
},
# Default all other strings not otherwise specified to keyword
{"strings": {"match_mapping_type": "string", "mapping": {"type": "keyword"}}},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"strings" is probably the most important one -- this ensures that all string values that aren't otherwise noted (like title and description) are keywords, so they don't get parsed. Without this, you can't do things like sort on id or filter on id when the id has a non-word character (like underscore) in it.

]

mappings = {
"numeric_detection": False,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this turns off the behavior of trying to index numeric-valued strings as numbers, e.g., it maintains that "10" is indexed as the keyword string "10" rather than the number 10.

"created": {"type": "date"},
"updated": {"type": "date"},
# Satellite Extension https://github.com/stac-extensions/sat
"sat:absolute_orbit": {"type": "integer"},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are both explicitly integral values, so don't let them be floats

},
# Default all other strings not otherwise specified to keyword
{"strings": {"match_mapping_type": "string", "mapping": {"type": "keyword"}}},
{"numerics": {"match_mapping_type": "long", "mapping": {"type": "float"}}},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make numeric values floats instead of integral. Without this, the first value for a field that comes in could be a floating-point that just happens to be an integer, and thereafter all of the values will be truncated. E.g., without this, if the first item had a value for eo:cloud_cover as 10.3, it would be indexed as the long 10, and a query eo:cloud_cover <= 10 would match on it, even though the Item itself would still have 10.3 as the value!

ignore=400, # ignore 400 already exists code
)

def create_item(self, model: stac_types.Item, **kwargs):
@overrides
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the parameters for all of these methods that override ones in the base class to match the names in the base class, and put them in the same order they are in the base class.

"""Bulk item insertion using es."""
transactions_client = TransactionsClient()
transactions_client._create_item_index()
transactions_client.create_item_index()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed this method to remove the leading _ because this is a usage outside the class.

@@ -178,10 +256,13 @@ def bulk_sync(self, processed_items):
]
helpers.bulk(self.client, actions)

def bulk_item_insert(self, items: Items, **kwargs) -> str:
@overrides
def bulk_item_insert(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the new signature for the method in stac-fastapi.

index="stac_items",
body=mapping,
body={"mappings": self.mappings},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply the mappings when creating stac_items

@philvarner philvarner changed the title WIP - improve elasticsearch mappings Reformat transactions, add docs, etc. Mar 14, 2022
@philvarner
Copy link
Collaborator Author

I pulled the actual changes to the mappings out of this one, since there were so many other formatting and cleanup changes.

@philvarner philvarner marked this pull request as ready for review March 14, 2022 21:18
@philvarner philvarner requested a review from jonhealy1 March 14, 2022 21:18
Copy link
Collaborator

@jonhealy1 jonhealy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@philvarner philvarner merged commit e385f14 into main Mar 14, 2022
@philvarner philvarner deleted the pv/es-mapping branch March 14, 2022 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants