Handling the base URI while evaluating against an instance

This issue is about the architectural principles that:
* You can resolve all uses of the base URI as a pre-processing step
* Once you do that, evaluating a schema object is the same regardless of its parent

Technically, draft 2019-09 violates that, although in practice we can wiggle around it.  But we should decide whether these principles hold, and make sure that our spec meets our own principles!  Options include:

* We change how we describe the behavior of a few keywords, and keep both principles _(I'm leaning towards this as explained at the end)_
* We decide that you can resolve `$id` and `$ref` as a pre-processing step, but in general you might need to keep track of base URIs during evaluation with an instance
* We decide not to promote the idea of a pre-processing step at all (although you do need one to at least discover all of the static URIs that can be reference targets, so you never entirely get rid of it)

------

`$id` and `$ref` (the only draft-07 keywords to rely on the base URI) to full URIs during schema _loading_, can be pre-processed by simply setting their values to the full URIs at the same time that you find the various schemas and cache them in some sort of URI-lookup thingy.

However, in the latest draft, `$anchor`, `$recursiveAnchor`, and `$recursiveRef` rely on the base URI as well.

For `$anchor`, since it only adds a URI to associate with that schema in your cache, there's no further processing with the base URI to do there.  So it's not a problem in practice.

But `$recursiveAnchor` and `$recursiveRef` are problems.  At least in theory.  In practice, because we restrict `$recursiveAnchor` to resource root schemas and `$recursiveRef` to only have a value of `"#"`, you can handle these without needing to know the base URI.  That is, in fact, why those restrictions exist.

So we can kind-of get away with ignoring this problem for now, and we could change how we talk about these keywords to remove the base URI stuff.

---------

In general, though, these keywords as we currently describe them work by dynamically re-calculating the base URI of the URI-reference in `$recursiveRef` depending on `$recursiveAnchor`.  This was done so that we could lift those restrictions on the value of `$recursiveRef`.  Or replace these keywords with a more general `$dynamicAnchor` and `$dynamicRef` since the name "recursive" wouldn't be entirely accurate anymore.

This works as follows:

* Resolve the URI-reference in `$recursiveRef` just as you would for `$ref` to get the _initial target_
* If the initial target contains `"$recursiveAnchor": true`, walk back up the dynamic scope to find the outermost scope that also has `"$recursiveAnchor": true`, to get the _intermediate target_
* ***Re-resolve*** the URI-reference in `$recursiveRef` against the base URI from the immediate target, to produce the _final target_

The nice thing about this is that it works with any URI-reference in `$recursiveRef` (although some sorts of URI-references don't make much sense- the other reason we restricted it).  And in practice, the `"#"` restriction means that the final target is **always** the same as the intermediate target, so you never actually need to re-resolve the URI reference.  once you find the intermediate target, you're done.

But in the general case, where the intermediate and final targets could be different, you need to know, *at runtime*, the base URI for both the initial and intermediate targets.  You can't even resolve the `$recursiveRef` to a full URI because you need to know which part was the original reference in order to re-resolve it against the intermediate target's base URI.

-----------

If we want to keep the ability to preprocess the base URI to the point where we never need to worry about parent schema objects, the best way to do that would be to reserve a keyword (`$base`? `$_base`?) where an implementation could safely store the base URI during preprocessing if there's a keyword in that object that would need it.  Then, when `$recursiveRef` or `$recursiveAnchor` is encountered, and implementation could just look at that reserved location.

There are subtleties there like what to do if someone actually does try to use it as a keyword, etc.  But that's what comes to mind for me.

Thoughts?  At the moment, I'm leaning towards changing how we talk about `$recursiveAnchor` and `$recursiveRef` and saying something like "these keywords adjust the reference target within the dynamic scope" instead of "these keywords change the base URI of the reference."  That gets a bit messy with another architectural principle of "always identify things with URIs", but I feel like that is more easily finessed.  Besides, we at least _start_ with a URI, and we can figure out the URI of the final target if we wanted to.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Handling the base URI while evaluating against an instance #868

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Handling the base URI while evaluating against an instance #868

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions