Skip to content

v3.2: Support ordered multipart including streaming #4589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: v3.2-dev
Choose a base branch
from

Conversation

handrews
Copy link
Member

Fixes:

This adds support for all multipart media types that do not have named parts, including support for streaming such media types. Note that multipart/mixed defines the basic processing rules for all multipart types, and implementations that encounter unrecognized multipart subtypes are required to process them as multipart/mixed. Therefore support for multipart/mixed addresses all other subtypes to some degree.

This builds on the recent support for sequential media types:

  • multipart/mixed and similar meet the definition for a sequential media type, requiring it to be modeled as an array. This does use an expansive definition of "repeating the same structure", where the structure is literally any content with a media type.
  • As a sequential media type, it also supports itemSchema
  • Adding a parallel itemEncoding is the obvious solution to multipart/mixed streams requiring an Encoding Object
  • We have regularly received requests to support truly mixed multipart/mixed payloads, and previously claimed such support from 3.0.0 onwards, without actually supporting it. Adding prefixEncoding along with itemEncoding supports this use case with a clear parallel to prefixItems, which is the schema construct needed to support this case.
  • There is no need for a prefixSchema field because the streaming use case requires a repetition of the same schema for each item. Therefore all mixed use cases can use schema and prefixItems
  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

We do not seem to run tests on the 3.2 schemas, and I couldn't quickly figure out how to add that, so we should do that separately and include coverage for this and other new fields.

Also paging @thecheatah, @jeremyfiel

This adds support for all `multipart` media types that do not
have named parts, including support for streaming such media types.
Note that `multipart/mixed` defines the basic processing rules
for all `multipart` types, and implementations that encounter
unrecognized `multipart` subtypes are required to process them
as `multipart/mixed`.  Therefore support for `multipart/mixed`
addresses all other subtypes to some degree.

This builds on the recent support for sequential media types:

* `multipart/mixed` and similar meet the definition for
  a sequential media type, requiring it to be modeled as
  an array.  This does use an expansive definition of
  "repeating the same structure", where the structure is
  literally any content with a media type.
* As a sequential media type, it also supports `itemSchema`
* Adding a parallel `itemEncoding` is the obvious solution to
  `multipart/mixed` streams requiring an Encoding Object
* We have regularly received requests to support truly mixed
  `multipart/mixed` payloads, and previously claimed such support
  from 3.0.0 onwards, without actually supporting it.
  Adding `prefixEncoding` along with `itemEncoding` supports this
  use case with a clear parallel to `prefixItems`, which is the
  schema construct needed to support this case.
* There is no need for a `prefixSchema` field because the streaming
  use case requires a repetition of the same schema for each item.
  Therefore all mixed use cases can use `schema` and `prefixItems`
@handrews handrews added this to the v3.2.0 milestone May 15, 2025
@handrews handrews requested review from a team as code owners May 15, 2025 18:54
@handrews handrews added enhancement media and encoding Issues regarding media type support and how to encode data (outside of query/path params) labels May 15, 2025
@jeremyfiel
Copy link
Contributor

Thanks @handrews for taking this on. I'm really happy to see it coming to fruition and hopefully the tooling catches up with it sooner than later.

I couldn't immediately make out if this would support nested multipart.

POST  /things HTTP/1.1
content-type: multipart/mixed;boundary=aaa

--aaa
content-type: application/json

{ 
   "data": ""
}
--aaa
content-type: multipart/mixed;boundary=bbb

        --bbb
        content-type: application/json
        {
            "more_data": ""
        }
        --bbb
        content-type: text/plain
        test file
        --bbb
        content-type: application/zip
        
        <binary data>
        ---bbb
        content-type: application/pdf
        
        <binary data>
        --bbb--
--aaa--

multipart/mixed:
  schema:
     prefixItems:
     -  type: object
         properties:
           data:
             type: string
     - prefixItems:
        - type: object
           properties:
              more_data: ""
        - {}
        - {}
        - {}
    prefixEncoding:
      - {}
      - contentType: multipart/mixed
      # not sure how to further document a nested structure here.

@handrews
Copy link
Member Author

@jeremyfiel aww... I was hoping no one would bring up nested multipart... 😵‍💫

I think it would be hard to do that, because there isn't anywhere to put the nested Encoding Object. I think we'd have to add encoding, prefixEncoding, and itemEncoding to the Encoding Object as well as the Media Type Object. I'm a bit hesitant to do that, but we could talk about it at the Thursday call and I could submit it as a follow-up if it gains traction.

Alternatively, we could recommend trying that as an extension given that it adds significant complexity and is a rare case that is deprecated by the current RFC (I know that's small consolation when you're the "rare case" and built things in good faith using older RFCs when they were current).

The complexity is not just the recursive structure, but also that you are now correlating two separate trees of structure.

@jeremyfiel
Copy link
Contributor

I'm not entirely sure this is a correct statement to include multipart/mixed. It is registered in the IANA registry and it does technically have an envelope with the boundary parameter.

Sequential Media Types

Within this specification, a sequential media type is defined as any media type that consists of a repeating structure, without any sort of header, footer, envelope, or other metadata in addition to the sequence.
Some examples of sequential media types (including some that are not IANA-registered but are in common use) are:

  application/jsonl
  application/x-ndjson
  application/json-seq
  application/geo+json-seq
  text/event-stream
  multipart/mixed

@handrews
Copy link
Member Author

handrews commented May 27, 2025

[EDIT: This goes with the nested multipart discussion]

@jeremyfiel the problem is that instead of just re-using the Media Type Object, we came up with the contentType field :-(

@jeremyfiel
Copy link
Contributor

I totally understand the complexity, just trying to confirm my initial impression.

@handrews
Copy link
Member Author

@jeremyfiel That statement only says that some of the listed types are not registered. application/json-seq, application/geo+json-seq, and multipart/mixed are all registered.

I decided not to get into the preamble and postamble of multipart because AFAICT they're supposed to be ignored and are there for historical purposes. Media type parameters are not part of the actual media type content, and the boundaries in the content are no more (or less) significant than the various differences in the three sequential JSON media type delimiters.

@handrews
Copy link
Member Author

@jeremyfiel I added some clarifications about the envelope/preamble/epilogue and the lack of nesting support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement media and encoding Issues regarding media type support and how to encode data (outside of query/path params)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants