-
Notifications
You must be signed in to change notification settings - Fork 914
[Discussion] Add initial design tenets/goals for S3 TransferManager #1120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Project Tenets (unless you know better ones) | ||
|
||
1. Meeting customers in their problem space allows them to deliver value | ||
quickly. | ||
2. Meeting customer expectations drives usability. | ||
3. Discoverability drives usage. | ||
|
||
# Introduction | ||
|
||
This project provides a much improved experience for S3 customers needing to | ||
easily perform uploads and downloads of objects to and from S3 by providing the | ||
S3 `S3TransferManager`, a high level library built on the S3 client. | ||
|
||
# Project Goals | ||
|
||
1. For the use cases it addresses, i.e. the transfer of objects to and from S3, | ||
S3TransferManager is the preferred solution. It is easier and more intuitive | ||
than using the S3 client. In the majority of situations, it is more | ||
performant. | ||
1. S3TransferManager provides a truly asynchronous, non-blocking API that | ||
conforms to the norms present in the rest of the SDK. | ||
1. S3TransferManager makes efficient use of system resources. | ||
1. S3TransferManager supplements rather than replaces the lower level S3 client. | ||
|
||
# Non Project Goals | ||
|
||
1. Ability to use the blocking, synchronous client. | ||
|
||
Using a blocking client would severely impede the ability to deliver on goals | ||
#2 and #3. | ||
|
||
# Customer-Requested Changes from 1.11.x | ||
|
||
* S3TransferManager supports progress listeners that are easier to use. | ||
|
||
Ref: https://github.com/aws/aws-sdk-java-v2/issues/37#issuecomment-316218667 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More of a meta-question: do we know yet whether these will be S3-specific, or built into the core? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it makes sense for them to be same thing. Or at least, TransferManager supports both the core and an "enhanced" TransferManager specific version if it makes sense to have one. |
||
|
||
* S3TransferManager provides bandwidth limiting of uploads and downloads. | ||
|
||
Ref: https://github.com/aws/aws-sdk-java/issues/1103 | ||
|
||
* The size of resources used by Transfermanager and configured by the user | ||
should not affect its stability. | ||
|
||
For example, the configured size of a threadpool should be irellevant to its | ||
ability to successfuly perform an operation. | ||
|
||
Ref: https://github.com/aws/aws-sdk-java/issues/939 | ||
|
||
* S3TransferManager supports parallel downloads of any object. | ||
|
||
Any object stored in S3 should be downloadable in multiple parts | ||
simultaneously, not just those uploaded using the Multipart API. | ||
|
||
* S3TransferManager has the ability to upload to and download from a pre-signed | ||
URL. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this include parallel downloads (see previous goal)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For downloads definitely. I wasn't sure if it's possible for uploads so I didn't want to specify parallel in case it's not. I'll confirm and then update the wording if necessary There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case I'd probably just leave it as is for now as is for now, or add a caveat that parallel parallel downloads where possible. |
||
|
||
* S3TransferManager allows uploads and downloads from and to memory. | ||
|
||
Ref: https://github.com/aws/aws-sdk-java/issues/474 | ||
|
||
* Ability to easily use canned ACL policies with all transfers to S3. | ||
|
||
Ref: https://github.com/aws/aws-sdk-java/issues/1207 | ||
|
||
* Trailing checksums for parallel uploads and downloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where you separate client vs http client but I think it should be a primary goal to support sync http clients like Apache. Each HTTP client has different levels of configuration and allowing customers to use their preferred one should be a requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess both as in the blocking service client and HTTP clients. I can't see supporting a sync HTTP client as a goal can coexist with the async/non-blocking, efficiency, and performance goals. We already saw a lot of this when we explored whether we could expose sync HTTP clients as async and vice versa.
To me, a big part of the original TransferManager was to provide the async/non-blocking experience so by focusing on using the async HTTP clients, we further enhance that experience.
If it's a matter of configuration I would think supporting a config present in Apache but not Netty would be easier to achieve than supporting Apache/a non blocking HTTP client in TransferManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just limiting it to non-blocking clients is fine, as long as we have support for HTTP proxies in Netty. Sync clients and async abstractions like TransferManager don't mix well.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So no shared connection pool if customers are using the sync s3 client? Which is made worse given that you have to use both the low-level client and transfer manager.
No lightweight JDK client?
No GAE support?
A worse debugging experience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my view, TransferManager is an enhancement of the async client, where this is all true right now as well.
I think it's important that customers who can't use the async clients for the reasons given above are not left without an option, but I think that means we provide a similar, but separate library for non-async clients.