S3 Multipart Upload
S3 has a series of multipart upload operations. These can be used to upload an object to S3 in multiple parts. If your object is larger than 5GB you are required to use the multipart operations for uploading, but multipart also has the advantage that if one part fails to upload you don't need to re-upload the whole object, just the parts that failed. Finally you can upload multiple parts at the same time and thus improve your upload speed. The multiple part uploading isn't support by Soto at the moment, but it is on the list of improvements to make. You can read Amazon's documentation on multipart upload here.
Multipart upload has three stages. First you initiate the upload with an S3.CreateMultipartUpload())
. Next, you upload each part using S3.UploadPart()
and then you complete the upload by calling S3.CompleteMultipartUpload()
. If there is an error and you don't want to finish the upload you need to call S3.AbortMultipartUpload()
. The code to implement this can get quite complex so Soto provides you with a function that implements all of this for you.
let request = S3.CreateMultipartUploadRequest(bucket: "MyBucket", key: "MyFile.txt")
let responseFuture = try await s3.multipartUpload(
request,
partSize: 5*1024*1024,
filename: "/Users/home/myfile.txt"
abortOnFail: true
) { progress in
print(progress)
}
The function parameters for multipartUpload are as follows
request
is the request object you would create to callS3.CreateMultipartUpload
.partSize
is the size of each part you upload. The minimum size for a part is 5MB.filename
is the full path to the file you want to upload.abortOnFail
is a flag indicating whether you wantS3.AbortMultipartUpload
to be called when a part upload fails. If you set this flag to false the abort function will not be called and the errorS3ErrorType.multipart.abortedUpload(resumeRequest:error:)
will be thrown. This holds anS3.ResumeMultipartUploadRequest
object which can be used in the functionS3.resumeMultipartUpload
to resume the multipart upload. In this situation if you do not call the resume function you should callS3.AbortMultipartUpload
with the uploadId contained in theS3.ResumeMultipartUploadRequest
to delete the parts you have already uploaded.on
indicates theEventLoop
the upload should run on.threadPoolProvider
: The file loading requires aThreadPool
to run. You can either provide your own, or have the function create its own, which it will destroy once the function is complete.progress
is a closure that gets called after every part is loaded. It is called with a value between 0 and 1 indicating how far we are through the multipart upload. You can also use theprogress
closure as a way to cancel the upload, by throwing an error.
Resuming an upload
As mentioned above if you call s3.multipartUpload(_:filename:abortOnFail:)
with abortOnFail
set to false, you can resume the upload if it fails. You can use the function resumeMultipartUpload(_:filename:)
in the following manner. While not implemented here you can also set the abortOnFail
to false again, and resume the upload again if the first resumeMultipartUpload(_:filename:)
fails.
let request = S3.CreateMultipartUploadRequest(bucket: name, key: name)
do {
return try await s3.multipartUpload(
request,
filename: filename,
abortOnFail: false
) {
print("Progress \($0 * 100)")
}
} catch S3ErrorType.MultipartError.abortedUpload(let resumeRequest, let error) {
return try await s3.resumeMultipartUpload(resumeRequest, filename: filename) {
print("Progress \($0 * 100)")
}
}