public interface RewriteManifests extends SnapshotUpdate<RewriteManifests>
This API accumulates manifest files, produces a new Snapshot
of the table described
only by the manifest files that were added, and commits that snapshot as the current.
This API can be used to rewrite matching manifests according to a clustering function as well as to replace specific manifests. Manifests that are deleted or added directly are ignored during the rewrite process. The set of active files in replaced manifests must be the same as in new manifests.
When committing, these changes will be applied to the latest table snapshot. Commit conflicts will be resolved by applying the changes to the new latest snapshot and reattempting the commit.
Modifier and Type | Method and Description |
---|---|
RewriteManifests |
addManifest(ManifestFile manifest)
Adds a
manifest file to the table. |
RewriteManifests |
clusterBy(java.util.function.Function<DataFile,java.lang.Object> func)
Groups an existing
DataFile by a cluster key produced by a function. |
RewriteManifests |
deleteManifest(ManifestFile manifest)
Deletes a
manifest file from the table. |
RewriteManifests |
rewriteIf(java.util.function.Predicate<ManifestFile> predicate)
Determines which existing
ManifestFile for the table should be rewritten. |
deleteWith, scanManifestsWith, set, stageOnly, toBranch
apply, commit, updateEvent
RewriteManifests clusterBy(java.util.function.Function<DataFile,java.lang.Object> func)
DataFile
by a cluster key produced by a function. The cluster key
will determine which data file will be associated with a particular manifest. All data files
with the same cluster key will be written to the same manifest (unless the file is large and
split into multiple files). Manifests deleted via deleteManifest(ManifestFile)
or
added via addManifest(ManifestFile)
are ignored during the rewrite process.func
- Function used to cluster data files to manifests.RewriteManifests rewriteIf(java.util.function.Predicate<ManifestFile> predicate)
ManifestFile
for the table should be rewritten. Manifests
that do not match the predicate are kept as-is. If this is not called and no predicate is set,
then all manifests will be rewritten.predicate
- Predicate used to determine which manifests to rewrite. If true then the
manifest file will be included for rewrite. If false then then manifest is kept as-is.RewriteManifests deleteManifest(ManifestFile manifest)
manifest file
from the table.manifest
- a manifest to deleteRewriteManifests addManifest(ManifestFile manifest)
manifest file
to the table. The added manifest cannot contain new
or deleted files.
By default, the manifest will be rewritten to ensure all entries have explicit snapshot IDs. In that case, it is always the responsibility of the caller to manage the lifecycle of the original manifest.
If manifest entries are allowed to inherit the snapshot ID assigned on commit, the manifest should never be deleted manually if the commit succeeds as it will become part of the table metadata and will be cleaned up on expiry. If the manifest gets merged with others while preparing a new snapshot, it will be deleted automatically if this operation is successful. If the commit fails, the manifest will never be deleted and it is up to the caller whether to delete or reuse it.
manifest
- a manifest to add