public interface DeletedDataFileScanTask extends ChangelogScanTask, ContentScanTask<DataFile>
Note that all historical delete files added earlier must be applied while reading the data file. This is required to output only those data records that were live when the data file was removed.
Suppose snapshot S1 contains data files F1, F2, F3. Then snapshot S2 adds a position delete file, D1, that deletes records from F2 and snapshot S3 removes F2 entirely. A scan for changes generated by S3 should include the following task:
Readers consuming these tasks should produce deleted records with metadata like change ordinal and commit snapshot ID.
Modifier and Type | Method and Description |
---|---|
java.util.List<DeleteFile> |
existingDeletes()
A list of previously added
delete files to apply when reading the data file
in this task. |
default int |
filesCount()
The number of files that will be opened by this scan task.
|
default ChangelogOperation |
operation()
Returns the type of changes produced by this task (i.e.
|
default long |
sizeBytes()
The number of bytes that should be read by this scan task.
|
changeOrdinal, commitSnapshotId
estimatedRowsCount, file, length, partition, residual, start
spec
asCombinedScanTask, asDataTask, asFileScanTask, isDataTask, isFileScanTask
java.util.List<DeleteFile> existingDeletes()
delete files
to apply when reading the data file
in this task.default ChangelogOperation operation()
ChangelogScanTask
operation
in interface ChangelogScanTask
default long sizeBytes()
ScanTask
sizeBytes
in interface ContentScanTask<DataFile>
sizeBytes
in interface ScanTask
default int filesCount()
ScanTask
filesCount
in interface ScanTask