Background Sweep¶
How Background Sweep Works¶
The Background Sweep Job works by sweeping one table at a time. The Background Sweep Job determines which table to sweep by estimating which would be most beneficial based on I/O activity and frequency, considering the following criteria:
- The number of cells written to a table since it was last swept.
- The number of cells that were deleted the last time a table was swept.
- The number of cells that were not deleted the last time a table was swept.
- The amount of time that has passed since the it was last swept.
Configuration¶
The background sweeper can be disabled by setting the enabled
property in the Sweep Config block in the AtlasDB Runtime Config block to false.
Metrics¶
We now expose Dropwizard metrics to allow easier tracking of the background sweeper’s actions. For more information, see Dropwizard Metrics.
Additional logging for Background Sweep¶
By default, the background sweeper only logs errors. If you’d like to watch the background sweeper’s progress, add the following in atlasdb.log.properties
:
#-------------------------------------------------------------------------------- # Sweep Logging #-------------------------------------------------------------------------------- # enable background sweep logging by setting this to 'debug'; for less verbose logging, use 'error'. log4j.logger.com.palantir.atlasdb.sweep.BackgroundSweeperImpl=debug, sweepAppend log4j.logger.com.palantir.atlasdb.sweep.SpecificTableSweeper=debug, sweepAppend log4j.logger.com.palantir.atlasdb.sweep.SweepTaskRunner=debug, sweepAppend log4j.logger.com.palantir.atlasdb.sweep.CellSweeper=debug, sweepAppend # set additivity to false to make these logs only show up in background-sweeper.log log4j.additivity.com.palantir.atlasdb.sweep.BackgroundSweeperImpl=false # configure a basic file appender log4j.appender.sweepAppend.layout=com.palantir.monitoring.logging.log4j.PalantirPatternLayout log4j.appender.sweepAppend.layout.ConversionPattern=%m%n log4j.appender.sweepAppend=com.palantir.util.logging.ArchivedDailyRollingFileAppender log4j.appender.sweepAppend.threshold=debug log4j.appender.sweepAppend.file=log/background-sweeper.log log4j.appender.sweepAppend.datePattern='.'yyyy-MM-dd log4j.appender.sweepAppend.MaxRollFileCount=90
This will create a log file log/background-sweeper.log
where sweep information will be logged.
Querying the Sweep Metadata Tables¶
You can also query the Atlas table sweep.progress
using Atlas Console.
sweep.progress
contains at most a single row detailing information for the current table the background sweeper is sweeping.
You can also query sweep.priority
to get a breakdown per table of:
write_count
- Approximate number of writes to this table since the last time it was swept.last_sweep_time
- Wall clock time the last time this table was swept.cells_deleted
- The numbers of stale values deleted last time this table was swept.cells_examined
- The number of cell-timestamp pairs in the table total the last time this table was swept.
Note
If one investigates the sweep.priority
table, one may find information regarding the number of cell-timestamp
pairs examined or deleted on earlier runs as well. However, the sweep.priority
table itself can be swept, thus
the table may not contain information concerning all historical runs of sweep.