Changelog

Note

For versions of AtlasDB above v0.151.1, please refer to GitHub’s releases page for release notes. This page is kept around as a document of changes up to v0.151.1 inclusive.

v0.151.1

2 Jul 2019

Identical to v0.151.0. This version was released owing to deployment infrastructure issues with v0.151.0.

v0.151.0

28 Jun 2019

Type	Change
IMPROVED	Upgraded gradle-baseline to improve compile time static analysis checks. (Pull Request)
DEV BREAK	`KeyValueService` implementations now have a new endpoint `deleteRows()` that allows row-level deletes to be performed. Users who extend `KeyValueService` will need to implement this method, but can refer to `AbstractKeyValueService` for a functional implementation (as this method can be written in terms of `deleteRange()`). (Pull Request)
IMPROVED	Targeted Sweep now supports reading write information and dedicated row information from multiple partitions at a time; the same overall sweep batch limit is still respected. This is expected to be particularly relevant at installations where write information is sparsely distributed over many partitions (e.g. transactions2 deployments, especially such deployments with thorough tables). (Pull Request)

v0.150.0

20 Jun 2019

NEW	Added ability for timelock to rate limit targeted sweep lock requests to 2 per second. This is to reduce load on timelock for bad atlas clients. (Pull Request)
IMPROVED	Relaxed concurrency model of `MetricsManager` allowing for more concurrency. (Pull Request)

v0.149.0

11 Jun 2019

Type	Change
IMPROVED	Added runtime configurable behavior to targeted sweep to allow multiple consecutive iterations on the same targeted sweep shard, all while holding the relevant lock. This option should improve targeted sweep throughput without adding additional load on Timelock for places where the downtime between targeted sweep iterations is a bottleneck. (Pull Request)

v0.148.0

11 Jun 2019

Type	Change
FIXED	The default value for the length of pause between targeted sweep iterations, `pauseMillis`, has been changed back to 500ms. (Pull Request)

v0.147.0

10 Jun 2019

Type	Change
DEV BREAK	Cassandra clients are now required to supply credentials in the KVS config. (Pull Request)

v0.146.0

7 Jun 2019

Type

Change

IMPROVED

Async initialization callbacks are run with an instrumented TransactionManager. (Pull Request)

IMPROVED

If using DbKVS or JDBC KVS, we no longer spin up a background thread attempting to install new transaction schema versions. Furthermore, we now log at WARN if configuration indicates that a new schema version should be installed in these cases where it won’t be respected. Previously, we would always read to and write from transactions1, even though the installer might actually have installed a non-1 schema version (and logged that this happened!). (Pull Request)

v0.145.0

4 Jun 2019

Type	Change
FIXED	Fixed a bug causing connection leaks to timelock. (Pull Request)
IMPROVED	The default value for the length of pause between targeted sweep iterations, `pauseMillis`, has been reduced to 50ms. (Pull Request)
IMPROVED	Changed the default values in `PaxosRuntimeConfiguration` as (#3943) changed it on test configs only. `leader-ping-response-wait-in-ms` was reduced to 2000 ms from 5000 ms. `maximum-wait-before-proposal-in-ms` was reduced to 300 ms from 1000 ms. `ping-rate-in-ms` was reduced to 50 ms from 5000 ms. These settings have empirically improved the performance of timelock when the leader node goes down without negatively affecting stability. (Pull Request)
IMPROVED	Clients now log at most once every 10 seconds when a `TimelockService` level request is slow. Please see the JavaDocs of `ProfilingTimelockService` for more information. (Pull Request)

v0.144.0

20 May 2019

Type	Change
DEV BREAK	`AutoDelegate` only works on interfaces now. (Pull Request)
FIXED	Fixed a bug in `TransactionManagers` introduced by a recently added caching layer, causing NPE’s. (Pull Request)
IMPROVED	The pause time between iterations of targeted sweep for each background thread is now configurable by the targeted sweep runtime configuration `pauseMillis`. The default value has also changed from 5000 milliseconds to 500. (Pull Request)

v0.143.1

16 May 2019

Type	Change
FIXED	`InsufficientConsistencyException` and `NoSuchElementException` will now not cause nodes to be blacklisted from the Cassandra client pool. Previously this could happen - even though these exceptions are not reflective of the individual node in question being unable to service requests. (Pull Request)
FIXED	Removed the DB username from the Hikari connection pool name. (Pull Request)
FIXED	Fixed incorrect delegation that took place in AutoDelegate transaction managers. (Pull Request)

v0.143.0

16 May 2019

Type	Change
IMPROVED DEPRECATED	Replaced all usages of Guava Supplier by Java Supplier. Please note that while Guava Supplier endpoints may still exist, they will be removed in a future release. (Pull Request 1, Pull Request 2 and Pull Request 3)
FIXED	Coordination service metrics no longer throw `NullPointerException` when attempting to read the metric value before reading anything from the coordination store. (Pull Request)
FIXED	AtlasDB now maintains a finite length for the delete executor’s work queue, to avoid OOMs on services with high conflict rates for transactions. In the event the queue length is reached, we will not proactively schedule cleanup of values written by a transaction that was rolled back. Note that it is not essential that these deletes are carried out immediately, as targeted sweep will eventually clear them out. Previously, this queue was unbounded, meaning that service nodes could end up using lots of memory. (Pull Request)
IMPROVED	The coordination store now retries when reading a value at a given sequence number that no longer exists (as opposed to throwing). This is necessary for supporting cleanup of the coordination store. Note that if one is performing rolling upgrades to a version that sweeps the coordination store, one MUST upgrade from at least this version. (Pull Request)

v0.142.2

14 May 2019

This version is equivalent to v0.142.0, but was re-tagged because of publishing issues.

v0.142.1

14 May 2019

This version is equivalent to v0.142.0, but was re-tagged because of publishing issues.

v0.142.0

14 May 2019

Type	Change
IMPROVED	The default configuration for the number of targeted sweep shards has been increased to 8. This enables us to increase the speed of targeted sweep if processing the queue starts falling behind. Previously, we could only increase the speed of processing future entries, as we cannot sweep entries with higher parallelism than the number of shards active when the writes were made. (Pull Request)
IMPROVED	AtlasDB now throws an `IllegalArgumentException` when attempting to create a column range selection that is invalid (has end before start). Previously, exceptions were thrown from the underlying KVS, but these were implementation-dependent. (Pull Request)
DEV BREAK	`AtlasDbHttpClients.createProxyWithFailover()` now requires `UserAgent` parameter. (Pull Request)

v0.141.0

9 May 2019

Type	Change
NEW	`TransactionManagers` now has a new builder option `lockImmutableTsOnReadOnlyTransactions()`. If it is set to `true` all transactions (including read-only ones) will grab immutable ts lock, enabling migrating to thorough sweep without downtime. Please contact the AtlasDB team before using this feature. (Pull Request)
IMPROVED	The Timelock Availability Health check should not timeout if we can’t reach other nodes. This should stop the health check firing erroneously. (Pull Request)
NEW	Setting `lockImmutableTsOnReadOnlyTransactions()` to `true` disables background sweep. This aims to prevent Cassandra load caused by Conservative to Thorough sweep migration. (Pull Request)

v0.140.0

8 May 2019

Type	Change
METRICS IMPROVED	Client side tombstone filtering is now instrumented more exhaustively. (Pull Request)
DEV BREAK	`TimelockDeprecatedConfig` and `TimeLockServerConfiguration` have been removed. Note that these configurations were only used for the dropwizard timelock server, which should only be used in tests. Now, the dropwizard server launcher uses a similar setup to the use in production, forcing the use of client request limits and lock time limiter. Note that this requires converting your existing `TimeLockServerConfiguration` to a `CombinedTimeLockServerConfiguration`. For an example of this conversion, refer to the PR below. (Pull Request)
IMPROVED	Changed the default values in `PaxosConfiguration`. `leader-ping-response-wait-in-ms` was reduced to 2000 ms from 5000 ms. `maximum-wait-before-proposal-in-ms` was reduced to 300 ms from 1000 ms. `ping-rate-in-ms` was reduced to 50 ms from 5000 ms. These settings have empirically improved the performance of timelock when the leader node goes down without negatively affecting stability. (Pull Request)

v0.139.0

30 Apr 2019

Type	Change
METRICS CHANGED	All instrumentation AtlasDB metrics now use a `SlidingTimeWindowArrayReservoir`. Previously, they used an exponentially decaying reservoir. (Pull Request)

v0.138.0

25 Apr 2019

Type	Change
USER BREAK	AtlasDB Cassandra KVS now depends on rescue 4.4.0 (was previously 3.22.0). (Pull Request)

v0.137.0

25 Apr 2019

Type	Change
FIXED	Coordination service now checks for semantic equality of `VersionedInternalSchemaMetadata` payloads as opposed to byte equality when deciding whether to reuse an existing value agreed on. Previously, using byte equality meant that multi-node clusters could end up spuriously writing the stored value many times, causing unnecessarily wide rows. (Pull Request)

v0.136.1

23 Apr 2019

This release is equivalent to v0.136.0 but was re-tagged due to a publishing issue.

v0.136.0

23 Apr 2019

Type	Change
IMPROVED DEV BREAK	Usage metrics for the coordination store have been added. Users should provide a MetricsRegistry when creating their coordination services. Also, `CoordinationService.createDefault()` now handles instrumentation of both the coordination service and store. (Pull Request)
FIXED	lock-api now declares a minimum dependency on timelock-server 0.59.0. (Pull Request)

v0.135.0

19 Apr 2019

Type	Change
IMPROVED	Coordination service now only initiates one request to perpetuate the bound forward at a time. This should avoid unnecessarily many CAS operations taking place when we need to do this. (Pull Request)

v0.134.0

18 Apr 2019

Type	Change
FIXED	We now close Cassandra clients properly when verifying that one’s Cassandra configuration makes sense. (Pull Request)

v0.133.0

16 Apr 2019

Type	Change
IMPROVED	AtlasDB now logs diagnostic information about usage of classes that utilise smart batching (e.g. when starting transactions, verifying leadership, _transactions2 put-unless-exists, etc.). (Pull Request)

v0.132.0

11 Apr 2019

Type

Change

FIXED DEV BREAK

Stop memoizing the Supplier of TimestampService, as we must get a fresh instance on each Supplier.get() call to ensure correctness after leadership elections. Without it, there is a possibility of data corruption if you are running atlas with a leader block in a multi-node configuration. Services using External Timelock, Embedded or Leader with 1 node will not be affected. Dev break to force AtlasDbFactory and ServiceDiscoveringAtlasSupplier to return ManagedTimestampService which unifies TimestampService and TimestampManagementService. (Pull Request)

IMPROVED

Removed unnecessary memory allocations in the lock refresher, and in several other classes, by using Lists.partition(…) instead of Iterables.partition(…). (Pull Request)

v0.131.0

9 Apr 2019

Type	Change
FIXED	Cassandra client input and output transports are now properly closed. (Pull Request)

v0.130.0

4 Apr 2019

Type	Change
NEW	AtlasDB now supports _transactions2 if backed by Cassandra KVS or In-Memory KVS. This is expected to improve transaction performance by making `putUnlessExists` faster, and increase stability by avoiding hotspotting of the transactions table in Cassandra. This is a beta feature; please contact the AtlasDB team if you are interested to use _transactions2. (Many PRs; key PRs include Pull Request 1, Pull Request 2, Pull Request 3, Pull Request 4)
FIXED	`putUnlessExists` in Cassandra KVS now produces correct cell names when failing with a `KeyAlreadyExistsException`. Previously, Cassandra KVS used to produce incorrect cell names (that were the concatenation of the correct cell name and an encoding of the AtlasDB timestamp). (Pull Request)
NEW	A new configuration option `lockImmutableTsOnReadOnlyTransactions` is added under `atlas-runtime.transaction`. Default value for this flag is `false`, and setting it to `true` enables running read-only transactions on thorough sweep tables; but introduces a perf overhead to read-only transactions on conservative sweep tables. This is an experimental feature, please do not change the default value for this flag without talking to AtlasDB team. (Pull Request)
FIXED	`WriteBatchingTransactionService` now tries requests for duplicated timestamps in subsequent batches. Previously, we would immediately throw a `SafeIllegalStateException` when seeing a duplicate in a single batch, which was causing unnecessary failures in transactions which could handle the `KeyAlreadyExistsException` safely. (Pull Request)
FIXED	Fixed a rare situation in which interrupting a thread could possibly leave dangling locks. (Pull Request)
FIXED	Coordination services now only perpetuate an existing value on value-preserving transformations if the existing bound is invalid at a fresh sequence number. Previously, we would perpetuate the bound regardless, meaning that when the bound is crossed in a multi-threaded environment, each in-flight transaction that tries to determine its transaction schema version will independently attempt to perpetuate the bound. This may lead to multiple unnecessary updates to the coordinated value in a short space of time. Note that updates that do change the value will be applied regardless, and could potentially still race if applied in parallel. (Pull Request)
IMPROVED	`LockRefresher` now logs at INFO when locks cannot be refreshed in that the server does not indicate that they were refreshed, along with a sample of the lock tokens involved. (Pull Request)
IMPROVED	Reduced dependency footprint by replacing dependency on groovy-all with dependencies on groovy, groovy-groovysh, and groovy-json. (Pull Request)

v0.129.0

28 Mar 2019

Type	Change
FIXED	Oracle KVS now deletes old entries correctly if using targeted sweep. Previously, there were situations where it would not delete values that could safely be deleted. (Pull Request)
IMPROVED	The Cassandra KVS `CellLoader` now supports cross-column batching for requests which query a variety of columns for a few rows. Previously, we would make separate requests for each of these columns in parallel, creating additional load on Cassandra. Internal benchmarks reflect a 4-5x improvement in read p99s for such workflows (e.g. small numbers of rows with static columns, or rows with dynamic columns when the column key is varied and known in advance). (Pull Request)
IMPROVED	Concurrent calls to TimelockService.startIdentifiedAtlasDbTransaction() now coalesced into a single Timelock rpc to reduce load on Timelock. (Pull Request)
DEV BREAK	RemoteTimelockServiceAdapter is now closeable. Users of this class should invoke close() before termination to avoid thread leaks. (Pull Request)
USER BREAK FIXED	AtlasDB Cassandra KVS now depends on sls-cassandra 3.31.0 (was 3.31.0-rc3). We do not want to stay on an RC version now that a full release is available. Note that this means that you must use this version of the sls-cassandra server if you want to use Cassandra KVS. (Pull Request)

v0.128.0

27 Mar 2019

Type	Change
USER BREAK	AtlasDB Cassandra KVS now depends on sls-cassandra 3.31.0-rc3 (was 3.27.0). This version of Cassandra KVS supports a `multiget_multislice` operation which retrieves different columns across different rows in a single query. Note that this means that you must use this version of the sls-cassandra server if you want to use Cassandra KVS. (Pull Request)
FIXED DEV BREAK	Callbacks specified in TransactionManagers will no longer be run synchronously when `initializeAsync` is set to true, even if initialization succeeds in the first, synchronous attempt. Previously, we would attempt to run the callbacks synchronously when synchronous initialization succeeds, but this prevented use cases where the callback must block until an external resource is available. Consequently, even if the initialization of a transaction manager created with asynchronous initialization succeeds synchronously, readiness of the returned object must be checked because transaction managers are not ready to be used until callbacks successfully run. (Pull Request)
CHANGED	Postgres 9.5.2+ requirement temporarily rescinded. (Pull Request)
LOGS	Added extra debug/trace logging to log the state of the Cassandra pool / application when running into cassandra pool exhaustion errors. (Pull Request)

v0.127.0

25 Mar 2019

Type	Change
FIXED	Fixed an issue where the `transformAgreedValue` of the `KeyValueServiceCoordinationStore` would throw an NPE when check and set fails on KVSs that do not support detail on CAS failure (DbKvs). (Pull Request)
FIXED USER BREAK	Background Sweep will now continue to prioritise tables accordingly, if writes to the sweep queue are enabled but targeted sweep is disabled on startup. Previously, Background Sweep would not prioritise new writes for sweeping if writes to the sweep queue were enabled. (Pull Request)
CHANGED IMPROVED	We’ve rolled back the change from 0.117.0 that introduces an extra delay after leader election as we are no longer pursuing leadership leases. (Pull Request)
IMPROVED DEV BREAK	AtlasDbHttpClients, FeignOkHttpClients and AtlasDbFeignTargetFactory are refactored to get rid of deprecated methods and overused overloads. (Pull Request)

v0.126.0

18 Mar 2019

Type	Change
CHANGED USER BREAK	Removed functionality for marking tables as deprecated as part of the schema definition and automatically dropping deprecated tables on startup. (Pull Request)
IMPROVED	Improved the startup check that verifies the correctness of the timestamp source to impose tighter constraints. Now uses a recent value from the puncher store rather than the unreadable timestamp. (Pull Request)
FIXED	`KeyValueService` and `CassandraKeyValueService` in particular now has tighter consistency guarantees in the presence of failures. Previously, inconsistent deletes to thoroughly swept tables could result in readers serving stale versions of cells. (Pull Request)
DEV BREAK	The contract of `deleteAllTimestamps` has been strengthened, and the default implementation has been removed. Please contact the AtlasDB team if you think this affects your workflows. (Pull Request)
FIXED	Fixed a bug in `PaxosQuorumChecker` causing a new timelock leader to block for 5 seconds before being able to serve requests if another node was unreachable. (Pull Request)

v0.125.0

07 Mar 2019

Type	Change
IMPROVED	`CassandraKeyValueService` now exposes a lightweight method for obtaining row keys. If you believe you need to use this method, you should reach out to the AtlasDB team first to assess your options. (Pull Request)
CHANGED USER BREAK	The minimum Postgres version is now 9.5.2 (Pull Request)
FIXED	Some race conditions in `TableRemappingKeyValueService` and `KvTableMappingService` have been fixed. Previously, it was possible to run into unexpected instances of `NullPointerException` and `IllegalStateException` when reading from tables, even when other (completely disjoint) sets of tables were created or dropped. It is likely that there remain more bugs here, though we have fixed several more egregious ones. (Pull Request)

v0.122.0

22 Feb 2019

Type	Change
IMPROVED DEV BREAK	Clients talking to Timelock will now throw instead of making a request with a payload larger than 50MB. This addresses several internal issues concerning Timelock stability. This is a devbreak in several AtlasDB utility classes used to create clients, where an additional boolean parameter has been added controlling whether its requests should be limited. (Pull Request)
IMPROVED	Timelock clients now use leased lock tokens to reduce number of RPC’s to Timelock server, and improve transaction performance. (Pull Request)
DEV BREAK	startIdentifiedAtlasDbTransaction() and lockImmutableTimestamp() now being called without an IdentifiedTimeLockRequest parameter. (Pull Request)

v0.121.0

21 Feb 2019

Type	Change
IMPROVED	We now use jetty-alpn-agent 2.0.9 (Pull Request)
IMPROVED DEV BREAK	All usage of remoting-api and remoting3 have been replaced by their equivalents in com.palantir.tracing, com.palantir.conjure.java.api, and com.palantir.conjure.java.runtime. (Pull Request)

v0.120.0

19 Feb 2019

Type	Change
DEV BREAK	The deprecated startAtlasDbTransaction() method is removed from TimelockService. (Pull Request)
DEV BREAK	startIdentifiedAtlasDbTransaction now accepts IdentifiedTimeLockRequest as a parameter rather than StartIdentifiedAtlasDbTransactionRequest. Moving the requestorId information to TimelockClient from the caller. (Pull Request)
FIXED	`FailoverFeignTarget` now retries correctly if calls to individual nodes take a long time and eventually fail with an exception. Previously, we could fail out without having tried all nodes under certain circumstances, even when there existed a node that could legitimately service a request. (Pull Request)
FIXED	Fixed cases where column range scans could result in NullPointerExceptions when there were concurrent writes to the same range. (Pull Request)

v0.119.0

13 Feb 2019

Type	Change
CHANGED IMPROVED	TimeLock will now no longer create its high level paxos directory at configuration de-serialization time. Instead it waits until creating each individual learner or acceptor log directory, allowing timelock to rely more accurately on directory existence as a proxy for said timelock node being new or not. (Pull Request)

v0.118.0

8 Feb 2019

Type	Change
IMPROVED DEV BREAK	AtlasDB Cassandra KVS now depends on `com.palantir.cassandra` instead of `org.apache.cassandra`. This version of Cassandra thrift client supports a `put_unless_exists` operation that can update multiple columns in the same row simultaneously. The Cassandra KVS putUnlessExists method has been updated to use the above call. Note that this means that you must use sls-cassandra server 3.27.0 if you want to use Cassandra KVS. (Pull Request)
DEV BREAK IMPROVED	The TableMetadata class has been refactored to use Immutables. (Pull Request)
NEW METRICS	Transaction services now expose timer metrics indicating how long committing or getting values takes. (Pull Request)
FIXED	Entries with the same value in adjacent ranges in a timestamp partitioning map will now be properly coalesced, and for the purposes of coordination will not be written as new values. Previously, these were stored as separate entries, meaning that unnecessary values may have been written to the coordination store; this does not affect correctness, but is unperformant. (Pull Request)
IMPROVED	AtlasDB now allows you to enable a new transaction retry strategy with exponential backoff via configs. (Pull Request)

v0.117.0

28 Jan 2019

Type	Change
CHANGED	Timelock service no longer supports synchronous lock endpoints. Users who explicitly stated timelock to use synchronous resources by setting install.asyncLock.useAsyncLockService to false (default is true) should migrate to AsyncLockService before taking this upgrade. (Pull Request)
DEV BREAK	Key value services now require their `CheckAndSetCompatibility` to be specified. Refer to the contract of `KeyValueService#getCheckAndSetCompatibility` and the `CheckAndSetCompatibility` enum class to guide this decision. Please be very careful if you are explicitly setting this to `CheckAndSetCompatibility.SUPPORTED_DETAIL_ON_FAILURE`. (Pull Request)
IMPROVED	AtlasDB now has an extra delay after leader elections; this lays the groundwork for leadership leases. (Pull Request)
IMPROVED	We now correctly handle host restart in the clock skew monitor. (Pull Request)

v0.116.1

20 Dec 2018

Type	Change
FIXED	The completion service in the Paxos leader election service should be more resilient to individual nodes being slow. Previously, if one individual node had a full thread pool, the service would throw a `RejectedExecutionException` even if other nodes were able to service the request. (Pull Request)

v0.116.0

14 Dec 2018

Type	Change
NEW	AtlasDB now writes to the _coordination table, a new table which is used to coordinate changes to schema metadata internal to AtlasDB across a multi-node cluster. Services which want to adopt _transactions2 will need to go through this version, to ensure that nodes are able to reach a consensus on when to switch the transaction schema version forwards. (Pull Request)
FIXED	AtlasDB transaction services no longer throw exceptions when performing Thorough Sweep on tables with sentinels. Previously, the services would throw when trying to delete the sentinel, meaning that Background and Targeted Sweep would become stuck if sweeping thorough tables that used to be conservative, or tables that had undergone hard delete via the scrubber. (Pull Request)
CHANGED DEV BREAK	AtlasDB transaction services now no longer support negative timestamps. Users are unlikely to be affected, since using transaction services with negative timestamps was already broken in the past owing to the use of negative numbers for special values (like sentinels or a marker meaning that a transaction was rolled back). (Pull Request)
DEV BREAK	With the introduction of _coordination, creation of `TransactionService` now requires a `CoordinationService<InternalSchemaMetadata>`. Users may create a `CoordinationService` via the `CoordinationServices` factory, if needed, or retrieve it from the relevant `TransactionManager`. Generally speaking, `TransactionService` should not be directly used by standard AtlasDB consumers; abusing it can result in SEVERE DATA CORRUPTION. (Pull Request 1 and Pull Request 2)
DEV BREAK	Transaction Managers now expose a `getTransactionService()` method. Users with custom subclasses of `TransactionManager` will need to implement this. (Pull Request)
NEW METRICS	With the introduction of _coordination, we expose new metrics indicating the point (in logical timestamps) till which the coordination service knows what has been agreed, as well as the transactions schema version that will eventually be applied. (Pull Request)
FIXED USER BREAK	Cassandra KVS getMetadataForTables method now returns a map where table reference keys have capitalisation matching the table names in Cassandra. Previously there was no strict guarantee on the keys’ capitalisation, but it was in most cases all lowercase. (Pull Request)
FIXED	Cassandra KVS getMetadataForTables method now does not contain entries for tables that do not exist in Cassandra. Previously, when a table was dropped, an empty byte array would be written into the _metadata table to mark it as deleted. Now, we delete all rows of the _metadata table containing entries pertaining to the dropped table. Note that this involves a range scan over a part of the _metadata table. While it is not expected that this significantly affects performance of table dropping, please contact the AtlasDB team if this causes issues. (Pull Request)

v0.115.0

07 Dec 2018

Type	Change
FIXED	Cassandra KVS now correctly decommissions servers from the client pool that do not appear in the current token range if autoRefreshNodes is set to true (default value). Previously, refresh would only add discovered new servers, but never remove decommissioned hosts. The new behaviour enables live decommissioning of Cassandra nodes, without having to update the configuration and restart of AtlasDB to stop trying to talk to that server. (Pull Request)
FIXED	The @AutoDelegate annotation now works correctly for interfaces which have static methods, and for simple cases of generics. Previously, the annotation processor would generate code that wouldn’t compile. Note that some cases (e.g. sub-interfaces of generics that refine type parameters) are still not supported correctly. (Pull Request)
IMPROVED	TimeLock Server now logs that a new client has been registered the first time a service makes a request (for each lifetime of each server). (Pull Request)
IMPROVED	Adds com.palantir.common.collect.IterableView#stream method for simplified conversion to Java Stream API usage. (Pull Request)

v0.114.0

03 Dec 2018

Type

Change

USER BREAK

As part of preparatory work to migrate to a new transactions table, this version of AtlasDB and all versions going forward expect to be using a version of TimeLock that supports the startIdentifiedAtlasDbTransaction endpoint. Support for previous versions of TimeLock has been dropped; please update your TimeLock server. Products should depend on TimeLock 0.51.0 or higher, or ignore this dependency altogether if they do not expect to use TimeLock. Note that new versions of the TimeLock server still expose the old endpoints, so old clients may still safely use a new TimeLock server. Also note that some momentary issues may be faced if one is performing a rolling upgrade of embedded services, though once the upgrades settle services should work normally. Note that for or across this version, blue-green deployment of embedded services is not supported. (Pull Request)

v0.113.0

03 Dec 2018

Type	Change
FIXED	KVS Migration CLI no longer migrates the checkpoint table if it exists on the source KVS. Previously, existence of an old checkpoint table on the source KVS could cause a migration to silently skip migrating data. Furthermore, in the cleanup stage of migration, the checkpoint table is now dropped instead of truncated. (Pull Request)
IMPROVED	Read transactions on thoroughly swept tables requires one less RPC to timelock now. This improves the read performance and reduces load on timelock. (Pull Request)
FIXED	Fix warning in stream-store generated code. (Pull Request)

v0.112.1

26 Nov 2018

Type	Change
FIXED	Wrap shutdown callback running in try-catch. This guards against any shutdown hooks throwing unchecked exceptions, which would cause other hooks to not run. (Pull Request)

v0.112.0

26 Nov 2018

Type	Change
FIXED	Remove a memory leak due to usages of Runtime#addShutdownHook to cleanup resources. This only applies where multiple TransactionManager s might exist in a single VM and they are created an shutdown repeatedly. (Pull Request)

v0.111.0

20 Nov 2018

Type	Change
FIXED	Fixed a bug where lock and timestamp services were not closed when transaction managers were closed. (Pull Request)

v0.110.0

20 Nov 2018

Type	Change
IMPROVED	Numerous small internal improvements that did not include release notes.

v0.109.0

14 Nov 2018

Type	Change
DEV BREAK	PaxosQuorumChecker now takes an ExecutorService as opposed to an Executor. (Pull Request)
FIXED	Re-introduced the distinct bounded thread pools to PaxosLeaderElectionService for communication with other PaxosLearners and PingableLeaders. Previously, a single unbounded thread pool was used, which could cause starvation and OOMs under high load if any learners or leaders in the cluster were slow to fulfil requests. This change also improves visibility as to which specific communication workflows may be suffering from issues. (Pull Request)
FIXED	Targeted sweep now handles table truncations with conservative sweeps correctly. (Pull Request)
IMPROVED	No longer calls deprecated OkHttpClient.Builder().sslSocketFactory() method, now passes in X509TrustManager. (Pull Request)
IMPROVED	Sha256Hash now caches its Java hashCode method. (Pull Request)
IMPROVED	The version of javapoet had previously been bumped to 1.11.1 from 1.9.0. However this was not done consistently across the repository. The atlasdb-client and atlasdb-processors subprojects now also use the newer version. (Pull Request)

v0.108.0

7 Nov 2018

Type	Change
FIXED	Cassandra KVS no longer uses the schema mutation lock and instead creates tables using an id deterministically generated from the Cassandra keyspace and the table name. As part of this change, table deletion now truncates the table before dropping it in Cassandra, therefore requiring all Cassandra nodes to be available to drop tables. This fixes a bug where it was possible to create two instances of the same table on two different Cassandra nodes, resulting in schema version inconsistency that required manual intervention. (Pull Request)
IMPROVED	Introduced runtime checks on the client side for timestamps retrieved from timelock. This aims to prevent data corruption if timestamps go back in time, possibly caused by a misconducted timelock migration. This is a best effort for catching abnormalities on timestamps at runtime, and does not provide absolute protection. (Pull Request)
USER BREAK	Qos Service: The experimental QosService for rate-limiting clients has been removed. (Pull Request)
FIXED	Fixed a bug in the `AsyncInitializer.cancelInitialization` method that caused asynchronously initialized `CassandraKeyValueServiceImpl` and `CassandraClientPoolImpl` objects unable to be closed and shut down, respectively. (Pull Request)
FIXED	Targeted sweep now deletes certain sweep queue rows faster than before, which should reduce table bloat (particularly on space constrained systems). (Pull Request)
IMPROVED FIXED	Schema mutations against the Cassandra KVS are now HA. Previously, Cassandra KVS required that after some schema mutations all cassandra nodes must agree on the schema version. Now, all reachable nodes must agree and at least a quorum of nodes must be reachable, instead. (Pull Request)
DEV BREAK	The AutoDelegate annotation no longer supports a typeToExtend parameter. Users should instead annotate the desired class or interface directly. (Pull Request)
FIXED	Targeted sweep does better with missing tables, and also with the empty namespace. Previously, it would just cycle on the error and never sweep. A highly undesirable condition. (Pull Request)
FIXED	KeyValueServicePuncherStore``s ``getMillisForTimestamp method now does a much more efficient `_punch` table lookup. This affects the performance of calculating the `millisSinceLastSweptTs` metric for targeted sweep. Also, the above mentioned metric will now consistently report falling behind if no new entries are being punhed into the punch table. (Pull Request)
IMPROVED	The HikariConnectionClientPool now allows specification of a use-case. If specified, threads created will have the use-case in their name, and log messages about pool statistics will be prefaced by the use-case as well. This may be useful for debugging when users run multiple such pools. (Pull Request)
NEW	Old deprecated tables can now be added to a schema to be cleaned up on startup. (Pull Request)
FIXED	Fixed a bug where `AwaitingLeadershipProxy` stops trying to gain leadership, causing client calls to leader to throw `NotCurrentLeaderException`. (Pull Request)
NEW	TimeLock now exposes a `startIdentifiedAtlasDbTransaction` endpoint. This may be used by AtlasDB clients for some key value services to achieve better data distribution and performance as far as the transactions table is concerned. (Pull Request)
DEV BREAK	The schema metadata service has been removed, as the AtlasDB team does not intend to pursue extracting sweep to its own separate service in the short to medium term, and it was causing support issues. If you were consuming this service, please contact the AtlasDB team. (Pull Request)
IMPROVED	On Oracle backed DbKvs, schema changes that would require the addition of an overflow column will now throw upon application. Previously, puts would instead fail at runtime when the column did not exist. (Pull Request)
IMPROVED	The index cleanup task for stream stores now only fetches the first column for each stream ID when determining whether the stream is still in use. Previously, we would fetch the entire row which is unnecessary and causes read pressure on the key-value-service for highly referenced streams. (Pull Request)
FIXED	Live-reloading HTTP proxies and HTTP proxies with failover now refresh themselves after encountering a large number of cumulative requests or consecutive exceptions. This was previously implemented to work around several issues with our usage of OkHttp, but was not implemented for the proxies with failover (which includes proxies to TimeLock). (Pull Request)

v0.107.0

10 Oct 2018

Type	Change
IMPROVED	Targeted sweep now stores even less data in the sweepable cells table due to dictionary encoding table references instead of storing them as strings. (Pull Request)
IMPROVED	The legacy lock service’s lock state logger now logs additional information about the lock service’s internal synchronization state. This includes details of queueing threads on each underlying sync object, as well as information on the progress of inflight requests. (Pull Request 1 and Pull Request 2)

v0.106.0

2 Oct 2018

Type	Change
FIXED DEV BREAK	Reverted the PR #3505, which was modifying PaxosLeaderElectionService to utilise distinct bounded thread pools, as this PR uncovered some resiliency issues with PaxosLeaderElectionService. It will be re-merged after fixing those issues. (Pull Request)
FIXED	Targeted sweep now stores much less data in the sweepable cells table due to more efficient encoding. (Pull Request)
NEW	TransactionManager``s now expose a ``TimestampManagementService, allowing clients to fast-forward timestamps when necessary. This functionality is intended for libraries that extend AtlasDB functionality; it is unlikely that users should directly require the TimestampManagementService. (Pull Request)
FIXED	Targeted sweep no longer chokes if a table in the queue no longer exists, and was deleted by a different host while this host was online and sweeping. (Pull Request)
IMPROVED	Add versionId to SimpleTokenInfo to improve logging for troubleshooting. (Pull Request)
IMPROVED	Increase maximum allowed rescue dependency version to 4.X.X. (Pull Request)
LOGS CHANGED	Changed the origin for logs when queries were slow from kvs-slow-log to kvs-slow-log-2. (Pull Request)

v0.105.0

20 Sep 2018

Type	Change
FIXED	Improved threading for MetricsManager’s metricsRegistry (Pull Request)
LOGS METRICS	Improved visibility into sources of high DB load. We log when a query returns a high number of timestamps that need to be looked up in the database, and tag some additional metrics with the tablename we were querying. (Pull Request) (Pull Request)
CHANGED	Upgrade http-remoting 3.41.1 -> 3.43.0 to make tracing delegate nicely. (Pull Request)
IMPROVED	Users may now provide their own executors to instances of `BasicSQL` and to `BasicSQLUtils.runUninterruptably`. Previously users were forced to use a default executor which had an unbounded thread-pool and fixed keep-alive timeouts. (Pull Request)
FIXED	TargetedSweepMetrics#millisSinceLastSweptTs updates periodically, even if targeted sweep is failing to successfully run. (Pull Request)
FIXED	Targeted sweep no longer chokes if a table in the queue no longer exists. (Pull Request)
FIXED	Targeted sweep threads will no longer die if Timelock unlock calls fail. (Pull Request)
IMPROVED	PaxosLeaderElectionService now utilises distinct bounded thread pools for communication with other PaxosLearners and PingableLeaders. Previously, a single unbounded thread pool was used, which could cause starvation and OOMs under high load if any learners or leaders in the cluster were slow to fulfil requests. This change also improves visibility as to which specific communication workflows may be suffering from issues. (Pull Request)
FIXED	Fixed an issue in timelock where followers were publishing metrics with `isCurrentSuspectedLeader` tag set to `true`. (Pull Request)
FIXED	Background sweep will now choose between priority tables uniform randomly if there are multiple priority tables. Previously, if multiple priority tables were specified, background sweep would repeatedly pick the same table to be swept, meaning that the other priority tables would all never be swept. (Pull Request)
IMPROVED	A few timelock ops edge cases have been removed. Timelock users must now indicate whether they are booting their servers for the first time or subsequent times, to avoid the situation where a timelock node becomes newly misconfigured and thinks it is booting up for the first time again. Additionally, timestamps no longer overflow when they hit Long.MAX_VALUE; this would only happen due to a bug, but at least now the DB will become read only and not corrupt. (Pull Request)
DEV BREAK	PaxosQuorumChecker now takes an ExecutorService as opposed to an Executor. (Pull Request)

v0.104.0

4 Sep 2018

Type	Change
FIXED	The Jepsen tests no longer assume that users have installed Python or DateUtil, and will install these itself if needed. (Pull Request)
CHANGED	Bumps com.palantir.remoting3 dependency to 3.41.1 from 3.22.0. (Pull Request)

v0.103.0

30 Aug 2018

Type	Change
IMPROVED	Targeted sweep queue now hard fails if it is unable to read table metadata to determine sweep strategy. Previously, we assumed the strategy was conservative, which could result in sweeping tables that should never be swept. (Pull Request)
FIXED	Fixed an issue where targeted sweep would fail to increase the number of shards and error out if the default number of shards was ever persisted into the progress table. (Pull Request)
FIXED	Several exceptions (such as when creating cells with overly long names or executors in illegal configurations) now contain numerical parameters correctly. Previously, the exceptions thrown would erroneously contain `{}` values. (Pull Request)
FIXED	Cassandra Key Value Service now no longer logs spurious ERROR warning messages when failing to read new-format table metadata. (Pull Request)
IMPROVED	Throw more specific CommittedTransactionException when operating on a committed transaction. (Pull Request)

v0.102.0

24 Aug 2018

Type	Change
FIXED	CQL queries are now logged correctly (with safe and unsafe arguments respected). Previously, these versions would log all arguments as part of the format string as it eagerly did the string substitution. AtlasDB versions 0.100.0 through 0.101.0 (inclusive both ends) are affected. (Pull Request)
DEV BREAK IMPROVED	CqlQuery is now an abstract class and must now be created through its builder. This makes the intention that the query string provided is safe considerably more explicit. (Pull Request)
IMPROVED	DbKvs now implements its own version of `deleteAllTimestamps` instead of using the default AbstractKvs implementation. This facilitates better performance of targeted sweep on DbKvs. (Pull Request)
FIXED	LockRefreshingLockService now batches calls to refresh locks in batches of 650K. Previously, trying to refresh a larger number of locks could trigger the 50MB limit in payload size. (Pull Request)
LOGS	Reduce logging level for locks not being refreshed. (Pull Request)

v0.101.0

16 Aug 2018

Type	Change
CHANGED	Targeted Sweep is now enabled by default. Products using atlasdb-cassandra library need to declare a dependency on Rescue 3 or ignore that dependency altogether. (Pull Request)
FIXED	Fixed a bug that when filtering the row results for `getRows` in `SnapshotTransaction` could cause an exception due to duplicate keys in a map builder. (Pull Request)
IMPROVED	AtlasDB now correctly closes the targeted sweeper on shutdown, and logs less by default. (Pull Request)
IMPROVED DEV BREAK	The atlasdb-commons package has had its dependency tree greatly pruned of unused cruft. This may introduce a devbreak to users transitively relying on these old dependencies. (Pull Request)
CHANGED	`CassandraRequestExceptionHandler` is set to use `Conservative` exception handler by default. Main differences are: Conservative exception handler backs off for larger subset of exceptions Backoff period is exponentially increasing (but cannot go beyond `MAX_BACKOFF`) Retries are executed on a different host rather than the same host for a larger subset of exceptions (Pull Request)
IMPROVED LOGS	CassandraKVS’s `ExecutorService` is now instrumented. This ExecutorService is responsible for submitting queries to the underlying DB. It being throttled will increase the latency of queries and transactions. The following metrics are available: `com.palantir.atlasdb.keyvalue.cassandra.CassandraKeyValueService.executorService.submitted` `com.palantir.atlasdb.keyvalue.cassandra.CassandraKeyValueService.executorService.running` `com.palantir.atlasdb.keyvalue.cassandra.CassandraKeyValueService.executorService.completed` `com.palantir.atlasdb.keyvalue.cassandra.CassandraKeyValueService.executorService.duration` (Pull Request)
NEW DEV BREAK	`TransactionManagers` has a new builder option named `validateLocksOnReads()`; set to `true` by default. This option is passed to `TransactionManager`’s constructor, to be used in initialization of `Transaction`. A transaction will validate pre-commit conditions and immutable ts lock after every read operation if underlying table is thoroughly swept (Default behavior). Setting `validateLocksOnReads` to `false` will stop transaction to do the mentioned validation on read operations; causing validations to take place only at commit time for the sake of reducing number of round-trips to improve overall transaction perf. This change will cause a devbreak if you are constructing a `TransactionManager` outside of `TransactionManagers`. This can be resolved by adding an additional boolean parameter to the constructor (`true` if you would like to keep previous behaviour) (Pull Request)

v0.100.0

2 Aug 2018

Type	Change
FIXED	Cassandra KVS now correctly accepts check-and-set operations if one is working with multiple columns in the relevant row. Previously, if there were multiple columns in the row where one was trying to do a CAS, the CAS would be rejected even if the column value matched the cell. Similarly, for put-unless-exists, the PUE would be rejected if there were any other cells in the relevant row (even if they had a different column name). We now perform the operations correctly only considering the value (or absence of value) in the relevant cell. (Pull Request)
IMPROVED DEV BREAK	We have removed the `sleepForBackoff(int)` method from `AbstractTransactionManager` as there were no known users and its presence led to user confusion. AtlasDB does not actually backoff between attempts of running a user’s transaction task. If your service overrides this method, please contact the AtlasDB team. (Pull Request)
IMPROVED	Sequential sweep now sleeps longer between iterations if there was nothing to sweep. Previously we would sleep for 2 minutes between runs, but it is unlikely that anything has changed dramatically in 2 minutes so we sleep for longer to prevent scanning the sweep priority table too often. Going forward the most likely explanation for there being nothing to sweep is that we have switched to targeted sweep. We don’t stop completely or sleep for too long just in case configuration changes and a table is eligible to sweep again. (Pull Request)
IMPROVED	TimeLockAgent now exposes the number of active clients and the configured maximum. This makes it easier for a service to expose these via a health check. (Pull Request)

v0.99.0

25 July 2018

Type	Change
FIXED	Fixed an issue where a failure to punch a value into the _punch table would suppress any future attempts to punch. Previously, if the asynchronous job that punches a timestamp every minute ever threw an exception, the unreadable timestamp would be stuck until the service is restarted. (Pull Request)
IMPROVED	TimeLock by default now has a client limit of 500. Previously, this used to be 100 - however we have run into issues internally where stacks legitimately reach this threshold. Note that we still need to maintain the client limit to avoid a possible DOS attack with users creating arbitrarily many clients. (Pull Request)
NEW METRICS	Added metrics for the number of active clients and maximum number of clients in TimeLock Server. These are useful to identify stacks that may be in danger of breaching their maxima. (Pull Request)

v0.98.0

25 July 2018

Type	Change
NEW METRICS	Targeted sweep now exposes tagged metrics for the outcome of each iteration, analogous to the legacy sweep outcome metrics. The reported outcomes for targeted sweep are: `SUCCESS`, `NOTHING_TO_SWEEP`, `DISABLED`, `NOT_ENOUGH_DB_NODES_ONLINE`, and `ERROR`. (Pull Request)
IMPROVED	Changed the range scan behavior for the sweep priority table so that reads scan less data in Cassandra. (Pull Request)

v0.97.0

20 July 2018

Type	Change
IMPROVED	TimeLock Server now exposes a `startAtlasDbTransaction` endpoint which locks an immutable timestamp and then gets a fresh timestamp (in a single round-trip call); new TimeLock clients call this endpoint. This saves an estimated one TimeLock round-trip of latency when starting a transaction. Note that the old endpoints are still exposed (so TimeLock remains compatible with older Atlas clients), and there is an automated adapter for new TimeLock clients to talk to old TimeLock servers that don’t have this endpoint. (Pull Request)
IMPROVED LOGS	Reduced the logging level of various log messages. (Pull Request)
CHANGED METRICS	CassandraClientPoolingContainer metrics are tagged by pool name. Previosly pool name was embedded in metric name. (Pull Request)
IMPROVED	Added the `CallbackInitializable` interface to simplify asynchronous initialization of resources using transaction manager callbacks. (Pull Request)
IMPROVED	The timestamp cache size is now actually live reloaded, and uses Caffeine instead of Guava for better performance. The read only transaction manager (almost unused) now no longer constructs a thread pool. (Pull Request)
IMPROVED DEV BREAK	Transactions now have meters recording their outcomes (e.g. successful commits, lock expiry, being rolled back, read-write conflicts, etc.) In the cases of write-write and read-write conflicts, the first table on which a conflict occurred will be tagged on to the conflict meter if it is safe for logging. Note that some metric names have changed; in particular, `SerializableTransaction.SerializableTransactionConflict` and `SnapshotTransaction.SnapshotTransactionConflict` are now tracked as `readWriteConflicts` and `writeWriteConflicts` respectively under `TransactionOutcomeMetrics`. This is also an improvement in terms of clarity, as serializable transactions that experienced write-write conflicts were previously marked as snapshot transaction conflicts. (Pull Request)

v0.96.0

11 July 2018

Type	Change
FIXED	Targeted sweep metrics will no longer range scan the punch table if the last swept timestamp was issued more than one week ago. Previously, we would range scan the table even if the last swept timestamp was -1, which would force a range scan of the entire table. (Pull Request)
FIXED DEPRECATED	Atlas clients using Cassandra can specify type of kvs as cassandra rather then CassandraKeyValueServiceRuntimeConfig in runtime configuration. The CassandraKeyValueServiceRuntimeConfig type is now deprecated. (Pull Request)
IMPROVED	Startup and schema change performance improved for Cassandra users with large numbers of tables. (Pull Request)

v0.95.0

9 July 2018

Type	Change
IMPROVED	The atlas console metadata query now returns more table metadata, such as sweep strategy and conflict handler information. (Pull Request)
DEV BREAK	The `putUnlessExists` API has been removed from AtlasDB tables, as it was misleading (it only did the put if the given row, column and value triple were already present, as opposed to the more intuitive condition of the row and column value pair being present). Please replace any uses of the table-level `putUnlessExists` with a get, check and put if appropriate - these will still be transactional because of the AtlasDB transaction protocol. Note that this is not the same as the KVS `putUnlessExists` API, which is still used by the transaction protocol. This API has already been deprecated since August 2017 (11 months from time of writing). (Pull Request)
IMPROVED	We will no longer continue to update `sweep.priority` if writes are persisted to the targeted sweep queue. This means that assuming `targetedSweep.enableSweepQueueWrites` remains on, the background sweeper will eventually run out of things to sweep without further intervention. At this point, the background sweeper will start reporting `NOTHING_TO_SWEEP`, and the background sweeper may safely be disabled. (Pull Request)
FIXED	Writes to the targeted sweep queue are now done using the start timestamp of the transaction that makes the call. Previously, the writes were done at timestamp 0, which was interfering with Cassandra compactions. (Pull Request)
FIXED	The sweep CLI will no longer perform in-process compactions after sweeping a table. For DbKvs, this operation is handled by the background compaction thread; Cassandra performs its own compactions. Note that the sweep CLI itself has been deprecated in favour of using the sweep priority override configuration, possibly in conjunction with the thread count (Docs). (Pull Request)
NEW	Three new conflict handlers `SERIALIZABLE_CELL`, `SERIALIZABLE_INDEX` and `SERIALIZABLE_LOCK_LEVEL_MIGRATION` are added. `SERIALIZABLE_CELL` conflict handler is same as `SERIALIZABLE`, but checks for conflicts by locking cells during commit instead of locking rows. Cell locks are more fine-grained, so this will produce less contention at the expense of requiring more locks to be acquired. `SERIALIZABLE_INDEX` conflict handler is designed to be used by index tables. As any write/write conflict on an index table will necessarily also be a write/write conflict on base table, this conflict handler does not check write/write conflicts. Read/write conflicts should still need to be checked, since we do not need to read the index table with the main table. This conflict handler also locks at cell level. If your schema already has a table with `SERIALIZABLE` conflict handler, and you would like to migrate it to `SERIALIZABLE_CELL` or `SERIALIZABLE_INDEX` with a rolling upgrade (without a shutdown); then you should first migrate it to `SERIALIZABLE_LOCK_LEVEL_MIGRATION` conflict handler to avoid data corruption. (Pull Request)
DEV BREAK	Removed the token range skewness logger from the Cassandra KVS. We’ve not been relying on it to catch issues and it produces a very large output that is cumbersome. (Pull Request)

v0.94.0

28 June 2018

Type	Change
IMPROVED	Snapshot transaction `getRowsColumnRange` performance has been improved by using an `ImmutableSortedMap.Builder` and constructing the map at the end. We previously used a `SortedSet` which would incur overhead in rebalancing the underlying red-black tree as the data was already mostly sorted. We have seen a 7 percent speedup for reading all columns from a wide row (50,000 columns). We have also seen a 6 percent speedup for reading 50,000 columns from a wide row, where a random 2 percent of these rows are from uncommitted transactions. (Pull Request)
DEV BREAK	Snapshot transactions now return immutable maps when calling `getRows` and `getRowsColumnRange`. These used to return mutable maps - please make a copy of the map if you need it to be mutable. (Pull Request)
NEW	Multiple `BackgroundSweeper` threads can now run simultaneously. To enable this, set the runtime option `sweep/sweepThreads` to the desired number of threads and restart any Atlas client. If running multiple clients, these threads will be randomly split across them. Due to the load it may place on Cassandra, this option is not recommended for long-term use for Cassandra-backed installations. (Pull Request)
IMPROVED	Sweep progress is now stored per-table, meaning that if background sweep of a table is interrupted (for example, because sweep priority config changed), next time the background sweeper selects that table, it will pick up where it left off. Previously, the table would be swept from the start, potentially leading to several days of work being redone. (Pull Request)
DEV BREAK	The `BackgroundSweeper` is no longer a `Runnable`. Its job is now to manage `BackgroundSweeperThread` instances, which are `Runnable`. (Pull Request)
IMPROVED	Targeted sweep now stops reading from the sweep queue immediately if it encounters an entry known to be committed after the sweep timestamp. Previously, we would read an entire batch before checking commit timestamps so that lookups can be batched, but this is not necessary if the commit timestamp is cached from a previous iteration. (Pull Request)
IMPROVED	Write transactions now unlock their row locks and immutable timestamp locks asynchronously after committing. This saves an estimated two TimeLock round-trips of latency when committing a transaction. (Pull Request)
NEW	AtlasDB clients now batch calls to unlock row locks and immutable timestamp locks across transactions. This should reduce request volumes on TimeLock Server. (Pull Request)
FIXED	Snapshot transactions now write detailed profiling logs of the form `Committed {} bytes with locks...` only once every 5 seconds per `TransactionManager` used. Previously, they were written on every transaction. (Pull Request)
FIXED	AtlasDB Benchmarks, CLIs and Console now shutdown properly under certain read patterns. Previously, if these tools needed to delete a value that a failed transaction had written, the delete executor was never closed, thereby preventing an orderly JVM shutdown. (Pull Request)
FIXED	Fixed a bug in C* retry logic where number of retries over all the hosts were used as number of retries on a single host, which may cause unexpected blacklisting behaviour. (Pull Request)

v0.93.0

25 June 2018

Type	Change
IMPROVED METRICS	Snapshot Transaction metrics now track the post-commit step of unlocking the transaction row locks. Also, the `nonPutOverhead` and `nonPutOverheadMillionths` metrics now account for this step as well. (Pull Request)
IMPROVED	Targeted sweep now uses timelock locks to synchronize background threads on multiple hosts. This avoids multiple hosts doing the same sweeps. Targeted sweep also no longer forcibly sets the number of shards to at least the number of threads. (Pull Request)
FIXED	Cassandra deleteRows now avoids reading any information in the case that we delete the whole row. (Pull Request)
USER BREAK	The `scyllaDb` option in Cassandra KVS config has been removed. Please contact the AtlasDB team if you deploy AtlasDB with scyllaDb (this was never supported). (Pull Request)
FIXED LOGS	Fixed a bug where Cassandra client pool was erroneously logging host removal from blacklist, even the host was not blacklisted in the first place. (Pull Request)

v0.92.2

22 June 2018

Type	Change
FIXED	With targeted sweep, we now only call timelock once per set of range tombstones we leave, rather than once per cell. (Pull Request)

v0.92.1

21 June 2018

Type	Change
FIXED	We now consider only one row at a time when getting rows from the KVS with sweepable cells. (Pull Request)
FIXED	Cassandra retry messages now log bounds on attempts correctly. Previously, they would log the supplier of these bounds (instead of the actual bounds, which users are more likely to be interested in). (Pull Request)

v0.92.0

20 June 2018

Type	Change
IMPROVED METRICS	We now publish metrics for more individual stages of the commit stage in a SnapshotTransaction. We also now publish metrics for the total non-KVS overhead - both the absolute time involved as well as a ratio of this to the total time spent in the commit stage. (Pull Request)
NEW LOGS	Snapshot transactions now, up to once every 5 real-time seconds, log an overview of how long each step in the commit phase took. These logs will help the Atlas team better understand which parts of committing transactions may be slow, so that we can improve on it. (Pull Request)
METRICS IMPROVED	The `millisSinceLastSweptTs` metric for targeted sweep now updates at the same frequency as the `lastSweptTimestamp` metric. This will result in a much smoother graph for the former metric instead of the current sawtooth graph. (Pull Request)
FIXED	We now page with a smaller batch size when looking at the sweepable cells. We also batch targeted sweep deletes in smaller batches. (Pull Request)
FIXED	Fixed an issue in targeted sweep where reading from the sweep queue when there are more than the specified batch size entries can cause some entries to be skipped. This is unlikely to have affected anyone because the default batch size used was very large. (Pull Request)
IMPROVED METRICS	AtlasDB now publishes timers tracking time taken to setup a transaction task before it is run, and time taken to tear down the task after it is done before runTaskWith* returns. (Pull Request)
IMPROVED LOGS	Added logging for leadership election code. (Pull Request)

v0.91.0

18 June 2018

Type	Change
DEV BREAK	AtlasDB metrics are no longer a static singleton, and are now created upon construction of relevant classes. This allows internal users to construct multiple AtlasDBs and get meaningful metrics. Many constructors have been broken due to this change. (Pull Request)
DEV BREAK	Refactored the TransactionManager inheritance tree to consolidate all relevant methods into a single interface. Functionally, any TransactionManager created using TransactionManagers will provide the serializable and snapshot isolation guarantees provided by a SerializableTransactionManager. Constructing TransactionManagers via this class should result in only a minor dev break as a result of this change. This will make it easier to transparently wrap TransactionManagers to extend their functionality. (Pull Request)
FIXED	The delete executor now uses daemon threads, so is less likely to cause failure to shutdown. (Pull Request)
FIXED	Fixed an issue where starting an HA Oracle-backed client may fail due to constraint violation. The issue occurred when multiple nodes attempted to insert the same metadata. (Pull Request)
CHANGED METRICS	Sweep metrics have been reworked based on their observed usefulness in the field. `tableBeingSwept` is removed, as it is observed that it is not ingested as expected. Users can use service logs to track the table being swept. `cellTimestampPairsExamined`, `staleValuesDeleted` and `sweepTimeSweeping` are being tracked by counters, instead of meters now. This change is done as it is observed that periodically sampled gauge readings are not useful if the frequency is lower than gauge update frequency. Now, these values will be accumulating over time. Users can take the difference of values of two successive points to track the process. Sweep now exposes the following metrics with the common prefix `com.palantir.atlasdb.sweep.metrics.LegacySweepMetrics.` (To be better distinguished from `TargetedSweepMetrics`): `cellTimestampPairsExamined` `staleValuesDeleted` `sweepTimeSweeping` `sweepTimeElapsedSinceStart` `sweepError` (Pull Request)

v0.90.0

11 June 2018

Type	Change
IMPROVED	When writing to Cassandra, the internal write timestamp for writes of sweep sentinels, range tombstones and deletes to regular tables are now approximately fresh timestamps from the timestamp service, as opposed to being an arbitrary hardcoded value or related to the transaction’s start timestamp. This should improve Cassandra’s ability to purge droppable tombstones at compaction time, particularly in tables that see heavy volumes of overwrites and sweeping. Note that this only applies if you have created your Transaction Manager through the `TransactionManagers` factory. If you are creating your transaction manager elsewhere, you should supply a suitable `freshTimestampProvider` in initialization. (Pull Request)
NEW IMPROVED	Targeted sweep now also sweeps stream stores. (Pull Request) Note that targeted sweep is considered a beta feature as it is not fully functional yet. Consult with the AtlasDB team if you wish to use targeted sweep in addition to, or instead of, standard sweep.
FIXED	Targeted sweep will no longer sweep cells from transactions that were committed after the sweep timestamp. Instead, targeted sweep will not proceed for that shard and strategy until the sweep timestamp progresses far enough. (Pull Request)
FIXED	Fixed an issue where `getRowsColumnRange` would return no results if the number of rows was more than the batch hint. (Pull Request)
DEV BREAK	Dropwizard transitive dependencies have been removed from the `atlasdb-config` subproject. Usages of `AtlasDbConfigs` for config parsing still support discovering subtypes of config, as we ship AtlasDB with an implementation of Dropwizard’s DiscoverableSubtypeResolver. (Pull Request)
DEV BREAK	`AtlasDbFactory` now takes an additional `LongSupplier` parameter when creating a key-value-service that is intended to be a source of fresh timestamps from the timestamp service. Please contact the AtlasDB team if you are uncertain what should be passed here. (Pull Request)
IMPROVED	The unbounded `CommitTsLoader` has been renamed to `CommitTsCache` and now has an eviction policy to prevent memory leaks. Background sweep now reuses this cache for iterations of sweep instead of recreating it every iteration. (Pull Request)
FIXED	Some users of AtlasDB rely on being able to abort transactions which are in progress. Until the last release of AtlasDB, this worked successfully, however this was only the case because before an assert could throw an AssertionError, an NPE was thrown by different code. Now, the assertion error is not thrown. (Pull Request)

v0.89.0

6 June 2018

Type	Change
FIXED	When determining if large sets of candidate cells were part of committed transactions, Background and Targeted Sweep will now read smaller batches of timestamps from the transaction service in serial. Previously, though these reads were re-partitioned into smaller batches, the batch requests were made in parallel which could monopolise Atlas client-side as well as KVS-side resources. There may be a small performance regression here, though this change promotes better stability for the underlying key-value-service especially in the presence of wide rows. (Pull Request)
USER BREAK	The size of batches that are used when the `CommitTsLoader` loads timestamps as part of sweep is now set to be a non-configurable 50,000. This used to be configured via the `fetchBatchSize` parameter in Cassandra config. Other workflows that use this parameter continue to respect it. If you have a use case for configuring this specifically, please contact the AtlasDB team. (Pull Request)
FIXED DEV BREAK	The `Transaction.getRowsColumnRange` method that returns an iterator now throws for `SERIALIZABLE` conflict handlers. This functionality was never implemented correctly and never offered the serializable guarantee. The method now throws an `UnsupportedOperationException` in this case. (Pull Request)
DEV BREAK	Due to lack of use, we have deleted the AtlasDB Dropwizard bundle. Users who need Atlas Console and CLI functionality are encouraged to use the respective distributions. (Pull Request)
NEW METRICS	Added a new tagged metric for targeted sweep showing approximate time in milliseconds since the last swept timestamp has been issued. This metric can be used to estimate how far targeted sweep is lagging behind the current moment in time. The metric is `com.palantir.atlasdb.sweep.metrics.TargetedSweepMetrics.millisSinceLastSweptTs` and is tagged with the sweep strategy used. (Pull Request)
FIXED	Atlas no longer throws if you read the same column range twice in a serializable transaction. (Pull Request)
FIXED	We no longer treat CAS failure in Cassandra as a Cassandra level issue, meaning that we won’t blacklist connections due to a failed CAS. (Pull Request)
IMPROVED	`SnapshotTransaction` now asynchronously deletes values for transactions that get rolled back. This restores the behaviour from before the previous fix, except that the parent transaction no longer waits for the delete to finish. (Pull Request)
FIXED	Fixed an issue occurring during transaction commits, where a failure to putUnlessExists a commit timestamp caused an NPE, leading to a confusing error message. Previously, the method determining whether the transaction had committed successfully or been aborted would hit a code path that would always result in an NPE. (Pull Request)
IMPROVED	Increased PTExecutors default thread timeout from 100 milliseconds to 5 seconds to avoid recreating threads unnecessarily. (Pull Request)

v0.88.0

30 May 2018

Type	Change
DEV BREAK NEW	KVS method `deleteAllTimestamps` now also takes a boolean argument specifying if garbage deletion sentinels should also be deleted. (Pull Request)
NEW	AtlasDB now implements targeted sweep using a sweep queue. As long as the `enableSweepQueueWrites` property of the `targetedSweep` configuration is set to true, information about each transactional write and delete will be persisted into the sweep queue. If `targetedSweep` is enabled in AtlasDB runtime configurations, background threads will read the persisted information from the sweep queue and delete stale data from the kvs. For more details on targeted sweep, please refer to the targeted sweep docs. (Pull Request) Note that targeted sweep is considered a beta feature as it is not fully functional yet. Consult with the AtlasDB team if you wish to use targeted sweep in addition to, or instead of, standard sweep.
NEW METRICS	Added tagged targeted sweep metrics for conservative and thorough sweep. The metrics show the cumulative number of enqueued writes, entries read, tombstones put, and aborted cells deleted. Additionally, there are metrics for the sweep timestamp of the last sweep iteration and for the lowest last swept timestamp across all shards. The metrics, tagged with the sweep strategy used, are as follws (with the common prefix `com.palantir.atlasdb.sweep.metrics.TargetedSweepMetrics.`): `enqueuedWrites` `entriesRead` `tombstonesPut` `abortedWritesDeleted` `sweepTimestamp` `lastSweptTimestamp` (Pull Request)
IMPROVED LOGS	Added logging of the values used to determine which table to sweep, provides more insight into why tables are being swept and others aren’t. (Pull Request)
IMPROVED	http-remoting has been upgraded to 3.22.0 (was 3.14.0). This release fixes several issues with communication between Atlas servers and a QoS service, if configured (especially in HA configurations). Note that this change does not affect communication between timelock nodes, or between an Atlas client and timelock, as these do not currently use remoting. (Pull Request)

v0.87.0

25 May 2018

Type	Change
FIXED	`SnapshotTransaction` will no longer attempt to delete values for transactions that get rolled back. The deletes were (necessarily) run at consistency `ALL`, meaning that if aborted data was present, read transactions had significantly impaired performance if a database node was down. (Pull Request)

v0.86.0

23 May 2018

Type

Change

FIXED

The Cassandra key value service is now guaranteed to return getRowsColumnRange results in the correct order. Previously while paging over row dynamic columns, the first batchHint results are ordered lexicographically, whilst the remainder are hashmap ordered in chunks of batchHint. In practice, when paging this can lead to entirely incorrect, duplicate results being returned. Now, they are returned in order. (Pull Request)

FIXED

Fixed a race condition where requests to a node can fail with NotCurrentLeaderException, even though that node just gained leadership. (Pull Request)

v0.85.0

18 May 2018

Type

Change

FIXED

Snapshot transaction is now guaranteed to return getRowsColumnRange results in the correct order. Previously while paging over row dynamic columns, if uncommitted or aborted transaction data was seen, it would be placed at the end of the list, instead of at the start, meaning that the results are mostly (but not entirely) in sorted order. In practice, this leads to duplicate results in paging, and on serializable tables, transactions that paradoxically conflict with themselves. Now, they are guaranteed to be returned in order, which removes this issue. (Pull Request)

v0.84.0

16 May 2018

Type	Change
IMPROVED	Timelock will now have more debugging info if the paxos directories fail to be created on startup. (Pull Request)
IMPROVED	Move a complicated and elsewhere overridden method from AbstractKeyValueService into DbKvs (Pull Request)
FIXED	The (Thrift-backed) `CassandraKeyValueService` now returns correctly for CQL queries that return null. Previously, they would throw an exception when we attempted to log information about the response. (Pull Request)

v0.83.0

10 May 2018

Type	Change
IMPROVED	If we make a successful request to a Cassandra client, we now remove it from the overall Cassandra service’s blacklist. Previously, removal from the blacklist would only occur after a background thread successfully refreshed the pool, meaning that requests may become stuck if Cassandra was rolling restarted. (Pull Request)
FIXED	The Cassandra client pool now respects the `maxRetriesOnHost` config option, and will not try a single operation beyond that many times on the same node. Previously, under certain kinds of exceptions (such as `TTransportException`), we would repeatedly retry the operation on the same node up to `maxTriesTotal` times. (Pull Request)
FIXED DEV BREAK	Any ongoing Cassandra schema mutations are now given two minutes to complete upon closing a transaction manager, decreasing the chance that the schema mutation lock is lost. Some exceptions thrown due to schema mutation failures now have type `UncheckedExecutionException`. (Pull Request)

v0.82.2

4 May 2018

Type	Change
FIXED	`SerializableTransaction` now initialises internal state correctly. Previously, we would throw an exception if multiple equivalent column range selections for different rows needed to be checked in the same transaction. (Pull Request)

v0.82.1

1 May 2018

Type	Change
FIXED	Specifying tables in configuration for sweep priority overrides now works properly. Previously, attempting to deserialize configurations with these overrides would cause errors. (Pull Request)

v0.82.0

1 May 2018

Type	Change
FIXED	AtlasDB now partitions versions of cells to be swept into batches more robustly and more efficiently. Previously, this could cause stack overflows when sweeping a very wide row, because the partitioning algorithm attempted to traverse a recursive hierarchy of sublists. Also, previously, partitioning would require time quadratic in the number of versions present in the row; it now takes linear time. (Pull Request)
NEW	Users can now explicitly specify specific tables for the background sweeper to (1) prioritise above other tables, or (2) blacklist. This is done as part of live-reloadable configuration, though note that background sweep will conclude its current iteration before switching to a priority table / away from a blacklisted table, as appropriate. Please see Sweep Priority Overrides for more details. (Pull Request)
FIXED	Transaction managers now shut down threads associated with the QoS client and TimeLock lock refresher when they are closed. Previously, these threads would continue running and needlessly using resources. (Pull Request)
FIXED	The `_locks` table is now created with a deterministic column family ID. This means that multi-node installations will no longer create multiple locks tables on first start-up. Note that new installations using versions of Cassandra prior to 2.1.13, 2.2.5, 3.0.3 or 3.2 will fail to create this table, as we rely on syntax introduced by the fix to CASSANDRA-9179. Existing installations will be unaffected. (Pull Request)
FIXED	OkHttp is not handling `Thread.interrupt()` well, and calling interrupts repeatedly may cause corrupted http clients. This would cause TimeLock clients to appear silent (requests would not be accepted or logged), but would not have affected data integrity. To avoid this issue, our Feign client is now wrapped with an `ExceptionCountingRefreshingClient`, which will detect and refresh corrupted clients. (Pull Request)
FIXED DEV BREAK	`LoggingArgs::isSafeForLogging(TableReference, Cell)` was removed, as it behaved unexpectedly and could leak information. Previously, it returned true only if the cell’s row matched the name of a row component which was declared as safe. However, knowledge of the existence of such a cell may not actually be safe. There currently isn’t an API for declaring specific row or dynamic column components as safe; please contact the AtlasDB team if you have such a use case. (Pull Request)
IMPROVED LOGS	Expired lock refreshes now tell you which locks expired, instead of just their refreshing token id. (Pull Request)
IMPROVED	The strategy for choosing the table to compact was adjusted to avoid the case when the same table is chosen multiple times in a row, even if it was not swept between compactions. Previously, the strategy to choose the table to compact was: if possible choose a table that was swept but not compacted otherwise choose a table for which the time passed between last compact and last swept was longer When all tables are swept and afterward compacted, the last point above could choose to compact the same table because `lastSweptTime - lastCompactTime` is negative and largest among all tables. The new strategy is: if possible choose a table that was swept but not compacted if there is no uncompacted table then choose a table swept further after it was compacted otherwise, randomly choose a table after filtering out the ones compacted in the past hour (Pull Request)
IMPROVED LOGS	`kvs-slow-log` messages now also include start time of the operation for easier debugging. (Pull Request)
IMPROVED LOGS	AtlasDB internal tables will no longer produce warning messages about hotspotting. (Pull Request)

v0.81.0

19 April 2018

Type	Change
IMPROVED METRICS	Async TimeLock Service metric timers are now tagged with (1) the relevant clients, and (2) whether the current node is the leader or not. This allows for easier analysis and consumption of these metrics. (Pull Request)
IMPROVED	Common annotations can now be imported via the commons-annotations library, instead of needing to pull in atlasdb-commons. Existing code that uses atlasdb-commons for the annotations will still be able to resolve them. (Pull Request)
FIXED	Logs in `CassandraRequestExceptionHandler` are logged using a logger named after that class instead of `CassandraClientPool`. (Pull Request)
IMPROVED DEV BREAK	Bumped several libraries to get past known security vulns: Cassandra Thrift and CQL libs Jackson Logback Netty (indirectly via cassandra lib bump) (Pull Request)

v0.80.0

04 April 2018

Type	Change
FIXED DEV BREAK	Centralize how `PersistentLockManager` is created in a dagger context. Also, removed the old constructor for `CellsSweeper`. (Pull Request)
IMPROVED LOGS	Downgraded “Tried to connect to Cassandra {} times” logs from `ERROR` to `WARN`, and stopped printing the stack trace. An exception is thrown to the service who made the request; this service has the opportunity to log at a higher level if desired. (Pull Request)
NEW	AtlasDB now supports runtime configuration of throttling for stream stores when streams are written block by block in a non-transactional fashion. Previously, such streams would be written using a separate transaction for each block, though in cases where data volume is high this may still cause load on the key-value-service Atlas is using. Please note that if you wish to use this feature, you will need to regenerate your Atlas schemas and suitably inject the stream persistence configuration into your stream stores. However, if you do not intend to use this feature, no action is required, and your stream stores’ behaviour will not be changed. Note that enabling throttling may make nontransactional `storeStream` operations take longer, though the length of constituent transactions should not be affected. (Pull Request)
NEW	AtlasDB now schedules KVS compactions on a background thread, as opposed to triggering a compaction after each table was swept. This allows for better control over KVS load arising from compactions. (Pull Request)
NEW	AtlasDB now supports configuration of a maintenance mode for compactions. If compactions are run in maintenance mode, AtlasDB may perform more costly operations which may be able to recover more space. For example, for Oracle KVS, `SHRINK SPACE` (which acquires locks on the entire table) will only be run if compactions are carried out in maintenance mode. (Pull Request)
FIXED	Fixed a bug that causes Cassandra clients to return to the pool even if they have thrown blacklisted exceptions. (Pull Request)
FIXED	Fix NPE if PaxosLeaderElectionServiceBuilder’s new field onlyLogOnQuorumFailure is never set. (Pull Request)
NEW	If using TimeLock, AtlasDB now checks the value of a fresh timestamp against the unreadable timestamp on startup, failing if the fresh timestamp is smaller. That implies clocks went backwards; doing this mitigates the damage that a bogus TimeLock migration or other corruption of TimeLock can do. (Pull Request)
IMPROVED	Applications can now easily determine whether their Timelock cluster is healthy by querying `TransactionManager.getTimelockServiceStatus().isHealthy()`. This returns true only if a healthy connection to timelock service is established. (Pull Request)

v0.79.0

20 March 2018

Type	Change
IMPROVED DEV BREAK	Guava has been updated from 21.0 to 23.6-jre. This unblocks users using libraries which have dependencies on more recent versions of Guava, owing to API changes in `SimpleTimeLimiter`, among other classes. (Pull Request)
IMPROVED METRICS	Sweep metrics are now updated to the result value of the last run iteration of sweep instead of the cumulative values for the run of sweep on the table. This has been done in order to improve the granularity of the metrics, since cumulative results can be several orders of magnitude larger, thus obfuscating the delta. (Pull Request)
NEW	Added a new parameter `addressTranslation` to `CassandraKeyValueServiceConfig`. This parameter is a static map specifying how internal Cassandra endpoints should be translated to InetSocketAddresses. (Pull Request)
FIXED	The Cassandra client pool is now cleaned up in the event of a failure to construct the Cassandra KVS (e.g. because we lost our connection to Cassandra midway). Previously, the client pool was not shut down, leading to a thread leak. (Pull Request)
IMPROVED LOGS	Log an ERROR in the case of failure to create a Cell due to a key greater than 1500 bytes. Previously we logged at DEBUG. (Pull Request)
FIXED	`clean-cass-locks-state` command is now using Atlas namespace as Cassandra keyspace if provided. (Pull Request)
IMPROVED LOGS	Logging exceptions in the case of quorum is runtime configurable now, using `only-log-on-quorum-failure` flag, for external timelock services. Previously it was set to true by default. (Pull Request)

v0.78.0

2 March 2018

Type	Change
NEW	The `TransactionManagers` builder now optionally accepts a `Callback` object. If `initializeAsync` is set to true, then this callback will be run after all the initialization prerequisites for the TransactionManager have been met, and the TransactionManager will start returning true on calls to its `isInitialized()` method only once the callback has returned. If `initializeAsync` is set to false, then this callback will be run just before the TransactionManager is returned, blocking until it is done. (Pull Request)
FIXED	SerializableTransactionManager can now be closed even if it is not initialized yet. (Pull Request)
NEW CHANGED METRICS	Sweep metrics have been reworked based on their observed usefulness in the field. In particular, histograms and most of the meters were replaced with gauges that expose last known values of tracked sweep results. Tagged metrics have been removed as well, and were replaced by a gauge `tableBeingSwept` that exposes the name of the table being swept, if it is safe for logging. Sweep metrics `cellTimestampPairsExamined` and `staleValuesDeleted` are now updated after every batch of deletes instead of waiting until all of the batches are processed. Sweep now exposes the following metrics with the common prefix `com.palantir.atlasdb.sweep.metrics.SweepMetric.`: `tableBeingSwept` `cellTimestampPairsExamined` `staleValuesDeleted` `sweepTimeSweeping` `sweepTimeElapsedSinceStart` `sweepError` (Pull Request)
FIXED	LoggingArgs no longer throws when it tries to hydrate invalid table metadata. This fixes an issue that prevented AtlasDB to start after performing a KVS migration. (Pull Request)
CHANGED	Changes the default scrubber behavior to aggressive scrub (synchronous with scrub request). (Pull Request)
FIXED	Fixed a bug that can causes the background sweep thread to fail to shut down cleanly, hanging the application. (Pull Request)
IMPROVED	Remove a round trip from read only transactions not involving thoroughly swept tables. (Pull Request)

v0.77.0

16 February 2018

Type	Change
CHANGED	AtlasDB migration CLI no longer drops the temporary table used during migration and instead truncates it. This avoids an issue where AtlasDB would refuse to start after a migration because it would try to hydrate empty table metadata for the above table. (Pull Request)
CHANGED	Upgraded Postgres jdbc driver to 42.2.1 (from 9.4.1209). (Pull Request)
FIXED	Fix NPE when warming conflict detection cache if table is being created. (Pull Request)
IMPROVED DEV BREAK	Introduced configurable `writeThreshold` and `writeSizeThreshold` parameters for when to write stats for the Sweep prioritization. Also reduce the defaults to flush write stats on 32MB overall write size and 2k cells. (Pull Request)
FIXED	Fix `SnapshotTransaction#getRows` to apply `ColumnSelection` when there are local writes. (Pull Request)
FIXED	CassandraKVS sstable size in MB was not being correctly set. This resulted in requirements on the entire cluster being up during startup of certain stacks. CF metadata mismatch messages are also now correctly safety marked for logging. (Pull Request)

v0.76.0

12 February 2018

Type	Change
FIXED	Fixed a bug which would make sweep deletes not be compacted by Cassandra. Over time this would lead to tombstones being accumulated in the DB and disk space not being reclaimed. (Pull Request)
FIXED	When TransactionManagers doesn’t return successfully, we leaked resources depending on which step of the initialization failed. Now resources are properly closed and freed. (Pull Request)
FIXED	Fixed a bug where Cassandra clients’ input buffers were left in an invalid state before returning the client to the pool, manifesting in NPEs in the Thrift layer. (Pull Request)
IMPROVED NEW	Added a new parameter `conservativeRequestExceptionHandler` to `CassandraKeyValueServiceRuntimeConfig`. Setting this parameter to true will enable more conservative retrying logic for requests, including longer backoffs and not retrying on the same host when encoutering an exception that is indicative of high Cassandra load, e.g., TimeoutExceptions. This parameter is live-reloadable, and reloading it will affect in-flight requests, with the caveat that once a request gives up on a node, it will not retry that node again even if we disable conservative retrying. (Pull Request)
IMPROVED	AtlasDB CLIs now allow a runtime config to be passed in. This allows the CLIs to be used with products that are configured to use timelock and have the timelock block in the runtime config. (Pull Request)
IMPROVED DEV BREAK	AtlasDbConfigs now supports parsing of both install and runtime configuration. As part of these changes, `load`, `loadFromString` and other methods in `AtlasDbConfigs` now take a type parameter. To fix existing usage, please pass in `AtlasDbConfig.class` as the type parameter to these functions. (Pull Request)
FIXED	Fixed a bug where the CleanCassLocksState CLI would not start because the Cassandra locks were in a bad state. (Pull Request)
FIXED	Close AsyncInitializer executors. This should reduce memory pressure of clients after startup. (Pull Request)
IMPROVED	Added a TimeLock healthcheck for signalling that no leader election has been triggered. This will allow TimeLock itself to broadcast a HEALTHY status even without a leader. (Pull Request)
IMPROVED	Index tables can now be marked as safe for logging. If you use indexes, please add `allSafeForLogging()` on their definition (where reasonable). This makes all AtlasDB tables able to be marked as safe for logging. (Pull Request)
IMPROVED	Make some values of `CassandraKeyValueServiceConfig` live-reloadable. To check which parameters are live-reloadable, check the `CassandraKeyValueServiceRuntimeConfig` class. Docs about this config can be found here and here. (Pull Request)
DEV BREAK	Renamed the method used to create LockAndTimestampServices by the CLI commands and AtlasConsole. Please update usages of `createLockAndTimestampServices` to `createLockAndTimestampServicesForCli`. (Pull Request)
IMPROVED LOGS	Sweep now logs the number of cells it is deleting when performing a single batch of deletes. This is useful for visibility of Sweep progress; previously, Sweep would only log when a top-level batch was complete, meaning that for highly versioned rows Sweep would only log after deleting all stale versions of said row. (Pull Request)
IMPROVED	The sweep-table endpoint now returns HTTP status 400 instead of 500, when asked to sweep a non-existent table. (Pull Request)
IMPROVED METRICS	Atlas now records the number of cells written over time, if you are using Cassandra KVS. This metric is reported under `com.palantir.atlasdb.keyvalue.cassandra.CassandraClient.cellsWritten`. (Pull Request) (Pull Request)
IMPROVED	`ExecutorInheritableThreadLocal` from `commons-executors` has been split out into a `commons-executors-api` dependent project with no dependencies. This allows api projects outside of atlasdb to use `ExecutorInheritableThreadLocal` without pulling in the dependencies of `commons-executors`. (Pull Request)

v0.75.0

29 January 2018

Type	Change
FIXED USER BREAK	AtlasDB will now fail to start if a TimeLock block is included in the initial runtime configuration, but the install configuration is set up with a leader block or with remote timestamp and lock blocks. Previously, AtlasDB would start successfully under these conditions, but the TimeLock block in the runtime configuration would be silently ignored. Note that the decision on whether to use TimeLock or another source of timestamps and locks is made at install-time. (Pull Request)
IMPROVED USER BREAK	AtlasDB users can now specify the usage of TimeLock Server purely by including a TimeLock block in the initial runtime configuration. Previously, AtlasDB users would need to specify that they were using TimeLock in the install configuration, possibly with an empty object (`timelock: {}`). This is a change from previous behaviour in cases where users specified an embedded install configuration but a TimeLock block in the runtime configuration; previously, the embedded configuration would have been selected, while now the TimeLock configuration will be selected. Also, users with scripts that depend on supplying a default runtime configuration may need to be careful to ensure that TimeLock configuration is preserved when such scripts are run. That said, AtlasDB will fail to start if trying to access a key-value service where TimeLock has been used as a source of timestamps without going through TimeLock, so we don’t think there is a risk of data corruption. (Pull Request)
FIXED METRICS	Fixed metric re-registration log spam in `TokenRangeWriteLogger`. (Pull Request)
FIXED DEV BREAK	AtlasDB clients will receive a `QosException.Throttle` for requests that are throttled and http-remoting should handle them appropriately based on the backOff strategy provided by the application. Note that this is an experimental feature and we do not expect it to be enabled anywhere. This is a dev break as the exception type has changed from `RateLimitExceededException` to `QosException.Throttle`. (Throttle) (Pull Request)
IMPROVED METRICS	Use tags in sweep outcome metrics instead of using each name per outcome. (Pull Request)
IMPROVED LOGS	Log message for leaked sweep/backup lock is now WARN rather than INFO. (Pull Request)
IMPROVED LOGS METRICS	`TokenRangeWrite` metrics are calculated every 1000 writes so we can chart metrics for smaller tables. Logging now happens every 6 hours regardless of number of writes (although there must be at least 1). (Pull Request)
IMPROVED LOGS	`CassandraClient` kvs-slow-logs have been improved. They now contain the duration of the call and information about the results from the KVS. (Pull Request)
CHANGED	Updated our Guava dependency from 18.0 to 20.0. This should unblock downstream products from upgrading to Guava 22.0. (Pull Request)
CHANGED	Updated our http-remoting dependency from 3.5.1 to 3.14.0. (Pull Request)

v0.74.0

23 January 2018

Type	Change
IMPROVED LOGS	AtlasDB internal table names are now safe for logging. (Pull Request)
IMPROVED METRICS	`BackgroundSweeperImpl` now logs if there’s an uncaught exception. Added 2 new outcomes for normal and abnormal shutdown to allow closer monitoring. (Pull Request)
IMPROVED	The `LockAwareTransactionManager` pre-commit checks that verify that locks are still held have been generalized to support arbitrary pre-commit checks. (Pull Request)
DEV BREAK	`AtlasDbConstants.GENERIC_TABLE_METADATA` is now safe for logging, if you are using this as the metadata to create table names that shouldn’t be logged in the internal logging framework, do not use this metadata. (Pull Request)
DEV BREAK IMPROVED	The `partitionStrategy` parameter in AtlasDB table metadata has been removed; products that explicitly specify partition strategies in their schemas will need to remove them. The value of this parameter was never actually read; behaviour would have been identical regardless of what this was specified to (if at all). This change was made to simplify the API and also remove any illusion that specifying the `partitionStrategy` would have done anything. (Pull Request)
DEV BREAK	The protobuf library has been upgraded to 3.5.1. Dependent projects will need to update their dependencies. (Pull Request)
FIXED	V2 Schemas which use `ValueType.BLOB` will now compile. Previously, compilation failed with an `IllegalArgumentException` from Java Poet, as we assumed Java versions of `ValueType` were always associated with object types. (Pull Request)
FIXED METRICS	`TokenRangeWriteLogger` now registers different metric names per table even if all are unsafe. We instead tag with an obfuscated version of the name which is safe for logging. (Pull Request)
FIXED	Stop sweeping when the sweep thread is interrupted. Previously, when services were shutting down, the background sweeper thread continuously logged warnings due to a closed `TransactionManager`. (Pull Request)
DEV BREAK	Removed `CassandraKeyValueServiceConfigManager`. If you’re affected by this, please contact the AtlasDB team. (Pull Request)

v0.73.1

16 January 2018

Type	Change
FIXED	Fix a NPE in that could happen in the Sweep background thread. In this scenario, sweep would get stuck and not be able to proceed. The regression was introduced with (#2860), in version 0.73.0. (Pull Request)
FIXED	Qos clients will query the service every 2 seconds instead of every client request. This should prevent too many requests to the service. (Pull Request)
FIXED	All Atlas executor services now run tasks wrapped in http-remoting utilities to preserve trace logging. (Pull Request) (Pull Request)

v0.73.0

16 January 2018

Type	Change
IMPROVED	On Cassandra KVS, sweep reads data from Cassandra in parallel, resulting in improved performance. The parallelism can be changed by adjusting `sweepReadThreads` in Cassandra KVS config (default 16). (Pull Request)
IMPROVED	AtlasDB now throws an error during schema code generation stage if index table name length exceeds KVS table name length limits. To override this, please specify `ignoreTableNameLengthChecks()` on your schema. (Pull Request)

v0.73.0-rc2

12 January 2018

Type	Change
NEW	Qos Service: AtlasDB now supports a QosService which can rate-limit clients. Please note that this feature is currently experimental; if you wish to use it, please contact the AtlasDB team. (Pull Request)
NEW	The JDBC URL for Oracle can now be overridden in the configuration. The parameter path is `keyValueService/connection/url`. (Pull Request)

v0.73.0-rc1

11 January 2018

Type	Change
IMPROVED LOGS METRICS	Allow StreamStore table names to be marked as safe. This will make StreamStore tables appear correctly on our logs and metrics. When building a StreamStore, please use `.tableNameLogSafety(TableMetadataPersistence.LogSafety.SAFE)` to mark the table name as safe. (Pull Request)
IMPROVED	Sweep stats are updated more often when large writes are being made. `SweepStatsKVS` now tracks the size of modifications being made to the underlying KVS and will write when a threshold is passed. Previously, sweep stats were updated every 65536 writes, but this could be a significant amount of data if written to the stream store. We now also track the size of the writes and if this is greater than 1GB, we flush the stats. (Pull Request)
IMPROVED	Improvements to how sweep prioritises which tables to sweep; should allow better reclaiming of space from stream stores. Stream store value tables are now more likely to be chosen because they contain lots of data per write. We ensure we sweep index tables before value tables, and allow a gap after sweeping index tables and before sweeping value tables. We wait 3 days between sweeps of a value table to prevent unnecessary work, allow other tables to be swept and tombstones to be compacted away. (Pull Request)
FIXED	`SweepResults.getCellTsPairsExamined` now returns the correct result when sweep is run over multiple batches. Previously, the result would only count cell-ts pairs examined in the last batch. (Pull Request)
FIXED	Further reduced memory pressure on sweep for Cassandra KVS, by rewriting one of the CQL queries. This removes a significant cause of occurrences of Cassandra OOMs that have been seen in the field recently. However, performance is significantly degraded on tables with few columns and few overwrites (fixed in 0.73.0). (Pull Request 1 and Pull Request 2)
FIXED LOGS	Safe and Unsafe table name logging args are now different, fixed unreleased bug where tables names were logged as Safe (Pull Request)
LOGS	Messages to the `slow-lock-log` now log at `WARN` rather than `INFO`, these messages can indicate a problem so we should be sure they are visible. (Pull Request)
DEV BREAK	For clarity, we renamed `ForwardingLockService` to `SimplifyingLockService`, since this class also overwrote some of its parent’s methods. Also, its `delegate` method is now public. (Pull Request)
IMPROVED	Tritium was upgraded to 0.9.0 (from 0.8.4), which provides functionality for de-registration of tagged metrics. (Pull Request)

v0.72.0

13 December 2017

Type	Change
NEW IMPROVED METRICS LOGS	Sweep metrics were reworked. Sweep now exposes metrics indicating the total number of cells examined, cells deleted, time spent sweeping, and time elapsed since sweep started on the current table that are updated after each iteration of sweep and separate metrics that are updated after each table is fully swept. Additionally, sweep now exposes metrics tagged with table names that expose the total number of cells examined, cells deleted, time spent sweeping per iteration for each table separately. Logs will also include the new timing information. Sweep now exposes the following metrics with the common prefix `com.palantir.atlasdb.sweep.metrics.SweepMetric.`: `cellTimestampPairsExamined.meter.currentIteration` `cellTimestampPairsExamined.histogram.currentTable` `cellTimestampPairsExamined.histogram.currentIteration` (tagged) `staleValuesDeleted.meter.currentIteration` `staleValuesDeleted.histogram.currentTable` `staleValuesDeleted.histogram.currentIteration` (tagged) `sweepTimeSweeping.meter.currentIteration` `sweepTimeSweeping.histogram.currentTable` `sweepTimeSweeping.histogram.currentIteration` (tagged) `sweepTimeElapsedSinceStart.currentValue.currentIteration` `sweepTimeElapsedSinceStart.histogram.currentTable` (Pull Request)
IMPROVED	AtlasDB now provides a configurable `compactInterval` (0 by default) option for Postgres, in the Postgres DDL Config. A vacuum will be kicked off an a table only if there hasn’t been one on the same table in the last `compactInterval`. This will prevent increasing load on Postgres due to queued up vacuums. We would suggest a value of 1-2 days (e.g. `2d` or `2 days`) for this config option and would encourage users to test this out and report the results back. We will modify the defaults once this has been field tested. (Pull Request)
FIXED	The `LeaderPingHealthCheck` supplied by `PaxosLeadershipCreator` now correctly reports the leadership state of nodes that believe themselves to be the leader. Previously, the health check would ping every other node in the cluster, resulting in leader nodes reporting that there are no leaders. (Pull Request)
FIXED	Fixed a bug in LockServiceImpl (caused by a bug in AbstractQueuedSynchronizer) where a race condition could cause a lock to become stuck indefinitely. (Pull Request)
DEV BREAK	Deleted the TTL duration field from the `Cell` class. The interface `ExpiringKeyValueService` and implementations `CassandraExpiringKeyValueService` and `CqlExpiringKeyValueService` have also been removed. Additionally, `StreamTableDefinitionBuilder.expirationStrategy` has been removed. We don’t believe that any of these fields or classes were used. (Pull Request)

v0.71.1

8 December 2017

Type	Change
FIXED	Removed an unused dependency from `atlasdb-api`, fixing a dependency clash in a downstream product. (Pull Request)

v0.71.0

7 December 2017

Type	Change
NEW	AtlasDB QoS: AtlasDB now allows clients to live-reloadably configure read and write limits (in terms of bytes) to rate-limit requests to Cassandra. AtlasDB clients will receive a `RateLimitExceededException` for requests that are throttled and should handle them appropriately. We provide an exception mapper `RateLimitExceededExceptionMapper` to map the throttling exceptions to `429`, but it is upto the application to register the exception mapper. Note that this is an experimental feature and applications should generally not enable it by default yet, unless the application has hard read-write limits. This should allow us to throttle dynamically in situations where the load on Cassandra is high. (Pull Request)
IMPROVED	AtlasDB publish of new releases is now done through the internal circleCI build instead of external circleCI. (Pull Request)
IMPROVED	AtlasDB no longer logs Cassandra retries at level WARN, thus reducing the volume of WARN logs by a significant factor. These logs are now available at INFO. (Pull Request)
FIXED	Sweep can now make progress after a restore and after the clean transactions CLI is run. Earlier, it would fail throwing a `NullPointerException` due to failure to read the commit ts. This would cause sweep to keep retrying without realising that it will never proceed forward. (Pull Request)
FIXED	Sweep will no longer run during KVS Migrations. (Pull Request)
NEW LOGS	Cassandra KVS now records how many writes have been made into each token range for each table. That information is logged at info every time a table is written to more than a threshold of times (currently 100 000 writes). These logs will be invaluable in more easily identifying hotspotting and for using targeted sweep. (Pull Request)
NEW METRICS	New metric added which reports the probability that a table is being written to unevenly. `com.palantir.atlasdb.keyvalue.cassandra.TokenRangeWritesLogger.probabilityDistributionIsNotUniform` is tagged with the table reference (where safe) and is a probability from 0.0 to 1.0 that the token ranges are being written to unevenly. Cassandra KVS only. (Pull Request)
NEW	`TimeLockAgent` exposes a new method, `getStatus()`, to be used by the internal TimeLock instance in order to provide a health check. (Pull Request)
DEV BREAK	Removed several utility methods that are used by AtlasDB code. `MathUtils` has been moved to our large internal product, which was the only place to use it. `MathUtils` (entire class) `IterableUtils` (`getFirst(it, defaultValue)`, `mergeIterators`, `partitionByHash`, `prepend`, `transformIterator`) `IteratorUtils.iteratorDifference` (Pull Request)
NEW	Added a CLI to read the punch table. The CLI receives an epoch time, in millis, and returns an approximation of the AtlasDB timestamp strictly before the given timestamp. (Pull Request)

v0.70.1

30 November 2017

Type

Change

DEV BREAK IMPROVED

The TransactionManagers builder now hooks up the metric registries passed in so that AtlasDB metrics are registered on the specified metric registries. Applications no longer should use the AtlasDbMetrics.setMetricRegistry method to specify a metric registry for AtlasDB.

TransactionManagers.builder()
    .config(config)
    .userAgent("ete test")
    .globalMetricRegistry(new MetricRegistry())
    .globalTaggedMetricRegistry(DefaultTaggedMetricRegistry.getDefault())
    .registrar(environment.jersey()::register)
    .addAllSchemas(ETE_SCHEMAS)
    .build()
    .serializable()

(Pull Request)

v0.70.0

30 November 2017

Type	Change
IMPROVED	When BackgroundSweeper decides to sweep a StreamStore VALUE table, first sweep the respective StreamStore INDEX table. Before we just swept the VALUE table, which ended up not deleting any values in the backing store. (Pull Request)
DEV BREAK METRICS	The method `AtlasDbMetrics.setMetricsRegistries` was added, to register both the MetricRegistry and the TaggedMetricRegistry. Please use it instead of the old `AtlasDbMetrics.setMetricsRegistry`. (Pull Request)
IMPROVED LOGS	All logging in `SnapshotTransaction` now marks its placeholder log arguments as either safe or unsafe. (Pull Request)
DEV BREAK IMPROVED	The previously deprecated `TransactionManagers.create()` methods have been removed. To create a `SerializableTransactionManager` please use the `TransactionManagers.builder()` to create a `TransactionManagers` object and then call its `serializable()` method. Furthermore, this builder now requires a `taggedMetricRegistry` argument, and is a staged builder, requiring all mandatory parameters to be specified in the following order: `TransactionManagers.config().userAgent().metricRegistry().taggedMetricRegistry()`. This avoid runtime errors due to failure to specify all required arguments. (Pull Request)
FIXED	Fixed a bug where setting `compressBlocksInDb` for stream store definitions would result in a much bigger than intended block size. This option is also deprecated, as we recommend `compressStreamsInClient` instead. (Pull Request)
FIXED	Fixed an edge case where sweep would loop infinitely on tables that contained only tombstones. (Pull Request)
FIXED METRICS	`MetricsManager` no longer outputs stack traces to WARN when a metric is registered for a second time. The stack trace can still be accessed by turning on TRACE logging for `com.palantir.atlasdb.util.MetricsManager`. (Pull Request)
IMPROVED DEV BREAK	AtlasDB now wraps `NotCurrentLeaderException` in `AtlasDbDependencyException` when this exception is thrown by TimeLock. (Pull Request)
IMPROVED	Sweep no longer fetches any values from Cassandra in CONSERVATIVE mode. This results in significantly less data being transferred from Cassandra to the client when sweeping tables with large values, such as stream store tables. (Pull Request)

v0.69.0

23 November 2017

Type	Change
FIXED	Fixed the deletion of values from the StreamStore when configured to hash rowComponents. Previously, due to a deserialization bug, we wouldn’t delete the streamed data. If you think you’re affected by this bug, please contact the AtlasDB team to migrate away from this behavior. (Pull Request)
FIXED	We now avoid Cassandra timeouts caused by running unbounded CQL range scans during sweep. In order to assign a bound, we prefetch row keys using thrift, and use these bounds to page internally through rows. This issue affected tables configured to use THOROUGH sweep strategy — which could accumulate many rows entirely made up of tombstones — when Cassandra is configured as the backing store. (Pull Request)
IMPROVED	Applications can now easily determine whether their AtlasDB cluster is healthy by querying `TransactionManager.getKeyValueServiceStatus().isHealthy()`. This returns true only if all key value service nodes are up; applications that have sweep and scrub disabled and do not perform schema mutations can also treat `KeyValueServiceStatus.HEALTHY_BUT_NO_SCHEMA_MUTATIONS_OR_DELETES` as a healthy state. (Pull Request)
IMPROVED DEV BREAK	AtlasDB will now consistently throw an `AtlasDbDependencyException` when requests fail due to TimeLock being unavailable. (Pull Request)
FIXED	`Throwables.createPalantirRuntimeException` once again throws `PalantirInterruptedException` if the original exception was either `InterruptedException` or `InterruptedIOException`. This reverts behaviour introduced in 0.67.0, where we instead threw `PalantirRuntimeException`. (Pull Request)
IMPROVED	Sweep now waits 1 day after generating a large number of tombstones before sweeping a table again. This behavior only applies when using Cassandra. (Pull Request)
FIXED LOGS	`CassandraKeyValueServiceImpl.compactInternally` no longer logs an error when no compaction manager is configured. This message is instead logged once, when the CKVS is instantiated. (Pull Request)

v0.68.0

16 November 2017

Type	Change
FIXED	HTTP clients for endpoints relating to the Paxos protocols (`PingableLeader`, `PaxosAcceptor` and `PaxosLearner`) now reset themselves after 500 million requests have been executed. This was implemented as a workaround for OkHttp #3670 where HTTP/2 connections managed in OkHttp would fail after just over a billion requests owing to an unexpected integer overflow. (Pull Request)

v0.67.0

15 November 2017

Type	Change
NEW	AtlasDB clients are now able to live reload TimeLock URLs. This is required for internal work on running services in Kubernetes. We still require that clients are configured to use TimeLock (as opposed to a leader, remote timestamp/lock or embedded service) at install time. Note that this change does not affect TimeLock Server, which still requires knowledge of the entire cluster as well. Please consult the documentation for more detail regarding the config changes needed. (Pull Request 1 and Pull Request 2)
DEPRECATED	The `servers` block within an AtlasDB `timelock` block is now deprecated. Please use the live-reloadable `servers` block within the `timelockRuntime` block of the runtime configuration instead. (Pull Request 2)
IMPROVED	AtlasDB clients using TimeLock can now start up with knowledge of zero TimeLock nodes. Requests to TimeLock will throw `ServiceUnavailableException` until the config is live reloaded with one or more nodes. If live reloading causes the number of nodes to fall to zero, we also fail gracefully; `ServiceUnavailableException` will be thrown until the config is live reloaded with one or more nodes. Note that this does not affect remote timestamp, lock or leader configurations; those still require at least one server. Also, note that if one is using TimeLock without async initialization, then one still needs to provide information about the TimeLock cluster on startup. (Pull Request 3)
IMPROVED DEV BREAK	`ServerListConfig` can now be created with zero servers, as part of work supporting Atlas clients starting up without knowing TimeLock nodes. This is strictly more permissive, but may affect developers that use `ServerListConfig` directly, especially if it is being serialized. (Pull Request 3)
IMPROVED LOGS	kvs-slow-log was added on all Cassandra calls. As with the original kvs-slow-log logs, the added logs have the `kvs-slow-log` origin. To see the exact log messages, check the `ProfilingCassandraClient` class. (Pull Request)
NEW METRICS	Metrics were added on all Cassandra calls. The `CassandraClient` interface was Tritium instrumented. The following metrics have been added, with the common prefix (package) `com.palantir.atlasdb.keyvalue.cassandra.`: `CassandraClient.multiget_slice` `CassandraClient.get_range_slices` `CassandraClient.batch_mutate` `CassandraClient.get` `CassandraClient.cas` `CassandraClient.execute_cql3_query` Note that the table calls mainly use the first three metrics of the above list. (Pull Request)
NEW METRICS	Metrics recording the number of Cassandra requests, and the amount of bytes read and written from and to Cassandra were added: The following metrics have been added, with the common prefix (package) `com.palantir.atlasdb.keyvalue.cassandra.`: `QosMetrics.numReadRequests` `QosMetrics.numWriteRequests` `QosMetrics.bytesRead` `QosMetrics.bytesWritten` (Pull Request)
NEW METRICS	Added metrics for cells read. The read cells can be post-filtered at the CassandraKVS layer, when there are multiple versions of the same cell. The filtered cells are recorded in the following metrics have been added, with the common prefix (package) `com.palantir.atlasdb.keyvalue.cassandra.`: `TimestampExtractor.notLatestVisibleValueCellFilterCount` `ValueExtractor.notLatestVisibleValueCellFilterCount` The cells returned from the KVS layer are then recorded at the metric with the prefix (package) `com.palantir.atlasdb.transaction.impl.`: `SnapshotTransaction.numCellsRead` Such cells can also be filtered out at the transaction layer, due to the Transaction Protocol. The filtered out cells are recorded in the metrics: `SnapshotTransaction.commitTsGreaterThatTxTsCellFilterCount` `SnapshotTransaction.invalidStartTsTsCellFilterCount` `SnapshotTransaction.invalidCommitTsCellFilterCount` `SnapshotTransaction.emptyValuesCellFilterCount` At last, the metric that record the number of cells actually returned to the AtlasDB client is: `SnapshotTransaction.numCellsReturnedAfterFiltering` (Pull Request)
NEW METRICS	Added metrics for written bytes at the Transaction layer: `com.palantir.atlasdb.transaction.impl.SnapshotTransaction.bytesWritten` (Pull Request)
NEW METRICS	A metric was added for the cases where a large read was made: `com.palantir.atlasdb.transaction.impl.SnapshotTransaction.tooManyBytesRead` Note that we also log a warning in these cases, with the message “A single get had quite a few bytes…”. (Pull Request)
IMPROVED DEV BREAK	AtlasDB will now consistently throw a `InsufficientConsistencyException` if Cassandra reports an `UnavailableException`. Also, all exceptions thrown at the KVS layer, as `KeyAlreadyExists` or `TTransportException` and `NotInitializedException` were wrapped in `AtlasDbDependencyException` in the interest of consistent exceptions. (Pull Request)
IMPROVED LOGS	`SweeperServiceImpl` now logs when it starts sweeping and makes it clear if it is running full sweep or not (Pull Request)
FIXED METRICS	`MetricsManager` now logs failures to register metrics at `WARN` instead of `ERROR`, as failure to do so is not necessarily a systemic failure. Also, we now log the name of the metric as a Safe argument (previously it was logged as Unsafe). (Pull Request)
FIXED	`SweepBatchConfig` values are now decayed correctly when there’s an error. `SweepBatchConfig` should be decreased until sweep succeeds, however the config actually oscillated between values, these were normally small but could be larger than the original config. This was caused by us fixing one of the values at 1. `SweepBatchConfig` values will now be halved with each failure until they reach 1 (previously they only went to about 30% due to another bug). This ensures we fully backoff and gives us the best possible chance of success. Values will slowly increase with each successful run until they are back to their default level. (Pull Request)
IMPROVED	AtlasDB now depends on Tritium 0.8.4, which depends on the same version of `com.palantir.remoting3` and `HdrHistogram` as AtlasDB. (Pull Request)
FIXED	Check that immutable timestamp is locked on write transactions with no writes. This could cause long-running readers to read an incorrect empty value when the table had the `Sweep.THOROUGH` strategy. (Pull Request)
FIXED	Paxos value information is now correctly being logged when applicable leader events are happening. (Pull Request)

v0.66.0

7 November 2017

Type	Change
IMPROVED	AtlasDB now depends on Tritium 0.8.3, allowing products to upgrade Tritium without running into `NoClassDefFound` and `NoSuchField` errors. (Pull Request)

v0.66.0-rc2

6 November 2017

Type	Change
IMPROVED	AtlasDB now depends on Tritium 0.8.1. (Pull Request)
IMPROVED	AtlasDB can now tag RC releases. (Pull Request)

v0.66.0-rc1

This version was skipped due to issues on release. No artifacts with this version were ever published.

v0.65.2

6 November 2017

Type	Change
FIXED	Reverted the Cassandra KVS executor PR (Pull Request) that caused a performance regression. (Pull Request)
FIXED	`CassandraTimestampBackupRunner` now logs the backup bound correctly when performing a backup as part of TimeLock migration. Previously, the bound logged would have been logged as `null` or as a relatively arbitrary byte array, depending on the content of the timestamp table when performing migration. (Pull Request)

v0.65.1

4 November 2017

Type	Change
IMPROVED	AtlasDB now depends on Tritium 0.8.0, allowing products to upgrade Tritium without running into `NoClassDefFound` errors. (Pull Request)
IMPROVED	Sweep is now more efficient and less susceptible to OOMs on Cassandra. Also, the default value for the sweep batch config parameter `candidateBatchSize` has been bumped up from `1` from `1024`. (Pull Request) (Pull Request)
FIXED	Fixed cursor leak when sweeping on oracle/postgres. (Pull Request)
IMPROVED	Sweep progress is now persisted as a blob and uses a KVS level table. This allows us to use check and set to avoid versioning the entries in the sweep progress table. As a result, loading of the persisted SweepResult which was previously linear in the size of the table being swept can be done in constant time. No migration is necessary as the data is persisted to a new table `_sweep_progress2`, however, sweep will ignore any previously persisted sweep progress. Note that this in particular means that any in-progress sweep will be abandoned and background sweep will choose a new table to sweep. (Pull Request)
IMPROVED LOGS	AtlasDB tables will now be logged as `ns.tablename` instead of `map[namespace:map[name:ns] tablename:tablename]`. (Pull Request)
FIXED	TracingKVS now has spans with safe names. (Pull Request)

v0.65.0

This version was skipped due to issues on release. No artifacts with this version were ever published.

v0.64.0

1 November 2017

Type	Change
FIXED	UUIDs can now be used in schemas again. Previously, schemas generated with UUIDs would reference the `java.util.UUID` class without importing it. (Pull Request)
IMPROVED	The executor used by the Cassandra KVS is now allowed to grow larger so that we can better delegate blocking to the underlying Cassandra client pools. Please note that for Cassandra users this may result in increased Atlas thread counts when receiving spikes in requests. The underlying throttling is the same, however, so Cassandra load shouldn’t be impacted. (Pull Request)
IMPROVED METRICS	`BackgroundSweeperImpl` now records additional metrics on how each run of sweep went. Metrics report the number of occurrences of each outcome in the 1 minute prior to the metrics being gathered. We may change this duration in the future. (Pull Request)
IMPROVED LOGS	Log host names in Cassandra* classes. (Pull Request)
FIXED	The executors used when async initializing objects are never shutdown anymore. You should be affected by this bug only if you had `AtlasDbConfig.initializeAsync = true`. Previously, we would shut down the executors after a successful initialization, which could lead to a race condition with the submission of a `cancelInitialization` task. (Pull Request)

v0.63.0

27 October 2017

Type	Change
FIXED	Fixed the deprecated `TransactionManagers.create` methods, by specifying a default user agent if none was provided. Previously, `TransactionManager` creation would have failed at runtime. (Pull Request)
IMPROVED METRICS	Metrics are now recorded for put/get operations around commit timestamps. (Pull Request)

v0.62.1

27 October 2017

Type	Change
FIXED	Updated our dependency on `http-remoting` to version `3.5.1`. (Pull Request)

v0.62.0

26 October 2017

Improvements

Type	Change
IMPROVED	`getRange` is now more efficient when scanning over rows with many updates in Cassandra if just a single column is requested. Previously, a range request in Cassandra would always retrieve all columns and all historical versions of each column, regardless of which columns were requested. Now, we only request the latest version of the specific column requested, if only one column is requested. Requesting multiple columns still results in the previous behavior, however this will also be optimized in a future release. (Pull Request)
DEPRECATED IMPROVED	`SerializableTransactionManager` is now created via an immutable builder instead of a long list of individual arguments. Use `TransactionManagers.builder()` to get the builder and once completely configured, build the transaction manager via the builder’s `.buildSerializable()` method. The existing `create` methods are deprecated and will be removed by November 15th, 2017. (Pull Request)
DEV BREAK IMPROVED	`TransactionManagers.builder()` no longer has a `callingClass(..)` method and now requires the consumer to directly specify their user agent via the previously optional method `userAgent(..)`. All of the `TransactionManagers.create(..)` methods are still deprecated, and can be used to specify an empty user-agent. We use the user-agent on logs and metrics, so specifying it helps us to diagnose issues in the future. (Pull Request)
IMPROVED	The duration between attempts of whitelist Cassandra nodes was reduced from 5 minutes to 2 minutes, and the minimum period a node is blacklisted for was reduced from 2 minutes to 30 seconds. This means we check the health of a blacklisted Cassandra node and whitelist it faster than before. (Pull Request)
DEV BREAK IMPROVED	The size of the transaction cache is now configurable. It is not anticipated end users will need to touch this; it is more likely that this will be configured via per-service overrides for the services for whom the current cache size is inadequate. If needed, configuring this parameter is available under the AtlasDbRuntimeConfig with the name timestampCacheSize. This is a small API change for users manually constructing a TransactionManager, which now requires a transaction cache size parameter. Please add it from the AtlasDbRuntimeConfig, or instead of manually creating a TransactionManager, utilize the builder in TransactionManagers to have this done for you. Note that even if the config is changed at runtime, the size of the cache doesn’t change dynamically until 2565 is resolved. (Pull Request 1) (Pull Request 2)
IMPROVED	Exposes another version of `getRanges` that uses a configurable concurrency level when not explicitly provided a value. This defaults to 8 and can be configured with the `KeyValueServiceConfig#defaultGetRangesConcurrency` parameter. Check the full configuration docs here. (Pull Request)
DEV BREAK IMPROVED	Simplify and annotate the constructors for `SerializableTransactionManager`. This should make the code of that class more maintainable. If you used one of the deleted or deprecated constructors, use the static `create` method. (Pull Request)

Logs and Metrics

Type	Change
IMPROVED METRICS	`SweepMetrics` are now updated at the end of every batch rather than cumulative metrics at the end of every table. This will provide more accurate metrics for when sweep is doing something. Sweeping through the sweep endpoint will now also contribute to these metrics — before it didn’t update any metrics which again distorted the view of what work sweep was doing on the DB. (Pull Request)
NEW METRICS	AtlasDB clients now emit metrics that track the immutable timestamp, unreadable timestamp, and current timestamp. These metrics should help in performing diagnosis of issues concerning Sweep and/or the lock service. (Pull Request)
FIXED METRICS	Timelock server no longer appends client names to metrics. Instead, each metric is aggregated across all clients. (Pull Request)
NEW METRICS	We now report metrics for Transaction conflicts. The metrics are a meter reported under the name `SerializableTransaction.SerializableTransactionConflict`. (Pull Request)
IMPROVED LOGS	Specified which logs from Cassandra* classes were Safe or Unsafe for collection, improving the data that we can collect for debugging purposes. (Pull Request)
IMPROVED USER BREAK LOGS	The `ProfilingKeyValueService` now reports its multipart log lines as a single line. This should improve log readability in log ingestion tools when AtlasDB is run in multithreaded environments. (Pull Request)
FIXED LOGS	TimeLock Server’s `ClockSkewMonitor` now attempts to contact all other nodes in the TimeLock cluster, even in the presence of remoting exceptions or clock skews. Previously, we would stop querying nodes once we encountered a remoting exception or detected clock skew. Also, the log line `ClockSkewMonitor threw an exception` which was previously logged every second when a TimeLock node was down or otherwise uncontactable is now restricted to once every 10 minutes. Note that the `clock.monitor-exception` metric is still incremented on every call, even if we do not log. (Pull Request)
FIXED LOGS	`ProfilingKeyValueService` now logs correctly when logging a message for `getRange`, `getRangeOfTimestamps` and `deleteRange`. Previously, the table reference was omitted, such that one might receive lines of the form `Call to KVS.getRange on table RangeRequest{reverse=false} with range 1504 took {} ms.`. (Pull Request)
FIXED METRICS	`MetricsManager` now supports de-registration of metrics for a given prefix. Previously, this would crash with a `ConcurrentModificationException` if metrics were actually being removed. (Pull Request)

Bug fixes

Type	Change
FIXED	`ProfilingKeyValueService` now logs correctly when logging a message for `getRange`, `getRangeOfTimestamps` and `DeleteRange`. Previously, the table reference was omitted, such that one might receive lines of the form `Call to KVS.getRange on table RangeRequest{reverse=false} with range 1504 took {} ms.`. (Pull Request)
FIXED	When AtlasDB thinks all Cassandra nodes are non-healthy, it logs a message containing “There are no known live hosts in the connection pool … We’re choosing one at random …”. The level of this log was reduced from ERROR to WARN, as it was spammy in periods of a Cassandra outage. (Pull Request)
FIXED	Timelock server will try to gain leadership synchronously when the first time a new client namespace is requested. Previously, the first request would always return 503. (Pull Request)
FIXED	`SerializableErrorDecoder` will decode errors properly instead of throwing `NullPointerException`. (Pull Request)
FIXED	Async Initialization now works with TimeLock Server. Previously, for Cassandra we would attempt to immediately migrate the timestamp bound from Cassandra to TimeLock on startup, which would fail if either of them was unavailable. For DBKVS or other key-value services, we would attempt to ping TimeLock on startup, which would fail if TimeLock was unavailable (though the KVS need not be available). (Pull Request)
FIXED	`AsyncInitializer` now shuts down its executor after initialization has completed. Previously, the executor service wasn’t shut down, which could lead to the initializer thread hanging around unnecessarily. (Pull Request)
FIXED	Fixed an issue where a `waitForLocks` request could retry unnecessarily. (Pull Request)
FIXED	`InMemoryAtlasDbConfig` now has an empty namespace, instead of “test”. This means that internal products will no longer have to set their root-level namespace to “test” in order to use `InMemoryKeyValueService` for testing. (Pull Request)
DEV BREAK FIXED	Move `@CancelableServerCall` to a more fitting package that matches internal codebase. (Pull Request)

v0.61.1

19 October 2017

Type	Change
IMPROVED	Reverted the Sweep rewrite for Cassandra as it would unnecessarily load values into memory which could cause Cassandra to OOM if the values are large enough. (Pull Request)

v0.61.0

18 October 2017

Type	Change
IMPROVED	Sweep is now more efficient on Postgres and Oracle. (Pull Request)
IMPROVED	The `SweeperService` endpoint registered on all clients will now sweeps the full table by default, rather than a single batch. It also now returns information about how much data was swept. (Pull Request)
FIXED LOGS	Sweep candidate batches are now logged correctly. Previously, we would log a `SafeArg` for these batches that had no content. (Pull Request)

v0.60.1

16 October 2017

Type	Change
NEW IMPROVED	AtlasDB now supports asynchronous initialization, where `TransactionManagers.create()` creates a `SerializableTransactionManager` even when initialization fails, for instance because the KVS is not up yet. To enable asynchronous initialization, a new config option `initializeAsync` was added to AtlasDbConfig. If this option is set to true, `TransactionManagers.create()` first attempts to create a `SerializableTransactionManager` synchronously, i.e., consistent with current behaviour. If this fails, it returns a `SerializableTransactionManager` for which the necessary initialization is scheduled in the background and which throws a `NotInitializedException` on any method call until the initialization completes - this is, until the backing store becomes available. While waiting for AtlasDB to be ready, clients can poll `isInitialized()` on the returned `SerializableTransactionManager`. The default value for the config is `false` in order to preserve previous behaviour. (Pull Request 1 and Pull Request 2)
NEW	Timelock server can now be configured to persist the timestamp bound in the database, specifically in Cassandra/Postgres/Oracle. We recommend this to be configured only for cases where you absolutely need to persist all state in the database, for example, in special cases where backups are simply database dumps and do not have any mechanism for storing timestamps. This will help support large internal product’s usage of the Timelock server. (Pull Request)
DEV BREAK IMPROVED	In order to limit the access to inner methods, and to make the implementation of asynchronous initialization feasible, we’ve extracted interfaces and renamed the following classes: `CassandraClientPool` `CassandraKeyValueService` `LockStore` `PersistentTimestampService` Now the factory methods for the above classes return the interfaces. The actual implementation of such classes was moved to their corresponding *Impl files. (Pull Request)
DEV BREAK IMPROVED	`LockRefreshingTimelockService` has been moved to the `lock-api` project under the package name `com.palantir.lock.client`, and now implements `AutoCloseable`, shutting down its internal executor service. (Pull Request)
FIXED	`PersistentLockManager` can now reacquire the persistent lock if another process unilaterally clears the lock. Previously in this case, sweep would continually fail to acquire the lock until the service restarts. (Pull Request)
FIXED	`CassandraClientPool` no longer logs stack traces twice for every failed attempt to connect to Cassandra. Instead, the exception is logged once only, when we run out of retries. (Pull Request)
FIXED	The Sweep endpoint and CLI now accept start rows regardless of the case these are presented in. Previously, giving a start row with hex characters in lower case e.g. `deadbeef` would result in an `IllegalArgumentException` being thrown. (Pull Request)
DEV BREAK	Removed the following unnecessary classes related to wrapping KVSes: `NamespacedKeyValueService` `NamespaceMappingKeyValueService` `NamespacedKeyValueServices` `StaticTableMappingService` (Pull Request)
FIXED	Lock state logging will dump `expiresIn` of refreshed token, instead of original, which was negative after refreshing. (Pull Request)
FIXED	When using the TimeLock block and either the timestamp or the lock service threw an exception, we were throwing InvocationTargetException instead. We now throw the actual cause for the invocation exception. (Pull Request)

v0.60.0

This version was skipped due to issues on release. No artifacts with this version were ever published.

v0.59.1

04 October 2017

Type	Change
IMPROVED	Allow passing a ProxyConfiguration to allow setting custom proxy on the TimeLock clients. (Pull Request)

v0.59.0

04 October 2017

Type	Change
IMPROVED	Timestamp batching has now been enabled by default. Please see Timestamp Client Options for details. This should improve throughput and latency, especially if load is heavy and/or clients are communicating with a TimeLock cluster which is used by many services. Note that there may be an increase in latency under light load (e.g. 2-4 threads). (Pull Request)
NEW	AtlasDB now offers a simplified version of the schema API by setting the `enableV2Table()` flag in your TableDefinition. This would generate an additional table class with some easy to use functions such as `putColumn(key, value)`, `getColumn(key)`, `deleteColumn(key)`. We only provide these methods for named columns, and don’t currently support dynamic columns. You can add this to your current Schema, and use the new simplified APIs by using the V2 generated table. (Pull Request)
NEW	AtlasDB now offers specifying `hashFirstNRowComponents(n)` in Table and Index definitions. This prevents hotspotting by prepending the hashed concatenation of the row components to the row key. When using with prefix range requests, the hashed components must also be specified in the prefix. Adding this to an existing Schema is not supported, as that would require a data migration. (Pull Request)
NEW	AtlasDB now offers specifying `hashRowComponents()` in StreamStore definitions. This prevents hotspotting in Cassandra by prepending the hashed concatenation of the `streamId` and `blockId` to the row key. Adding this to an existing StreamStore is not supported, as that would require a data migration. (Pull Request)
FIXED	The `lock/log-current-state` endpoint now correctly logs the number of outstanding lock requests. (Pull Request)
FIXED	Fixed migration from JDBC KeyValueService by adding a missing dependency to the CLI distribution. (Pull Request)
FIXED	Oracle auto-shrink is now disabled by default. This is an experimental feature allowing Oracle non-EE users to compact automatically. We decided to turn it off by default since we have observed timeouts for large amounts of data, until we find a better retry mechanism for shrink failures. (Pull Request)
LOGS USER BREAK	AtlasDB no longer tries to register Cassandra metrics for each pool with the same names. We now add poolN to the metric name in CassandraClientPoolingContainer, where N is the pool number. This will prevent spurious stacktraces in logs due to failure in registering metrics with the same name. (Pull Request)
DEV BREAK FIXED	Adjusted the remoting-api library version to match the version used by remoting3. Developers may need to check your dependencies, but no other actions should be required. (Pull Request)
FIXED	Adjusted optimizer hints for getRange() to prevent Oracle from picking a bad query plan. (Pull Request)

v0.58.0

22 September 2017

Type	Change
LOGS	AtlasDB now logs slow queries CQL queries (via `kvs-slow-log`) used for sweep (Pull Request)
DEV BREAK FIXED	AtlasDB now depends on okhttp 3.8.1. This is expected to fix an issue where connections would constantly throw “shutdown” exceptions, which was likely due to a documented bug in okhttp 3.4.1. (Pull Request)
DEV BREAK IMPROVED	Upgraded all uses of http-remoting from remoting2 to remoting3, except for serialization of errors (preserved for backwards wire compatibility). Developers may need to check their dependencies, as well as update instantiation of their calls to `TransactionManagers.create()` to use the remoting3 API. Note that users of AtlasDB clients are not affected, in that the wire format of configuration files has not changed. (Pull Request)
FIXED	KVS migration no longer fails when the old `_scrub` table is present. This unblocks KVS migrations for users who have data in `_scrub` but have not migrated from `_scrub` to `_scrub2` yet. (Pull Request)
FIXED	Path and query parameters for TimeLock endpoints have now been marked as safe. Several logging parameters in TimeLock (e.g. in `PaxosTimestampBoundStore` and `PaxosSynchronizer`) have also been marked as safe. (Pull Request)
IMPROVED	The `LockServiceImpl` now, in addition to lock tokens and grants (which are unsafe for logging), also logs token and grant IDs (which are big-integer IDs) as safe. (Pull Request)
FIXED	Sweep log priority has been increased to INFO for logs of when a table 1. is starting to be swept, 2. will be swept with another batch, and 3. has just been completely swept. (Pull Request)

v0.57.0

19 September 2017

Type	Change
METRICS CHANGED	From this version onwards, AtlasDB’s metrics no longer have unbounded multiplicity. This means that AtlasDB can be whitelisted in the internal metrics aggregator tool.
METRICS USER BREAK	AtlasDB no longer embeds Cassandra host names in its metrics. Aggregate metrics are retained in both CassandraClientPool and CassandraClientPoolingContainer. This was necessary for compatibility with an internal log-ingestion tool. (Pull Request)
METRICS USER BREAK	AtlasDB no longer embeds table names in Sweep metrics. Sweep aggregate metrics continue to be reported. This was necessary for compatibility with an internal log-ingestion tool. (Pull Request)
DEV BREAK FIXED	AtlasDB now depends on okhttp 3.8.1. This is expected to fix an issue where connections would constantly throw “shutdown” exceptions, which was likely due to a documented bug in okhttp 3.4.1. (Pull Request)
DEV BREAK FIXED	The `ConcurrentStreams` class has been deleted and replaced with calls to `MoreStreams.blockingStreamWithParallelism`, from streams. (Pull Request)
DEV BREAK IMPROVED	TimeLockAgent’s constructor now accepts a Supplier instead of an RxJava Observable. This reduces the size of the TimeLock Agent jar, and removes the need for a dependency on RxJava. To convert an RxJava Observable to a Supplier that always returns the most recent value, consider the method blockingMostRecent as implemented here. (Pull Request)
IMPROVED	`BatchingVisitableView` methods `immutableCopy`, `immutableSetCopy`, and `copyInto` use the default batch hint of 1000, instead of a batch hint of 100,000. We previously defaulted to the higher value because the result set was assumed to be small; however, in practice this has turned out not to be the case, leading to timeouts and OOMs in the field. To use a custom batch hint, set the `batchHint` property for your `RangeRequest`. Alternatively, call `BatchingVisitableView.hintBatchSize(int)` before making a copy. (Pull Request)
IMPROVED	AtlasDB table definitions now support specifying log safety without having to also specify value byte order for row components. (Pull Request)

v0.56.1

14 September 2017

Type	Change
IMPROVED	The new concurrent version of Transaction#getRanges did not correctly guarantee ordering of the results returned in its stream. We now make sure the resulting ordering matches that of the input RangeRequests. (Pull Request)

v0.56.0

12 September 2017

Type	Change
NEW	TimelockServer now exposes the LockService instead of the RemoteLockService if using the synchronous lock service. This will provide a more comprehensive API which is required by the large internal products. (Pull Request)
USER BREAK NEW	Timelock clients now report tritium metrics for the lock requests with the prefix `LockService` instead of `RemoteLockService`. (Pull Request)
DEV BREAK	LockAwareTransactionManager now returns a LockService instead of a RemoteLockService in order to expose the new API. Any products that extend this class will have to change their class definition. (Pull Request)
NEW	Added two new methods to Transaction, getRangesLazy and a concurrent version of getRanges, which are also exposed in the Table API. If you expect to only use a small amount of the rows in the provided ranges, it is often advisable to use the new getRangesLazy method and serially iterate over the results. Otherwise, you should use the new version of getRanges that allows explicitly operating on the resulting visitables in parallel. (Pull Request)
DEPRECATED	The existing getRanges method has been deprecated as it would eagerly load the first page of all ranges, potentially concurrently. This often caused more data to be fetched than necessary or higher concurrency than expected. Recommended alternative is to use getRanges with a specified concurrency level, or getRangesLazy. (Pull Request)
USER BREAK FIXED	AtlasDB no longer embeds user-agents in metric names. This affects both AtlasDB clients as well as TimeLock Server. All metrics are still available; however, metrics which previously included a user-agent component will no longer do so. For example, the timer `com.palantir.timestamp.TimestampService.myUserAgent_version.getFreshTimestamp` is now named `com.palantir.timestamp.TimestampService.getFreshTimestamp`. This was necessary for compatibility with an internal log-ingestion tool. (Pull Request)
IMPROVED	LockServerOptions now provides a builder, which means constructing one should not require overriding methods. (Pull Request)
NEW	Oracle will now validate connections by running the test query when getting a new connection from the HikariPool. (Pull Request)
IMPROVED	Cassandra range concurrency defaults lowered from 64x to 32x, to reflect default connection pool sizes that have shrank over time, and to be more appropriate for fairly common smaller 3-node clusters. (Pull Request)

v0.55.0

01 September 2017

Type	Change
USER BREAK	If AtlasDB is used with TimeLock, and the TimeLock client name is different than either the Cassandra `keyspace`, Postgres `dbName`, or Oracle `sid`, AtlasDB will fail to start. This was done to avoid the risk of data corruption if these are accidentally changed independently. If the above parameters contradict, please contact the AtlasDB team to change the TimeLock client name. Changing it in config without additional action may result in severe data corruption. (Pull Request)
NEW	AtlasDB introduces a top-level `namespace` configuration parameter, which is used to set the `keyspace` in the keyValueService when using Cassandra and the `client` in TimeLock. Following the previous change, we unify both the configs that cannot be changed separately in one single config. Therefore it is suggested that AtlasDB users follow and use the new parameter to specify both the deprecated ones. Note that if the new `namespace` config contradicts with either the Cassandra `keyspace` and/or the TimeLock `client` configs, AtlasDB will fail to start. Please consult the documentation for AtlasDB Configuration for details on how to set this up. (Pull Request)
DEPRECATED	As a followup of the `namespace` change, the Cassandra `keyspace` and TimeLock `client` configs were deprecated. As said previously, please use the `namespace` root level config to specify both of these parameters. (Pull Request)
NEW	Oracle SE will now automatically trigger table data shrinking to recover space after sweeping a table. You can disable the compaction by setting `enableShrinkOnOracleStandardEdition` to `false` in the Oracle DDL config. (Pull Request)
FIXED	Fixed an issue where sweep logs would get rolled over sooner than expected. The number of log files stored on disk was increased from 10 to 90 before rolling over. (Pull Request)

v0.54.0

25 August 2017

Type	Change
NEW	Timelock clients now report tritium metrics for the `TimestampService` even if they are using the request batching service. Note when setting up metric graphs, the timestamp service metrics are named with `...Timelock.<getFreshTimestamp/getFreshTimestamps>` when not using request batching, but as `...Timestamp.<getFreshTimestamp/getFreshTimestamps>` if using request batching. The lock service metrics are always reported as `...Timelock.<lock/unlock/etc>` for timelock clients. (Pull Request)
FIXED	AtlasDB clients now report tritium metrics for the `TimestampService` and `LockService` endpoints just once instead of twice. In the past, every request would be reported twice leading to number bloat and more load on the metric collector service. (Pull Request)
FIXED	`kvs-slow-log` now uses `logsafe` to support sls-compatible logging. (Pull Request)
FIXED	The scrubber queue no longer grows without bound if the same cell is overwritten multiple times by hard delete transactions. (Pull Request)
IMPROVED	If `enableOracleEnterpriseFeatures` is configured to be false, you will now see warnings asking you to run Oracle compaction manually. This will help make non-EE Oracle users aware of potential database bloat. (Pull Request)
FIXED	Fixed a case where logging an expection suppressing itself would cause a stack overflow. See LOGBACK-1027. (Pull Request)
NEW	AtlasDB now produces a new artifact, `timelock-agent`. Users who wish to run TimeLock Server outside of a Dropwizard environment should now be able to do so more easily, by supplying the TimeLock Agent with a registrar that knows how to register Java resources and expose suitable HTTP endpoints. (Pull Request)
IMPROVED	TimeLock now creates client namespaces the first time they are requested, rather than requiring them to be specified in config. This means that specifying a list of clients in Timelock configuration will no longer have any effect. Further, a new configuration property called `max-number-of-clients` has been introduced in `TimeLockRuntimeConfiguration`. This can be used to limit the number of clients that will be created dynamically, since each distinct client has some memory, disk space, and CPU overhead. (Pull Request)
DEPRECATED	`putUnlessExists` methods in schema generated code have been marked as deprecated as the naming can be misleading, leading to accidental value overwrites. The recommended alternative is doing a separate read and write in a single transaction. (Pull Request)
FIXED	CharacterLimitType now has fields marked as final. (Pull Request)
CHANGED	The `RangeMigrator` interface now contains an additional method `logStatus(int numRangeBoundaries)`. This method is used to log the state of migration for each table when starting or resuming a KVS migration. (Pull Request)
CHANGED	Updated our dependency on `sls-packaging` from 2.3.1 to 2.4.0. (Pull Request)

v0.53.0

9 August 2017

Type	Change
FIXED	KVS migrations will no longer verify equality between the from and to KVSes for the sweep priority and progress tables. Note that these tables are still migrated across, as they provide heuristics for timely sweeping of tables. However, these tables may change during the migration, without affecting correctness (e.g. the from-kvs could be swept). Previously, we would attempt to check that the sweep tables were equal on both KVSes, leading to spurious validation failures. (Pull Request)
NEW	AtlasDB now supports specifying the safety of table names as well as row and column component names following the palantir/safe-logging library. Please consult the documentation for Tables and Indices for details on how to set this up. As AtlasDB regenerates its metadata on startup, changes will take effect after restarting your AtlasDB client (in particular, you do NOT need to rerender your schemas.) Previously, all table names, row component names and column names were always treated as unsafe. (Pull Request 1, Pull Request 2 and Pull Request 3)
IMPROVED	The `ProfilingKeyValueService` and `SpecificTableSweeper` now log table names as safe arguments, if and only if these have been specified as safe in one’s schemas. Previously, these were always logged as unsafe. (Pull Request)
DEV BREAK	AtlasDB now throws an error during schema code generation stage if table length exceeds KVS limits. To override this, please specify `ignoreTableNameLengthChecks()` on your schema. (Pull Request)
DEV BREAK	`NameComponentDescription` is now a `final` class and has a builder instead of constructors. This will affect any products which have subclassed `NameComponentDescription`, although we are not aware of any. (Pull Request)
DEV BREAK	IteratorUtils.forEach removed; it’s not needed in a Java 8 codebase. (Pull Request)

v0.52.0

1 August 2017

Type

Change

FIXED

Fixed a critical bug in Oracle that limits the number of writes with values greater than 2000 bytes to Integer.MAX_VALUE. (Pull Request)

FIXED

Change schemas in the codebase so that they use JAVA8 Optionals instead of Guava. (Pull Request)

DEV BREAK

Removed unused classes on AtlasDB.

FutureClosableIteratorTask

ClosableMergedIterator

ThrowingKeyValueService

If any issues arise from this change, please contact the development team. (Pull Request)

v0.51.0

28 July 2017

Type	Change
FIXED	For DbKvs, the `actualValues` field is now populated when a `CheckAndSetException` is thrown. (Pull Request)

v0.50.0

27 July 2017

Type	Change
FIXED USER BREAK	TimeLock Server, if configured to use the async lock service, will now throw if a client attempts to start a transaction via the sync lock service. Previously, users which have clients (for the same namespace) running both pre- and post-0.49.0 versions of AtlasDB were able to run transactions against the sync and async lock services concurrently, thus breaking the guarantees of the lock service. AtlasDB does not support having clients (for the same namespace) running both pre- and post-0.49.0 versions. Note that TimeLock users which have clients (for different namespaces) running both pre- and post-0.49.0 versions will need to turn this feature off for clients on pre-0.49.0 versions to continue working with TimeLock, and should exercise caution in ensuring that, for each namespace, clients use only pre- or post-0.49.0 versions of AtlasDB. Please see Async Lock Service Configuration for documentation. (Pull Request)
USER BREAK	TimeLock Server has moved its parameter `useAsyncLockService` to be within an `asyncLock` block. This was done as we wanted to keep the configuration options for the async lock service together. The parameter remains optional, and users not configuring this parameter are unaffected. (Pull Request)
IMPROVED	`gc_grace_seconds` will now be automatically updated for services running against CassandraKVS on startup. We reduced `gc_grace_seconds` from four days to one hour in 0.42.0 but that is enforced for new tables and not the existing ones. Updating `gc_grace_seconds` can be an expensive operation and users should expect the service to block for a while on startup. However, this shouldn’t be a concern unless the count of tables is in the order of 100s. If you think this will be an issue, please configure the `gcGraceSeconds` parameter in Cassandra KVS config to 4 days (`4 * 24 * 60 * 60`), which was the previous default. (Pull Request)
FIXED	`RequestBatchingTimestampService` now works for AtlasDB clients using TimeLock Server once again. Previously in 0.49.0, clients using TimeLock Server and request batching would still request timestamps one at a time from the TimeLock Server. (Pull Request)
FIXED	`PaxosQuorumChecker` will now interrupt outstanding requests after a quorum response has been collected. This prevents the number of paxos request threads from growing without bound. (Pull Request)
IMPROVED DEV BREAK	OkHttp clients (created with `FeignOkHttpClients`) will no longer silently retry connections. We have already implemented retries, including retries from connection failures, at the Feign level in `FailoverFeignTarget`. If you require silent retry, please contact the AtlasDB team. (Pull Request)
IMPROVED USER BREAK	AtlasConsole database mutation commands (namely `put()` and `delete()`) are now disabled by default. To enable them, run AtlasConsole with the `--mutations_enabled` flag (Pull Request)
FIXED	Fixed a bug in AtlasConsole that caused valid table names not to be recognized. (Pull Request)
NEW	TimeLock Server now supports a `NonBlockingFileAppenderFactory` which prevents requests from blocking if the request log queue is full. To use this appender, the `type` property should be set to `non-blocking-file` in the logging appender configuration. Note that using this appender may result in request logs being dropped. (Docs) (Pull Request)
FIXED	Fixed a potential deadlock in `PersistentLockManager` that could prevent clients from shutting down if the persistent backup lock could not be acquired. (Pull Request)
NEW	New metrics have been added for tracking Cassandra’s approximate pool size, number of idle connections, and number of active connections. (Docs) (Pull Request)

v0.49.0

18 July 2017

Type	Change
IMPROVED	TimeLock Server now can process lock requests using async Jetty servlets, rather than blocking request threads. This leads to more stability and higher throughput during periods of heavy lock contention. To enable this behavior, use the `useAsyncLockService` option to switch between the new and old lock service implementation. This option defaults to `true`. (Pull Request)
DEV BREAK IMPROVED	The maximum time that a transaction will block while waiting for commit locks is now configurable, and defaults to 1 minute. This can be configured via the `transaction.lockAcquireTimeoutMillis` option in `AtlasDbRuntimeConfig`. This differs from the previous behavior, which was to block indefinitely. However, the previous behavior can be effectively restored by configuring a large timeout. If creating a `SerializableTransactionManager` directly, use the new constructor which accepts a timeout parameter. (Pull Request)
DEV BREAK	`randomBitCount` and `maxAllowedBlockingDuration` are deprecated and no longer configurable in `LockServerOptions`. If specified, they will be silently ignored. If your service relies on either of these configuration options, please contact the AtlasDB team. (Pull Request)
USER BREAK	This version of the AtlasDB client will require a version of Timelock server that exposes the new `/timelock` endpoints. Note that this only applies if running against Timelock server; clients running with embedded leader mode are not affected. (Pull Request)
USER BREAK	The timestamp batching functionality introduced in 0.48.0 is temporarily no longer supported when running with Timelock server. We will re-enable support for this in a future release.
FIXED	Fixed the broken `put()` command in AtlasConsole. You should now be able to insert and update data using Console. (Pull Request)
FIXED	Fixed an issue that could cause AtlasConsole to print unnecessary amounts of input when commands were run. (Pull Request)
USER BREAK	Remove Cassandra config option `safetyDisabled`; users should instead move to a more specific config for their situation, which are: `ignoreNodeTopologyChecks`, `ignoreInconsistentRingChecks`, `ignoreDatacenterConfigurationChecks`, `ignorePartitionerChecks` (Pull Request)
FIXED	`commons-executors` now excludes the `safe-logging` Java8 jar to support Java 6 clients. (Pull Request)
NEW	`TransactionManagers` exposes a method in which it is possible to specify the user agent to be used. (Pull Request)

v0.48.0

Type	Change
FIXED	If sweep configs are specified in the AtlasDbConfig block, they will be ignored, but AtlasDB will no longer fail to start. This effectively fixes the Sweep-related user break change of version `0.47.0`. Note that users of products that upgraded from `0.45.0` to `0.48.0` will need to move configuration overrides from the regular `atlasdb` config to the `atlasdb-runtime` config for them to continue taking effect. Please reference the Sweep configuration docs for more details. (Pull Request)
NEW	AtlasDB now supports batching of timestamp requests on the client-side; see Timestamp Client Options for details. On internal benchmarks, the AtlasDB team has obtained an almost 2x improvement in timestamp throughput and latency under modest load (32 threads), and an over 10x improvement under heavy load (8,192 threads). There may be a very small increase in latency under extremely light load (e.g. 2-4 threads). Note that this is not enabled by default. (Pull Request)
DEV BREAK	The `RateLimitingTimestampService` in `timestamp-impl` has been renamed to `RequestBatchingTimestampService`, to better reflect what the service does and avoid confusion with the `ThreadPooledLockService` (which performs resource-limiting). Products that do not use this class directly are not affected. (Pull Request)
FIXED DEV BREAK	`TransactionManager.close()` now closes the lock service (provided it is closeable), and also shuts down the Background Sweeper. Previously, the lock service’s background threads as well as background sweeper would continue to run (potentially indefinitely) even after a transaction manager was closed. Note that services that relied on the lock service being available even after a transaction manager was shut down may no longer behave properly, and should ensure that the transaction manager is not shut down while the lock service is still needed. (Pull Request)
FIXED	`commons-executors` now uses Java 6 when compiling from source and generates classes targeting Java 6. Java 6 support was removed in AtlasDB `0.41.0` and blocks certain internal products from upgrading to subsequent versions. (Pull Request)
FIXED	`LockServiceImpl.close()` is now idempotent. Previously, calling the referred method more than once could fail an assertion and throw an exception. (Pull Request)

v0.47.0

11 July 2017

Type

Change

IMPROVED

ErrorProne is enabled and not ignored on all AtlasDB projects. This means that AtlasDB can be whitelisted in the internal logging aggregator tool from this version ownards. (Pull Request) (Pull Request) (Pull Request) (Pull Request)

NEW IMPROVED

Background Sweep is enabled by default on AtlasDB. To understand what Background Sweep is, please check the sweep docs, in particular, the background sweep docs. (Pull Request)

USER BREAK DEV BREAK IMPROVED

Added support for live-reloading sweep configurations. TransactionManagers.create() methods now accept a Supplier of AtlasDbRuntimeConfig in addition to an AtlasDbConfig. If needed, the helper method defaultRuntimeConfig() can be used to create a runtime config with the default values. As part of this improvement, we made the sweep options of AtlasDbConfig unavailable. The following options now may not be specified in the install config and must instead be specified in the runtime config:

AtlasDbConfig

AtlasDbRuntimeConfig

enableSweep

enabled

sweepPauseMillis

pauseMillis

sweepReadLimit

readLimit

sweepCandidateBatchHint

candidateBatchHint

sweepDeleteBatchHint

deleteBatchHint

Specifying any of the above install options will result in AtlasDB failing to start. Check the full configuration docs here. (Pull Request)

FIXED

Fixed a bug that caused AtlasDB internal tables (e.g. the Transactions table or the Punch table) to be wiped when read from the AtlasDB Console. (Pull Request)

USER BREAK

The Atomix algorithm implementation for the TimeLock server and the corresponding configurations have been removed. The default algorithm for TimeLockServer has been changed to Paxos. This should not affect users as Atomix should not have been used due to known bugs in the implementation. (Pull Request)

USER BREAK

The previously deprecated RocksDBKVS has been removed. Developers that relied on RocksDB for testing should move to H2 on JdbcKvs. (Pull Request)

FIXED IMPROVED

Sweep now dynamically adjusts the number of (cell, ts) pairs across runs:

On a failure run, sweep halves the number of pairs to read and to delete on subsequent runs.
On a success run, sweep slowly increases the number of (cell, ts) pairs to read and to delete on subsequent runs, up to a configurable maximum.

This should fix the issue where we were unable to sweep cells with a high number of mutations. (Pull Request)

IMPROVED

Default configs which tune sweep runs were lowered, to ensure that sweep works in any situation. For more information, please check the sweep docs. Please delete any config overrides regarding sweep and use the default values, to ensure a sane run of sweep. (Pull Request)

NEW

AtlasDB now instruments embedded timestamp and lock services when no leader block is present in the config, to expose aggregate response time and service call metrics. Note that this may cause a minor performance hit. If that is a concern, the instrumentation can be turned off by setting the tritium system properties instrument.com.palantir.timestamp.TimestampService and instrument.com.palantir.lock.RemoteLockService to false and restarting the service. (Pull Request)

NEW

AtlasDB now adds endpoints for sweeping a specific table, with options for startRow and batch config parameters. This should be used in place of the deprecated sweep CLIs. Check the endpoints documentation here. (Pull Request)

IMPROVED

Improved performance of timestamp and lock requests on clusters with a leader block and a single node. If a single leader is configured, timestamp and lock requests will no longer use HTTPS/Jetty. In addition to the minor perf improvement, this fixes an issue causing livelock/deadlock when the leader is under heavy load. We recommend HA clusters under heavy load switch to using a standalone timestamp service, as they may also be vulnerable to this failure mode. (Pull Request)

IMPROVED DEV BREAK

The dropwizard independent implementation of the TimeLock server has been separated into a new project, timelock-impl. This should not affect users directly, unless they depended on classes from within the TimeLock server. (Pull Request)

FIXED

JDBC KVS now batches cells in put/delete operations via the config parameter batchSizeForMutations. This will prevent the driver from throwing due to many parameters in the resulting SQL select query. Also, the batch size for getRows is now controlled by a config parameter rowBatchSize. (Pull Request)

FIXED

AtlasDB clients now retry lock requests if the server loses leadership while the request is blocking. In the past, this scenario would cause the server to return 500 responses that were not retried by the client. Now the server returns 503 responses, which triggers the correct retry logic. (Pull Request)

FIXED

AtlasDB now generates Maven POM files for shadowed jars correctly. Previously, we would regenerate the XML for shadow dependencies by creating a node with corresponding groupId, artifactId, scope and version tags only, which is incorrect because it loses information about, for example, specific or transitive exclusions. We now respect these additional tags. (Pull Request)

FIXED

Fixed a bug where a timelock server instance could get stuck in an infinite loop if cutoff from the other nodes and failed to achieve a quorum. (Pull Request)

IMPROVED USER BREAK

Improved the way rows and named columns are outputted in AtlasConsole to be more intuitive and easier to use. Note that this may break existing AtlasConsole scripts. (Pull Request)

FIXED

Added backwards compatibility for the changes introduced in #2067, in particular, for passing row values into AtlasConsole functions. (Pull Request)

v0.46.0

This version was skipped due to issues on release. No artifacts with this version were ever published.

v0.45.0

19 June 2017

Type	Change
DEV BREAK IMPROVED	Upgraded all usages of http-remoting to remoting2. Previously, depending on the use case, AtlasDB would use http-remoting, remoting1 and remoting2. Developers may need to check their dependencies, as well as update instantiation of their calls to `TransactionManagers.create()` to use the remoting2 API. (Pull Request)
DEV BREAK IMPROVED	AtlasDB has updated Feign to 8.17.0 and OkHttp to 3.4.1, following remoting2 in the palantir/http-remoting library. We previously used Feign 8.6.1 and OkHttp 2.5.0. Developers may need to check their dependencies, especially if they relied on AtlasDB to pull in Feign and OkHttp as there were package name changes. (Pull Request 1) and (Pull Request 2)
DEV BREAK IMPROVED	AtlasDB now shades Feign and Okio (same as palantir/http-remoting). This was done to enable us to synchronize with remoting2 while limiting breaks for users of older versions of Feign, especially given an API break in Feign 8.16. Users who previously relied on AtlasDB to pull in these libraries may experience a compile break, and should consider explicitly depending on them. (Pull Request)
DEV BREAK IMPROVED	Converted all compile time uses of Guava’s `com.google.common.base.Optional` class to the Java8 equivalent `java.util.Optional`. This change should not directly affect users as there is no change to json or yml representations of AtlasDB configurations. This is a relatively straightforward compile time break for products consuming AtlasDB libraries. (Pull Request)
DEPRECATED IMPROVED	AssertUtils logging methods will now ask for an SLF4J logger to log to, instead of using a default logger. This should make log events from AssertUtils easier to filter. (Pull Request)
FIXED	JDBC KVS now batches cells in get operations via the config parameter `batchSizeForReads`. This will prevent the driver from throwing due to many parameters in the resulting SQL select query. (Pull Request)
FIXED	The CLI distribution can now be run against JDBC with hikari connection pools. In the past, it would fail to resolve the configuration due to a missing runtime dependency. Note: this is not a problem if running with the dropwizard bundle. (Pull Request)
FIXED	Fixed an issue where the lock service was not properly shut down after losing leadership, which could result in threads blocking unnecessarily. (Pull Request)
FIXED	Lock refresh requests are no longer restricted by lock service threadpool limiting. This allows transactions to make progress even when the threadpool is full. (Pull Request)
FIXED	Lock service now ensures that locks are reaped in a more timely manner. Previously the lock service could allow locks to be held past expiration, if they had a timeout shorter than the longest timeout in the expiration queue. (Pull Request)
NEW	Added a getRow() command to AtlasConsole for retrieving a single row. (Pull Request)
NEW	Added a rowComponents() function to the AtlasConsole table() command to allow you to easily view the fields that make up a row key. (Pull Request)
NEW	The default lock timeout is now configurable. Currently, the default lock timeout is 2 minutes. This can cause a large delay if a lock requester’s connection has died at the time it receives the lock. Since TransactionManagers#create provides an auto-refreshing lock service, it is safe to lower the default timeout to reduce the delay that happens in this case. (Pull Request)
IMPROVED	The priority of logging on background sweep was increased from debug to info or warn. (Pull Request)
IMPROVED	The lock service state logger now has a reduced memory footprint. It also now logs the locking mode for each lock. (Pull Request)
IMPROVED	Reduced the logging level of some messages relating to check-and-set operations in `CassandraTimestampBoundStore` to reduce noise in the logs. These were designed to help debugging the `MultipleRunningTimestampServicesException` issues but we no longer require them to log all the time. (Pull Request)

v0.44.0

8 June 2017

Type	Change
IMPROVED	`TimestampService` now uses atomic variables rather than locking, and refreshes the bound synchronously rather than asynchronously. This should improve performance somewhat under heavy load, although there will be a short pause in responses when the bound needs to be refreshed (currently, once every 1 million timestamps). (Pull Request)
IMPROVED	Added new meter metrics for cells swept/deleted and failures to acquire persistent lock. (Pull Request)
IMPROVED	Cassandra thrift driver has been bumped to version 3.10. This will fix a bug (#1654) that caused Atlas probing downed Cassandra nodes every few minutes to see if they were up and working yet to eventually take out the entire cluster by steadily building up leaked connections, due to a bug in the underlying driver. (Pull Request)
IMPROVED	Read-only transactions will no longer make a remote call to fetch a timestamp, if no work is done on the transaction. This will benefit services that execute read-only transactions around in-memory cache operations, and frequently never fall through to perform a read. (Pull Request)
IMPROVED	Timelock service now includes user agents for all inter-node requests. (Pull Request)
NEW	Timelock now tracks metrics for leadership elections, including leadership gains, losses, and proposals. (Pull Request)
FIXED	Fixed a severe performance regression in getRange() on Oracle caused by an inadequate query plan being chosen sometimes. (Pull Request)
FIXED	Fixed a potential out-of-memory issue by limiting the number of rows getRange() can request from Postgres at once. (Pull Request)
FIXED	KVS migration CLI will now clear the checkpoint tables that are required while the migration is in progress but not after the migration is complete. The tables were previously left hanging and the user had to delete/truncate them. (Pull Request)
DEV BREAK	Some downstream projects were using empty table metadata for dev-laziness reasons in their tests. This is no longer permitted, as it leads to many (unsolved) questions about how to deal with such a table. If this breaks your tests, you can fix it with making real schema for tests or by switching to AtlasDbConstants.GENERIC_TABLE_METADATA (Pull Request)
USER BREAK FIXED	Fixed a bug that caused Cassandra to always use the minimum compression block size of 4KB instead of the requested compression block size. Users must explicitly rewrite table metadata for any tables that requested explicit compression, as any tables that were created previously will not respect the compression block size in the schema. This can have a very large performance impact (both positive and negative in different cases), so users may need to remove the explicit compression request from their schema if this causes a performance regression. Users that previously attempted to set a compression block size that was not a power of 2 will also need to update their schema because Cassandra only allows this value to be a power of 2. (Pull Request)
FIXED	Fixed a potential out-of-memory issue by limiting the number of rows getRange() can request from Postgres at once. (Pull Request)

v0.43.0

25 May 2017

Type	Change
FIXED	For requests that fail due to to networking or other IOException, the AtlasDB client now backs off before retrying. (Pull Request)
USER BREAK IMPROVED	The `acquire-backup-lock` endpoint of `PersistentLockService` now returns a 400 response instead of a 500 response when no reason for acquiring the lock is provided. (Pull Request)
FIXED	`PaxosTimestampBoundStore` now throws `NotCurrentLeaderException`, invalidating the timestamp store, if a bound update fails because another timestamp service on the same node proposed a smaller bound for the same sequence number. This was added to address a very specific race condition leading to an infinite loop that would saturate the TimeLock cluster with spurious Paxos messages; see issue 1941 for more detail. (Pull Request)
DEPRECATED	The FastForwardTimestamp and FetchTimestamp CLIs have been deprecated. Please use the `timestamp-management/fast-forward` and `timestamp/fresh-timestamp` endpoints instead. (Pull Request)
IMPROVED	Sweep now batches delete calls before executing them. This should improve performance on relatively clean tables by deleting more cells at a time, leading to fewer DB operations and taking out the backup lock less frequently. The new configuration parameter `sweepDeleteBatchHint` determines the approximate number of (cell, timestamp) pairs deleted in a single batch. Please refer to the documentation for details of how to configure this. (Pull Request)
CHANGED	Sweep metrics now record counts of cell-timestamp pairs examined rather than the count of entire cells examined. This provides more accurate insight on the work done by the sweeper. (Pull Request)
DEPRECATED	The Sweep CLI configuration parameters `--batch-size` and `--cell-batch-size` have been deprecated, as we now batch on cell-timestamp pairs rather than by rows and cells. Please use the `--candidate-batch-hint` (batching on cells) instead of `--batch-hint` (batching on rows), and `--read-limit` instead of `--cell-batch-size` (docs). (Pull Request)
DEPRECATED	The background sweep configuration parameters `sweepBatchSize` (which used to batch on rows) and `sweepCellBatchSize` have been deprecated in favour of `sweepCandidateBatchHint` (which now batches on cells) and `sweepReadLimit` respectively. If your application configures either of these values, please look at more details in the docs. (Pull Request)
FIXED	After the Pull Request #1808 the TimeLock Server did not gate the lock service behind the `AwaitingLeadershipProxy`. This could lead to data corruption in very rare scenarios. The affected TimeLock server versions are not distributed anymore internally. (Pull Request)
FIXED	`TimestampAllocationFailures` now correctly propagates `ServiceNotAvailableException` if thrown from the timestamp bound store. Previously, a `NotCurrentLeaderException` that was thrown from the timestamp store would be wrapped in `RuntimeException` before being thrown out, meaning that TimeLock clients saw 500s instead of the intended 503s. This could lead to inneficient retry logic. (Pull Request)
DEV BREAK	New `KeyValueService` method `getCandidateCellsForSweeping()` that should eventually replace `getRangeOfTimestamps()`. (Pull Request)

v0.42.2

25 May 2017

Type

Change

FIXED

PaxosTimestampBoundStore now throws TerminalTimestampStoreException if a bound update fails because another timestamp service on the same node proposed a smaller bound, or if another node proposed a bound update we were not expecting. Previously, a NotCurrentLeaderException that was thrown from the timestamp store would be wrapped in RuntimeException before being thrown out, meaning that TimeLock clients saw 500s instead of the intended 503s. (Pull Request)

v0.42.1

24 May 2017

Type

Change

FIXED

PaxosTimestampBoundStore now throws NotCurrentLeaderException, invalidating the timestamp store, if a bound update fails because another timestamp service on the same node proposed a smaller bound for the same sequence number. This was added to address a very specific race condition leading to an infinite loop that would saturate the TimeLock cluster with spurious Paxos messages; see issue 1941 for more detail. (Pull Request)

v0.42.0

23 May 2017

Type	Change
FIXED	`PaxosTimestampBoundStore`, the bound store for Timelock, will now throw `NotCurrentLeaderException` instead of `MultipleRunningTimestampServiceError` when a bound update fails. The cases where this can happen are explained by a race condition that can occur after leadership change, and it is safe to let requests be retried on another server. (Pull Request)
FIXED	A 500 ms backoff has been added to the our retry logic when the client has queried all the servers of a cluster and received a `NotCurrentLeaderException`. Previously in this case, our retry logic would dictate infinitely many retries with a 1 ms backoff. The new backoff should reduce contention during leadership elections, when all nodes throw `NotCurrentLeaderException`. (Pull Request)
IMPROVED	Timelock server can now start with an empty clients list. Note that you currently need to restart timelock when adding clients to the configuration. (Pull Request)
IMPROVED	Default `gc_grace_seconds` set by AtlasDB for Cassandra tables has been changed from four days to one hour, allowing Cassandra to start cleaning up swept data sooner after sweeping. This parameter is set at table creation time, and it will only apply for new tables. Existing customers can update the `gc_grace_seconds` of existing tables to be one hour if they would like to receive this benefit now. We will also be adding functionality to auto-update this for existing tables in a future release. There is no issue with having tables with different values for `gc_grace_seconds`, and this can be updated at any time. (Pull Request)
IMPROVED	`ProfilingKeyValueService` now has some additional logging mechanisms for logging long-running operations on WARN level, enabled by default. (Pull Request)

v0.41.0

17 May 2017

Type	Change
USER BREAK CHANGED	Projects `atlasdb-commons`, `commons-annotations`, `commons-api`, `commons-executors`, `commons-proxy`, and `lock-api` no longer force Java 6 compatibility. This eliminates the need for a Java 6 compiler to compile AtlasDB. However, users can no longer compile against AtlasDB artifacts using Java 6 or 7; they must use Java 8 if depending on these AtlasDB projects. (Pull Request)
DEV BREAK IMPROVED	The format of serialized exceptions occurring on a remote host has been brought in line with that of the palantir/http-remoting library. This should generally improve readability and also allows for more meaningful messages to be sent; we would previously return message bodies with no content for some exceptions (such as `NotCurrentLeaderException`). In particular, the assumption that a status code of 503 definitively means that the node being contacted is not the leader is no longer valid. That said, existing AtlasDB clients will still behave correctly even with a new TimeLock. (Pull Request 1, Pull Request 2)
NEW	Timelock server now has jar publication in addition to dist publication. (Pull Request)
NEW FIXED	TimeLock clients may now receive an HTTP response with status code 503, encapsulating a `BlockingTimeoutException`. This response is returned if a client makes a lock request that blocks for long enough that the server’s idle timeout expires; clients may (immediately) retry the request. Previously, these requests would be failed with a HTTP-level exception that the stream was closed. We have rewritten clients constructed via `AtlasDbHttpClients` to account for this new behaviour, but custom clients directly accessing the lock service may be affected. This feature is disabled by default, but can be enabled following the TimeLock server configuration docs. (Pull Request)
FIXED	`DbKvs.getRangeOfTimestamps()` now returns the entire range of timestamps requested for. Previously, this would only return a range corresponding to the first page of results. (Pull Request)
FIXED	AtlasDB Console no longer errors on range requests that used a column selection and had more than one batch of results. (Pull Request)
IMPROVED	The `PaxosQuorumChecker` thread pool which is used to dispatch requests to other nodes during leadership elections is now instrumented with Dropwizard metrics. This will be useful for debugging PaxosQuorumChecker can leave hanging threads. (Pull Request)
FIXED	Import ordering and license generation in generated IntelliJ project files now respect Baseline conventions. (Pull Request)
FIXED IMPROVED	Cassandra thrift depedencies have been bumped to newer versions; should fix a bug (#1654) that caused Atlas probing downed Cassandra nodes every few minutes to see if they were up and working yet to eventually take out the entire cluster by steadily building up leaked connections, due to a bug in the underlying driver. (Pull Request)

v0.40.1

4 May 2017

This release contains (almost) exclusively baseline-related changes.

Type	Change
DEV BREAK	The Lock Descriptor classes (`AtlasCellLockDescriptor` etc.), static factories (e.g. `LockCollections`) and `LockClient` have been made final. If this is a concern, please contact the AtlasDB team. (Pull Request)
DEV BREAK	Removed package `atlasdb-exec`. If you require this package, please file a ticket to have it reinstated. (Pull Request)
CHANGED	Our dependency on immutables was bumped from 2.2.4 to 2.4.0, in order to fix an issue with static code analysis reporting errors in generated code. (Pull Request)
DEV BREAK	Renamed the following classes to match baseline rules. In each case, acronyms were lowercased, e.g. `CQL` becomes `Cql`. `CqlExpiringKeyValueService` `CqlKeyValueService` `CqlKeyValueServices` `CqlStatementCache` `KvTableMappingService` `TransactionKvsWrapper` (Pull Request)
DEV BREAK	Relax the signature of KeyValueService.addGarbageCollectionSentinelValues() to take an Iterable instead of a Set. (Pull Request)

v0.40.0

28 Apr 2017

Type	Change
USER BREAK	AtlasDB will refuse to start if backed by Postgres 9.5.0 or 9.5.1. These versions contain a known bug that causes incorrect results to be returned for certain queries. (Pull Request)
USER BREAK IMPROVED	The lock server now will dump all held locks and outstanding lock requests in YAML file, when logging state requested, for easy readability and further processing. This will make debuging lock congestions easier. Lock descriptors are changed with places holders and can be decoded using descriptors file, which will be written in the folder. Information like requesting clients, requesting threads and other details can be found in the YAML. Note that this change modifies serialization of lock tokens by adding the name of the requesting thread to the lock token; thus, TimeLock Servers are no longer compatible with AtlasDB clients from preceding versions. (Pull Request)
DEV BREAK FIXED	Correct `TransactionManagers.createInMemory(...)` to conform with the rest of the api by accepting a `Set<Schema>` object. (Pull Request)
NEW	The lock server can now limit the number of concurrent open lock requests from the same client. This behavior can be enabled with the flag `useClientRequestLimit`. It is disabled by default. For more information, see the docs. (Pull Request)
DEV BREAK NEW	The `TransactionManager` and `KeyValueService` interfaces have new methods that must be implemented by applications that have custom implementations of those interfaces. These new methods are `TransactionManager.getKeyValueServiceStatus()` and `KeyValueService.getClusterAvailabilityStatus()`. Applications can now call `TransactionManager.getKeyValueServiceStatus()` to determine the health of the underlying KVS. This is designed for applications to implement their availability status taking into account the kvs health. (Pull Request)
IMPROVED	On graceful shutdown, the background sweeper will now release the backup lock if it holds it. This should reduce the need for users to manually reset the `_persisted_locks` table in the event that they restarted a service while it was holding the lock. (Pull Request)
IMPROVED	Improved performance of `getRange()` on DbKvs. Range requests are now done with a single round trip to the database. (Pull Request)
DEV BREAK	`atlasdb-config` now pulls in two more dependencies - the Jackson JDK 8 and JSR 310 modules (`jackson-datatype-jdk8` and `jackson-datatype-jsr310`). These are required by the palantir/http-remoting library. This behaviour is consistent with our existing behaviour for Jackson modules (JDK 7, Guava and Joda Time). If you do encounter breaks due to this addition, please contact the AtlasDB team for support. (Pull Request)
DEPRECATED	`ConflictDetectionManagers.createDefault(KeyValueService)` has been deprecated. If you use this method, please replace it with `ConflictDetectionManagers.create(KeyValueService)`. (Pull Request 1) and (Pull Request 2)

v0.39.0

19 Apr 2017

Type	Change
IMPROVED	Refactored `AvailableTimestamps` reducing overzealous synchronization. Giving out timestamps is no longer blocking on refreshing the timestamp bound if there are enough timestamps to give out with the current bound. This improves latency of timestamp requests under heavy load; we have seen an approximately 30 percent improvement on internal benchmarks. (Pull Request)
NEW	The lock server now has a `SlowLockLogger`, which logs at INFO in the service logs if a lock request receives a response after at least a given amount of time (10 seconds by default). This is likely to be useful for debugging issues with long-running locks in production. Specifically, the timelock server has a configuration parameter `slowLockLogTriggerMillis` which defaults to `10000`. Setting this parameter to zero (or any negative number) will disable the new logger. If not using timelock, an application can modify the trigger value through `LockServerOptions` during initialization in `TransactionManagers.create`. (Pull Request)
DEPRECATED	Deprecated `InMemoryAtlasDbFactory#createInMemoryTransactionManager`, please instead use the supported `TransactionManagers.createInMemory(...)` for your testing. (Pull Request)
FIXED	Proxies created via `AtlasDbHttpClients` now parse `Retry-After` headers correctly. This manifests as Timelock clients failing over and trying other nodes when receiving a 503 with a `Retry-After` header from a remote (e.g. from a TimeLock non-leader). Previously, these proxies would immediately retry the connection on the node with a 503 two times (for a total of three attempts) before failing over. (Pull Request)
NEW	The `atlasdb-config` project now shadows the `error-handling` and `jackson-support` libraries from http-remoting. This will be used to handle exceptions in a future release, and was done in this way to avoid causing dependency issues in upstream products. (Pull Request)

v0.38.0

6 Apr 2017

Type	Change
IMPROVED	The default `sweepBatchSize` has been changed from 1000 to 100. This has empirically shown to be a better batch size because it puts less stress on the underlying KVS. For a full list of tunable sweep parameters and default settings, see sweep tunable options. (Pull Request)
FIXED	Reverted #1524, which caused dependency issues in upstream products. Once we have resolved these issues, we will reintroduce the change, which was originally part of AtlasDB 0.37.0. (Pull Request)
FIXED	Creating a postgres table with a long name now throws a `RuntimeException` if the truncated name (first sixty characters) is the same as that of a different existing table. (Pull Request)
FIXED	Fixed a performance regression introduced in #582, which caused sub-optimal batching behaviour when getting large sets of rows in Cassandra. The benchmark, intentionally set up in #1770 to highlight the break, shows a 10x performance improvement. (Pull Request)
FIXED	Correctness issue fixed in the `clean-transactions-range` CLI. This CLI is responsible for deleting potentially inconsistent transactions in the KVS upon restore from backup. The CLI was not reading the entire `_transactions` table, and as a result missed deleting transactions that started before and committed after. (Pull Request)
DEV BREAK	The `atlasdb-remoting` project was removed. We don’t believe this was used anywhere, but if you encounter any problems due to the project having being removed, please contact AtlasDB support. (Pull Request)
NEW	`InMemoryAtlasDbFactory` now supports creating an in-memory transaction manager with multiple schemas. (Pull Request)
IMPROVED	Timelock users who start an embedded timestamp and lock service without reverse-migrating now encounter a more informative error message. (Pull Request)

v0.37.0

Removed 6 Apr 2017 due to dependency issues. Please use 0.38.0 instead.

Released 29 Mar 2017

Type	Change
FIXED	Fixed an issue where a `MultipleRunningTimestampServicesError` would not be propagated from the asynchronous refresh job that increases the timestamp bound. This could result in a state where two timestamp services are simultaneously handing out timestamps until the older service’s buffer of 1M timestamps is exhausted and fails. Now we immediately fail, alerting users much sooner that a `MultipleRunningTimestampServicesError` has occurred. Note that users would still see the error prior to the fix, we now just ensure it is discovered sooner. This failure does not affect the Timelock server. Furthermore, we improved the logic for increasing the timestamp bound when the allocation buffer is exhausted. (Pull Request)
NEW	Added Dropwizard metrics for sweep, exposing aggregate and table-specific counts of cells examined and stale values deleted. (Pull Request)
NEW	Added a benchmark `TimestampServiceBenchmarks` for parallel requesting of fresh timestamps from the TimestampService. (Pull Request)
FIXED	KVS migrations now maintain the guarantee of the timestamp service to hand out monotonically increasing timestamps. Previously, we would reset the timestamp service to 0 after a migration, but now we use the correct logical timestamp. (Pull Request)
IMPROVED	Improved performance of paging over dynamic columns on Oracle DBKVS: the time required to page through a large wide row is now linear rather than quadratic in the length of the row. (Pull Request)
DEPRECATED	`GenericStreamStore.loadStream` has been deprecated. Use `loadSingleStream`, which returns an `Optional<InputStream>`, instead. (Pull Request)
DEV BREAK	`getAsyncRows` and `getAsyncRowsMultimap` methods have been removed from generated code. They do not appear valuable to the API and use an unintuitive and custom `AsyncProxy` that was also removed. We believe they are unused by upstream applications, but if you do encounter breaks due to this removal please file a ticket with the dev team for immediate support. (Pull Request)
FIXED	RemoteLockService clients will no longer silently retry on connection failures to the Timelock server. This is used to mitigate issues with frequent leadership changes owing to #1680. Previously, because of Jetty’s idle timeout and OkHttp’s silent connection retrying, we would generate an endless stream of lock requests if using HTTP/2 and blocking for more than the Jetty idle timeout for a single lock. This would lead to starvation of other requests on the TimeLock server, since a lock request blocked on acquiring a lock consumes a server thread. (Pull Request)

v0.36.0

15 Mar 2017

Type	Change
FIXED	Fixed DBKVS sweep OOM issue (#982) caused by very wide rows. `DbKvs.getRangeOfTimestamps` uses an adjustable cell batch size to avoid loading too many timestamps. One can set the batch size by calling `DbKvs.setMaxRangeOfTimestampsBatchSize`. In case of a single row that is too wide, this may result in `getRangeOfTimestamps` returning multiple `RowResult` to include all timestamps. It is, however, guaranteed that each `RowResult` will contain all timestamps for each included column. (Pull Request)
FIXED	Actions run by the `ReadOnlyTransactionManager` can no longer bypass necessary protections when using `getRowsColumnRange()`. These protections disallow reads against `THOROUGH` swept tables as read only transactions do not acquire the appropriate locks to guarantee transactionality. (Pull Request)
FIXED	Fixed an unnecessarily long-held connection in Oracle table name mapping code. (Pull Request)
FIXED	Fixed an issue where we excessively log after successful transactions. (Pull Request)
FIXED	Fixed an issue where the `_persisted_locks` table was unnecessarily logged as not having persisted metadata. The `_persisted_locks` table is a hidden table, and thus it does not need to have persisted metadata. (Pull Request)
NEW	AtlasDB now instruments services to expose aggregate response time and service call metrics for keyvalue, timestamp, and lock services. (Pull Request)
DEV BREAK IMPROVED	`TransactionManager` now explicitly declares a `close` method that does not throw exceptions. This makes the `TransactionManager` significantly easier to develop against. Clients who have implemented a concrete `TransactionManager` throwing checked exceptions are encouraged to wrap said exceptions as unchecked exceptions. (Pull Request)
NEW	Added the following benchmarks for paging over columns of a very wide row: `TransactionGetRowsColumnRangeBenchmarks` `KvsGetRowsColumnRangeBenchmarks` (Pull Request)
DEPRECATED	The public `PaxosLeaderElectionService` constructor is now deprecated to mitigate risks of users supplying parameters in the wrong order. `PaxosLeaderElectionServiceBuilder` should be used instead. (Pull Request)

v0.35.0

3 Mar 2017

Type	Change
IMPROVED	Timelock server now specifies minimum and maximum heap size of 512 MB. This should improve GC performance per the comments in #1594. (Pull Request)
FIXED	The background sweeper now uses deleteRange instead of truncate when clearing the `sweep.progress` table. This allows users with Postgres to perform backups via the normal pg_dump command while running background sweep. Previous it was possible for a backup to fail if sweep were performing a truncate at the same time. (Pull Request)
IMPROVED	Cassandra now attempts to truncate when performing a `deleteRange(RangeRequest.All())` in an effort to build up less garbage. This is relevant for when sweep is operating on its own sweep tables. (Pull Request)
NEW	Users can now create a Docker image and run containers of the Timelock Server, by running `./gradlew timelock-server:dockerTag`. This can be useful for quickly spinning up a Timelock instance (e.g. for testing purposes). Note that we are not yet publishing this image. (Pull Request)
FIXED	AtlasDB CLIs run via the Dropwizard bundle can now work with a Timelock block, and will contact the relevant Timelock server for timestamps or locks in this case. Previously, these CLIs would throw an error that a leader block was not specified. Note that CLIs will not perform automated migrations. (Pull Request)
IMPROVED	Cassandra truncates that are going to fail will do so faster. (Pull Request)
DEV BREAK	The persistent lock endpoints now use `PersistentLockId` instead of `LockEntry`. (Pull Request)
FIXED	The `CheckAndSetException` now gets mapped to the correct response for compatibility with http-remoting. Previously, any consumer using http-remoting would have to deal with deserialization errors. (Pull Request)
DEV BREAK	The persistent lock release endpoint has now been renamed to `releaseBackupLock` since it is currently only supposed to be used for the backup lock. (Pull Request)

v0.34.0

23 Feb 2017

Type

Change

NEW

Timelock server now supports HTTP/2, and the AtlasDB HTTP clients enable a required GCM cipher suite. This feature improves performance of the Timelock server. Any client that wishes to connect to the timelock server via HTTP/2 must add jetty_alpn_agent as a javaAgent JVM argument, otherwise connections will fall back to HTTP/1.1 and performance will be considerably slower.

For an example of how to add this dependency, see our timelock-server/build.gradle file. (Pull Request)

FIXED

AtlasDB Perf CLI can now output KVS-agnostic benchmark data (such as HttpBenchmarks) to a file. Previously running these benchmarks whilst attempting to write output to a file would fail. (Pull Request)

v0.33.0

22 Feb 2017

Type	Change
FIXED	AtlasDB HTTP clients are now compatible with OkHttp 3.3.0+, and no longer assume that header names are specified in Train-Case. This fix enables the Timelock server and AtlasDB clients to use HTTP/2. (Pull Request)
FIXED	Canonicalised SQL strings will now have contiguous whitespace rendered as a single space as opposed to the first character of said whitespace. This is important for backwards compatibility with an internal product. (Pull Request)
NEW	Added the option to perform a dry run of sweep via the Sweep CLI. When `--dry-run` is set, sweep will tell you how many cells would have been deleted, but will not actually delete any cells. This feature was introduced to avoid accidentally generating more tombstones than the Cassandra tombstone threshold (default 100k) introduced in CASSANDRA-6117. If you delete more than 100k cells and thus cross the Cassandra threshold, then Cassandra may reject read requests until the tombstones have been compacted away. Customers wishing to run Sweep should first run with the `--dry-run` option and only continue if the number of cells to be deleted is fewer than 100k. (Pull Request)
FIXED	Fixed atlasdb-commons Java 1.6 compatibility by removing tracing from `InterruptibleProxy`. (Pull Request)
FIXED	Persisted locks table is now considered an Atomic Table. `ATOMIC_TABLES` are those that must always exist on KVSs that support check-and-set (CAS) operations. This is particularly relevant for AtlasDB clients that make use of the TableSplittingKVS and want to keep tables on different KVSs. (Pull Request)
FIXED	Reverted PR #1577 in 0.32.0 because this change prevents AtlasDB clients from downgrading to earlier versions of AtlasDB. We will merge a fix for MRTSE once we have a solution that allows a seamless rollback process. This change is also reverted on 0.32.1. (Pull Request)
IMPROVED	Reduced contention on `PersistentTimestampService.getFreshTimestamps` to provide performance improvements to the Timestamp service under heavy request load. (Pull Request)

v0.32.1

21 Feb 2017

Type	Change
FIXED	Reverted PR #1577 in 0.32.0 because this change prevents AtlasDB clients from downgrading to earlier versions of AtlasDB. We will merge a fix for MRTSE once we have a solution that allows a seamless rollback process. This change is also reverted on develop. (Pull Request)

v0.32.0

16 Feb 2017

Type	Change
FIXED	Fixed erroneous occurrence of `MultipleRunningTimestampServicesError` (see #1000) where the timestamp service was unaware of successfully writing the new timestamp limit to the DB. This fix only applies to Cassandra backed AtlasDB clients who are not using the external Timelock service. (Pull Request)
IMPROVED	AtlasDB HTTP clients will now have a user agent of `<project.name>-atlasdb (project.version)` as opposed to `okhttp/2.5.0`. This should make distinguishing AtlasDB request logs from application request logs much easier. (Pull Request)
NEW	Sweep now takes out a lock to ensure data is not corrupted during online backups. Users performing live backups should grab this lock before performing a backup of the underlying KVS, and then release the lock once the backup is complete. This enables the backup to safely run alongside either the background sweeper or the sweep CLI. (Pull Request)
NEW	Initial support for tracing Key Value Services integrating with http-remoting tracing. (Pull Request)
IMPROVED	Improved heap usage during heavy DBKVS querying by reducing mallocs in `SQLString.canonicalizeString()`. (Pull Request)
IMPROVED	Removed an unused hamcrest import from the timestamp-impl project. This should reduce the size of our transitive dependencies, and therefore the size of product binaries. (Pull Request)
FIXED	Fixed schema generation with Java 8 optionals. To use Java8 optionals, supply `OptionalType.JAVA8` as an additional constructor argument when creating your `Schema` object. (Pull Request)
DEV BREAK	Modified the type signature of `BatchingVisitableView#of` to no longer accept `final BatchingVisitable<? extends T> underlyingVisitable` and instead accept `final BatchingVisitable<T> underlyingVisitable`. This will resolve an issue where newer versions of Intellij fail to compile AtlasDB. (Pull Request)
IMPROVED	Reduced logging noise from large Cassandra gets and puts by removing ERROR messages and only providing stacktraces at DEBUG. (Pull Request)
NEW	Upon startup of an AtlasDB client with a `timeblock` config block, the client will now automatically migrate its timestamp to the the external Timelock cluster. The client will fast-forward the Timelock Server’s timestamp bound to that of the embedded service. The client will now also invalidate the embedded service’s bound, backing this up in a separate row in the timestamp table. Automated migration is only supported for Cassandra KVS at the moment. If using DBKVS or other key-value services, it remains the user’s responsibility to ensure that they have performed the migration detailed in Migration to External Timelock Services. (Pull Request 1, Pull Request 2, and Pull Request 3)
FIXED	Fixed multiple scenarios where DBKVS can run into deadlocks due to unnecessary connections. (Pull Request)

v0.31.0

8 Feb 2017

Type	Change
IMPROVED DEV BREAK	Improved Oracle performance on DBKVS by preventing excessive reads from the _namespace table when initializing SweepStrategyManager. Replaced `mapToFullTableNames()` with `generateMapToFullTableNames()` in `com.palantir.atlasdb.keyvalue.TableMappingService`. (Pull Request)
DEV BREAK	Removed the unused `TieredKeyValueService` which offered the ability to spread tables across multiple KVSs that exist in a stacked hierarchy (primary & secondary). If you require this KVS please file a ticket to have it reinstated. (Pull Request)
DEV BREAK	Fast forwarding a persistent timestamp service to `Long.MIN_VALUE` will now throw an exception, whereas previously it would be a no-op. Calling the `fast-forward` endpoint without specifying the fast-forward timestamp parameter will now default to submitting `Long.MIN_VALUE`, and thus return a HTTP 400 response. We are introducing this break to prevent accidental corruption by forgetting to submit the fast-forward timestamp. (Pull Request)
FIXED	Oracle queries now use the correct hints when generating the query plan. This will improve performance for Oracle on DB KVS. (Pull Request)
USER BREAK	Oracle table names can now have a maximum length of 27 characters instead of the previous limit of 30. This is to ensure consistency in naming the primary key constraint which adds a prefix of `pk_` to the table name. This will break any installation of Oracle with the `useTableMapping` flag set to `true`. Since Oracle support is still in beta, we are not providing an automatic migration path from older versions of AtlasDB. (Pull Request)
FIXED	Support for Oracle 12c batch responses. (Pull Request)

v0.30.0

27 Jan 2017

Type	Change
FIXED DEV BREAK	Fixed schema generation with Java 8 optionals. To use Java8 optionals, supply `OptionalType.JAVA8` as an additional constructor argument when creating your `Schema` object. Additionally, this fix requires all AtlasDB clients to regenerate their schema, even if they do not use the Java 8 optionals. (Pull Request)
FIXED	Prevent deadlocks in an edge case where we perform parallel reads with a small connection pool on DB KVS. (Pull Request)
NEW	Added support for benchmarking custom Key Value Stores. In the future this will enable performance regression testing for Oracle. See our performance writing documentation for details. (Pull Request)
IMPROVED	Don’t retry interrupted remote calls. This should have the effect of shutting down faster in situations where we receive a `InterruptedException`. (Pull Request)
IMPROVED	Added request and exception rates metrics in CassandraClientPool. This will provide access to 1-, 5-, and 15-minute moving averages. (Pull Request)
IMPROVED	More informative logging around retrying of transactions. If a transaction succeeds after being retried, we log the success (at the INFO level). If a transaction failed, but will be retried, we now also log the number of failures so far (at INFO). (Pull Request)
IMPROVED	Updated our dependency on `gradle-java-distribution` from 1.2.0 to 1.3.0. See gradle-java-distribution release notes for details. (Pull Request)

v0.29.0

17 Jan 2017

Type	Change
NEW	Returned `RemotingKeyValueService` and associated remoting classes to the AtlasDB code base. These now live in `atlasdb-remoting`. This KVS will pass remote calls to a local delegate KVS. (Pull Request)
FIXED	Stream store compression, introduced in 0.27.0, no longer creates a transaction inside a transaction when streaming directly to a file. Additionally, a check was added to enforce the condition imposed in 0.28.0, namely that the caller of `AbstractGenericStreamStore.loadStream` should not call `InputStream.read()` within the transaction that was used to fetch the stream. (Pull Request)
IMPROVED	AtlasDB timestamp and lock HTTPS communication now use JVM optimized cipher suite CBC over the slower GCM. (Pull Request)
NEW	Added a new `KeyValueService` API method, `checkAndSet`. This is to be used in upcoming backup lock changes, and is not intended for other usage. If you think your application would benefit from using this directly, please contact the AtlasDB dev team. This is supported for Cassandra, Postgres, and Oracle, but in the latter case support is only provided for tables which are not overflow tables. `checkAndSet` is not supported for RocksDB or JDBC. (Pull Request)
FIXED	Reverted the `devbreak` in AtlasDB 0.28.0 by returning the `DebugLogger` to its original location. (Pull Request)

v0.28.0

13 Jan 2017

Type	Change
DEV BREAK	The `DebugLogger` class was moved from package `com.palantir.timestamp` in project `timestamp-impl` to `com.palantir.util` in project `atlasdb-commons`. This break is reverted in the next release (AtlasDB 0.29.0) and will not affect services who skip this release. (Pull Request)
IMPROVED	Increase default Cassandra pool size from minimum of 20 and maximum of 5x the minimum (100 if minimum not modified) connections to minimum of 30 and maximum of 100 connections. This has empirically shown better handling of bursts of requests that would otherwise require creating many new connections to Cassandra from the clients. (Pull Request)
NEW	Added metrics to SnapshotTransaction to monitor durations of various operations such as `get`, `getRows`, `commit`, etc. AtlasDB users should use `AtlasDbMetrics.setMetricRegistry` to set a `MetricRegistry`. (Pull Request)
NEW	Added metrics in Cassandra clients to record connection pool statistics and exception rates. These metrics use the global `AtlasDbRegistry` metrics. (Pull Request)
NEW	There is now a `TimestampMigrationService` with the `fast-forward` method that can be used to migrate between timestamp services. You will simply need to fast-forward the new timestamp service using the latest timestamp from the old service. This can be done using the timestamp forward cli when your AtlasDB services are offline. This capability was added so we can automate the migration to an external Timelock service in a future release. (Pull Request)
FIXED	Allow tables declared with `SweepStrategy.THOROUGH` to be migrated during a KVS migration. (Pull Request)
FIXED	Fix an issue with stream store where pre-loading the first block of an input stream caused us to create a transaction inside another transaction. To avoid this issue, it is now the caller’s responsibility to ensure that `InputStream.read()` is not called within the transaction used to fetch the stream. (Pull Request)
IMPROVED	`atlasdb-rocksdb` is no longer required by `atlasdb-cli` and therefore will no longer be packaged with AtlasDB clients pulling in `atlasdb-dropwizard-bundle`. (Pull Request)
FIXED	All SnapshotTransaction `get` methods are now safe for tables declared with SweepStrategy.THOROUGH. Previously, a validation check was omitted for `getRowsColumnRange`, `getRowsIgnoringLocalWrites`, and `getIgnoringLocalWrites`, which in very rare cases could have resulted in deleted values being returned by a long-running read transaction. (Pull Request)
USER BREAK	Users must not create a client named `leader`. AtlasDB Timelock Server will fail to start if this is found. Previously, using `leader` would have silently failed, since the JAXRS 3.7.2 algorithm does not include backtracking over root resource classes (so either leader election or timestamp requests would have failed). (Pull Request)

v0.27.2

10 Jan 2017

Type

Change

FIXED

Fixed an issue with StreamStore.loadStream’s underlying BlockGetter where, for non-default block size and in-memory thresholds, we would incorrectly throw an exception instead of allowing the stream to be created. This caused an issue when the in-memory threshold was many times larger than the default (47MB for the default block size), or when the block size was many times smaller (7KB for the default in-memory threshold). (Pull Request)

v0.27.1

6 Jan 2017

Type	Change
FIXED	Fixed an edge case in stream stores where we throw an exception for using the exact maximum number of bytes in memory. This behavior was introduced in 0.27.0 and does not affect stream store usage pre-0.27.0. (Pull Request)
IMPROVED	Backoff when receiving a socket timeout to Cassandra to put back pressure on client and to spread out load incurred on remaining servers when a failover occurs. (Pull Request)

v0.27.0

6 Jan 2017

Type	Change
NEW	AtlasDB now supports stream store compression. Streams can be compressed client-side by adding the `compressStreamInClient` option to the stream definition. Reads from the stream store will transparently decompress the data. For information on using the stream store, see Streams. (Pull Request)
IMPROVED	`StreamStore.loadStream` now actually streams data if it does not fit in memory. This means that getting the first byte of the stream now has constant-time performance, rather than linear in terms of stream length as it was previously. (Pull Request)
IMPROVED	Increased Cassandra connection pool idle timeout to 10 minutes, and reduced eviction check frequency to 20-30 seconds at 1/10 of connections. This should reduce bursts of stress on idle Cassandra clusters. (Pull Request)
NEW	There is a new configuration called `maxConnectionBurstSize`, which configures how large the pool is able to grow when receiving a large burst of requests. Previously this was hard-coded to 5x the `poolSize` (which is now the default for the parameter). See Cassandra KVS Config for details on configuring AtlasDB with Cassandra. (Pull Request)
IMPROVED	Improved the performance of Oracle queries by making the table name cache global to the KVS level. Keeping the mapping in a cache saves one DB lookup per query, when the table has already been used. (Pull Request)
FIXED	Oracle value style caching limited in scope to per-KVS, previously per-JVM, which could have in extremely rare cases caused issues for users in non-standard configurations. This would have caused issues for users doing a KVS migration to move from one Oracle DB to another. (Pull Request)
NEW	We now publish a runnable distribution of AtlasCli that is available for download directly from Bintray. (Pull Request 1) and (Pull Request 2)
IMPROVED	Enabled garbage collection logging for CircleCI builds. This may be useful for investigating pre-merge build failures. (Pull Request)
IMPROVED	Updated our dependency on `gradle-java-distribution` from 1.0.1 to 1.2.0. See gradle-java-distribution release notes for details. (Pull Request)
NEW	Add KeyValueStore.deleteRange(); makes large swathes of row deletions faster, like transaction sweeping. Also can be used as a fallback option for people having issues with their backup solutions not allowing truncate() during a backup (Pull Request)

v0.26.0

5 Dec 2016

Type	Change
IMPROVED	Added Javadocs to `CassandraKeyValueService.java`, documented the behaviour of `CassandraKeyValueService` when one or more nodes in the Cassandra cluster are down. (Pull Request)
IMPROVED	Substantially improved performance of the DBKVS implementation of the single-iterator version of getRowsColumnRange. Two new performance benchmarks were added as part of this PR: `KvsGetRowsColumnRangeBenchmarks.getAllColumnsAligned` `KvsGetRowsColumnRangeBenchmarks.getAllColumnsUnaligned` These benchmarks show a 2x improvement on Postgres, and an AtlasDB client has observed an order of magnitude improvement experimentally. (Pull Request)
IMPROVED	OkHttpClient connection pool configured to have 100 idle connections with 10 minute keep-alive, reducing the number of connections that need to be created when a large number of transactions begin. (Pull Request)
IMPROVED	Commit timestamp lookups are now cached across transactions. This provided a near 2x improvement in our performance benchmark testing. See comments on the pull request for details. (Pull Request)
IMPROVED	`LockAwareTransactionManager.runTaskWithLocksWithRetry` now fails faster if given lock tokens that time out in a way that cannot be recovered from. (Pull Request)
IMPROVED	When we hit the `MultipleRunningTimestampServicesError` issue, we now automatically log thread dumps to a separate file (file path specified in service logs). The full file path of the `atlas-timestamps-log` file will be outputted to the service logs. (Pull Request 1, Pull Request 2)

v0.25.0

25 Nov 2016

Type	Change
FIXED	`--config-root` and other global parameters can now be passed into dropwizard CLIs. (Pull Request)
USER BREAK	The migration `--config-root` shorthand (`-r`) can no longer be used as it conflicted with the timestamp command `--row`. (Pull Request)
NEW	Dbkvs: ConnectionSupplier consumers can now choose to receive a brand new unshared connection. (Pull Request)
NEW	AtlasDB now supports Cassandra 3.7 as well as Cassandra 2.2.8. (Pull Request)
IMPROVED	Oracle perf improvement; table names now cached, resulting in fewer round trips to the database. (Pull Request)
IMPROVED	`SweepStatsKeyValueService` will no longer flush a final batch of statistics during shutdown. This avoids potentially long pauses that could previously occur when closing a `Cleaner`. (Pull Request)
IMPROVED	Better support for AtlasDB clients running behind load balancers. In particular, if an AtlasDB client falls down and its load balancer responds with “503: Service Unavailable”, the request will be attempted on other clients rather than aborting. (Pull Request)
FIXED	Oracle will not drop a table that already exists on `createTable` calls when multiple AtlasDB clients make the call to create the same table. (Pull Request)
FIXED	Certain Oracle KVS calls no longer attempt to leak connections created internally. (Pull Request)
FIXED	OracleKVS: `TableSizeCache` now invalidates the cache on table delete. (Pull Request)
DEV BREAK	Our Jackson version has been updated from 2.5.1 to 2.6.7 and Dropwizard version from 0.8.2 to 0.9.3. (Pull Request)
IMPROVED	Additional debugging available for those receiving ‘name must be no longer than 1500 bytes’ errors. (Pull Request)
DEV BREAK	`Cell.validateNameValid` is now private; consider `Cell.isNameValid` instead. (Pull Request)

v0.24.0

15 Nov 2016

Type	Change
USER BREAK	All Oracle table names will be truncated and be of the form: `<prefix>_<2-letter-namespace>__<table-name>_<5-digit-int>`. Previously we only truncated names that exceeded the character limit for Oracle table names. This should improve legibility as all table names for a particular application will have identical formatting. Oracle is in beta, and thus we have not built a migration path from old table names to new table names. (Pull Request)
FIXED	The fetch timestamp CLI correctly handles `--file` inputs containing non-existent directories by creating any missing intermediate directories. Previously, the CLI would throw an exception and fail in such cases. (Pull Request)
FIXED	When using DBKVS with Oracle, `TableRemappingKeyValueService` does not throw a RuntimeException when performing `getMetaData` and `dropTable` operations on a non-existent table. (Pull Request)
FIXED	The KVS migration CLI will now decrypt encrypted values in your KVS configuration. (Pull Request)
IMPROVED	If using the Dropwizard command to run a KVS migration, the Dropwizard config will be used as the `--migrateConfig` config if none is specified. Running the KVS migration command as a deployable CLI still requires `--migrateConfig`. See the documentation for details on how to use the KVS migration command. (Pull Request)
FIXED	The timestamp bound store now works with Oracle as a relational backing store. (Pull Request)
IMPROVED	CLIs now output to standard out, standard error, and the service logs, rather than only printing to the service logs. This should greatly improve usability for service admins using the CLIs. (Pull Request)
IMPROVED	Remove usage of `createUnsafe` in generated Schema code. You can regenerate your schema to get rid of the deprecation warnings. (Pull Request)
IMPROVED	`atlasdb-cassandra` now depends on `cassandra-thrift` instead of `cassandra-all`. Applications that support CassandraKVS will see a 20MB (10%) decrease in their Cassandra dependency footprint. (Pull Request)
NEW	Add support for generating schemas with Java8 Optionals instead of Guava Optionals. To use Java8 optionals, supply `OptionalType.JAVA8` as an additional constructor argument when creating your `Schema` object. (Pull Request)

v0.23.0

8 Nov 2016

Type	Change
DEV BREAK	All KVSs now as a guarantee throw a RuntimeException on attempts to truncate a non-existing table, so services should check the existence of a table before attempting to truncate. Previously we would only throw exceptions for the Cassandra KVS. (Pull Request)
FIXED	The KVS migration command now supports the `--offline` flag and can be run as an offline CLI. (Pull Request)
DEPRECATED	`TableReference.createUnsafe` is now deprecated to prevent mishandling of table names. `createWithEmptyNamespace` or `createFromFullyQualifiedName` should be used instead. Schema generated code still contains use of `TableReference.createUnsafe` and is being tracked for removal on #1172. (Pull Request)
NEW	We now provide Oracle support (beta) for all valid schemas. Oracle table names exceeding 30 characters are now mapped to shorter names by truncating and appending a sequence number. Support for Oracle is currently in beta and services wishing to deploy against Oracle should contact the AtlasDB team. See Oracle Table Mapping for details on how table names are mapped. (Pull Request)
CHANGED	We now test against Cassandra 2.2.8, rather than Cassandra 2.2.7. (Pull Request)
IMPROVED	Added a significant amount of logging aimed at tracking down the `MultipleRunningTimestampServicesError`. If clients are hitting this error, then they should add TRACE logging for `com.palantir.timestamp`. These logs can also be directed to a separate file, see the documentation for more details. (Pull Request)
IMPROVED	Retrying a Cassandra operation now retries against distinct hosts. Previously, this would independently select hosts randomly, meaning that we might unintentionally try the same operation on the same servers. (Pull Request)
FIXED	AtlasDB clients can start when a single Cassandra node is unreachable. (Pull Request).
IMPROVED	Removed spurious error logging during first-time startup against a brand new Cassandra cluster. (Pull Request)
IMPROVED	Improved the reliability of starting up against a degraded Cassandra cluster. (Pull Request)
FIXED	No longer publish a spurious junit dependency in atlasdb-client compile. (Pull Request)

v0.22.0

28 Oct 2016

Type

Change

IMPROVED

The clean-cass-locks-state CLI clears the schema mutation lock by setting it to a special “cleared” value in the same way that normal lockholders clear the lock. Previously the CLI would would drop the whole _locks table to clear the schema mutation lock.

See Schema Mutation Lock (Cassandra only) for details on how the schema mutation lock works. (Pull Request)

FIXED

Fixed an issue where some locks were not being tracked for continuous refreshing due to one of the lock methods not being overridden by the LockRefreshingLockService. This resulted in locks that appeared to be refreshed properly, but then would mysteriously time out at the end of a long-running operation. (Pull Request)

IMPROVED

Sweep no longer immediately falls back to a sweepBatchSize of 1 after receiving an error.

See sweep tuning documentation for more information on sweep tuning parameters. (Pull Request)

v0.21.1

24 Oct 2016

Type

Change

FIXED

Fixed a regression with Cassandra KVS where you could no longer create a table if it has the same name as another table in a different namespace.

To illustrate the issue, assume you have namespace namespace1 and the table table1, and you would like to add a column to table1 and version the table by using the new namespace namespace2. On disk you already have the Cassandra table namespace1__table1, and now you are trying to create namespace2__table1. Creating namespace2__table1 would fail because Cassandra KVS believes that the table already exists. This is relevant if you use multiple namespaces when performing schema migrations.

Note that namespace is an application level abstraction defined as part of a AtlasDB schema and is not the same as Cassandra keyspace. (Pull Request)

v0.21.0

21 Oct 2016

Type	Change
NEW	Sweep now supports batching on a per-cell level via the `sweepCellBatchSize` parameter in your AtlasDB config. This can decrease Sweep memory consumption on the client side if your tables have large cells or many columns (i.e. wide rows). For information on how to configure Sweep batching, see the sweep documentation. (Pull Request)
FIXED	If `hashFirstRowComponent()` is used in a table or index definition, we no longer throw `IllegalStateException` when generating schema code. (Pull Request)

v0.20.0

19 Oct 2016

Type	Change
DEV BREAK	Hotspotting warnings, previously logged at ERROR, will now throw `IllegalStateException` when generating your schema code. Products who hit this warning will need to add `ignoreHotspottingChecks()` to the relevant tables of their schema, or modify their schemas such that the first row component is not a VAR_STRING, a VAR_LONG, a VAR_SIGNED_LONG, or a SIZED_BLOB. See documentation on primitive value types and partitioners for information on how to address your schemas. (Pull Request)
FIXED	The AtlasDB Console included in the Dropwizard bundle can startup in an “online” mode, i.e. it can connect to a running cluster. See AtlasDB Console for information on how to use AtlasDB console. (Pull Request)
FIXED	The `atlasdb-dagger` project now publishes a shadowed version so we do not rely on the version of dagger on the classpath. This fixes the issue where running the CLIs would cause a `ClassNotFoundException` if your application also makes use of dagger. (Pull Request)
NEW	Oracle is supported via DBKVS if you have runtime dependency on an Oracle driver that resolves the JsonType “jdbcHandler”. Due to an Oracle limitation, all table names in the schema must be less than 30 characters long. See Oracle KVS Configuration for details on how to configure your service to use Oracle. (Pull Request)
FIXED	The DBKVS config now enforces that the namespace must always be empty for `metadataTable` in the `ddl` block. The `metadataTable` parameter defaults to an empty name space, and if this was configured to be anything else previously, DBKVS would not start. (Pull Request)
FIXED	We have changed the default `tablePrefix` for `OracleDdlConfig` to be `a_`. Previously this would default to be empty and so user-defined tables could have a leading underscore, which is an invalid table name for Oracle. This change is specific to Oracle and does not affect DBKVS on Postgres. (Pull Request)
FIXED	The `metadataTableName` for Oracle is now `atlasdb_metadata` instead of `_metadata`. This is due to Oracle’s restriction of not allowing table names with a leading underscore. (Pull Request)

v0.19.0

11 Oct 2016

Type	Change
DEV BREAK	Removed KeyValueService `initializeFromFreshInstance`, `tearDown`, and `getRangeWithHistory`. It is likely all callers of tearDown just want to call close, and getRangeWithHistory has been replaced with `getRangeOfTimestamps`. Also removed Partitioning and Remoting KVSs, which were unused and had many unimplemented methods. (Pull Request)
FIXED	In Cassandra KVS, we now no longer take out the schema mutation lock in calls to `createTables` if tables already exist. This fixes the issue that prevented the `clean-cass-locks-state` CLI from running correctly. (Pull Request)
FIXED	Added a wait period before declaring someone dead based on lack of heartbeat. This will ensure we handle delayed heartbeats in high load situations (eg. on circleci). (Pull Request)
DEV BREAK	Removed the following classes and interfaces that appeared to be unused: `AbstractStringCollector` `BatchRowVisitor` `ChunkedRowVisitor` `CloseShieldedKeyValueService` `DBMgrConfigurationException` `IdGenerator` `ManyHostPoolingContainer` `MapCollector` `PalantirSequenceEnabledSqlConnection` `PalantirSqlConnectionRunner` `PaxosLearnerPersistence` `PaxosPingablePersistence` `PaxosProtos` `PostgresBlobs` `RowWrapper` `SqlConnectionImpl` `SqlStackLogWrapper` `StringCollector` `TLongQueue` Please reach out to us if you are adversely affected by these removals. (Pull Request 1 and Pull Request 2)
CHANGED	The SQL connection manager will no longer temporarily increase the pool size by eleven connections when the pool is exhausted. (Pull Request)

v0.18.0

3 Oct 2016

Type	Change
FIXED	Fixed a bug introduced in 0.17.0, where products upgraded to 0.17.0 would see a “dead heartbeat” error on first start-up, requiring users to manually truncate the `_locks` table. Upgrading to AtlasDB 0.18.0 from any previous version will work correctly without requiring manual intervention. (Pull Request)
FIXED	Dropping a table and then creating it again no longer adds an additional row to the `_metadata` table. Historical versions of the metadata entry before the most recent one are not deleted, so if you routinely drop and recreate the same table, you might consider sweeping the `_metadata` table. (Pull Request)
IMPROVED	Users of DBKVS can now set arbitrary connection parameters. This is useful if, for example, you wish to boost performance by adjusting the default batch size for fetching rows from the underlying database. See the documentation for how to set these parameters, and the JDBC docs for a full list. (Pull Request)

v0.17.0

28 Sept 2016

Type	Change
IMPROVED	The schema mutation lock holder now writes a “heartbeat” to the database to indicate that it is still responsive. Other processes that are waiting for the schema mutation lock will now be able to see this heartbeat, infer that the lock holder is still working, and wait for longer. This should reduce the need to manually truncate the locks table. (Pull Request)
NEW	`hashFirstRowComponent` can now be used on index definitions to prevent hotspotting when creating schemas. For more information on using `hashFirstRowComponent`, see the Partitioners documentation. (Pull Request)

v0.16.0

26 Sept 2016

Type	Change
DEV BREAK	Removed `TransactionManager` implementations `ShellAwareReadOnlyTransactionManager` and `AtlasDbBackendDebugTransactionManager`. These are no longer supported by AtlasDB and products are not expected to use them. (Pull Request)
IMPROVED	`TransactionMangers.create()` now accepts `LockServerOptions` which can be used to apply configurations to the embedded LockServer instance running in the product. The other `create()` methods will continue to use `LockServerOptions.DEFAULT`. (Pull Request)
FIXED	Column paging Sweep (in beta) correctly handles cases where table names have both upper and lowercase characters and cases where sweep is run multiple times on the same table. If you are using the regular implementation of Sweep (i.e. you do not specify `timestampsGetterBatchSize` in your AtlasDB config), then you are not affected. (Pull Request)

v0.15.0

14 Sept 2016

Type	Change
IMPROVED	We have removed references to temp tables and no longer attempt to drop temp tables when aborting transactions. Temp tables are not currently being used by any KVSs, yet we were still calling `dropTempTables()` when we abort transactions. Since dropping tables is a schema mutation, this has the side effect of increasing the likelihood that we lose the schema mutation lock when there are many concurrent transactions. Removing temp tables entirely should reduce the need to manually truncate the locks table. (Pull Request)
DEV BREAK	All TransactionManagers are now AutoCloseable and implement a close method that will free up the underlying resources. If your service implements a `TransactionManager` and does not extend `AbstractTransactionManager`, you now have to add a close method to the implementation. No operations can be performed using the TransactionManager once it is closed. (Pull Request)
NEW	AtlasDB Sweep now uses column paging via the `timestampsGetterBatchSize` parameter to better handle sweeping cells with many historical versions. By paging over historical versions of cells during sweeping, we can avoid out of memory exceptions in Cassandra when we have particularly large cells or many historical versions of cells. This feature is only implemented for Cassandra KVS and is disabled by default; please reach out to the AtlasDB dev team if you would like to enable it. (Pull Request)
NEW	Added a second implementation of `getRowsColumnRange` method which allows you to page through dynamic columns in a single iterator. This is expected to perform better than the previous `getRowsColumnRange`, which allows you to page through columns per row with certain KVS stores (e.g. DB KVS). The new method should be preferred unless it is necessary to page through the results for different rows separately. Products or clients using wide rows should consider using `getRowsColumnRange` instead of `getRows` in `KeyValueService`. (Pull Request)
NEW	Added an offline CLI called `clean-cass-locks-state` to truncate the locks table when the schema mutation lock has been lost. This is useful on Cassandra KVS if an AtlasDB client goes down during a schema mutation and does not release the schema mutation lock, preventing other clients from continuing. Previously an error message would direct users to manually truncate this table with CQL, but now this error message references the CLI. (Pull Request)
CHANGED	Reverted our Dagger dependency from 2.4 to 2.0.2 and shadowed it so that it won’t conflict with internal products. (Pull Request)

v0.14.0

8 Sept 2016

Type	Change
USER BREAK	`TransactionManagers.create()` no longer takes in an argument of `Optional<SSLSocketFactory> sslSocketFactory`. Instead, security settings between AtlasDB clients are now specified directly in configuration via the new optional parameter `sslConfiguration` located in the `leader`, `timestamp`, and `lock` blocks. Details can be found in the Leader Configuration documentation. To assist with back compatibility, we have introduced a helper method `AtlasDbConfigs.addFallbackSslConfigurationToAtlasDbConfig`, which will add the provided `sslConfiguration` to `config` if the SSL configuration is not specified directly in the `leader`, `timestamp`, or `lock` blocks. (Pull Request 1 and Pull Request 2)
FIXED	AtlasDB could startup with a leader configuration that is nonsensical, such as specifying both a `leader` block as well as a remote `timestamp` and `lock` blocks. AtlasDB will now fail to start if your configuration is invalid with a sensible message, per #790, rather than potentially breaking in unexpected ways. Please refer to Example Leader Configurations for guidance on valid configurations. (Pull Request)
FIXED	Fixed and standardized serialization and deserialization of AtlasDBConfig. This prevented CLIs deployed via the Dropwizard bundle from loading configuration properly. (Pull Request)
DEV BREAK	Updated our Dagger dependency from 2.0.2 to 2.4, so that our generated code matches with that of internal products. This also bumps our Guava dependency from 18.0 to 19.0 to accommodate a Dagger compile dependency. We plan on shading Dagger in the next release of AtlasDB, but products can force a Guava 18.0 runtime dependency to workaround the issue in the meantime. (Pull Request)

v0.13.0

30 Aug 2016

Type	Change
DEV BREAK	`AtlasDbServer` has been renamed to `AtlasDbServiceServer`. Any products that are using this should switch to using the standard AtlasDB java API instead. (Pull Request)
FIXED	The method `updateManyUnregisteredQuery(String sql)` has been removed from the `SqlConnection` interface, as it was broken, unused, and unnecessary. Use `updateManyUnregisteredQuery(String sql, Iterable<Object[] list>)` instead. (Pull Request)
IMPROVED	Improved logging for schema mutation lock timeouts and added logging for obtaining and releasing locks. Removed the advice to restart the client, as it will not help in this scenario. (Pull Request)
FIXED	Connections to Cassandra can be established over arbitrary ports. Previously AtlasDB clients would assume the default Cassandra port of 9160 despite what is specified in the Cassandra keyValueService configuration. (Pull Request)
FIXED	Fixed an issue when starting an AtlasDB client using the Cassandra KVS where we always grab the schema mutation lock, even if we are not making schema mutations. This reduces the likelihood of clients losing the schema mutation lock and having to manually truncate the _locks table. (Pull Request)
IMPROVED	Performance and reliability enhancements to the in-beta CQL KVS. (Pull Request)

v0.12.0

22 Aug 2016

Type	Change
USER BREAK	AtlasDB will always try to register timestamp and lock endpoints for your application, whereas previously this only occurred if you specify a Leader Configuration. This ensures that CLIs will be able to run against your service even in the single node case. For Dropwizard applications, this is only a breaking change if you try to initialize your KeyValueService after having initialized the Dropwizard application. Note: If you are initializing the KVS post-Dropwizard initialization, then your application will already fail when starting multiple AtlasDB clients. (Pull Request)
NEW	There is now a Dropwizard bundle which can be added to Dropwizard applications. This will add startup commands to launch the AtlasDB console and CLIs suchs as `sweep` and `timestamp`, which is needed to perform backup-restore. These commands will only work if the server is started with a leader block in its configuration. (Pull Request 1 and Pull Request 2)
FIXED	DB passwords are no longer output as part of the connection configuration `toString()` methods. (Pull Request)
NEW	All KVSs now come wrapped with ProfilingKeyValueService, which at the TRACE level provides timing information per KVS operation performed by AtlasDB. See Logging Configuration for more details. (Pull Request)

v0.11.4

29 Jul 2016

Type	Change
FIXED	Correctly checks the Cassandra client version that determines if Cassandra supports Check And Set operations. This is a critical bug fix that ensures we actually use our implementation from #436, which prevents data loss due to the Cassandra concurrent table creation bug described in #431. (Pull Request)

v0.11.2

29 Jul 2016

Type

Change

USER BREAK

Reverting behavior introduced in AtlasDB 0.11.0 so the ssl property continues to take precedence over the sslConfiguration block to allow back-compatibility when using SSL with CassandraKVS. This means that products can add default truststore and keystore configuration to their AtlasDB config without overriding previously made SSL decisions (setting ssl: false should cause SSL to not be used).

This only affects end users who have deployed products with AtlasDB 0.11.0 or 0.11.1; users upgrading from earlier versions will not see changed behavior. See Communicating Over SSL for details on how to configure CassandraKVS with SSL. (Pull Request)

v0.11.1

28 Jul 2016

Type	Change
FIXED	Removed a check enforcing a leader block config when one was not required. This prevents AtlasDB 0.11.0 clients from starting if a leader configuration is not specified (i.e. single node clusters). (Pull Request)
IMPROVED	Updated schema table generation to optimize reads with no ColumnSelection specified against tables with fixed columns. To benefit from this improvement you will need to re-generate your schemas. (Pull Request)

v0.11.0

27 Jul 2016

Type	Change
IMPROVED	Clarified the logging when multiple timestamp servers are running to state that CLIs could be causing the issue. (Pull Request)
CHANGED	Updated cassandra client from 2.2.1 to 2.2.7 and cassandra docker testing version from 2.2.6 to 2.2.7. (Pull Request)
FIXED	The leader config now contains a new `lockCreator` option, which specifies the single node that creates the locks table when starting your cluster for the very first time. This configuration prevents an extremely unlikely race condition where multiple clients can create the locks table simultaneously. Full details on the failure scenario can be found on #444. If left blank, `lockCreator` will default to the first host in the `leaders` list, but we recommend setting this explicitly to ensure that the lockCreater is the same value across all your clients for a specific service. This configuration is only relevant for new clusters and does not affect existing AtlasDB clusters. Full details for configuring the leader block, see cassandra configuration. (Pull Request)
FIXED	A utility method was removed in the previous release, breaking an internal product that relied on it. This method has now been added back. (Pull Request)
FIXED	Removed unnecessary error message for missing _timestamp metadata table. _timestamp is a hidden table, and it is expected that _timestamp metadata should not be retrievable from public API. (Pull Request)
IMPROVED	Trace logging is more informative and will log all failed calls. To enable trace logging, see Enabling Cassandra Tracing. (Pull Request)
NEW	The Cassandra KVS now supports specifying SSL options via the new `sslConfiguration` block, which takes precedence over the now deprecated `ssl` property. The `ssl` property will be removed in a future release, and consumers leveraging the Cassandra KVS are encouraged to use the `sslConfiguration` block instead. See the Cassandra SSL Configuration documentation for more details. (Pull Request)

v0.10.0

13 Jul 2016

Type	Change
CHANGED	Updated HikariCP dependency from 2.4.3 to 2.4.7 to comply with updates in internal products. Details of the HikariCP changes can be found here. (Pull Request)
NEW	AtlasDB currently allows you to create dynamic columns (wide rows), but you can only retrieve entire rows or specific columns. Typically with dynamic columns, you do not know all the columns you have in advance, and this features allows you to page through dynamic columns per row, reducing pressure on the underlying KVS. Products or clients (such as AtlasDB Sweep) making use of wide rows should consider using `getRowsColumnRange` instead of `getRows` in `KeyValueService`. (Pull Request) Note: This is considered a beta feature and is not yet being used by AtlasDB Sweep.
FIXED	We properly check that cells are not set to empty (zero-byte) or null. (Pull Request)
IMPROVED	Cassandra client connection pooling will now evict idle connections over a longer period of time and has improved logic for deciding whether or not a node should be blacklisted. This should result in less connection churn and therefore lower latency. (Pull Request)

v0.9.0

11 Jul 2016

Type

Change

DEV BREAK

Inserting an empty (size = 0) value into a Cell will now throw an IllegalArgumentException. (#156) Likely empty values include empty strings and empty protobufs.

AtlasDB cannot currently distinguish between empty and deleted cells. In previous versions of AtlasDB, inserting an empty value into a Cell would delete that cell. Thus, in this snippet,

Transaction.put(table, ImmutableMap.of(myCell, new byte[0]))
Transaction.get(table, ImmutableSet.of(myCell)).get(myCell)

the second line will return null instead of a zero-length byte array.

To minimize confusion, we explicitly disallow inserting an empty value into a cell by throwing an IllegalArgumentException.

In particular, this change will break calls to Transaction.put(TableReference tableRef, Map<Cell, byte[]> values), as well as generated code which uses this method, if any entry in values contains a zero-byte array. If your product does not need to distinguish between empty and non-existent values, simply make sure all the values entries have positive length. If the distinction is necessary, you will need to explicitly differentiate the two cases (for example, by introducing a sentinel value for empty cells).

If any code deletes cells by calling Transaction.put(...) with an empty array, use Transaction.delete(...) instead.

Note: Existing cells with empty values will be interpreted as deleted cells, and will not lead to Exceptions when read. (Pull Request)

IMPROVED

The warning emitted when an attempted leadership election fails is now more descriptive. (Pull Request)

FIXED

Code generation for the hashCode of *IdxColumn classes now uses deepHashCode for its arrays such that it returns consistent hash codes for use with hash-based collections (HashMap, HashSet, HashTable). This issue will only appear if you are instantiating columns in multiple places and storing columns in hash collections.

If you are using Indices we recommend you upgrade as a precaution and ensure you are not relying on logic related to the hashCode of auto-generated *IdxColumn classes. You will need to regenerate your schema code in order to see this fix. (Pull Request)

v0.8.0

5 Jul 2016

Type	Change
FIXED	Some logging was missing important information due to use of the wrong substitution placeholder. This version should be taken in preference to 0.7.0 to ensure logging is correct. (Pull Request)

v0.7.0

4 Jul 2016

Type	Change
NEW	AtlasDB can now be backed by Postgres via DB KVS. This is a very early release for this feature, so please contact us if you plan on using it. Please see the documentation for more details.
FIXED	The In Memory Key Value Service now makes defensive copies of any data stored or retrieved. This may lead to a slight performance degradation to users of In Memory Key Value Service. In Memory Key Value Service is recommended for testing environments only and production instances should use DB KVS or Cassandra KVS for data that needs to be persisted. (Pull Request)
FIXED	AtlasDB will no longer log incorrect errors stating “Couldn’t grab new token ranges for token aware cassandra mapping” when running against a single node and single token Cassandra cluster. (Pull Request)
IMPROVED	Read heavy workflows with Cassandra KVS will now use substantially less heap. In worst-case testing this change resulted in a 10-100x reduction in client side heap size. However, this is very dependent on the particular scenario AtlasDB is being used in and most consumers should not expect a difference of this size. (Pull Request)

v0.6.0

26 May 2016

Type	Change
FIXED	A potential race condition could cause timestamp allocation to never complete on a particular node (#462).
FIXED	An innocuous error was logged once for each TransactionManager about not being able to allocate enough timestamps. The error has been downgraded to INFO and made less scary.
FIXED	Serializable Transactions that read a column selection could consistently report conflicts when there were none.
FIXED	An excessively long Cassandra related logline was sometimes printed (#501).

v0.5.0

16 May 2016

Type	Change
CHANGED	Only bumping double minor version in artifacts for long-term stability fixes.

v0.4.1

17 May 2016

Type	Change
FIXED	Prevent _metadata tables from triggering the Cassandra 2.x schema mutation bug 431 (444 not yet fixed).
FIXED	Required projects are now Java 6 compliant.