Changelog
Note
For versions of AtlasDB above v0.151.1, please refer to GitHub’s releases page for release notes. This page is kept around as a document of changes up to v0.151.1 inclusive.
v0.151.1
2 Jul 2019
Identical to v0.151.0. This version was released owing to deployment infrastructure issues with v0.151.0.
v0.151.0
28 Jun 2019
Type |
Change |
---|---|
IMPROVED |
Upgraded gradle-baseline to improve compile time static analysis checks. (Pull Request) |
DEV BREAK |
|
IMPROVED |
Targeted Sweep now supports reading write information and dedicated row information from multiple partitions at a time; the same overall sweep batch limit is still respected. This is expected to be particularly relevant at installations where write information is sparsely distributed over many partitions (e.g. transactions2 deployments, especially such deployments with thorough tables). (Pull Request) |
v0.150.0
20 Jun 2019
NEW |
Added ability for timelock to rate limit targeted sweep lock requests to 2 per second. This is to reduce load on timelock for bad atlas clients. (Pull Request) |
---|---|
IMPROVED |
Relaxed concurrency model of |
v0.149.0
11 Jun 2019
Type |
Change |
---|---|
IMPROVED |
Added runtime configurable behavior to targeted sweep to allow multiple consecutive iterations on the same targeted sweep shard, all while holding the relevant lock. This option should improve targeted sweep throughput without adding additional load on Timelock for places where the downtime between targeted sweep iterations is a bottleneck. (Pull Request) |
v0.148.0
11 Jun 2019
Type |
Change |
---|---|
FIXED |
The default value for the length of pause between targeted sweep iterations, |
v0.147.0
10 Jun 2019
Type |
Change |
---|---|
DEV BREAK |
Cassandra clients are now required to supply credentials in the KVS config. (Pull Request) |
v0.146.0
7 Jun 2019
Type |
Change |
---|---|
IMPROVED |
Async initialization callbacks are run with an instrumented TransactionManager. (Pull Request) |
IMPROVED |
If using DbKVS or JDBC KVS, we no longer spin up a background thread attempting to install new transaction schema versions. Furthermore, we now log at WARN if configuration indicates that a new schema version should be installed in these cases where it won’t be respected. Previously, we would always read to and write from transactions1, even though the installer might actually have installed a non-1 schema version (and logged that this happened!). (Pull Request) |
v0.145.0
4 Jun 2019
Type |
Change |
---|---|
FIXED |
Fixed a bug causing connection leaks to timelock. (Pull Request) |
IMPROVED |
The default value for the length of pause between targeted sweep iterations, |
IMPROVED |
Changed the default values in |
IMPROVED |
Clients now log at most once every 10 seconds when a |
v0.144.0
20 May 2019
Type |
Change |
---|---|
DEV BREAK |
|
FIXED |
Fixed a bug in |
IMPROVED |
The pause time between iterations of targeted sweep for each background thread is now configurable by the targeted sweep runtime configuration |
v0.143.1
16 May 2019
Type |
Change |
---|---|
FIXED |
|
FIXED |
Removed the DB username from the Hikari connection pool name. (Pull Request) |
FIXED |
Fixed incorrect delegation that took place in AutoDelegate transaction managers. (Pull Request) |
v0.143.0
16 May 2019
Type |
Change |
---|---|
IMPROVED DEPRECATED |
Replaced all usages of Guava Supplier by Java Supplier. Please note that while Guava Supplier endpoints may still exist, they will be removed in a future release. (Pull Request 1, Pull Request 2 and Pull Request 3) |
FIXED |
Coordination service metrics no longer throw |
FIXED |
AtlasDB now maintains a finite length for the delete executor’s work queue, to avoid OOMs on services with high conflict rates for transactions. In the event the queue length is reached, we will not proactively schedule cleanup of values written by a transaction that was rolled back. Note that it is not essential that these deletes are carried out immediately, as targeted sweep will eventually clear them out. Previously, this queue was unbounded, meaning that service nodes could end up using lots of memory. (Pull Request) |
IMPROVED |
The coordination store now retries when reading a value at a given sequence number that no longer exists (as opposed to throwing). This is necessary for supporting cleanup of the coordination store. Note that if one is performing rolling upgrades to a version that sweeps the coordination store, one MUST upgrade from at least this version. (Pull Request) |
v0.142.2
14 May 2019
This version is equivalent to v0.142.0, but was re-tagged because of publishing issues.
v0.142.1
14 May 2019
This version is equivalent to v0.142.0, but was re-tagged because of publishing issues.
v0.142.0
14 May 2019
Type |
Change |
---|---|
IMPROVED |
The default configuration for the number of targeted sweep shards has been increased to 8. This enables us to increase the speed of targeted sweep if processing the queue starts falling behind. Previously, we could only increase the speed of processing future entries, as we cannot sweep entries with higher parallelism than the number of shards active when the writes were made. (Pull Request) |
IMPROVED |
AtlasDB now throws an |
DEV BREAK |
|
v0.141.0
9 May 2019
Type |
Change |
---|---|
NEW |
|
IMPROVED |
The Timelock Availability Health check should not timeout if we can’t reach other nodes. This should stop the health check firing erroneously. (Pull Request) |
NEW |
Setting |
v0.140.0
8 May 2019
Type |
Change |
---|---|
METRICS IMPROVED |
Client side tombstone filtering is now instrumented more exhaustively. (Pull Request) |
DEV BREAK |
|
IMPROVED |
Changed the default values in |
v0.139.0
30 Apr 2019
Type |
Change |
---|---|
METRICS CHANGED |
All instrumentation AtlasDB metrics now use a |
v0.138.0
25 Apr 2019
Type |
Change |
---|---|
USER BREAK |
AtlasDB Cassandra KVS now depends on rescue 4.4.0 (was previously 3.22.0). (Pull Request) |
v0.137.0
25 Apr 2019
Type |
Change |
---|---|
FIXED |
Coordination service now checks for semantic equality of |
v0.136.1
23 Apr 2019
This release is equivalent to v0.136.0 but was re-tagged due to a publishing issue.
v0.136.0
23 Apr 2019
Type |
Change |
---|---|
IMPROVED DEV BREAK |
Usage metrics for the coordination store have been added.
Users should provide a MetricsRegistry when creating their coordination services.
Also, |
FIXED |
lock-api now declares a minimum dependency on timelock-server 0.59.0. (Pull Request) |
v0.135.0
19 Apr 2019
Type |
Change |
---|---|
IMPROVED |
Coordination service now only initiates one request to perpetuate the bound forward at a time. This should avoid unnecessarily many CAS operations taking place when we need to do this. (Pull Request) |
v0.134.0
18 Apr 2019
Type |
Change |
---|---|
FIXED |
We now close Cassandra clients properly when verifying that one’s Cassandra configuration makes sense. (Pull Request) |
v0.133.0
16 Apr 2019
Type |
Change |
---|---|
IMPROVED |
AtlasDB now logs diagnostic information about usage of classes that utilise smart batching (e.g. when starting transactions, verifying leadership, _transactions2 put-unless-exists, etc.). (Pull Request) |
v0.132.0
11 Apr 2019
Type |
Change |
---|---|
FIXED DEV BREAK |
Stop memoizing the |
IMPROVED |
Removed unnecessary memory allocations in the lock refresher, and in several other classes, by using Lists.partition(…) instead of Iterables.partition(…). (Pull Request) |
v0.131.0
9 Apr 2019
Type |
Change |
---|---|
FIXED |
Cassandra client input and output transports are now properly closed. (Pull Request) |
v0.130.0
4 Apr 2019
Type |
Change |
---|---|
NEW |
AtlasDB now supports _transactions2 if backed by Cassandra KVS or In-Memory KVS.
This is expected to improve transaction performance by making |
FIXED |
|
NEW |
A new configuration option |
FIXED |
|
FIXED |
Fixed a rare situation in which interrupting a thread could possibly leave dangling locks. (Pull Request) |
FIXED |
Coordination services now only perpetuate an existing value on value-preserving transformations if the existing bound is invalid at a fresh sequence number. Previously, we would perpetuate the bound regardless, meaning that when the bound is crossed in a multi-threaded environment, each in-flight transaction that tries to determine its transaction schema version will independently attempt to perpetuate the bound. This may lead to multiple unnecessary updates to the coordinated value in a short space of time. Note that updates that do change the value will be applied regardless, and could potentially still race if applied in parallel. (Pull Request) |
IMPROVED |
|
IMPROVED |
Reduced dependency footprint by replacing dependency on groovy-all with dependencies on groovy, groovy-groovysh, and groovy-json. (Pull Request) |
v0.129.0
28 Mar 2019
Type |
Change |
---|---|
FIXED |
Oracle KVS now deletes old entries correctly if using targeted sweep. Previously, there were situations where it would not delete values that could safely be deleted. (Pull Request) |
IMPROVED |
The Cassandra KVS |
IMPROVED |
Concurrent calls to TimelockService.startIdentifiedAtlasDbTransaction() now coalesced into a single Timelock rpc to reduce load on Timelock. (Pull Request) |
DEV BREAK |
RemoteTimelockServiceAdapter is now closeable. Users of this class should invoke close() before termination to avoid thread leaks. (Pull Request) |
USER BREAK FIXED |
AtlasDB Cassandra KVS now depends on sls-cassandra 3.31.0 (was 3.31.0-rc3). We do not want to stay on an RC version now that a full release is available. Note that this means that you must use this version of the sls-cassandra server if you want to use Cassandra KVS. (Pull Request) |
v0.128.0
27 Mar 2019
Type |
Change |
---|---|
USER BREAK |
AtlasDB Cassandra KVS now depends on sls-cassandra 3.31.0-rc3 (was 3.27.0).
This version of Cassandra KVS supports a |
FIXED DEV BREAK |
Callbacks specified in TransactionManagers will no longer be run synchronously when |
CHANGED |
Postgres 9.5.2+ requirement temporarily rescinded. (Pull Request) |
LOGS |
Added extra debug/trace logging to log the state of the Cassandra pool / application when running into cassandra pool exhaustion errors. (Pull Request) |
v0.127.0
25 Mar 2019
Type |
Change |
---|---|
FIXED |
Fixed an issue where the |
FIXED USER BREAK |
Background Sweep will now continue to prioritise tables accordingly, if writes to the sweep queue are enabled but targeted sweep is disabled on startup. Previously, Background Sweep would not prioritise new writes for sweeping if writes to the sweep queue were enabled. (Pull Request) |
CHANGED IMPROVED |
We’ve rolled back the change from 0.117.0 that introduces an extra delay after leader election as we are no longer pursuing leadership leases. (Pull Request) |
IMPROVED DEV BREAK |
AtlasDbHttpClients, FeignOkHttpClients and AtlasDbFeignTargetFactory are refactored to get rid of deprecated methods and overused overloads. (Pull Request) |
v0.126.0
18 Mar 2019
Type |
Change |
---|---|
CHANGED USER BREAK |
Removed functionality for marking tables as deprecated as part of the schema definition and automatically dropping deprecated tables on startup. (Pull Request) |
IMPROVED |
Improved the startup check that verifies the correctness of the timestamp source to impose tighter constraints. Now uses a recent value from the puncher store rather than the unreadable timestamp. (Pull Request) |
FIXED |
|
DEV BREAK |
The contract of |
FIXED |
Fixed a bug in |
v0.125.0
07 Mar 2019
Type |
Change |
---|---|
IMPROVED |
|
CHANGED USER BREAK |
The minimum Postgres version is now 9.5.2 (Pull Request) |
FIXED |
Some race conditions in |
v0.122.0
22 Feb 2019
Type |
Change |
---|---|
IMPROVED DEV BREAK |
Clients talking to Timelock will now throw instead of making a request with a payload larger than 50MB. This addresses several internal issues concerning Timelock stability. This is a devbreak in several AtlasDB utility classes used to create clients, where an additional boolean parameter has been added controlling whether its requests should be limited. (Pull Request) |
IMPROVED |
Timelock clients now use leased lock tokens to reduce number of RPC’s to Timelock server, and improve transaction performance. (Pull Request) |
DEV BREAK |
startIdentifiedAtlasDbTransaction() and lockImmutableTimestamp() now being called without an IdentifiedTimeLockRequest parameter. (Pull Request) |
v0.121.0
21 Feb 2019
Type |
Change |
---|---|
IMPROVED |
We now use jetty-alpn-agent 2.0.9 (Pull Request) |
IMPROVED DEV BREAK |
All usage of remoting-api and remoting3 have been replaced by their equivalents in com.palantir.tracing, com.palantir.conjure.java.api, and com.palantir.conjure.java.runtime. (Pull Request) |
v0.120.0
19 Feb 2019
Type |
Change |
---|---|
DEV BREAK |
The deprecated startAtlasDbTransaction() method is removed from TimelockService. (Pull Request) |
DEV BREAK |
startIdentifiedAtlasDbTransaction now accepts IdentifiedTimeLockRequest as a parameter rather than StartIdentifiedAtlasDbTransactionRequest. Moving the requestorId information to TimelockClient from the caller. (Pull Request) |
FIXED |
|
FIXED |
Fixed cases where column range scans could result in NullPointerExceptions when there were concurrent writes to the same range. (Pull Request) |
v0.119.0
13 Feb 2019
Type |
Change |
---|---|
CHANGED IMPROVED |
TimeLock will now no longer create its high level paxos directory at configuration de-serialization time. Instead it waits until creating each individual learner or acceptor log directory, allowing timelock to rely more accurately on directory existence as a proxy for said timelock node being new or not. (Pull Request) |
v0.118.0
8 Feb 2019
Type |
Change |
---|---|
IMPROVED DEV BREAK |
AtlasDB Cassandra KVS now depends on |
DEV BREAK IMPROVED |
The TableMetadata class has been refactored to use Immutables. (Pull Request) |
NEW METRICS |
Transaction services now expose timer metrics indicating how long committing or getting values takes. (Pull Request) |
FIXED |
Entries with the same value in adjacent ranges in a timestamp partitioning map will now be properly coalesced, and for the purposes of coordination will not be written as new values. Previously, these were stored as separate entries, meaning that unnecessary values may have been written to the coordination store; this does not affect correctness, but is unperformant. (Pull Request) |
IMPROVED |
AtlasDB now allows you to enable a new transaction retry strategy with exponential backoff via configs. (Pull Request) |
v0.117.0
28 Jan 2019
Type |
Change |
---|---|
CHANGED |
Timelock service no longer supports synchronous lock endpoints. Users who explicitly stated timelock to use synchronous resources by setting install.asyncLock.useAsyncLockService to false (default is true) should migrate to AsyncLockService before taking this upgrade. (Pull Request) |
DEV BREAK |
Key value services now require their |
IMPROVED |
AtlasDB now has an extra delay after leader elections; this lays the groundwork for leadership leases. (Pull Request) |
IMPROVED |
We now correctly handle host restart in the clock skew monitor. (Pull Request) |
v0.116.1
20 Dec 2018
Type |
Change |
---|---|
FIXED |
The completion service in the Paxos leader election service should be more resilient to individual nodes being slow.
Previously, if one individual node had a full thread pool, the service would throw a |
v0.116.0
14 Dec 2018
Type |
Change |
---|---|
NEW |
AtlasDB now writes to the _coordination table, a new table which is used to coordinate changes to schema metadata internal to AtlasDB across a multi-node cluster. Services which want to adopt _transactions2 will need to go through this version, to ensure that nodes are able to reach a consensus on when to switch the transaction schema version forwards. (Pull Request) |
FIXED |
AtlasDB transaction services no longer throw exceptions when performing Thorough Sweep on tables with sentinels. Previously, the services would throw when trying to delete the sentinel, meaning that Background and Targeted Sweep would become stuck if sweeping thorough tables that used to be conservative, or tables that had undergone hard delete via the scrubber. (Pull Request) |
CHANGED DEV BREAK |
AtlasDB transaction services now no longer support negative timestamps. Users are unlikely to be affected, since using transaction services with negative timestamps was already broken in the past owing to the use of negative numbers for special values (like sentinels or a marker meaning that a transaction was rolled back). (Pull Request) |
DEV BREAK |
With the introduction of _coordination, creation of |
DEV BREAK |
Transaction Managers now expose a |
NEW METRICS |
With the introduction of _coordination, we expose new metrics indicating the point (in logical timestamps) till which the coordination service knows what has been agreed, as well as the transactions schema version that will eventually be applied. (Pull Request) |
FIXED USER BREAK |
Cassandra KVS getMetadataForTables method now returns a map where table reference keys have capitalisation matching the table names in Cassandra. Previously there was no strict guarantee on the keys’ capitalisation, but it was in most cases all lowercase. (Pull Request) |
FIXED |
Cassandra KVS getMetadataForTables method now does not contain entries for tables that do not exist in Cassandra. Previously, when a table was dropped, an empty byte array would be written into the _metadata table to mark it as deleted. Now, we delete all rows of the _metadata table containing entries pertaining to the dropped table. Note that this involves a range scan over a part of the _metadata table. While it is not expected that this significantly affects performance of table dropping, please contact the AtlasDB team if this causes issues. (Pull Request) |
v0.115.0
07 Dec 2018
Type |
Change |
---|---|
FIXED |
Cassandra KVS now correctly decommissions servers from the client pool that do not appear in the current token range if autoRefreshNodes is set to true (default value). Previously, refresh would only add discovered new servers, but never remove decommissioned hosts. The new behaviour enables live decommissioning of Cassandra nodes, without having to update the configuration and restart of AtlasDB to stop trying to talk to that server. (Pull Request) |
FIXED |
The @AutoDelegate annotation now works correctly for interfaces which have static methods, and for simple cases of generics. Previously, the annotation processor would generate code that wouldn’t compile. Note that some cases (e.g. sub-interfaces of generics that refine type parameters) are still not supported correctly. (Pull Request) |
IMPROVED |
TimeLock Server now logs that a new client has been registered the first time a service makes a request (for each lifetime of each server). (Pull Request) |
IMPROVED |
Adds com.palantir.common.collect.IterableView#stream method for simplified conversion to Java Stream API usage. (Pull Request) |
v0.114.0
03 Dec 2018
Type |
Change |
---|---|
USER BREAK |
As part of preparatory work to migrate to a new transactions table, this version of AtlasDB and all versions going forward expect to be using a version of TimeLock that supports the |
v0.113.0
03 Dec 2018
Type |
Change |
---|---|
FIXED |
KVS Migration CLI no longer migrates the checkpoint table if it exists on the source KVS. Previously, existence of an old checkpoint table on the source KVS could cause a migration to silently skip migrating data. Furthermore, in the cleanup stage of migration, the checkpoint table is now dropped instead of truncated. (Pull Request) |
IMPROVED |
Read transactions on thoroughly swept tables requires one less RPC to timelock now. This improves the read performance and reduces load on timelock. (Pull Request) |
FIXED |
Fix warning in stream-store generated code. (Pull Request) |
v0.112.1
26 Nov 2018
Type |
Change |
---|---|
FIXED |
Wrap shutdown callback running in try-catch. This guards against any shutdown hooks throwing unchecked exceptions, which would cause other hooks to not run. (Pull Request) |
v0.112.0
26 Nov 2018
Type |
Change |
---|---|
FIXED |
Remove a memory leak due to usages of Runtime#addShutdownHook to cleanup resources. This only applies where multiple TransactionManager s might exist in a single VM and they are created an shutdown repeatedly. (Pull Request) |
v0.111.0
20 Nov 2018
Type |
Change |
---|---|
FIXED |
Fixed a bug where lock and timestamp services were not closed when transaction managers were closed. (Pull Request) |
v0.110.0
20 Nov 2018
Type |
Change |
---|---|
IMPROVED |
Numerous small internal improvements that did not include release notes. |
v0.109.0
14 Nov 2018
Type |
Change |
---|---|
DEV BREAK |
PaxosQuorumChecker now takes an ExecutorService as opposed to an Executor. (Pull Request) |
FIXED |
Re-introduced the distinct bounded thread pools to PaxosLeaderElectionService for communication with other PaxosLearners and PingableLeaders. Previously, a single unbounded thread pool was used, which could cause starvation and OOMs under high load if any learners or leaders in the cluster were slow to fulfil requests. This change also improves visibility as to which specific communication workflows may be suffering from issues. (Pull Request) |
FIXED |
Targeted sweep now handles table truncations with conservative sweeps correctly. (Pull Request) |
IMPROVED |
No longer calls deprecated OkHttpClient.Builder().sslSocketFactory() method, now passes in X509TrustManager. (Pull Request) |
IMPROVED |
Sha256Hash now caches its Java hashCode method. (Pull Request) |
IMPROVED |
The version of javapoet had previously been bumped to 1.11.1 from 1.9.0. However this was not done consistently across the repository. The atlasdb-client and atlasdb-processors subprojects now also use the newer version. (Pull Request) |
v0.108.0
7 Nov 2018
Type |
Change |
---|---|
FIXED |
Cassandra KVS no longer uses the schema mutation lock and instead creates tables using an id deterministically generated from the Cassandra keyspace and the table name. As part of this change, table deletion now truncates the table before dropping it in Cassandra, therefore requiring all Cassandra nodes to be available to drop tables. This fixes a bug where it was possible to create two instances of the same table on two different Cassandra nodes, resulting in schema version inconsistency that required manual intervention. (Pull Request) |
IMPROVED |
Introduced runtime checks on the client side for timestamps retrieved from timelock. This aims to prevent data corruption if timestamps go back in time, possibly caused by a misconducted timelock migration. This is a best effort for catching abnormalities on timestamps at runtime, and does not provide absolute protection. (Pull Request) |
USER BREAK |
Qos Service: The experimental QosService for rate-limiting clients has been removed. (Pull Request) |
FIXED |
Fixed a bug in the |
FIXED |
Targeted sweep now deletes certain sweep queue rows faster than before, which should reduce table bloat (particularly on space constrained systems). (Pull Request) |
IMPROVED FIXED |
Schema mutations against the Cassandra KVS are now HA. Previously, Cassandra KVS required that after some schema mutations all cassandra nodes must agree on the schema version. Now, all reachable nodes must agree and at least a quorum of nodes must be reachable, instead. (Pull Request) |
DEV BREAK |
The AutoDelegate annotation no longer supports a typeToExtend parameter. Users should instead annotate the desired class or interface directly. (Pull Request) |
FIXED |
Targeted sweep does better with missing tables, and also with the empty namespace. Previously, it would just cycle on the error and never sweep. A highly undesirable condition. (Pull Request) |
FIXED |
|
IMPROVED |
The HikariConnectionClientPool now allows specification of a use-case. If specified, threads created will have the use-case in their name, and log messages about pool statistics will be prefaced by the use-case as well. This may be useful for debugging when users run multiple such pools. (Pull Request) |
NEW |
Old deprecated tables can now be added to a schema to be cleaned up on startup. (Pull Request) |
FIXED |
Fixed a bug where |
NEW |
TimeLock now exposes a |
DEV BREAK |
The schema metadata service has been removed, as the AtlasDB team does not intend to pursue extracting sweep to its own separate service in the short to medium term, and it was causing support issues. If you were consuming this service, please contact the AtlasDB team. (Pull Request) |
IMPROVED |
On Oracle backed DbKvs, schema changes that would require the addition of an overflow column will now throw upon application. Previously, puts would instead fail at runtime when the column did not exist. (Pull Request) |
IMPROVED |
The index cleanup task for stream stores now only fetches the first column for each stream ID when determining whether the stream is still in use. Previously, we would fetch the entire row which is unnecessary and causes read pressure on the key-value-service for highly referenced streams. (Pull Request) |
FIXED |
Live-reloading HTTP proxies and HTTP proxies with failover now refresh themselves after encountering a large number of cumulative requests or consecutive exceptions. This was previously implemented to work around several issues with our usage of OkHttp, but was not implemented for the proxies with failover (which includes proxies to TimeLock). (Pull Request) |
v0.107.0
10 Oct 2018
Type |
Change |
---|---|
IMPROVED |
Targeted sweep now stores even less data in the sweepable cells table due to dictionary encoding table references instead of storing them as strings. (Pull Request) |
IMPROVED |
The legacy lock service’s lock state logger now logs additional information about the lock service’s internal synchronization state. This includes details of queueing threads on each underlying sync object, as well as information on the progress of inflight requests. (Pull Request 1 and Pull Request 2) |
v0.106.0
2 Oct 2018
Type |
Change |
---|---|
FIXED DEV BREAK |
Reverted the PR #3505, which was modifying PaxosLeaderElectionService to utilise distinct bounded thread pools, as this PR uncovered some resiliency issues with PaxosLeaderElectionService. It will be re-merged after fixing those issues. (Pull Request) |
FIXED |
Targeted sweep now stores much less data in the sweepable cells table due to more efficient encoding. (Pull Request) |
NEW |
|
FIXED |
Targeted sweep no longer chokes if a table in the queue no longer exists, and was deleted by a different host while this host was online and sweeping. (Pull Request) |
IMPROVED |
Add versionId to SimpleTokenInfo to improve logging for troubleshooting. (Pull Request) |
IMPROVED |
Increase maximum allowed rescue dependency version to 4.X.X. (Pull Request) |
LOGS CHANGED |
Changed the origin for logs when queries were slow from kvs-slow-log to kvs-slow-log-2. (Pull Request) |
v0.105.0
20 Sep 2018
Type |
Change |
---|---|
FIXED |
Improved threading for MetricsManager’s metricsRegistry (Pull Request) |
LOGS METRICS |
Improved visibility into sources of high DB load. We log when a query returns a high number of timestamps that need to be looked up in the database, and tag some additional metrics with the tablename we were querying. (Pull Request) (Pull Request) |
CHANGED |
Upgrade http-remoting 3.41.1 -> 3.43.0 to make tracing delegate nicely. (Pull Request) |
IMPROVED |
Users may now provide their own executors to instances of |
FIXED |
TargetedSweepMetrics#millisSinceLastSweptTs updates periodically, even if targeted sweep is failing to successfully run. (Pull Request) |
FIXED |
Targeted sweep no longer chokes if a table in the queue no longer exists. (Pull Request) |
FIXED |
Targeted sweep threads will no longer die if Timelock unlock calls fail. (Pull Request) |
IMPROVED |
PaxosLeaderElectionService now utilises distinct bounded thread pools for communication with other PaxosLearners and PingableLeaders. Previously, a single unbounded thread pool was used, which could cause starvation and OOMs under high load if any learners or leaders in the cluster were slow to fulfil requests. This change also improves visibility as to which specific communication workflows may be suffering from issues. (Pull Request) |
FIXED |
Fixed an issue in timelock where followers were publishing metrics with |
FIXED |
Background sweep will now choose between priority tables uniform randomly if there are multiple priority tables. Previously, if multiple priority tables were specified, background sweep would repeatedly pick the same table to be swept, meaning that the other priority tables would all never be swept. (Pull Request) |
IMPROVED |
A few timelock ops edge cases have been removed. Timelock users must now indicate whether they are booting their servers for the first time or subsequent times, to avoid the situation where a timelock node becomes newly misconfigured and thinks it is booting up for the first time again. Additionally, timestamps no longer overflow when they hit Long.MAX_VALUE; this would only happen due to a bug, but at least now the DB will become read only and not corrupt. (Pull Request) |
DEV BREAK |
PaxosQuorumChecker now takes an ExecutorService as opposed to an Executor. (Pull Request) |
v0.104.0
4 Sep 2018
Type |
Change |
---|---|
FIXED |
The Jepsen tests no longer assume that users have installed Python or DateUtil, and will install these itself if needed. (Pull Request) |
CHANGED |
Bumps com.palantir.remoting3 dependency to 3.41.1 from 3.22.0. (Pull Request) |
v0.103.0
30 Aug 2018
Type |
Change |
---|---|
IMPROVED |
Targeted sweep queue now hard fails if it is unable to read table metadata to determine sweep strategy. Previously, we assumed the strategy was conservative, which could result in sweeping tables that should never be swept. (Pull Request) |
FIXED |
Fixed an issue where targeted sweep would fail to increase the number of shards and error out if the default number of shards was ever persisted into the progress table. (Pull Request) |
FIXED |
Several exceptions (such as when creating cells with overly long names or executors in illegal configurations) now contain numerical parameters correctly.
Previously, the exceptions thrown would erroneously contain |
FIXED |
Cassandra Key Value Service now no longer logs spurious ERROR warning messages when failing to read new-format table metadata. (Pull Request) |
IMPROVED |
Throw more specific CommittedTransactionException when operating on a committed transaction. (Pull Request) |
v0.102.0
24 Aug 2018
Type |
Change |
---|---|
FIXED |
CQL queries are now logged correctly (with safe and unsafe arguments respected). Previously, these versions would log all arguments as part of the format string as it eagerly did the string substitution. AtlasDB versions 0.100.0 through 0.101.0 (inclusive both ends) are affected. (Pull Request) |
DEV BREAK IMPROVED |
CqlQuery is now an abstract class and must now be created through its builder. This makes the intention that the query string provided is safe considerably more explicit. (Pull Request) |
IMPROVED |
DbKvs now implements its own version of |
FIXED |
LockRefreshingLockService now batches calls to refresh locks in batches of 650K. Previously, trying to refresh a larger number of locks could trigger the 50MB limit in payload size. (Pull Request) |
LOGS |
Reduce logging level for locks not being refreshed. (Pull Request) |
v0.101.0
16 Aug 2018
Type |
Change |
---|---|
CHANGED |
Targeted Sweep is now enabled by default. Products using atlasdb-cassandra library need to declare a dependency on Rescue 3 or ignore that dependency altogether. (Pull Request) |
FIXED |
Fixed a bug that when filtering the row results for |
IMPROVED |
AtlasDB now correctly closes the targeted sweeper on shutdown, and logs less by default. (Pull Request) |
IMPROVED DEV BREAK |
The atlasdb-commons package has had its dependency tree greatly pruned of unused cruft. This may introduce a devbreak to users transitively relying on these old dependencies. (Pull Request) |
CHANGED |
|
IMPROVED LOGS |
CassandraKVS’s
|
NEW DEV BREAK |
|
v0.100.0
2 Aug 2018
Type |
Change |
---|---|
FIXED |
Cassandra KVS now correctly accepts check-and-set operations if one is working with multiple columns in the relevant row. Previously, if there were multiple columns in the row where one was trying to do a CAS, the CAS would be rejected even if the column value matched the cell. Similarly, for put-unless-exists, the PUE would be rejected if there were any other cells in the relevant row (even if they had a different column name). We now perform the operations correctly only considering the value (or absence of value) in the relevant cell. (Pull Request) |
IMPROVED DEV BREAK |
We have removed the |
IMPROVED |
Sequential sweep now sleeps longer between iterations if there was nothing to sweep. Previously we would sleep for 2 minutes between runs, but it is unlikely that anything has changed dramatically in 2 minutes so we sleep for longer to prevent scanning the sweep priority table too often. Going forward the most likely explanation for there being nothing to sweep is that we have switched to targeted sweep. We don’t stop completely or sleep for too long just in case configuration changes and a table is eligible to sweep again. (Pull Request) |
IMPROVED |
TimeLockAgent now exposes the number of active clients and the configured maximum. This makes it easier for a service to expose these via a health check. (Pull Request) |
v0.99.0
25 July 2018
Type |
Change |
---|---|
FIXED |
Fixed an issue where a failure to punch a value into the _punch table would suppress any future attempts to punch. Previously, if the asynchronous job that punches a timestamp every minute ever threw an exception, the unreadable timestamp would be stuck until the service is restarted. (Pull Request) |
IMPROVED |
TimeLock by default now has a client limit of 500. Previously, this used to be 100 - however we have run into issues internally where stacks legitimately reach this threshold. Note that we still need to maintain the client limit to avoid a possible DOS attack with users creating arbitrarily many clients. (Pull Request) |
NEW METRICS |
Added metrics for the number of active clients and maximum number of clients in TimeLock Server. These are useful to identify stacks that may be in danger of breaching their maxima. (Pull Request) |
v0.98.0
25 July 2018
Type |
Change |
---|---|
NEW METRICS |
Targeted sweep now exposes tagged metrics for the outcome of each iteration, analogous to the legacy sweep outcome metrics.
The reported outcomes for targeted sweep are: |
IMPROVED |
Changed the range scan behavior for the sweep priority table so that reads scan less data in Cassandra. (Pull Request) |
v0.97.0
20 July 2018
Type |
Change |
---|---|
IMPROVED |
TimeLock Server now exposes a |
IMPROVED LOGS |
Reduced the logging level of various log messages. (Pull Request) |
CHANGED METRICS |
CassandraClientPoolingContainer metrics are tagged by pool name. Previosly pool name was embedded in metric name. (Pull Request) |
IMPROVED |
Added the |
IMPROVED |
The timestamp cache size is now actually live reloaded, and uses Caffeine instead of Guava for better performance. The read only transaction manager (almost unused) now no longer constructs a thread pool. (Pull Request) |
IMPROVED DEV BREAK |
Transactions now have meters recording their outcomes (e.g. successful commits, lock expiry, being rolled back, read-write conflicts, etc.)
In the cases of write-write and read-write conflicts, the first table on which a conflict occurred will be tagged on to the conflict meter if it is safe for logging.
Note that some metric names have changed; in particular, |
v0.96.0
11 July 2018
Type |
Change |
---|---|
FIXED |
Targeted sweep metrics will no longer range scan the punch table if the last swept timestamp was issued more than one week ago. Previously, we would range scan the table even if the last swept timestamp was -1, which would force a range scan of the entire table. (Pull Request) |
FIXED DEPRECATED |
Atlas clients using Cassandra can specify type of kvs as cassandra rather then CassandraKeyValueServiceRuntimeConfig in runtime configuration. The CassandraKeyValueServiceRuntimeConfig type is now deprecated. (Pull Request) |
IMPROVED |
Startup and schema change performance improved for Cassandra users with large numbers of tables. (Pull Request) |
v0.95.0
9 July 2018
Type |
Change |
---|---|
IMPROVED |
The atlas console metadata query now returns more table metadata, such as sweep strategy and conflict handler information. (Pull Request) |
DEV BREAK |
The |
IMPROVED |
We will no longer continue to update |
FIXED |
Writes to the targeted sweep queue are now done using the start timestamp of the transaction that makes the call. Previously, the writes were done at timestamp 0, which was interfering with Cassandra compactions. (Pull Request) |
FIXED |
The sweep CLI will no longer perform in-process compactions after sweeping a table. For DbKvs, this operation is handled by the background compaction thread; Cassandra performs its own compactions. Note that the sweep CLI itself has been deprecated in favour of using the sweep priority override configuration, possibly in conjunction with the thread count (Docs). (Pull Request) |
NEW |
Three new conflict handlers |
DEV BREAK |
Removed the token range skewness logger from the Cassandra KVS. We’ve not been relying on it to catch issues and it produces a very large output that is cumbersome. (Pull Request) |
v0.94.0
28 June 2018
Type |
Change |
---|---|
IMPROVED |
Snapshot transaction |
DEV BREAK |
Snapshot transactions now return immutable maps when calling |
NEW |
Multiple |
IMPROVED |
Sweep progress is now stored per-table, meaning that if background sweep of a table is interrupted (for example, because sweep priority config changed), next time the background sweeper selects that table, it will pick up where it left off. Previously, the table would be swept from the start, potentially leading to several days of work being redone. (Pull Request) |
DEV BREAK |
The |
IMPROVED |
Targeted sweep now stops reading from the sweep queue immediately if it encounters an entry known to be committed after the sweep timestamp. Previously, we would read an entire batch before checking commit timestamps so that lookups can be batched, but this is not necessary if the commit timestamp is cached from a previous iteration. (Pull Request) |
IMPROVED |
Write transactions now unlock their row locks and immutable timestamp locks asynchronously after committing. This saves an estimated two TimeLock round-trips of latency when committing a transaction. (Pull Request) |
NEW |
AtlasDB clients now batch calls to unlock row locks and immutable timestamp locks across transactions. This should reduce request volumes on TimeLock Server. (Pull Request) |
FIXED |
Snapshot transactions now write detailed profiling logs of the form |
FIXED |
AtlasDB Benchmarks, CLIs and Console now shutdown properly under certain read patterns. Previously, if these tools needed to delete a value that a failed transaction had written, the delete executor was never closed, thereby preventing an orderly JVM shutdown. (Pull Request) |
FIXED |
Fixed a bug in C* retry logic where number of retries over all the hosts were used as number of retries on a single host, which may cause unexpected blacklisting behaviour. (Pull Request) |
v0.93.0
25 June 2018
Type |
Change |
---|---|
IMPROVED METRICS |
Snapshot Transaction metrics now track the post-commit step of unlocking the transaction row locks.
Also, the |
IMPROVED |
Targeted sweep now uses timelock locks to synchronize background threads on multiple hosts. This avoids multiple hosts doing the same sweeps. Targeted sweep also no longer forcibly sets the number of shards to at least the number of threads. (Pull Request) |
FIXED |
Cassandra deleteRows now avoids reading any information in the case that we delete the whole row. (Pull Request) |
USER BREAK |
The |
FIXED LOGS |
Fixed a bug where Cassandra client pool was erroneously logging host removal from blacklist, even the host was not blacklisted in the first place. (Pull Request) |
v0.92.2
22 June 2018
Type |
Change |
---|---|
FIXED |
With targeted sweep, we now only call timelock once per set of range tombstones we leave, rather than once per cell. (Pull Request) |
v0.92.1
21 June 2018
Type |
Change |
---|---|
FIXED |
We now consider only one row at a time when getting rows from the KVS with sweepable cells. (Pull Request) |
FIXED |
Cassandra retry messages now log bounds on attempts correctly. Previously, they would log the supplier of these bounds (instead of the actual bounds, which users are more likely to be interested in). (Pull Request) |
v0.92.0
20 June 2018
Type |
Change |
---|---|
IMPROVED METRICS |
We now publish metrics for more individual stages of the commit stage in a SnapshotTransaction. We also now publish metrics for the total non-KVS overhead - both the absolute time involved as well as a ratio of this to the total time spent in the commit stage. (Pull Request) |
NEW LOGS |
Snapshot transactions now, up to once every 5 real-time seconds, log an overview of how long each step in the commit phase took. These logs will help the Atlas team better understand which parts of committing transactions may be slow, so that we can improve on it. (Pull Request) |
METRICS IMPROVED |
The |
FIXED |
We now page with a smaller batch size when looking at the sweepable cells. We also batch targeted sweep deletes in smaller batches. (Pull Request) |
FIXED |
Fixed an issue in targeted sweep where reading from the sweep queue when there are more than the specified batch size entries can cause some entries to be skipped. This is unlikely to have affected anyone because the default batch size used was very large. (Pull Request) |
IMPROVED METRICS |
AtlasDB now publishes timers tracking time taken to setup a transaction task before it is run, and time taken to tear down the task after it is done before runTaskWith* returns. (Pull Request) |
IMPROVED LOGS |
Added logging for leadership election code. (Pull Request) |
v0.91.0
18 June 2018
Type |
Change |
---|---|
DEV BREAK |
AtlasDB metrics are no longer a static singleton, and are now created upon construction of relevant classes. This allows internal users to construct multiple AtlasDBs and get meaningful metrics. Many constructors have been broken due to this change. (Pull Request) |
DEV BREAK |
Refactored the TransactionManager inheritance tree to consolidate all relevant methods into a single interface. Functionally, any TransactionManager created using TransactionManagers will provide the serializable and snapshot isolation guarantees provided by a SerializableTransactionManager. Constructing TransactionManagers via this class should result in only a minor dev break as a result of this change. This will make it easier to transparently wrap TransactionManagers to extend their functionality. (Pull Request) |
FIXED |
The delete executor now uses daemon threads, so is less likely to cause failure to shutdown. (Pull Request) |
FIXED |
Fixed an issue where starting an HA Oracle-backed client may fail due to constraint violation. The issue occurred when multiple nodes attempted to insert the same metadata. (Pull Request) |
CHANGED METRICS |
Sweep metrics have been reworked based on their observed usefulness in the field.
|
v0.90.0
11 June 2018
Type |
Change |
---|---|
IMPROVED |
When writing to Cassandra, the internal write timestamp for writes of sweep sentinels, range tombstones and deletes to regular tables are now approximately fresh timestamps from the timestamp service, as opposed to being an arbitrary hardcoded value or related to the transaction’s start timestamp. This should improve Cassandra’s ability to purge droppable tombstones at compaction time, particularly in tables that see heavy volumes of overwrites and sweeping. Note that this only applies if you have created your Transaction Manager through the |
NEW IMPROVED |
Targeted sweep now also sweeps stream stores. (Pull Request) Note that targeted sweep is considered a beta feature as it is not fully functional yet. Consult with the AtlasDB team if you wish to use targeted sweep in addition to, or instead of, standard sweep. |
FIXED |
Targeted sweep will no longer sweep cells from transactions that were committed after the sweep timestamp. Instead, targeted sweep will not proceed for that shard and strategy until the sweep timestamp progresses far enough. (Pull Request) |
FIXED |
Fixed an issue where |
DEV BREAK |
Dropwizard transitive dependencies have been removed from the |
DEV BREAK |
|
IMPROVED |
The unbounded |
FIXED |
Some users of AtlasDB rely on being able to abort transactions which are in progress. Until the last release of AtlasDB, this worked successfully, however this was only the case because before an assert could throw an AssertionError, an NPE was thrown by different code. Now, the assertion error is not thrown. (Pull Request) |
v0.89.0
6 June 2018
Type |
Change |
---|---|
FIXED |
When determining if large sets of candidate cells were part of committed transactions, Background and Targeted Sweep will now read smaller batches of timestamps from the transaction service in serial. Previously, though these reads were re-partitioned into smaller batches, the batch requests were made in parallel which could monopolise Atlas client-side as well as KVS-side resources. There may be a small performance regression here, though this change promotes better stability for the underlying key-value-service especially in the presence of wide rows. (Pull Request) |
USER BREAK |
The size of batches that are used when the |
FIXED DEV BREAK |
The |
DEV BREAK |
Due to lack of use, we have deleted the AtlasDB Dropwizard bundle. Users who need Atlas Console and CLI functionality are encouraged to use the respective distributions. (Pull Request) |
NEW METRICS |
Added a new tagged metric for targeted sweep showing approximate time in milliseconds since the last swept timestamp has been issued.
This metric can be used to estimate how far targeted sweep is lagging behind the current moment in time.
The metric is |
FIXED |
Atlas no longer throws if you read the same column range twice in a serializable transaction. (Pull Request) |
FIXED |
We no longer treat CAS failure in Cassandra as a Cassandra level issue, meaning that we won’t blacklist connections due to a failed CAS. (Pull Request) |
IMPROVED |
|
FIXED |
Fixed an issue occurring during transaction commits, where a failure to putUnlessExists a commit timestamp caused an NPE, leading to a confusing error message. Previously, the method determining whether the transaction had committed successfully or been aborted would hit a code path that would always result in an NPE. (Pull Request) |
IMPROVED |
Increased PTExecutors default thread timeout from 100 milliseconds to 5 seconds to avoid recreating threads unnecessarily. (Pull Request) |
v0.88.0
30 May 2018
Type |
Change |
---|---|
DEV BREAK NEW |
KVS method |
NEW |
AtlasDB now implements targeted sweep using a sweep queue.
As long as the Note that targeted sweep is considered a beta feature as it is not fully functional yet. Consult with the AtlasDB team if you wish to use targeted sweep in addition to, or instead of, standard sweep. |
NEW METRICS |
Added tagged targeted sweep metrics for conservative and thorough sweep.
The metrics show the cumulative number of enqueued writes, entries read, tombstones put, and aborted cells deleted.
Additionally, there are metrics for the sweep timestamp of the last sweep iteration and for the lowest last swept timestamp across all shards.
The metrics, tagged with the sweep strategy used, are as follws (with the common prefix
|
IMPROVED LOGS |
Added logging of the values used to determine which table to sweep, provides more insight into why tables are being swept and others aren’t. (Pull Request) |
IMPROVED |
http-remoting has been upgraded to 3.22.0 (was 3.14.0). This release fixes several issues with communication between Atlas servers and a QoS service, if configured (especially in HA configurations). Note that this change does not affect communication between timelock nodes, or between an Atlas client and timelock, as these do not currently use remoting. (Pull Request) |
v0.87.0
25 May 2018
Type |
Change |
---|---|
FIXED |
|
v0.86.0
23 May 2018
Type |
Change |
---|---|
FIXED |
The Cassandra key value service is now guaranteed to return getRowsColumnRange results in the correct order. Previously while paging over row dynamic columns, the first batchHint results are ordered lexicographically, whilst the remainder are hashmap ordered in chunks of batchHint. In practice, when paging this can lead to entirely incorrect, duplicate results being returned. Now, they are returned in order. (Pull Request) |
FIXED |
Fixed a race condition where requests to a node can fail with NotCurrentLeaderException, even though that node just gained leadership. (Pull Request) |
v0.85.0
18 May 2018
Type |
Change |
---|---|
FIXED |
Snapshot transaction is now guaranteed to return getRowsColumnRange results in the correct order. Previously while paging over row dynamic columns, if uncommitted or aborted transaction data was seen, it would be placed at the end of the list, instead of at the start, meaning that the results are mostly (but not entirely) in sorted order. In practice, this leads to duplicate results in paging, and on serializable tables, transactions that paradoxically conflict with themselves. Now, they are guaranteed to be returned in order, which removes this issue. (Pull Request) |
v0.84.0
16 May 2018
Type |
Change |
---|---|
IMPROVED |
Timelock will now have more debugging info if the paxos directories fail to be created on startup. (Pull Request) |
IMPROVED |
Move a complicated and elsewhere overridden method from AbstractKeyValueService into DbKvs (Pull Request) |
FIXED |
The (Thrift-backed) |
v0.83.0
10 May 2018
Type |
Change |
---|---|
IMPROVED |
If we make a successful request to a Cassandra client, we now remove it from the overall Cassandra service’s blacklist. Previously, removal from the blacklist would only occur after a background thread successfully refreshed the pool, meaning that requests may become stuck if Cassandra was rolling restarted. (Pull Request) |
FIXED |
The Cassandra client pool now respects the |
FIXED DEV BREAK |
Any ongoing Cassandra schema mutations are now given two minutes to complete upon closing a transaction manager, decreasing the chance that the schema mutation lock is lost.
Some exceptions thrown due to schema mutation failures now have type |
v0.82.2
4 May 2018
Type |
Change |
---|---|
FIXED |
|
v0.82.1
1 May 2018
Type |
Change |
---|---|
FIXED |
Specifying tables in configuration for sweep priority overrides now works properly. Previously, attempting to deserialize configurations with these overrides would cause errors. (Pull Request) |
v0.82.0
1 May 2018
Type |
Change |
---|---|
FIXED |
AtlasDB now partitions versions of cells to be swept into batches more robustly and more efficiently. Previously, this could cause stack overflows when sweeping a very wide row, because the partitioning algorithm attempted to traverse a recursive hierarchy of sublists. Also, previously, partitioning would require time quadratic in the number of versions present in the row; it now takes linear time. (Pull Request) |
NEW |
Users can now explicitly specify specific tables for the background sweeper to (1) prioritise above other tables, or (2) blacklist. This is done as part of live-reloadable configuration, though note that background sweep will conclude its current iteration before switching to a priority table / away from a blacklisted table, as appropriate. Please see Sweep Priority Overrides for more details. (Pull Request) |
FIXED |
Transaction managers now shut down threads associated with the QoS client and TimeLock lock refresher when they are closed. Previously, these threads would continue running and needlessly using resources. (Pull Request) |
FIXED |
The |
FIXED |
OkHttp is not handling |
FIXED DEV BREAK |
|
IMPROVED LOGS |
Expired lock refreshes now tell you which locks expired, instead of just their refreshing token id. (Pull Request) |
IMPROVED |
The strategy for choosing the table to compact was adjusted to avoid the case when the same table is chosen multiple times in a row, even if it was not swept between compactions. Previously, the strategy to choose the table to compact was:
When all tables are swept and afterward compacted, the last point above could choose to compact the same table because The new strategy is:
|
IMPROVED LOGS |
|
IMPROVED LOGS |
AtlasDB internal tables will no longer produce warning messages about hotspotting. (Pull Request) |
v0.81.0
19 April 2018
Type |
Change |
---|---|
IMPROVED METRICS |
Async TimeLock Service metric timers are now tagged with (1) the relevant clients, and (2) whether the current node is the leader or not. This allows for easier analysis and consumption of these metrics. (Pull Request) |
IMPROVED |
Common annotations can now be imported via the commons-annotations library, instead of needing to pull in atlasdb-commons. Existing code that uses atlasdb-commons for the annotations will still be able to resolve them. (Pull Request) |
FIXED |
Logs in |
IMPROVED DEV BREAK |
Bumped several libraries to get past known security vulns:
|
v0.80.0
04 April 2018
Type |
Change |
---|---|
FIXED DEV BREAK |
Centralize how |
IMPROVED LOGS |
Downgraded “Tried to connect to Cassandra {} times” logs from |
NEW |
AtlasDB now supports runtime configuration of throttling for stream stores when streams are written block by block in a non-transactional fashion.
Previously, such streams would be written using a separate transaction for each block, though in cases where data volume is high this may still cause load on the key-value-service Atlas is using.
Please note that if you wish to use this feature, you will need to regenerate your Atlas schemas and suitably inject the stream persistence configuration into your stream stores.
However, if you do not intend to use this feature, no action is required, and your stream stores’ behaviour will not be changed.
Note that enabling throttling may make nontransactional |
NEW |
AtlasDB now schedules KVS compactions on a background thread, as opposed to triggering a compaction after each table was swept. This allows for better control over KVS load arising from compactions. (Pull Request) |
NEW |
AtlasDB now supports configuration of a maintenance mode for compactions.
If compactions are run in maintenance mode, AtlasDB may perform more costly operations which may be able to recover more space.
For example, for Oracle KVS, |
FIXED |
Fixed a bug that causes Cassandra clients to return to the pool even if they have thrown blacklisted exceptions. (Pull Request) |
FIXED |
Fix NPE if PaxosLeaderElectionServiceBuilder’s new field onlyLogOnQuorumFailure is never set. (Pull Request) |
NEW |
If using TimeLock, AtlasDB now checks the value of a fresh timestamp against the unreadable timestamp on startup, failing if the fresh timestamp is smaller. That implies clocks went backwards; doing this mitigates the damage that a bogus TimeLock migration or other corruption of TimeLock can do. (Pull Request) |
IMPROVED |
Applications can now easily determine whether their Timelock cluster is healthy by querying |
v0.79.0
20 March 2018
Type |
Change |
---|---|
IMPROVED DEV BREAK |
Guava has been updated from 21.0 to 23.6-jre.
This unblocks users using libraries which have dependencies on more recent versions of Guava, owing to API changes in |
IMPROVED METRICS |
Sweep metrics are now updated to the result value of the last run iteration of sweep instead of the cumulative values for the run of sweep on the table. This has been done in order to improve the granularity of the metrics, since cumulative results can be several orders of magnitude larger, thus obfuscating the delta. (Pull Request) |
NEW |
Added a new parameter |
FIXED |
The Cassandra client pool is now cleaned up in the event of a failure to construct the Cassandra KVS (e.g. because we lost our connection to Cassandra midway). Previously, the client pool was not shut down, leading to a thread leak. (Pull Request) |
IMPROVED LOGS |
Log an ERROR in the case of failure to create a Cell due to a key greater than 1500 bytes. Previously we logged at DEBUG. (Pull Request) |
FIXED |
|
IMPROVED LOGS |
Logging exceptions in the case of quorum is runtime configurable now, using |
v0.78.0
2 March 2018
Type |
Change |
---|---|
NEW |
The |
FIXED |
SerializableTransactionManager can now be closed even if it is not initialized yet. (Pull Request) |
NEW CHANGED METRICS |
Sweep metrics have been reworked based on their observed usefulness in the field.
In particular, histograms and most of the meters were replaced with gauges that expose last known values of tracked sweep results.
Tagged metrics have been removed as well, and were replaced by a gauge
|
FIXED |
LoggingArgs no longer throws when it tries to hydrate invalid table metadata. This fixes an issue that prevented AtlasDB to start after performing a KVS migration. (Pull Request) |
CHANGED |
Changes the default scrubber behavior to aggressive scrub (synchronous with scrub request). (Pull Request) |
FIXED |
Fixed a bug that can causes the background sweep thread to fail to shut down cleanly, hanging the application. (Pull Request) |
IMPROVED |
Remove a round trip from read only transactions not involving thoroughly swept tables. (Pull Request) |
v0.77.0
16 February 2018
Type |
Change |
---|---|
CHANGED |
AtlasDB migration CLI no longer drops the temporary table used during migration and instead truncates it. This avoids an issue where AtlasDB would refuse to start after a migration because it would try to hydrate empty table metadata for the above table. (Pull Request) |
CHANGED |
Upgraded Postgres jdbc driver to 42.2.1 (from 9.4.1209). (Pull Request) |
FIXED |
Fix NPE when warming conflict detection cache if table is being created. (Pull Request) |
IMPROVED DEV BREAK |
Introduced configurable |
FIXED |
Fix |
FIXED |
CassandraKVS sstable size in MB was not being correctly set. This resulted in requirements on the entire cluster being up during startup of certain stacks. CF metadata mismatch messages are also now correctly safety marked for logging. (Pull Request) |
v0.76.0
12 February 2018
Type |
Change |
---|---|
FIXED |
Fixed a bug which would make sweep deletes not be compacted by Cassandra. Over time this would lead to tombstones being accumulated in the DB and disk space not being reclaimed. (Pull Request) |
FIXED |
When TransactionManagers doesn’t return successfully, we leaked resources depending on which step of the initialization failed. Now resources are properly closed and freed. (Pull Request) |
FIXED |
Fixed a bug where Cassandra clients’ input buffers were left in an invalid state before returning the client to the pool, manifesting in NPEs in the Thrift layer. (Pull Request) |
IMPROVED NEW |
Added a new parameter |
IMPROVED |
AtlasDB CLIs now allow a runtime config to be passed in. This allows the CLIs to be used with products that are configured to use timelock and have the timelock block in the runtime config. (Pull Request) |
IMPROVED DEV BREAK |
AtlasDbConfigs now supports parsing of both install and runtime configuration.
As part of these changes, |
FIXED |
Fixed a bug where the CleanCassLocksState CLI would not start because the Cassandra locks were in a bad state. (Pull Request) |
FIXED |
Close AsyncInitializer executors. This should reduce memory pressure of clients after startup. (Pull Request) |
IMPROVED |
Added a TimeLock healthcheck for signalling that no leader election has been triggered. This will allow TimeLock itself to broadcast a HEALTHY status even without a leader. (Pull Request) |
IMPROVED |
Index tables can now be marked as safe for logging.
If you use indexes, please add |
IMPROVED |
Make some values of |
DEV BREAK |
Renamed the method used to create LockAndTimestampServices by the CLI commands and AtlasConsole.
Please update usages of |
IMPROVED LOGS |
Sweep now logs the number of cells it is deleting when performing a single batch of deletes. This is useful for visibility of Sweep progress; previously, Sweep would only log when a top-level batch was complete, meaning that for highly versioned rows Sweep would only log after deleting all stale versions of said row. (Pull Request) |
IMPROVED |
The sweep-table endpoint now returns HTTP status 400 instead of 500, when asked to sweep a non-existent table. (Pull Request) |
IMPROVED METRICS |
Atlas now records the number of cells written over time, if you are using Cassandra KVS.
This metric is reported under |
IMPROVED |
|
v0.75.0
29 January 2018
Type |
Change |
---|---|
FIXED USER BREAK |
AtlasDB will now fail to start if a TimeLock block is included in the initial runtime configuration, but the install configuration is set up with a leader block or with remote timestamp and lock blocks. Previously, AtlasDB would start successfully under these conditions, but the TimeLock block in the runtime configuration would be silently ignored. Note that the decision on whether to use TimeLock or another source of timestamps and locks is made at install-time. (Pull Request) |
IMPROVED USER BREAK |
AtlasDB users can now specify the usage of TimeLock Server purely by including a TimeLock block in the initial runtime configuration.
Previously, AtlasDB users would need to specify that they were using TimeLock in the install configuration, possibly with an empty object ( Also, users with scripts that depend on supplying a default runtime configuration may need to be careful to ensure that TimeLock configuration is preserved when such scripts are run. That said, AtlasDB will fail to start if trying to access a key-value service where TimeLock has been used as a source of timestamps without going through TimeLock, so we don’t think there is a risk of data corruption. (Pull Request) |
FIXED METRICS |
Fixed metric re-registration log spam in |
FIXED DEV BREAK |
AtlasDB clients will receive a |
IMPROVED METRICS |
Use tags in sweep outcome metrics instead of using each name per outcome. (Pull Request) |
IMPROVED LOGS |
Log message for leaked sweep/backup lock is now WARN rather than INFO. (Pull Request) |
IMPROVED LOGS METRICS |
|
IMPROVED LOGS |
|
CHANGED |
Updated our Guava dependency from 18.0 to 20.0. This should unblock downstream products from upgrading to Guava 22.0. (Pull Request) |
CHANGED |
Updated our http-remoting dependency from 3.5.1 to 3.14.0. (Pull Request) |
v0.74.0
23 January 2018
Type |
Change |
---|---|
IMPROVED LOGS |
AtlasDB internal table names are now safe for logging. (Pull Request) |
IMPROVED METRICS |
|
IMPROVED |
The |
DEV BREAK |
|
DEV BREAK IMPROVED |
The |
DEV BREAK |
The protobuf library has been upgraded to 3.5.1. Dependent projects will need to update their dependencies. (Pull Request) |
FIXED |
V2 Schemas which use |
FIXED METRICS |
|
FIXED |
Stop sweeping when the sweep thread is interrupted.
Previously, when services were shutting down, the background sweeper thread continuously logged warnings
due to a closed |
DEV BREAK |
Removed |
v0.73.1
16 January 2018
Type |
Change |
---|---|
FIXED |
Fix a NPE in that could happen in the Sweep background thread. In this scenario, sweep would get stuck and not be able to proceed. The regression was introduced with (#2860), in version 0.73.0. (Pull Request) |
FIXED |
Qos clients will query the service every 2 seconds instead of every client request. This should prevent too many requests to the service. (Pull Request) |
FIXED |
All Atlas executor services now run tasks wrapped in http-remoting utilities to preserve trace logging. (Pull Request) (Pull Request) |
v0.73.0
16 January 2018
Type |
Change |
---|---|
IMPROVED |
On Cassandra KVS, sweep reads data from Cassandra in parallel, resulting in improved performance.
The parallelism can be changed by adjusting |
IMPROVED |
AtlasDB now throws an error during schema code generation stage if index table name length exceeds KVS table name length limits.
To override this, please specify |
v0.73.0-rc2
12 January 2018
Type |
Change |
---|---|
NEW |
Qos Service: AtlasDB now supports a QosService which can rate-limit clients. Please note that this feature is currently experimental; if you wish to use it, please contact the AtlasDB team. (Pull Request) |
NEW |
The JDBC URL for Oracle can now be overridden in the configuration.
The parameter path is |
v0.73.0-rc1
11 January 2018
Type |
Change |
---|---|
IMPROVED LOGS METRICS |
Allow StreamStore table names to be marked as safe. This will make StreamStore tables appear correctly on our logs and metrics.
When building a StreamStore, please use |
IMPROVED |
Sweep stats are updated more often when large writes are being made.
|
IMPROVED |
Improvements to how sweep prioritises which tables to sweep; should allow better reclaiming of space from stream stores. Stream store value tables are now more likely to be chosen because they contain lots of data per write. We ensure we sweep index tables before value tables, and allow a gap after sweeping index tables and before sweeping value tables. We wait 3 days between sweeps of a value table to prevent unnecessary work, allow other tables to be swept and tombstones to be compacted away. (Pull Request) |
FIXED |
|
FIXED |
Further reduced memory pressure on sweep for Cassandra KVS, by rewriting one of the CQL queries. This removes a significant cause of occurrences of Cassandra OOMs that have been seen in the field recently. However, performance is significantly degraded on tables with few columns and few overwrites (fixed in 0.73.0). (Pull Request 1 and Pull Request 2) |
FIXED LOGS |
Safe and Unsafe table name logging args are now different, fixed unreleased bug where tables names were logged as Safe (Pull Request) |
LOGS |
Messages to the |
DEV BREAK |
For clarity, we renamed |
IMPROVED |
Tritium was upgraded to 0.9.0 (from 0.8.4), which provides functionality for de-registration of tagged metrics. (Pull Request) |
v0.72.0
13 December 2017
Type |
Change |
---|---|
NEW IMPROVED METRICS LOGS |
Sweep metrics were reworked. Sweep now exposes metrics indicating the total number of cells examined, cells deleted, time spent sweeping, and time elapsed since sweep started on the current table that are updated after each iteration of sweep and separate metrics that are updated after each table is fully swept.
Additionally, sweep now exposes metrics tagged with table names that expose the total number of cells examined, cells deleted, time spent sweeping per iteration for each table separately.
Logs will also include the new timing information.
Sweep now exposes the following metrics with the common prefix
|
IMPROVED |
AtlasDB now provides a configurable |
FIXED |
The |
FIXED |
Fixed a bug in LockServiceImpl (caused by a bug in AbstractQueuedSynchronizer) where a race condition could cause a lock to become stuck indefinitely. (Pull Request) |
DEV BREAK |
Deleted the TTL duration field from the |
v0.71.1
8 December 2017
Type |
Change |
---|---|
FIXED |
Removed an unused dependency from |
v0.71.0
7 December 2017
Type |
Change |
---|---|
NEW |
AtlasDB QoS: AtlasDB now allows clients to live-reloadably configure read and write limits (in terms of bytes) to rate-limit requests to Cassandra.
AtlasDB clients will receive a |
IMPROVED |
AtlasDB publish of new releases is now done through the internal circleCI build instead of external circleCI. (Pull Request) |
IMPROVED |
AtlasDB no longer logs Cassandra retries at level WARN, thus reducing the volume of WARN logs by a significant factor. These logs are now available at INFO. (Pull Request) |
FIXED |
Sweep can now make progress after a restore and after the clean transactions CLI is run.
Earlier, it would fail throwing a |
FIXED |
Sweep will no longer run during KVS Migrations. (Pull Request) |
NEW LOGS |
Cassandra KVS now records how many writes have been made into each token range for each table. That information is logged at info every time a table is written to more than a threshold of times (currently 100 000 writes). These logs will be invaluable in more easily identifying hotspotting and for using targeted sweep. (Pull Request) |
NEW METRICS |
New metric added which reports the probability that a table is being written to unevenly. |
NEW |
|
DEV BREAK |
Removed several utility methods that are used by AtlasDB code.
|
NEW |
Added a CLI to read the punch table. The CLI receives an epoch time, in millis, and returns an approximation of the AtlasDB timestamp strictly before the given timestamp. (Pull Request) |
v0.70.1
30 November 2017
Type |
Change |
---|---|
DEV BREAK IMPROVED |
The
|
v0.70.0
30 November 2017
Type |
Change |
---|---|
IMPROVED |
When BackgroundSweeper decides to sweep a StreamStore VALUE table, first sweep the respective StreamStore INDEX table. Before we just swept the VALUE table, which ended up not deleting any values in the backing store. (Pull Request) |
DEV BREAK METRICS |
The method |
IMPROVED LOGS |
All logging in |
DEV BREAK IMPROVED |
The previously deprecated |
FIXED |
Fixed a bug where setting |
FIXED |
Fixed an edge case where sweep would loop infinitely on tables that contained only tombstones. (Pull Request) |
FIXED METRICS |
|
IMPROVED DEV BREAK |
AtlasDB now wraps |
IMPROVED |
Sweep no longer fetches any values from Cassandra in CONSERVATIVE mode. This results in significantly less data being transferred from Cassandra to the client when sweeping tables with large values, such as stream store tables. (Pull Request) |
v0.69.0
23 November 2017
Type |
Change |
---|---|
FIXED |
Fixed the deletion of values from the StreamStore when configured to hash rowComponents. Previously, due to a deserialization bug, we wouldn’t delete the streamed data. If you think you’re affected by this bug, please contact the AtlasDB team to migrate away from this behavior. (Pull Request) |
FIXED |
We now avoid Cassandra timeouts caused by running unbounded CQL range scans during sweep. In order to assign a bound, we prefetch row keys using thrift, and use these bounds to page internally through rows. This issue affected tables configured to use THOROUGH sweep strategy — which could accumulate many rows entirely made up of tombstones — when Cassandra is configured as the backing store. (Pull Request) |
IMPROVED |
Applications can now easily determine whether their AtlasDB cluster is healthy by querying |
IMPROVED DEV BREAK |
AtlasDB will now consistently throw an |
FIXED |
|
IMPROVED |
Sweep now waits 1 day after generating a large number of tombstones before sweeping a table again. This behavior only applies when using Cassandra. (Pull Request) |
FIXED LOGS |
|
v0.68.0
16 November 2017
Type |
Change |
---|---|
FIXED |
HTTP clients for endpoints relating to the Paxos protocols ( |
v0.67.0
15 November 2017
Type |
Change |
---|---|
NEW |
AtlasDB clients are now able to live reload TimeLock URLs. This is required for internal work on running services in Kubernetes. We still require that clients are configured to use TimeLock (as opposed to a leader, remote timestamp/lock or embedded service) at install time. Note that this change does not affect TimeLock Server, which still requires knowledge of the entire cluster as well. Please consult the documentation for more detail regarding the config changes needed. (Pull Request 1 and Pull Request 2) |
DEPRECATED |
The |
IMPROVED |
AtlasDB clients using TimeLock can now start up with knowledge of zero TimeLock nodes.
Requests to TimeLock will throw |
IMPROVED DEV BREAK |
|
IMPROVED LOGS |
kvs-slow-log was added on all Cassandra calls. As with the original kvs-slow-log logs, the added logs have the |
NEW METRICS |
Metrics were added on all Cassandra calls. The
Note that the table calls mainly use the first three metrics of the above list. (Pull Request) |
NEW METRICS |
Metrics recording the number of Cassandra requests, and the amount of bytes read and written from and to Cassandra were added:
The following metrics have been added, with the common prefix (package)
|
NEW METRICS |
Added metrics for cells read.
The read cells can be post-filtered at the CassandraKVS layer, when there are multiple versions of the same cell.
The filtered cells are recorded in the following metrics have been added, with the common prefix (package)
The cells returned from the KVS layer are then recorded at the metric with the prefix (package)
Such cells can also be filtered out at the transaction layer, due to the Transaction Protocol. The filtered out cells are recorded in the metrics:
At last, the metric that record the number of cells actually returned to the AtlasDB client is:
|
NEW METRICS |
Added metrics for written bytes at the Transaction layer:
|
NEW METRICS |
A metric was added for the cases where a large read was made:
Note that we also log a warning in these cases, with the message “A single get had quite a few bytes…”. (Pull Request) |
IMPROVED DEV BREAK |
AtlasDB will now consistently throw a |
IMPROVED LOGS |
|
FIXED METRICS |
|
FIXED |
|
IMPROVED |
AtlasDB now depends on Tritium 0.8.4, which depends on the same version of |
FIXED |
Check that immutable timestamp is locked on write transactions with no writes.
This could cause long-running readers to read an incorrect empty value when the table had the |
FIXED |
Paxos value information is now correctly being logged when applicable leader events are happening. (Pull Request) |
v0.66.0
7 November 2017
Type |
Change |
---|---|
IMPROVED |
AtlasDB now depends on Tritium 0.8.3, allowing products to upgrade Tritium without running into |
v0.66.0-rc2
6 November 2017
Type |
Change |
---|---|
IMPROVED |
AtlasDB now depends on Tritium 0.8.1. (Pull Request) |
IMPROVED |
AtlasDB can now tag RC releases. (Pull Request) |
v0.66.0-rc1
This version was skipped due to issues on release. No artifacts with this version were ever published.
v0.65.2
6 November 2017
Type |
Change |
---|---|
FIXED |
Reverted the Cassandra KVS executor PR (Pull Request) that caused a performance regression. (Pull Request) |
FIXED |
|
v0.65.1
4 November 2017
Type |
Change |
---|---|
IMPROVED |
AtlasDB now depends on Tritium 0.8.0, allowing products to upgrade Tritium without running into |
IMPROVED |
Sweep is now more efficient and less susceptible to OOMs on Cassandra.
Also, the default value for the sweep batch config parameter |
FIXED |
Fixed cursor leak when sweeping on oracle/postgres. (Pull Request) |
IMPROVED |
Sweep progress is now persisted as a blob and uses a KVS level table.
This allows us to use check and set to avoid versioning the entries in the sweep progress table.
As a result, loading of the persisted SweepResult which was previously linear in the size of the table being swept can be done in constant time.
No migration is necessary as the data is persisted to a new table |
IMPROVED LOGS |
AtlasDB tables will now be logged as |
FIXED |
TracingKVS now has spans with safe names. (Pull Request) |
v0.65.0
This version was skipped due to issues on release. No artifacts with this version were ever published.
v0.64.0
1 November 2017
Type |
Change |
---|---|
FIXED |
UUIDs can now be used in schemas again.
Previously, schemas generated with UUIDs would reference the |
IMPROVED |
The executor used by the Cassandra KVS is now allowed to grow larger so that we can better delegate blocking to the underlying Cassandra client pools. Please note that for Cassandra users this may result in increased Atlas thread counts when receiving spikes in requests. The underlying throttling is the same, however, so Cassandra load shouldn’t be impacted. (Pull Request) |
IMPROVED METRICS |
|
IMPROVED LOGS |
Log host names in Cassandra* classes. (Pull Request) |
FIXED |
The executors used when async initializing objects are never shutdown anymore.
You should be affected by this bug only if you had |
v0.63.0
27 October 2017
Type |
Change |
---|---|
FIXED |
Fixed the deprecated |
IMPROVED METRICS |
Metrics are now recorded for put/get operations around commit timestamps. (Pull Request) |
v0.62.1
27 October 2017
Type |
Change |
---|---|
FIXED |
Updated our dependency on |
v0.62.0
26 October 2017
Improvements
Type |
Change |
---|---|
IMPROVED |
|
DEPRECATED IMPROVED |
|
DEV BREAK IMPROVED |
|
IMPROVED |
The duration between attempts of whitelist Cassandra nodes was reduced from 5 minutes to 2 minutes, and the minimum period a node is blacklisted for was reduced from 2 minutes to 30 seconds. This means we check the health of a blacklisted Cassandra node and whitelist it faster than before. (Pull Request) |
DEV BREAK IMPROVED |
The size of the transaction cache is now configurable. It is not anticipated end users will need to touch this; it is more likely that this will be configured via per-service overrides for the services for whom the current cache size is inadequate. If needed, configuring this parameter is available under the AtlasDbRuntimeConfig with the name timestampCacheSize. This is a small API change for users manually constructing a TransactionManager, which now requires a transaction cache size parameter. Please add it from the AtlasDbRuntimeConfig, or instead of manually creating a TransactionManager, utilize the builder in TransactionManagers to have this done for you. Note that even if the config is changed at runtime, the size of the cache doesn’t change dynamically until 2565 is resolved. (Pull Request 1) (Pull Request 2) |
IMPROVED |
Exposes another version of |
DEV BREAK IMPROVED |
Simplify and annotate the constructors for |
Logs and Metrics
Type |
Change |
---|---|
IMPROVED METRICS |
|
NEW METRICS |
AtlasDB clients now emit metrics that track the immutable timestamp, unreadable timestamp, and current timestamp. These metrics should help in performing diagnosis of issues concerning Sweep and/or the lock service. (Pull Request) |
FIXED METRICS |
Timelock server no longer appends client names to metrics. Instead, each metric is aggregated across all clients. (Pull Request) |
NEW METRICS |
We now report metrics for Transaction conflicts.
The metrics are a meter reported under the name |
IMPROVED LOGS |
Specified which logs from Cassandra* classes were Safe or Unsafe for collection, improving the data that we can collect for debugging purposes. (Pull Request) |
IMPROVED USER BREAK LOGS |
The |
FIXED LOGS |
TimeLock Server’s |
FIXED LOGS |
|
FIXED METRICS |
|
Bug fixes
Type |
Change |
---|---|
FIXED |
|
FIXED |
When AtlasDB thinks all Cassandra nodes are non-healthy, it logs a message containing “There are no known live hosts in the connection pool … We’re choosing one at random …”. The level of this log was reduced from ERROR to WARN, as it was spammy in periods of a Cassandra outage. (Pull Request) |
FIXED |
Timelock server will try to gain leadership synchronously when the first time a new client namespace is requested. Previously, the first request would always return 503. (Pull Request) |
FIXED |
|
FIXED |
Async Initialization now works with TimeLock Server. Previously, for Cassandra we would attempt to immediately migrate the timestamp bound from Cassandra to TimeLock on startup, which would fail if either of them was unavailable. For DBKVS or other key-value services, we would attempt to ping TimeLock on startup, which would fail if TimeLock was unavailable (though the KVS need not be available). (Pull Request) |
FIXED |
|
FIXED |
Fixed an issue where a |
FIXED |
|
DEV BREAK FIXED |
Move |
v0.61.1
19 October 2017
Type |
Change |
---|---|
IMPROVED |
Reverted the Sweep rewrite for Cassandra as it would unnecessarily load values into memory which could cause Cassandra to OOM if the values are large enough. (Pull Request) |
v0.61.0
18 October 2017
Type |
Change |
---|---|
IMPROVED |
Sweep is now more efficient on Postgres and Oracle. (Pull Request) |
IMPROVED |
The |
FIXED LOGS |
Sweep candidate batches are now logged correctly.
Previously, we would log a |
v0.60.1
16 October 2017
Type |
Change |
---|---|
NEW IMPROVED |
AtlasDB now supports asynchronous initialization, where To enable asynchronous initialization, a new config option While waiting for AtlasDB to be ready, clients can poll The default value for the config is |
NEW |
Timelock server can now be configured to persist the timestamp bound in the database, specifically in Cassandra/Postgres/Oracle. We recommend this to be configured only for cases where you absolutely need to persist all state in the database, for example, in special cases where backups are simply database dumps and do not have any mechanism for storing timestamps. This will help support large internal product’s usage of the Timelock server. (Pull Request) |
DEV BREAK IMPROVED |
In order to limit the access to inner methods, and to make the implementation of asynchronous initialization feasible, we’ve extracted interfaces and renamed the following classes:
Now the factory methods for the above classes return the interfaces. The actual implementation of such classes was moved to their corresponding *Impl files. (Pull Request) |
DEV BREAK IMPROVED |
|
FIXED |
|
FIXED |
|
FIXED |
The Sweep endpoint and CLI now accept start rows regardless of the case these are presented in.
Previously, giving a start row with hex characters in lower case e.g. |
DEV BREAK |
Removed the following unnecessary classes related to wrapping KVSes:
|
FIXED |
Lock state logging will dump |
FIXED |
When using the TimeLock block and either the timestamp or the lock service threw an exception, we were throwing InvocationTargetException instead. We now throw the actual cause for the invocation exception. (Pull Request) |
v0.60.0
This version was skipped due to issues on release. No artifacts with this version were ever published.
v0.59.1
04 October 2017
Type |
Change |
---|---|
IMPROVED |
Allow passing a ProxyConfiguration to allow setting custom proxy on the TimeLock clients. (Pull Request) |
v0.59.0
04 October 2017
Type |
Change |
---|---|
IMPROVED |
Timestamp batching has now been enabled by default. Please see Timestamp Client Options for details. This should improve throughput and latency, especially if load is heavy and/or clients are communicating with a TimeLock cluster which is used by many services. Note that there may be an increase in latency under light load (e.g. 2-4 threads). (Pull Request) |
NEW |
AtlasDB now offers a simplified version of the schema API by setting the |
NEW |
AtlasDB now offers specifying |
NEW |
AtlasDB now offers specifying |
FIXED |
The |
FIXED |
Fixed migration from JDBC KeyValueService by adding a missing dependency to the CLI distribution. (Pull Request) |
FIXED |
Oracle auto-shrink is now disabled by default. This is an experimental feature allowing Oracle non-EE users to compact automatically. We decided to turn it off by default since we have observed timeouts for large amounts of data, until we find a better retry mechanism for shrink failures. (Pull Request) |
LOGS USER BREAK |
AtlasDB no longer tries to register Cassandra metrics for each pool with the same names. We now add poolN to the metric name in CassandraClientPoolingContainer, where N is the pool number. This will prevent spurious stacktraces in logs due to failure in registering metrics with the same name. (Pull Request) |
DEV BREAK FIXED |
Adjusted the remoting-api library version to match the version used by remoting3. Developers may need to check your dependencies, but no other actions should be required. (Pull Request) |
FIXED |
Adjusted optimizer hints for getRange() to prevent Oracle from picking a bad query plan. (Pull Request) |
v0.58.0
22 September 2017
Type |
Change |
---|---|
LOGS |
AtlasDB now logs slow queries CQL queries (via |
DEV BREAK FIXED |
AtlasDB now depends on okhttp 3.8.1. This is expected to fix an issue where connections would constantly throw “shutdown” exceptions, which was likely due to a documented bug in okhttp 3.4.1. (Pull Request) |
DEV BREAK IMPROVED |
Upgraded all uses of http-remoting from remoting2 to remoting3, except for serialization of errors (preserved for backwards wire compatibility).
Developers may need to check their dependencies, as well as update instantiation of their calls to |
FIXED |
KVS migration no longer fails when the old |
FIXED |
Path and query parameters for TimeLock endpoints have now been marked as safe.
Several logging parameters in TimeLock (e.g. in |
IMPROVED |
The |
FIXED |
Sweep log priority has been increased to INFO for logs of when a table 1. is starting to be swept, 2. will be swept with another batch, and 3. has just been completely swept. (Pull Request) |
v0.57.0
19 September 2017
Type |
Change |
---|---|
METRICS CHANGED |
From this version onwards, AtlasDB’s metrics no longer have unbounded multiplicity. This means that AtlasDB can be whitelisted in the internal metrics aggregator tool. |
METRICS USER BREAK |
AtlasDB no longer embeds Cassandra host names in its metrics. Aggregate metrics are retained in both CassandraClientPool and CassandraClientPoolingContainer. This was necessary for compatibility with an internal log-ingestion tool. (Pull Request) |
METRICS USER BREAK |
AtlasDB no longer embeds table names in Sweep metrics. Sweep aggregate metrics continue to be reported. This was necessary for compatibility with an internal log-ingestion tool. (Pull Request) |
DEV BREAK FIXED |
AtlasDB now depends on okhttp 3.8.1. This is expected to fix an issue where connections would constantly throw “shutdown” exceptions, which was likely due to a documented bug in okhttp 3.4.1. (Pull Request) |
DEV BREAK FIXED |
The |
DEV BREAK IMPROVED |
TimeLockAgent’s constructor now accepts a Supplier instead of an RxJava Observable. This reduces the size of the TimeLock Agent jar, and removes the need for a dependency on RxJava. To convert an RxJava Observable to a Supplier that always returns the most recent value, consider the method blockingMostRecent as implemented here. (Pull Request) |
IMPROVED |
|
IMPROVED |
AtlasDB table definitions now support specifying log safety without having to also specify value byte order for row components. (Pull Request) |
v0.56.1
14 September 2017
Type |
Change |
---|---|
IMPROVED |
The new concurrent version of Transaction#getRanges did not correctly guarantee ordering of the results returned in its stream. We now make sure the resulting ordering matches that of the input RangeRequests. (Pull Request) |
v0.56.0
12 September 2017
Type |
Change |
---|---|
NEW |
TimelockServer now exposes the LockService instead of the RemoteLockService if using the synchronous lock service. This will provide a more comprehensive API which is required by the large internal products. (Pull Request) |
USER BREAK NEW |
Timelock clients now report tritium metrics for the lock requests with the prefix |
DEV BREAK |
LockAwareTransactionManager now returns a LockService instead of a RemoteLockService in order to expose the new API. Any products that extend this class will have to change their class definition. (Pull Request) |
NEW |
Added two new methods to Transaction, getRangesLazy and a concurrent version of getRanges, which are also exposed in the Table API. If you expect to only use a small amount of the rows in the provided ranges, it is often advisable to use the new getRangesLazy method and serially iterate over the results. Otherwise, you should use the new version of getRanges that allows explicitly operating on the resulting visitables in parallel. (Pull Request) |
DEPRECATED |
The existing getRanges method has been deprecated as it would eagerly load the first page of all ranges, potentially concurrently. This often caused more data to be fetched than necessary or higher concurrency than expected. Recommended alternative is to use getRanges with a specified concurrency level, or getRangesLazy. (Pull Request) |
USER BREAK FIXED |
AtlasDB no longer embeds user-agents in metric names.
This affects both AtlasDB clients as well as TimeLock Server.
All metrics are still available; however, metrics which previously included a user-agent component will no longer do so.
For example, the timer |
IMPROVED |
LockServerOptions now provides a builder, which means constructing one should not require overriding methods. (Pull Request) |
NEW |
Oracle will now validate connections by running the test query when getting a new connection from the HikariPool. (Pull Request) |
IMPROVED |
Cassandra range concurrency defaults lowered from 64x to 32x, to reflect default connection pool sizes that have shrank over time, and to be more appropriate for fairly common smaller 3-node clusters. (Pull Request) |
v0.55.0
01 September 2017
Type |
Change |
---|---|
USER BREAK |
If AtlasDB is used with TimeLock, and the TimeLock client name is different than either the Cassandra |
NEW |
AtlasDB introduces a top-level |
DEPRECATED |
As a followup of the |
NEW |
Oracle SE will now automatically trigger table data shrinking to recover space after sweeping a table.
You can disable the compaction by setting |
FIXED |
Fixed an issue where sweep logs would get rolled over sooner than expected. The number of log files stored on disk was increased from 10 to 90 before rolling over. (Pull Request) |
v0.54.0
25 August 2017
Type |
Change |
---|---|
NEW |
Timelock clients now report tritium metrics for the |
FIXED |
AtlasDB clients now report tritium metrics for the |
FIXED |
|
FIXED |
The scrubber queue no longer grows without bound if the same cell is overwritten multiple times by hard delete transactions. (Pull Request) |
IMPROVED |
If |
FIXED |
Fixed a case where logging an expection suppressing itself would cause a stack overflow. See LOGBACK-1027. (Pull Request) |
NEW |
AtlasDB now produces a new artifact, |
IMPROVED |
TimeLock now creates client namespaces the first time they are requested, rather than requiring them to be specified in config.
This means that specifying a list of clients in Timelock configuration will no longer have any effect. Further, a new configuration property called |
DEPRECATED |
|
FIXED |
CharacterLimitType now has fields marked as final. (Pull Request) |
CHANGED |
The |
CHANGED |
Updated our dependency on |
v0.53.0
9 August 2017
Type |
Change |
---|---|
FIXED |
KVS migrations will no longer verify equality between the from and to KVSes for the sweep priority and progress tables. Note that these tables are still migrated across, as they provide heuristics for timely sweeping of tables. However, these tables may change during the migration, without affecting correctness (e.g. the from-kvs could be swept). Previously, we would attempt to check that the sweep tables were equal on both KVSes, leading to spurious validation failures. (Pull Request) |
NEW |
AtlasDB now supports specifying the safety of table names as well as row and column component names following the palantir/safe-logging library. Please consult the documentation for Tables and Indices for details on how to set this up. As AtlasDB regenerates its metadata on startup, changes will take effect after restarting your AtlasDB client (in particular, you do NOT need to rerender your schemas.) Previously, all table names, row component names and column names were always treated as unsafe. (Pull Request 1, Pull Request 2 and Pull Request 3) |
IMPROVED |
The |
DEV BREAK |
AtlasDB now throws an error during schema code generation stage if table length exceeds KVS limits.
To override this, please specify |
DEV BREAK |
|
DEV BREAK |
IteratorUtils.forEach removed; it’s not needed in a Java 8 codebase. (Pull Request) |
v0.52.0
1 August 2017
Type |
Change |
---|---|
FIXED |
Fixed a critical bug in Oracle that limits the number of writes with values greater than 2000 bytes to |
FIXED |
Change schemas in the codebase so that they use JAVA8 Optionals instead of Guava. (Pull Request) |
DEV BREAK |
Removed unused classes on AtlasDB.
If any issues arise from this change, please contact the development team. (Pull Request) |
v0.51.0
28 July 2017
Type |
Change |
---|---|
FIXED |
For DbKvs, the |
v0.50.0
27 July 2017
Type |
Change |
---|---|
FIXED USER BREAK |
TimeLock Server, if configured to use the async lock service, will now throw if a client attempts to start a transaction via the sync lock service. Previously, users which have clients (for the same namespace) running both pre- and post-0.49.0 versions of AtlasDB were able to run transactions against the sync and async lock services concurrently, thus breaking the guarantees of the lock service. AtlasDB does not support having clients (for the same namespace) running both pre- and post-0.49.0 versions. Note that TimeLock users which have clients (for different namespaces) running both pre- and post-0.49.0 versions will need to turn this feature off for clients on pre-0.49.0 versions to continue working with TimeLock, and should exercise caution in ensuring that, for each namespace, clients use only pre- or post-0.49.0 versions of AtlasDB. Please see Async Lock Service Configuration for documentation. (Pull Request) |
USER BREAK |
TimeLock Server has moved its parameter |
IMPROVED |
We reduced |
FIXED |
|
FIXED |
|
IMPROVED DEV BREAK |
OkHttp clients (created with |
IMPROVED USER BREAK |
AtlasConsole database mutation commands (namely |
FIXED |
Fixed a bug in AtlasConsole that caused valid table names not to be recognized. (Pull Request) |
NEW |
TimeLock Server now supports a |
FIXED |
Fixed a potential deadlock in |
NEW |
New metrics have been added for tracking Cassandra’s approximate pool size, number of idle connections, and number of active connections. (Docs) (Pull Request) |
v0.49.0
18 July 2017
Type |
Change |
---|---|
IMPROVED |
TimeLock Server now can process lock requests using async Jetty servlets, rather than blocking request threads. This leads to more stability and higher throughput during periods of heavy lock contention.
To enable this behavior, use the |
DEV BREAK IMPROVED |
The maximum time that a transaction will block while waiting for commit locks is now configurable, and defaults to 1 minute. This can be configured via the |
DEV BREAK |
|
USER BREAK |
This version of the AtlasDB client will require a version of Timelock server that exposes the new |
USER BREAK |
The timestamp batching functionality introduced in 0.48.0 is temporarily no longer supported when running with Timelock server. We will re-enable support for this in a future release. |
FIXED |
Fixed the broken |
FIXED |
Fixed an issue that could cause AtlasConsole to print unnecessary amounts of input when commands were run. (Pull Request) |
USER BREAK |
Remove Cassandra config option |
FIXED |
|
NEW |
|
v0.48.0
Type |
Change |
---|---|
FIXED |
If sweep configs are specified in the AtlasDbConfig block, they will be ignored, but AtlasDB will no longer fail to start.
This effectively fixes the Sweep-related user break change of version |
NEW |
AtlasDB now supports batching of timestamp requests on the client-side; see Timestamp Client Options for details. On internal benchmarks, the AtlasDB team has obtained an almost 2x improvement in timestamp throughput and latency under modest load (32 threads), and an over 10x improvement under heavy load (8,192 threads). There may be a very small increase in latency under extremely light load (e.g. 2-4 threads). Note that this is not enabled by default. (Pull Request) |
DEV BREAK |
The |
FIXED DEV BREAK |
|
FIXED |
|
FIXED |
|
v0.47.0
11 July 2017
Type |
Change |
||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
IMPROVED |
ErrorProne is enabled and not ignored on all AtlasDB projects. This means that AtlasDB can be whitelisted in the internal logging aggregator tool from this version ownards. (Pull Request) (Pull Request) (Pull Request) (Pull Request) |
||||||||||||
NEW IMPROVED |
Background Sweep is enabled by default on AtlasDB. To understand what Background Sweep is, please check the sweep docs, in particular, the background sweep docs. (Pull Request) |
||||||||||||
USER BREAK DEV BREAK IMPROVED |
Added support for live-reloading sweep configurations.
Specifying any of the above install options will result in AtlasDB failing to start. Check the full configuration docs here. (Pull Request) |
||||||||||||
FIXED |
Fixed a bug that caused AtlasDB internal tables (e.g. the Transactions table or the Punch table) to be wiped when read from the AtlasDB Console. (Pull Request) |
||||||||||||
USER BREAK |
The Atomix algorithm implementation for the TimeLock server and the corresponding configurations have been removed.
The default algorithm for |
||||||||||||
USER BREAK |
The previously deprecated RocksDBKVS has been removed. Developers that relied on RocksDB for testing should move to H2 on JdbcKvs. (Pull Request) |
||||||||||||
FIXED IMPROVED |
Sweep now dynamically adjusts the number of (cell, ts) pairs across runs:
This should fix the issue where we were unable to sweep cells with a high number of mutations. (Pull Request) |
||||||||||||
IMPROVED |
Default configs which tune sweep runs were lowered, to ensure that sweep works in any situation. For more information, please check the sweep docs. Please delete any config overrides regarding sweep and use the default values, to ensure a sane run of sweep. (Pull Request) |
||||||||||||
NEW |
AtlasDB now instruments embedded timestamp and lock services when no leader block is present in the config, to expose aggregate response time and service call metrics.
Note that this may cause a minor performance hit.
If that is a concern, the instrumentation can be turned off by setting the tritium system properties |
||||||||||||
NEW |
AtlasDB now adds endpoints for sweeping a specific table, with options for startRow and batch config parameters. This should be used in place of the deprecated sweep CLIs. Check the endpoints documentation here. (Pull Request) |
||||||||||||
IMPROVED |
Improved performance of timestamp and lock requests on clusters with a leader block and a single node. If a single leader is configured, timestamp and lock requests will no longer use HTTPS/Jetty. In addition to the minor perf improvement, this fixes an issue causing livelock/deadlock when the leader is under heavy load. We recommend HA clusters under heavy load switch to using a standalone timestamp service, as they may also be vulnerable to this failure mode. (Pull Request) |
||||||||||||
IMPROVED DEV BREAK |
The dropwizard independent implementation of the TimeLock server has been separated into a new project, |
||||||||||||
FIXED |
JDBC KVS now batches cells in put/delete operations via the config parameter |
||||||||||||
FIXED |
AtlasDB clients now retry lock requests if the server loses leadership while the request is blocking. In the past, this scenario would cause the server to return 500 responses that were not retried by the client. Now the server returns 503 responses, which triggers the correct retry logic. (Pull Request) |
||||||||||||
FIXED |
AtlasDB now generates Maven POM files for shadowed jars correctly. Previously, we would regenerate the XML for shadow dependencies by creating a node with corresponding groupId, artifactId, scope and version tags only, which is incorrect because it loses information about, for example, specific or transitive exclusions. We now respect these additional tags. (Pull Request) |
||||||||||||
FIXED |
Fixed a bug where a timelock server instance could get stuck in an infinite loop if cutoff from the other nodes and failed to achieve a quorum. (Pull Request) |
||||||||||||
IMPROVED USER BREAK |
Improved the way rows and named columns are outputted in AtlasConsole to be more intuitive and easier to use. Note that this may break existing AtlasConsole scripts. (Pull Request) |
||||||||||||
FIXED |
Added backwards compatibility for the changes introduced in #2067, in particular, for passing row values into AtlasConsole functions. (Pull Request) |
v0.46.0
This version was skipped due to issues on release. No artifacts with this version were ever published.
v0.45.0
19 June 2017
Type |
Change |
---|---|
DEV BREAK IMPROVED |
Upgraded all usages of http-remoting to remoting2.
Previously, depending on the use case, AtlasDB would use http-remoting, remoting1 and remoting2.
Developers may need to check their dependencies, as well as update instantiation of their calls to |
DEV BREAK IMPROVED |
AtlasDB has updated Feign to 8.17.0 and OkHttp to 3.4.1, following remoting2 in the palantir/http-remoting library. We previously used Feign 8.6.1 and OkHttp 2.5.0. Developers may need to check their dependencies, especially if they relied on AtlasDB to pull in Feign and OkHttp as there were package name changes. (Pull Request 1) and (Pull Request 2) |
DEV BREAK IMPROVED |
AtlasDB now shades Feign and Okio (same as palantir/http-remoting). This was done to enable us to synchronize with remoting2 while limiting breaks for users of older versions of Feign, especially given an API break in Feign 8.16. Users who previously relied on AtlasDB to pull in these libraries may experience a compile break, and should consider explicitly depending on them. (Pull Request) |
DEV BREAK IMPROVED |
Converted all compile time uses of Guava’s |
DEPRECATED IMPROVED |
AssertUtils logging methods will now ask for an SLF4J logger to log to, instead of using a default logger. This should make log events from AssertUtils easier to filter. (Pull Request) |
FIXED |
JDBC KVS now batches cells in get operations via the config parameter |
FIXED |
The CLI distribution can now be run against JDBC with hikari connection pools. In the past, it would fail to resolve the configuration due to a missing runtime dependency. Note: this is not a problem if running with the dropwizard bundle. (Pull Request) |
FIXED |
Fixed an issue where the lock service was not properly shut down after losing leadership, which could result in threads blocking unnecessarily. (Pull Request) |
FIXED |
Lock refresh requests are no longer restricted by lock service threadpool limiting. This allows transactions to make progress even when the threadpool is full. (Pull Request) |
FIXED |
Lock service now ensures that locks are reaped in a more timely manner. Previously the lock service could allow locks to be held past expiration, if they had a timeout shorter than the longest timeout in the expiration queue. (Pull Request) |
NEW |
Added a getRow() command to AtlasConsole for retrieving a single row. (Pull Request) |
NEW |
Added a rowComponents() function to the AtlasConsole table() command to allow you to easily view the fields that make up a row key. (Pull Request) |
NEW |
The default lock timeout is now configurable. Currently, the default lock timeout is 2 minutes. This can cause a large delay if a lock requester’s connection has died at the time it receives the lock. Since TransactionManagers#create provides an auto-refreshing lock service, it is safe to lower the default timeout to reduce the delay that happens in this case. (Pull Request) |
IMPROVED |
The priority of logging on background sweep was increased from debug to info or warn. (Pull Request) |
IMPROVED |
The lock service state logger now has a reduced memory footprint. It also now logs the locking mode for each lock. (Pull Request) |
IMPROVED |
Reduced the logging level of some messages relating to check-and-set operations in |
v0.44.0
8 June 2017
Type |
Change |
---|---|
IMPROVED |
|
IMPROVED |
Added new meter metrics for cells swept/deleted and failures to acquire persistent lock. (Pull Request) |
IMPROVED |
Cassandra thrift driver has been bumped to version 3.10. This will fix a bug (#1654) that caused Atlas probing downed Cassandra nodes every few minutes to see if they were up and working yet to eventually take out the entire cluster by steadily building up leaked connections, due to a bug in the underlying driver. (Pull Request) |
IMPROVED |
Read-only transactions will no longer make a remote call to fetch a timestamp, if no work is done on the transaction. This will benefit services that execute read-only transactions around in-memory cache operations, and frequently never fall through to perform a read. (Pull Request) |
IMPROVED |
Timelock service now includes user agents for all inter-node requests. (Pull Request) |
NEW |
Timelock now tracks metrics for leadership elections, including leadership gains, losses, and proposals. (Pull Request) |
FIXED |
Fixed a severe performance regression in getRange() on Oracle caused by an inadequate query plan being chosen sometimes. (Pull Request) |
FIXED |
Fixed a potential out-of-memory issue by limiting the number of rows getRange() can request from Postgres at once. (Pull Request) |
FIXED |
KVS migration CLI will now clear the checkpoint tables that are required while the migration is in progress but not after the migration is complete. The tables were previously left hanging and the user had to delete/truncate them. (Pull Request) |
DEV BREAK |
Some downstream projects were using empty table metadata for dev-laziness reasons in their tests. This is no longer permitted, as it leads to many (unsolved) questions about how to deal with such a table. If this breaks your tests, you can fix it with making real schema for tests or by switching to AtlasDbConstants.GENERIC_TABLE_METADATA (Pull Request) |
USER BREAK FIXED |
Fixed a bug that caused Cassandra to always use the minimum compression block size of 4KB instead of the requested compression block size. Users must explicitly rewrite table metadata for any tables that requested explicit compression, as any tables that were created previously will not respect the compression block size in the schema. This can have a very large performance impact (both positive and negative in different cases), so users may need to remove the explicit compression request from their schema if this causes a performance regression. Users that previously attempted to set a compression block size that was not a power of 2 will also need to update their schema because Cassandra only allows this value to be a power of 2. (Pull Request) |
FIXED |
Fixed a potential out-of-memory issue by limiting the number of rows getRange() can request from Postgres at once. (Pull Request) |
v0.43.0
25 May 2017
Type |
Change |
---|---|
FIXED |
For requests that fail due to to networking or other IOException, the AtlasDB client now backs off before retrying. (Pull Request) |
USER BREAK IMPROVED |
The |
FIXED |
|
DEPRECATED |
The FastForwardTimestamp and FetchTimestamp CLIs have been deprecated.
Please use the |
IMPROVED |
Sweep now batches delete calls before executing them.
This should improve performance on relatively clean tables by deleting more cells at a time, leading to fewer DB operations and taking out the backup lock less frequently.
The new configuration parameter |
CHANGED |
Sweep metrics now record counts of cell-timestamp pairs examined rather than the count of entire cells examined. This provides more accurate insight on the work done by the sweeper. (Pull Request) |
DEPRECATED |
The Sweep CLI configuration parameters |
DEPRECATED |
The background sweep configuration parameters |
FIXED |
After the Pull Request #1808 the TimeLock Server did not gate the lock service behind the |
FIXED |
|
DEV BREAK |
New |
v0.42.2
25 May 2017
Type |
Change |
---|---|
FIXED |
|
v0.42.1
24 May 2017
Type |
Change |
---|---|
FIXED |
|
v0.42.0
23 May 2017
Type |
Change |
---|---|
FIXED |
|
FIXED |
A 500 ms backoff has been added to the our retry logic when the client has queried all the servers of a cluster and received a |
IMPROVED |
Timelock server can now start with an empty clients list. Note that you currently need to restart timelock when adding clients to the configuration. (Pull Request) |
IMPROVED |
Default This parameter is set at table creation time, and it will only apply for new tables.
Existing customers can update the |
IMPROVED |
|
v0.41.0
17 May 2017
Type |
Change |
---|---|
USER BREAK CHANGED |
Projects |
DEV BREAK IMPROVED |
The format of serialized exceptions occurring on a remote host has been brought in line with that of the palantir/http-remoting library.
This should generally improve readability and also allows for more meaningful messages to be sent; we would previously return message bodies with no content for some exceptions (such as |
NEW |
Timelock server now has jar publication in addition to dist publication. (Pull Request) |
NEW FIXED |
TimeLock clients may now receive an HTTP response with status code 503, encapsulating a |
FIXED |
|
FIXED |
AtlasDB Console no longer errors on range requests that used a column selection and had more than one batch of results. (Pull Request) |
IMPROVED |
The |
FIXED |
Import ordering and license generation in generated IntelliJ project files now respect Baseline conventions. (Pull Request) |
FIXED IMPROVED |
Cassandra thrift depedencies have been bumped to newer versions; should fix a bug (#1654) that caused Atlas probing downed Cassandra nodes every few minutes to see if they were up and working yet to eventually take out the entire cluster by steadily building up leaked connections, due to a bug in the underlying driver. (Pull Request) |
v0.40.1
4 May 2017
This release contains (almost) exclusively baseline-related changes.
Type |
Change |
---|---|
DEV BREAK |
The Lock Descriptor classes ( |
DEV BREAK |
Removed package |
CHANGED |
Our dependency on immutables was bumped from 2.2.4 to 2.4.0, in order to fix an issue with static code analysis reporting errors in generated code. (Pull Request) |
DEV BREAK |
Renamed the following classes to match baseline rules. In each case, acronyms were lowercased, e.g.
|
DEV BREAK |
Relax the signature of KeyValueService.addGarbageCollectionSentinelValues() to take an Iterable instead of a Set. (Pull Request) |
v0.40.0
28 Apr 2017
Type |
Change |
---|---|
USER BREAK |
AtlasDB will refuse to start if backed by Postgres 9.5.0 or 9.5.1. These versions contain a known bug that causes incorrect results to be returned for certain queries. (Pull Request) |
USER BREAK IMPROVED |
The lock server now will dump all held locks and outstanding lock requests in YAML file, when logging state requested, for easy readability and further processing. This will make debuging lock congestions easier. Lock descriptors are changed with places holders and can be decoded using descriptors file, which will be written in the folder. Information like requesting clients, requesting threads and other details can be found in the YAML. Note that this change modifies serialization of lock tokens by adding the name of the requesting thread to the lock token; thus, TimeLock Servers are no longer compatible with AtlasDB clients from preceding versions. (Pull Request) |
DEV BREAK FIXED |
Correct |
NEW |
The lock server can now limit the number of concurrent open lock requests from the same client.
This behavior can be enabled with the flag |
DEV BREAK NEW |
The Applications can now call |
IMPROVED |
On graceful shutdown, the background sweeper will now release the backup lock if it holds it.
This should reduce the need for users to manually reset the |
IMPROVED |
Improved performance of |
DEV BREAK |
|
DEPRECATED |
|
v0.39.0
19 Apr 2017
Type |
Change |
---|---|
IMPROVED |
Refactored |
NEW |
The lock server now has a Specifically, the timelock server has a configuration parameter |
DEPRECATED |
Deprecated |
FIXED |
Proxies created via |
NEW |
The |
v0.38.0
6 Apr 2017
Type |
Change |
---|---|
IMPROVED |
The default |
FIXED |
Reverted #1524, which caused dependency issues in upstream products. Once we have resolved these issues, we will reintroduce the change, which was originally part of AtlasDB 0.37.0. (Pull Request) |
FIXED |
Creating a postgres table with a long name now throws a |
FIXED |
Fixed a performance regression introduced in #582, which caused sub-optimal batching behaviour when getting large sets of rows in Cassandra. The benchmark, intentionally set up in #1770 to highlight the break, shows a 10x performance improvement. (Pull Request) |
FIXED |
Correctness issue fixed in the |
DEV BREAK |
The |
NEW |
|
IMPROVED |
Timelock users who start an embedded timestamp and lock service without reverse-migrating now encounter a more informative error message. (Pull Request) |
v0.37.0
Removed 6 Apr 2017 due to dependency issues. Please use 0.38.0 instead.
Released 29 Mar 2017
Type |
Change |
---|---|
FIXED |
Fixed an issue where a |
NEW |
Added Dropwizard metrics for sweep, exposing aggregate and table-specific counts of cells examined and stale values deleted. (Pull Request) |
NEW |
Added a benchmark |
FIXED |
KVS migrations now maintain the guarantee of the timestamp service to hand out monotonically increasing timestamps. Previously, we would reset the timestamp service to 0 after a migration, but now we use the correct logical timestamp. (Pull Request) |
IMPROVED |
Improved performance of paging over dynamic columns on Oracle DBKVS: the time required to page through a large wide row is now linear rather than quadratic in the length of the row. (Pull Request) |
DEPRECATED |
|
DEV BREAK |
|
FIXED |
RemoteLockService clients will no longer silently retry on connection failures to the Timelock server. This is used to mitigate issues with frequent leadership changes owing to #1680. Previously, because of Jetty’s idle timeout and OkHttp’s silent connection retrying, we would generate an endless stream of lock requests if using HTTP/2 and blocking for more than the Jetty idle timeout for a single lock. This would lead to starvation of other requests on the TimeLock server, since a lock request blocked on acquiring a lock consumes a server thread. (Pull Request) |
v0.36.0
15 Mar 2017
Type |
Change |
---|---|
FIXED |
Fixed DBKVS sweep OOM issue (#982) caused by very wide rows.
In case of a single row that is too wide, this may result in |
FIXED |
Actions run by the |
FIXED |
Fixed an unnecessarily long-held connection in Oracle table name mapping code. (Pull Request) |
FIXED |
Fixed an issue where we excessively log after successful transactions. (Pull Request) |
FIXED |
Fixed an issue where the |
NEW |
AtlasDB now instruments services to expose aggregate response time and service call metrics for keyvalue, timestamp, and lock services. (Pull Request) |
DEV BREAK IMPROVED |
|
NEW |
Added the following benchmarks for paging over columns of a very wide row:
|
DEPRECATED |
The public |
v0.35.0
3 Mar 2017
Type |
Change |
---|---|
IMPROVED |
Timelock server now specifies minimum and maximum heap size of 512 MB. This should improve GC performance per the comments in #1594. (Pull Request) |
FIXED |
The background sweeper now uses deleteRange instead of truncate when clearing the |
IMPROVED |
Cassandra now attempts to truncate when performing a |
NEW |
Users can now create a Docker image and run containers of the Timelock Server, by running |
FIXED |
AtlasDB CLIs run via the Dropwizard bundle can now work with a Timelock block, and will contact the relevant Timelock server for timestamps or locks in this case. Previously, these CLIs would throw an error that a leader block was not specified. Note that CLIs will not perform automated migrations. (Pull Request) |
IMPROVED |
Cassandra truncates that are going to fail will do so faster. (Pull Request) |
DEV BREAK |
The persistent lock endpoints now use |
FIXED |
The |
DEV BREAK |
The persistent lock release endpoint has now been renamed to |
v0.34.0
23 Feb 2017
Type |
Change |
---|---|
NEW |
Timelock server now supports HTTP/2, and the AtlasDB HTTP clients enable a required GCM cipher suite. This feature improves performance of the Timelock server. Any client that wishes to connect to the timelock server via HTTP/2 must add jetty_alpn_agent as a javaAgent JVM argument, otherwise connections will fall back to HTTP/1.1 and performance will be considerably slower. For an example of how to add this dependency, see our timelock-server/build.gradle file. (Pull Request) |
FIXED |
AtlasDB Perf CLI can now output KVS-agnostic benchmark data (such as |
v0.33.0
22 Feb 2017
Type |
Change |
---|---|
FIXED |
AtlasDB HTTP clients are now compatible with OkHttp 3.3.0+, and no longer assume that header names are specified in Train-Case. This fix enables the Timelock server and AtlasDB clients to use HTTP/2. (Pull Request) |
FIXED |
Canonicalised SQL strings will now have contiguous whitespace rendered as a single space as opposed to the first character of said whitespace. This is important for backwards compatibility with an internal product. (Pull Request) |
NEW |
Added the option to perform a dry run of sweep via the Sweep CLI.
When This feature was introduced to avoid accidentally generating more tombstones than the Cassandra tombstone threshold (default 100k) introduced in CASSANDRA-6117.
If you delete more than 100k cells and thus cross the Cassandra threshold, then Cassandra may reject read requests until the tombstones have been compacted away.
Customers wishing to run Sweep should first run with the |
FIXED |
Fixed atlasdb-commons Java 1.6 compatibility by removing tracing from |
FIXED |
Persisted locks table is now considered an Atomic Table.
|
FIXED |
Reverted PR #1577 in 0.32.0 because this change prevents AtlasDB clients from downgrading to earlier versions of AtlasDB. We will merge a fix for MRTSE once we have a solution that allows a seamless rollback process. This change is also reverted on 0.32.1. (Pull Request) |
IMPROVED |
Reduced contention on |
v0.32.1
21 Feb 2017
Type |
Change |
---|---|
FIXED |
Reverted PR #1577 in 0.32.0 because this change prevents AtlasDB clients from downgrading to earlier versions of AtlasDB. We will merge a fix for MRTSE once we have a solution that allows a seamless rollback process. This change is also reverted on develop. (Pull Request) |
v0.32.0
16 Feb 2017
Type |
Change |
---|---|
FIXED |
Fixed erroneous occurrence of |
IMPROVED |
AtlasDB HTTP clients will now have a user agent of |
NEW |
Sweep now takes out a lock to ensure data is not corrupted during online backups. Users performing live backups should grab this lock before performing a backup of the underlying KVS, and then release the lock once the backup is complete. This enables the backup to safely run alongside either the background sweeper or the sweep CLI. (Pull Request) |
NEW |
Initial support for tracing Key Value Services integrating with http-remoting tracing. (Pull Request) |
IMPROVED |
Improved heap usage during heavy DBKVS querying by reducing mallocs in |
IMPROVED |
Removed an unused hamcrest import from the timestamp-impl project. This should reduce the size of our transitive dependencies, and therefore the size of product binaries. (Pull Request) |
FIXED |
Fixed schema generation with Java 8 optionals.
To use Java8 optionals, supply |
DEV BREAK |
Modified the type signature of |
IMPROVED |
Reduced logging noise from large Cassandra gets and puts by removing ERROR messages and only providing stacktraces at DEBUG. (Pull Request) |
NEW |
Upon startup of an AtlasDB client with a The client will fast-forward the Timelock Server’s timestamp bound to that of the embedded service. The client will now also invalidate the embedded service’s bound, backing this up in a separate row in the timestamp table. Automated migration is only supported for Cassandra KVS at the moment. If using DBKVS or other key-value services, it remains the user’s responsibility to ensure that they have performed the migration detailed in Migration to External Timelock Services. (Pull Request 1, Pull Request 2, and Pull Request 3) |
FIXED |
Fixed multiple scenarios where DBKVS can run into deadlocks due to unnecessary connections. (Pull Request) |
v0.31.0
8 Feb 2017
Type |
Change |
---|---|
IMPROVED DEV BREAK |
Improved Oracle performance on DBKVS by preventing excessive reads from the _namespace table when initializing SweepStrategyManager.
Replaced |
DEV BREAK |
Removed the unused |
DEV BREAK |
Fast forwarding a persistent timestamp service to We are introducing this break to prevent accidental corruption by forgetting to submit the fast-forward timestamp. (Pull Request) |
FIXED |
Oracle queries now use the correct hints when generating the query plan. This will improve performance for Oracle on DB KVS. (Pull Request) |
USER BREAK |
Oracle table names can now have a maximum length of 27 characters instead of the previous limit of 30.
This is to ensure consistency in naming the primary key constraint which adds a prefix of Since Oracle support is still in beta, we are not providing an automatic migration path from older versions of AtlasDB. (Pull Request) |
FIXED |
Support for Oracle 12c batch responses. (Pull Request) |
v0.30.0
27 Jan 2017
Type |
Change |
---|---|
FIXED DEV BREAK |
Fixed schema generation with Java 8 optionals.
To use Java8 optionals, supply Additionally, this fix requires all AtlasDB clients to regenerate their schema, even if they do not use the Java 8 optionals. (Pull Request) |
FIXED |
Prevent deadlocks in an edge case where we perform parallel reads with a small connection pool on DB KVS. (Pull Request) |
NEW |
Added support for benchmarking custom Key Value Stores. In the future this will enable performance regression testing for Oracle. See our performance writing documentation for details. (Pull Request) |
IMPROVED |
Don’t retry interrupted remote calls. This should have the effect of shutting down faster in situations where we receive a |
IMPROVED |
Added request and exception rates metrics in CassandraClientPool. This will provide access to 1-, 5-, and 15-minute moving averages. (Pull Request) |
IMPROVED |
More informative logging around retrying of transactions. If a transaction succeeds after being retried, we log the success (at the INFO level). If a transaction failed, but will be retried, we now also log the number of failures so far (at INFO). (Pull Request) |
IMPROVED |
Updated our dependency on |
v0.29.0
17 Jan 2017
Type |
Change |
---|---|
NEW |
Returned |
FIXED |
Stream store compression, introduced in 0.27.0, no longer creates a transaction inside a transaction when streaming directly to a file.
Additionally, a check was added to enforce the condition imposed in 0.28.0, namely that the caller of |
IMPROVED |
AtlasDB timestamp and lock HTTPS communication now use JVM optimized cipher suite CBC over the slower GCM. (Pull Request) |
NEW |
Added a new |
FIXED |
Reverted the |
v0.28.0
13 Jan 2017
Type |
Change |
---|---|
DEV BREAK |
The |
IMPROVED |
Increase default Cassandra pool size from minimum of 20 and maximum of 5x the minimum (100 if minimum not modified) connections to minimum of 30 and maximum of 100 connections. This has empirically shown better handling of bursts of requests that would otherwise require creating many new connections to Cassandra from the clients. (Pull Request) |
NEW |
Added metrics to SnapshotTransaction to monitor durations of various operations such as |
NEW |
Added metrics in Cassandra clients to record connection pool statistics and exception rates.
These metrics use the global |
NEW |
There is now a This capability was added so we can automate the migration to an external Timelock service in a future release. (Pull Request) |
FIXED |
Allow tables declared with |
FIXED |
Fix an issue with stream store where pre-loading the first block of an input stream caused us to create a transaction inside another transaction.
To avoid this issue, it is now the caller’s responsibility to ensure that |
IMPROVED |
|
FIXED |
All SnapshotTransaction |
USER BREAK |
Users must not create a client named |
v0.27.2
10 Jan 2017
Type |
Change |
---|---|
FIXED |
Fixed an issue with |
v0.27.1
6 Jan 2017
Type |
Change |
---|---|
FIXED |
Fixed an edge case in stream stores where we throw an exception for using the exact maximum number of bytes in memory. This behavior was introduced in 0.27.0 and does not affect stream store usage pre-0.27.0. (Pull Request) |
IMPROVED |
Backoff when receiving a socket timeout to Cassandra to put back pressure on client and to spread out load incurred on remaining servers when a failover occurs. (Pull Request) |
v0.27.0
6 Jan 2017
Type |
Change |
---|---|
NEW |
AtlasDB now supports stream store compression.
Streams can be compressed client-side by adding the For information on using the stream store, see Streams. (Pull Request) |
IMPROVED |
|
IMPROVED |
Increased Cassandra connection pool idle timeout to 10 minutes, and reduced eviction check frequency to 20-30 seconds at 1/10 of connections. This should reduce bursts of stress on idle Cassandra clusters. (Pull Request) |
NEW |
There is a new configuration called See Cassandra KVS Config for details on configuring AtlasDB with Cassandra. (Pull Request) |
IMPROVED |
Improved the performance of Oracle queries by making the table name cache global to the KVS level. Keeping the mapping in a cache saves one DB lookup per query, when the table has already been used. (Pull Request) |
FIXED |
Oracle value style caching limited in scope to per-KVS, previously per-JVM, which could have in extremely rare cases caused issues for users in non-standard configurations. This would have caused issues for users doing a KVS migration to move from one Oracle DB to another. (Pull Request) |
NEW |
We now publish a runnable distribution of AtlasCli that is available for download directly from Bintray. (Pull Request 1) and (Pull Request 2) |
IMPROVED |
Enabled garbage collection logging for CircleCI builds. This may be useful for investigating pre-merge build failures. (Pull Request) |
IMPROVED |
Updated our dependency on |
NEW |
Add KeyValueStore.deleteRange(); makes large swathes of row deletions faster, like transaction sweeping. Also can be used as a fallback option for people having issues with their backup solutions not allowing truncate() during a backup (Pull Request) |
v0.26.0
5 Dec 2016
Type |
Change |
---|---|
IMPROVED |
Added Javadocs to |
IMPROVED |
Substantially improved performance of the DBKVS implementation of the single-iterator version of getRowsColumnRange. Two new performance benchmarks were added as part of this PR:
These benchmarks show a 2x improvement on Postgres, and an AtlasDB client has observed an order of magnitude improvement experimentally. (Pull Request) |
IMPROVED |
OkHttpClient connection pool configured to have 100 idle connections with 10 minute keep-alive, reducing the number of connections that need to be created when a large number of transactions begin. (Pull Request) |
IMPROVED |
Commit timestamp lookups are now cached across transactions. This provided a near 2x improvement in our performance benchmark testing. See comments on the pull request for details. (Pull Request) |
IMPROVED |
|
IMPROVED |
When we hit the |
v0.25.0
25 Nov 2016
Type |
Change |
---|---|
FIXED |
|
USER BREAK |
The migration |
NEW |
Dbkvs: ConnectionSupplier consumers can now choose to receive a brand new unshared connection. (Pull Request) |
NEW |
AtlasDB now supports Cassandra 3.7 as well as Cassandra 2.2.8. (Pull Request) |
IMPROVED |
Oracle perf improvement; table names now cached, resulting in fewer round trips to the database. (Pull Request) |
IMPROVED |
|
IMPROVED |
Better support for AtlasDB clients running behind load balancers. In particular, if an AtlasDB client falls down and its load balancer responds with “503: Service Unavailable”, the request will be attempted on other clients rather than aborting. (Pull Request) |
FIXED |
Oracle will not drop a table that already exists on |
FIXED |
Certain Oracle KVS calls no longer attempt to leak connections created internally. (Pull Request) |
FIXED |
OracleKVS: |
DEV BREAK |
Our Jackson version has been updated from 2.5.1 to 2.6.7 and Dropwizard version from 0.8.2 to 0.9.3. (Pull Request) |
IMPROVED |
Additional debugging available for those receiving ‘name must be no longer than 1500 bytes’ errors. (Pull Request) |
DEV BREAK |
|
v0.24.0
15 Nov 2016
Type |
Change |
---|---|
USER BREAK |
All Oracle table names will be truncated and be of the form: Oracle is in beta, and thus we have not built a migration path from old table names to new table names. (Pull Request) |
FIXED |
The fetch timestamp CLI correctly handles |
FIXED |
When using DBKVS with Oracle, |
FIXED |
The KVS migration CLI will now decrypt encrypted values in your KVS configuration. (Pull Request) |
IMPROVED |
If using the Dropwizard command to run a KVS migration, the Dropwizard config will be used as the See the documentation for details on how to use the KVS migration command. (Pull Request) |
FIXED |
The timestamp bound store now works with Oracle as a relational backing store. (Pull Request) |
IMPROVED |
CLIs now output to standard out, standard error, and the service logs, rather than only printing to the service logs. This should greatly improve usability for service admins using the CLIs. (Pull Request) |
IMPROVED |
Remove usage of |
IMPROVED |
|
NEW |
Add support for generating schemas with Java8 Optionals instead of Guava Optionals.
To use Java8 optionals, supply |
v0.23.0
8 Nov 2016
Type |
Change |
---|---|
DEV BREAK |
All KVSs now as a guarantee throw a RuntimeException on attempts to truncate a non-existing table, so services should check the existence of a table before attempting to truncate. Previously we would only throw exceptions for the Cassandra KVS. (Pull Request) |
FIXED |
The KVS migration command now supports the |
DEPRECATED |
Schema generated code still contains use of |
NEW |
We now provide Oracle support (beta) for all valid schemas. Oracle table names exceeding 30 characters are now mapped to shorter names by truncating and appending a sequence number. Support for Oracle is currently in beta and services wishing to deploy against Oracle should contact the AtlasDB team. See Oracle Table Mapping for details on how table names are mapped. (Pull Request) |
CHANGED |
We now test against Cassandra 2.2.8, rather than Cassandra 2.2.7. (Pull Request) |
IMPROVED |
Added a significant amount of logging aimed at tracking down the |
IMPROVED |
Retrying a Cassandra operation now retries against distinct hosts. Previously, this would independently select hosts randomly, meaning that we might unintentionally try the same operation on the same servers. (Pull Request) |
FIXED |
AtlasDB clients can start when a single Cassandra node is unreachable. (Pull Request). |
IMPROVED |
Removed spurious error logging during first-time startup against a brand new Cassandra cluster. (Pull Request) |
IMPROVED |
Improved the reliability of starting up against a degraded Cassandra cluster. (Pull Request) |
FIXED |
No longer publish a spurious junit dependency in atlasdb-client compile. (Pull Request) |
v0.22.0
28 Oct 2016
Type |
Change |
---|---|
IMPROVED |
The See Schema Mutation Lock (Cassandra only) for details on how the schema mutation lock works. (Pull Request) |
FIXED |
Fixed an issue where some locks were not being tracked for continuous refreshing due to one of the lock methods not being overridden by the |
IMPROVED |
Sweep no longer immediately falls back to a See sweep tuning documentation for more information on sweep tuning parameters. (Pull Request) |
v0.21.1
24 Oct 2016
Type |
Change |
---|---|
FIXED |
Fixed a regression with Cassandra KVS where you could no longer create a table if it has the same name as another table in a different namespace. To illustrate the issue, assume you have namespace Note that namespace is an application level abstraction defined as part of a AtlasDB schema and is not the same as Cassandra keyspace. (Pull Request) |
v0.21.0
21 Oct 2016
Type |
Change |
---|---|
NEW |
Sweep now supports batching on a per-cell level via the |
FIXED |
If |
v0.20.0
19 Oct 2016
Type |
Change |
---|---|
DEV BREAK |
Hotspotting warnings, previously logged at ERROR, will now throw See documentation on primitive value types and partitioners for information on how to address your schemas. (Pull Request) |
FIXED |
The AtlasDB Console included in the Dropwizard bundle can startup in an “online” mode, i.e. it can connect to a running cluster. See AtlasDB Console for information on how to use AtlasDB console. (Pull Request) |
FIXED |
The |
NEW |
Oracle is supported via DBKVS if you have runtime dependency on an Oracle driver that resolves the JsonType “jdbcHandler”. Due to an Oracle limitation, all table names in the schema must be less than 30 characters long. See Oracle KVS Configuration for details on how to configure your service to use Oracle. (Pull Request) |
FIXED |
The DBKVS config now enforces that the namespace must always be empty for |
FIXED |
We have changed the default |
FIXED |
The |
v0.19.0
11 Oct 2016
Type |
Change |
---|---|
DEV BREAK |
Removed KeyValueService |
FIXED |
In Cassandra KVS, we now no longer take out the schema mutation lock in calls to |
FIXED |
Added a wait period before declaring someone dead based on lack of heartbeat. This will ensure we handle delayed heartbeats in high load situations (eg. on circleci). (Pull Request) |
DEV BREAK |
Please reach out to us if you are adversely affected by these removals. (Pull Request 1 and Pull Request 2) |
CHANGED |
The SQL connection manager will no longer temporarily increase the pool size by eleven connections when the pool is exhausted. (Pull Request) |
v0.18.0
3 Oct 2016
Type |
Change |
---|---|
FIXED |
Fixed a bug introduced in 0.17.0, where products upgraded to 0.17.0 would see a “dead heartbeat” error on first start-up, requiring users to manually truncate the |
FIXED |
Dropping a table and then creating it again no longer adds an additional row to the |
IMPROVED |
Users of DBKVS can now set arbitrary connection parameters. This is useful if, for example, you wish to boost performance by adjusting the default batch size for fetching rows from the underlying database. See the documentation for how to set these parameters, and the JDBC docs for a full list. (Pull Request) |
v0.17.0
28 Sept 2016
Type |
Change |
---|---|
IMPROVED |
The schema mutation lock holder now writes a “heartbeat” to the database to indicate that it is still responsive. Other processes that are waiting for the schema mutation lock will now be able to see this heartbeat, infer that the lock holder is still working, and wait for longer. This should reduce the need to manually truncate the locks table. (Pull Request) |
NEW |
|
v0.16.0
26 Sept 2016
Type |
Change |
---|---|
DEV BREAK |
Removed |
IMPROVED |
|
FIXED |
Column paging Sweep (in beta) correctly handles cases where table names have both upper and lowercase characters and cases where sweep is run multiple times on the same table.
If you are using the regular implementation of Sweep (i.e. you do not specify |
v0.15.0
14 Sept 2016
Type |
Change |
---|---|
IMPROVED |
We have removed references to temp tables and no longer attempt to drop temp tables when aborting transactions. Temp tables are not currently being used by any KVSs, yet we were still calling |
DEV BREAK |
All TransactionManagers are now AutoCloseable and implement a close method that will free up the underlying resources. If your service implements a |
NEW |
AtlasDB Sweep now uses column paging via the By paging over historical versions of cells during sweeping, we can avoid out of memory exceptions in Cassandra when we have particularly large cells or many historical versions of cells. This feature is only implemented for Cassandra KVS and is disabled by default; please reach out to the AtlasDB dev team if you would like to enable it. (Pull Request) |
NEW |
Added a second implementation of Products or clients using wide rows should consider using |
NEW |
Added an offline CLI called This is useful on Cassandra KVS if an AtlasDB client goes down during a schema mutation and does not release the schema mutation lock, preventing other clients from continuing. Previously an error message would direct users to manually truncate this table with CQL, but now this error message references the CLI. (Pull Request) |
CHANGED |
Reverted our Dagger dependency from 2.4 to 2.0.2 and shadowed it so that it won’t conflict with internal products. (Pull Request) |
v0.14.0
8 Sept 2016
Type |
Change |
---|---|
USER BREAK |
To assist with back compatibility, we have introduced a helper method |
FIXED |
AtlasDB could startup with a leader configuration that is nonsensical, such as specifying both a |
FIXED |
Fixed and standardized serialization and deserialization of AtlasDBConfig. This prevented CLIs deployed via the Dropwizard bundle from loading configuration properly. (Pull Request) |
DEV BREAK |
Updated our Dagger dependency from 2.0.2 to 2.4, so that our generated code matches with that of internal products. This also bumps our Guava dependency from 18.0 to 19.0 to accommodate a Dagger compile dependency. We plan on shading Dagger in the next release of AtlasDB, but products can force a Guava 18.0 runtime dependency to workaround the issue in the meantime. (Pull Request) |
v0.13.0
30 Aug 2016
Type |
Change |
---|---|
DEV BREAK |
|
FIXED |
The method |
IMPROVED |
Improved logging for schema mutation lock timeouts and added logging for obtaining and releasing locks. Removed the advice to restart the client, as it will not help in this scenario. (Pull Request) |
FIXED |
Connections to Cassandra can be established over arbitrary ports. Previously AtlasDB clients would assume the default Cassandra port of 9160 despite what is specified in the Cassandra keyValueService configuration. (Pull Request) |
FIXED |
Fixed an issue when starting an AtlasDB client using the Cassandra KVS where we always grab the schema mutation lock, even if we are not making schema mutations. This reduces the likelihood of clients losing the schema mutation lock and having to manually truncate the _locks table. (Pull Request) |
IMPROVED |
Performance and reliability enhancements to the in-beta CQL KVS. (Pull Request) |
v0.12.0
22 Aug 2016
Type |
Change |
---|---|
USER BREAK |
AtlasDB will always try to register timestamp and lock endpoints for your application, whereas previously this only occurred if you specify a Leader Configuration. This ensures that CLIs will be able to run against your service even in the single node case. For Dropwizard applications, this is only a breaking change if you try to initialize your KeyValueService after having initialized the Dropwizard application. Note: If you are initializing the KVS post-Dropwizard initialization, then your application will already fail when starting multiple AtlasDB clients. (Pull Request) |
NEW |
There is now a Dropwizard bundle which can be added to Dropwizard applications.
This will add startup commands to launch the AtlasDB console and CLIs suchs as |
FIXED |
DB passwords are no longer output as part of the connection configuration |
NEW |
All KVSs now come wrapped with ProfilingKeyValueService, which at the TRACE level provides timing information per KVS operation performed by AtlasDB. See Logging Configuration for more details. (Pull Request) |
v0.11.4
29 Jul 2016
Type |
Change |
---|---|
FIXED |
Correctly checks the Cassandra client version that determines if Cassandra supports Check And Set operations. This is a critical bug fix that ensures we actually use our implementation from #436, which prevents data loss due to the Cassandra concurrent table creation bug described in #431. (Pull Request) |
v0.11.2
29 Jul 2016
Type |
Change |
---|---|
USER BREAK |
Reverting behavior introduced in AtlasDB 0.11.0 so the This only affects end users who have deployed products with AtlasDB 0.11.0 or 0.11.1; users upgrading from earlier versions will not see changed behavior. See Communicating Over SSL for details on how to configure CassandraKVS with SSL. (Pull Request) |
v0.11.1
28 Jul 2016
Type |
Change |
---|---|
FIXED |
Removed a check enforcing a leader block config when one was not required. This prevents AtlasDB 0.11.0 clients from starting if a leader configuration is not specified (i.e. single node clusters). (Pull Request) |
IMPROVED |
Updated schema table generation to optimize reads with no ColumnSelection specified against tables with fixed columns. To benefit from this improvement you will need to re-generate your schemas. (Pull Request) |
v0.11.0
27 Jul 2016
Type |
Change |
---|---|
IMPROVED |
Clarified the logging when multiple timestamp servers are running to state that CLIs could be causing the issue. (Pull Request) |
CHANGED |
Updated cassandra client from 2.2.1 to 2.2.7 and cassandra docker testing version from 2.2.6 to 2.2.7. (Pull Request) |
FIXED |
The leader config now contains a new If left blank, Full details for configuring the leader block, see cassandra configuration. (Pull Request) |
FIXED |
A utility method was removed in the previous release, breaking an internal product that relied on it. This method has now been added back. (Pull Request) |
FIXED |
Removed unnecessary error message for missing _timestamp metadata table. _timestamp is a hidden table, and it is expected that _timestamp metadata should not be retrievable from public API. (Pull Request) |
IMPROVED |
Trace logging is more informative and will log all failed calls. To enable trace logging, see Enabling Cassandra Tracing. (Pull Request) |
NEW |
The Cassandra KVS now supports specifying SSL options via the new |
v0.10.0
13 Jul 2016
Type |
Change |
---|---|
CHANGED |
Updated HikariCP dependency from 2.4.3 to 2.4.7 to comply with updates in internal products. Details of the HikariCP changes can be found here. (Pull Request) |
NEW |
AtlasDB currently allows you to create dynamic columns (wide rows), but you can only retrieve entire rows or specific columns.
Typically with dynamic columns, you do not know all the columns you have in advance, and this features allows you to page through dynamic columns per row, reducing pressure on the underlying KVS.
Products or clients (such as AtlasDB Sweep) making use of wide rows should consider using Note: This is considered a beta feature and is not yet being used by AtlasDB Sweep. |
FIXED |
We properly check that cells are not set to empty (zero-byte) or null. (Pull Request) |
IMPROVED |
Cassandra client connection pooling will now evict idle connections over a longer period of time and has improved logic for deciding whether or not a node should be blacklisted. This should result in less connection churn and therefore lower latency. (Pull Request) |
v0.9.0
11 Jul 2016
Type |
Change |
---|---|
DEV BREAK |
Inserting an empty (size = 0) value into a AtlasDB cannot currently distinguish between empty and deleted cells. In previous versions of AtlasDB, inserting
an empty value into a Transaction.put(table, ImmutableMap.of(myCell, new byte[0]))
Transaction.get(table, ImmutableSet.of(myCell)).get(myCell)
the second line will return To minimize confusion, we explicitly disallow inserting an empty value into a cell by throwing an
In particular, this change will break calls to If any code deletes cells by calling Note: Existing cells with empty values will be interpreted as deleted cells, and will not lead to Exceptions when read. (Pull Request) |
IMPROVED |
The warning emitted when an attempted leadership election fails is now more descriptive. (Pull Request) |
FIXED |
Code generation for the If you are using Indices we recommend you upgrade as a precaution and ensure you are not relying on logic related to the |
v0.8.0
5 Jul 2016
Type |
Change |
---|---|
FIXED |
Some logging was missing important information due to use of the wrong substitution placeholder. This version should be taken in preference to 0.7.0 to ensure logging is correct. (Pull Request) |
v0.7.0
4 Jul 2016
Type |
Change |
---|---|
NEW |
AtlasDB can now be backed by Postgres via DB KVS. This is a very early release for this feature, so please contact us if you plan on using it. Please see the documentation for more details. |
FIXED |
The In Memory Key Value Service now makes defensive copies of any data stored or retrieved. This may lead to a slight performance degradation to users of In Memory Key Value Service. In Memory Key Value Service is recommended for testing environments only and production instances should use DB KVS or Cassandra KVS for data that needs to be persisted. (Pull Request) |
FIXED |
AtlasDB will no longer log incorrect errors stating “Couldn’t grab new token ranges for token aware cassandra mapping” when running against a single node and single token Cassandra cluster. (Pull Request) |
IMPROVED |
Read heavy workflows with Cassandra KVS will now use substantially less heap. In worst-case testing this change resulted in a 10-100x reduction in client side heap size. However, this is very dependent on the particular scenario AtlasDB is being used in and most consumers should not expect a difference of this size. (Pull Request) |
v0.6.0
26 May 2016
Type |
Change |
---|---|
FIXED |
A potential race condition could cause timestamp allocation to never complete on a particular node (#462). |
FIXED |
An innocuous error was logged once for each TransactionManager about not being able to allocate enough timestamps. The error has been downgraded to INFO and made less scary. |
FIXED |
Serializable Transactions that read a column selection could consistently report conflicts when there were none. |
FIXED |
An excessively long Cassandra related logline was sometimes printed (#501). |
v0.5.0
16 May 2016
Type |
Change |
---|---|
CHANGED |
Only bumping double minor version in artifacts for long-term stability fixes. |
v0.4.1
17 May 2016
Type |
Change |
---|---|
FIXED |
Prevent _metadata tables from triggering the Cassandra 2.x schema mutation bug 431 (444 not yet fixed). |
FIXED |
Required projects are now Java 6 compliant. |