Description
timeuuid
is a UUID v1 - it contains a timestamp and some random bits that together form a unique identifier.
Their comparison operator should compare them primarily based on this timestamp.
timeuuid
values are currently represented using uuid.UUID
, but the comparison operators of uuid.UUID
don't compare timeuuid
values in the same way that Cassandra/Scylla does.
For example, with values 00000257-0efc-11ee-9547-00006490e9a6
and fed35080-0efb-11ee-a1ca-00006490e9a4
:
Cassandra believes that fed35080-0efb-11ee-a1ca-00006490e9a4
is smaller:
CREATE KEYSPACE ks WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1};
CREATE TABLE ks.tab (p int, c timeuuid, PRIMARY KEY (p, c));
INSERT INTO ks.tab (p, c) VALUES (0, 00000257-0efc-11ee-9547-00006490e9a6);
INSERT INTO ks.tab (p, c) VALUES (0, fed35080-0efb-11ee-a1ca-00006490e9a4);
SELECT p, c FROM ks.tab;
p | c
---+--------------------------------------
0 | fed35080-0efb-11ee-a1ca-00006490e9a4
0 | 00000257-0efc-11ee-9547-00006490e9a6
(2 rows)
Because it has a lower timestamp value:
timeuuid_col | system.totimestamp(timeuuid_col)
--------------------------------------+----------------------------------
fed35080-0efb-11ee-a1ca-00006490e9a4 | 2023-06-19 23:49:56.990000+0000
00000257-0efc-11ee-9547-00006490e9a6 | 2023-06-19 23:49:58.961000+0000
But UUID comparison says the opposite, as it compares the bytes in lexicographical order:
>>> import uuid
>>> a = uuid.UUID('fed35080-0efb-11ee-a1ca-00006490e9a4')
>>> b = uuid.UUID('00000257-0efc-11ee-9547-00006490e9a6')
>>> a < b
False
>>> b < a
True
It would be useful to have a class that represents timeuuid
values and has the same semantics as the values in Cassandra.
Here's an example implementation that I wrote, based on the cassandra implementation:
https://gist.github.com/cvybhu/ed5b64d8b62eff51dc46258157a92e41
Ideally python driver would return values of this type when some row is selected, but this would be a breaking change.