Skip to content

Commit

Permalink
Add UUIDv8, change UUIDv7 to millisecond precision (#63)
Browse files Browse the repository at this point in the history
* Add UUIDv8, change UUIDv7 to millisecond precision
  • Loading branch information
oittaa authored Oct 25, 2022
1 parent bfc16ef commit f78ee83
Show file tree
Hide file tree
Showing 5 changed files with 191 additions and 32 deletions.
57 changes: 50 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@

# uuid-php

A small PHP class for generating [RFC 4122][RFC 4122] version 3, 4, and 5 universally unique identifiers (UUID). Additionally supports [draft][draft 04] versions 6 and 7.
A small PHP class for generating [RFC 4122][RFC 4122] version 3, 4, and 5 universally unique identifiers (UUID). Additionally supports [draft][draft] versions 6, 7, and 8.

If all you want is a unique ID, you should call `uuid4()`.

> Implementations SHOULD utilize UUID version 7 over UUID version 1 and 6 if possible.
## Minimal UUID v4 implementation

Credits go to [this answer][stackoverflow uuid4] on Stackoverflow for this minimal RFC 4122 compliant solution.
Expand All @@ -25,7 +27,7 @@ echo uuid4();

## Installation

If you need comparison tools or sortable identifiers like in versions 6 and 7, you might find this small and fast package useful. It doesn't require any other dependencies.
If you need comparison tools or sortable identifiers like in versions 6, 7, and 8, you might find this small and fast package useful. It doesn't require any other dependencies.

```bash
composer require oittaa/uuid
Expand Down Expand Up @@ -64,6 +66,12 @@ echo $uuid7_first . "\n"; // e.g. 017f22e2-79b0-7cc3-98c4-dc0c0c07398f
$uuid7_second = UUID::uuid7();
var_dump($uuid7_first < $uuid7_second); // bool(true)

// Generate a version 8 (lexicographically sortable) UUID
$uuid8_first = UUID::uuid8();
echo $uuid8_first . "\n"; // e.g. 017f22e2-79b0-8cc3-98c4-dc0c0c07398f
$uuid8_second = UUID::uuid8();
var_dump($uuid8_first < $uuid8_second); // bool(true)

// Test if a given string is a valid UUID
$isvalid = UUID::isValid('11a38b9a-b3da-360f-9353-a5a725514269');
var_dump($isvalid); // bool(true)
Expand Down Expand Up @@ -111,19 +119,54 @@ $cmp3 = UUID::cmp(
);
var_dump($cmp3 === 0); // bool(true)

// Extract Unix time from versions 6 and 7 as a string.
// Extract Unix time from versions 6, 7, and 8 as a string.
$uuid6_time = UUID::getTime('1ec9414c-232a-6b00-b3c8-9e6bdeced846');
var_dump($uuid6_time); // string(18) "1645557742.0000000"
$uuid7_time = UUID::getTime('017f22e2-79b0-7cc3-98c4-dc0c0c07398f');
var_dump($uuid7_time); // string(18) "1645557742.0007977"
var_dump($uuid7_time); // string(18) "1645557742.0000000"
$uuid8_time = UUID::getTime('017f22e2-79b0-8cc3-98c4-dc0c0c07398f');
var_dump($uuid8_time); // string(18) "1645557742.0007977"

// Extract the UUID version.
$uuid_version = UUID::getVersion('2140a926-4a47-465c-b622-4571ad9bb378');
var_dump($uuid_version); // int(4)
```

## UUIDv6 Field and Bit Layout

```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | time_low_and_version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|clk_seq_hi_res | clk_seq_low | node (0-1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node (2-5) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

## UUIDv7 Field and Bit Layout

```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms | ver | rand_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

## UUIDv8 Field and Bit Layout

```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
Expand All @@ -139,14 +182,14 @@ var_dump($uuid_version); // int(4)
```

- `unix_ts_ms`: 48 bit big-endian unsigned number of Unix epoch timestamp with millisecond level of precision
- `ver`: The 4 bit UUIDv7 version (0111)
- `ver`: The 4 bit UUIDv8 version (1000)
- `subsec`: 12 bits allocated to sub-second precision values
- `var`: 2 bit UUID variant (10)
- `sub`: 2 bits allocated to sub-second precision values
- `rand`: The remaining 60 bits are filled with pseudo-random data

14 bits dedicated to sub-second precision provide 100 nanosecond resolution. The `unix_ts` and `subsec` fields guarantee the order of UUIDs generated within the same timestamp by monotonically incrementing the timer.
14 bits dedicated to sub-second precision provide 100 nanosecond resolution. The `unix_ts_ms` and `subsec` fields guarantee the order of UUIDs generated within the same timestamp by monotonically incrementing the timer.

[RFC 4122]: http://tools.ietf.org/html/rfc4122
[draft 04]: https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04
[draft]: https://github.com/ietf-wg-uuidrev/rfc4122bis
[stackoverflow uuid4]: https://stackoverflow.com/a/15875555
95 changes: 76 additions & 19 deletions src/UUID.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
* Represents a universally unique identifier (UUID), according to RFC 4122.
*
* This class provides the static methods `uuid3()`, `uuid4()`, `uuid5()`,
* `uuid6()`, and `uuid7()` for generating version 3, 4, 5, 6 (draft), and
* 7 (draft) UUIDs.
* `uuid6()`, `uuid7()`, and `uuid8()` for generating version 3, 4, 5,
* 6 (draft), 7 (draft), and 8 (draft) UUIDs.
*
* If all you want is a unique ID, you should call `uuid4()`.
*
* @link http://tools.ietf.org/html/rfc4122
* @link https://github.com/uuid6/uuid6-ietf-draft
* @link https://github.com/ietf-wg-uuidrev/rfc4122bis
* @link http://en.wikipedia.org/wiki/Universally_unique_identifier
*/
class UUID
Expand Down Expand Up @@ -49,6 +49,12 @@ class UUID
* @link http://tools.ietf.org/html/rfc4122#section-4.1.7
*/
public const NIL = '00000000-0000-0000-0000-000000000000';
/**
* The Max UUID is special form of UUID that is specified to have all 128 bits set to one.
* @var string
* @link https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-00.html#name-max-uuid
*/
public const MAX = 'FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF';

/**
* 0x01b21dd213814000 is the number of 100-ns intervals between the
Expand All @@ -65,7 +71,10 @@ class UUID
private const V7_SUBSEC_RANGE = 10_000;

/** @internal */
private const V7_SUBSEC_BITS = 14;
private const V8_SUBSEC_RANGE = 10_000;

/** @internal */
private const V8_SUBSEC_BITS = 14;

/** @internal */
private const UUID_REGEX = '/^(?:urn:)?(?:uuid:)?(\{)?([0-9a-f]{8})\-?([0-9a-f]{4})'
Expand All @@ -78,7 +87,10 @@ class UUID
private static $subsec = 0;

/** @internal */
private static function getUnixTime(): array
private static $unixts_ms = 0;

/** @internal */
private static function getUnixTimeSubsec(): array
{
$timestamp = microtime(false);
$unixts = intval(substr($timestamp, 11), 10);
Expand All @@ -98,6 +110,19 @@ private static function getUnixTime(): array
return [$unixts, $subsec];
}

/** @internal */
private static function getUnixTimeMs(): int
{
$timestamp = microtime(false);
$unixts = intval(substr($timestamp, 11), 10);
$unixts_ms = $unixts * 1000 + intval(substr($timestamp, 2, 3), 10);
if (self::$unixts_ms >= $unixts_ms) {
$unixts_ms = self::$unixts_ms + 1;
}
self::$unixts_ms = $unixts_ms;
return $unixts_ms;
}

/** @internal */
private static function stripExtras(string $uuid): string
{
Expand Down Expand Up @@ -138,13 +163,13 @@ private static function uuidFromHex(string $uhex, int $version): string
/** @internal */
private static function encodeSubsec(int $value): int
{
return intdiv($value << self::V7_SUBSEC_BITS, self::V7_SUBSEC_RANGE);
return intdiv($value << self::V8_SUBSEC_BITS, self::V8_SUBSEC_RANGE);
}

/** @internal */
private static function decodeSubsec(int $value): int
{
return -(-$value * self::V7_SUBSEC_RANGE >> self::V7_SUBSEC_BITS);
return -(-$value * self::V8_SUBSEC_RANGE >> self::V8_SUBSEC_BITS);
}

/**
Expand Down Expand Up @@ -189,15 +214,16 @@ public static function uuid5(string $namespace, string $name): string
}

/**
* Generate a version 6 UUID. A v6 UUID is lexicographically sortable and contains
* a 60-bit timestamp and 62 extra unique bits. Unlike version 1 UUID, this
* implementation of version 6 UUID doesn't leak the MAC address of the host.
* UUID version 6 is a field-compatible version of UUIDv1, reordered for improved
* DB locality. It is expected that UUIDv6 will primarily be used in contexts
* where there are existing v1 UUIDs. Systems that do not involve legacy UUIDv1
* SHOULD consider using UUIDv7 instead.
*
* @return string The string standard representation of the UUID
*/
public static function uuid6(): string
{
[$unixts, $subsec] = self::getUnixTime();
[$unixts, $subsec] = self::getUnixTimeSubsec();
$timestamp = $unixts * self::SUBSEC_RANGE + $subsec;
$timehex = str_pad(dechex($timestamp + self::TIME_OFFSET_INT), 15, '0', \STR_PAD_LEFT);
$uhex = substr_replace(substr($timehex, -15), '6', -3, 0);
Expand All @@ -206,24 +232,43 @@ public static function uuid6(): string
}

/**
* Generate a version 7 UUID. A v7 UUID is lexicographically sortable and is
* designed to encode a Unix timestamp with arbitrary sub-second precision.
* UUID version 7 features a time-ordered value field derived from the widely
* implemented and well known Unix Epoch timestamp source, the number of
* milliseconds seconds since midnight 1 Jan 1970 UTC, leap seconds excluded. As
* well as improved entropy characteristics over versions 1 or 6.
*
* Implementations SHOULD utilize UUID version 7 over UUID version 1 and 6 if
* possible.
*
* @return string The string standard representation of the UUID
*/
public static function uuid7(): string
{
[$unixts, $subsec] = self::getUnixTime();
$unixtsms = $unixts * 1000 + intdiv($subsec, self::V7_SUBSEC_RANGE);
$subsec = self::encodeSubsec($subsec % self::V7_SUBSEC_RANGE);
$unixtsms = self::getUnixTimeMs();
$uhex = substr(str_pad(dechex($unixtsms), 12, '0', \STR_PAD_LEFT), -12);
$uhex .= bin2hex(random_bytes(10));
return self::uuidFromHex($uhex, 7);
}

/**
* Generate a version 8 UUID. A v8 UUID is lexicographically sortable and is
* designed to encode a Unix timestamp with arbitrary sub-second precision.
*
* @return string The string standard representation of the UUID
*/
public static function uuid8(): string
{
[$unixts, $subsec] = self::getUnixTimeSubsec();
$unixtsms = $unixts * 1000 + intdiv($subsec, self::V8_SUBSEC_RANGE);
$subsec = self::encodeSubsec($subsec % self::V8_SUBSEC_RANGE);
$subsecA = $subsec >> 2;
$subsecB = $subsec & 0x03;
$randB = random_bytes(8);
$randB[0] = chr(ord($randB[0]) & 0x0f | $subsecB << 4);
$uhex = substr(str_pad(dechex($unixtsms), 12, '0', \STR_PAD_LEFT), -12);
$uhex .= '7' . str_pad(dechex($subsecA), 3, '0', \STR_PAD_LEFT);
$uhex .= '8' . str_pad(dechex($subsecA), 3, '0', \STR_PAD_LEFT);
$uhex .= bin2hex($randB);
return self::uuidFromHex($uhex, 7);
return self::uuidFromHex($uhex, 8);
}

/**
Expand Down Expand Up @@ -270,9 +315,13 @@ public static function getTime(string $uuid): ?string
}
$retval .= substr_replace(str_pad(strval($ts), 8, '0', \STR_PAD_LEFT), '.', -7, 0);
} elseif ($version === 7) {
$unixts = hexdec(substr($timehex, 0, 13));
$retval = strval($unixts * self::V7_SUBSEC_RANGE);
$retval = substr_replace(str_pad($retval, 8, '0', \STR_PAD_LEFT), '.', -7, 0);
} elseif ($version === 8) {
$unixts = hexdec(substr($timehex, 0, 13));
$subsec = self::decodeSubsec((hexdec(substr($timehex, 13)) << 2) + (hexdec(substr($uuid, 16, 1)) & 0x03));
$retval = strval($unixts * self::V7_SUBSEC_RANGE + $subsec);
$retval = strval($unixts * self::V8_SUBSEC_RANGE + $subsec);
$retval = substr_replace(str_pad($retval, 8, '0', \STR_PAD_LEFT), '.', -7, 0);
}
return $retval;
Expand Down Expand Up @@ -354,4 +403,12 @@ public static function v7(): string
{
return self::uuid7();
}
/**
* @see UUID::uuid8() Alias
* @return string
*/
public static function v8(): string
{
return self::uuid8();
}
}
4 changes: 4 additions & 0 deletions tests/Benchmark/UUIDGenerationBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ public function benchUUID7Generation(): void
{
UUID::uuid7();
}
public function benchUUID8Generation(): void
{
UUID::uuid8();
}
public function benchUUIDToString(): void
{
UUID::toString('{C4A760A8-DBCF-5254-A0D9-6A4474BD1B62}');
Expand Down
23 changes: 23 additions & 0 deletions tests/FutureTimeTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ protected function setUp(): void
$property = $reflection->getProperty('subsec');
$property->setAccessible(true);
$property->setValue($a, 9999990);
$property = $reflection->getProperty('unixts_ms');
$property->setAccessible(true);
$property->setValue($a, 9000000000090);
}

protected function tearDown(): void
Expand All @@ -34,6 +37,9 @@ protected function tearDown(): void
$property = $reflection->getProperty('subsec');
$property->setAccessible(true);
$property->setValue($a, 0);
$property = $reflection->getProperty('unixts_ms');
$property->setAccessible(true);
$property->setValue($a, 0);
}

public function testFutureTimeVersion6()
Expand Down Expand Up @@ -69,4 +75,21 @@ public function testFutureTimeVersion7()
$uuid1 = $uuid2;
}
}

public function testFutureTimeVersion8()
{
$uuid1 = UUID::uuid8();
for ($x = 0; $x < 1000; $x++) {
$uuid2 = UUID::uuid8();
$this->assertGreaterThan(
$uuid1,
$uuid2
);
$this->assertLessThan(
0,
strcmp(UUID::getTime($uuid1), UUID::getTime($uuid2))
);
$uuid1 = $uuid2;
}
}
}
Loading

0 comments on commit f78ee83

Please sign in to comment.