Skip to content

Commit

Permalink
Added Unicode support for custom dictionary. (#2)
Browse files Browse the repository at this point in the history
* Add support for Unicode and increased computational performance.
  • Loading branch information
mprimeaux authored Oct 26, 2024
1 parent f77b620 commit 3428dd3
Show file tree
Hide file tree
Showing 5 changed files with 353 additions and 206 deletions.
19 changes: 16 additions & 3 deletions CHANGELOG/CHANGELOG-1.x.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Security

---
## [1.2.0] - 2024-10-25
## [1.3.0] - 2024-OCT-26

### Added
- **FEATURE:** Added Unicode support for custom dictionaries.
### Changed
- **DEBT:** Modified implementation to be approximately 30% more efficient in terms of CPU complexity. See the `bench` make target.
### Deprecated
### Removed
### Fixed
### Security

---
## [1.2.0] - 2024-OCT-25

### Added
### Changed
Expand All @@ -27,7 +39,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Security

---
## [1.0.0] - 2024-10-24
## [1.0.0] - 2024-OCT-24

### Added
- **FEATURE:** Initial commit.
Expand All @@ -37,7 +49,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed
### Security

[Unreleased]: https://github.com/scriptures-social/platform/compare/v0.2.0...HEAD
[Unreleased]: https://github.com/scriptures-social/platform/compare/v1.3.0...HEAD
[1.3.0]: https://github.com/sixafter/nanoid/compare/v1.2.0...v1.3.0
[1.2.0]: https://github.com/sixafter/nanoid/compare/v1.0.0...v1.2.0
[1.0.0]: https://github.com/sixafter/nanoid/compare/a6a1eb74b61e518fd0216a17dfe5c9b4c432e6e8...v1.0.0

Expand Down
171 changes: 100 additions & 71 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,115 +32,144 @@ import "github.com/sixafter/nanoid"

## Usage

### Generating a Default NanoID
### Generate a Nano ID with Default Settings

Generate a NanoID using the default size (21 characters) and the default alphabet (numbers and uppercase/lowercase letters):
Generate a Nano ID using the default size (21 characters) and default alphabet:

```go
package main

import (
"fmt"
"log"

"github.com/sixafter/nanoid"
)

func main() {
id, err := nanoid.Generate()
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated NanoID:", id)
id, err := nanoid.New()
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID:", id)
```

### Generating a NanoID with Custom Size

Generate a NanoID with a custom length:

```go
package main
id, err := nanoid.NewSize(32)
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID of size 32:", id)
```

import (
"fmt"
"log"
### Generate a Nano ID with Custom Alphabet

"github.com/sixafter/nanoid"
)
Generate a Nano ID using a custom alphabet:

func main() {
id, err := nanoid.GenerateSize(10) // Generate a 10-character NanoID
if err != nil {
log.Fatal(err)
}
fmt.Println("NanoID with custom size:", id)
```go
customAlphabet := "abcdef123456"
id, err := nanoid.NewCustom(16, customAlphabet)
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID with custom alphabet:", id)
```

### Generating a NanoID with Custom Alphabet
### Generate a Nano ID with Unicode Alphabet

Generate a NanoID with a custom length and a custom set of characters:
Generate a Nano ID using a Unicode alphabet:

```go
package main
unicodeAlphabet := "あいうえお漢字🙂🚀"
id, err := nanoid.NewCustom(10, unicodeAlphabet)
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID with Unicode alphabet:", id)
```

import (
"fmt"
"log"
### Generate a Nano ID with Custom Random Source

"github.com/sixafter/nanoid"
)
Generate a Nano ID using a custom random source that implements io.Reader:

func main() {
alphabet := "0123456789abcdef" // Hexadecimal characters
id, err := nanoid.GenerateCustom(16, alphabet) // Generate a 16-character NanoID
if err != nil {
log.Fatal(err)
}
fmt.Println("NanoID with custom alphabet:", id)
```go
// Example custom random source (for demonstration purposes)
var myRandomSource io.Reader = myCustomRandomReader{}

id, err := nanoid.NewCustomReader(21, nanoid.DefaultAlphabet, myRandomSource)
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID with custom random source:", id)
```

### Concurrency and Thread Safety
**Note:** Replace `myCustomRandomReader{}` with your actual implementation of `io.Reader`.

## Thread Safety

The NanoID functions are designed to be thread-safe. You can safely generate IDs from multiple goroutines concurrently without additional synchronization.
All functions provided by this package are safe for concurrent use by multiple goroutines. Here's an example of generating Nano IDs concurrently:

```go
package main

import (
"fmt"
"sync"
"fmt"
"log"
"sync"

"github.com/sixafter/nanoid"
"github.com/sixafter/nanoid"
)

func main() {
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go func() {
defer wg.Done()
id, err := nanoid.Generate()
if err != nil {
fmt.Println("Error generating NanoID:", err)
return
}
fmt.Println("Generated NanoID:", id)
}()
}
wg.Wait()
const numGoroutines = 10
const idSize = 21

var wg sync.WaitGroup
wg.Add(numGoroutines)

for i := 0; i < numGoroutines; i++ {
go func() {
defer wg.Done()
id, err := nanoid.New()
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated Nano ID:", id)
}()
}

wg.Wait()
}
```

### Error Handling
## Functions

All functions return an error as the second return value. Ensure you handle any potential errors:
* `func New() (string, error)`: Generates a Nano ID with the default size (21 characters) and default alphabet.
* `func NewSize(size int) (string, error)`: Generates a Nano ID with a specified size using the default alphabet.
* `func NewCustom(size int, alphabet string) (string, error)`: Generates a Nano ID with a specified size and custom alphabet.
* `func NewCustomReader(size int, alphabet string, rnd io.Reader) (string, error)`: Generates a Nano ID with a specified size, custom alphabet, and custom random source.

## Constants

* `DefaultAlphabet`: The default alphabet used for ID generation: `-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz`
* `DefaultSize`: The default size of the generated ID: `21`

## Unicode Support

This implementation fully supports custom alphabets containing Unicode characters, including emojis and characters from various languages. By using []rune internally, it correctly handles multi-byte Unicode characters.

## Performance

The package is optimized for performance and low memory consumption:
* **Efficient Random Byte Consumption**: Uses bitwise operations to extract random bits efficiently.
* **Avoids `math/big`**: Does not use `math/big`, relying on built-in integer types for calculations.
* **Minimized System Calls**: Reads random bytes in batches to reduce the number of system calls.

## Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

* Fork the repository.
* Create a new branch for your feature or bugfix.
* Write tests for your changes.
* Ensure all tests pass.
* Submit a pull request.

## License

This project is licensed under the [MIT License](https://choosealicense.com/licenses/mit/).

```go
id, err := nanoid.Generate()
if err != nil {
// Handle the error
}
```
104 changes: 69 additions & 35 deletions nanoid.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,73 +2,107 @@
//
// This source code is licensed under the MIT License found in the
// LICENSE file in the root directory of this source tree.

package nanoid

import (
"crypto/rand"
"errors"
"io"
"math/bits"
"strings"
)

const (
defaultAlphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
defaultSize = 21
bitsPerByte = 8 // Number of bits in a byte
DefaultAlphabet = "-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz"
DefaultSize = 21
)

// Generate generates a NanoID with the default size and alphabet.
func Generate() (string, error) {
return GenerateSize(defaultSize)
// New generates a NanoID with the default size and alphabet using crypto/rand as the random source.
func New() (string, error) {
return NewSize(DefaultSize)
}

// NewSize generates a NanoID with a specified size and the default alphabet using crypto/rand as the random source.
func NewSize(size int) (string, error) {
return NewCustom(size, DefaultAlphabet)
}

// GenerateSize generates a NanoID with a specified size and the default alphabet.
func GenerateSize(size int) (string, error) {
return GenerateCustom(size, defaultAlphabet)
// NewCustom generates a NanoID with a specified size and custom alphabet using crypto/rand as the random source.
func NewCustom(size int, alphabet string) (string, error) {
return NewCustomReader(size, alphabet, cryptoRandReader)
}

// GenerateCustom generates a NanoID with a specified size and custom alphabet.
func GenerateCustom(size int, alphabet string) (string, error) {
// NewCustomReader generates a NanoID with a specified size, custom alphabet, and custom random source.
func NewCustomReader(size int, alphabet string, rnd io.Reader) (string, error) {
if rnd == nil {
return "", errors.New("random source cannot be nil")
}
if size <= 0 {
return "", errors.New("size must be greater than zero")
}
alphabetLen := len(alphabet)

// Convert alphabet to []rune to support Unicode characters
alphabetRunes := []rune(alphabet)
alphabetLen := len(alphabetRunes)
if alphabetLen == 0 {
return "", errors.New("alphabet must not be empty")
}

alphabetBytes := []byte(alphabet)

// Handle special case when alphabet length is 1
if alphabetLen == 1 {
return strings.Repeat(alphabet, size), nil
return strings.Repeat(string(alphabetRunes[0]), size), nil
}

mask := (1 << bits.Len(uint(alphabetLen-1))) - 1
// Calculate the number of bits needed to represent the alphabet indices
bitsPerChar := bits.Len(uint(alphabetLen - 1))
totalBits := size * bitsPerChar
step := (totalBits + bitsPerByte - 1) / bitsPerByte // Number of random bytes needed
if bitsPerChar == 0 {
bitsPerChar = 1
}

id := make([]byte, size)
bytes := make([]byte, step)
idRunes := make([]rune, size)
var bitBuffer uint64
var bitsInBuffer int
i := 0

for i := 0; i < size; {
_, err := rand.Read(bytes)
if err != nil {
return "", err
}
for _, b := range bytes {
idx := int(b) & mask
if idx < alphabetLen {
id[i] = alphabetBytes[idx]
i++
if i == size {
break
}
for i < size {
// If we don't have enough bits, read more random bytes
if bitsInBuffer < bitsPerChar {
var b [8]byte // Read up to 8 bytes at once for efficiency
n, err := rnd.Read(b[:])
if err != nil {
return "", err
}
if n == 0 {
return "", errors.New("random source returned no data")
}
// Append the new random bytes to the bit buffer
for j := 0; j < n; j++ {
bitBuffer |= uint64(b[j]) << bitsInBuffer
bitsInBuffer += 8
}
}

// Extract bitsPerChar bits to get the index
idx := int(bitBuffer & ((1 << bitsPerChar) - 1))
bitBuffer >>= bitsPerChar
bitsInBuffer -= bitsPerChar

// Use the index if it's within the alphabet range
if idx < alphabetLen {
idRunes[i] = alphabetRunes[idx]
i++
}
// Else discard and continue
}

return string(id), nil
return string(idRunes), nil
}

// cryptoRandReader is a wrapper around crypto/rand.Reader to match io.Reader interface.
var cryptoRandReader io.Reader = cryptoRandReaderType{}

type cryptoRandReaderType struct{}

func (cryptoRandReaderType) Read(p []byte) (int, error) {
return rand.Read(p)
}
Loading

0 comments on commit 3428dd3

Please sign in to comment.