Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add String[X] types #87

Merged
merged 2 commits into from
Sep 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "WeakRefStrings"
uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5"
authors = ["quinnj <quinn.jacobd@gmail.com>"]
version = "1.2.2"
version = "1.3.0"

[deps]
DataAPI = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
Expand Down
34 changes: 32 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,36 @@ Contributions are very welcome, as are feature requests and suggestions. Please

## Usage

Usage of `WeakRefString`s is discouraged for general users. Currently, a `WeakRefString` purposely _does not_ implement many Base Julia String interface methods due to many recent changes to Julia's builtin String interface, as well as the complexity to do so correctly. As such, `WeakRefString`s are used primarily in the data ecosystem as an IO optimization and nothing more. Upon indexing a `WeakRefStringArray`, a proper Julia `String` type is materialized for safe, correct string processing. In the future, it may be possible to implement safe operations on `WeakRefString` itself, but for now, they must be converted to a `String` for any real work.
### `InlineString`

Additional documentation is available at the REPL for `?WeakRefStringArray` and `?WeakRefString`.
A set of custom string types of various fixed sizes. Each inline string is a
custom primitive type and can benefit from being stack friendly by avoiding
allocations/heap tracking in the GC. When used in an array, the elements are
able to be stored inline since each one has a fixed size. Currently support
inline strings from 1 byte up to 255 bytes.

The following types are supported: `String1`, `String3`, `String7`, `String15`,
`String31`, `String63`, `String127`, `String255`.

### `PosLenString`

A custom string representation that takes a byte buffer (`buf`), `poslen`, and
`e` escape character, and lazily allows treating a region of the `buf` as a
string. Can be used most efficiently as part of a `PosLenStringVector` which
only stores an array of `PosLen` (inline) along with a single `buf` and `e` and
returns `PosLenString` when indexing individual elements.

### `WeakRefString`

Usage of `WeakRefString`s is discouraged for general users. Currently, a
`WeakRefString` purposely _does not_ implement many Base Julia String interface
methods due to many recent changes to Julia's builtin String interface, as well
as the complexity to do so correctly. As such, `WeakRefString`s are used
primarily in the data ecosystem as an IO optimization and nothing more. Upon
indexing a `WeakRefStringArray`, a proper Julia `String` type is materialized
for safe, correct string processing. In the future, it may be possible to
implement safe operations on `WeakRefString` itself, but for now, they must be
converted to a `String` for any real work.

Additional documentation is available at the REPL for `?WeakRefStringArray` and
`?WeakRefString`.
5 changes: 4 additions & 1 deletion src/inlinestrings.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ size. Currently support inline strings from 1 byte up to 255 bytes.
abstract type InlineString <: AbstractString end

for sz in (1, 4, 8, 16, 32, 64, 128, 256)
nm = Symbol(:InlineString, max(1, sz - 1))
nm = Symbol(:String, max(1, sz - 1))
nma = Symbol(:InlineString, max(1, sz - 1))
@eval begin
"""
$($nm)
Expand All @@ -32,7 +33,9 @@ for sz in (1, 4, 8, 16, 32, 64, 128, 256)
with the new codeunit `b` appended.
"""
primitive type $nm <: InlineString $(sz * 8) end
const $nma = $nm
export $nm
export $nma
end
end

Expand Down
13 changes: 12 additions & 1 deletion test/inlinestrings.jl
Original file line number Diff line number Diff line change
Expand Up @@ -169,5 +169,16 @@ end # @testset

@test typeof(i_str_copy) == typeof(i_str)
@test i_str_copy == i_str
end
end
end # @testset

@testset "alias tests" begin
@test String1 == InlineString1
@test String3 == InlineString3
@test String7 == InlineString7
@test String15 == InlineString15
@test String31 == InlineString31
@test String63 == InlineString63
@test String127 == InlineString127
@test String255 == InlineString255
end