Skip to content

Commit

Permalink
Merge pull request #21 from dwyl/create-cid-with-ex-cid
Browse files Browse the repository at this point in the history
Create cid
  • Loading branch information
nelsonic authored Jan 30, 2019
2 parents cf65929 + 0325986 commit 9c03a85
Show file tree
Hide file tree
Showing 6 changed files with 260 additions and 40 deletions.
16 changes: 16 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
language: elixir
elixir:
- 1.8
env:
- MIX_ENV=test
before_install:
- curl https://dist.ipfs.io/go-ipfs/v0.4.18/go-ipfs_v0.4.18_linux-386.tar.gz --output go-ipfs.tar.gz
- tar xvfz go-ipfs.tar.gz
- cd go-ipfs
- sudo ./install.sh
- cd ..
- ipfs init
script:
- mix all_tests
after_success:
- bash <(curl -s https://codecov.io/bash)
22 changes: 15 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,9 +237,9 @@ We can then create a URLs table
in our URL shortening app/service
with the following entry:

| `inserted_at ` | **`URL`** (PK) | `cid` | `short` |
| ----------- | ----------- | ----------- | ----------- |
| 1541609554 | https://github.com/dwyl/phoenix-ecto-append-only-log-example | gVSTedHFGBetxyYib9mBQsjtZj4dJjQe | gV |
| `inserted_at ` | **`URL`** (PK) | `cid` | `short` |
| -------------- | ------------------------------------------------------------ | -------------------------------- | ------- |
| 1541609554 | https://github.com/dwyl/phoenix-ecto-append-only-log-example | gVSTedHFGBetxyYib9mBQsjtZj4dJjQe | gV |

So the "short" url would be
[dwyl.co/gV](https://github.com/dwyl/phoenix-ecto-append-only-log-example)
Expand Down Expand Up @@ -273,6 +273,14 @@ be found at [https://hexdocs.pm/rid](https://hexdocs.pm/rid)
-->

## Tests

The tests for this module are a combination of doctests, unit tests and property based tests.

To run the property based tests you will need an installation of [IPFS](https://ipfs.io/).
See https://github.com/dwyl/learn-ipfs#how for details.

Then you can run `mix all_tests`, which will run the `Cid.cid` function on 100 randomly generated strings and maps, comparing the results of these to the IPFS generated cid, ensuring our function is correct in its implementation.

# Research, Background & Relevant Reading
+ Real World examples of services that use Strings as IDs instead of Integers. [Real World Examples](https://github.com/dwyl/cid/blob/master/read_world_examples.md)
Expand Down Expand Up @@ -355,10 +363,10 @@ be **familiar** to people_)

**`prev: previous_cid`** address _example_:

| `inserted ` | **`cid`**(PK)<sup>1</sup> | **`name`** | **`address`** | **`prev`** |
| ----------- | ----------- | ----------- | ----------- |-----|
| 1541609554 | **gVSTedHFGBetxy** | Bruce Wane | 1007 Mountain Drive, Gotham | null |
| 1541618643 | smnELuCmEaX42 | Bruce Wane | [Rua Goncalo Afonso, Vila Madalena, Sao Paulo, 05436-100, Brazil](https://www.tripadvisor.co.uk/ShowUserReviews-g303631-d2349935-r341872180-Batman_Alley-Sao_Paulo_State_of_Sao_Paulo.html "Batman Alley ;-)") | **gVSTedHFGBetxy** |
| `inserted ` | **`cid`**(PK)<sup>1</sup> | **`name`** | **`address`** | **`prev`** |
| ----------- | ------------------------- | ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ |
| 1541609554 | **gVSTedHFGBetxy** | Bruce Wane | 1007 Mountain Drive, Gotham | null |
| 1541618643 | smnELuCmEaX42 | Bruce Wane | [Rua Goncalo Afonso, Vila Madalena, Sao Paulo, 05436-100, Brazil](https://www.tripadvisor.co.uk/ShowUserReviews-g303631-d2349935-r341872180-Batman_Alley-Sao_Paulo_State_of_Sao_Paulo.html "Batman Alley ;-)") | **gVSTedHFGBetxy** |

When a row does _not_ have a **`prev`** value then we know it is the _first_
time that content has been inserted into the database. When a **`prev`** value
Expand Down
113 changes: 91 additions & 22 deletions lib/cid.ex
Original file line number Diff line number Diff line change
@@ -1,38 +1,107 @@
defmodule Cid do
@moduledoc """
Returns a SHA512 transformed to Base64, remove ambiguous chars then sub-string
Provides a way for a user to turn a String, Map or Struct into a CID that
is identical to one that will be returned from IPFS if the same data is
added.
Currently only produces a default v1 CID.
Currently only uses the "raw" codec
Data provided must be under 256Kb in order for the CID to match the one
returned by IPFS
For more info on CIDs and IPFS see the following...
https://ipfs.io/
https://pascalprecht.github.io/posts/content-identifiers-in-ipfs/
https://github.com/dwyl/learn-ipfs/issues
"""

@doc """
make/2 create a SHA512 hash from the given input and return the require length
note: we remove "ambiguous" characters so _humans_ can type the hash without
getting "confused" this might not be required, but is to match the original
"Hits" implementation.
Returns a CID that identical to one returned by IPFS if given the same data.
Can take a String, Map or Struct as an argument.
## Examples
## Parameters
iex> Cid.cid("hello")
"zb2rhZfjRh2FHHB2RkHVEvL2vJnCTcu7kwRqgVsf9gpkLgteo"
- input: String the string to be hashed.
- length: Number the length of string required
iex> Cid.cid(%{key: "value"})
"zb2rhn1C6ZDoX6rdgiqkqsaeK7RPKTBgEi8scchkf3xdsi8Bj"
Returns String hash of desired length.
iex> Cid.cid(1234)
"invalid data type"
iex> Cid.cid([1,2,3,"four"])
"invalid data type"
"""
def make(input) when is_map(input) do
input |> stringify_map_values |> make
@spec cid(String.t | map() | struct()) :: String.t
def cid(value) do
value
|> create_multihash()
|> create_cid()
end

# create_multihash returns a multihash. A multihash is a self describing hash.
# for more info on multihashes see this blog post...
# https://pascalprecht.github.io/posts/future-proofed-hashes-with-multihash/
# if create_multihash is called with a struct, the struct is converted into a
# map and then create_multihash is called again
# The %_{} syntax works like regular pattern matching. The underscore, _,
# simply matches any Struct/Module name.
defp create_multihash(%_{} = struct) do
struct
|> Map.from_struct()
|> create_multihash()
end

def make(input, length \\ 32) do
hash1 = :crypto.hash(:sha512, input)
{:ok, <<_multihash_code, _length, hash2::binary>>} = Multihash.encode(:sha2_512, hash1)
# if create_multihash is called with a map the map is converted into a JSON
# string and then create_multihash is called again
defp create_multihash(map) when is_map(map) do
map
|> Jason.encode!()
|> create_multihash()
end

# if create_multihash is called with a string then the string is converted
# into a multihash. This uses the erlang crypto hash function. For more
# infomation on using # erlang functions in elixir see...
# https://stackoverflow.com/questions/35283888/how-to-call-an-erlang-function-in-elixir
defp create_multihash(str) when is_binary(str) do
digest = :crypto.hash(:sha256, str)
{:ok, multihash} = Multihash.encode(:sha2_256, digest)

hash2
|> Base.encode64()
|> String.replace(~r/[Il0oO=\/\+]/, "", global: true)
|> String.slice(0..(length - 1))
multihash
end

def stringify_map_values(input_map) do
Enum.sort(Map.keys(input_map)) # sort map keys for consistent ordering
|> Enum.map(fn (x) -> Map.get(input_map, x) end)
|> Enum.join("")
# if create_multihash is called something that is not a string, map or struct
# then it returns an error.
defp create_multihash(_), do: {:error, "invalid data type"}

# if an error is passed in return error message
defp create_cid({:error, msg}), do: msg

# takes a multihash and retuns a CID
# B58.encode58 takes the binary returned from create_cid_suffix and converts
# it into a base58 string. For more info on base58 strings see
# https://en.wikipedia.org/wiki/Base58
defp create_cid(multihash) when is_binary(multihash) do
multihash
|> create_cid_suffix()
|> B58.encode58()
|> add_multibase_prefix()
end

# takes a multihash and returns the suffix
# currently version is hardcoded to 1
# (currenly IPFS only have 2 versions, 0 or 1. O is deprecated)
# and multicodec-packed-content-type is hardcoded to "raw" ("U" == <<85>>)
# more info on multicodec can be found https://github.com/multiformats/multicodec
# <version><multicodec-packed-content-type><multihash>
# the syntax on this line is concatenating strings and binary values together.
# Strings in elixir are binaries and that is how this works. Learn more here...
# https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html
defp create_cid_suffix(multihash), do: <<1>> <> "U" <> multihash

# adds the multibase prefix (multibase-prefix) to the suffix (<version><mc><mh>)
# for more info on multibase, see https://github.com/multiformats/multibase
defp add_multibase_prefix(suffix), do: "z" <> suffix
end
22 changes: 18 additions & 4 deletions mix.exs
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
defmodule Rid.MixProject do
defmodule Cid.MixProject do
use Mix.Project

def project do
[
app: :rid,
app: :cid,
version: "0.1.0",
elixir: "~> 1.7",
start_permanent: Mix.env() == :prod,
deps: deps()
deps: deps(),
aliases: aliases(),
test_coverage: [tool: ExCoveralls],
preferred_cli_env: [coveralls: :test, "coveralls.detail": :test, "coveralls.post": :test, "coveralls.html": :test, all_tests: :test]
]
end

Expand All @@ -21,7 +24,18 @@ defmodule Rid.MixProject do
# Run "mix help deps" to learn about dependencies.
defp deps do
[
{:ex_multihash, "~> 2.0"}
{:ex_multihash, "~> 2.0"},
{:jason, "~> 1.1"},
{:basefiftyeight, "~> 0.1.0"}, # Currenly building our own version of this here https://git.io/fhPaK. Can replace when it is ready
{:excoveralls, "~> 0.10", only: :test},
{:stream_data, "~> 0.4.2", only: :test}
]
end

defp aliases do
[
test: ["coveralls --exclude ipfs"],
all_tests: ["coveralls.detail --include ipfs"]
]
end
end
12 changes: 12 additions & 0 deletions mix.lock
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
%{
"basefiftyeight": {:hex, :basefiftyeight, "0.1.0", "3d48544743bf9aab7ab02aed803ac42af77acf268c7d8c71d4f39e7fa85ee8d3", [:mix], [], "hexpm"},
"certifi": {:hex, :certifi, "2.4.2", "75424ff0f3baaccfd34b1214184b6ef616d89e420b258bb0a5ea7d7bc628f7f0", [:rebar3], [{:parse_trans, "~>3.3", [hex: :parse_trans, repo: "hexpm", optional: false]}], "hexpm"},
"ex_multihash": {:hex, :ex_multihash, "2.0.0", "7fb36f842a2ec1c6bbba550f28fcd16d3c62981781b9466c9c1975c43d7db43c", [:mix], [], "hexpm"},
"excoveralls": {:hex, :excoveralls, "0.10.4", "b86230f0978bbc630c139af5066af7cd74fd16536f71bc047d1037091f9f63a9", [:mix], [{:hackney, "~> 1.13", [hex: :hackney, repo: "hexpm", optional: false]}, {:jason, "~> 1.0", [hex: :jason, repo: "hexpm", optional: false]}], "hexpm"},
"hackney": {:hex, :hackney, "1.15.0", "287a5d2304d516f63e56c469511c42b016423bcb167e61b611f6bad47e3ca60e", [:rebar3], [{:certifi, "2.4.2", [hex: :certifi, repo: "hexpm", optional: false]}, {:idna, "6.0.0", [hex: :idna, repo: "hexpm", optional: false]}, {:metrics, "1.0.1", [hex: :metrics, repo: "hexpm", optional: false]}, {:mimerl, "1.0.2", [hex: :mimerl, repo: "hexpm", optional: false]}, {:ssl_verify_fun, "1.1.4", [hex: :ssl_verify_fun, repo: "hexpm", optional: false]}], "hexpm"},
"idna": {:hex, :idna, "6.0.0", "689c46cbcdf3524c44d5f3dde8001f364cd7608a99556d8fbd8239a5798d4c10", [:rebar3], [{:unicode_util_compat, "0.4.1", [hex: :unicode_util_compat, repo: "hexpm", optional: false]}], "hexpm"},
"jason": {:hex, :jason, "1.1.2", "b03dedea67a99223a2eaf9f1264ce37154564de899fd3d8b9a21b1a6fd64afe7", [:mix], [{:decimal, "~> 1.0", [hex: :decimal, repo: "hexpm", optional: true]}], "hexpm"},
"metrics": {:hex, :metrics, "1.0.1", "25f094dea2cda98213cecc3aeff09e940299d950904393b2a29d191c346a8486", [:rebar3], [], "hexpm"},
"mimerl": {:hex, :mimerl, "1.0.2", "993f9b0e084083405ed8252b99460c4f0563e41729ab42d9074fd5e52439be88", [:rebar3], [], "hexpm"},
"parse_trans": {:hex, :parse_trans, "3.3.0", "09765507a3c7590a784615cfd421d101aec25098d50b89d7aa1d66646bc571c1", [:rebar3], [], "hexpm"},
"ssl_verify_fun": {:hex, :ssl_verify_fun, "1.1.4", "f0eafff810d2041e93f915ef59899c923f4568f4585904d010387ed74988e77b", [:make, :mix, :rebar3], [], "hexpm"},
"stream_data": {:hex, :stream_data, "0.4.2", "fa86b78c88ec4eaa482c0891350fcc23f19a79059a687760ddcf8680aac2799b", [:mix], [], "hexpm"},
"unicode_util_compat": {:hex, :unicode_util_compat, "0.4.1", "d869e4c68901dd9531385bb0c8c40444ebf624e60b6962d95952775cac5e90cd", [:rebar3], [], "hexpm"},
}
115 changes: 108 additions & 7 deletions test/cid_test.exs
Original file line number Diff line number Diff line change
@@ -1,18 +1,119 @@
defmodule DummyStruct do
defstruct [:name, :username, :age]
end

defmodule CidTest do
use ExUnit.Case
use ExUnitProperties

doctest Cid

test "Creates a deterministic Content ID from Elixir String" do
assert Cid.make("Elixir") == "NSqJspBr2u1F6z1DhcR2cnQAxLdQZBLk"
defstruct [:a]
@filename "random.txt"
@ipfs_args ["add", @filename, "-n", "--cid-version=1"]
@dummy_map %{
name: "Batman",
username: "The Batman",
age: 80
}

describe "Testing ex_cid cid function" do
test "returns the same CID as IPFS when given a string" do
assert "zb2rhhnbH6zTaAj948YVsYxW4c5AY6TfJURC9EGhQum3Kq7b3" == Cid.cid("Hello World")
end

test "returns the same CID as IPFS when given a map" do
assert "zb2rhdeaHh2UHghBcwxeFP1GRUYETDH96DkV6oppiz5Gk1xGN" == Cid.cid(%{a: "a"})
end

test "returns the same CID as IPFS when given a struct" do
assert "zb2rhdeaHh2UHghBcwxeFP1GRUYETDH96DkV6oppiz5Gk1xGN" == Cid.cid(%__MODULE__{a: "a"})
end

test "returns an error if given invalid data type" do
assert Cid.cid(2) == "invalid data type"
end

test "returns the same CID regardless of order of items in map" do
map = %{
age: 80,
name: "Batman",
username: "The Batman"
}

assert Cid.cid(@dummy_map) == Cid.cid(map)
end

test "A struct with the same keys and values as a map creates the same CID" do
struct = %DummyStruct{
age: 80,
name: "Batman",
username: "The Batman"
}

assert Cid.cid(struct) == Cid.cid(@dummy_map)
end

test "returns a different CID when the value given differs (CIDs are all unique)" do
refute Cid.cid("") == Cid.cid(" ")
refute Cid.cid("\n") == Cid.cid("")
refute Cid.cid("Hello World") == Cid.cid("salve mundi")
refute Cid.cid("Hello World") == Cid.cid("hello world")
refute Cid.cid(%{a: "a"}) == Cid.cid(%{a: "b"})
refute Cid.cid(%__MODULE__{a: "a"}) == Cid.cid(%DummyStruct{})
end

test "empty values also work" do
assert Cid.cid("") == "zb2rhmy65F3REf8SZp7De11gxtECBGgUKaLdiDj7MCGCHxbDW"
assert Cid.cid(%{}) == "zb2rhbE2775XANjTsRTV9sxfFMWxrGuMWYgshDn9xvjG69fZ3"
end

# Property based tests that generate random strings and
# use them in our compare_ipfs_cid function
# Tagged to allow you to ignore these tests if you don't have ipfs installed
@tag :ipfs
property "test with 50 random strings" do
check all str <- StreamData.string(:ascii), max_runs: 50 do
compare_ipfs_cid(str)
end
end

# Property based tests that generate random maps and
# use them in our compare_ipfs_cid function
# Tagged to allow you to ignore these tests if you don't have ipfs installed
@tag :ipfs
property "test with 50 random maps" do
check all map <- random_map(), max_runs: 50 do
map
|> Jason.encode!()
|> compare_ipfs_cid()
end
end
end

test "Create a CID from a Map" do
map = %{cat: "Meow", dog: "Woof", fox: "What Does The Fox Say?"}
assert Cid.make(map) == "GdrVnsLSdxRphXgQgNsmq1FDyRXAySXT"
# Calls IPFS `add` function to generate cid
# then compares result to result of our `Cid.cid` function
# see: https://docs.ipfs.io/introduction/usage/
def compare_ipfs_cid(val) do
File.write(@filename, val)

{added_val, 0} = System.cmd("ipfs", @ipfs_args)

<<"added ", cid::bytes-size(49), _::binary>> = added_val

assert cid == Cid.cid(val)

File.rm!(@filename)
end

test "Cid.make(\"hello world\")" do
assert Cid.make("hello world") == "MJ7MSJwS1utMxA9QyQLytNDtd5RGnx6m"
def random_map do
keys = StreamData.atom(:alphanumeric)
values = StreamData.one_of([random_value(), StreamData.list_of(random_value())])

StreamData.map_of(keys, values)
end

def random_value do
StreamData.one_of([StreamData.string(:ascii), StreamData.integer()])
end
end

0 comments on commit 9c03a85

Please sign in to comment.