Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create cid #21

Merged
merged 29 commits into from
Jan 30, 2019
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1fb3965
require jason and @simonlabs version of ex_cid #11 #7
RobStallion Jan 28, 2019
1e4e2aa
create same CID values as IPFS. works with string, map and struct #7 #11
RobStallion Jan 28, 2019
74f3cf4
rename file, remove unused code, add docs and doc tests #7 #11
RobStallion Jan 28, 2019
b17d84b
adds tests #7 #11
RobStallion Jan 28, 2019
0e4f639
update name of application #22
RobStallion Jan 28, 2019
011b284
rename module to CID #22
RobStallion Jan 28, 2019
597b817
remove cid and and base 58 module #11 #7
RobStallion Jan 28, 2019
3b96a0b
update create cid function so it no longer uses ex_cid module
RobStallion Jan 28, 2019
2412717
rename files as per comment #22
RobStallion Jan 28, 2019
d2a7671
update mix lock
RobStallion Jan 28, 2019
1ae6128
add .travis.yml file to run tests on branch https://github.com/dwyl/c…
nelsonic Jan 29, 2019
8a21996
update documentation on functions #11
RobStallion Jan 29, 2019
ce987d2
adds examples for invalid data types #11
RobStallion Jan 29, 2019
3369a12
test more complex data and show that different (but similar) data ret…
RobStallion Jan 29, 2019
1c142e2
adds tests for empty strings and maps
RobStallion Jan 29, 2019
74c58d2
update documentation to include more info and relevant links
RobStallion Jan 29, 2019
ec620ec
adds code coverage
RobStallion Jan 29, 2019
aa4f2c6
adds a spec for the cid function #11
RobStallion Jan 29, 2019
d15e914
improve docs to explain what a multihash is
RobStallion Jan 29, 2019
3ec0d4d
fix poor grammar
RobStallion Jan 29, 2019
53ccfaf
update travis yml with ipfs install script
RobStallion Jan 29, 2019
58e1c09
add sudo to install command #24
RobStallion Jan 29, 2019
89f3806
cd back out over go-ipfs dir #24
RobStallion Jan 29, 2019
8f97c9a
update travis yml so it runs all tests 24
RobStallion Jan 29, 2019
89dbdc1
adds property tests #24
RobStallion Jan 29, 2019
217d684
remove new line addition. text editors were adding new line, not IPFS…
RobStallion Jan 29, 2019
fc2409c
runs ipfs init on travis, #24
Danwhy Jan 29, 2019
b287876
adds property tests for random maps, #24
Danwhy Jan 29, 2019
0325986
adds docs for property based tests, #24
Danwhy Jan 29, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
language: elixir
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍

Good addition @nelsonic

elixir:
- 1.8
env:
- MIX_ENV=test
script:
- mix test
after_success:
- bash <(curl -s https://codecov.io/bash)
121 changes: 99 additions & 22 deletions lib/cid.ex
Original file line number Diff line number Diff line change
@@ -1,38 +1,115 @@
defmodule Cid do
@moduledoc """
Returns a SHA512 transformed to Base64, remove ambiguous chars then sub-string
Provides a way for a user to turn a String, Map or Struct into a CID that
is identical to one what will be returned from IPFS if the same data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A link to an explanation of what IPFS is might be good here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link to it in the README.md in https://github.com/dwyl/learn-ipfs i.e. "What is IPFS?" and if we don't already have the section, we should add it ASAP see: dwyl/learn-ipfs#1 (comment)

added.

Currently only produces a default v1 CID.
Currently only uses the "raw" codec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a little more info on what a v1 CID or "raw" codec is. Just a link would probably be fine, even if it's to elsewhere in this repo.

Data provided must be under 256Kb in order for the CID to match the one
returned by IPFS

For more info on CIDs and IPFS see the following...
https://ipfs.io/
https://pascalprecht.github.io/posts/content-identifiers-in-ipfs/
https://github.com/dwyl/learn-ipfs/issues
"""

@doc """
make/2 create a SHA512 hash from the given input and return the require length
note: we remove "ambiguous" characters so _humans_ can type the hash without
getting "confused" this might not be required, but is to match the original
"Hits" implementation.
Returns a CID that identical to one returned by IPFS if given the same data.
Can take a String, Map or Struct as an argument.

## Examples

iex> Cid.cid("hello")
"zb2rhcc1wJn2GHDLT2YkmPq5b69cXc2xfRZZmyufbjFUfBkxr"

## Parameters
iex> Cid.cid(%{key: "value"})
"zb2rhkN6szWhAmBFjjP8RSczv2YVNLnG1tz1Q7FyfEp8LssNZ"

- input: String the string to be hashed.
- length: Number the length of string required
iex> Cid.cid(1234)
"invalid data type"

Returns String hash of desired length.
iex> Cid.cid([1,2,3,"four"])
"invalid data type"
"""
def make(input) when is_map(input) do
input |> stringify_map_values |> make
@spec cid(String.t | map() | struct()) :: String.t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def cid(value) do
value
|> create_multihash()
|> create_cid()
end

# if create_multihash is called with a struct, the struct is converted into a
# map and then create_multihash is called again
# for more info on multihashes see this blog post...
# https://pascalprecht.github.io/posts/future-proofed-hashes-with-multihash/
# The %_{} syntax works like regular pattern matching. The underscore, _,
# simply matches any Struct/Module name.
defp create_multihash(%_{} = struct) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A brief explanation of how this pattern matching works would be useful for beginners

struct
|> Map.from_struct()
|> create_multihash()
end

# if create_multihash is called with a map the map is converted into a JSON
# string and then create_multihash is called again
defp create_multihash(map) when is_map(map) do
map
|> Jason.encode!()
|> create_multihash()
end

def make(input, length \\ 32) do
hash1 = :crypto.hash(:sha512, input)
{:ok, <<_multihash_code, _length, hash2::binary>>} = Multihash.encode(:sha2_512, hash1)
# if create_multihash is called with a string, the string has a new line added
# to the end (as that's what IPFS appears to be doing based on tests), then
# the string is converted into a multihash
# This uses the erlang crypto hash function. For more infomation on using
# erlang functions in elixir see...
# https://stackoverflow.com/questions/35283888/how-to-call-an-erlang-function-in-elixir
defp create_multihash(str) when is_binary(str) do
str = add_new_line(str)
digest = :crypto.hash(:sha256, str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explaining in the comments that this is an erlang function would be good for beginners

{:ok, multihash} = Multihash.encode(:sha2_256, digest)

hash2
|> Base.encode64()
|> String.replace(~r/[Il0oO=\/\+]/, "", global: true)
|> String.slice(0..(length - 1))
multihash
end

def stringify_map_values(input_map) do
Enum.sort(Map.keys(input_map)) # sort map keys for consistent ordering
|> Enum.map(fn (x) -> Map.get(input_map, x) end)
|> Enum.join("")
# if create_multihash is called something that is not a string, map or struct
# then it returns an error.
defp create_multihash(_), do: {:error, "invalid data type"}

# if an error is passed in return error message
defp create_cid({:error, msg}), do: msg

# takes a multihash and retuns a CID
# B58.encode58 takes the binary returned from create_cid_suffix and converts
# it into a base58 string. For more info on base58 strings see
# https://en.wikipedia.org/wiki/Base58
defp create_cid(multihash) when is_binary(multihash) do
multihash
|> create_cid_suffix()
|> B58.encode58()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do with a brief explanation of B58

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added this in one of the latest commits. Thanks for pointing that out

|> add_multibase_prefix()
end

# takes a multihash and returns the suffix
# currently version is hardcoded to 1
# (currenly IPFS only have 2 versions, 0 or 1. O is deprecated)
# and multicodec-packed-content-type is hardcoded to "raw" ("U" == <<85>>)
# more info on multicodec can be found https://github.com/multiformats/multicodec
# <version><multicodec-packed-content-type><multihash>
# the syntax on this line is concatenating strings and binary values together.
# Strings in elixir are binaries and that is how this works. Learn more here...
# https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html
defp create_cid_suffix(multihash), do: <<1>> <> "U" <> multihash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small comment on the binary syntax might be useful to people unfamiliar with it


# adds the multibase prefix (multibase-prefix) to the suffix (<version><mc><mh>)
# for more info on multibase, see https://github.com/multiformats/multibase
defp add_multibase_prefix(suffix), do: "z" <> suffix

# adds new line to the end of string. (exists because all tests with ipfs
# appeared to do the same thing.)
defp add_new_line(str) do
str <> "\n"
end
end
18 changes: 14 additions & 4 deletions mix.exs
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
defmodule Rid.MixProject do
defmodule Cid.MixProject do
use Mix.Project

def project do
[
app: :rid,
app: :cid,
version: "0.1.0",
elixir: "~> 1.7",
start_permanent: Mix.env() == :prod,
deps: deps()
deps: deps(),
aliases: aliases(),
test_coverage: [tool: ExCoveralls],
preferred_cli_env: [coveralls: :test, "coveralls.detail": :test, "coveralls.post": :test, "coveralls.html": :test]
]
end

Expand All @@ -21,7 +24,14 @@ defmodule Rid.MixProject do
# Run "mix help deps" to learn about dependencies.
defp deps do
[
{:ex_multihash, "~> 2.0"}
{:ex_multihash, "~> 2.0"},
{:jason, "~> 1.1"},
{:basefiftyeight, "~> 0.1.0"}, # Currenly building our own version of this here https://git.io/fhPaK. Can replace when it is ready
{:excoveralls, "~> 0.10", only: :test}
]
end

defp aliases do
[test: ["coveralls"]]
end
end
11 changes: 11 additions & 0 deletions mix.lock
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
%{
"basefiftyeight": {:hex, :basefiftyeight, "0.1.0", "3d48544743bf9aab7ab02aed803ac42af77acf268c7d8c71d4f39e7fa85ee8d3", [:mix], [], "hexpm"},
"certifi": {:hex, :certifi, "2.4.2", "75424ff0f3baaccfd34b1214184b6ef616d89e420b258bb0a5ea7d7bc628f7f0", [:rebar3], [{:parse_trans, "~>3.3", [hex: :parse_trans, repo: "hexpm", optional: false]}], "hexpm"},
"ex_multihash": {:hex, :ex_multihash, "2.0.0", "7fb36f842a2ec1c6bbba550f28fcd16d3c62981781b9466c9c1975c43d7db43c", [:mix], [], "hexpm"},
"excoveralls": {:hex, :excoveralls, "0.10.4", "b86230f0978bbc630c139af5066af7cd74fd16536f71bc047d1037091f9f63a9", [:mix], [{:hackney, "~> 1.13", [hex: :hackney, repo: "hexpm", optional: false]}, {:jason, "~> 1.0", [hex: :jason, repo: "hexpm", optional: false]}], "hexpm"},
"hackney": {:hex, :hackney, "1.15.0", "287a5d2304d516f63e56c469511c42b016423bcb167e61b611f6bad47e3ca60e", [:rebar3], [{:certifi, "2.4.2", [hex: :certifi, repo: "hexpm", optional: false]}, {:idna, "6.0.0", [hex: :idna, repo: "hexpm", optional: false]}, {:metrics, "1.0.1", [hex: :metrics, repo: "hexpm", optional: false]}, {:mimerl, "1.0.2", [hex: :mimerl, repo: "hexpm", optional: false]}, {:ssl_verify_fun, "1.1.4", [hex: :ssl_verify_fun, repo: "hexpm", optional: false]}], "hexpm"},
"idna": {:hex, :idna, "6.0.0", "689c46cbcdf3524c44d5f3dde8001f364cd7608a99556d8fbd8239a5798d4c10", [:rebar3], [{:unicode_util_compat, "0.4.1", [hex: :unicode_util_compat, repo: "hexpm", optional: false]}], "hexpm"},
"jason": {:hex, :jason, "1.1.2", "b03dedea67a99223a2eaf9f1264ce37154564de899fd3d8b9a21b1a6fd64afe7", [:mix], [{:decimal, "~> 1.0", [hex: :decimal, repo: "hexpm", optional: true]}], "hexpm"},
"metrics": {:hex, :metrics, "1.0.1", "25f094dea2cda98213cecc3aeff09e940299d950904393b2a29d191c346a8486", [:rebar3], [], "hexpm"},
"mimerl": {:hex, :mimerl, "1.0.2", "993f9b0e084083405ed8252b99460c4f0563e41729ab42d9074fd5e52439be88", [:rebar3], [], "hexpm"},
"parse_trans": {:hex, :parse_trans, "3.3.0", "09765507a3c7590a784615cfd421d101aec25098d50b89d7aa1d66646bc571c1", [:rebar3], [], "hexpm"},
"ssl_verify_fun": {:hex, :ssl_verify_fun, "1.1.4", "f0eafff810d2041e93f915ef59899c923f4568f4585904d010387ed74988e77b", [:make, :mix, :rebar3], [], "hexpm"},
"unicode_util_compat": {:hex, :unicode_util_compat, "0.4.1", "d869e4c68901dd9531385bb0c8c40444ebf624e60b6962d95952775cac5e90cd", [:rebar3], [], "hexpm"},
}
71 changes: 61 additions & 10 deletions test/cid_test.exs
Original file line number Diff line number Diff line change
@@ -1,18 +1,69 @@
defmodule DummyStruct do
defstruct [:name, :username, :age]
end

defmodule CidTest do
use ExUnit.Case
doctest Cid

test "Creates a deterministic Content ID from Elixir String" do
assert Cid.make("Elixir") == "NSqJspBr2u1F6z1DhcR2cnQAxLdQZBLk"
end
defstruct [:a]

test "Create a CID from a Map" do
map = %{cat: "Meow", dog: "Woof", fox: "What Does The Fox Say?"}
assert Cid.make(map) == "GdrVnsLSdxRphXgQgNsmq1FDyRXAySXT"
end
@dummy_map %{
name: "Batman",
username: "The Batman",
age: 80
}

test "Cid.make(\"hello world\")" do
assert Cid.make("hello world") == "MJ7MSJwS1utMxA9QyQLytNDtd5RGnx6m"
end
describe "Testing ex_cid cid function" do
test "returns the same CID as IPFS when given a string" do
assert "zb2rhkpbfTBtUV1ESqSScrUre8Hh77fhCKDLmX21rCo5xp8J9" == Cid.cid("Hello World")
end

test "returns the same CID as IPFS when given a map" do
assert "zb2rhbYzyUJP6euwn89vAstfgG2Au9BSwkFGUJkbujWztZWjZ" == Cid.cid(%{a: "a"})
end

test "returns the same CID as IPFS when given a struct" do
assert "zb2rhbYzyUJP6euwn89vAstfgG2Au9BSwkFGUJkbujWztZWjZ" == Cid.cid(%__MODULE__{a: "a"})
end

test "returns an error if given invalid data type" do
assert Cid.cid(2) == "invalid data type"
end

test "returns the same CID regardless of order of items in map" do
map = %{
age: 80,
name: "Batman",
username: "The Batman"
}

assert Cid.cid(@dummy_map) == Cid.cid(map)
end

test "A struct with the same keys and values as a map creates the same CID" do
struct =
%DummyStruct{
age: 80,
name: "Batman",
username: "The Batman"
}

assert Cid.cid(struct) == Cid.cid(@dummy_map)
end

test "returns a different CID when the value given differs (CIDs are all unique)" do
refute Cid.cid("") == Cid.cid(" ")
refute Cid.cid("\n") == Cid.cid("")
refute Cid.cid("Hello World") == Cid.cid("salve mundi")
refute Cid.cid("Hello World") == Cid.cid("hello world")
refute Cid.cid(%{a: "a"}) == Cid.cid(%{a: "b"})
refute Cid.cid(%__MODULE__{a: "a"}) == Cid.cid(%DummyStruct{})
end

test "empty values also work" do
assert Cid.cid("") == "zb2rhWm2M1wXXqtqU6pHfovz3DZQ7D54ZD2xN3ynwankHCBCn"
assert Cid.cid(%{}) == "zb2rhkFjaEEfsGTTeHVAGXB3qq7RzLHJojqpucfYFzoL2gB9P"
end
end
end