Implement database iterator #47

jjxtra · 2018-07-16T22:10:27Z

I would appreciate code review on this. It passes all tests.

I would love to see this in a nuget package soon.

As a side note, some of the threading tests are failing. I did not investigate that.

Enumeration time for cities database in release mode is around 12 seconds now. Debug is around 20.

jjxtra · 2018-07-16T23:29:36Z

I would suggest a squash and merge on this to avoid soiling the commit history

This allows any type to easily be iterated, at the cost of losing the convenience of a foreach loop.

oschwald

This looks great! I apologize for taking so long to get back to you on this. I had a few minor comments.

oschwald · 2018-08-03T15:29:37Z

MaxMind.Db/Reader.cs

+        /// <summary>
+        /// Get an enumerator that iterates all data nodes in the database
+        /// </summary>
+        /// <param name="cacheSize">The size of the data cache. This can greatly speed enumeration at the cost of memory usage.</param>


This should also probably warn about the dangers of mutating the returned object given that the object is in the cache.

oschwald · 2018-08-03T15:47:00Z

MaxMind.Db.Test/ReaderTest.cs

+        }
+
+        [Fact]
+        public void TestEnumerateCitiesDatabaseSpeed()


This test is rather slow and would likely fail if run on a slow machine. Perhaps this could be refactored a bit and moved to be with the benchmarks in MaxMind.Db.Benchmark.

oschwald · 2018-08-03T15:51:11Z

MaxMind.Db.Test/ReaderTest.cs

+        {
+            Stopwatch timer = Stopwatch.StartNew();
+            int count = 0;
+            using (var reader = new Reader(Path.Combine(_testDataRoot, "../../GeoLite2-Country.mmdb"), FileAccessMode.Memory))


Perhaps this one could be moved to the benchmarks too. I noticed that you added this database to the repo. Although it is much smaller than the GeoLite2-City database that was added to the repo early on, I'd like to avoid adding new binary files to reduce repo bloat.

oschwald · 2018-08-03T15:54:29Z

MaxMind.Db/CachedDictionary.cs

+    /// A dictionary that caches up to N values in memory. Once the dictionary reaches N count, the last item in the internal list is removed.
+    /// New items are always added to the start of the internal list.
+    /// </summary>
+    internal class CachedDictionary<TKey, TValue> : IDictionary<TKey, TValue>, IDisposable


Nice! Was this based on another caching implementation or is it completely new code? I am just wondering for licensing reasons.

Something I use in other projects, can tag it with MIT license. All code written by me in my spare time for free.

Thanks! If you are the owner of the code, would you be willing to place it under the Apache 2 license to match the other code in the repo and for the benefit of the patent grant?

Sure no problem

It is done.

Commented code eradicated

oschwald · 2018-08-03T16:03:35Z

MaxMind.Db/CachedDictionary.cs

+        /// <summary>
+        /// The first node in the priority list
+        /// </summary>
+        protected LinkedListNode<KeyValuePair<TKey, TValue>> FirstNode


I don't think this is used.

oschwald · 2018-08-03T16:13:26Z

MaxMind.Db/CachedDictionary.cs

+        /// <summary>
+        /// Current comparer
+        /// </summary>
+        public IEqualityComparer<TKey> Comparer


This and the following three are also not used.

oschwald · 2018-08-03T16:27:18Z

MaxMind.Db/Reader.cs

+        /// </summary>
+        /// <param name="cacheSize">The size of the data cache. This can greatly speed enumeration at the cost of memory usage.</param>
+        /// <returns>Enumerator for all data nodes</returns>
+        public IEnumerable<Reader.ReaderIteratorNode<T>> FindAll<T>(int cacheSize = 16384) where T : class


A test that used something other than Dictionary<string, object> might be good. In particular, one using the TypeHolder class with the MaxMind-DB-test-decoder.mmdb database.

jjxtra · 2018-08-04T01:09:08Z

I've implemented most of these changes. I've moved the code to the benchmark project, but it is commented out for now. Not sure how useful it is at the moment, perhaps someone else can add it later if needed.

oschwald · 2018-08-06T22:04:10Z

Looks good. Perhaps the benchmark code could be moved to a separate PR or an issue so that we don't have commented out code lying around in the codebase. Other than that and the license question, I am happy to merge this.

jjxtra · 2018-08-07T04:31:08Z

Ready to merge when you are.

oschwald · 2018-08-08T22:34:27Z

Looks great! Thanks. I am squashing it so that the binary database does not end up in the Git history.

jjxtra added 9 commits July 16, 2018 11:27

Add database enumerator

07ad6f9

Optimize database enumerator

912517f

Fix test logic

d3ceb65

Properly return ipv4 addresses

34d78e7

Remove redundant code

6a1c398

1000x speed up using data cache. Now to optimize memory usage.

7b67008

Massive enumeration speed boost

19b7e80

Enumeration time for cities database in release mode is around 12 seconds now. Debug is around 20.

Add country enumeration speed test

cecf09a

Fix ipv4 being passed as ipv6

4ebbc4f

Simplify ip logic

08aa9f8

jjxtra mentioned this pull request Jul 17, 2018

Iterate all records #46

Closed

jjxtra added 3 commits July 19, 2018 16:48

Require typed enumerator

b7fb691

This allows any type to easily be iterated, at the cost of losing the convenience of a foreach loop.

Use throwaway for out param

9f977c9

Switch to IEnumerable to allow foreach

55b7f4a

oschwald requested changes Aug 3, 2018

View reviewed changes

Pull request changes requested

6611489

jjxtra added 2 commits August 3, 2018 19:10

Call out license

af56478

Add city db back

ab72d9b

jjxtra added 2 commits August 6, 2018 16:05

Switch to Apache 2.0 license

a1d3488

Remove commented out code

1ed4202

oschwald merged commit db0f4c4 into maxmind:master Aug 8, 2018

oschwald pushed a commit that referenced this pull request Jul 11, 2019

Implement database iterator (#47)

87e7ce0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement database iterator #47

Implement database iterator #47

jjxtra commented Jul 16, 2018 •

edited

Loading

jjxtra commented Jul 16, 2018

oschwald left a comment

oschwald Aug 3, 2018

oschwald Aug 3, 2018

oschwald Aug 3, 2018

oschwald Aug 3, 2018

jjxtra Aug 4, 2018

oschwald Aug 6, 2018

jjxtra Aug 6, 2018

jjxtra Aug 6, 2018

jjxtra Aug 6, 2018

oschwald Aug 3, 2018

oschwald Aug 3, 2018

oschwald Aug 3, 2018

jjxtra commented Aug 4, 2018

oschwald commented Aug 6, 2018

jjxtra commented Aug 7, 2018

oschwald commented Aug 8, 2018

Implement database iterator #47

Implement database iterator #47

Conversation

jjxtra commented Jul 16, 2018 • edited Loading

jjxtra commented Jul 16, 2018

oschwald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjxtra commented Aug 4, 2018

oschwald commented Aug 6, 2018

jjxtra commented Aug 7, 2018

oschwald commented Aug 8, 2018

jjxtra commented Jul 16, 2018 •

edited

Loading