forked from tlwg/libdatrie
-
Notifications
You must be signed in to change notification settings - Fork 1
/
ChangeLog
2318 lines (1603 loc) · 77.1 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
2021-10-17 Theppitak Karoonboonyanan <theppitak@gmail.com>
Check fread() result in serialization test.
* tests/test_serialization.c (main):
- Check fread() return value when reading back the serialized
trie file. (Caught by -Wunused-result when building deb.)
2021-08-31 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix PAPER_TYPE in Doxyfile.
* doc/Doxyfile.in:
- Fix invalid PAPER_TYPE 'a4wide' to 'a4'.
2021-08-31 Theppitak Karoonboonyanan <theppitak@gmail.com>
Update Doxyfile for doxygen 1.9.1.
* doc/Doxyfile.in:
- Apply 'doxygen -u'.
* configure.ac:
- Bump doxygen required version.
2021-08-31 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix documentation typo.
* datrie/trie.h (TrieEnumFunc):
- Fix 'data' param to 'key_data' in doc comment.
2021-08-31 Theppitak Karoonboonyanan <theppitak@gmail.com>
Apply 'autoupdate' for autoconf 2.71
* configure.ac:
- Quote m4 strings in AC_INIT() parameters.
- Replace AC_CONFIG_HEADER() with AC_CONFIG_HEADERS().
- Replace AC_PROG_LIBTOOL with LT_INIT. With this, drop
AC_LIBTOOL_WIN32_DLL, as we haven't really declared
dllexport anywhere yet.
- Replace obsolete AC_LIBTOOL_LINKER_OPTION() with
_LT_LINKER_OPTION(). Also quote an m4 string.
- Replace obsolete AC_TRY_COMPILE() with AC_COMPILE_IFELSE().
Use AC_LANG_PROGRAM() as 'autoupdate' suggests.
- Drop AC_HEADER_STDC as 'autoupdate' suggests.
- Replace deprecated AC_HELP_STRING() with AS_HELP_STRING().
- Update AC_PREREQ() from 2.59 to 2.71.
2021-02-08 Theppitak Karoonboonyanan <theppitak@gmail.com>
Provide our own version of INSTALL instruction.
* INSTALL:
- Explain simple installation steps and build requirements.
This is to address a frequently found issue when
'autoconf-archive' is missing, like in issue #17, issue #18.
2021-01-29 Theppitak Karoonboonyanan <theppitak@gmail.com>
* NEWS:
=== Version 0.2.13 ===
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Update library versioning
* configure.ac: Bump library versioning to reflect API addition.
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Rename trie_byte_strlen() to trie_char_strsize().
The object of the function is TrieChar string. Let's keep
that semantics in the name.
* datrie/trie-string.h, datrie/trie-string.c
(trie_byte_strlen -> trie_char_strsize):
- Rename the function.
* datrie/tail.c (tail_get_serialized_size, tail_serialize):
- Replace trie_byte_strlen() calls with the new name.
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Revise test_serialization.
* tests/test_serialization.c (-trie_enum_mark_rec):
- Drop unused callback function.
* tests/test_serialization (main):
- Drop unused 'is_failed' variable.
- "%Ilu" -> "%lu" printf format.
- Rearrange error handling in stack unwinding style.
- Add '\n' to printf messages.
- Free 'trieSerializedData'.
We're not moving to C99 yet, but the declaration amid code is
too useful to remove. And it's just in the test code, not in
the main source. So we still allow it.
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Get rid of <unistd.h> include in the new test.
* tests/test_serialization.c (main):
- Replace unlink() calls with remove() from <stdio.h> and drop
<unistd.h> include.
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Adjust file-internal function declarations.
* datrie/fileutils.c
(parse_int16_be, serialize_int32_be, serialize_int16_be):
- Re-declare functions as static.
* datrie/fileutils.c (parse_int16_be):
- Make the 'buff' arg const pointer.
* datrie/fileutils.c:
- Remove some blank lines in source.
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix documentation.
* datrie/tail.h (Tail typedef):
- Fix comment (Double-array -> Tail).
2021-01-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Cosmetic changes.
* datrie/fileutils.c:
* datrie/trie-string.c:
* datrie/alpha-map.c:
* datrie/darray.h:
* datrie/darray.c:
* datrie/tail.h:
* datrie/tail.c:
* datrie/trie.c:
* tools/trie-tool.c:
* tests/test_serialization.c:
- Use space before left parenthesis.
- Use old-style C comments.
- Remove trailing spaces.
- Re-wrap lines.
2021-01-23 KOLANICH <KOLANICH@users.noreply.github.com>
Added serialization of the trie into a memory buffer.
* datrie/fileutils.c
(file_write_int32, +serialize_int32,
file_write_int16, +serialize_int16,
file_read_int32, +parse_int32_be,
file_read_int16, +parse_int16_be):
- Split binary read/write operations into separate functions.
* datrie/fileutils.h, datrie/fileutils.c
(+serialize_int32_be_incr, +serialize_int16_be_incr):
- Add serialization utility functions with pointer advancement.
* datrie/trie-string.h, datrie/trie-string.c
(+trie_byte_strlen):
- Add utility method for calculating TrieChar string size in bytes.
* datrie/alpha-map-private.h, datrie/alpha-map.c
(+alpha_map_get_serialized_size, +alpha_map_serialize_bin):
- Add AlphaMap serialization methods.
* datrie/darray.h, datrie/darray.c
(+da_get_serialized_size, +da_serialize):
- Add DArray serialization methods.
* datrie/tail.h, datrie/tail.c
(+tail_get_serialized_size, +tail_serialize):
- Add Tail serialization methods.
* datrie/trie.h, datrie/trie.c
(+trie_get_serialized_size, +trie_serialize):
- Add Trie serialization methods.
* datrie/libdatrie.map, datrie/libdatrie.def:
- Add export symbols for Trie serialization.
* tests/Makefile.am, +tests/test_serialization.c:
- Add serialization test.
Pull Request #12.
2021-01-22 Theppitak Karoonboonyanan <theppitak@gmail.com>
Get rid of <unistd.h> include.
* tests/test_file.c (main):
- Replace unlink() calls with remove() from <stdio.h> and drop
<unistd.h> include, fixing build issue on Windows.
Addressing Windows build issue differently from what proposed by
@fanc999 in pull request #15. Thanks @fanc999 for first raising this.
2021-01-15 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use TRIE_CHAR_TERM in TAIL I/O methods.
* datrie/tail.c (tail_fwrite):
- Replace strlen() with trie_char_strlen() on suffix,
which is TrieChar string.
* datrie/tail.c (tail_fread):
- Append TRIE_CHAR_TERM, rather than literal zero,
as suffix terminator.
2021-01-15 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use TRIE_CHAR_TERM in TrieIterator methods.
* datrie/trie.c (trie_iterator_get_key):
- Replace strlen() with trie_char_strlen() on tail_str,
which is TrieChar string.
- Check tail_str termination against TRIE_CHAR_TERM, not zero.
2021-01-14 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix wrong TRIE_CHAR_TERM semantics.
* datrie/trie.h (trie_state_is_terminal):
- Test for terminal state using zero AlphaChar.
TRIE_CHAR_TERM is a TrieChar, although it's also accidentally zero.
2021-01-14 Theppitak Karoonboonyanan <theppitak@gmail.com>
Reduce loops in alpha_map_recalc_work_area().
* datrie/alpha-map.c (alpha_map_recalc_work_area):
- Instead of pre-filling trie-to-alpha map with errors
and setting valid cells afterward, just fill error cells
left after valid cells are done. Then, finally set
TRIE_CHAR_TERM cell.
2021-01-10 Theppitak Karoonboonyanan <theppitak@gmail.com>
Rewrite AlphaMap recalc.
* datrie/alpha-map.c (alpha_map_recalc_work_area):
Rewrite alpha-to-trie & trie-to-alpha maps recalculation
- For clearer relation between the two maps
- To allow other TRIE_CHAR_TERM values than zero
2021-01-09 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use TRIE_CHAR_TERM macro instead of zero.
* datrie/alpha-map.c
(alpha_map_char_to_trie, alpha_map_char_to_trie_str):
* datrie/trie.c (trie_branch_in_branch, trie_branch_in_tail):
* datrie/tail.c (tail_walk_str, tail_walk_char):
- Use TRIE_CHAR_TERM instead of hard-wired zero when working
with raw trie string termination.
* datrie/trie-string.c (trie_string_terminate):
- Append TRIE_CHAR_TERM instead of simply delegating to
dstring_terminate().
2021-01-09 Theppitak Karoonboonyanan <theppitak@gmail.com>
Share static trie string functions internally.
* datrie/trie-string.h, datrie/trie-string.c, datrie/tail.c
(tc_strlen -> trie_char_strlen, tc_strdup -> trie_char_strdup):
- Move private functions in tail.c to trie-string.[ch],
with new full names under new "static trie string" section.
- Check/assign string terminator using TRIE_CHAR_TERM
instead of zero.
* datrie/tail.c (tail_set_suffix):
- Call the new trie_char_strdup() instead of tc_strdup().
* datrie/alpha-map.c (alpha_map_trie_to_char_str):
- Call trie_char_strlen() instead of strlen().
2021-01-06 Theppitak Karoonboonyanan <theppitak@gmail.com>
Get rid of char semantics from TrieChar
* datrie/trie.c (trie_branch_in_branch, trie_branch_in_tail):
- Check null TrieChar with int zero instead of char.
2021-01-05 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use proper #include form in installed header.
* datrie/alpha-map.h:
- Use angle quotes form instead of double quotes in #include.
2021-01-05 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix some documentations.
* datrie/trie.c (trie_is_dirty):
- Adjust wording to make clear the file is out of sync,
not the other way around.
* datrie/trie.c (trie_store, trie_store_if_absent):
- Fix typo in the description of 'key' parameter.
* datrie/trie.c (trie_store_if_absent):
- Minor wording adjustment (inserted, not appended).
2020-12-30 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix isspace() arg problem on NetBSD.
* tools/trietool.c (command_add_list, string_trim):
- Cast char to unsigned char before passing to isspace().
Thanks Sean <scole_mail@gmx.com> for the report via a personal mail.
2019-12-20 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use GitHub issue tracker as bug report address.
* configure.ac:
- Replace bug report e-mail address with GitHub issue tracker URL.
2019-08-05 Theppitak Karoonboonyanan <theppitak@gmail.com>
Stop installing README.migration
It's supposed to be internal document now.
* Makefile.am:
- Remove README.migration from doc_DATA.
2019-01-21 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix cross-compiling issue caused by AC_FUNC_MALLOC
* configure.ac:
- Replace AC_FUNC_MALLOC with AC_CHECK_FUNCS([malloc]),
as we don't rely on GNU's malloc(0) behavior.
Thanks Vanessa McHale for the report. Closes: #11
2018-11-23 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix wrong key listing in byte trie
* tests/Makefile.am, +tests/test_byte_list.c:
- Add test case
* datrie/alpha-map.c (alpha_map_recalc_work_area):
- Index trie_to_alpha_map[] using TrieIndex, not TrieChar type,
to prevent overflow upon incrementing over 0xff.
- Drop tc variable and just reuse trie_last.
Thanks @legale for the report.
Closes: #9
https://github.com/tlwg/libdatrie/issues/9
2018-06-19 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac:
- Bump library revision to reflect code changes.
* NEWS:
=== Version 0.2.12 ===
2018-06-19 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use HTTPS in URL
* README:
- Update document URL to HTTPS
2018-06-14 Theppitak Karoonboonyanan <theppitak@gmail.com>
Cast (wchar_t *) to fix warnings in tests
"%ls" printf() format requires (wchar_t *) [aka int *] arg.
So, let's cast (AlphaChar *) [aka unsigned int *] to satisfy it.
* tests/test_walk.c:
* tests/test_iterator.c:
* tests/test_store-retrieve.c:
* tests/test_file.c:
* tests/test_nonalpha.c:
* tests/test_null_trie.c:
- Add <wchar.h> include, for wchar_t type
- Cast "%ls" args from (AlphaChar *) to (wchar_t *)
2018-06-06 Theppitak Karoonboonyanan <theppitak@gmail.com>
Avoid non-ANSI C snprintf()
* tools/trietool.c (+full_path, prepare_trie, close_trie):
- Instead of preparing full path name with snprintf(), which is
non-ANSI, and still risks path name trimming, do it with
size-calculated malloc().
- free() it as needed.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix sscanf() string format
* tools/trietool.c (prepare_trie):
- Define b, e as unsigned int, as required by "%x" format.
Fixing warning from '-Wformat=' gcc option.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix compiler warnings in tests
* tests/test_byte_alpha.c (main):
* tests/test_file.c (main):
* tests/test_iterator.c (main):
* tests/test_nonalpha.c (main):
* tests/test_null_trie.c (main):
* tests/test_store-retrieve.c (main):
* tests/test_term_state.c (main):
* tests/test_walk.c (main):
- Declare main function with 'main (void)'.
Fixing warning from '-Wstrict-prototypes' gcc option.
* tests/test_walk.c (main):
- Split long string, which required C90 compilers.
Fixing warning from '-Woverlength-strings' gcc option.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Duplicate TrieChar string in more portable manner
* datrie/tail.c (tail_set_suffix, +tc_strdup, +tc_strlen):
- Replace cast strdup() with crafted implementation,
allowing TrieChar to be of larger size than char.
Fixing warning from '-Wint-to-pointer-cast' gcc option.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Split long string
* tools/trietool.c (usage):
- Split help message which was too long and required C90 compiler.
Caught by '-Woverlength-strings' gcc option.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Remove unused byte, word, dword typedefs
These are likely to conflict with other uses.
* datrie/typedefs.h (-byte, -word, -dword):
- Remove the unused typedefs
Thanks Peter Moulder for the patch.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Rename TRUE/FALSE in Bool enum to avoid clash
Some other header file may have already define TRUE/FALSE.
* datrie/typedefs.h (Bool):
- Rename FALSE, TRUE to DA_FALSE, DA_TRUE respectively,
and define FALSE, TRUE macros only if they haven't been defined.
Thanks Peter Moulder for the patch.
2018-06-04 Theppitak Karoonboonyanan <theppitak@gmail.com>
Declare argument-less functions with "(void)"
"f()" declaration form is K&R style, specifying that no information
about the number or types of parameters is supplied. This caused
warnings on '-Wstrict-prototypes' gcc option.
* datrie/alpha-map.h, datrie/alpha-map.c (alpha_map_new):
* datrie/darray.h, datrie/darray.c (da_new, symbols_new):
* datrie/tail.h, datrie/tail.c (tail_new):
* tests/utils.h, tests/utils.c (en_alpha_map_new, en_trie_new):
- Use "(void)" form in declaration
- Also use "(void)" form in definition, for consistency
Thanks Peter Moulder for the initial patch.
2018-05-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
Remove duplicate include
* tools/trietool.c:
- Remove duplicate include <config.h>
2018-04-23 Theppitak Karoonboonyanan <theppitak@gmail.com>
Add missing include in test
* tests/test_byte_alpha.c:
- Add missing include for utils.h
2018-04-23 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac:
- Bump library revision to reflect code changes.
* NEWS:
=== Version 0.2.11 ===
2018-04-21 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix reported segfault on full-range alpha map
* tests/Makefile.am, +tests/test_byte_alpha.c:
- Add test case
* datrie/alpha-map.c (alpha_map_recalc_work_area()):
- Redeclare trie_last as TrieIndex, to prevent overflow.
Thanks Xiao Wang for the report, and @nevermatch for the analysis.
Closes: #6
https://github.com/tlwg/libdatrie/issues/6
2018-03-29 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix trie_state_get_data() at a prefix key
When getting data from a state which terminates a key that is
a prefix of another key, a terminator should be tried, so it
jumps from DA to TAIL, where we can get the data.
* tests/Makefile.am, +tests/test_term_state.c:
- Add a test case with {'ab', 'abc'} dictionary, which fails previous
code when retrieving data for key 'ab'.
* datrie/trie.c (trie_state_get_data()):
- Instead of simply checking for leaf state, which only caught a state
in TAIL, also try walking with a terminator when still in DA.
- Replace 'leaf state' with 'terminal state' in documentation,
for more clarity.
- Also return error on null state pointer.
Thanks Filip Pytloun from the pytries project for the initial patch.
2017-09-06 Theppitak Karoonboonyanan <theppitak@gmail.com>
Revise description about search time complexity
* README:
- Clarify that search time is O(m), where m is the key length,
instead of O(1), while still claim that it's independent of
database size.
This closes #4.
https://github.com/tlwg/libdatrie/issues/4
2016-12-14 Theppitak Karoonboonyanan <theppitak@gmail.com>
Include git-version-gen in tarball
* Makefile.am:
- Add build-aux/git-version-gen to EXTRA_DIST.
2016-09-21 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix iconv() return value checking.
* tools/trietool.c (conv_to_alpha):
- Check iconv() return value against (size_t) -1, rather than
for its negativity, as size_t can be unsigned.
Thanks Daniel Macks for the report on Issue #3.
https://github.com/tlwg/libdatrie/issues/3
2016-09-21 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use versioning based on Git snapshot.
* Makefile.am:
- Add dist-hook to generate VERSION file on tarball generation.
* +build-aux/git-version-gen:
- Add script to generate version based on 'git describe'
if in git tree, or using VERSION file if in release tarball.
* configure.ac:
- Call git-version-gen to get package version.
2015-10-20 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac:
- Bump library revision to reflect code changes.
* NEWS, configure.ac:
=== Version 0.2.10 ===
2015-10-13 Theppitak Karoonboonyanan <theppitak@gmail.com>
Optimize AlphaMap mapping.
alpha_map_char_to_trie() is called everywhere before trie state
transition. It's an important bottleneck.
We won't change the persistent AlphaMap structure, but will add
pre-calculated lookup tables for use at run-time.
* datrie/alpha-map.c (struct _AlphaMap):
- Add members for alpha-to-trie and trie-to-alpha lookup tables.
* datrie/alpha-map.c (alpha_map_new, alpha_map_free):
- Initialize & free the tables properly.
* datrie/alpha-map.c (alpha_map_add_range -> alpha_map_add_range_only
+ alpha_map_recalc_work_area):
- Split alpha_map_add_range() API into two parts: adding the range
as usual and recalculate the lookup tables.
* datrie/alpha-map.c (alpha_map_clone, alpha_map_fread_bin):
- Call alpha_map_add_range_only() repeatedly before calling
alpha_map_recalc_work_area() once.
* datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_trie_to_char):
- Look up the pre-calculated tables instead of calculating on
every call.
This appears to save time by 14.6% for total alpha_char_to_trie()
calls and even lower its bottleneck rank by 1 rank on a libthai
test case. It reduces 0.2% run time of the total libthai test case.
Note that the time saved would be even more in case of multiple
uncontinuous alphabet ranges, at the expense of more memory used.
2015-08-18 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix doxygen version checking.
* configure.ac:
- Correctly compare doxygen versions. Simple expr comparison
didn't work with version 1.8.10.
Thanks Petr Gajdos <pgajdos@suse.cz> for the patch.
2015-06-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
* datrie/tail.c (tail_set_suffix):
- Catch strdup() failure.
2015-06-24 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac: Post-release version suffix added.
2015-05-03 Theppitak Karoonboonyanan <theppitak@gmail.com>
* NEWS, configure.ac:
=== Version 0.2.9 ===
2015-05-03 Theppitak Karoonboonyanan <theppitak@gmail.com>
Use relative paths for symlinks.
* tools/Makefile.am, man/Makefile.am:
- Use relative paths for symlinks to avoid confusion in
installation with DESTDIR.
2015-05-03 Theppitak Karoonboonyanan <theppitak@gmail.com>
Also install symlink for old trietool.
* man/Makefile.am:
- Add hooks to install/uninstall symlink for old man page.
2015-05-02 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac:
- Bump library revision to reflect code changes.
2015-04-29 Theppitak Karoonboonyanan <theppitak@gmail.com>
Bump doxygen required version.
* configure.ac:
- Bump doxygen required version to 1.8.8, according to recent
Doxyfile update.
2015-04-21 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix infinite loop on empty trie iteration.
* tests/Makefile.am, +tests/test_null_trie.c:
- Add test case for empty trie iteration.
* datrie/darray.c (da_first_separate):
- Fix error condition after loop ending.
Thanks Sergei Lebedev <sergei.a.lebedev@gmail.com> for the report
via personal mail.
Original report: https://github.com/kmike/datrie/issues/17
2015-04-12 Theppitak Karoonboonyanan <theppitak@gmail.com>
Document about alphabet size.
* datrie/trie.h:
- Add to doc comment a description on the alphabet size limit
and the mapped raw codes.
Thanks edgehogapp for the suggestion.
https://groups.google.com/forum/#!topic/thai-linux-foss-devel/U-O__IfviQ0
2015-04-11 Theppitak Karoonboonyanan <theppitak@gmail.com>
Clarify Symbols' struct & methods.
* datrie/darray.c (struct _Symbols):
- Use TRIE_CHAR_MAX + 1 instead of hard-coded value for symbols[]
array size.
Thanks edgehogapp for the suggestion.
https://groups.google.com/forum/#!topic/thai-linux-foss-devel/U-O__IfviQ0
* datrie/darray.h, datrie/darray.c (symbols_new, symbols_add):
- Hide symbols_new() and symbols_add() for internal use.
2015-03-06 Theppitak Karoonboonyanan <theppitak@gmail.com>
Update Doxyfile.
* doc/Doxyfile.in:
- Updated for doxygen 1.8.8 with 'doxygen -u'.
2015-03-02 Theppitak Karoonboonyanan <theppitak@gmail.com>
Catch realloc failure.
* datrie/tail.c (tail_alloc_block):
- Check realloc() result on t->tails reallocation and return
failure code if failed.
* datrie/tail.c (tail_add_suffix):
- Check return value from tail_alloc_block() and return failure
code if failed.
- Update documentation.
2015-03-02 Theppitak Karoonboonyanan <theppitak@gmail.com>
Catch realloc failure.
* datrie/darray.c (da_extend_pool):
- Check realloc() result on d->cells reallocation and handle
failure properly.
2015-02-27 Theppitak Karoonboonyanan <theppitak@gmail.com>
Catch malloc failure.
* datrie/tail.c (tail_fread):
- Check malloc() result on suffix string and exit properly.
2015-02-26 Theppitak Karoonboonyanan <theppitak@gmail.com>
More micro-optimization with LIKELY/UNLIKELY.
* datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_trie_to_char):
- Use UNLIKELY() when checking for NUL character.
2015-02-10 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix 'make distcheck' failure.
* doc/Makefile.am:
- Remove doxygen db file on clean.
2015-02-10 Theppitak Karoonboonyanan <theppitak@gmail.com>
More update of my e-mail address.
* man/trietool.1:
- Update my e-mail address.
2015-02-10 Theppitak Karoonboonyanan <theppitak@gmail.com>
Rename trietool-0.2 utility to trietool.
* configure.ac:
- Check for ln -s
* tools/Makefile.am:
- Rename bin target from trietool-0.2 to trietool.
- Add hooks to install/uninstall symlink with old name.
* man/Makefile.am, man/trietool-0.2.1 -> man/trietool.1:
- Rename & update manpage accordingly.
2015-02-06 Theppitak Karoonboonyanan <theppitak@gmail.com>
Micro-optimize with likely/unlikely hints.
* datrie/trie-private.h:
- Add LIKELY() and UNLIKELY() macros based on compiler extension.
* datrie/alpha-map.c
(alpha_map_new, alpha_map_clone, alpha_map_fread_bin,
alpha_map_add_range, alpha_map_char_to_trie_str,
alpha_map_trie_to_char_str):
* datrie/darray.c
(symbols_new, da_new, da_fread, da_get_base, da_get_check,
da_set_base, da_set_check, da_insert_branch, da_find_free_base,
da_extend_pool):
* datrie/dstring.c (dstring_new, dstring_ensure_space):
* datrie/tail.c
(tail_new, tail_fread, tail_get_suffix, tail_set_suffix,
tail_get_data, tail_set_data, tail_walk_str, tail_walk_char):
* datrie/trie.c
(trie_new, trie_fread, trie_enumerate, trie_state_new,
trie_state_walk, trie_state_is_walkable, trie_iterator_new):
- Use LIKELY() and UNLIKELY() where it is known to be so, mostly
for one-time initialization and failure handling.
* datrie/alpha-map.c, datrie/tail.c, datrie/tail.c:
- These are the files that need to include trie-private.h
because of this.
Callgrind says it does help speed up a little bit.
2015-02-05 Theppitak Karoonboonyanan <theppitak@gmail.com>
Disable timestamp in Doxygen-generated doc.
* doc/Doxyfile.in:
- Set HTML_TIMESTAMP to NO to make the document reproducible.
(reported by Debian Reproducible)
2015-02-01 Theppitak Karoonboonyanan <theppitak@gmail.com>
* configure.ac: [Belated] post-release version suffix added.
2015-02-01 Theppitak Karoonboonyanan <theppitak@gmail.com>
Update my e-mail address everywhere.
* AUTHORS, configure.ac, datrie/*.[ch], tests/*.[ch],
tools/trietool.c:
- Replace all mentionings of my e-mail address with the gmail one.
2015-02-01 Theppitak Karoonboonyanan <theppitak@gmail.com>
Fix binary file opening on Windows.
* datrie/trie.c (trie_new_from_file, trie_save):
- Add "b" to fopen() modes, so the binary file is opened properly
on Windows.
Thanks phongphan.p for the report and initial patch.
2014-01-10 Theppitak Karoonboonyanan <thep@linux.thai.net>
* configure.ac:
- Bump library revision to reflect code changes.
* NEWS, configure.ac:
=== Version 0.2.8 ===
2014-01-09 Theppitak Karoonboonyanan <thep@linux.thai.net>
Improve documentation.
* datrie/triedefs.h:
- Refine descriptions of data types.
* datrie/trie.c (trie_iterator_new):
- Fix typo on trie_root() mentioning.
* datrie/trie.c (trie_store, trie_store_if_absent):
- Adjust wording.
* datrie/alpha-map.h, datrie/trie.h:
- Add detailed description of AlphaMap and Trie types.
2014-01-08 Theppitak Karoonboonyanan <thep@linux.thai.net>
Clarify message in test_nonalpha.
* tests/test_nonalpha.c (main):
- Clarify message on false key duplication.
2014-01-08 Theppitak Karoonboonyanan <thep@linux.thai.net>
Add test on keys with non-alphabet input chars.
* tests/Makefile.am, +tests/test_nonalpha.c:
- Add test to ensure that operations on keys with non-alphabet
input chars fail.
2014-01-08 Theppitak Karoonboonyanan <thep@linux.thai.net>
Fail trie operations on non-alphabet inputs.
alpha_map_char_to_trie() tried to return TRIE_CHAR_MAX to indicate
out-of-range error. But this value is indeed valid in trie operations.
Doing so could allow false key duplication when different non-alphabet
chars and TRIE_CHAR_MAX itself were all mapped to TRIE_CHAR_MAX.
So, let's fail all trie operations on non-alphabet input chars.
* datrie/alpha-map-private.h, datrie/alpha-map.c
(alpha_map_char_to_trie):
- Make alpha_map_char_to_trie return TrieIndex type, using
TRIE_INDEX_MAX to indicate out-of-range error.
This allows TRIE_CHAR_MAX to be returned as a valid output.
* datrie/alpha-map.c (alpha_map_char_to_trie_str):
- Fail if alpha_map_char_to_trie() returns error code.
* datrie/trie.c (trie_retrieve, trie_store_conditionally, trie_delete,
trie_state_walk, trie_state_is_walkable):
- Check return value from alpha_map_char_to_trie() and return
failure status on error.
- Also cast TrieIndex return values to TrieChar on function calls.
Thanks Naoki Youshinaga for the suggestion.
2014-01-07 Theppitak Karoonboonyanan <thep@linux.thai.net>
Check for NULL result from AlphaMap string funcs.
* datrie/trie.c (trie_store_conditionally):
- Return failure on NULL alpha_map_char_to_trie_str().
2014-01-07 Theppitak Karoonboonyanan <thep@linux.thai.net>
Return NULL on allocation errors in AlphaMap funcs.
* datrie/alpha-map.c
(alpha_map_char_to_trie_str, alpha_map_trie_to_char_str):
- Return NULL on malloc() error.
2014-01-03 Theppitak Karoonboonyanan <thep@linux.thai.net>
Fix edge case with TRIE_CHAR_MAX as TrieChar.
The trie input char with value TRIE_CHAR_MAX (255), was always
skipped by double-array algorithms. Let's include it.
* datrie/darray.c (da_has_children, da_output_symbols,
da_relocate_base, da_first_separate, da_next_separate):
- Include the last char in trie char iterations.
* datrie/darray.c (da_first_separate, da_next_separate):
- Declare characters as TrieIndex type instead of TrieChar,
to prevent infinite loop due to unsigned char overflow.
Thanks Naoki Youshinaga for the report, test case, and analysis.
2013-10-25 Theppitak Karoonboonyanan <thep@linux.thai.net>
Fix comiler warnings in tests.
* tests/test_walk.c (main):
- Remove unused var i;
- Remove extra printf() args.
* tests/test_iterator.c:
- Add missing #include for free().
* tests/test_walk.c (walk_dict), tests/utils.c (dict_src):
- Cast string literals to (AlphaChar *) to fix signedness
differences.
2013-10-25 Theppitak Karoonboonyanan <thep@linux.thai.net>
* configure.ac: Post-release version suffix added.
2013-10-22 Theppitak Karoonboonyanan <thep@linux.thai.net>
* NEWS, configure.ac:
=== Version 0.2.7.1 ===
2013-10-21 Theppitak Karoonboonyanan <thep@linux.thai.net>
* configure.ac: Bump library versioning to reflect API addition.
(Change missing in previous release)
2013-10-21 Theppitak Karoonboonyanan <thep@linux.thai.net>
* configure.ac: Post-release version suffix added.
2013-10-21 Theppitak Karoonboonyanan <thep@linux.thai.net>
* NEWS, configure.ac:
=== Version 0.2.7 ===
2013-10-21 Theppitak Karoonboonyanan <thep@linux.thai.net>
Add missing distributed file.
* tests/Makefile.am:
- Add utils.h to distribution.
2013-10-20 Theppitak Karoonboonyanan <thep@linux.thai.net>
Reorder tests from primitive to applied.
* tests/Makefile.am:
- Test walk & iterator before store-retrieve & file.
2013-10-20 Theppitak Karoonboonyanan <thep@linux.thai.net>
Write a test suite for trie walk.
* tests/test_walk.c:
- Write test code.
2013-10-18 Theppitak Karoonboonyanan <thep@linux.thai.net>
Write a test suite for trie store/retrieval.
* tests/utils.h, tests/utils.c (+dict_src_n_entries):
- Add function to get total entries in dict_src[].
* tests/test_store-retrieve.c (main):
- Write test code.
2013-10-18 Theppitak Karoonboonyanan <thep@linux.thai.net>
Fix messages in test_iterator.
* tests/test_iterator.c (main):
- s/file/trie/. No file is written or read in this test.
2013-10-18 Theppitak Karoonboonyanan <thep@linux.thai.net>
Skip further iteration tests if key is NULL.
* tests/test_iterator.c (main):
- Insert 'continue' if trie_iterator_get_key() returns NULL.