forked from rdesai16/numba
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGE_LOG
2807 lines (2291 loc) · 111 KB
/
CHANGE_LOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Version 0.45.1
--------------
This patch release addresses some regressions reported in the 0.45.0 release and
adds support for NumPy 1.17:
* PR #4325: accept scalar/0d-arrays
* PR #4338: Fix #4299. Parfors reduction vars not deleted.
* PR #4350: Use process level locks for fork() only.
* PR #4354: Try to fix #4352.
* PR #4357: Fix np1.17 isnan, isinf, isfinite ufuncs
* PR #4363: Fix np.interp for np1.17 nan handling
* PR #4371: Fix nump1.17 random function non-aliasing
Contributors:
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Valentin Haenel (core dev)
Version 0.45.0
--------------
In this release, Numba gained an experimental :ref:`numba.typed.List
<feature-typed-list>` container as a future replacement of the :ref:`reflected
list <feature-reflected-list>`. In addition, functions decorated with
``parallel=True`` can now be cached to reduce compilation overhead associated
with the auto-parallelization.
Enhancements from user contributed PRs (with thanks!):
* James Bourbeau added the Numba version to reportable error messages in #4227,
added the ``signature`` parameter to ``inspect_types`` in #4200, improved the
docstring of ``normalize_signature`` in #4205, and fixed #3658 by adding
reference counting to ``register_dispatcher`` in #4254
* Guilherme Leobas implemented the dominator tree and dominance frontier
algorithms in #4216 and #4149, respectively.
* Nick White fixed the issue with ``round`` in the CUDA target in #4137.
* Joshua Adelman added support for determining if a value is in a `range`
(i.e. ``x in range(...)``) in #4129, and added windowing functions
(``np.bartlett``, ``np.hamming``, ``np.blackman``, ``np.hanning``,
``np.kaiser``) from NumPy in #4076.
* Lucio Fernandez-Arjona added support for ``np.select`` in #4077
* Rob Ennis added support for ``np.flatnonzero`` in #4157
* Keith Kraus extended the ``__cuda_array_interface__`` with an optional mask
attribute in #4199.
* Gregory R. Lee replaced deprecated use of ``inspect.getargspec`` in #4311.
General Enhancements:
* PR #4328: Replace GC macro with function call
* PR #4311: Avoid deprecated use of inspect.getargspec
* PR #4296: Slacken window function testing tol on ppc64le
* PR #4254: Add reference counting to register_dispatcher
* PR #4239: Support len() of multi-dim arrays in array analysis
* PR #4234: Raise informative error for np.kron array order
* PR #4232: Add unicodetype db, low level str functions and examples.
* PR #4229: Make hashing cacheable
* PR #4227: Include numba version in reportable error message
* PR #4216: Add dominator tree
* PR #4200: Add signature parameter to inspect_types
* PR #4196: Catch missing imports of internal functions.
* PR #4180: Update use of unlowerable global message.
* PR #4166: Add tests for PR #4149
* PR #4157: Support for np.flatnonzero
* PR #4149: Implement dominance frontier for SSA for the Numba IR
* PR #4148: Call branch pruning in inline_closure_call()
* PR #4132: Reduce usage of inttoptr
* PR #4129: Support contains for range
* PR #4112: better error messages for np.transpose and tuples
* PR #4110: Add range attrs, start, stop, step
* PR #4077: Add np select
* PR #4076: Add numpy windowing functions support (np.bartlett, np.hamming,
np.blackman, np.hanning, np.kaiser)
* PR #4095: Support ir.Global/FreeVar in find_const()
* PR #3691: Make TypingError abort compiling earlier
* PR #3646: Log internal errors encountered in typeinfer
Fixes:
* PR #4303: Work around scipy bug 10206
* PR #4302: Fix flake8 issue on master
* PR #4301: Fix integer literal bug in np.select impl
* PR #4291: Fix pickling of jitclass type
* PR #4262: Resolves #4251 - Fix bug in reshape analysis.
* PR #4233: Fixes issue revealed by #4215
* PR #4224: Fix #4223. Looplifting error due to StaticSetItem in objectmode
* PR #4222: Fix bad python path.
* PR #4178: Fix unary operator overload, check with unicode impl
* PR #4173: Fix return type in np.bincount with weights
* PR #4153: Fix slice shape assignment in array analysis
* PR #4152: fix status check in dict lookup
* PR #4145: Use callable instead of checking __module__
* PR #4118: Fix inline assembly support on CPU.
* PR #4088: Resolves #4075 - parfors array_analysis bug.
* PR #4085: Resolves #3314 - parfors array_analysis bug with reshape.
CUDA Enhancements/Fixes:
* PR #4199: Extend `__cuda_array_interface__` with optional mask attribute,
bump version to 1
* PR #4137: CUDA - Fix round Builtin
* PR #4114: Support 3rd party activated CUDA context
Documentation Updates:
* PR #4317: Add docs for ARMv8/AArch64
* PR #4318: Add supported platforms to the docs. Closes #4316
* PR #4295: Alter deprecation schedules
* PR #4253: fix typo in pysupported docs
* PR #4252: fix typo on repomap
* PR #4241: remove unused import
* PR #4240: fix typo in jitclass docs
* PR #4205: Update return value order in normalize_signature docstring
* PR #4237: Update doc links to point to latest not dev docs.
* PR #4197: hyperlink repomap
* PR #4170: Clarify docs on accumulating into arrays in prange
* PR #4147: fix docstring for DictType iterables
* PR #3951: A guide to overloading
CI Updates:
* PR #4300: AArch64 has no faulthandler package
* PR #4273: pin to MKL BLAS for testing to get consistent results
* PR #4209: Revert previous network tol patch and try with conda config
* PR #4138: Remove tbb before Azure test only on Python 3, since it was already
removed for Python 2
Contributors:
* Ehsan Totoni (core dev)
* Gregory R. Lee
* Guilherme Leobas
* James Bourbeau
* Joshua L. Adelman
* Keith Kraus
* Lucio Fernandez-Arjona
* Nick White
* Rob Ennis
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.44.1
--------------
This patch release addresses some regressions reported in the 0.44.0 release:
- PR #4165: Fix #4164 issue with NUMBAPRO_NVVM.
- PR #4172: Abandon branch pruning if an arg name is redefined. (Fixes #4163)
- PR #4183: Fix #4156. Problem with defining in-loop variables.
Version 0.44.0
--------------
IMPORTANT: In this release a few significant deprecations (and some less
significant ones) are being made, users are encouraged to read the related
documentation.
General enhancements in this release include:
- Numba is backed by LLVM 8 on all platforms apart from ppc64le, which, due to
bugs, remains on the LLVM 7.x series.
- Numba's dictionary support now includes type inference for keys and values.
- The .view() method now works for NumPy scalar types.
- Newly supported NumPy functions added: np.delete, np.nanquantile, np.quantile,
np.repeat, np.shape.
In addition considerable effort has been made to fix some long standing bugs and
a large number of other bugs, the "Fixes" section is very large this time!
Enhancements from user contributed PRs (with thanks!):
- Max Bolingbroke added support for the selective use of ``fastmath`` flags in
#3847.
- Rob Ennis made min() and max() work on iterables in #3820 and added
np.quantile and np.nanquantile in #3899.
- Sergey Shalnov added numerous unicode string related features, zfill in #3978,
ljust in #4001, rjust and center in #4044 and strip, lstrip and rstrip in
#4048.
- Guilherme Leobas added support for np.delete in #3890
- Christoph Deil exposed the Numba CLI via ``python -m numba`` in #4066 and made
numerous documentation fixes.
- Leo Schwarz wrote the bulk of the code for jitclass default constructor
arguments in #3852.
- Nick White enhanced the CUDA backend to use min/max PTX instructions where
possible in #4054.
- Lucio Fernandez-Arjona implemented the unicode string ``__mul__`` function in
#3952.
- Dimitri Vorona wrote the bulk of the code to implement getitem and setitem for
jitclass in #3861.
General Enhancements:
* PR #3820: Min max on iterables
* PR #3842: Unicode type iteration
* PR #3847: Allow fine-grained control of fastmath flags to partially address #2923
* PR #3852: Continuation of PR #2894
* PR #3861: Continuation of PR #3730
* PR #3890: Add support for np.delete
* PR #3899: Support for np.quantile and np.nanquantile
* PR #3900: Fix 3457 :: Implements np.repeat
* PR #3928: Add .view() method for NumPy scalars
* PR #3939: Update icc_rt clone recipe.
* PR #3952: __mul__ for strings, initial implementation and tests
* PR #3956: Type-inferred dictionary
* PR #3959: Create a view for string slicing to avoid extra allocations
* PR #3978: zfill operation implementation
* PR #4001: ljust operation implementation
* PR #4010: Support `dict()` and `{}`
* PR #4022: Support for llvm 8
* PR #4034: Make type.Optional str more representative
* PR #4041: Deprecation warnings
* PR #4044: rjust and center operations implementation
* PR #4048: strip, lstrip and rstrip operations implementation
* PR #4066: Expose numba CLI via python -m numba
* PR #4081: Impl `np.shape` and support function for `asarray`.
* PR #4091: Deprecate the use of iternext_impl without RefType
CUDA Enhancements/Fixes:
* PR #3933: Adds `.nbytes` property to CUDA device array objects.
* PR #4011: Add .inspect_ptx() to cuda device function
* PR #4054: CUDA: Use min/max PTX Instructions
* PR #4096: Update env-vars for CUDA libraries lookup
Documentation Updates:
* PR #3867: Code repository map
* PR #3918: adding Joris' Fosdem 2019 presentation
* PR #3926: order talks on applications of Numba by date
* PR #3943: fix two small typos in vectorize docs
* PR #3944: Fixup jitclass docs
* PR #3990: mention preprint repo in FAQ. Fixes #3981
* PR #4012: Correct runtests command in contributing.rst
* PR #4043: fix typo
* PR #4047: Ambiguous Documentation fix for guvectorize.
* PR #4060: Remove remaining mentions of autojit in docs
* PR #4063: Fix annotate example in docstring
* PR #4065: Add FAQ entry explaining Numba project name
* PR #4079: Add Documentation for atomicity of typed.Dict
* PR #4105: Remove info about CUDA ENVVAR potential replacement
Fixes:
* PR #3719: Resolves issue #3528. Adds support for slices when not using parallel=True.
* PR #3727: Remove dels for known dead vars.
* PR #3845: Fix mutable flag transmission in .astype
* PR #3853: Fix some minor issues in the C source.
* PR #3862: Correct boolean reinterpretation of data
* PR #3863: Comments out the appveyor badge
* PR #3869: fixes flake8 after merge
* PR #3871: Add assert to ir.py to help enforce correct structuring
* PR #3881: fix preparfor dtype transform for datetime64
* PR #3884: Prevent mutation of objmode fallback IR.
* PR #3885: Updates for llvmlite 0.29
* PR #3886: Use `safe_load` from pyyaml.
* PR #3887: Add tolerance to network errors by permitting conda to retry
* PR #3893: Fix casting in namedtuple ctor.
* PR #3894: Fix array inliner for multiple array definition.
* PR #3905: Cherrypick #3903 to main
* PR #3920: Raise better error if unsupported jump opcode found.
* PR #3927: Apply flake8 to the numpy related files
* PR #3935: Silence DeprecationWarning
* PR #3938: Better error message for unknown opcode
* PR #3941: Fix typing of ufuncs in parfor conversion
* PR #3946: Return variable renaming dict from inline_closurecall
* PR #3962: Fix bug in alignment computation of `Record.make_c_struct`
* PR #3967: Fix error with pickling unicode
* PR #3964: Unicode split algo versioning
* PR #3975: Add handler for unknown locale to numba -s
* PR #3991: Permit Optionals in ufunc machinery
* PR #3995: Remove assert in type inference causing poor error message.
* PR #3996: add is_ascii flag to UnicodeType
* PR #4009: Prevent zero division error in np.linalg.cond
* PR #4014: Resolves #4007.
* PR #4021: Add a more specific error message for invalid write to a global.
* PR #4023: Fix handling of titles in record dtype
* PR #4024: Do a check if a call is const before saying that an object is multiply defined.
* PR #4027: Fix issue #4020. Turn off no_cpython_wrapper flag when compiling for…
* PR #4033: [WIP] Fixing wrong dtype of array inside reflected list #4028
* PR #4061: Change IPython cache dir name to numba_cache
* PR #4067: Delete examples/notebooks/LinearRegr.py
* PR #4070: Catch writes to global typed.Dict and raise.
* PR #4078: Check tuple length
* PR #4084: Fix missing incref on optional return None
* PR #4089: Make the warnings fixer flush work for warning comparing on type.
* PR #4094: Fix function definition finding logic for commented def
* PR #4100: Fix alignment check on 32-bit.
* PR #4104: Use PEP 508 compliant env markers for install deps
Contributors:
* Benjamin Zaitlen
* Christoph Deil
* David Hirschfeld
* Dimitri Vorona
* Ehsan Totoni (core dev)
* Guilherme Leobas
* Leo Schwarz
* Lucio Fernandez-Arjona
* Max Bolingbroke
* NanduTej
* Nick White
* Ravi Teja Gutta
* Rob Ennis
* Sergey Shalnov
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.43.1
--------------
This is a bugfix release that provides minor changes to fix: a bug in branch
pruning, bugs in `np.interp` functionality, and also fully accommodate the
NumPy 1.16 release series.
* PR #3826: NumPy 1.16 support
* PR #3850: Refactor np.interp
* PR #3883: Rewrite pruned conditionals as their evaluated constants.
Contributors:
* Rob Ennis
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
Version 0.43.0
--------------
In this release, the major new features are:
- Initial support for statically typed dictionaries
- Improvements to `hash()` to match Python 3 behavior
- Support for the heapq module
- Ability to pass C structs to Numba
- More NumPy functions: asarray, trapz, roll, ptp, extract
NOTE:
The vast majority of NumPy 1.16 behaviour is supported, however
``datetime`` and ``timedelta`` use involving ``NaT`` matches the behaviour
present in earlier release. The ufunc suite has not been extending to
accommodate the two new time computation related additions present in NumPy
1.16. In addition the functions ``ediff1d`` and ``interp`` have known minor
issues in replicating outputs exactly when ``NaN``'s occur in certain input
patterns.
General Enhancements:
* PR #3563: Support for np.roll
* PR #3572: Support for np.ptp
* PR #3592: Add dead branch prune before type inference.
* PR #3598: Implement np.asarray()
* PR #3604: Support for np.interp
* PR #3607: Some simplication to lowering
* PR #3612: Exact match flag in dispatcher
* PR #3627: Support for np.trapz
* PR #3630: np.where with broadcasting
* PR #3633: Support for np.extract
* PR #3657: np.max, np.min, np.nanmax, np.nanmin - support for complex dtypes
* PR #3661: Access C Struct as Numpy Structured Array
* PR #3678: Support for str.split and str.join
* PR #3684: Support C array in C struct
* PR #3696: Add intrinsic to help debug refcount
* PR #3703: Implementations of type hashing.
* PR #3715: Port CPython3.7 dictionary for numba internal use
* PR #3716: Support inplace concat of strings
* PR #3718: Add location to ConstantInferenceError exceptions.
* PR #3720: improve error msg about invalid signature
* PR #3731: Support for heapq
* PR #3754: Updates for llvmlite 0.28
* PR #3760: Overloadable operator.setitem
* PR #3775: Support overloading operator.delitem
* PR #3777: Implement compiler support for dictionary
* PR #3791: Implement interpreter-side interface for numba dict
* PR #3799: Support refcount'ed types in numba dict
CUDA Enhancements/Fixes:
* PR #3713: Fix the NvvmSupportError message when CC too low
* PR #3722: Fix #3705: slicing error with negative strides
* PR #3755: Make cuda.to_device accept readonly host array
* PR #3773: Adapt library search to accommodate multiple locations
Documentation Updates:
* PR #3651: fix link to berryconda in docs
* PR #3668: Add Azure Pipelines build badge
* PR #3749: DOC: Clarify when prange is different from range
* PR #3771: fix a few typos
* PR #3785: Clarify use of range as function only.
* PR #3829: Add docs for typed-dict
Fixes:
* PR #3614: Resolve #3586
* PR #3618: Skip gdb tests on ARM.
* PR #3643: Remove support_literals usage
* PR #3645: Enforce and fix that AbstractTemplate.generic must be returning a Signature
* PR #3648: Fail on @overload signature mismatch.
* PR #3660: Added Ignore message to test numba.tests.test_lists.TestLists.test_mul_error
* PR #3662: Replace six with numba.six
* PR #3663: Removes coverage computation from travisci builds
* PR #3672: Avoid leaking memory when iterating over uniform tuple
* PR #3676: Fixes constant string lowering inside tuples
* PR #3677: Ensure all referenced compiled functions are linked properly
* PR #3692: Fix test failure due to overly strict test on floating point values.
* PR #3693: Intercept failed import to help users.
* PR #3694: Fix memory leak in enumerate iterator
* PR #3695: Convert return of None from intrinsic implementation to dummy value
* PR #3697: Fix for issue #3687
* PR #3701: Fix array.T analysis (fixes #3700)
* PR #3704: Fixes for overload_method
* PR #3706: Don't push call vars recursively into nested parfors. Resolves #3686.
* PR #3710: Set as non-hoistable if a mutable variable is passed to a function in a loop. Resolves #3699.
* PR #3712: parallel=True to use better builtin mechanism to resolve call types. Resolves issue #3671
* PR #3725: Fix invalid removal of dead empty list
* PR #3740: add uintp as a valid type to the tuple operator.getitem
* PR #3758: Fix target definition update in inlining
* PR #3782: Raise typing error on yield optional.
* PR #3792: Fix non-module object used as the module of a function.
* PR #3800: Bugfix for np.interp
* PR #3808: Bump macro to include VS2014 to fix py3.5 build
* PR #3809: Add debug guard to debug only C function.
* PR #3816: Fix array.sum(axis) 1d input return type.
* PR #3821: Replace PySys_WriteStdout with PySys_FormatStdout to ensure no truncation.
* PR #3830: Getitem should not return optional type
* PR #3832: Handle single string as path in find_file()
Contributors:
* Ehsan Totoni
* Gryllos Prokopis
* Jonathan J. Helmus
* Kayla Ngan
* lalitparate
* luk-f-a
* Matyt
* Max Bolingbroke
* Michael Seifert
* Rob Ennis
* Siu Kwan Lam
* Stan Seibert
* Stuart Archibald
* Todd A. Anderson
* Tao He
* Valentin Haenel
Version 0.42.1
--------------
Bugfix release to fix the incorrect hash in OSX wheel packages.
No change in source code.
Version 0.42.0
--------------
In this release the major features are:
- The capability to launch and attach the GDB debugger from within a jitted
function.
- The upgrading of LLVM to version 7.0.0.
We added a draft of the project roadmap to the developer manual. The roadmap is
for informational purposes only as priorities and resources may change.
Here are some enhancements from contributed PRs:
- #3532. Daniel Wennberg improved the ``cuda.{pinned, mapped}`` API so that
the associated memory is released immediately at the exit of the context
manager.
- #3531. Dimitri Vorona enabled the inlining of jitclass methods.
- #3516. Simon Perkins added the support for passing numpy dtypes (i.e.
``np.dtype("int32")``) and their type constructor (i.e. ``np.int32``) into
a jitted function.
- #3509. Rob Ennis added support for ``np.corrcoef``.
A regression issue (#3554, #3461) relating to making an empty slice in parallel
mode is resolved by #3558.
General Enhancements:
* PR #3392: Launch and attach gdb directly from Numba.
* PR #3437: Changes to accommodate LLVM 7.0.x
* PR #3509: Support for np.corrcoef
* PR #3516: Typeof dtype values
* PR #3520: Fix @stencil ignoring cval if out kwarg supplied.
* PR #3531: Fix jitclass method inlining and avoid unnecessary increfs
* PR #3538: Avoid future C-level assertion error due to invalid visibility
* PR #3543: Avoid implementation error being hidden by the try-except
* PR #3544: Add `long_running` test flag and feature to exclude tests.
* PR #3549: ParallelAccelerator caching improvements
* PR #3558: Fixes array analysis for inplace binary operators.
* PR #3566: Skip alignment tests on armv7l.
* PR #3567: Fix unifying literal types in namedtuple
* PR #3576: Add special copy routine for NumPy out arrays
* PR #3577: Fix example and docs typos for `objmode` context manager.
reorder statements.
* PR #3580: Use alias information when determining whether it is safe to
* PR #3583: Use `ir.unknown_loc` for unknown `Loc`, as #3390 with tests
* PR #3587: Fix llvm.memset usage changes in llvm7
* PR #3596: Fix Array Analysis for Global Namedtuples
* PR #3597: Warn users if threading backend init unsafe.
* PR #3605: Add guard for writing to read only arrays from ufunc calls
* PR #3606: Improve the accuracy of error message wording for undefined type.
* PR #3611: gdb test guard needs to ack ptrace permissions
* PR #3616: Skip gdb tests on ARM.
CUDA Enhancements:
* PR #3532: Unregister temporarily pinned host arrays at once
* PR #3552: Handle broadcast arrays correctly in host->device transfer.
* PR #3578: Align cuda and cuda simulator kwarg names.
Documentation Updates:
* PR #3545: Fix @njit description in 5 min guide
* PR #3570: Minor documentation fixes for numba.cuda
* PR #3581: Fixing minor typo in `reference/types.rst`
* PR #3594: Changing `@stencil` docs to correctly reflect `func_or_mode` param
* PR #3617: Draft roadmap as of Dec 2018
Contributors:
* Aaron Critchley
* Daniel Wennberg
* Dimitri Vorona
* Dominik Stańczak
* Ehsan Totoni (core dev)
* Iskander Sharipov
* Rob Ennis
* Simon Muller
* Simon Perkins
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
Version 0.41.0
--------------
This release adds the following major features:
* Diagnostics showing the optimizations done by ParallelAccelerator
* Support for profiling Numba-compiled functions in Intel VTune
* Additional NumPy functions: partition, nancumsum, nancumprod, ediff1d, cov,
conj, conjugate, tri, tril, triu
* Initial support for Python 3 Unicode strings
General Enhancements:
* PR #1968: armv7 support
* PR #2983: invert mapping b/w binop operators and the operator module #2297
* PR #3160: First attempt at parallel diagnostics
* PR #3307: Adding NUMBA_ENABLE_PROFILING envvar, enabling jit event
* PR #3320: Support for np.partition
* PR #3324: Support for np.nancumsum and np.nancumprod
* PR #3325: Add location information to exceptions.
* PR #3337: Support for np.ediff1d
* PR #3345: Support for np.cov
* PR #3348: Support user pipeline class in with lifting
* PR #3363: string support
* PR #3373: Improve error message for empty imprecise lists.
* PR #3375: Enable overload(operator.getitem)
* PR #3402: Support negative indexing in tuple.
* PR #3414: Refactor Const type
* PR #3416: Optimized usage of alloca out of the loop
* PR #3424: Updates for llvmlite 0.26
* PR #3462: Add support for `np.conj/np.conjugate`.
* PR #3480: np.tri, np.tril, np.triu - default optional args
* PR #3481: Permit dtype argument as sole kwarg in np.eye
CUDA Enhancements:
* PR #3399: Add max_registers Option to cuda.jit
Continuous Integration / Testing:
* PR #3303: CI with Azure Pipelines
* PR #3309: Workaround race condition with apt
* PR #3371: Fix issues with Azure Pipelines
* PR #3362: Fix #3360: `RuntimeWarning: 'numba.runtests' found in sys.modules`
* PR #3374: Disable openmp in wheel building
* PR #3404: Azure Pipelines templates
* PR #3419: Fix cuda tests and error reporting in test discovery
* PR #3491: Prevent faulthandler installation on armv7l
* PR #3493: Fix CUDA test that used negative indexing behaviour that's fixed.
* PR #3495: Start Flake8 checking of Numba source
Fixes:
* PR #2950: Fix dispatcher to only consider contiguous-ness.
* PR #3124: Fix 3119, raise for 0d arrays in reductions
* PR #3228: Reduce redundant module linking
* PR #3329: Fix AOT on windows.
* PR #3335: Fix memory management of __cuda_array_interface__ views.
* PR #3340: Fix typo in error name.
* PR #3365: Fix the default unboxing logic
* PR #3367: Allow non-global reference to objmode() context-manager
* PR #3381: Fix global reference in objmode for dynamically created function
* PR #3382: CUDA_ERROR_MISALIGNED_ADDRESS Using Multiple Const Arrays
* PR #3384: Correctly handle very old versions of colorama
* PR #3394: Add 32bit package guard for non-32bit installs
* PR #3397: Fix with-objmode warning
* PR #3403 Fix label offset in call inline after parfor pass
* PR #3429: Fixes raising of user defined exceptions for exec(<string>).
* PR #3432: Fix error due to function naming in CI in py2.7
* PR #3444: Fixed TBB's single thread execution and test added for #3440
* PR #3449: Allow matching non-array objects in find_callname()
* PR #3455: Change getiter and iternext to not be pure. Resolves #3425
* PR #3467: Make ir.UndefinedType singleton class.
* PR #3478: Fix np.random.shuffle sideeffect
* PR #3487: Raise unsupported for kwargs given to `print()`
* PR #3488: Remove dead script.
* PR #3498: Fix stencil support for boolean as return type
* PR #3511: Fix handling make_function literals (regression of #3414)
* PR #3514: Add missing unicode != unicode
* PR #3527: Fix complex math sqrt implementation for large -ve values
* PR #3530: This adds arg an check for the pattern supplied to Parfors.
* PR #3536: Sets list dtor linkage to `linkonce_odr` to fix visibility in AOT
Documentation Updates:
* PR #3316: Update 0.40 changelog with additional PRs
* PR #3318: Tweak spacing to avoid search box wrapping onto second line
* PR #3321: Add note about memory leaks with exceptions to docs. Fixes #3263
* PR #3322: Add FAQ on CUDA + fork issue. Fixes #3315.
* PR #3343: Update docs for argsort, kind kwarg partially supported.
* PR #3357: Added mention of njit in 5minguide.rst
* PR #3434: Fix parallel reduction example in docs.
* PR #3452: Fix broken link and mark up problem.
* PR #3484: Size Numba logo in docs in em units. Fixes #3313
* PR #3502: just two typos
* PR #3506: Document string support
* PR #3513: Documentation for parallel diagnostics.
* PR #3526: Fix 5 min guide with respect to @njit decl
Contributors:
* Alex Ford
* Andreas Sodeur
* Anton Malakhov
* Daniel Stender
* Ehsan Totoni (core dev)
* Henry Schreiner
* Marcel Bargull
* Matt Cooper
* Nick White
* Nicolas Hug
* rjenc29
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
Version 0.40.1
--------------
This is a PyPI-only patch release to ensure that PyPI wheels can enable the
TBB threading backend, and to disable the OpenMP backend in the wheels.
Limitations of manylinux1 and variation in user environments can cause
segfaults when OpenMP is enabled on wheel builds. Note that this release has
no functional changes for users who obtained Numba 0.40.0 via conda.
Patches:
* PR #3338: Accidentally left Anton off contributor list for 0.40.0
* PR #3374: Disable OpenMP in wheel building
* PR #3376: Update 0.40.1 changelog and docs on OpenMP backend
Version 0.40.0
--------------
This release adds a number of major features:
* A new GPU backend: kernels for AMD GPUs can now be compiled using the ROCm
driver on Linux.
* The thread pool implementation used by Numba for automatic multithreading
is configurable to use TBB, OpenMP, or the old "workqueue" implementation.
(TBB is likely to become the preferred default in a future release.)
* New documentation on thread and fork-safety with Numba, along with overall
improvements in thread-safety.
* Experimental support for executing a block of code inside a nopython mode
function in object mode.
* Parallel loops now allow arrays as reduction variables
* CUDA improvements: FMA, faster float64 atomics on supporting hardware,
records in const memory, and improved datatime dtype support
* More NumPy functions: vander, tri, triu, tril, fill_diagonal
General Enhancements:
* PR #3017: Add facility to support with-contexts
* PR #3033: Add support for multidimensional CFFI arrays
* PR #3122: Add inliner to object mode pipeline
* PR #3127: Support for reductions on arrays.
* PR #3145: Support for np.fill_diagonal
* PR #3151: Keep a queue of references to last N deserialized functions. Fixes #3026
* PR #3154: Support use of list() if typeable.
* PR #3166: Objmode with-block
* PR #3179: Updates for llvmlite 0.25
* PR #3181: Support function extension in alias analysis
* PR #3189: Support literal constants in typing of object methods
* PR #3190: Support passing closures as literal values in typing
* PR #3199: Support inferring stencil index as constant in simple unary expressions
* PR #3202: Threading layer backend refactor/rewrite/reinvention!
* PR #3209: Support for np.tri, np.tril and np.triu
* PR #3211: Handle unpacking in building tuple (BUILD_TUPLE_UNPACK opcode)
* PR #3212: Support for np.vander
* PR #3227: Add NumPy 1.15 support
* PR #3272: Add MemInfo_data to runtime._nrt_python.c_helpers
* PR #3273: Refactor. Removing thread-local-storage based context nesting.
* PR #3278: compiler threadsafety lockdown
* PR #3291: Add CPU count and CFS restrictions info to numba -s.
CUDA Enhancements:
* PR #3152: Use cuda driver api to get best blocksize for best occupancy
* PR #3165: Add FMA intrinsic support
* PR #3172: Use float64 add Atomics, Where Available
* PR #3186: Support Records in CUDA Const Memory
* PR #3191: CUDA: fix log size
* PR #3198: Fix GPU datetime timedelta types usage
* PR #3221: Support datetime/timedelta scalar argument to a CUDA kernel.
* PR #3259: Add DeviceNDArray.view method to reinterpret data as a different type.
* PR #3310: Fix IPC handling of sliced cuda array.
ROCm Enhancements:
* PR #3023: Support for AMDGCN/ROCm.
* PR #3108: Add ROC info to `numba -s` output.
* PR #3176: Move ROC vectorize init to npyufunc
* PR #3177: Add auto_synchronize support to ROC stream
* PR #3178: Update ROC target documentation.
* PR #3294: Add compiler lock to ROC compilation path.
* PR #3280: Add wavebits property to the HSA Agent.
* PR #3281: Fix ds_permute types and add tests
Continuous Integration / Testing:
* PR #3091: Remove old recipes, switch to test config based on env var.
* PR #3094: Add higher ULP tolerance for products in complex space.
* PR #3096: Set exit on error in incremental scripts
* PR #3109: Add skip to test needing jinja2 if no jinja2.
* PR #3125: Skip cudasim only tests
* PR #3126: add slack, drop flowdock
* PR #3147: Improve error message for arg type unsupported during typing.
* PR #3128: Fix recipe/build for jetson tx2/ARM
* PR #3167: In build script activate env before installing.
* PR #3180: Add skip to broken test.
* PR #3216: Fix libcuda.so loading in some container setup
* PR #3224: Switch to new Gitter notification webhook URL and encrypt it
* PR #3235: Add 32bit Travis CI jobs
* PR #3257: This adds scipy/ipython back into windows conda test phase.
Fixes:
* PR #3038: Fix random integer generation to match results from NumPy.
* PR #3045: Fix #3027 - Numba reassigns sys.stdout
* PR #3059: Handler for known LoweringErrors.
* PR #3060: Adjust attribute error for NumPy functions.
* PR #3067: Abort simulator threads on exception in thread block.
* PR #3079: Implement +/-(types.boolean) Fix #2624
* PR #3080: Compute np.var and np.std correctly for complex types.
* PR #3088: Fix #3066 (array.dtype.type in prange)
* PR #3089: Fix invalid ParallelAccelerator hoisting issue.
* PR #3136: Fix #3135 (lowering error)
* PR #3137: Fix for issue3103 (race condition detection)
* PR #3142: Fix Issue #3139 (parfors reuse of reduction variable across prange blocks)
* PR #3148: Remove dead array equal @infer code
* PR #3153: Fix canonicalize_array_math typing for calls with kw args
* PR #3156: Fixes issue with missing pygments in testing and adds guards.
* PR #3168: Py37 bytes output fix.
* PR #3171: Fix #3146. Fix CFUNCTYPE void* return-type handling
* PR #3193: Fix setitem/getitem resolvers
* PR #3222: Fix #3214. Mishandling of POP_BLOCK in while True loop.
* PR #3230: Fixes liveness analysis issue in looplifting
* PR #3233: Fix return type difference for 32bit ctypes.c_void_p
* PR #3234: Fix types and layout for `np.where`.
* PR #3237: Fix DeprecationWarning about imp module
* PR #3241: Fix #3225. Normalize 0nd array to scalar in typing of indexing code.
* PR #3256: Fix #3251: Move imports of ABCs to collections.abc for Python >= 3.3
* PR #3292: Fix issue3279.
* PR #3302: Fix error due to mismatching dtype
Documentation Updates:
* PR #3104: Workaround for #3098 (test_optional_unpack Heisenbug)
* PR #3132: Adds an ~5 minute guide to Numba.
* PR #3194: Fix docs RE: np.random generator fork/thread safety
* PR #3242: Page with Numba talks and tutorial links
* PR #3258: Allow users to choose the type of issue they are reporting.
* PR #3260: Fixed broken link
* PR #3266: Fix cuda pointer ownership problem with user/externally allocated pointer
* PR #3269: Tweak typography with CSS
* PR #3270: Update FAQ for functions passed as arguments
* PR #3274: Update installation instructions
* PR #3275: Note pyobject and voidptr are types in docs
* PR #3288: Do not need to call parallel optimizations "experimental" anymore
* PR #3318: Tweak spacing to avoid search box wrapping onto second line
Contributors:
* Anton Malakhov
* Alex Ford
* Anthony Bisulco
* Ehsan Totoni (core dev)
* Leonard Lausen
* Matthew Petroff
* Nick White
* Ray Donnelly
* rjenc29
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Stuart Reynolds
* Todd A. Anderson (core dev)
Version 0.39.0
--------------
Here are the highlights for the Numba 0.39.0 release.
* This is the first version that supports Python 3.7.
* With help from Intel, we have fixed the issues with SVML support (related
issues #2938, #2998, #3006).
* List has gained support for containing reference-counted types like NumPy
arrays and `list`. Note, list still cannot hold heterogeneous types.
* We have made a significant change to the internal calling-convention,
which should be transparent to most users, to allow for a future feature that
will permitting jumping back into python-mode from a nopython-mode function.
This also fixes a limitation to `print` that disabled its use from nopython
functions that were deep in the call-stack.
* For CUDA GPU support, we added a `__cuda_array_interface__` following the
NumPy array interface specification to allow Numba to consume externally
defined device arrays. We have opened a corresponding pull request to CuPy to
test out the concept and be able to use a CuPy GPU array.
* The Numba dispatcher `inspect_types()` method now supports the kwarg `pretty`
which if set to `True` will produce ANSI/HTML output, showing the annotated
types, when invoked from ipython/jupyter-notebook respectively.
* The NumPy functions `ndarray.dot`, `np.percentile` and `np.nanpercentile`, and
`np.unique` are now supported.
* Numba now supports the use of a per-project configuration file to permanently
set behaviours typically set via `NUMBA_*` family environment variables.
* Support for the `ppc64le` architecture has been added.
Enhancements:
* PR #2793: Simplify and remove javascript from html_annotate templates.
* PR #2840: Support list of refcounted types
* PR #2902: Support for np.unique
* PR #2926: Enable fence for all architecture and add developer notes
* PR #2928: Making error about untyped list more informative.
* PR #2930: Add configuration file and color schemes.
* PR #2932: Fix encoding to 'UTF-8' in `check_output` decode.
* PR #2938: Python 3.7 compat: _Py_Finalizing becomes _Py_IsFinalizing()
* PR #2939: Comprehensive SVML unit test
* PR #2946: Add support for `ndarray.dot` method and tests.
* PR #2953: percentile and nanpercentile
* PR #2957: Add new 3.7 opcode support.
* PR #2963: Improve alias analysis to be more comprehensive
* PR #2984: Support for namedtuples in array analysis
* PR #2986: Fix environment propagation
* PR #2990: Improve function call matching for intrinsics
* PR #3002: Second pass at error rewrites (interpreter errors).
* PR #3004: Add numpy.empty to the list of pure functions.
* PR #3008: Augment SVML detection with llvmlite SVML patch detection.
* PR #3012: Make use of the common spelling of heterogeneous/homogeneous.
* PR #3032: Fix pycc ctypes test due to mismatch in calling-convention
* PR #3039: Add SVML detection to Numba environment diagnostic tool.
* PR #3041: This adds @needs_blas to tests that use BLAS
* PR #3056: Require llvmlite>=0.24.0
CUDA Enhancements:
* PR #2860: __cuda_array_interface__
* PR #2910: More CUDA intrinsics
* PR #2929: Add Flag To Prevent Unneccessary D->H Copies
* PR #3037: Add CUDA IPC support on non-peer-accessible devices
CI Enhancements:
* PR #3021: Update appveyor config.
* PR #3040: Add fault handler to all builds
* PR #3042: Add catchsegv
* PR #3077: Adds optional number of processes for `-m` in testing
Fixes:
* PR #2897: Fix line position of delete statement in numba ir
* PR #2905: Fix for #2862
* PR #3009: Fix optional type returning in recursive call
* PR #3019: workaround and unittest for issue #3016
* PR #3035: [TESTING] Attempt delayed removal of Env
* PR #3048: [WIP] Fix cuda tests failure on buildfarm
* PR #3054: Make test work on 32-bit
* PR #3062: Fix cuda.In freeing devary before the kernel launch
* PR #3073: Workaround #3072
* PR #3076: Avoid ignored exception due to missing globals at interpreter teardown
Documentation Updates:
* PR #2966: Fix syntax in env var docs.
* PR #2967: Fix typo in CUDA kernel layout example.
* PR #2970: Fix docstring copy paste error.
Contributors:
The following people contributed to this release.
* Anton Malakhov
* Ehsan Totoni (core dev)
* Julia Tatz
* Matthias Bussonnier
* Nick White
* Ray Donnelly
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Rik-de-Kort
* rjenc29
Version 0.38.1
--------------
This is a critical bug fix release addressing:
https://github.com/numba/numba/issues/3006
The bug does not impact users using conda packages from Anaconda or Intel Python
Distribution (but it does impact conda-forge). It does not impact users of pip
using wheels from PyPI.
This only impacts a small number of users where:
* The ICC runtime (specifically libsvml) is present in the user's environment.
* The user is using an llvmlite statically linked against a version of LLVM
that has not been patched with SVML support.
* The platform is 64-bit.
The release fixes a code generation path that could lead to the production of
incorrect results under the above situation.
Fixes:
* PR #3007: Augment SVML detection with llvmlite SVML patch detection.
Contributors:
The following people contributed to this release.
* Stuart Archibald (core dev)
Version 0.38.0
--------------
Following on from the bug fix focus of the last release, this release swings
back towards the addition of new features and usability improvements based on
community feedback. This release is comparatively large! Three key features/
changes to note are:
* Numba (via llvmlite) is now backed by LLVM 6.0, general vectorization is
improved as a result. A significant long standing LLVM bug that was causing
corruption was also found and fixed.
* Further considerable improvements in vectorization are made available as
Numba now supports Intel's short vector math library (SVML).
Try it out with `conda install -c numba icc_rt`.
* CUDA 8.0 is now the minimum supported CUDA version.
Other highlights include:
* Bug fixes to `parallel=True` have enabled more vectorization opportunities