diff --git a/DESCRIPTION b/DESCRIPTION index 77e9132f0..e1df408df 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: keras3 Type: Package Title: R Interface to 'Keras' -Version: 1.1.0.9000 +Version: 1.2.0 Authors@R: c( person("Tomasz", "Kalinowski", role = c("aut", "cph", "cre"), email = "tomasz@posit.co"), diff --git a/NEWS.md b/NEWS.md index e1709a7ab..641359354 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# keras3 (development version) +# keras3 1.2.0 - Added compatibility with Keras v3.5.0. User facing changes: diff --git a/R/ops.R b/R/ops.R index 1246c0a19..781d6d183 100644 --- a/R/ops.R +++ b/R/ops.R @@ -3588,7 +3588,7 @@ function (x, dtype = NULL) #' #' @param f #' A callable implementing an associative binary operation with -#' signature `r = f(a, b)`. Function `f` must be associative, i.e., +#' signature ` r = f(a, b)`. Function `f` must be associative, i.e., #' it must satisfy the equation #' `f(a, f(b, c)) == f(f(a, b), c)`. #' The inputs and result are (possibly nested tree structures diff --git a/cran-comments.md b/cran-comments.md index 127d5a9f1..7167b2e18 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -2,8 +2,7 @@ 0 errors | 0 warnings | 1 note -* This is a new release. -* installed size is 12.3Mb +* installed size is 12.4Mb sub-directories of 1Mb or more: doc 3.3Mb - help 8.2Mb + help 8.4Mb diff --git a/docs/404.html b/docs/404.html index 3cc9a21bf..67ecabfc8 100644 --- a/docs/404.html +++ b/docs/404.html @@ -29,7 +29,7 @@ keras3 - 1.1.0 + 1.2.0 + + + + + +
+
+
+ +
+

This operation his similar to op_scan(), with the key difference that +op_associative_scan() is a parallel implementation with +potentially significant performance benefits, especially when jit compiled. +The catch is that it can only be used when f is a binary associative +operation (i.e. it must verify f(a, f(b, c)) == f(f(a, b), c)).

+

For an introduction to associative scans, refer to this paper: +Blelloch, Guy E. 1990. +Prefix Sums and Their Applications.

+
+ +
+

Usage

+
op_associative_scan(f, elems, reverse = FALSE, axis = 1L)
+
+ +
+

Arguments

+ + +
f
+

A callable implementing an associative binary operation with +signature r = f(a, b). Function f must be associative, i.e., +it must satisfy the equation +f(a, f(b, c)) == f(f(a, b), c). +The inputs and result are (possibly nested tree structures +of) array(s) matching elems. Each array has a dimension in place +of the axis dimension. f should be applied elementwise over +the axis dimension. +The result r has the same shape (and structure) as the +two inputs a and b.

+ + +
elems
+

A (possibly nested tree structure of) array(s), each with +an axis dimension of size num_elems.

+ + +
reverse
+

A boolean stating if the scan should be reversed with respect +to the axis dimension.

+ + +
axis
+

an integer identifying the axis over which the scan should occur.

+ +
+
+

Value

+

A (possibly nested tree structure of) array(s) of the same shape +and structure as elems, in which the k'th element of axis is +the result of recursively applying f to combine the first k +elements of elems along axis. For example, given +elems = list(a, b, c, ...), the result would be +list(a, f(a, b), f(f(a, b), c), ...).

+
+
+

Examples

+

sum_fn <- function(x, y) x + y
+xs <- op_arange(5L)
+op_associative_scan(sum_fn, xs)

+

## tf.Tensor([ 0  1  3  6 10], shape=(5), dtype=int32)
+

+

sum_fn <- function(x, y) {
+  str(list(x = x, y = y))
+  map2(x, y, \(.x, .y) .x + .y)
+}
+
+xs <- list(op_array(1:2),
+           op_array(1:2),
+           op_array(1:2))
+ys <- op_associative_scan(sum_fn, xs, axis = 1)

+

## List of 2
+##  $ x:List of 3
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([1], dtype=int32)>
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([1], dtype=int32)>
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([1], dtype=int32)>
+##  $ y:List of 3
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([2], dtype=int32)>
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([2], dtype=int32)>
+##   ..$ :<tf.Tensor: shape=(1), dtype=int32, numpy=array([2], dtype=int32)>
+

+

ys

+

## [[1]]
+## tf.Tensor([1 3], shape=(2), dtype=int32)
+##
+## [[2]]
+## tf.Tensor([1 3], shape=(2), dtype=int32)
+##
+## [[3]]
+## tf.Tensor([1 3], shape=(2), dtype=int32)
+

+
+
+

See also

+

Other core ops:
op_cast()
op_cond()
op_convert_to_numpy()
op_convert_to_tensor()
op_custom_gradient()
op_dtype()
op_fori_loop()
op_is_tensor()
op_map()
op_scan()
op_scatter()
op_scatter_update()
op_searchsorted()
op_shape()
op_slice()
op_slice_update()
op_stop_gradient()
op_switch()
op_unstack()
op_vectorized_map()
op_while_loop()

+

Other ops:
op_abs()
op_add()
op_all()
op_any()
op_append()
op_arange()
op_arccos()
op_arccosh()
op_arcsin()
op_arcsinh()
op_arctan()
op_arctan2()
op_arctanh()
op_argmax()
op_argmin()
op_argpartition()
op_argsort()
op_array()
op_average()
op_average_pool()
op_batch_normalization()
op_binary_crossentropy()
op_bincount()
op_broadcast_to()
op_cast()
op_categorical_crossentropy()
op_ceil()
op_cholesky()
op_clip()
op_concatenate()
op_cond()
op_conj()
op_conv()
op_conv_transpose()
op_convert_to_numpy()
op_convert_to_tensor()
op_copy()
op_correlate()
op_cos()
op_cosh()
op_count_nonzero()
op_cross()
op_ctc_decode()
op_ctc_loss()
op_cumprod()
op_cumsum()
op_custom_gradient()
op_depthwise_conv()
op_det()
op_diag()
op_diagonal()
op_diff()
op_digitize()
op_divide()
op_divide_no_nan()
op_dot()
op_dtype()
op_eig()
op_eigh()
op_einsum()
op_elu()
op_empty()
op_equal()
op_erf()
op_erfinv()
op_exp()
op_expand_dims()
op_expm1()
op_extract_sequences()
op_eye()
op_fft()
op_fft2()
op_flip()
op_floor()
op_floor_divide()
op_fori_loop()
op_full()
op_full_like()
op_gelu()
op_get_item()
op_greater()
op_greater_equal()
op_hard_sigmoid()
op_hard_silu()
op_hstack()
op_identity()
op_imag()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
op_in_top_k()
op_inv()
op_irfft()
op_is_tensor()
op_isclose()
op_isfinite()
op_isinf()
op_isnan()
op_istft()
op_leaky_relu()
op_less()
op_less_equal()
op_linspace()
op_log()
op_log10()
op_log1p()
op_log2()
op_log_sigmoid()
op_log_softmax()
op_logaddexp()
op_logical_and()
op_logical_not()
op_logical_or()
op_logical_xor()
op_logspace()
op_logsumexp()
op_lstsq()
op_lu_factor()
op_map()
op_matmul()
op_max()
op_max_pool()
op_maximum()
op_mean()
op_median()
op_meshgrid()
op_min()
op_minimum()
op_mod()
op_moments()
op_moveaxis()
op_multi_hot()
op_multiply()
op_nan_to_num()
op_ndim()
op_negative()
op_nonzero()
op_norm()
op_normalize()
op_not_equal()
op_one_hot()
op_ones()
op_ones_like()
op_outer()
op_pad()
op_power()
op_prod()
op_psnr()
op_qr()
op_quantile()
op_ravel()
op_real()
op_reciprocal()
op_relu()
op_relu6()
op_repeat()
op_reshape()
op_rfft()
op_roll()
op_round()
op_rsqrt()
op_scan()
op_scatter()
op_scatter_update()
op_searchsorted()
op_segment_max()
op_segment_sum()
op_select()
op_selu()
op_separable_conv()
op_shape()
op_sigmoid()
op_sign()
op_silu()
op_sin()
op_sinh()
op_size()
op_slice()
op_slice_update()
op_slogdet()
op_softmax()
op_softplus()
op_softsign()
op_solve()
op_solve_triangular()
op_sort()
op_sparse_categorical_crossentropy()
op_split()
op_sqrt()
op_square()
op_squeeze()
op_stack()
op_std()
op_stft()
op_stop_gradient()
op_subtract()
op_sum()
op_svd()
op_swapaxes()
op_switch()
op_take()
op_take_along_axis()
op_tan()
op_tanh()
op_tensordot()
op_tile()
op_top_k()
op_trace()
op_transpose()
op_tri()
op_tril()
op_triu()
op_unstack()
op_var()
op_vdot()
op_vectorize()
op_vectorized_map()
op_vstack()
op_where()
op_while_loop()
op_zeros()
op_zeros_like()

+
+ +
+ + +
+ + + + + + + diff --git a/docs/reference/op_average.html b/docs/reference/op_average.html index 6842a26a5..d9be3f8be 100644 --- a/docs/reference/op_average.html +++ b/docs/reference/op_average.html @@ -8,7 +8,7 @@ keras3 - 1.1.0 + 1.2.0 + + + + + +
+
+
+ +
+

Perform a binary search, returning indices for insertion of values +into sorted_sequence that maintain the sorting order.

+
+ +
+

Usage

+
op_searchsorted(sorted_sequence, values, side = "left")
+
+ +
+

Arguments

+ + +
sorted_sequence
+

1-D input tensor, sorted along the innermost +dimension.

+ + +
values
+

N-D tensor of query insertion values.

+ + +
side
+

'left' or 'right', specifying the direction in which to insert +for the equality case (tie-breaker).

+ +
+
+

Value

+

Tensor of insertion indices of same shape as values.

+
+
+

See also

+

Other core ops:
op_associative_scan()
op_cast()
op_cond()
op_convert_to_numpy()
op_convert_to_tensor()
op_custom_gradient()
op_dtype()
op_fori_loop()
op_is_tensor()
op_map()
op_scan()
op_scatter()
op_scatter_update()
op_shape()
op_slice()
op_slice_update()
op_stop_gradient()
op_switch()
op_unstack()
op_vectorized_map()
op_while_loop()

+

Other ops:
op_abs()
op_add()
op_all()
op_any()
op_append()
op_arange()
op_arccos()
op_arccosh()
op_arcsin()
op_arcsinh()
op_arctan()
op_arctan2()
op_arctanh()
op_argmax()
op_argmin()
op_argpartition()
op_argsort()
op_array()
op_associative_scan()
op_average()
op_average_pool()
op_batch_normalization()
op_binary_crossentropy()
op_bincount()
op_broadcast_to()
op_cast()
op_categorical_crossentropy()
op_ceil()
op_cholesky()
op_clip()
op_concatenate()
op_cond()
op_conj()
op_conv()
op_conv_transpose()
op_convert_to_numpy()
op_convert_to_tensor()
op_copy()
op_correlate()
op_cos()
op_cosh()
op_count_nonzero()
op_cross()
op_ctc_decode()
op_ctc_loss()
op_cumprod()
op_cumsum()
op_custom_gradient()
op_depthwise_conv()
op_det()
op_diag()
op_diagonal()
op_diff()
op_digitize()
op_divide()
op_divide_no_nan()
op_dot()
op_dtype()
op_eig()
op_eigh()
op_einsum()
op_elu()
op_empty()
op_equal()
op_erf()
op_erfinv()
op_exp()
op_expand_dims()
op_expm1()
op_extract_sequences()
op_eye()
op_fft()
op_fft2()
op_flip()
op_floor()
op_floor_divide()
op_fori_loop()
op_full()
op_full_like()
op_gelu()
op_get_item()
op_greater()
op_greater_equal()
op_hard_sigmoid()
op_hard_silu()
op_hstack()
op_identity()
op_imag()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
op_in_top_k()
op_inv()
op_irfft()
op_is_tensor()
op_isclose()
op_isfinite()
op_isinf()
op_isnan()
op_istft()
op_leaky_relu()
op_less()
op_less_equal()
op_linspace()
op_log()
op_log10()
op_log1p()
op_log2()
op_log_sigmoid()
op_log_softmax()
op_logaddexp()
op_logical_and()
op_logical_not()
op_logical_or()
op_logical_xor()
op_logspace()
op_logsumexp()
op_lstsq()
op_lu_factor()
op_map()
op_matmul()
op_max()
op_max_pool()
op_maximum()
op_mean()
op_median()
op_meshgrid()
op_min()
op_minimum()
op_mod()
op_moments()
op_moveaxis()
op_multi_hot()
op_multiply()
op_nan_to_num()
op_ndim()
op_negative()
op_nonzero()
op_norm()
op_normalize()
op_not_equal()
op_one_hot()
op_ones()
op_ones_like()
op_outer()
op_pad()
op_power()
op_prod()
op_psnr()
op_qr()
op_quantile()
op_ravel()
op_real()
op_reciprocal()
op_relu()
op_relu6()
op_repeat()
op_reshape()
op_rfft()
op_roll()
op_round()
op_rsqrt()
op_scan()
op_scatter()
op_scatter_update()
op_segment_max()
op_segment_sum()
op_select()
op_selu()
op_separable_conv()
op_shape()
op_sigmoid()
op_sign()
op_silu()
op_sin()
op_sinh()
op_size()
op_slice()
op_slice_update()
op_slogdet()
op_softmax()
op_softplus()
op_softsign()
op_solve()
op_solve_triangular()
op_sort()
op_sparse_categorical_crossentropy()
op_split()
op_sqrt()
op_square()
op_squeeze()
op_stack()
op_std()
op_stft()
op_stop_gradient()
op_subtract()
op_sum()
op_svd()
op_swapaxes()
op_switch()
op_take()
op_take_along_axis()
op_tan()
op_tanh()
op_tensordot()
op_tile()
op_top_k()
op_trace()
op_transpose()
op_tri()
op_tril()
op_triu()
op_unstack()
op_var()
op_vdot()
op_vectorize()
op_vectorized_map()
op_vstack()
op_where()
op_while_loop()
op_zeros()
op_zeros_like()

+
+ +
+ + +
+ + + +
+ + + + + + + diff --git a/docs/reference/op_segment_max.html b/docs/reference/op_segment_max.html index 9ba54b1ee..c379918f8 100644 --- a/docs/reference/op_segment_max.html +++ b/docs/reference/op_segment_max.html @@ -8,7 +8,7 @@ keras3 - 1.1.0 + 1.2.0 + + + + + +
+
+
+ +
+

Lamb is a stochastic gradient descent method that +uses layer-wise adaptive moments to adjusts the +learning rate for each parameter based on the ratio of the +norm of the weight to the norm of the gradient +This helps to stabilize the training process and improves convergence +especially for large batch sizes.

+
+ +
+

Usage

+
optimizer_lamb(
+  learning_rate = 0.001,
+  beta_1 = 0.9,
+  beta_2 = 0.999,
+  epsilon = 1e-07,
+  weight_decay = NULL,
+  clipnorm = NULL,
+  clipvalue = NULL,
+  global_clipnorm = NULL,
+  use_ema = FALSE,
+  ema_momentum = 0.99,
+  ema_overwrite_frequency = NULL,
+  loss_scale_factor = NULL,
+  gradient_accumulation_steps = NULL,
+  name = "lamb",
+  ...
+)
+
+ +
+

Arguments

+ + +
learning_rate
+

A float, a +LearningRateSchedule() instance, or +a callable that takes no arguments and returns the actual value to +use. The learning rate. Defaults to 0.001.

+ + +
beta_1
+

A float value or a constant float tensor, or a callable +that takes no arguments and returns the actual value to use. The +exponential decay rate for the 1st moment estimates. Defaults to +0.9.

+ + +
beta_2
+

A float value or a constant float tensor, or a callable +that takes no arguments and returns the actual value to use. The +exponential decay rate for the 2nd moment estimates. Defaults to +0.999.

+ + +
epsilon
+

A small constant for numerical stability. +Defaults to 1e-7.

+ + +
weight_decay
+

Float. If set, weight decay is applied.

+ + +
clipnorm
+

Float. If set, the gradient of each weight is individually +clipped so that its norm is no higher than this value.

+ + +
clipvalue
+

Float. If set, the gradient of each weight is clipped to be +no higher than this value.

+ + +
global_clipnorm
+

Float. If set, the gradient of all weights is clipped +so that their global norm is no higher than this value.

+ + +
use_ema
+

Boolean, defaults to FALSE. +If TRUE, exponential moving average +(EMA) is applied. EMA consists of computing an exponential moving +average of the weights of the model (as the weight values change +after each training batch), and periodically overwriting the +weights with their moving average.

+ + +
ema_momentum
+

Float, defaults to 0.99. Only used if use_ema = TRUE. +This is the momentum to use when computing +the EMA of the model's weights: +new_average = ema_momentum * old_average + (1 - ema_momentum) * current_variable_value.

+ + +
ema_overwrite_frequency
+

Int or NULL, defaults to NULL. Only used if +use_ema = TRUE. Every ema_overwrite_frequency steps of iterations, +we overwrite the model variable by its moving average. +If NULL, the optimizer +does not overwrite model variables in the middle of training, +and you need to explicitly overwrite the variables +at the end of training by calling +optimizer$finalize_variable_values() (which updates the model +variables in-place). When using the built-in fit() training loop, +this happens automatically after the last epoch, +and you don't need to do anything.

+ + +
loss_scale_factor
+

Float or NULL. If a float, the scale factor will +be multiplied the loss before computing gradients, and the inverse +of the scale factor will be multiplied by the gradients before +updating variables. Useful for preventing underflow during +mixed precision training. Alternately, +optimizer_loss_scale() will +automatically set a loss scale factor.

+ + +
gradient_accumulation_steps
+

Int or NULL. If an int, model and optimizer +variables will not be updated at every step; instead they will be +updated every gradient_accumulation_steps steps, using the average +value of the gradients since the last update. This is known as +"gradient accumulation". This can be useful +when your batch size is very small, in order to reduce gradient +noise at each update step. EMA frequency will look at "accumulated" +iterations value (optimizer steps // gradient_accumulation_steps). +Learning rate schedules will look at "real" iterations value +(optimizer steps).

+ + +
name
+

String. The name to use +for momentum accumulator weights created by +the optimizer.

+ + +
...
+

For forward/backward compatability.

+ +
+
+

Value

+

an Optimizer instance

+
+
+

References

+ +
+ + +
+ + +
+ + + +
+ + + + + + + diff --git a/docs/reference/optimizer_lion.html b/docs/reference/optimizer_lion.html index 8ae8b1339..c14fedb27 100644 --- a/docs/reference/optimizer_lion.html +++ b/docs/reference/optimizer_lion.html @@ -26,7 +26,7 @@ keras3 - 1.1.0 + 1.2.0