id | sidebar_label | title |
---|---|---|
changelog |
Rasa Open Source Change Log |
Rasa Open Source Change Log |
All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning starting with version 1.0.
No significant changes.
No significant changes.
No significant changes.
No significant changes.
No significant changes.
-
#5757: Removed previously deprecated packages
rasa_nlu
andrasa_core
.Use imports from
rasa.core
andrasa.nlu
instead. -
#5758: Removed previously deprecated classes:
- event brokers (
EventChannel
andFileProducer
,KafkaProducer
,PikaProducer
,SQLProducer
) - intent classifier
EmbeddingIntentClassifier
- policy
KerasPolicy
Removed previously deprecated methods:
Agent.handle_channels
TrackerStore.create_tracker_store
Removed support for pipeline templates in
config.yml
Removed deprecated training data keys
entity_examples
andintent_examples
from json training data format. - event brokers (
-
#5834: Removed
restaurantbot
example as it was confusing and not a great way to build a bot. -
#6296:
LabelTokenizerSingleStateFeaturizer
is deprecated. To replicateLabelTokenizerSingleStateFeaturizer
functionality, add aTokenizer
withintent_tokenization_flag: True
andCountVectorsFeaturizer
to the NLU pipeline. An example of elements to be added to the pipeline is shown in the improvement changelog 6296`.BinarySingleStateFeaturizer
is deprecated and will be removed in the future. We recommend to switch toSingleStateFeaturizer
. -
#6354: Specifying the parameters
force
andsave_to_default_model_directory
as part of the JSON payload when training a model usingPOST /model/train
is now deprecated. Please use the query parametersforce_training
andsave_to_default_model_directory
instead. See the API documentation for more information. -
#6409: The conversation event
form
was renamed toactive_loop
. Rasa Open Source will continue to be able to read and process oldform
events. Note that serialized trackers will no longer have theactive_form
field. Instead theactive_loop
field will contain the same information. Story representations in Markdown and YAML will useactive_loop
instead ofform
to represent the event. -
#6453: Removed support for
queue
argument inPikaEventBroker
(usequeues
instead).Domain file:
- Removed support for
templates
key (useresponses
instead). - Removed support for string
responses
(use dictionaries instead).
NLU
Component
:- Removed support for
provides
attribute, it's not needed anymore. - Removed support for
requires
attribute (userequired_components()
instead).
Removed
_guess_format()
utils method fromrasa.nlu.training_data.loading
(useguess_format
instead).Removed several config options for TED Policy, DIETClassifier and ResponseSelector:
hidden_layers_sizes_pre_dial
hidden_layers_sizes_bot
droprate
droprate_a
droprate_b
hidden_layers_sizes_a
hidden_layers_sizes_b
num_transformer_layers
num_heads
dense_dim
embed_dim
num_neg
mu_pos
mu_neg
use_max_sim_neg
C2
C_emb
evaluate_every_num_epochs
evaluate_on_num_examples
Please check the documentation for more information.
- Removed support for
-
#6658:
SklearnPolicy
was deprecated.TEDPolicy
is the preferred machine-learning policy for dialogue models. -
#6952: Using the default action
action_deactivate_form
to deactivate the currently active loop / Form is deprecated. Please useaction_deactivate_loop
instead.
-
#4745: Added template name to the metadata of bot utterance events.
BotUttered
event contains atemplate_name
property in its metadata for any new bot message. -
#5086: Added a
--num-threads
CLI argument that can be passed torasa train
and will be used to train NLU components. -
#5510: You can now define what kind of features should be used by what component (see Choosing a Pipeline).
You can set an alias via the option
alias
for every featurizer in your pipeline. Thealias
can be anything, by default it is set to the full featurizer class name. You can then specify, for example, on the DIETClassifier what features from which featurizers should be used. If you don't set the optionfeaturizers
all available features will be used. This is also the default behavior. Check components to see what components have the optionfeaturizers
available.Here is an example pipeline that shows the new option. We define an alias for all featurizers in the pipeline. All features will be used in the
DIETClassifier
. However, theResponseSelector
only takes the features from theConveRTFeaturizer
and theCountVectorsFeaturizer
(word level).pipeline: - name: ConveRTTokenizer - name: ConveRTFeaturizer alias: "convert" - name: CountVectorsFeaturizer alias: "cvf_word" - name: CountVectorsFeaturizer alias: "cvf_char" analyzer: char_wb min_ngram: 1 max_ngram: 4 - name: RegexFeaturizer alias: "regex" - name: LexicalSyntacticFeaturizer alias: "lsf" - name: DIETClassifier: - name: ResponseSelector epochs: 50 featurizers: ["convert", "cvf_word"] - name: EntitySynonymMapper
:::caution This change is model-breaking. Please retrain your models.
:::
-
#5837: Added
--port
commandline argument to the interactive learning mode to allow changing the port for the Rasa server running in the background. -
#5957: Add new entity extractor
RegexEntityExtractor
. The entity extractor extracts entities using the lookup tables and regexes defined in the training data. For more information see RegexEntityExtractor. -
#5996: Introduced a new
YAML
format for Core training data and implemented a parser for it. Rasa Open Source can now read stories in bothMarkdown
andYAML
format. -
#6020: You can now enable threaded message responses from Rasa through the Slack connector. This option is enabled using an optional configuration in the credentials.yml file
slack: slack_token: slack_channel: use_threads: True
Button support has also been added in the Slack connector.
-
#6066: The NLU
interpreter
is now passed to the Policies during training and inference time. Note that this requires an additional parameterinterpreter
in the methodpredict_action_probabilities
of thePolicy
interface. In case a customPolicy
implementation doesn't provide this parameter Rasa Open Source will print a warning and omit passing theinterpreter
. -
#6088: Added the new dialogue policy RulePolicy which will replace the old “rule-like” policies Mapping Policy, Fallback Policy, Two-Stage Fallback Policy, and Form Policy. These policies are now deprecated and will be removed in the future. Please see the rules documentation for more information.
Added new NLU component FallbackClassifier which predicts an intent
nlu_fallback
in case the confidence was below a given threshold. The intentnlu_fallback
may then be used to write stories / rules to handle the fallback in case of low NLU confidence.pipeline: - # Other NLU components ... - name: FallbackClassifier # If the highest ranked intent has a confidence lower than the threshold then # the NLU pipeline predicts an intent `nlu_fallback` which you can then be used in # stories / rules to implement an appropriate fallback. threshold: 0.5
-
#6132: Added possibility to split the domain into separate files. All YAML files under the path specified with
--domain
will be scanned for domain information (e.g. intents, actions, etc) and then combined into a single domain.The default value for
--domain
is stilldomain.yml
. -
#6354: The Rasa Open Source API endpoint
POST /model/train
now supports training data in YAML format. Please specify the headerContent-Type: application/yaml
when training a model using YAML training data. See the API documentation for more information. -
#6374: Added a YAML schema and a writer for 2.0 Training Core data.
-
#6404: Users can now use the
rasa data convert {nlu|core} -f yaml
command to convert training data from Markdown format to YAML format. -
#6536: Add option
use_lemma
toCountVectorsFeaturizer
. By default it is set toTrue
.use_lemma
indicates whether the featurizer should use the lemma of a word for counting (if available) or not. If this option is set toFalse
it will use the word as it is.
-
#4536: Add support for Python 3.8.
-
#5368: Changed the project structure for Rasa projects initialized with the CLI (using the
rasa init
command):actions.py
->actions/actions.py
.actions
is now a Python package (it contains a fileactions/__init__.py
). In addition, the__init__.py
at the root of the project has been removed. -
#5481:
DIETClassifier
now also assigns a confidence value to entity predictions. -
#5637: Added behavior to the
rasa --version
command. It will now also list information about the operating system, python version andrasa-sdk
. This will make it easier for users to file bug reports. -
#5743: Support for additional training metadata.
Training data messages now to support kwargs and the Rasa JSON data reader includes all fields when instantiating a training data instance.
-
#5748: Standardize testing output. The following test output can be produced for intents, responses, entities and stories:
- report: a detailed report with testing metrics per label (e.g. precision, recall, accuracy, etc.)
- errors: a file that contains incorrect predictions
- successes: a file that contains correct predictions
- confusion matrix: plot of confusion matrix
- histogram: plot of confidence distribution (not available for stories)
-
#5756: To avoid the problem of our entity extractors predicting entity labels for just a part of the words, we introduced a cleaning method after the prediction was done. We should avoid the incorrect prediction in the first place. To achieve this we will not tokenize words into sub-words anymore. We take the mean feature vectors of the sub-words as the feature vector of the word.
:::caution This change is model breaking. Please, retrain your models.
:::
-
#5759: Move option
case_sensitive
from the tokenizers to the featurizers.- Remove the option from the
WhitespaceTokenizer
andConveRTTokenizer
. - Add option
case_sensitive
to theRegexFeaturizer
.
- Remove the option from the
-
#5766: If a user sends a voice message to the bot using Facebook, users messages was set to the attachments URL. The same is now also done for the rest of attachment types (image, video, and file).
-
#5794: Creating a
Domain
usingDomain.fromDict
can no longer alter the input dictionary. Previously, there could be problems when the input dictionary was re-used for other things after creating theDomain
from it. -
#5805: The debug-level logs when instantiating an SQLTrackerStore no longer show the password in plain text. Now, the URL is displayed with the password hidden, e.g.
postgresql://username:***@localhost:5432
. -
#5855: Shorten the information in tqdm during training ML algorithms based on the log level. If you train your model in debug mode, all available metrics will be shown during training, otherwise, the information is shorten.
-
#5913: Ignore conversation test directory
tests/
when importing a project usingMultiProjectImporter
anduse_e2e
isFalse
. Previously, any story data found in a project subdirectory would be imported as training data. -
#5985: Implemented model checkpointing for DIET (including the response selector) and TED. The best model during training will be stored instead of just the last model. The model is evaluated on the basis of
evaluate_every_number_of_epochs
andevaluate_on_number_of_examples
.Checkpointing is enabled iff the following is set for the models in the
config.yml
file:checkpoint_model: True
evaluate_on_number_of_examples > 0
The model is stored to whatever location has been specified with the
--out
parameter when callingrasa train nlu/core ...
. -
#6024:
rasa data split nlu
now makes sure that there is at least one example per intent and response in the test data. -
#6052: Add endpoint kwarg to
rasa.jupyter.chat
to enable using a custom action server while chatting with a model in a jupyter notebook. -
#6055: Support for rasa conversation id with special characters on the server side - necessary for some channels (e.g. Viber)
-
#6134: Log the number of examples per intent during training. Logging can be enabled using
rasa train --debug
. -
#6237: Support for other remote storages can be achieved by using an external library.
-
#6276: Allow Rasa to boot when model loading exception occurs. Forward HTTP Error responses to standard log output.
-
#6296: * Modified functionality of
SingleStateFeaturizer
.SingleStateFeaturizer
uses trained NLUInterpreter
to featurize intents and action names. This modifiedSingleStateFeaturizer
can replicateLabelTokenizerSingleStateFeaturizer
functionality. This component is deprecated from now on. To replicateLabelTokenizerSingleStateFeaturizer
functionality, add aTokenizer
withintent_tokenization_flag: True
andCountVectorsFeaturizer
to the NLU pipeline. Please update your configuration file.For example:
yaml language: en pipeline: - name: WhitespaceTokenizer intent_tokenization_flag: True - name: CountVectorsFeaturizer
Please train both NLU and Core (using
rasa train
) to use a trained tokenizer and featurizer for core featurization.The new
SingleStateFeaturizer
stores slots, entities and forms in sparse features for more lightweight storage.BinarySingleStateFeaturizer
is deprecated and will be removed in the future. We recommend to switch toSingleStateFeaturizer
.-
Modified
TEDPolicy
to handle sparse features. As a result,TEDPolicy
may require more epochs than before to converge. -
Default TEDPolicy featurizer changed to
MaxHistoryTrackerFeaturizer
with infinite max history (takes all dialogue turns into account). -
Default batch size for TED increased from [8,32] to [64, 256]
-
-
#6323: Response selector templates now support all features that domain utterances do. They use the yaml format instead of markdown now. This means you can now use buttons, images, ... in your FAQ or chitchat responses (assuming they are using the response selector).
As a consequence, training data form in markdown has to have the file suffix
.md
from now on to allow proper file type detection- -
#6457: Support for test stories written in yaml format.
-
#6466: Response Selectors are now trained on retrieval intent labels by default instead of the actual response text. For most models, this should improve training time and accuracy of the
ResponseSelector
.If you want to revert to the pre-2.0 default behavior, add the
use_text_as_label=true
parameter to yourResponseSelector
component.You can now also have multiple response templates for a single sub-intent of a retrieval intent. The first response template containing the text attribute is picked for training(if
use_text_as_label=True
) and a random template is picked for bot's utterance just as how otherutter_
templates are picked.All response selector related evaluation artifacts -
report.json, successes.json, errors.json, confusion_matrix.png
now use the sub-intent of the retrieval intent as the target and predicted labels instead of the actual response text.The output schema of
ResponseSelector
has changed -full_retrieval_intent
andname
have been deprecated in favour ofintent_response_key
andresponse_templates
respectively. Additionally a keyall_retrieval_intents
is added to the response selector output which will hold a list of all retrieval intents(faq,chitchat, etc.) that are present in the training data.An example output looks like this -"response_selector": { "all_retrieval_intents": ["faq"], "default": { "response": { "id": 1388783286124361986, "confidence": 1.0, "intent_response_key": "faq/is_legit", "response_templates": [ { "text": "absolutely", "image": "https://i.imgur.com/nGF1K8f.jpg" }, { "text": "I think so." } ], }, "ranking": [ { "id": 1388783286124361986, "confidence": 1.0, "intent_response_key": "faq/is_legit" }, ]
An example bot demonstrating how to use the
ResponseSelector
is added to theexamples
folder. -
#6472: Do not modify conversation tracker's
latest_input_channel
property when usingPOST /trigger_intent
orReminderScheduled
. -
#6555: Do not set the output dimension of the
sparse-to-dense
layers to the same dimension as the dense features.Update default value of
dense_dimension
andconcat_dimension
fortext
inDIETClassifier
to 128. -
#6591: Retrieval actions with
respond_
prefix are now replaced with usual utterance actions withutter_
prefix.If you were using retrieval actions before, rename all of them to start with
utter_
prefix. For example,respond_chitchat
becomesutter_chitchat
. Also, in order to keep the response templates more consistent, you should now add theutter_
prefix to all response templates defined for retrieval intents. For example, a response templatechitchat/ask_name
becomesutter_chitchat/ask_name
. Note that the NLU examples for this will still be underchitchat/ask_name
intent. The exampleresponseselectorbot
should help clarify these changes further. -
#6613: Added telemetry reporting. Rasa uses telemetry to report anonymous usage information. This information is essential to help improve Rasa Open Source for all users. Reporting will be opt-out. More information can be found in our telemetry documentation.
-
#5038: Fixed a bug in the
CountVectorsFeaturizer
which resulted in the very first message after loading a model to be processed incorrectly due to the vocabulary not being loaded yet. -
#5135: Fixed Rasa shell skipping button messages if buttons are attached to a message previous to the latest.
-
#5385: Stack level for
FutureWarning
updated to level 2. -
#5453: If custom utter message contains no value or integer value, then it fails returning custom utter message. Fixed by converting the template to type string.
-
#5617: Don't create TensorBoard log files during prediction.
-
#5638: Fixed DIET breaking with empty spaCy model.
-
#5737: Pinned the library version for the Azure Cloud Storage to 2.1.0 since the persistor is currently not compatible with later versions of the azure-storage-blob library.
-
#5755: Remove
clean_up_entities
from extractors that extract pre-defined entities. Just keep the clean up method for entity extractors that extract custom entities. -
#5792: Fixed issue where the
DucklingHTTPExtractor
component would not work if itsurl
contained a trailing slash. -
#5808: Changed to variable
CERT_URI
inhangouts.py
to a string type -
#5850: Slots will be correctly interpolated for
button
responses.Previously this resulted in no interpolation due to a bug.
-
#5905: Remove option
token_pattern
fromCountVectorsFeaturizer
. Instead all tokenizers now have the optiontoken_pattern
. If a regular expression is set, the tokenizer will apply the token pattern. -
#5964: Fixed a bug when custom metadata passed with the utterance always restarted the session.
-
#5998:
WhitespaceTokenizer
does not remove vowel signs in Hindi anymore. -
#6042: Convert entity values coming from
DucklingHTTPExtractor
to string during evaluation to avoid mismatches due to different types. -
#6053: Update
FeatureSignature
to store just the feature dimension instead of the complete shape. This change fixes the usage of the optionshare_hidden_layers
in theDIETClassifier
. -
#6087: Unescape the
\n, \t, \r, \f, \b
tokens on reading nlu data from markdown files.On converting json files into markdown, the tokens mentioned above are espaced. These tokens need to be unescaped on loading the data from markdown to ensure that the data is treated in the same way.
-
#6120: Fix the way training data is generated in rasa test nlu when using the
-P
flag. Each percentage of the training dataset used to be formed as a part of the last sampled training dataset and not as a sample from the original training dataset. -
#6143: Prevent
WhitespaceTokenizer
from outputting empty list of tokens. -
#6198: Add
EntityExtractor
as a required component forEntitySynonymMapper
in a pipeline. -
#6222: Better handling of input sequences longer than the maximum sequence length that the
HFTransformersNLP
models can handle.During training, messages with longer sequence length should result in an error, whereas during inference they are gracefully handled but a debug message is logged. Ideally, passing messages longer than the acceptable maximum sequence lengths of each model should be avoided.
-
#6231: When using the
DynamoTrackerStore
, if there are more than 100 DynamoDB tables, the tracker could attempt to re-create an existing table if that table was not among the first 100 listed by the dynamo API. -
#6282: Fixed a deprication warning that pops up due to changes in numpy
-
#6291: Update
rasabaster
to fix an issue with syntax highlighting on "Prototype an Assistant" page.Update default stories and rules on "Prototype an Assistant" page.
-
#6419: Fixed a bug in the
serialise
method of theEvaluationStore
class which resulted in a wrong end-to-end evaluation of the predicted entities. -
#6535: Forms with slot mappings defined in
domain.yml
must now be a dictionary (with form names as keys). The previous syntax whereforms
was simply a list of form names is still supported. -
#6577: Remove BILOU tag prefix from role and group labels when creating entities.
-
#6601: Fixed a bug in the featurization of the boolean slot type. Previously, to set a slot value to "true", you had to set it to "1", which is in conflict with the documentation. In older versions
true
(without quotes) was also possible, but now raised an error during yaml validation.
- #4441: Added documentation on
ambiguity_threshold
parameter in Fallback Actions page. - #4605: Remove outdated whitespace tokenizer warning in Testing Your Assistant documentation.
- #5640: Updated Facebook Messenger channel docs with supported attachment information
- #5952: Update
rasa init
documentation to includetests/conversation_tests.md
in the resulting directory tree.
-
#5757: Removed previously deprecated packages
rasa_nlu
andrasa_core
.Use imports from
rasa.core
andrasa.nlu
instead. -
#5758: Removed previously deprecated classes:
- event brokers (
EventChannel
andFileProducer
,KafkaProducer
,PikaProducer
,SQLProducer
) - intent classifier
EmbeddingIntentClassifier
- policy
KerasPolicy
Removed previously deprecated methods:
Agent.handle_channels
TrackerStore.create_tracker_store
Removed support for pipeline templates in
config.yml
Removed deprecated training data keys
entity_examples
andintent_examples
from json training data format. - event brokers (
-
#5834: Removed
restaurantbot
example as it was confusing and not a great way to build a bot. -
#6296:
LabelTokenizerSingleStateFeaturizer
is deprecated. To replicateLabelTokenizerSingleStateFeaturizer
functionality, add aTokenizer
withintent_tokenization_flag: True
andCountVectorsFeaturizer
to the NLU pipeline. An example of elements to be added to the pipeline is shown in the improvement changelog 6296`.BinarySingleStateFeaturizer
is deprecated and will be removed in the future. We recommend to switch toSingleStateFeaturizer
. -
#6354: Specifying the parameters
force
andsave_to_default_model_directory
as part of the JSON payload when training a model usingPOST /model/train
is now deprecated. Please use the query parametersforce_training
andsave_to_default_model_directory
instead. See the API documentation for more information. -
#6409: The conversation event
form
was renamed toactive_loop
. Rasa Open Source will continue to be able to read and process oldform
events. Note that serialized trackers will no longer have theactive_form
field. Instead theactive_loop
field will contain the same information. Story representations in Markdown and YAML will useactive_loop
instead ofform
to represent the event. -
#6453: Removed support for
queue
argument inPikaEventBroker
(usequeues
instead).Domain file:
- Removed support for
templates
key (useresponses
instead). - Removed support for string
responses
(use dictionaries instead).
NLU
Component
:- Removed support for
provides
attribute, it's not needed anymore. - Removed support for
requires
attribute (userequired_components()
instead).
Removed
_guess_format()
utils method fromrasa.nlu.training_data.loading
(useguess_format
instead).Removed several config options for TED Policy, DIETClassifier and ResponseSelector:
hidden_layers_sizes_pre_dial
hidden_layers_sizes_bot
droprate
droprate_a
droprate_b
hidden_layers_sizes_a
hidden_layers_sizes_b
num_transformer_layers
num_heads
dense_dim
embed_dim
num_neg
mu_pos
mu_neg
use_max_sim_neg
C2
C_emb
evaluate_every_num_epochs
evaluate_on_num_examples
Please check the documentation for more information.
- Removed support for
-
#6658:
SklearnPolicy
was deprecated.TEDPolicy
is the preferred machine-learning policy for dialogue models. -
#6952: Using the default action
action_deactivate_form
to deactivate the currently active loop / Form is deprecated. Please useaction_deactivate_loop
instead.
-
#4745: Added template name to the metadata of bot utterance events.
BotUttered
event contains atemplate_name
property in its metadata for any new bot message. -
#5086: Added a
--num-threads
CLI argument that can be passed torasa train
and will be used to train NLU components. -
#5510: You can now define what kind of features should be used by what component (see Choosing a Pipeline).
You can set an alias via the option
alias
for every featurizer in your pipeline. Thealias
can be anything, by default it is set to the full featurizer class name. You can then specify, for example, on the DIETClassifier what features from which featurizers should be used. If you don't set the optionfeaturizers
all available features will be used. This is also the default behavior. Check components to see what components have the optionfeaturizers
available.Here is an example pipeline that shows the new option. We define an alias for all featurizers in the pipeline. All features will be used in the
DIETClassifier
. However, theResponseSelector
only takes the features from theConveRTFeaturizer
and theCountVectorsFeaturizer
(word level).pipeline: - name: ConveRTTokenizer - name: ConveRTFeaturizer alias: "convert" - name: CountVectorsFeaturizer alias: "cvf_word" - name: CountVectorsFeaturizer alias: "cvf_char" analyzer: char_wb min_ngram: 1 max_ngram: 4 - name: RegexFeaturizer alias: "regex" - name: LexicalSyntacticFeaturizer alias: "lsf" - name: DIETClassifier: - name: ResponseSelector epochs: 50 featurizers: ["convert", "cvf_word"] - name: EntitySynonymMapper
:::caution This change is model-breaking. Please retrain your models.
:::
-
#5837: Added
--port
commandline argument to the interactive learning mode to allow changing the port for the Rasa server running in the background. -
#5957: Add new entity extractor
RegexEntityExtractor
. The entity extractor extracts entities using the lookup tables and regexes defined in the training data. For more information see RegexEntityExtractor. -
#5996: Introduced a new
YAML
format for Core training data and implemented a parser for it. Rasa Open Source can now read stories in bothMarkdown
andYAML
format. -
#6020: You can now enable threaded message responses from Rasa through the Slack connector. This option is enabled using an optional configuration in the credentials.yml file
slack: slack_token: slack_channel: use_threads: True
Button support has also been added in the Slack connector.
-
#6066: The NLU
interpreter
is now passed to the Policies during training and inference time. Note that this requires an additional parameterinterpreter
in the methodpredict_action_probabilities
of thePolicy
interface. In case a customPolicy
implementation doesn't provide this parameter Rasa Open Source will print a warning and omit passing theinterpreter
. -
#6088: Added the new dialogue policy RulePolicy which will replace the old “rule-like” policies Mapping Policy, Fallback Policy, Two-Stage Fallback Policy, and Form Policy. These policies are now deprecated and will be removed in the future. Please see the rules documentation for more information.
Added new NLU component FallbackClassifier which predicts an intent
nlu_fallback
in case the confidence was below a given threshold. The intentnlu_fallback
may then be used to write stories / rules to handle the fallback in case of low NLU confidence.pipeline: - # Other NLU components ... - name: FallbackClassifier # If the highest ranked intent has a confidence lower than the threshold then # the NLU pipeline predicts an intent `nlu_fallback` which you can then be used in # stories / rules to implement an appropriate fallback. threshold: 0.5
-
#6132: Added possibility to split the domain into separate files. All YAML files under the path specified with
--domain
will be scanned for domain information (e.g. intents, actions, etc) and then combined into a single domain.The default value for
--domain
is stilldomain.yml
. -
#6354: The Rasa Open Source API endpoint
POST /model/train
now supports training data in YAML format. Please specify the headerContent-Type: application/yaml
when training a model using YAML training data. See the API documentation for more information. -
#6374: Added a YAML schema and a writer for 2.0 Training Core data.
-
#6404: Users can now use the
rasa data convert {nlu|core} -f yaml
command to convert training data from Markdown format to YAML format. -
#6536: Add option
use_lemma
toCountVectorsFeaturizer
. By default it is set toTrue
.use_lemma
indicates whether the featurizer should use the lemma of a word for counting (if available) or not. If this option is set toFalse
it will use the word as it is.
-
#4536: Add support for Python 3.8.
-
#5368: Changed the project structure for Rasa projects initialized with the CLI (using the
rasa init
command):actions.py
->actions/actions.py
.actions
is now a Python package (it contains a fileactions/__init__.py
). In addition, the__init__.py
at the root of the project has been removed. -
#5481:
DIETClassifier
now also assigns a confidence value to entity predictions. -
#5637: Added behavior to the
rasa --version
command. It will now also list information about the operating system, python version andrasa-sdk
. This will make it easier for users to file bug reports. -
#5743: Support for additional training metadata.
Training data messages now to support kwargs and the Rasa JSON data reader includes all fields when instantiating a training data instance.
-
#5748: Standardize testing output. The following test output can be produced for intents, responses, entities and stories:
- report: a detailed report with testing metrics per label (e.g. precision, recall, accuracy, etc.)
- errors: a file that contains incorrect predictions
- successes: a file that contains correct predictions
- confusion matrix: plot of confusion matrix
- histogram: plot of confidence distribution (not available for stories)
-
#5756: To avoid the problem of our entity extractors predicting entity labels for just a part of the words, we introduced a cleaning method after the prediction was done. We should avoid the incorrect prediction in the first place. To achieve this we will not tokenize words into sub-words anymore. We take the mean feature vectors of the sub-words as the feature vector of the word.
:::caution This change is model breaking. Please, retrain your models.
:::
-
#5759: Move option
case_sensitive
from the tokenizers to the featurizers.- Remove the option from the
WhitespaceTokenizer
andConveRTTokenizer
. - Add option
case_sensitive
to theRegexFeaturizer
.
- Remove the option from the
-
#5766: If a user sends a voice message to the bot using Facebook, users messages was set to the attachments URL. The same is now also done for the rest of attachment types (image, video, and file).
-
#5794: Creating a
Domain
usingDomain.fromDict
can no longer alter the input dictionary. Previously, there could be problems when the input dictionary was re-used for other things after creating theDomain
from it. -
#5805: The debug-level logs when instantiating an SQLTrackerStore no longer show the password in plain text. Now, the URL is displayed with the password hidden, e.g.
postgresql://username:***@localhost:5432
. -
#5855: Shorten the information in tqdm during training ML algorithms based on the log level. If you train your model in debug mode, all available metrics will be shown during training, otherwise, the information is shorten.
-
#5913: Ignore conversation test directory
tests/
when importing a project usingMultiProjectImporter
anduse_e2e
isFalse
. Previously, any story data found in a project subdirectory would be imported as training data. -
#5985: Implemented model checkpointing for DIET (including the response selector) and TED. The best model during training will be stored instead of just the last model. The model is evaluated on the basis of
evaluate_every_number_of_epochs
andevaluate_on_number_of_examples
.Checkpointing is enabled iff the following is set for the models in the
config.yml
file:checkpoint_model: True
evaluate_on_number_of_examples > 0
The model is stored to whatever location has been specified with the
--out
parameter when callingrasa train nlu/core ...
. -
#6024:
rasa data split nlu
now makes sure that there is at least one example per intent and response in the test data. -
#6052: Add endpoint kwarg to
rasa.jupyter.chat
to enable using a custom action server while chatting with a model in a jupyter notebook. -
#6055: Support for rasa conversation id with special characters on the server side - necessary for some channels (e.g. Viber)
-
#6134: Log the number of examples per intent during training. Logging can be enabled using
rasa train --debug
. -
#6237: Support for other remote storages can be achieved by using an external library.
-
#6276: Allow Rasa to boot when model loading exception occurs. Forward HTTP Error responses to standard log output.
-
#6296: * Modified functionality of
SingleStateFeaturizer
.SingleStateFeaturizer
uses trained NLUInterpreter
to featurize intents and action names. This modifiedSingleStateFeaturizer
can replicateLabelTokenizerSingleStateFeaturizer
functionality. This component is deprecated from now on. To replicateLabelTokenizerSingleStateFeaturizer
functionality, add aTokenizer
withintent_tokenization_flag: True
andCountVectorsFeaturizer
to the NLU pipeline. Please update your configuration file.For example:
yaml language: en pipeline: - name: WhitespaceTokenizer intent_tokenization_flag: True - name: CountVectorsFeaturizer
Please train both NLU and Core (using
rasa train
) to use a trained tokenizer and featurizer for core featurization.The new
SingleStateFeaturizer
stores slots, entities and forms in sparse features for more lightweight storage.BinarySingleStateFeaturizer
is deprecated and will be removed in the future. We recommend to switch toSingleStateFeaturizer
.-
Modified
TEDPolicy
to handle sparse features. As a result,TEDPolicy
may require more epochs than before to converge. -
Default TEDPolicy featurizer changed to
MaxHistoryTrackerFeaturizer
with infinite max history (takes all dialogue turns into account). -
Default batch size for TED increased from [8,32] to [64, 256]
-
-
#6323: Response selector templates now support all features that domain utterances do. They use the yaml format instead of markdown now. This means you can now use buttons, images, ... in your FAQ or chitchat responses (assuming they are using the response selector).
As a consequence, training data form in markdown has to have the file suffix
.md
from now on to allow proper file type detection- -
#6457: Support for test stories written in yaml format.
-
#6466: Response Selectors are now trained on retrieval intent labels by default instead of the actual response text. For most models, this should improve training time and accuracy of the
ResponseSelector
.If you want to revert to the pre-2.0 default behavior, add the
use_text_as_label=true
parameter to yourResponseSelector
component.You can now also have multiple response templates for a single sub-intent of a retrieval intent. The first response template containing the text attribute is picked for training(if
use_text_as_label=True
) and a random template is picked for bot's utterance just as how otherutter_
templates are picked.All response selector related evaluation artifacts -
report.json, successes.json, errors.json, confusion_matrix.png
now use the sub-intent of the retrieval intent as the target and predicted labels instead of the actual response text.The output schema of
ResponseSelector
has changed -full_retrieval_intent
andname
have been deprecated in favour ofintent_response_key
andresponse_templates
respectively. Additionally a keyall_retrieval_intents
is added to the response selector output which will hold a list of all retrieval intents(faq,chitchat, etc.) that are present in the training data.An example output looks like this -"response_selector": { "all_retrieval_intents": ["faq"], "default": { "response": { "id": 1388783286124361986, "confidence": 1.0, "intent_response_key": "faq/is_legit", "response_templates": [ { "text": "absolutely", "image": "https://i.imgur.com/nGF1K8f.jpg" }, { "text": "I think so." } ], }, "ranking": [ { "id": 1388783286124361986, "confidence": 1.0, "intent_response_key": "faq/is_legit" }, ]
An example bot demonstrating how to use the
ResponseSelector
is added to theexamples
folder. -
#6472: Do not modify conversation tracker's
latest_input_channel
property when usingPOST /trigger_intent
orReminderScheduled
. -
#6555: Do not set the output dimension of the
sparse-to-dense
layers to the same dimension as the dense features.Update default value of
dense_dimension
andconcat_dimension
fortext
inDIETClassifier
to 128. -
#6591: Retrieval actions with
respond_
prefix are now replaced with usual utterance actions withutter_
prefix.If you were using retrieval actions before, rename all of them to start with
utter_
prefix. For example,respond_chitchat
becomesutter_chitchat
. Also, in order to keep the response templates more consistent, you should now add theutter_
prefix to all response templates defined for retrieval intents. For example, a response templatechitchat/ask_name
becomesutter_chitchat/ask_name
. Note that the NLU examples for this will still be underchitchat/ask_name
intent. The exampleresponseselectorbot
should help clarify these changes further. -
#6613: Added telemetry reporting. Rasa uses telemetry to report anonymous usage information. This information is essential to help improve Rasa Open Source for all users. Reporting will be opt-out. More information can be found in our telemetry documentation.
-
#5038: Fixed a bug in the
CountVectorsFeaturizer
which resulted in the very first message after loading a model to be processed incorrectly due to the vocabulary not being loaded yet. -
#5135: Fixed Rasa shell skipping button messages if buttons are attached to a message previous to the latest.
-
#5385: Stack level for
FutureWarning
updated to level 2. -
#5453: If custom utter message contains no value or integer value, then it fails returning custom utter message. Fixed by converting the template to type string.
-
#5617: Don't create TensorBoard log files during prediction.
-
#5638: Fixed DIET breaking with empty spaCy model.
-
#5737: Pinned the library version for the Azure Cloud Storage to 2.1.0 since the persistor is currently not compatible with later versions of the azure-storage-blob library.
-
#5755: Remove
clean_up_entities
from extractors that extract pre-defined entities. Just keep the clean up method for entity extractors that extract custom entities. -
#5792: Fixed issue where the
DucklingHTTPExtractor
component would not work if itsurl
contained a trailing slash. -
#5808: Changed to variable
CERT_URI
inhangouts.py
to a string type -
#5850: Slots will be correctly interpolated for
button
responses.Previously this resulted in no interpolation due to a bug.
-
#5905: Remove option
token_pattern
fromCountVectorsFeaturizer
. Instead all tokenizers now have the optiontoken_pattern
. If a regular expression is set, the tokenizer will apply the token pattern. -
#5964: Fixed a bug when custom metadata passed with the utterance always restarted the session.
-
#5998:
WhitespaceTokenizer
does not remove vowel signs in Hindi anymore. -
#6042: Convert entity values coming from
DucklingHTTPExtractor
to string during evaluation to avoid mismatches due to different types. -
#6053: Update
FeatureSignature
to store just the feature dimension instead of the complete shape. This change fixes the usage of the optionshare_hidden_layers
in theDIETClassifier
. -
#6087: Unescape the
\n, \t, \r, \f, \b
tokens on reading nlu data from markdown files.On converting json files into markdown, the tokens mentioned above are espaced. These tokens need to be unescaped on loading the data from markdown to ensure that the data is treated in the same way.
-
#6120: Fix the way training data is generated in rasa test nlu when using the
-P
flag. Each percentage of the training dataset used to be formed as a part of the last sampled training dataset and not as a sample from the original training dataset. -
#6143: Prevent
WhitespaceTokenizer
from outputting empty list of tokens. -
#6198: Add
EntityExtractor
as a required component forEntitySynonymMapper
in a pipeline. -
#6222: Better handling of input sequences longer than the maximum sequence length that the
HFTransformersNLP
models can handle.During training, messages with longer sequence length should result in an error, whereas during inference they are gracefully handled but a debug message is logged. Ideally, passing messages longer than the acceptable maximum sequence lengths of each model should be avoided.
-
#6231: When using the
DynamoTrackerStore
, if there are more than 100 DynamoDB tables, the tracker could attempt to re-create an existing table if that table was not among the first 100 listed by the dynamo API. -
#6282: Fixed a deprication warning that pops up due to changes in numpy
-
#6291: Update
rasabaster
to fix an issue with syntax highlighting on "Prototype an Assistant" page.Update default stories and rules on "Prototype an Assistant" page.
-
#6419: Fixed a bug in the
serialise
method of theEvaluationStore
class which resulted in a wrong end-to-end evaluation of the predicted entities. -
#6535: Forms with slot mappings defined in
domain.yml
must now be a dictionary (with form names as keys). The previous syntax whereforms
was simply a list of form names is still supported. -
#6577: Remove BILOU tag prefix from role and group labels when creating entities.
-
#6601: Fixed a bug in the featurization of the boolean slot type. Previously, to set a slot value to "true", you had to set it to "1", which is in conflict with the documentation. In older versions
true
(without quotes) was also possible, but now raised an error during yaml validation.
- #4441: Added documentation on
ambiguity_threshold
parameter in Fallback Actions page. - #4605: Remove outdated whitespace tokenizer warning in Testing Your Assistant documentation.
- #5640: Updated Facebook Messenger channel docs with supported attachment information
- #5952: Update
rasa init
documentation to includetests/conversation_tests.md
in the resulting directory tree.
- #6549: Fix slow training of
CRFEntityExtractor
when using Entity Roles and Groups.
-
#6044: Do not deepcopy slots when instantiating trackers. This leads to a significant speedup when training on domains with a large number of slots.
-
#6226: Added more debugging logs to the Lock Stores to simplify debugging in case of connection problems.
Added a new parameter
socket_timeout
to theRedisLockStore
. If Redis doesn't answer withinsocket_timeout
seconds to requests from Rasa Open Source, an error is raised. This avoids seemingly infinitely blocking connections and exposes connection problems early.
- #5182: Fixed a bug where domain fields such as
store_entities_as_slots
were overridden with defaults and therefore ignored. - #6191: If two entities are separated by a comma (or any other symbol), extract them as two separate entities.
- #6340: If two entities are separated by a single space and uses BILOU tagging, extract them as two separate entities based on their BILOU tags.
- #6280: Fixed
TypeError: expected string or bytes-like object
issue caused by integer, boolean, and null values in templates.
- #6255: Rasa Open Source will no longer add
responses
to theactions
section of the domain when persisting the domain as a file. This addresses related problems in Rasa X when Integrated Version Control introduced big diffs due to the added utterances in theactions
section.
- #6160: Consider entity roles/groups during interactive learning.
- #6075: Add 'Access-Control-Expose-Headers' for 'filename' header
- #6137: Fixed a bug where an invalid language variable prevents rasa from finding training examples when importing Dialogflow data.
-
#6150: Add
not_supported_language_list
to component to be able to define languages that a component can NOT handle.WhitespaceTokenizer
is not able to process languages which are not separated by whitespace.WhitespaceTokenizer
will throw an error if it is used with Chinese, Japanese, and Thai.
- #6150:
WhitespaceTokenizer
only removes emoji if complete token matches emoji regex.
- #6143: Prevent
WhitespaceTokenizer
from outputting empty list of tokens.
- #6119: Explicitly remove all emojis which appear as unicode characters from the output of
regex.sub
insideWhitespaceTokenizer
.
-
#5998:
WhitespaceTokenizer
does not remove vowel signs in Hindi anymore. -
#6031: Previously, specifying a lock store in the endpoint configuration with a type other than
redis
orin_memory
would lead to anAttributeError: 'str' object has no attribute 'type'
. This bug is fixed now. -
#6032: Fix
Interpreter parsed an intent ...
warning when using the/model/parse
endpoint with an NLU-only model. -
#6042: Convert entity values coming from any entity extractor to string during evaluation to avoid mismatches due to different types.
-
#6078: The assistant will respond through the webex channel to any user (room) communicating to it. Before the bot responded only to a fixed
roomId
set in thecredentials.yml
config file.
- #3900: Reduced duplicate logs and warnings when running
rasa train
.
-
#5972: Remove the
clean_up_entities
method from theDIETClassifier
andCRFEntityExtractor
as it let to incorrect entity predictions. -
#5976: Fix server crashes that occurred when Rasa Open Source pulls a model from a model server and an exception was thrown during model loading (such as a domain with invalid YAML).
-
#5521: Responses used in ResponseSelector now support new lines with explicitly adding
\\n
between them. -
#5758: Fixed a bug in rasa export (Export Conversations to an Event Broker) which caused Rasa Open Source to only migrate conversation events from the last Session configuration.
- #5794: Creating a
Domain
usingDomain.fromDict
can no longer alter the input dictionary. Previously, there could be problems when the input dictionary was re-used for other things after creating theDomain
from it.
-
#5617: Don't create TensorBoard log files during prediction.
-
#5638: Fix: DIET breaks with empty spaCy model
-
#5755: Remove
clean_up_entities
from extractors that extract pre-defined entities. Just keep the clean up method for entity extractors that extract custom entities. -
#5792: Fixed issue where the
DucklingHTTPExtractor
component would not work if its url contained a trailing slash. -
#5825: Fix list index out of range error in
ensure_consistent_bilou_tagging
.
- #5788
-
#3765: Add support for entities with roles and grouping of entities in Rasa NLU.
You can now define a role and/or group label in addition to the entity type for entities. Use the role label if an entity can play different roles in your assistant. For example, a city can be a destination or a departure city. The group label can be used to group multiple entities together. For example, you could group different pizza orders, so that you know what toppings goes with which pizza and what size which pizza has. For more details see Entities Roles and Groups.
To fill slots from entities with a specific role/group, you need to either use forms or use a custom action. We updated the tracker method
get_latest_entity_values
to take an optional role/group label. If you want to use a form, you can add the specific role/group label of interest to the slot mapping functionfrom_entity
(see Forms).:::note Composite entities are currently just supported by the DIETClassifier and CRFEntityExtractor.
:::
-
#5465: Update training data format for NLU to support entities with a role or group label.
You can now specify synonyms, roles, and groups of entities using the following data format: Markdown:
[LA]{"entity": "location", "role": "city", "group": "CA", "value": "Los Angeles"}
JSON:
"entities": [ { "start": 10, "end": 12, "value": "Los Angeles", "entity": "location", "role": "city", "group": "CA", } ]
The markdown format
[LA](location:Los Angeles)
is deprecated. To update your training data file just execute the following command on the terminal of your choice:sed -i -E 's/\\[([^)]+)\\]\\(([^)]+):([^)]+)\\)/[\\1]{"entity": "\\2", "value": "\\3"}/g' nlu.md
For more information about the new data format see Training Data Format.
-
#2224: Suppressed
pika
logs when establishing the connection. These log messages mostly happened when Rasa X and RabbitMQ were started at the same time. Since RabbitMQ can take a few seconds to initialize, Rasa X has to re-try until the connection is established. In case you suspect a different problem (such as failing authentication) you can re-enable thepika
logs by setting the log level toDEBUG
. To run Rasa Open Source in debug mode, use the--debug
flag. To run Rasa X in debug mode, set the environment variableDEBUG_MODE
totrue
. -
#3419: Include the source filename of a story in the failed stories
Include the source filename of a story in the failed stories to make it easier to identify the file which contains the failed story.
-
#5544: Add confusion matrix and “confused_with” to response selection evaluation
If you are using ResponseSelectors, they now produce similiar outputs during NLU evaluation. Misclassfied responses are listed in a “confused_with” attribute in the evaluation report. Similiarily, a confusion matrix of all responses is plotted.
-
#5578: Added
socketio
to the compatible channels for Reminders and External Events. -
#5595: Update
POST /model/train
endpoint to accept retrieval action responses at theresponses
key of the JSON payload. -
#5627: All Rasa Open Source images are now using Python 3.7 instead of Python 3.6.
-
#5635: Update dependencies based on the
dependabot
check. -
#5636: Add dropout between
FFNN
andDenseForSparse
layers inDIETClassifier
,ResponseSelector
andEmbeddingIntentClassifier
controlled byuse_dense_input_dropout
config parameter. -
#5646:
DIETClassifier
only counts as extractor inrasa test
if it was actually trained for entity recognition. -
#5669: Remove regularization gradient for variables that don't have prediction gradient.
-
#5672: Raise a warning in
CRFEntityExtractor
andDIETClassifier
if entities are not correctly annotated in the training data, e.g. their start and end values do not match any start and end values of tokens. -
#5690: Add
full_retrieval_intent
property toResponseSelector
rankings -
#5717: Change default values for hyper-parameters in
EmbeddingIntentClassifier
andDIETClassifier
Use
scale_loss=False
inDIETClassifier
. Reduce the number of dense dimensions for sparse features of text from 512 to 256 inEmbeddingIntentClassifier
.
-
#5230: Fixed issue where posting to certain callback channel URLs would return a 500 error on successful posts due to invalid response format.
-
#5475: One word can just have one entity label.
If you are using, for example,
ConveRTTokenizer
words can be split into multiple tokens. Our entity extractors assign entity labels per token. So, it might happen, that a word, that was split into two tokens, got assigned two different entity labels. This is now fixed. One word can just have one entity label at a time. -
#5509: An entity label should always cover a complete word.
If you are using, for example,
ConveRTTokenizer
words can be split into multiple tokens. Our entity extractors assign entity labels per token. So, it might happen, that just a part of a word has an entity label. This is now fixed. An entity label always covers a complete word. -
#5574: Fixed an issue that happened when metadata is passed in a new session.
Now the metadata is correctly passed to the ActionSessionStart.
-
#5672: Updated Python dependency
ruamel.yaml
to>=0.16
. We recommend to use at least0.16.10
due to the security issue CVE-2019-20478 which is present in in prior versions.
- #5556, #5587, #5614, #5631, #5633
- #4606: The stream reading timeout for
rasa shell\
is now configurable by using the environment variable ``RASA_SHELL_STREAM_READING_TIMEOUT_IN_SECONDS. This can help to fix problems when using
rasa shell` with custom actions which run 10 seconds or longer.
-
#5709: Reverted changes in 1.9.6 that led to model incompatibility. Upgrade to 1.9.7 to fix
self.sequence_lengths_for(tf_batch_data[TEXT_SEQ_LENGTH][0]) IndexError: list index out of range
error without needing to retrain earlier 1.9 models.Therefore, all 1.9 models except for 1.9.6 will be compatible; a model trained on 1.9.6 will need to be retrained on 1.9.7.
-
#5426: Fix rasa test nlu plotting when using multiple runs.
-
#5489: Fixed issue where
max_number_of_predictions
was not considered when running end-to-end testing.
- #5626
-
#5533: Support for PostgreSQL schemas in SQLTrackerStore. The
SQLTrackerStore
accesses schemas defined by thePOSTGRESQL_SCHEMA
environment variable if connected to a PostgreSQL database.The schema is added to the connection string option's
-csearch_path
key, e.g.-options=-csearch_path=<SCHEMA_NAME>
(see https://www.postgresql.org/docs/11/contrib-dblink-connect.html for more details). As before, if noPOSTGRESQL_SCHEMA
is defined, Rasa uses the database's default schema (public
).The schema has to exist in the database before connecting, i.e. it needs to have been created with
CREATE SCHEMA schema_name;
- #5547: Fixed ambiguous logging in
DIETClassifier
by adding the name of the calling class to the log message.
- #5529: Fix memory leak problem on increasing number of calls to
/model/parse
endpoint.
- #5505: Set default value for
weight_sparsity
inResponseSelector
to0
. This fixes a bug in the default behavior ofResponseSelector
which was accidentally introduced inrasa==1.8.0
. Users should update to this version and re-train their models ifResponseSelector
was used in their pipeline.
- #5497: Fix documentation to bring back Sara.
- #5492: Fix an issue where the deprecated
queue
parameter for the Pika Event Broker was ignored and Rasa Open Source published the events to therasa_core_events
queue instead. Note that this does not change the fact that thequeue
argument is deprecated in favor of thequeues
argument.
-
#5006: Channel
hangouts
for Rasa integration with Google Hangouts Chat is now supported out-of-the-box. -
#5389: Add an optional path to a specific directory to download and cache the pre-trained model weights for HFTransformersNLP.
-
#5422: Add options
tensorboard_log_directory
andtensorboard_log_level
toEmbeddingIntentClassifier
,DIETClasifier
,ResponseSelector
,EmbeddingPolicy
andTEDPolicy
.By default
tensorboard_log_directory
isNone
. If a valid directory is provided, metrics are written during training. After the model is trained you can take a look at the training metrics in tensorboard. Executetensorboard --logdir <path-to-given-directory>
.Metrics can either be written after every epoch (default) or for every training step. You can specify when to write metrics using the variable
tensorboard_log_level
. Valid values are 'epoch' and 'minibatch'.We also write down a model summary, i.e. layers with inputs and types, to the given directory.
-
#4756: Make response timeout configurable.
rasa run
,rasa shell
andrasa x
can now be started with--response-timeout <int>
to configure a response timeout of<int>
seconds. -
#4826: Add full retrieval intent name to message data
ResponseSelector
will now add the full retrieval intent name e.g.faq/which_version
to the prediction, making it accessible from the tracker. -
#5258: Added
PikaEventBroker
(Pika Event Broker) support for publishing to multiple queues. Messages are now published to afanout
exchange with namerasa-exchange
(see exchange-fanout for more information onfanout
exchanges).The former
queue
key is deprecated. Queues should now be specified as a list in theendpoints.yml
event broker config under a new keyqueues
. Example config:event_broker: type: pika url: localhost username: username password: password queues: - queue-1 - queue-2 - queue-3
-
#5416: Change
rasa init
to includetests/conversation_tests.md
file by default. -
#5446: The endpoint
PUT /conversations/<conversation_id>/tracker/events
no longer adds session start events (to learn more about conversation sessions, please see Session configuration) in addition to the events which were sent in the request payload. To achieve the old behavior send aGET /conversations/<conversation_id>/tracker
request before appending events. -
#5482: Make
scale_loss
for intents behave the same way as in versions below1.8
, but only scale if some of the examples in a batch has probability of the golden label more than0.5
. Introducescale_loss
for entities inDIETClassifier
.
-
#5205: Fixed the bug when FormPolicy was overwriting MappingPolicy prediction (e.g.
/restart
). Priorities for Mapping Policy and Form Policy are no longer linear:FormPolicy
priority is 5, but its prediction is ignored ifMappingPolicy
is used for prediction. -
#5215: Fixed issue related to storing Python
float
values asdecimal.Decimal
objects in DynamoDB tracker stores. Alldecimal.Decimal
objects are now converted tofloat
on tracker retrieval.Added a new docs section on DynamoTrackerStore.
-
#5356: Fixed bug where
FallbackPolicy
would always fall back if the fallback action isaction_listen
. -
#5361: Fixed bug where starting or ending a response with
\\n\\n
led to one of the responses returned being empty. -
#5405: Fixes issue where model always gets retrained if multiple NLU/story files are in a directory, by sorting the list of files.
-
#5444: Fixed ambiguous logging in DIETClassifier by adding the name of the calling class to the log message.
-
#2237: Restructure the “Evaluating models” documentation page and rename this page to Testing Your Assistant.
-
#5302: Improved documentation on how to build and deploy an action server image for use on other servers such as Rasa X deployments.
- #5340
-
#5405: Fixes issue where model always gets retrained if multiple NLU/story files are in a directory, by sorting the list of files.
-
#5444: Fixed ambiguous logging in DIETClassifier by adding the name of the calling class to the log message.
-
#5506: Set default value for
weight_sparsity
inResponseSelector
to0
. This fixes a bug in the default behavior ofResponseSelector
which was accidentally introduced inrasa==1.8.0
. Users should update to this version orrasa>=1.9.3
and re-train their models ifResponseSelector
was used in their pipeline.
- #5302: Improved documentation on how to build and deploy an action server image for use on other servers such as Rasa X deployments.
-
#5438: Fixed bug when installing rasa with
poetry
. -
#5413: Fixed bug with
EmbeddingIntentClassifier
, where results weren't the same as in 1.7.x. Fixed by setting weight sparsity to 0.
-
#5404: Explain how to run commands as
root
user in Rasa SDK Docker images since version1.8.0
. Since version1.8.0
the Rasa SDK Docker images does not longer run asroot
user by default. For commands which requireroot
user usage, you have to switch back to theroot
user in your Docker image as described in Building an Action Server Image. -
#5402: Made improvements to Building Assistants tutorial
- #5354: Fixed issue with using language models like
xlnet
along withentity_recognition
set toTrue
insideDIETClassifier
.
- #5330, #5348
-
#4991: Removed
Agent.continue_training
and thedump_flattened_stories
parameter fromAgent.persist
. -
#5266: Properties
Component.provides
andComponent.requires
are deprecated. UseComponent.required_components()
instead.
-
#2674: Add default value
__other__
tovalues
of aCategoricalSlot
.All values not mentioned in the list of values of a
CategoricalSlot
will be mapped to__other__
for featurization. -
#4088: Add story structure validation functionality (e.g. rasa data validate stories –max-history 5).
-
#5065: Add LexicalSyntacticFeaturizer to sparse featurizers.
LexicalSyntacticFeaturizer
does the same featurization as theCRFEntityExtractor
. We extracted the featurization into a separate component so that the features can be reused and featurization is independent from the entity extraction. -
#5187: Integrate language models from HuggingFace's Transformers Library.
Add a new NLP component HFTransformersNLP which tokenizes and featurizes incoming messages using a specified pre-trained model with the Transformers library as the backend. Add LanguageModelTokenizer and LanguageModelFeaturizer which use the information from HFTransformersNLP and sets them correctly for message object. Language models currently supported: BERT, OpenAIGPT, GPT-2, XLNet, DistilBert, RoBERTa.
-
#5225: Added a new CLI command
rasa export
to publish tracker events from a persistent tracker store using an event broker. See Export Conversations to an Event Broker, Tracker Stores and Event Brokers for more details. -
#5230: Refactor how GPU and CPU environments are configured for TensorFlow 2.0.
Please refer to the documentation to understand which environment variables to set in what scenarios. A couple of examples are shown below as well:
# This specifies to use 1024 MB of memory from GPU with logical ID 0 and 2048 MB of memory from GPU with logical ID 1 TF_GPU_MEMORY_ALLOC="0:1024, 1:2048" # Specifies that at most 3 CPU threads can be used to parallelize multiple non-blocking operations TF_INTER_OP_PARALLELISM_THREADS="3" # Specifies that at most 2 CPU threads can be used to parallelize a particular operation. TF_INTRA_OP_PARALLELISM_THREADS="2"
-
#5266: Added a new NLU component DIETClassifier and a new policy TEDPolicy.
DIET (Dual Intent and Entity Transformer) is a multi-task architecture for intent classification and entity recognition. You can read more about this component in our documentation. The new component will replace the
EmbeddingIntentClassifier
and the CRFEntityExtractor in the future. Those two components are deprecated from now on. See migration guide for details on how to switch to the new component.TEDPolicy is the new name for EmbeddingPolicy.
EmbeddingPolicy
is deprecated from now on. The functionality ofTEDPolicy
andEmbeddingPolicy
is the same. Please update your configuration file to use the new name for the policy. -
#663: The sentence vector of the
SpacyFeaturizer
andMitieFeaturizer
can be calculated using max or mean pooling.To specify the pooling operation, set the option
pooling
for theSpacyFeaturizer
or theMitieFeaturizer
in your configuration file. The default pooling operation ismean
. The mean pooling operation also does not take into account words, that do not have a word vector.
-
#3975: Added command line argument
--conversation-id
torasa interactive
. If the argument is not given,conversation_id
defaults to a random uuid. -
#4653: Added a new command-line argument
--init-dir
to commandrasa init
to specify the directory in which the project is initialised. -
#4682: Added support to send images with the twilio output channel.
-
#4817: Part of Slack sanitization: Multiple garbled URL's in a string coming from slack will be converted into actual strings.
Example: health check of <http://eemdb.net|eemdb.net> and <http://eemdb1.net|eemdb1.net> to health check of eemdb.net and eemdb1.net
-
#5117: New command-line argument –conversation-id will be added and wiil give the ability to set specific conversation ID for each shell session, if not passed will be random.
-
#5211: Messages sent to the Pika Event Broker are now persisted. This guarantees the RabbitMQ will re-send previously received messages after a crash. Note that this does not help for the case where messages are sent to an unavailable RabbitMQ instance.
-
#5250: Added support for mattermost connector to use bot accounts.
-
#5266: We updated our code to TensorFlow 2.
-
#5317: Events exported using
rasa export
receive a message header if published through aPikaEventBroker
. The header is added to the message'sBasicProperties.headers
under therasa-export-process-id
key (rasa.core.constants.RASA_EXPORT_PROCESS_ID_HEADER_NAME
). The value is a UUID4 generated at each call ofrasa export
. The resulting header is a key-value pair that looks as follows:'rasa-export-process-id': 'd3b3d3ffe2bd4f379ccf21214ccfb261'
-
#5292: Added
followlinks=True
to os.walk calls, to allow the use of symlinks in training, NLU and domain data. -
#4811: Support invoking a
SlackBot
by direct messaging or@<app name>
mentions.
-
#4006: Fixed timestamp parsing warning when using DucklingHTTPExtractor
-
#4601: Fixed issue with
action_restart
getting overridden byaction_listen
when theMappingPolicy
and the TwoStageFallbackPolicy are used together. -
#5201: Fixed incorrectly raised Error encountered in pipelines with a
ResponseSelector
and NLG.When NLU training data is split before NLU pipeline comparison, NLG responses were not also persisted and therefore training for a pipeline including the
ResponseSelector
would fail.NLG responses are now persisted along with NLU data to a
/train
directory in therun_x/xx%_exclusion
folder. -
#5277: Fixed sending custom json with Twilio channel
-
#5174: Updated the documentation to properly suggest not to explicitly add utterance actions to the domain.
-
#5189: Added user guide for reminders and external events, including
reminderbot
demo.
- #3923, #4597, #4903, #5180, #5189, #5266, #699
-
#5068: Tracker stores supporting conversation sessions (
SQLTrackerStore
andMongoTrackerStore
) do not save the tracker state to database immediately after starting a new conversation session. This leads to the number of events being saved in addition to the already-existing ones to be calculated correctly.This fixes
action_listen
events being saved twice at the beginning of conversation sessions.
- #5231: Fix segmentation fault when running
rasa train
orrasa shell
.
- #5286: Fix doc links on “Deploying your Assistant” page
- #5197: Fixed incompatibility of Oracle with the SQLTrackerStore, by using a
Sequence
for the primary key columns. This does not change anything for SQL databases other than Oracle. If you are using Oracle, please create a sequence with the instructions in the SQLTrackerStore docs.
-
#5197: Added section on setting up the SQLTrackerStore with Oracle
-
#5210: Renamed “Running the Server” page to “Configuring the HTTP API”
-
#5106: Fixed file loading of non proper UTF-8 story files, failing properly when checking for story files.
-
#5162: Fix problem with multi-intents. Training with multi-intents using the
CountVectorsFeaturizer
together withEmbeddingIntentClassifier
is working again. -
#5171: Fix bug
ValueError: Cannot concatenate sparse features as sequence dimension does not match
.When training a Rasa model that contains responses for just some of the intents, training was failing. Fixed the featurizers to return a consistent feature vector in case no response was given for a specific message.
-
#5199: If no text features are present in
EmbeddingIntentClassifier
return the intentNone
. -
#5216: Resolve version conflicts: Pin version of cloudpickle to ~=1.2.0.
-
#4964: The endpoint
/conversations/<conversation_id>/execute
is now deprecated. Instead, users should use the/conversations/<conversation_id>/trigger_intent
endpoint and thus trigger intents instead of actions. -
#4978: Remove option
use_cls_token
from tokenizers and optionreturn_sequence
from featurizers.By default all tokenizer add a special token (
__CLS__
) to the end of the list of tokens. This token will be used to capture the features of the whole utterance.The featurizers will return a matrix of size (number-of-tokens x feature-dimension) by default. This allows to train sequence models. However, the feature vector of the
__CLS__
token can be used to train non-sequence models. The corresponding classifier can decide what kind of features to use.
-
#400: Rename
templates
key in domain toresponses
.templates
key will still work for backwards compatibility but will raise a future warning. -
#4902: Added a new configuration parameter,
ranking_length
to theEmbeddingPolicy
,EmbeddingIntentClassifier
, andResponseSelector
classes. -
#4964: External events and reminders now trigger intents (and entities) instead of actions.
Add new endpoint
/conversations/<conversation_id>/trigger_intent
, which lets the user specify an intent and a list of entities that is injected into the conversation in place of a user message. The bot then predicts and executes a response action. -
#4978: Add
ConveRTTokenizer
.The tokenizer should be used whenever the
ConveRTFeaturizer
is used.Every tokenizer now supports the following configuration options:
intent_tokenization_flag
: Flag to check whether to split intents (defaultFalse
).intent_split_symbol
: Symbol on which intent should be split (default_
)
-
#1988: Remove the need of specifying utter actions in the
actions
section explicitly if these actions are already listed in thetemplates
section. -
#4877: Entity examples that have been extracted using an external extractor are excluded from Markdown dumping in
MarkdownWriter.dumps()
. The excluded external extractors areDucklingHTTPExtractor
andSpacyEntityExtractor
. -
#4902: The
EmbeddingPolicy
,EmbeddingIntentClassifier
, andResponseSelector
now by default normalize confidence levels over the top 10 results. See Rasa 1.6 to Rasa 1.7 for more details. -
#4964:
ReminderCancelled
can now cancel multiple reminders if no name is given. It still cancels a single reminder if the reminder's name is specified.
-
#4774: Requests to
/model/train
do not longer block other requests to the Rasa server. -
#4896: Fixed default behavior of
rasa test core --evaluate-model-directory
when called without--model
. Previously, the latest model file was used as--model
. Now the default model directory is used instead.New behavior of
rasa test core --evaluate-model-directory
when given an existing file as argument for--model
: Previously, this led to an error. Now a warning is displayed and the directory containing the given file is used as--model
. -
#5040: Updated the dependency
networkx
from 2.3.0 to 2.4.0. The old version created incompatibilities when using pip.There is an imcompatibility between Rasa dependecy requests 2.22.0 and the own depedency from Rasa for networkx raising errors upon pip install. There is also a bug corrected in
requirements.txt
which used~=
instead of==
. All of these are fixed using networkx 2.4.0. -
#5057: Fixed compatibility issue with Microsoft Bot Framework Emulator if
service_url
lacked a trailing/
. -
#5092: DynamoDB tracker store decimal values will now be rounded on save. Previously values exceeding 38 digits caused an unhandled error.
- #4458, #4664, #4780, #5029
- #4994: Switching back to a TensorFlow release which only includes CPU support to reduce the
size of the dependencies. If you want to use the TensorFlow package with GPU support,
please run
pip install tensorflow-gpu==1.15.0
.
-
#5111: Fixes
Exception 'Loop' object has no attribute '_ready'
error when runningrasa init
. -
#5126: Updated the end-to-end ValueError you recieve when you have a invalid story format to point to the updated doc link.
-
#4989: Use an empty domain in case a model is loaded which has no domain (avoids errors when accessing
agent.doman.<some attribute>
). -
#4995: Replace error message with warning in tokenizers and featurizers if default parameter not set.
-
#5019: Pin sanic patch version instead of minor version. Fixes sanic
_run_request_middleware()
error. -
#5032: Fix wrong calculation of additional conversation events when saving the conversation. This led to conversation events not being saved.
-
#5032: Fix wrong order of conversation events when pushing events to conversations via
POST /conversations/<conversation_id>/tracker/events
.
-
#4935: Removed
ner_features
as a feature name fromCRFEntityExtractor
, usetext_dense_features
instead.The following settings match the previous
NGramFeaturizer
:pipeline: - name: 'CountVectorsFeaturizer' analyzer: 'char_wb' min_ngram: 3 max_ngram: 17 max_features: 10 min_df: 5
-
#4957: To use custom features in the
CRFEntityExtractor
usetext_dense_features
instead ofner_features
. Iftext_dense_features
are present in the feature set, theCRFEntityExtractor
will automatically make use of them. Just make sure to add a dense featurizer in front of theCRFEntityExtractor
in your pipeline and set the flagreturn_sequence
toTrue
for that featurizer. -
#4990: Deprecated
Agent.continue_training
. Instead, a model should be retrained. -
#684: Specifying lookup tables directly in the NLU file is now deprecated. Please specify them in an external file.
-
#4795: Replaced the warnings about missing templates, intents etc. in validator.py by debug messages.
-
#4830: Added conversation sessions to trackers.
A conversation session represents the dialog between the assistant and a user. Conversation sessions can begin in three ways: 1. the user begins the conversation with the assistant, 2. the user sends their first message after a configurable period of inactivity, or 3. a manual session start is triggered with the
/session_start
intent message. The period of inactivity after which a new conversation session is triggered is defined in the domain using thesession_expiration_time
key in thesession_config
section. The introduction of conversation sessions comprises the following changes:-
Added a new event
SessionStarted
that marks the beginning of a new conversation session. -
Added a new default action
ActionSessionStart
. This action takes allSlotSet
events from the previous session and applies it to the next session. -
Added a new default intent
session_start
which triggers the start of a new conversation session. -
SQLTrackerStore
andMongoTrackerStore
only retrieve events from the last session from the database.
:::note The session behavior is disabled for existing projects, i.e. existing domains without session config section.
:::
-
-
#4935: Preparation for an upcoming change in the
EmbeddingIntentClassifier
:Add option
use_cls_token
to all tokenizers. If it is set toTrue
, the token__CLS__
will be added to the end of the list of tokens. Default is set toFalse
. No need to change the default value for now.Add option
return_sequence
to all featurizers. By default all featurizers return a matrix of size (1 x feature-dimension). If the optionreturn_sequence
is set toTrue
, the corresponding featurizer will return a matrix of size (token-length x feature-dimension). See Text Featurizers. Default value is set toFalse
. However, you might want to set it toTrue
if you want to use custom features in theCRFEntityExtractor
. See passing custom features to theCRFEntityExtractor
Changed some featurizers to use sparse features, which should reduce memory usage with large amounts of training data significantly. Read more: Text Featurizers .
:::caution These changes break model compatibility. You will need to retrain your old models!
:::
-
#3549: Added
--no-plot
option forrasa test
command, which disables rendering of confusion matrix and histogram. By default plots will be rendered. -
#4086: If matplotlib couldn't set up a default backend, it will be set automatically to TkAgg/Agg one
-
#4647: Add the option
\
random_seed`to the
`rasa data split nlu`` command to generate reproducible train/test splits. -
#4734: Changed
url
__init__()
arguments for custom tracker stores tohost
to reflect the__init__
arguments of currently supported tracker stores. Note that inendpoints.yml
, these are still declared asurl
. -
#4751: The
kafka-python
dependency has become as an “extra” dependency. To use theKafkaEventConsumer
,rasa
has to be installed with the[kafka]
option, i.e.$ pip install rasa[kafka]
-
#4801: Allow creation of natural language interpreter and generator by classname reference in
endpoints.yml
. -
#4834: Made it explicit that interactive learning does not work with NLU-only models.
Interactive learning no longer trains NLU-only models if no model is provided and no core data is provided.
-
#4899: The
intent_report.json
created byrasa test
now creates an extra fieldconfused_with
for each intent. This is a dictionary containing the names of the most common false positives when this intent should be predicted, and the number of such false positives. -
#4976:
rasa test nlu --cross-validation
now also includes an evaluation of the response selector. As a result, the train and test F1-score, accuracy and precision is logged for the response selector. A report is also generated in theresults
folder by the nameresponse_selection_report.json
-
#4635: If a
wait_time_between_pulls
is configured for the model server inendpoints.yml
, this will be used instead of the default one when running Rasa X. -
#4759: Training Luis data with
luis_schema_version
higher than 4.x.x will show a warning instead of throwing an exception. -
#4799: Running
rasa interactive
with no NLU data now works, with the functionality ofrasa interactive core
. -
#4917: When loading models from S3, namespaces (folders within a bucket) are now respected. Previously, this would result in an error upon loading the model.
-
#4925: “rasa init” will ask if user wants to train a model
-
#4942: Pin
multidict
dependency to 4.6.1 to prevent sanic from breaking, see sanic-org/sanic#1729 -
#4985: Fix errors during training and testing of
ResponseSelector
.
- #4933: Improved error message that appears when an incorrect parameter is passed to a policy.
-
#4914: Added
rasa/nlu/schemas/config.yml
to wheel package -
#4942: Pin
multidict
dependency to 4.6.1 to prevent sanic from breaking, see sanic-org/sanic#1729
-
#3684:
rasa interactive
will skip the story visualization of training stories in case there are more than 200 stories. Stories created during interactive learning will be visualized as before. -
#4792: The log level for SocketIO loggers, including
websockets.protocol
,engineio.server
, andsocketio.server
, is now handled by theLOG_LEVEL_LIBRARIES
environment variable, where the default log level isERROR
. -
#4873: Updated all example bots and documentation to use the updated
dispatcher.utter_message()
method from rasa-sdk==1.5.0.
-
#3684:
rasa interactive
will not load training stories in case the visualization is skipped. -
#4789: Fixed error where spacy models where not found in the docker images.
-
#4802: Fixed unnecessary
kwargs
unpacking inrasa.test.test_core
call inrasa.test.test
function. -
#4898: Training data files now get loaded in the same order (especially relevant to subdirectories) each time to ensure training consistency when using a random seed.
-
#4918: Locks for tickets in
LockStore
are immediately issued without a redundant check for their availability.
-
#4844: Added
towncrier
to automatically collect changelog entries. -
#4869: Document the pipeline for
pretrained_embeddings_convert
in the pre-configured pipelines section. -
#4894:
Proactively Reaching Out to the User Using Actions
now correctly links to the endpoint specification.
- When NLU training data is dumped as Markdown file the intents are not longer ordered alphabetically, but in the original order of given training data
-
End to end stories now support literal payloads which specify entities, e.g.
greet: /greet{"name": "John"}
-
Slots will be correctly interpolated if there are lists in custom response templates.
-
Fixed compatibility issues with
rasa-sdk
1.5
-
Updated
/status
endpoint to show correct path to model archive
-
Added data validator that checks if domain object returned is empty. If so, exit early from the command
rasa data validate
. -
Added the KeywordIntentClassifier.
-
Added documentation for
AugmentedMemoizationPolicy
. -
Fall back to
InMemoryTrackerStore
in case there is any problem with the current tracker store. -
Arbitrary metadata can now be attached to any
Event
subclass. The data must be stored under themetadata
key when reading the event from a JSON object or dictionary. -
Add command line argument
rasa x --config CONFIG
, to specify path to the policy and NLU pipeline configuration of your bot (default:config.yml
). -
Added a new NLU featurizer -
ConveRTFeaturizer
based on ConveRT model released by PolyAI. -
Added a new preconfigured pipeline -
pretrained_embeddings_convert
.
-
Do not retrain the entire Core model if only the
templates
section of the domain is changed. -
Upgraded
jsonschema
version.
- Remove duplicate messages when creating training data (issues/1446).
-
MultiProjectImporter
now imports files in the order of the import statements -
Fixed server hanging forever on leaving
rasa shell
before first message -
Fixed rasa init showing traceback error when user does Keyboard Interrupt before choosing a project path
-
CountVectorsFeaturizer
featurizes intents only if its analyzer is set toword
-
Fixed bug where facebooks generic template was not rendered when buttons were
None
-
Fixed default intents unnecessarily raising undefined parsing error
-
Fixed Rasa X not working when any tracker store was configured for Rasa.
-
Use the matplotlib backend
agg
in case thetkinter
package is not installed.
-
NLU-only models no longer throw warnings about parsing features not defined in the domain
-
Fixed bug that stopped Dockerfiles from building version 1.4.4.
-
Fixed format guessing for e2e stories with intent restated as
/intent
PikaEventProducer
adds the RabbitMQApp ID
message property to published messages with the value of theRASA_ENVIRONMENT
environment variable. The message property will not be assigned if this environment variable isn't set.
-
Updated Mattermost connector documentation to be more clear.
-
Updated format strings to f-strings where appropriate.
-
Updated tensorflow requirement to
1.15.0
-
Dump domain using UTF-8 (to avoid
\\UXXXX
sequences in the dumped files)
-
Fixed exporting NLU training data in
json
format fromrasa interactive
-
Fixed numpy deprecation warnings
- Fixed
Connection reset by peer
errors and bot response delays when using the RabbitMQ event broker.
- TensorFlow deprecation warnings are no longer shown when running
rasa x
-
Fixed
'Namespace' object has no attribute 'persist_nlu_data'
error during interactive learning -
Pinned networkx~=2.3.0 to fix visualization in rasa interactive and Rasa X
-
Fixed
No model found
error when usingrasa run actions
with “actions” as a directory.
Regression: changes from 1.2.12
were missing from 1.4.0
, readded them
-
add flag to CLI to persist NLU training data if needed
-
log a warning if the
Interpreter
picks up an intent or an entity that does not exist in the domain file. -
added
DynamoTrackerStore
to support persistence of agents running on AWS -
added docstrings for
TrackerStore
classes -
added buttons and images to mattermost.
-
CRFEntityExtractor
updated to accept arbitrary token-level features like word vectors (issues/4214) -
SpacyFeaturizer
updated to addner_features
forCRFEntityExtractor
-
Sanitizing incoming messages from slack to remove slack formatting like
<mailto:xyz@rasa.com|xyz@rasa.com>
or<http://url.com|url.com>
and substitute it with original content -
Added the ability to configure the number of Sanic worker processes in the HTTP server (
rasa.server
) and input channel server (rasa.core.agent.handle_channels()
). The number of workers can be set using the environment variableSANIC_WORKERS
(default: 1). A value of >1 is allowed only in combination withRedisLockStore
as the lock store. -
Botframework channel can handle uploaded files in
UserMessage
metadata. -
Added data validator that checks there is no duplicated example data across multiples intents
-
Unknown sections in markdown format (NLU data) are not ignored anymore, but instead an error is raised.
-
It is now easier to add metadata to a
UserMessage
in existing channels. You can do so by overwriting the methodget_metadata
. The return value of this method will be passed to theUserMessage
object. -
Tests can now be run in parallel
-
Serialise
DialogueStateTracker
as json instead of pickle. DEPRECATION warning: Deserialisation of pickled trackers will be deprecated in version 2.0. For now, trackers are still loaded from pickle but will be dumped as json in any subsequent save operations. -
Event brokers are now also passed to custom tracker stores (using the
event_broker
parameter) -
Don't run the Rasa Docker image as
root
. -
Use multi-stage builds to reduce the size of the Rasa Docker image.
-
Updated the
/status
api route to use the actual model file location instead of thetmp
location.
- Removed Python 3.5 support
-
fixed missing
tkinter
dependency for running tests on Ubuntu -
fixed issue with
conversation
JSON serialization -
fixed the hanging HTTP call with
ner_duckling_http
pipeline -
fixed Interactive Learning intent payload messages saving in nlu files
-
fixed DucklingHTTPExtractor dimensions by actually applying to the request
- Can now pass a package as an argument to the
--actions
parameter of therasa run actions
command.
- Fixed visualization of stories with entities which led to a failing visualization in Rasa X
-
Port of 1.2.10 (support for RabbitMQ TLS authentication and
port
key in event broker endpoint config). -
Port of 1.2.11 (support for passing a CA file for SSL certificate verification via the –ssl-ca-file flag).
-
Fixed the hanging HTTP call with
ner_duckling_http
pipeline. -
Fixed text processing of
intent
attribute insideCountVectorFeaturizer
. -
Fixed
argument of type 'NoneType' is not iterable
when usingrasa shell
,rasa interactive
/rasa run
- Policies now only get imported if they are actually used. This removes TensorFlow warnings when starting Rasa X
-
Fixed error
Object of type 'MaxHistoryTrackerFeaturizer' is not JSON serializable
when runningrasa train core
-
Default channel
send_
methods no longer support kwargs as they caused issues in incompatible channels
-
re-added TLS, SRV dependencies for PyMongo
-
socketio can now be run without turning on the
--enable-api
flag -
MappingPolicy no longer fails when the latest action doesn't have a policy
- Added the ability for users to specify a conversation id to send a message to when
using the
RasaChat
input channel.
- Fixed issue where
rasa init
would fail without spaCy being installed
-
Added the ability to set the
backlog
parameter in Sanicsrun()
method using theSANIC_BACKLOG
environment variable. This parameter sets the number of unaccepted connections the server allows before refusing new connections. A default value of 100 is used if the variable is not set. -
Status endpoint (
/status
) now also returns the number of training processes currently running
-
Added the ability to properly deal with spaCy
Doc
-objects created on empty strings as discussed here. Only training samples that actually bear content are sent toself.nlp.pipe
for every given attribute. Non-content-bearing samples are converted to emptyDoc
-objects. The resulting lists are merged with their preserved order and properly returned. -
asyncio warnings are now only printed if the callback takes more than 100ms (up from 1ms).
-
agent.load_model_from_server
no longer affects logging.
- The endpoint
POST /model/train
no longer supports specifying an output directory for the trained model using the fieldout
. Instead you can choose whether you want to save the trained model in the default model directory (models
) (default behavior) or in a temporary directory by specifying thesave_to_default_model_directory
field in the training request.
-
Added a check to avoid training
CountVectorizer
for a particular attribute of a message if no text is provided for that attribute across the training data. -
Default one-hot representation for label featurization inside
EmbeddingIntentClassifier
if label features don't exist. -
Policy ensemble no longer incorrectly wrings “missing mapping policy” when mapping policy is present.
-
“text” from
utter_custom_json
now correctly saved to tracker when using telegram channel
- Removed computation of
intent_spacy_doc
. As a result, none of the spacy components process intents now.
- SQL tracker events are retrieved ordered by timestamps. This fixes interactive learning events being shown in the wrong order.
- Pin gast to == 0.2.2
-
Added option to persist nlu training data (default: False)
-
option to save stories in e2e format for interactive learning
-
bot messages contain the
timestamp
of theBotUttered
event, which can be used in channels -
FallbackPolicy
can now be configured to trigger when the difference between confidences of two predicted intents is too narrow -
experimental training data importer which supports training with data of multiple sub bots. Please see the docs for more information.
-
throw error during training when triggers are defined in the domain without
MappingPolicy
being present in the policy ensemble -
The tracker is now available within the interpreter's
parse
method, giving the ability to create interpreter classes that use the tracker state (eg. slot values) during the parsing of the message. More details on motivation of this change see issues/3015. -
add example bot
knowledgebasebot
to showcase the usage ofActionQueryKnowledgeBase
-
softmax
starspace loss for bothEmbeddingPolicy
andEmbeddingIntentClassifier
-
balanced
batching strategy for bothEmbeddingPolicy
andEmbeddingIntentClassifier
-
max_history
parameter forEmbeddingPolicy
-
Successful predictions of the NER are written to a file if
--successes
is set when runningrasa test nlu
-
Incorrect predictions of the NER are written to a file by default. You can disable it via
--no-errors
. -
New NLU component
ResponseSelector
added for the task of response selection -
Message data attribute can contain two more keys -
response_key
,response
depending on the training data -
New action type implemented by
ActionRetrieveResponse
class and identified withresponse_
prefix -
Vocabulary sharing inside
CountVectorsFeaturizer
withuse_shared_vocab
flag. If set to True, vocabulary of corpus is shared between text, intent and response attributes of message -
Added an option to share the hidden layer weights of text input and label input inside
EmbeddingIntentClassifier
using the flagshare_hidden_layers
-
New type of training data file in NLU which stores response phrases for response selection task.
-
Add flag
intent_split_symbol
andintent_tokenization_flag
to allWhitespaceTokenizer
,JiebaTokenizer
andSpacyTokenizer
-
Added evaluation for response selector. Creates a report
response_selection_report.json
inside--out
directory. -
argument
--config-endpoint
to specify the URL from whichrasa x
pulls the runtime configuration (endpoints and credentials) -
LockStore
class storing instances ofTicketLock
for everyconversation_id
-
environment variables
SQL_POOL_SIZE
(default: 50) andSQL_MAX_OVERFLOW
(default: 100) can be set to control the pool size and maximum pool overflow forSQLTrackerStore
when used with thepostgresql
dialect -
Add a bot_challenge intent and a utter_iamabot action to all example projects and the rasa init bot.
-
Allow sending attachments when using the socketio channel
-
rasa data validate
will fail with a non-zero exit code if validation fails
-
added character-level
CountVectorsFeaturizer
with empirically found parameters into thesupervised_embeddings
NLU pipeline template -
NLU evaluations now also stores its output in the output directory like the core evaluation
-
show warning in case a default path is used instead of a provided, invalid path
-
compare mode of
rasa train core
allows the whole core config comparison, naming style of models trained for comparison is changed (this is a breaking change) -
pika keeps a single connection open, instead of open and closing on each incoming event
-
RasaChatInput
fetches the public key from the Rasa X API. The key is used to decode the bearer token containing the conversation ID. This requiresrasa-x>=0.20.2
. -
more specific exception message when loading custom components depending on whether component's path or class name is invalid or can't be found in the global namespace
-
change priorities so that the
MemoizationPolicy
has higher priority than theMappingPolicy
-
substitute LSTM with Transformer in
EmbeddingPolicy
-
EmbeddingPolicy
can now useMaxHistoryTrackerFeaturizer
-
non zero
evaluate_on_num_examples
inEmbeddingPolicy
andEmbeddingIntentClassifier
is the size of hold out validation set that is excluded from training data -
defaults parameters and architectures for both
EmbeddingPolicy
andEmbeddingIntentClassifier
are changed (this is a breaking change) -
evaluation of NER does not include 'no-entity' anymore
-
--successes
forrasa test nlu
is now boolean values. If set incorrect/successful predictions are saved in a file. -
--errors
is renamed to--no-errors
and is now a boolean value. By default incorrect predictions are saved in a file. If--no-errors
is set predictions are not written to a file. -
Remove
label_tokenization_flag
andlabel_split_symbol
fromEmbeddingIntentClassifier
. Instead move these parameters toTokenizers
. -
Process features of all attributes of a message, i.e. - text, intent and response inside the respective component itself. For e.g. - intent of a message is now tokenized inside the tokenizer itself.
-
Deprecate
as_markdown
andas_json
in favour ofnlu_as_markdown
andnlu_as_json
respectively. -
pin python-engineio >= 3.9.3
-
update python-socketio req to >= 4.3.1
-
rasa test nlu
with a folder of configuration files -
MappingPolicy
standard featurizer is set toNone
-
Removed
text
parameter from send_attachment function in slack.py to avoid duplication of text output to slackbot -
server
/status
endpoint reports status when an NLU-only model is loaded
- Removed
--report
argument fromrasa test nlu
. All output files are stored in the--out
directory.
- Support for transit encryption with Redis via
use_ssl: True
in the tracker store config in endpoints.yml
- Support for passing a CA file for SSL certificate verification via the –ssl-ca-file flag
-
Added support for RabbitMQ TLS authentication. The following environment variables need to be set:
RABBITMQ_SSL_CLIENT_CERTIFICATE
- path to the SSL client certificate (required)RABBITMQ_SSL_CLIENT_KEY
- path to the SSL client key (required)RABBITMQ_SSL_CA_FILE
- path to the SSL CA file (optional, for certificate verification)RABBITMQ_SSL_KEY_PASSWORD
- SSL private key password (optional) -
Added ability to define the RabbitMQ port using the
port
key in theevent_broker
endpoint config.
- Correctly pass SSL flag values to x CLI command (backport of
- SQL tracker events are retrieved ordered by timestamps. This fixes interactive
learning events being shown in the wrong order. Backport of
1.3.2
patch (PR #4427).
- Added
query
dictionary argument toSQLTrackerStore
which will be appended to the SQL connection URL as query parameters.
- fixed bug that occurred when sending template
elements
through a channel that doesn't support them
- SSL support for
rasa run
command. Certificate can be specified using--ssl-certificate
and--ssl-keyfile
.
-
made default augmentation value consistent across repo
-
'/restart'
will now also restart the bot if the tracker is paused
- the
SocketIO
input channel now allows accesses from other origins (fixesSocketIO
channel on Rasa X)
-
messages with multiple entities are now handled properly with e2e evaluation
-
data/test_evaluations/end_to_end_story.md
was re-written in the restaurantbot domain
-
messages with multiple entities are now handled properly with e2e evaluation
-
data/test_evaluations/end_to_end_story.md
was re-written in the restaurantbot domain
- Free text input was not allowed in the Rasa shell when the response template contained buttons, which has now been fixed.
UserUttered
events always got the same timestamp
- Docs now have an
EDIT THIS PAGE
button
Flood control exceeded
error in Telegram connector which happened because the webhook was set twice
-
add root route to server started without
--enable-api
parameter -
add
--evaluate-model-directory
torasa test core
to evaluate models fromrasa train core -c <config-1> <config-2>
-
option to send messages to the user by calling
POST /conversations/{conversation_id}/execute
-
Agent.update_model()
andAgent.handle_message()
now work without needing to set a domain or a policy ensemble -
Update pytype to
2019.7.11
-
new event broker class:
SQLProducer
. This event broker is now used when running locally with Rasa X -
API requests are not longer logged to
rasa_core.log
by default in order to avoid problems when running on OpenShift (use--log-file rasa_core.log
to retain the old behavior) -
metadata
attribute added toUserMessage
-
rasa test core
can handle compressed model files -
rasa can handle story files containing multi line comments
-
template will retain { if escaped with {. e.g. {{“foo”: {bar}}} will result in {“foo”: “replaced value”}
-
TrainingFileImporter
interface to support customizing the process of loading training data -
fill slots for custom templates
-
Agent.update_model()
andAgent.handle_message()
now work without needing to set a domain or a policy ensemble -
update pytype to
2019.7.11
-
interactive learning bug where reverted user utterances were dumped to training data
-
added timeout to terminal input channel to avoid freezing input in case of server errors
-
fill slots for image, buttons, quick_replies and attachments in templates
-
rasa train core
in comparison mode stores the model files compressed (tar.gz
files) -
slot setting in interactive learning with the TwoStageFallbackPolicy
-
added optional pymongo dependencies
[tls, srv]
torequirements.txt
for better mongodb support -
case_sensitive
option added toWhiteSpaceTokenizer
withtrue
as default.
-
validation no longer throws an error during interactive learning
-
fixed wrong cleaning of
use_entities
in case it was a list and notTrue
-
updated the server endpoint
/model/parse
to handle also messages with the intent prefix -
fixed bug where “No model found” message appeared after successfully running the bot
-
debug logs now print to
rasa_core.log
when runningrasa x -vv
orrasa run -vv
- rest channel supports setting a message's input_channel through a field
input_channel
in the request body
- recommended syntax for empty
use_entities
andignore_entities
in the domain file has been updated fromFalse
orNone
to an empty list ([]
)
-
rasa run
without--enable-api
does not require a local model anymore -
using
rasa run
with--enable-api
to run a server now prints “running Rasa server” instead of “running Rasa Core server” -
actions, intents, and utterances created in
rasa interactive
can no longer be empty
-
debug logging now tells you which tracker store is connected
-
the response of
/model/train
now includes a response header for the trained model filename -
Validator
class to help developing by checking if the files have any errors -
project's code is now linted using flake8
-
info
log when credentials were provided for multiple channels and channel in--connector
argument was specified at the same time -
validate export paths in interactive learning
-
deprecate
rasa.core.agent.handle_channels(...)\
. Please use ``rasa.run(...)or
rasa.core.run.configure_app` instead. -
Agent.load()
also acceptstar.gz
model file
-
revert the stripping of trailing slashes in endpoint URLs since this can lead to problems in case the trailing slash is actually wanted
-
starter packs were removed from Github and are therefore no longer tested by Travis script
-
all temporal model files are now deleted after stopping the Rasa server
-
rasa shell nlu
now outputs unicode characters instead of\\uxxxx
codes -
fixed PUT /model with model_server by deserializing the model_server to EndpointConfig.
-
x in AnySlotDict
is nowTrue
for anyx
, which fixes empty slot warnings in interactive learning -
rasa train
now also includes NLU files in other formats than the Rasa format -
rasa train core
no longer crashes without a--domain
arg -
rasa interactive
now looks for endpoints inendpoints.yml
if no--endpoints
arg is passed -
custom files, e.g. custom components and channels, load correctly when using the command line interface
-
MappingPolicy
now works correctly when used as part of a PolicyEnsemble
-
unfeaturize single entities
-
added agent readiness check to the
/status
resource
- removed leading underscore from name of '_create_initial_project' function.
-
fixed bug where facebook quick replies were not rendering
-
take FB quick reply payload rather than text as input
-
fixed bug where training_data path in metadata.json was an absolute path
- fixed any inconsistent type annotations in code and some bugs revealed by type checker
- fixed duplicate events appearing in tracker when using a PostgreSQL tracker store
-
fixed compatibility with Rasa SDK
-
bot responses can contain
custom
messages besides other message types
- nlu configs can now be directly compared for performance on a dataset
in
rasa test nlu
-
update the tracker in interactive learning through reverting and appending events instead of replacing the tracker
-
POST /conversations/{conversation_id}/tracker/events
supports a list of events
-
fixed creation of
RasaNLUHttpInterpreter
-
form actions are included in domain warnings
-
default actions, which are overriden by custom actions and are listed in the domain are excluded from domain warnings
-
SQL
data
column type toText
for compatibility with MySQL -
non-featurizer training parameters don't break SklearnPolicy anymore
- revert PR #3739 (as this is a breaking change): set
PikaProducer
andKafkaProducer
default queues back torasa_core_events
-
support for specifying full database urls in the
SQLTrackerStore
configuration -
maximum number of predictions can be set via the environment variable
MAX_NUMBER_OF_PREDICTIONS
(default is 10)
-
default
PikaProducer
andKafkaProducer
queues torasa_production_events
-
exclude unfeaturized slots from domain warnings
-
loading of additional training data with the
SkillSelector
-
strip trailing slashes in endpoint URLs
- added argument
--rasa-x-port
to specify the port of Rasa X when running Rasa X locally viarasa x
-
slack notifications from bots correctly render text
-
fixed usage of
--log-file
argument forrasa run
andrasa shell
-
check if correct tracker store is configured in local mode
- fixed backwards incompatible utils changes
- fixed spacy being a required dependency (regression)
- automatic creation of index on the
sender_id
column when using an SQL tracker store. If you have an existing data and you are running into performance issues, please make sure to add an index manually usingCREATE INDEX event_idx_sender_id ON events (sender_id);
.
- NLU evaluation in cross-validation mode now also provides intent/entity reports, confusion matrix, etc.
-
non-ascii characters render correctly in stories generated from interactive learning
-
validate domain file before usage, e.g. print proper error messages if domain file is invalid instead of raising errors
- added
domain_warnings()
method toDomain
which returns a dict containing the diff between supplied {actions, intents, entities, slots} and what's contained in the domain
-
fix lookup table files failed to load issues/3622
-
buttons can now be properly selected during cmdline chat or when in interactive learning
-
set slots correctly when events are added through the API
-
mapping policy no longer ignores NLU threshold
-
mapping policy priority is correctly persisted
- updated installation command in docs for Rasa X
-
added arguments to set the file paths for interactive training
-
added quick reply representation for command-line output
-
added option to specify custom button type for Facebook buttons
-
added tracker store persisting trackers into a SQL database (
SQLTrackerStore
) -
added rasa command line interface and API
-
Rasa HTTP training endpoint at
POST /jobs
. This endpoint will train a combined Rasa Core and NLU model -
ReminderCancelled(action_name)
event to cancel given action_name reminder for current user -
Rasa HTTP intent evaluation endpoint at
POST /intentEvaluation
. This endpoints performs an intent evaluation of a Rasa model -
option to create template for new utterance action in
interactive learning
-
you can now choose actions previously created in the same session in
interactive learning
-
add formatter 'black'
-
channel-specific utterances via the
- "channel":
key in utterance templates -
arbitrary json messages via the
- "custom":
key in utterance templates and viautter_custom_json()
method in custom actions -
support to load sub skills (domain, stories, nlu data)
-
support to select which sub skills to load through
import
section inconfig.yml
-
support for spaCy 2.1
-
a model for an agent can now also be loaded from a remote storage
-
log level can be set via environment variable
LOG_LEVEL
-
add
--store-uncompressed
to train command to not compress Rasa model -
log level of libraries, such as tensorflow, can be set via environment variable
LOG_LEVEL_LIBRARIES
-
if no spaCy model is linked upon building a spaCy pipeline, an appropriate error message is now raised with instructions for linking one
-
renamed all CLI parameters containing any
_
to use dashes-
instead (GNU standard) -
renamed
rasa_core
package torasa.core
-
for interactive learning only include manually annotated and ner_crf entities in nlu export
-
made
message_id
an additional argument tointerpreter.parse
-
changed removing punctuation logic in
WhitespaceTokenizer
-
training_processes
in the Rasa NLU data router have been renamed toworker_processes
-
created a common utils package
rasa.utils
for nlu and core, common methods likeread_yaml
moved there -
removed
--num_threads
from run command (server will be asynchronous but running in a single thread) -
the
_check_token()
method inRasaChat
now authenticates against/auth/verify
instead of/user
-
removed
--pre_load
from run command (Rasa NLU server will just have a maximum of one model and that model will be loaded by default) -
changed file format of a stored trained model from the Rasa NLU server to
tar.gz
-
train command uses fallback config if an invalid config is given
-
test command now compares multiple models if a list of model files is provided for the argument
--model
-
Merged rasa.core and rasa.nlu server into a single server. See swagger file in
docs/_static/spec/server.yaml
for available endpoints. -
utter_custom_message()
method in rasa_core_sdk has been renamed toutter_elements()
-
updated dependencies. as part of this, models for spacy need to be reinstalled for 2.1 (from 2.0)
-
make sure all command line arguments for
rasa test
andrasa interactive
are actually used, removed arguments that were not used at all (e.g.--core
forrasa test
)
-
removed possibility to execute
python -m rasa_core.train
etc. (e.g. scripts inrasa.core
andrasa.nlu
). Use the CLI for rasa instead, e.g.rasa train core
. -
removed
_sklearn_numpy_warning_fix
from theSklearnIntentClassifier
-
removed
Dispatcher
class from core -
removed projects: the Rasa NLU server now has a maximum of one model at a time loaded.
-
evaluating core stories with two stage fallback gave an error, trying to handle None for a policy
-
the
/evaluate
route for the Rasa NLU server now runs evaluation in a parallel process, which prevents the currently loaded model unloading -
added missing implementation of the
keys()
function for the Redis Tracker Store -
in interactive learning: only updates entity values if user changes annotation
-
log options from the command line interface are applied (they overwrite the environment variable)
-
all message arguments (kwargs in dispatcher.utter methods, as well as template args) are now sent through to output channels
-
utterance templates defined in actions are checked for existence upon training a new agent, and a warning is thrown before training if one is missing