-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding the results and improving the model #72
Comments
Hi @kai-majerus , Let's see if I can help at least a bit:
The code that you used as example are the ones available in the getting started notebook, I probably should improve those already as we have some better features now. In order to print and plot values associated to what the model converged to, I recommend using this notebook as a reference. You could, for instance, print each parameter mean and standard deviation with something like: for param in ci.model.parameters:
print("{}: {} +- {}".format(param.name,
tf.mean(ci.model.samples[param.name], axis=0),
tf.std(ci.model.samples[param.name], axis=0))) (This is just an example, I'm not with access to my workstation right now so I can't confirm). Another interesting approach would be to decompose the model and plot each component as well: component_dists = sts.decompose_by_component(
ci.model,
observed_time_series=your_Y_data,
parameter_samples=ci.model.samples)
forecast_component_dists = sts.decompose_forecast_by_component(
ci.model,
forecast_dist=ci.model.forecast,
parameter_samples=ci.model.samples) And use those "dists" as input for plotting functions. Funny thing (and quite coincidentally) the second example in this colab also finds a temperature effect (their "X") of 0.06 just like yours:
Notice the temperature effect has a std of just 0.004 so the 95% interval wouldn't cross the 0 (zero) threshold by any margin (which hints as the weight being statically significant in the frequentist interpretation). If you check their component plots you'll see that the temperature component is also quite relevant when compared to the others, so it's definitely helping out in the forecasting procedure (notice the So I'd recommend plotting and printing out all those values and comparing them and their impact overall. If you observe your X variable is not adding much relatively speaking to the other components (like the "day_of_week_effect" in the example mentioned) then it may be a sign it's indeed not helping much. Notice this is not entirely a rigorous approach but at least it uses the posterior samples and values to guide you out and gives some ideas on what is working or not.
Further approaches you could take is to add seasonal components (maybe at day level, week level, month level, year level), test auto-regressive components to see if it helps on tightening the bounds of the residuals variance and play with adding other models (such as level trend models). I'd recommend using the variational inference for that as HMC will be quite slow.
Also we observe at moments in your data where there's some disconnect between impressions and keywords, such as around September and December. Maybe those are hot periods for this type of furniture, you could try and a dummy linear regression that is 0 everywhere except those periods and see if it helps in prediction quality as well. Either way, finding good covariates will remain a challenge indeed. This phase tends to be quite empirical and it's recommended to test a lot of ideas to see which works best (this library removes automatically covariates which doesn't offer much value to the final predictions so you can add more to see what happens). As for changing frequency of data, I'd suggest -- as usual -- to give it a try and see how it goes. By aggregating the data the algorithm may find more signal in all noise but seasonal components will be removed, so a trade-off is usually expected. You can use plots, print converged values and back-testing techniques to guide you out on which is the best model for you (on other packages, such as statsmodels, we have available goodness of fitness metrics but those are not available here). Hope that helps. If you can you could send me an example of your standardized data with white noise and I could play around with this data as well to see what I find. Hope that helps, I'll try in the next week or two to improve this documentation in this repo, it may make it more helpful after all. Let me know if this helps, Best, Will |
@WillianFuks Thanks for the great reply - really helpful. I'm currently working on something else, but will return to this work in the next few months and try all of your suggestions. Here is the standardized data with some white noise added as Another interesting idea suggested in the original paper, is to use groups of google trends search terms as covariates to proxy industry verticals. So instead of just using searches for As another example, I have a standardized dataset for a retailer that sells cookware. Y is again their impressions on google ads, and I have 5 covariates, where each covariate is the sum of google trends searchs for the terms in the group. Each group tries to capture a different industry vertical. These are the groups
Here is that dataset if you want to have a play around This is the extract from the paper - section 4, analysis 2. An important characteristic of counterfactual-forecasting approaches is that they do not require a setting in which a set of controls, selected at random, was exempt from the campaign. We therefore repeated the preceding analysis in the following way: we discarded the data from all control regions and, instead, used searches for keywords related to the advertiser’s industry, grouped into a handful of verticals, as covariates. In the absence of a dedicated set of control regions, such industry related time series can be very powerful controls, as they capture not only seasonal variations but also market-specific trends and events (though not necessarily advertiser-specific trends). A major strength of the controls chosen here is that time series on web searches are publicly available through Google Trends (http://www.google.com/trends/). This makes the approach applicable to virtually any kind of intervention. At the same time, the industry as a whole is unlikely to be moved by a single actor’s activities. This precludes a positive bias in estimating the effect of the campaign that would arise if a covariate was negatively affected by the campaign. |
This post isn't about a particular problem with the package, but rather how to understand the results and improve the model. I hope this is the right place to post.
I work for a company that helps online retailers to group their inventory into google ad campaigns. I am using Causal Impact to determine whether the release of a new feature within our software had an impact on the total impressions that an online retailer received through google ads - an impression is counted each time the retailers ad is shown on a google search result page.
To begin with, I just have one X variable.
y - impressions over the past 365 days.
X - daily searches for the term ‘garden furniture’ using google trends
I expected searches for 'garden furniture' to have a good correlation with the impressions of this particular retailer (correlation was +0.58). Importantly, google search terms won't be influenced by the change we made to our software and therefore satisfies the key requirement that the X variables are not affected by the intervention.
After standardising, the data looks like this.
And running Causal Impact shows that the intervention did not quite have a significant effect (p=0.07).
`pre_period_start = '20220405'
per_period_end = '20230207'
post_period_start = '20230208'
post_period_end = '20230329'
pre_period = [pre_period_start, per_period_end]
post_period = [post_period_start, post_period_end]
ci = CausalImpact(dated_data, pre_period, post_period)
ci.plot()`
Questions
tf.reduce_mean(ci.model.components_by_name['SparseLinearRegression/'].params_to_weights( ci.model_samples['SparseLinearRegression/_global_scale_variance'], ci.model_samples['SparseLinearRegression/_global_scale_noncentered'], ci.model_samples['SparseLinearRegression/_local_scale_variances'], ci.model_samples['SparseLinearRegression/_local_scales_noncentered'], ci.model_samples['SparseLinearRegression/_weights_noncentered'], ), axis=0)
<tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.06836722], dtype=float32)>
The value for beta.X = 0.06836722 seems quite low and suggests that the garden_furniture searches don't explain impressions very well. Is this the correct interpretation?
When adding another X variable to the model, how can I determine whether adding that variable was useful or not?
I’ve also attempted to backtest the model by selecting the first 90 data points and an imaginary intervention date. As shown below, we do not get a significant effect. However, I’m concerned that the predictions don’t seem to align that closely with the actual y. Does this look like a problem?
The text was updated successfully, but these errors were encountered: