22
$\begingroup$

Could someone walk me through an example on how to use DLM Kalman filtering in R on a time series. Say I have a these values (quarterly values with yearly seasonality); how would you use DLM to predict the next values? And BTW, do I have enough historical data (what is the minimum)?

89  2009Q1  
82  2009Q2  
89  2009Q3  
131 2009Q4  
97  2010Q1  
94  2010Q2  
101 2010Q3  
151 2010Q4  
100 2011Q1  
?   2011Q2

I'm looking for a R code cookbook-style how-to step-by-step type of answer. Accuracy of the prediction is not my main goal, I just want to learn the sequence of code that gives me a number for 2011Q2, even if I don't have enough data.

$\endgroup$
5
  • 3
    $\begingroup$ This may get better answers on stats.stackexchange.com $\endgroup$ Commented Mar 8, 2011 at 20:11
  • $\begingroup$ Bump... I still cannot understand how to do this. Any takers on answering the original post? $\endgroup$
    – datayoda
    Commented Apr 19, 2011 at 23:31
  • 2
    $\begingroup$ With a DLM it's not as cookbook-style as you might like. I'd take RockScience's answer (the DLM vignette) and walk through it. A DLM is more like designing a program than other techniques that simply require plugging in some data and tweaking some parameters. Ultimately, you're designing a set of arrays that implement something like a Hidden Markov Model, and the dlm package makes this as easy as possible. $\endgroup$
    – Wayne
    Commented Sep 16, 2011 at 17:37
  • $\begingroup$ Have you got solution to your problem? I am looking for a solution to similar type of timeseries problem but unable to find a solution. $\endgroup$
    – user8219
    Commented Dec 27, 2011 at 20:50
  • $\begingroup$ Have you worked through the paper suggested by @RockScience? Have you looked at the dlm package? As I said in my answer, DLMs are much more like creating a program than plugging some variables into a function call. datayoda never accepted an answer, so I'm not sure that they got past this observation. $\endgroup$
    – Wayne
    Commented Dec 27, 2011 at 21:15

3 Answers 3

18
$\begingroup$

The paper at JSS 39-02 compares 5 different Kalman filtering R packages and gives sample code.

$\endgroup$
17
$\begingroup$

DLMs are cool, but they are not as simple as, say, ARIMA or other methods. In other methods, you plug in your data and then tweak some parameters of the algorithm, perhaps referring to various diagnostics to guide your settings.

With a DLM, you are creating a state space machine, which consists of several matrices that basically implement something like a Hidden Markov Model. Some packages (sspir I think, among others) expect that you understand the concept and what the matrices do. I'd highly recommend that you start with the dlm package, and as @RockScience recommends, walk through the vignette.

With dlm you're going to basically take several steps:

  1. What kinds of components describe my series? A trend? Seasonality? Exogenous variables? You will use dlm tools like dlmModPoly to implement these components, using the + operator to join them together into one model.

  2. Create an R subroutine that takes however many parameters are required by this model, creates the components with those parameters, then adds them together and returns the resulting model.

  3. Use dlmMLE to do an search/optimization to find the appropriate parameters (using MLE, which is basically optimization, with the pitfalls that can occur in optimization). dlmMLE repeatedly calls your R subroutine with candidate parameters to create models, then tests them.

  4. Create your final model, using the R subroutine you created plus the parameters you found in step 3.

  5. Filter your data with dlmFilter, then perhaps smooth with dlmSmooth.

  6. If you use dlmModReg or do anything that causes the model to have time-variant parameters, you can't use dlmForecast to forecast your series. If you do end up with a time-variant model, you'll want to fill out your input data with NA's and let the dlmFilter fill in the NA's for you (a poor man's forecast), since dlmForecast does not work with time-varying parameters.

  7. If you want to examine the components individually (say the trend, separately from the seasonal), you'll need to understand the matrices and what's in each column, plus understand a bit of how dlm puts them together (order matters!).

There's another package, whose name escapes me, which tries to create a front end that can use several of these packages (including dlm as the back end). Unfortunately, I've never gotten it to work well, but that might just be me.

I'd really recommend getting a book on DLMs. I got a couple of them and played a lot with dlm to get to where I am, and I'm not the expert by any means.

$\endgroup$
3
  • $\begingroup$ Thanks Wayne, I think my case is quite simple in a way that I didn't spot any clear trends or seasonality in the visual inspection. (However, if you are aware of any tests in R, please let me know, I will try to run them). My problem is that I don't know how to fill in the arguments like (FF, V, GG, W, m0, C0, dV etc.) in the dlm functions for my data? This is the main issue for me. If I have a bivariate series data (y = X1 + X2) e.g. (price = demand + supply), then how could I go about calculating these arguments for my data? FF, V, GG, W, m0, C0, dV etc that are required in the dlm functions $\endgroup$
    – nclfinance
    Commented Sep 17, 2011 at 13:01
  • 1
    $\begingroup$ @nclfinance Please read the FAQ and don't treat this place as a forum. $\endgroup$
    – user88
    Commented Sep 17, 2011 at 14:18
  • $\begingroup$ @nclfinance: Work through the dlm package's vignette. You will learn what you need to know. That's why I recommend dlm, because you don't create FF, etc, yourself. $\endgroup$
    – Wayne
    Commented Sep 18, 2011 at 1:44
8
$\begingroup$

I suggest you read the dlm vignette http://cran.r-project.org/web/packages/dlm/vignettes/dlm.pdf especially the chapter 3.3

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.