Dynamic time warping (DTW) checks and compares sequences with different lengths for sim­i­lar­i­ties. The dynamic time warping algorithm generates warping paths. Warping paths recognize matches using back­track­ing and degree of dif­fer­ence. Time dis­tor­tion or different speeds do not influence their ability to recognize sim­i­lar­i­ties. DTW is mainly used for speech recog­ni­tion, digital signature recog­ni­tion, financial market analysis.

What does dynamic time warping mean?

Dynamic time warping may sound like a tech­nol­o­gy from Star Trek. However, it is just an algorithm for comparing time sequences. The algorithm maps them on top of each other and compares them. “Warping” the sequences allows sim­i­lar­i­ties and matching patterns between the sequences to be rec­og­nized, even when they have different lengths and speeds.

As an example, if two walking sequences were examined for sim­i­lar­i­ties, the algorithm would be able to recognize identical patterns if the walking speed or distance was different. The same goes for comparing sig­na­tures of the same person, sim­i­lar­i­ties can be iden­ti­fied even if writing size differs.

What is the purpose of dynamic time warping?

DTW may seem like an abstract concept without any practical benefits. However, dynamic time warping actually performs an important analysis function in a wide variety of in­dus­tries. It is most ben­e­fi­cial for sequences that can be mapped and compared linearly, such as video and audio sequences, graphics or sta­tis­ti­cal models. When DTW detects matches, it can trigger certain actions, signals pr functions.

Pattern mea­sure­ment and pattern recog­ni­tion in sequences can be used to study similar de­vel­op­ments over different time periods. Dynamic time warping is often used in forward-looking tech­nolo­gies such as machine learning, su­per­vised learning and neural networks to train the analysis and response ca­pa­bil­i­ties of self-learning systems and evaluate data sets more ef­fi­cient­ly.

How does dynamic time warping work?

DTW searches for optimal matches to identify patterns and sim­i­lar­i­ties in different sequences. The prin­ci­ples “one-to-many” or “many-to-one” are helpful in this case. Different rules and con­di­tions are applied:

  • Each value from the first sequence must be compared with one or more values from the second sequence (and vice versa).
  • The first value from a sequence must be compared with the first value from the second sequence.
  • The last value from a sequence must be compared with the last value from the second sequence.
  • Mapping values from the first sequence to the values from the second sequence must increase mo­not­o­n­i­cal­ly. Values at the beginning and end of the sequences must have matched positions, without omission or overlap.

An optimal match exists when all spec­i­fi­ca­tions and con­di­tions are fulfilled. The cost function measures the degree of dif­fer­ence between the sequences’ values. An optimal match requires the cost function to be as low as possible.

The algorithm also creates a warping path that can detect optimal matches in sequences with different lengths. The warping path is created using back­track­ing. The algorithm maps one or more values from one sequence to points from the second sequence. This can find matches in time sequences with different lengths.

Where is dynamic time warping used?

Dynamic time warping can be used wherever data sets can be mapped and compared in linear sequences. For example, DTW is often used in pre-ex­am­i­na­tion and post-ex­am­i­na­tion in data analysis. This may be audio data, video data, al­phanu­mer­ic data or Big Data-based data sets.

DTW is also used in:

  • Pattern recog­ni­tion in audio sequences: DTW is plays an important role in speech recog­ni­tion. Recorded and stored speech patterns are mapped on top of each other to compare patterns using DTW. DTW uses adaptive time nor­mal­iza­tion to create a warping path for audio sequences with different lengths and speeds. This can even detect matches in vowels and con­so­nants with different lengths and speeds.
  • Pattern recog­ni­tion video sequences: DTW can map video sequences on top of each other for gesture and motion recog­ni­tion. A com­par­i­son and pattern recog­ni­tion can be performed in motion sequences with different temporal lengths or different speeds.
  • Pattern recog­ni­tion in financial data: DTW is also used in financial markets and corporate finance. DTW is the best option when creating forecasts of financial cycles, sales curves or stock market trends. DTW can be used to show similar or identical cycles and trends in market and company data over different time periods. The analysis is based on forecasts as well as on business and financial data.

Which tools use dynamic time warping?

The DTW algorithm is found in various open source software. These include:

  • DTW Suite with pro­gram­ming packages for Python and R.
  • FastDTW provides a Java im­ple­men­ta­tion for DTW al­go­rithms.
  • MatchBox im­ple­ments DTW for audio signals.
  • mlpy and pydtw also provide DTW functions as Python libraries for machine learning.
  • Gesture Recog­ni­tion provides DTW al­go­rithms for real-time gesture recog­ni­tion.

The following fast com­pu­ta­tion tech­niques use DTW as ef­fi­cient­ly as possible:

  • PruneDTW
  • SparseDTW
  • FastDTW
  • Mul­ti­scaleDTW

Dynamic time warping in Python

Below is an example of a DTW algorithm in Python code which il­lus­trates its com­plex­i­ty:

def dtw(s, t, window):
    n, m = len(s), len(t)
    w = np.max([window, abs(n-m)])
    dtw_matrix = np.zeros((n+1, m+1))
    
    for i in range(n+1):
        for j in range(m+1):
            dtw_matrix[i, j] = np.inf
    dtw_matrix[0, 0] = 0
    
    for i in range(1, n+1):
        for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
            dtw_matrix[i, j] = 0
    
    for i in range(1, n+1):
        for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
            cost = abs(s[i-1] - t[j-1])
            # take last min from a square box
            last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
            dtw_matrix[i, j] = cost + last_min
    return dtw_matrix
Python

The function presented in the example is passed three pa­ra­me­ters. Firstly, the two signals that are being examined are passed in the pa­ra­me­ters named s and t. The signals are usually arrays or vectors. The parameter called window can be used to specify how many other elements can be matched.

A matrix is created in the function and then ini­tial­ized with an infinite value. The main DTW step takes place in the final two nested “For” loops. The value stored in the variable last_min is added to the previous costs that are stored in the variable costs. These result from the distance between the two input values at the re­spec­tive index.

This value results from pre­vi­ous­ly cal­cu­lat­ed values in dynamic pro­gram­ming. You can select the minimum from the pre­vi­ous­ly cal­cu­lat­ed sur­round­ing values and add it to the pre­vi­ous­ly cal­cu­lat­ed costs. This step de­ter­mines whether two elements are a direct match, and then adds an element or deletes an element.

After the function has been carried out, you will receive a distance matrix with the warping path.

Dynamic time warping al­ter­na­tives

Al­ter­na­tives to DTW pattern recog­ni­tion in sequences and timings are:

  • Cor­re­la­tion Optimized Warping (COW)
  • Func­tion­al data analysis
  • Hidden Markov Model
  • Viterbi algorithm
Go to Main Menu