Linear Regression

  • Sequence data and Time series

Sequence data refers to data where the order of data points is important. This means the data isn't just a collection of observations, but a sequence of observations where each element has some logical connection to those before and after it. Typical examples of sequence data are text (a sequence of words or characters), a time series of stock prices, the DNA sequence in genomics (a sequence of genes), or even a series of actions performed by a user on a website, with varying lengths.

# a series of characters in a text document:
['H', 'e', 'l', 'l', 'o']

A time series is a type of sequence data with some time-related feature. The observations are collected at successive (often equally spaced) points in time, and each data point is associated with a timestamp. This could be anything from daily weather measurements to second-by-second stock prices. So the order of data is important: there's a clear temporal dimension, and the future is predicted based on the past, so forecasting.

# daily temperatures:
[(June 1, 70°F), (June 2, 72°F), (June 3, 68°F),...]
  • Linear Regression