woodwork.table_accessor.WoodworkTableAccessor.infer_temporal_frequencies¶
- WoodworkTableAccessor.infer_temporal_frequencies(temporal_columns=None)[source]¶
- Infers the observation frequency (daily, biweekly, yearly, etc) of each temporal column
in the DataFrame. Temporal columns are ones with the logical type Datetime or Timedelta. Not supported for Dask and Koalas DataFrames.
- Parameters
temporal_columns (list[str], optional) – Columns for which frequencies should be inferred. Must be columns that are present in the DataFrame and are temporal in nature. Defaults to None. If not specified, all temporal columns will have their frequencies inferred.
- Returns
- A dictionary where each key is a temporal column from the DataFrame, and the
value is its observation frequency represented as a pandas offset alias string (D, M, Y, etc.) or None if no uniform frequency was present in the data.
- Return type
(dict)
Note
- The pandas util
pd.infer_freq
, which is used in this method, has the following behaviors: - If even one row in a column does not follow the frequency seen in the remaining rows,
no frequency will be inferred. Example of otherwise daily data that skips one day:
['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-07']
.
If any NaNs are present in the data, no frequency will be inferred.
- Pandas will use the largest offset alias available to it, so
W
will be inferred for weekly data instead of7D
. The list of available offset aliases, which include aliases such as
B
for business day orN
for nanosecond, can be found at https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
- Pandas will use the largest offset alias available to it, so
- Offset aliases can be combined to create something like
2d1H
, which could also be expressed as ‘49H’. Pandas’ frequency inference will return the lower common alias,
49H
, in situations when it’d otherwise need to combine aliases.
- Offset aliases can be combined to create something like
- Offset strings can contain more information than just the offset alias. For example, a date range
pd.date_range(start="2020-01-01", freq="w", periods=10)
will be inferred to have frequencyW-SUN
. That string is an offset alias with an anchoring suffix that indicates that the data is not only observed at a weekly frequency, but that all the dates are on Sundays. More anchored offsets can be seen here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#anchored-offsets
- Some frequencies that can be defined for a
pd.date_range
cannot then be re-inferred by pandas’pd.infer_freq
. One example of this can be seen when using the business day offset alias
B
pd.date_range(start="2020-01-01", freq="4b", periods=10)
, which is a validfreq
parameter when building the date range, but is not then inferrable.
- Some frequencies that can be defined for a