Global Configuration Options

Woodwork contains global configuration options that you can use to control the behavior of certain aspects of Woodwork. This guide provides an overview of working with those options, including viewing the current settings and updating the config values.

Viewing Config Settings

To demonstrate how to display the current configuration options, follow along.

After you’ve imported Woodwork, you can view the options with ww.config as shown below.

[1]:
import woodwork as ww
ww.config
[1]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 10
numeric_categorical_threshold: -1

The output of ww.config lists each of the available config variables followed by it’s current setting. In the output above, the natural_language_threshold config variable has been set to 10 and the numeric_categorical_threshold has been set to -1.

Updating Config Settings

Updating a config variable is done simply with a call to the ww.config.set_option function. This function requires two arguments: the name of the config variable to update and the new value to set.

As an example, update the natural_language_threshold config variable to have a value of 25 instead of the default value of 10.

[2]:
ww.config.set_option('natural_language_threshold', 25)
ww.config
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-datatables/envs/stable/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.
  and should_run_async(code)
[2]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 25
numeric_categorical_threshold: -1

As you can see from the output above, the value for the natural_language_threshold config variable has been updated to 25.

Get Value for a Specific Config Variable

If you need access to the value that is set for a specific config variable you can access it with the ww.config.get_option function, passing in the name of the config variable for which you want the value.

[3]:
ww.config.get_option('natural_language_threshold')
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-datatables/envs/stable/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.
  and should_run_async(code)
[3]:
25

Resetting to Default Values

Config variables can be reset to their default values using the ww.config.reset_option function, passing in the name of the variable to reset.

As an example, reset the natural_language_threshold config variable to its default value.

[4]:
ww.config.reset_option('natural_language_threshold')
ww.config
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-datatables/envs/stable/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.
  and should_run_async(code)
[4]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 10
numeric_categorical_threshold: -1

Available Config Settings

This section provides an overview of the current config options that can be set within Woodwork.

Natural Language Threshold

The natural_language_threshold config variable helps control the distinction between Categorical and NaturalLanguage logical types during type inference. More specifically, this threshold represents the average string length that is used to distinguish between these two types. If the average string length in a column is greater than this threshold, the column is inferred as a NaturalLanguage column; otherwise, it is inferred as a Categorical column. The natural_language_threshold config variable defaults to 10.

Numeric Categorical Threshold

Woodwork provides the option to infer numeric columns as the Categorical logical type if they have few enough unique values. The numeric_categorical_threshold config variable allows users to set the threshold of unique values below which numeric columns are inferred as categorical. The default threshold is -1, meaning that numeric columns are not inferred to be categorical by default (because the fewest number of unique values a column can have is zero).