Woodwork Global Configuration Options

Woodwork contains global configuration options that can be set by users to control the behavior of certain aspects of Woodwork. This guide will provide an overview of working with these options, including viewing the current settings and updating the config values.

Viewing Config Settings

First, we will demonstrate how to display the current configuration options. Once you have imported Woodwork, you can view the options with ww.config as shown below.

[1]:
import woodwork as ww
ww.config
[1]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 10
numeric_categorical_threshold: -1

The output of ww.config lists each of the available config variables followed by it’s current setting. In the output above, the natural_language_threshold config variable has been set to 10 and the numeric_categorical_threshold has been set to -1.

Updating Config Settings

The process of updating a config variable is done simply with a call to the ww.config.set_option function. This function requires two arguments: the name of the config variable to update, and the new value to set.

To illustrate this, we will update the natural_language_threshold config variable to have a value of 25 instead of the default value of 10:

[2]:
ww.config.set_option('natural_language_threshold', 25)
ww.config
[2]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 25
numeric_categorical_threshold: -1

As you can see from the output above, the value for the natural_language_threshold config variable has now been updated to 25.

Get Value for a Specific Config Variable

If you need access to the value that is set for a specific config variable you can access it with the ww.config.get_option function, passing in the name of the config variable for which you want the value:

[3]:
ww.config.get_option('natural_language_threshold')
[3]:
25

Resetting to Default Values

Finally, config variables can be reset to their default values using the ww.config.reset_option function, passing in the name of the variable to reset. To demonstrate this, we will reset the natural_language_threshold config variable to its default value:

[4]:
ww.config.reset_option('natural_language_threshold')
ww.config
[4]:
Woodwork Global Config Settings
-------------------------------
natural_language_threshold: 10
numeric_categorical_threshold: -1

Available Config Settings

This section provides an overview of the current config options that can be set within Woodwork.

Natural Language Threshold

The natural_language_threshold config variable helps control the distinction between Categorical and NaturalLanguage logical types during type inference. More specifically, this threshold represents the average string length that is used to distinguish between these two types. If the average string length in a column is greater than this threshold, the column will be inferred as a NaturalLanguage column, otherwise it will be inferred as a Categorical column. The natural_language_threshold config variable defaults to 10.

Numeric Categorical Threshold

Woodwork provides the option to infer numeric columns as the Categorical logical type if they have few enough unique values. The numeric_categorical_threshold config variable allows users to set the threshold of unique values below which numeric columns will be inferred as categorical. The default threshold is -1, meaning that numeric columns will not be inferred to be Categorical by default (since the fewest number of unique values a column can have is zero).