Woodwork contains global configuration options that you can use to control the behavior of certain aspects of Woodwork. This guide provides an overview of working with those options, including viewing the current settings and updating the config values.
To demonstrate how to display the current configuration options, follow along.
After you’ve imported Woodwork, you can view the options with ww.config as shown below.
ww.config
[1]:
import woodwork as ww ww.config
Woodwork Global Config Settings ------------------------------- natural_language_threshold: 10 numeric_categorical_threshold: -1
The output of ww.config lists each of the available config variables followed by it’s current setting. In the output above, the natural_language_threshold config variable has been set to 10 and the numeric_categorical_threshold has been set to -1.
natural_language_threshold
10
numeric_categorical_threshold
-1
Updating a config variable is done simply with a call to the ww.config.set_option function. This function requires two arguments: the name of the config variable to update and the new value to set.
ww.config.set_option
As an example, update the natural_language_threshold config variable to have a value of 25 instead of the default value of 10.
25
[2]:
ww.config.set_option('natural_language_threshold', 25) ww.config
Woodwork Global Config Settings ------------------------------- natural_language_threshold: 25 numeric_categorical_threshold: -1
As you can see from the output above, the value for the natural_language_threshold config variable has been updated to 25.
If you need access to the value that is set for a specific config variable you can access it with the ww.config.get_option function, passing in the name of the config variable for which you want the value.
ww.config.get_option
[3]:
ww.config.get_option('natural_language_threshold')
Config variables can be reset to their default values using the ww.config.reset_option function, passing in the name of the variable to reset.
ww.config.reset_option
As an example, reset the natural_language_threshold config variable to its default value.
[4]:
ww.config.reset_option('natural_language_threshold') ww.config
This section provides an overview of the current config options that can be set within Woodwork.
The natural_language_threshold config variable helps control the distinction between Categorical and NaturalLanguage logical types during type inference. More specifically, this threshold represents the average string length that is used to distinguish between these two types. If the average string length in a column is greater than this threshold, the column is inferred as a NaturalLanguage column; otherwise, it is inferred as a Categorical column. The natural_language_threshold config variable defaults to 10.
Categorical
NaturalLanguage
Woodwork provides the option to infer numeric columns as the Categorical logical type if they have few enough unique values. The numeric_categorical_threshold config variable allows users to set the threshold of unique values below which numeric columns are inferred as categorical. The default threshold is -1, meaning that numeric columns are not inferred to be categorical by default (because the fewest number of unique values a column can have is zero).