API Reference

API Reference#

kim.pre_analysis Modules#

pairwise_analysis(xdata: Array, ydata: Array, metric_calculator: MetricBase, sst: bool = False, ntest: int = 100, alpha: float = 0.05, n_jobs: int = -1, seed_shuffle: int = 1234, verbose: int = 0)[source]#

Perform the pairwise analysis using either mutual information or correlation coefficient.

Parameters:

xdata (array-like) – the predictors with shape (Ns, Nx)
ydata (array-like) – the predictands with shape (Ns, Ny)
metric_calculator (class) – the metric calculator
sst (bool) – whether to perform statistical significance test. Defaults to False.
ntest (int) – number of shuffled samples in sst. Defaults to 100.
alpha (float) – the significance level. Defaults to 0.05.
n_jobs (int) – the number of processers/threads used by joblib. Defaults to -1.
seed_shuffle (int) – the random seed number for doing shuffle test. Defaults to 1234.
verbose (int) – the verbosity level (0: normal, 1: debug). Defaults to 0.

Returns:

the sensitivity, the sensitivity mask

Return type:

(array, array)

pc(xdata: Array, ydata: Array, metric_calculator: MetricBase, cond_metric_calculator: MetricBase, ntest: int = 100, alpha: float = 0.05, Ncond_max: int = 3, n_jobs: int = -1, seed_shuffle: int = 1234, verbose: int = 0)[source]#

The modified PC algorithm adapted to the X –> Y mapping problem.

Parameters:

xdata (array-like) – the predictors with shape (Ns, Nx)
ydata (array-like) – the predictands with shape (Ns, Ny)
metric_calculator (class) – the metric calculator for unconditional relation
cond_metric_calculator (class) – the metric calculator for conditional relation
ntest (int) – number of shuffled samples in sst. Defaults to 100.
alpha (float) – the significance level. Defaults to 0.05.
Ncond_max (int) – the maximum number of conditions used by cond_metric_calculator. Defaults to 3.
n_jobs (int) – the number of processers/threads used by joblib. Defaults to -1.
seed_shuffle (int) – the random seed number for doing shuffle test. Defaults to 1234.
verbose (int) – the verbosity level (0: normal, 1: debug). Defaults to 0.

Returns:

the sensitivity, the sensitivity mask, the conditional sensitivity mask

Return type:

(array, array, array)

kim.data.Data Class#

class Data(xdata: Array | None = None, ydata: Array | None = None, fdata: PosixPath | None = None, xscaler_type: str = '', yscaler_type: str = '')[source]#

The Data object.

xdata#

the copy of xdata

Type:: array-like

ydata#

the copy of ydata

Type:: array-like

Ns#

the number of samples

Type:: int

Nx#

the number of predictors

Type:: int

Ny#

the number of predictands

Type:: int

xscaler_type#

the type of xdata scaler, either ‘minmax’, ‘normalize’, ‘standard’, or ‘log’

Type:: str

yscaler_type#

the type of ydata scaler, either ‘minmax’, ‘normalize’, ‘standard’, or ‘log’

Type:: str

xscaler#

the xdata scaler

Type:: str

yscaler#

the ydata scaler

Type:: str

sensitivity_config#

the sensitivity analysis configuration

Type:: dict

sensitivity_done#

whether the sensitivity analysis is performed

Type:: bool

sensitivity#

the calculated sensitivity with shape (Nx, Ny)

Type:: array-like

sensitivity_mask#

the calculated sensitivity mask with shape (Nx, Ny)

Type:: array-like

cond_sensitivity_mask#

the calculated conditional sensitivity mask with shape (Nx, Ny)

Type:: array-like

__init__(xdata: Array | None = None, ydata: Array | None = None, fdata: PosixPath | None = None, xscaler_type: str = '', yscaler_type: str = '')[source]#

Initialization function.

Parameters:

xdata (array-like) – the predictors with shape (Ns, Nx)
fdata (PosixPath) – the root path where an existing data instance will be loaded
ydata (array-like) – the predictands with shape (Ns, Ny)
xscaler_type (str) – the type of xdata scaler, either minmax, normalize, standard, log, or ``
yscaler_type (str) – the type of ydata scaler, either minmax, normalize, standard, log, or ``

calculate_sensitivity(method: str = 'gsa', metric: str = 'it-bins', sst: bool = False, ntest: int = 100, alpha: float = 0.05, bins: int = 10, k: int = 5, n_jobs=-1, seed_shuffle: int = 1234, verbose: int = 0)[source]#

Calculate the sensitivity between self.xdata and self.ydata using either pairwise_analysis or pc method.: The results are updated in self.sensitivity_done, self.sensitivity, self.sensitivity_mask, and self.cond_sensitivity_mask.

Parameters:

method (str) – The preliminary analysis method, including: gsa: the pairwise global sensitivity analysis pc: a modified PC algorithm that include conditional indendpence test after gsa Defaults to gsa.
metric (str) – The metric calculating the sensitivity, including: it-bins: the information-theoretic measures (MI and CMI) using binning approach it-knn: the information-theoretic measures (MI and CMI) using knn approach corr: the correlation coefficient Defaults to corr.
sst (bool) – Whether to perform the statistical significance test or the shuffle test. Defaults to False.
ntest (int) – The number of shuffled samples in sst. Defaults to 100.
alpha (float) – The significance level. Defaults to 0.05.
bins (int) – The number of bins for each dimension when metric == “it-bins”. Defaults to 10.
k (int) – The number of nearest neighbors when metric == “it-knn”. Defaults to 5.
n_jobs (int) – The number of processers/threads used by joblib.Parallel. Defaults to -1.
seed_shuffle (int) – The random seed number for doing shuffle test. Defaults to 5.
verbose (int) – The verbosity level (0: normal, 1: debug). Defaults to 0.

load(rootpath: PosixPath = PosixPath('.'), check_xy: bool = True, overwrite: bool = False)[source]#

load data and sensitivity analysis results from specified location, including:

data (x, y) and scaler
sensitivity analysis configuration
sensitivity analysis results

Parameters:: rootpath (PosixPath) – the root path where data will be loaded

save(rootpath: PosixPath = PosixPath('.'))[source]#

Save data and sensitivity analysis results to specified location, including:

data (x, y) and scaler
sensitivity analysis configuration
sensitivity analysis results

Parameters:: rootpath (PosixPath) – the root path where data will be saved

property xdata_scaled#

Perform normalization on self.xdata based on the given normalization type self.xscaler_type.

Returns:: the scaled self.xdata
Return type:: array-like

property ydata_scaled#

Perform normalization on self.ydata based on the given normalization type self.yscaler_type.

Returns:: the scaled self.ydata
Return type:: array-like

kim.map.KIM Class#

class KIM(data: Data, map_configs: dict, mask_option: str = 'cond_sensitivity', map_option: str = 'many2one', other_mask: Array | None = None, name: str = 'kim')[source]#

The class for knowledge-informed mapping training, prediction, saving and loading.

Attributes:#

dataData: the copy of the __init__ argument
map_configsdict: the copy of the __init__ argument
map_optionstr: the copy of the __init__ argument
mask_optionstr: the copy of the __init__ argument
trainedbool: whether KIM has been trained
loaded_from_other_sourcesbool: whether KIM is loaded from other sources.
Nsint: the number of ensemble members (from data.Ns)
Nxint: the number of input features (from data.Nx)
Nyint: the number of output features (from data.Ny)
maskArray: the masked array with shape (Nx, Ny)
_n_mapsint: the number of maps
_mapsint: the trained maps

__init__(data: Data, map_configs: dict, mask_option: str = 'cond_sensitivity', map_option: str = 'many2one', other_mask: Array | None = None, name: str = 'kim')[source]#

Initialization function.

Parameters:

data (Data) – the Data object containing the ensemble data and sensitivity analysis result.
map_configs (dict) – the mapping configuration, including all the arguments of Map class except x and y.
mask_option (str) – the masking option including “sensitivity” (using data.sensitivity_mask), and “cond_sensitivity” (using data.cond_sensitivity_mask).
map_option (str) – the map option including “many2one”: knowledge-informed mapping using sensitivity analysis result as filter, and “many2many”: normal mapping without being knowledge-informed
other_mask (List) – the additional mask to be assigned to self.mask with size Nx. Default to None.
name (str) – the name of the KIM object

evaluate_maps_on_givendata()[source]#: Perform predictions on the given dataset

load(rootpath: PosixPath = PosixPath('.'))[source]#

load the trained KIM from specified location.

Parameters:: rootpath (PosixPath) – the root path where KIM will be loaded

property maps#

property n_maps#

predict(x: Array | None = None)[source]#

Prediction using the trained KIM.

Parameters:: x (Array) – predictors with shape (Ns,…,Nx)

save(rootpath: PosixPath = PosixPath('.'))[source]#

Save the KIM, including:

the data object
all the mappings
the remaining configurations

Parameters:: rootpath (PosixPath) – the root path where data will be saved

train(verbose: int = 0)[source]#

kim.map.Map Class#

class Map(x: ~jax.Array, y: ~jax.Array, model_type: type = <class 'kim.mapping_model.mlp.MLP'>, n_model: int = 0, ensemble_type: str = 'single', model_hp_choices: dict = {}, model_hp_fixed: dict = {}, optax_hp_choices: dict = {}, optax_hp_fixed: dict = {}, dl_hp_choices: dict = {}, dl_hp_fixed: dict = {}, training_parallel: bool = True, ens_seed: int | None = None, parallel_config: dict | None = None, device: ~jaxlib._jax.Device | None = None)[source]#

The class for one mapping training, prediction, saving and loading. Ensemble training is supported through either serial or parallel way, using joblib.

x#