API Reference#
kim.pre_analysis Modules#
- pairwise_analysis(xdata: Array, ydata: Array, metric_calculator: MetricBase, sst: bool = False, ntest: int = 100, alpha: float = 0.05, n_jobs: int = -1, seed_shuffle: int = 1234, verbose: int = 0)[source]#
Perform the pairwise analysis using either mutual information or correlation coefficient.
- Parameters:
xdata (array-like) – the predictors with shape (Ns, Nx)
ydata (array-like) – the predictands with shape (Ns, Ny)
metric_calculator (class) – the metric calculator
sst (bool) – whether to perform statistical significance test. Defaults to False.
ntest (int) – number of shuffled samples in sst. Defaults to 100.
alpha (float) – the significance level. Defaults to 0.05.
n_jobs (int) – the number of processers/threads used by joblib. Defaults to -1.
seed_shuffle (int) – the random seed number for doing shuffle test. Defaults to 1234.
verbose (int) – the verbosity level (0: normal, 1: debug). Defaults to 0.
- Returns:
the sensitivity, the sensitivity mask
- Return type:
(array, array)
- pc(xdata: Array, ydata: Array, metric_calculator: MetricBase, cond_metric_calculator: MetricBase, ntest: int = 100, alpha: float = 0.05, Ncond_max: int = 3, n_jobs: int = -1, seed_shuffle: int = 1234, verbose: int = 0)[source]#
The modified PC algorithm adapted to the X –> Y mapping problem.
- Parameters:
xdata (array-like) – the predictors with shape (Ns, Nx)
ydata (array-like) – the predictands with shape (Ns, Ny)
metric_calculator (class) – the metric calculator for unconditional relation
cond_metric_calculator (class) – the metric calculator for conditional relation
ntest (int) – number of shuffled samples in sst. Defaults to 100.
alpha (float) – the significance level. Defaults to 0.05.
Ncond_max (int) – the maximum number of conditions used by cond_metric_calculator. Defaults to 3.
n_jobs (int) – the number of processers/threads used by joblib. Defaults to -1.
seed_shuffle (int) – the random seed number for doing shuffle test. Defaults to 1234.
verbose (int) – the verbosity level (0: normal, 1: debug). Defaults to 0.
- Returns:
the sensitivity, the sensitivity mask, the conditional sensitivity mask
- Return type:
(array, array, array)
kim.data.Data Class#
- class Data(xdata: Array | None = None, ydata: Array | None = None, fdata: PosixPath | None = None, xscaler_type: str = '', yscaler_type: str = '')[source]#
The Data object.
- xdata#
the copy of xdata
- Type:
array-like
- ydata#
the copy of ydata
- Type:
array-like
- Ns#
the number of samples
- Type:
int
- Nx#
the number of predictors
- Type:
int
- Ny#
the number of predictands
- Type:
int
- xscaler_type#
the type of xdata scaler, either ‘minmax’, ‘normalize’, ‘standard’, or ‘log’
- Type:
str
- yscaler_type#
the type of ydata scaler, either ‘minmax’, ‘normalize’, ‘standard’, or ‘log’
- Type:
str
- xscaler#
the xdata scaler
- Type:
str
- yscaler#
the ydata scaler
- Type:
str
- sensitivity_config#
the sensitivity analysis configuration
- Type:
dict
- sensitivity_done#
whether the sensitivity analysis is performed
- Type:
bool
- sensitivity#
the calculated sensitivity with shape (Nx, Ny)
- Type:
array-like
- sensitivity_mask#
the calculated sensitivity mask with shape (Nx, Ny)
- Type:
array-like
- cond_sensitivity_mask#
the calculated conditional sensitivity mask with shape (Nx, Ny)
- Type:
array-like
- __init__(xdata: Array | None = None, ydata: Array | None = None, fdata: PosixPath | None = None, xscaler_type: str = '', yscaler_type: str = '')[source]#
Initialization function.
- Parameters:
xdata (array-like) – the predictors with shape (Ns, Nx)
fdata (PosixPath) – the root path where an existing data instance will be loaded
ydata (array-like) – the predictands with shape (Ns, Ny)
xscaler_type (str) – the type of xdata scaler, either minmax, normalize, standard, log, or ``
yscaler_type (str) – the type of ydata scaler, either minmax, normalize, standard, log, or ``
- calculate_sensitivity(method: str = 'gsa', metric: str = 'it-bins', sst: bool = False, ntest: int = 100, alpha: float = 0.05, bins: int = 10, k: int = 5, n_jobs=-1, seed_shuffle: int = 1234, verbose: int = 0)[source]#
- Calculate the sensitivity between self.xdata and self.ydata using either pairwise_analysis or pc method.
The results are updated in self.sensitivity_done, self.sensitivity, self.sensitivity_mask, and self.cond_sensitivity_mask.
- Parameters:
method (str) – The preliminary analysis method, including: gsa: the pairwise global sensitivity analysis pc: a modified PC algorithm that include conditional indendpence test after gsa Defaults to gsa.
metric (str) – The metric calculating the sensitivity, including: it-bins: the information-theoretic measures (MI and CMI) using binning approach it-knn: the information-theoretic measures (MI and CMI) using knn approach corr: the correlation coefficient Defaults to corr.
sst (bool) – Whether to perform the statistical significance test or the shuffle test. Defaults to False.
ntest (int) – The number of shuffled samples in sst. Defaults to 100.
alpha (float) – The significance level. Defaults to 0.05.
bins (int) – The number of bins for each dimension when metric == “it-bins”. Defaults to 10.
k (int) – The number of nearest neighbors when metric == “it-knn”. Defaults to 5.
n_jobs (int) – The number of processers/threads used by joblib.Parallel. Defaults to -1.
seed_shuffle (int) – The random seed number for doing shuffle test. Defaults to 5.
verbose (int) – The verbosity level (0: normal, 1: debug). Defaults to 0.
- load(rootpath: PosixPath = PosixPath('.'), check_xy: bool = True, overwrite: bool = False)[source]#
- load data and sensitivity analysis results from specified location, including:
data (x, y) and scaler
sensitivity analysis configuration
sensitivity analysis results
- Parameters:
rootpath (PosixPath) – the root path where data will be loaded
- save(rootpath: PosixPath = PosixPath('.'))[source]#
- Save data and sensitivity analysis results to specified location, including:
data (x, y) and scaler
sensitivity analysis configuration
sensitivity analysis results
- Parameters:
rootpath (PosixPath) – the root path where data will be saved
- property xdata_scaled#
Perform normalization on self.xdata based on the given normalization type self.xscaler_type.
- Returns:
the scaled self.xdata
- Return type:
array-like
- property ydata_scaled#
Perform normalization on self.ydata based on the given normalization type self.yscaler_type.
- Returns:
the scaled self.ydata
- Return type:
array-like
kim.map.KIM Class#
- class KIM(data: Data, map_configs: dict, mask_option: str = 'cond_sensitivity', map_option: str = 'many2one', other_mask: Array | None = None, name: str = 'kim')[source]#
The class for knowledge-informed mapping training, prediction, saving and loading.
Attributes:#
- dataData
the copy of the __init__ argument
- map_configsdict
the copy of the __init__ argument
- map_optionstr
the copy of the __init__ argument
- mask_optionstr
the copy of the __init__ argument
- trainedbool
whether KIM has been trained
- loaded_from_other_sourcesbool
whether KIM is loaded from other sources.
- Nsint
the number of ensemble members (from data.Ns)
- Nxint
the number of input features (from data.Nx)
- Nyint
the number of output features (from data.Ny)
- maskArray
the masked array with shape (Nx, Ny)
- _n_mapsint
the number of maps
- _mapsint
the trained maps
- __init__(data: Data, map_configs: dict, mask_option: str = 'cond_sensitivity', map_option: str = 'many2one', other_mask: Array | None = None, name: str = 'kim')[source]#
Initialization function.
- Parameters:
data (Data) – the Data object containing the ensemble data and sensitivity analysis result.
map_configs (dict) – the mapping configuration, including all the arguments of Map class except x and y.
mask_option (str) – the masking option including “sensitivity” (using data.sensitivity_mask), and “cond_sensitivity” (using data.cond_sensitivity_mask).
map_option (str) – the map option including “many2one”: knowledge-informed mapping using sensitivity analysis result as filter, and “many2many”: normal mapping without being knowledge-informed
other_mask (List) – the additional mask to be assigned to self.mask with size Nx. Default to None.
name (str) – the name of the KIM object
- load(rootpath: PosixPath = PosixPath('.'))[source]#
load the trained KIM from specified location.
- Parameters:
rootpath (PosixPath) – the root path where KIM will be loaded
- property maps#
- property n_maps#
- predict(x: Array | None = None)[source]#
Prediction using the trained KIM.
- Parameters:
x (Array) – predictors with shape (Ns,…,Nx)
kim.map.Map Class#
- class Map(x: ~jax.Array, y: ~jax.Array, model_type: type = <class 'kim.mapping_model.mlp.MLP'>, n_model: int = 0, ensemble_type: str = 'single', model_hp_choices: dict = {}, model_hp_fixed: dict = {}, optax_hp_choices: dict = {}, optax_hp_fixed: dict = {}, dl_hp_choices: dict = {}, dl_hp_fixed: dict = {}, training_parallel: bool = True, ens_seed: int | None = None, parallel_config: dict | None = None, device: ~jaxlib._jax.Device | None = None)[source]#
The class for one mapping training, prediction, saving and loading. Ensemble training is supported through either serial or parallel way, using joblib.
- x#
the copy of the __init__ argument
- Type:
array_like
- y#
the copy of the __init__ argument
- Type:
array_like
- n_model#
the copy of the __init__ argument
- Type:
int
- training_parallel#
the copy of the __init__ argument
- Type:
bool
- model_type#
the copy of the __init__ argument
- Type:
type
- ensemble_type#
the copy of the __init__ argument
- Type:
str
- model_hp_choices#
the copy of the __init__ argument
- Type:
dict
- model_hp_fixed#
the copy of the __init__ argument
- Type:
dict
- optax_hp_choices#
the copy of the __init__ argument
- Type:
dict
- optax_hp_fixed#
the copy of the __init__ argument
- Type:
dict
- dl_hp_choices#
the copy of the __init__ argument
- Type:
dict
- dl_hp_fixed#
the copy of the __init__ argument
- Type:
dict
- training_parallel#
the copy of the __init__ argument
- Type:
bool
- ens_seed#
the copy of the __init__ argument
- Type:
Optional[int], optional)
- parallel_config#
the copy of the __init__ argument
- Type:
Optional[dict], optional)
- device#
the copy of the __init__ argument
- Type:
Optional[Device], optional
- trained#
whether the mapping has been trained
- Type:
bool
- loaded_from_other_sources#
whether the mapping is loaded from other sources.
- Type:
bool
- Ns#
number of samples
- Type:
int
- Nx#
number of input features
- Type:
int
- Ny#
number of output features
- Type:
int
- model_configs#
model hyperparameters for all ensemble models
- Type:
list
- optax_configs#
optimizer hyperparameters for all ensemble models
- Type:
list
- dl_configs#
dataloader hyperparameters for all ensemble models
- Type:
list
- model_ens#
list of trained model ensemble
- Type:
list
- loss_train_ens#
list of the training losses over steps
- Type:
list
- loss_val_ens#
list of the val losses over steps
- Type:
list
- __init__(x: ~jax.Array, y: ~jax.Array, model_type: type = <class 'kim.mapping_model.mlp.MLP'>, n_model: int = 0, ensemble_type: str = 'single', model_hp_choices: dict = {}, model_hp_fixed: dict = {}, optax_hp_choices: dict = {}, optax_hp_fixed: dict = {}, dl_hp_choices: dict = {}, dl_hp_fixed: dict = {}, training_parallel: bool = True, ens_seed: int | None = None, parallel_config: dict | None = None, device: ~jaxlib._jax.Device | None = None)[source]#
Initialization function.
- Parameters:
x (array-like) – the predictors with shape (Ns, Nx)
y (array-like) – the predictands with shape (Ns, Ny)
model_type (type) – the equinox model class
n_model (int) – the number of ensemble models
ensemble_type (str) – the ensemble type, either ‘single’, ‘serial’ or ‘parallel’.
model_hp_choices (dict) – the tunable model hyperparameters, in dictionary format {key: [value1, value2,…]}. The model hyperparameters must follow the arguments of the specified model_type
model_hp_fixed (dict) – the fixed model hyperparameters, in dictionary format {key: value}. The model hyperparameters must follow the arguments of the specified model_type
optax_hp_choices (dict) – the tunable optimizer hyperparameters, in dictionary format {key: [value1, value2,…]}. The optimizer hyperparameters must follow the arguments of the specified optax optimizer. Key hyperparameters: ‘optimizer_type’ (str), ‘nsteps’ (int), and ‘loss_func’ (callable)
optax_hp_fixed (dict) – the fixed optimizer hyperparameters, in dictionary format {key: value}. The optimizer hyperparameters must follow the arguments of the specified model_type. Key hyperparameters: ‘optimizer_type’ (str), ‘nsteps’ (int), and ‘loss_func’ (callable)
dl_hp_choices (dict) – the tunable dataloader hyperparameters, in dictionary format {key: [value1, value2,…]}. The optimizer hyperparameters must follow the arguments of make_big_data_loader. Key hyperparameters: ‘batch_size’ (int) and ‘num_train_sample’ (int)
dl_hp_fixed (dict) – the fixed dataloader hyperparameters, in dictionary format {key: value}. The optimizer hyperparameters must follow the arguments of make_big_data_loader. Key hyperparameters: ‘batch_size’ (int) and ‘num_train_sample’ (int
training_parallel (bool) – whether to perform parallel training
ens_seed (Optional[int], optional) – the random seed for generating ensemble configurations.
parallel_config (Optional[dict], optional) – the parallel training configurations following the arguments of joblib.Parallel
device (Optional[Device], optional) – the computing device to be set
- property dl_configs#
- load(rootpath: PosixPath = PosixPath('.'))[source]#
load the trained mapping from specified location.
- Parameters:
rootpath (PosixPath) – the root path where mappings will be loaded
- property loss_train_ens#
- property loss_val_ens#
- property model_configs#
- property model_ens#
- property n_model#
- property optax_configs#
- predict(x: Array)[source]#
Prediction using the trained mapping.
- Parameters:
x (Array) – predictors with shape (Ns,…,Nx)