collasso package#

Submodules#

Module contents#

Sparse linear multi-task regression.

class collasso.CoopLassoCV(*, cv: int = 5, n_alphas: int = 100, l1_ratio: float = 0.5, exp_y: float = 1, exp_x: float = 1, random_state: int | None = None)#

Bases: RegressorMixin, BaseEstimator

Cross-Validated Cooperative Multi-Task Lasso Regression.

Implements cooperative multi-task lasso regression, optimising the regularisation parameters by cross-validation.

Parameters:
cvint, default=5

Number of cross-validation folds.

n_alphasint, default=100

Number of candidate values for the regularisation parameter in the final regressions.

l1_ratiofloat, default=0.5

Elastic net mixing parameter for the initial regressions, with 0<=l1_ratio<=1, where l1_ratio=0 leads to L2 (ridge) and l1_ratio=1 leads to L1 (lasso) penalisation.

exp_yfloat, default=1.0

Non-negative number for exponentiating the target-target correlation coefficients.

exp_xfloat, default=1.0

Non-negative number for exponentiating the feature-feature correlation coefficients.

random_stateint or None, default=None

Random seed for generating reproducible cross-validation folds.

Attributes:
n_int

Number of training samples.

p_int

Number of features.

q_int

Number of targets.

model_list of length q_targets

Fitted models from _CoopLasso.

alpha_list length q_targets of ndarrays

Sequence of regularisation parameters.

mse_list of length q_targets of ndarrays

Cross-validated mean squared errors for each value of alpha.

min_list of length q_targets of ndarrays

Indices of regularisation parameters corresponding to the lowest mean squared error.

coef_ndarray of shape (q_targets, p_features)

Estimated effects (of the feature in the column on the target in the row).

See also

IndepLassoCV

A convenience class using the same interface as CoopLassoCV (similarly formatted inputs and outputs) without sharing information among targets or features.

_CoopLasso

An internal class without cross-validation returning the lasso solution path. This is repeatedly called by CoopLassoCV (once in each cross-validation iteration and once for the full dataset).

Examples

>>> from sklearn.datasets import load_linnerud
>>> from collasso import CoopLassoCV
>>> x, y = load_linnerud(return_X_y=True)
>>> model = CoopLassoCV()
>>> model.fit(x, y) # n_samples x p_features, n_samples x q_targets
>>> model.coef_ # q_targets x p_features
>>> y_pred = model.predict(x) # n_samples x q_targets
fit(X: ndarray, y: ndarray, Z: ndarray | None = None) CoopLassoCV#

Fit cross-validated model.

Fits cross-validated cooperative multi-task lasso regression.

Parameters:
Xndarray of shape (n_samples, p_features) or (n_samples, p_features, q_targets)

Common feature matrix for all targets or a specific feature matrix for each target.

yndarray of shape (n_samples, q_targets)

Target matrix.

Zndarray of shape (p_features,) or (p_features, q_targets), or None

Logical vector or matrix indicating primary (1, True) and auxiliary features (0, False) for all targets or each target.

Returns:
selfCoopLassoCV

Fitted model.

predict(X: ndarray) ndarray#

Make predictions.

Make predictions from a cross-validated model obtained by cooperative multi-task lasso regression.

Parameters:
Xndarray of shape (n_samples, p_features) or (n_samples, p_features, q_targets)

Common feature matrix for all targets, or a specific feature matrix for each target.

Returns:
y_hatndarray of shape (n_samples, q_targets)

Matrix of predicted values (of the target in the column for the sample in the row).

set_fit_request(*, Z: bool | None | str = '$UNCHANGED$') CoopLassoCV#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
Zstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for Z parameter in fit.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') CoopLassoCV#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

class collasso.IndepLassoCV(*, cv: int = 5, alphas: int = 100)#

Bases: RegressorMixin, BaseEstimator

Single-Task Lasso Regression For Multiple Targets.

Fits single-task lasso regression separately to multiple targets, optimising the regularisation parameters by cross-validation.

This is a convencience class with the interface as CoopLassoCV (but without sharing information among targets or features). Note that auxiliary features are simply excluded from the model (i.e., learning without privileged information).

Parameters:
cvint, default=5

Number of cross-validation folds.

alphasint, default=100

Number of candidate values for the regularisation parameter.

Attributes:
n_int

Number of training samples.

p_int

Number of features.

q_int

Number of targets.

model_list of length q_targets

Fitted models from LassoCV (one for each target).

coef_ndarray of shape (q_targets, p_features)

Estimated coefficients (of the feature in the column on the target in the row).

See also

CoopLassoCV

The main class of this package. It uses the same interface as IndepLassoCV (similarly formatted inputs and outputs) but shares information among targets and features to improve selection and prediction.

Examples

>>> from sklearn.datasets import load_linnerud
>>> from collasso import IndepLassoCV
>>> x, y = load_linnerud(return_X_y=True)
>>> model = IndepLassoCV()
>>> model.fit(x, y) # n_samples x p_features, n_samples x q_targets
>>> model.coef_ # q_targets x p_features
>>> y_pred = model.predict(x) # n_samples x q_targets
fit(X: ndarray, y: ndarray, Z: ndarray | None = None) IndepLassoCV#

Fit IndepLassoCV.

Fit independent lasso regressions to multiple targets.

Parameters:
Xndarray of shape (n_samples, p_features) or (n_samples, p_features, q_targets)

Common feature matrix for all targets or a separate feature matrix for each target.

yndarray of shape (n_samples, q_targets)

Target matrix.

Zndarray of shape (p_features,) or (p_features, q_targets), or None

Logical vector or matrix indicating primary (1/True) and auxiliary features (0/False) for all targets together or each target separately (NB: auxiliary features are simply excluded).

Returns:
self: IndepLassoCV

Fitted models.

predict(X: ndarray) ndarray#

Make predictions.

Make prediction with models estimated by independent lasso regressions.

Parameters:
Xndarray of shape (n_samples, p_features) or (n_samples, p_features, q_targets)

Common feature matrix for all targets, or a separate feature matrix for each target.

Returns:
y_hatndarray of shape (n_samples, q_targets)

Matrix of predicted values.

set_fit_request(*, Z: bool | None | str = '$UNCHANGED$') IndepLassoCV#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
Zstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for Z parameter in fit.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') IndepLassoCV#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

collasso.simulate(*, n0: int = 100, n1: int = 10000, p: int = 200, q: int = 3, rho: float = 0.9, kappa: float = 1.0, prob_com: float = 0.05, prob_sep: float = 0.05) tuple[ndarray, ndarray, ndarray, ndarray, ndarray]#

Simulate Data for Linear Multi-Task Regression.

Simulates feature matrix and target matrix, with given probabilities of (i) common effects on all targets and (ii) specific effects on one target.

Parameters:
n0int, default=100

Number of training samples.

n1int, default=10000

Number of testing samples.

pint, default=200

Number of features.

qint, default=3

Number of targets.

rhofloat, default=0.90

Correlation coefficient, 0<=rho<=1.

kappafloat, default=1.00

Correlation coefficient, 0<=kappa<=1.

prob_comfloat, default=0.05

Probability of common effects for all targets, 0<=prob_com<=1.

prob_sepfloat, default=0.05

Probability of separate effects for each target.

Returns:
x_trainndarray of shape (n0_samples,p_features) or (n0_samples,p_features,q_targets)

Training feature matrix or matrices, common matrix for all targets (if kappa=1) or separate matrix for each target (if 0<=kappa<1).

y_trainndarray of shape (n0_samples,q_targets)

Training target matrix.

x_testndarray of shape (n1_samples,p_features) or (n1_samples,p_features,q_targets)

Test feature matrix or matrices, common matrix for all targets (if kappa=1) or separate matrix for each target (if 0<=kappa<1).

y_testndarray of shape (n1_samples,q_targets)

Test target matrix.

betandarray of shape (p_features,q_targets)

True effects in the training and the test data (of the feature in the row on the target in the column).

Raises:
ValueError

See also

_simulate_features

Internal function for simulating feature matrix or matrices.

_simulate_effects

Internal function for simulating effect matrix.

_simulate_targets

Internal function for simulating target matrix.

Examples

>>> from collasso import simulate
>>> x_train, y_train, x_test, y_test, beta = simulate()