Title: | Kernel_lasso Expansion |
---|---|
Description: | Kernel_lasso package can expands the features of existed data. It used the Gauss function to amplify the dimension of dataset and decrease the feature by lasso. |
Authors: | Zongrui Dai [aut, cre] |
Maintainer: | Zongrui Dai <[email protected]> |
License: | GPL-2 |
Version: | 0.0.0.9000 |
Built: | 2025-03-12 03:30:12 UTC |
Source: | https://github.com/zongrui-dai/kernel-lasso-feature-expansion |
Gauss function
gauss(d1, d2, sigma = 0.5)
gauss(d1, d2, sigma = 0.5)
d1 |
vector1 |
d2 |
vector2 |
sigma |
The hyperparameter of RBF kernel function, which indicates the width. |
## data(iris,package = 'datasets') w<-gauss(iris[,1],iris[,2]) print(w)
## data(iris,package = 'datasets') w<-gauss(iris[,1],iris[,2]) print(w)
Kernel_lasso is one feature selection method, which combines the feature expansion and lasso regression together. Kernel function will increase the dimensions of the existed data and then reduce the features by lasso. 'glmnet' package should be higher than 4.1-2.
x |
Your input features, which can have to be data.frame with at least two variables. |
y |
The dependent variable |
sigma |
The hyperparameter of RBF kernel function, which indicates the width. |
dataframe |
Wether the data is dataframe. The default is TURE |
standard |
Using 'max_min_scale' or 'Z_score' method to standardize the data. NULL means no standardization |
##Regression (MSE) data(attenu,package = 'datasets') result<-kernel_lasso_expansion(x=attenu[,-c(3,5)],y=attenu[,5], standard = 'max_min',sigma=0.01,control = lasso.control(nfolds=3,type.measure = 'mse')) summary(result) #Plot the lasso plot(result$lasso) #Result result$original ##The original feature space result$expansion ##The feature space after expansion result$final_feature ##The name of the final feature result$final_data ##The dataframe of final feature
##Regression (MSE) data(attenu,package = 'datasets') result<-kernel_lasso_expansion(x=attenu[,-c(3,5)],y=attenu[,5], standard = 'max_min',sigma=0.01,control = lasso.control(nfolds=3,type.measure = 'mse')) summary(result) #Plot the lasso plot(result$lasso) #Result result$original ##The original feature space result$expansion ##The feature space after expansion result$final_feature ##The name of the final feature result$final_data ##The dataframe of final feature
The same function from glmnet, which controls the training of lasso.
lasso.control(nfolds = 10, trace.it = 1, type.measure = "auc")
lasso.control(nfolds = 10, trace.it = 1, type.measure = "auc")
nfolds |
n-fold cross-validation. |
trace.it |
Whether to plot the training process |
type.measure |
Choose the loss funcrion. |
The lasso control setting
##10-fold Cross-validation with MSE as loss function c<-lasso.control(nfolds=10,type.measure='mse')
##10-fold Cross-validation with MSE as loss function c<-lasso.control(nfolds=10,type.measure='mse')
max_min_scale is used to calculate the standardization value of data.The formula is (x-min(x))/(max(x)-min(x)). It can compress the data into the (0,1).
data |
Your input data, which can be numerci or data.frame |
dataframe |
Wether the data is dataframe. The default is False(numeric) |
##For the numeric data data(iris,package = 'datasets') w<-max_min_scale(iris[,1]) print(w) ##For the data.frame data w1<-max_min_scale(iris[,-5],dataframe=TRUE) print(w1)
##For the numeric data data(iris,package = 'datasets') w<-max_min_scale(iris[,1]) print(w) ##For the data.frame data w1<-max_min_scale(iris[,-5],dataframe=TRUE) print(w1)
Z-score method is used to calculate the standardization value of data.The formula is (x-mean(x))/var(x). It can compress the data into the (0,1).
Z_score(data, dataframe = FALSE)
Z_score(data, dataframe = FALSE)
data |
Your input data, which can be numerci or data.frame |
dataframe |
Wether the data is dataframe. The default is False(numeric) |
##For the numeric data data(iris,package = 'datasets') w<-Z_score(iris[,1]) print(w) ##For the data.frame data w1<-Z_score(iris[,-5],dataframe=TRUE) print(w1)
##For the numeric data data(iris,package = 'datasets') w<-Z_score(iris[,1]) print(w) ##For the data.frame data w1<-Z_score(iris[,-5],dataframe=TRUE) print(w1)