正則化 LassoとElastic Net

LassoとElastic Netのモデルも追加してみましょう。これらは、Ridge回帰と異なる正則化項を持っています。

L1ペナルティー：パラメーターの絶対値の和。Lasso（ラッソ）で使われます。
L1とL2ペナルティー：L1ペナルティーとL2ペナルティーの重み和。Elastic Net（エラスティックネット）で使われます。

# 前回のプログラムの読込
%run 1.ipynb
X.shape

(506, 13)

from sklearn.linear_model import Lasso, ElasticNet
from sklearn.metrics import mean_squared_error

# Lasso
lasso = Lasso(alpha=1.0)
lasso.fit(X_train, y_train)
# Elastic Net
enet = ElasticNet(alpha=1, l1_ratio=0.5)
enet.fit(X_train, y_train)

ElasticNet(alpha=1)

# Lassoの予測
y_pred3 = lasso.predict(X_test)
# Lassoの評価
score3 = r2_score(y_test, y_pred3)
# Elastic Netの予測
y_pred4 = enet.predict(X_test)
# Elastic Netの評価
score4 = r2_score(y_test, y_pred4)
score3, score4

(0.5516247059049908, 0.5603163143661134)

lasso.coef_

array([-0.05873776,  0.04999404, -0.00158882,  0.        , -0.        ,
        0.761785  ,  0.01304661, -0.71010927,  0.19551641, -0.01414771,
       -0.80524598,  0.00709763, -0.74214555])

# パラメーターの比較
df = pd.DataFrame({'LR': lr.coef_,
                   'Ridge': ridge.coef_,
                   'Lasso': lasso.coef_,
                   'ElasticNet': enet.coef_})
df.plot.bar();

Lasso（sklearn.linear_model.Lasso）とElastic Net（sklearn.linear_model.ElasticNet）を作成しフィッティングします。
どちらもalphaを指定できます。また、ElasticNetは、l1_ratioを指定できます。
l1_ratioは、L1正則化の比率で、0〜1の値を指定します。

l1_ratio == 0：L2ペナルティーだけを意味し、Ridge回帰に近くなります。
l1_ratio == 1：L1ペナルティーだけになるので、Lassoと同じになります。

決定係数の値はほぼ変わりません。
線形回帰とRidge回帰の決定係数は、0.63前後あったので、それよりは悪くなっています。

lasso.coef_を確認すると、0になっているものが2つあります。Lassoの特徴として「説明変数のパラメーターが0になりやすい」というのがあります（説明変数を減らせます）。

パラメーターがどう変わったかを棒グラフで確認します。
線形回帰とRidge回帰に比べると、全体的に小さくなっているのがわかります。

Elastic Netのパラメーター

Elastic Netのalphaとl1_ratioを変えて、決定係数がどのように変わるか、確認します。

Ridge回帰、Lasso、Elastic Netでは、αで正則化の強さを制御します。α＝0だと線形回帰と同じになります。

# Elastic Net
enet0 = ElasticNet(alpha=0)  # alpha=0だと警告が出ます
enet0.fit(X_train, y_train)
y_pred0 = enet0.predict(X_test)
score0 = r2_score(y_test, y_pred0)
score0, score1  # 線形回帰と比較

/tmp/ipykernel_2955/2489826072.py:3: UserWarning: With alpha=0, this algorithm does not converge well. You are advised to use the LinearRegression estimator
  enet0.fit(X_train, y_train)
/home/appuser/venv/lib/python3.9/site-packages/sklearn/linear_model/_coordinate_descent.py:647: UserWarning: Coordinate descent with no regularization may lead to unexpected results and is discouraged.
  model = cd_fast.enet_coordinate_descent(
/home/appuser/venv/lib/python3.9/site-packages/sklearn/linear_model/_coordinate_descent.py:647: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.722e+03, tolerance: 3.233e+00 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
  model = cd_fast.enet_coordinate_descent(

(0.6354638433202116, 0.6354638433202133)

ElasticNet(alpha=0)で確認すると、前々問のLinearRegression()の決定係数とほぼ同じになります。
ただし、「α＝0にするなら、LinearRegressionを使って下さい」という警告が出ています。確かに実用上はα＝０で実行する意味はないでしょう。

# Elastic Net
enet5 = ElasticNet(alpha=1, l1_ratio=0)  # l1_ratio=0だと警告が出ます
enet5.fit(X_train, y_train)
y_pred5 = enet5.predict(X_test)
score5 = r2_score(y_test, y_pred5)
score5, score2  # Ridge回帰と比較

/home/appuser/venv/lib/python3.9/site-packages/sklearn/linear_model/_coordinate_descent.py:647: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.001e+03, tolerance: 3.233e+00 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.
  model = cd_fast.enet_coordinate_descent(

(0.5661486095034544, 0.6266182204613859)

ElasticNet(alpha=1, l1_ratio=0)は、Ridge(alpha=1)に近いはずです。
しかし、決定係数を比較すると、0.57と0.63でそこそこ違います。「目的変数が収束していません。反復回数を増やしてください」と警告が出ています。l1_ratioを小さくしたいときは、Ridge回帰を使った方がよいでしょう。

# Elastic Net
enet6 = ElasticNet(alpha=1, l1_ratio=1)
enet6.fit(X_train, y_train)
y_pred6 = enet6.predict(X_test)
score6 = r2_score(y_test, y_pred6)
score6, score3

(0.5516247059049908, 0.5516247059049908)

ElasticNet(alpha=1, l1_ratio=1)で確認すると、Lasso(alpha=1)の決定係数と同じになることが確認できます。