LogisticRegression scaling in scikit-learn

In this example we will look into the time and space complexity of sklearn.linear_model.LogisticRegression

from collections import OrderedDict

import numpy as np
from sklearn.linear_model import LogisticRegression
from neurtu import Benchmark, delayed


rng = np.random.RandomState(42)

n_samples, n_features = 50000, 100


X = rng.rand(n_samples, n_features)
y = rng.randint(2, size=(n_samples))


def benchmark_cases():
    for N in np.logspace(np.log10(100), np.log10(n_samples), 5).astype('int'):
        for solver in ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']:
            tags = OrderedDict(N=N, solver=solver)
            model = delayed(LogisticRegression, tags=tags)(
                                solver=solver, random_state=rng)

            yield model.fit(X[:N], y[:N])


bench = Benchmark(wall_time=True, peak_memory=True)
df = bench(benchmark_cases())

print(df.tail())

Out:

  0%|          | 0/50 [00:00<?, ?it/s]
  4%|4         | 2/50 [00:00<00:09,  4.92it/s]
  6%|6         | 3/50 [00:00<00:12,  3.71it/s]
  8%|8         | 4/50 [00:01<00:12,  3.71it/s]
 10%|#         | 5/50 [00:01<00:12,  3.56it/s]
 12%|#2        | 6/50 [00:01<00:10,  4.19it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 14%|#4        | 7/50 [00:01<00:10,  3.94it/s]
 16%|#6        | 8/50 [00:01<00:08,  4.78it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 18%|#8        | 9/50 [00:02<00:09,  4.21it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 20%|##        | 10/50 [00:02<00:08,  4.60it/s]
 22%|##2       | 11/50 [00:02<00:11,  3.26it/s]
 24%|##4       | 12/50 [00:03<00:12,  2.94it/s]
 26%|##6       | 13/50 [00:03<00:15,  2.46it/s]
 28%|##8       | 14/50 [00:04<00:14,  2.54it/s]
 30%|###       | 15/50 [00:04<00:13,  2.63it/s]
 32%|###2      | 16/50 [00:04<00:10,  3.14it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 34%|###4      | 17/50 [00:05<00:10,  3.11it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 36%|###6      | 18/50 [00:05<00:08,  3.70it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 38%|###8      | 19/50 [00:05<00:08,  3.45it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)

 40%|####      | 20/50 [00:05<00:07,  3.96it/s]
 42%|####2     | 21/50 [00:06<00:07,  3.83it/s]
 44%|####4     | 22/50 [00:06<00:08,  3.21it/s]
 46%|####6     | 23/50 [00:07<00:10,  2.63it/s]
 48%|####8     | 24/50 [00:07<00:10,  2.46it/s]
 50%|#####     | 25/50 [00:07<00:09,  2.56it/s]
 52%|#####2    | 26/50 [00:08<00:07,  3.04it/s]
 54%|#####4    | 27/50 [00:08<00:08,  2.83it/s]
 56%|#####6    | 28/50 [00:08<00:07,  2.98it/s]
 58%|#####8    | 29/50 [00:09<00:07,  2.68it/s]
 60%|######    | 30/50 [00:09<00:06,  2.98it/s]
 62%|######2   | 31/50 [00:09<00:06,  3.09it/s]
 64%|######4   | 32/50 [00:09<00:05,  3.47it/s]
 66%|######6   | 33/50 [00:10<00:04,  3.44it/s]
 68%|######8   | 34/50 [00:10<00:04,  3.90it/s]
 70%|#######   | 35/50 [00:10<00:03,  3.85it/s]
 72%|#######2  | 36/50 [00:10<00:03,  4.20it/s]
 74%|#######4  | 37/50 [00:11<00:04,  3.08it/s]
 76%|#######6  | 38/50 [00:12<00:04,  2.45it/s]
 78%|#######8  | 39/50 [00:12<00:04,  2.52it/s]
 80%|########  | 40/50 [00:12<00:03,  2.56it/s]
 82%|########2 | 41/50 [00:13<00:05,  1.78it/s]
 84%|########4 | 42/50 [00:14<00:05,  1.36it/s]
 86%|########6 | 43/50 [00:15<00:05,  1.35it/s]
 88%|########8 | 44/50 [00:16<00:04,  1.33it/s]
 90%|######### | 45/50 [00:17<00:03,  1.33it/s]
 92%|#########2| 46/50 [00:17<00:03,  1.30it/s]
 94%|#########3| 47/50 [00:22<00:05,  1.87s/it]
 96%|#########6| 48/50 [00:26<00:04,  2.44s/it]
 98%|#########8| 49/50 [00:28<00:02,  2.33s/it]
100%|##########| 50/50 [00:30<00:00,  2.30s/it]

                 wall_time  peak_memory
N     solver
49999 newton-cg   0.909621    65.433594
      lbfgs       0.662035     0.000000
      liblinear   0.625137    79.863281
      sag         4.358365     0.000000
      saga        2.037092     0.007812

The above section will run in approximately 1min, a progress bar will be displayed.

We can use the pandas plotting API (that requires matplotlib) to visualize the results,

ax = df.wall_time.unstack().plot(marker='o')
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_ylabel('Wall time (s)')
ax.set_title('Run time scaling for LogisticRegression.fit')
../_images/sphx_glr_logistic_regression_scaling_001.png

The solver with the best scalability in this example is “lbfgs”.

Similarly the memory scaling is represented below,

ax = df.peak_memory.unstack().plot(marker='o')
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_ylabel('Peak memory (MB)')
ax.set_title('Peak memory usage for LogisticRegression.fit')
../_images/sphx_glr_logistic_regression_scaling_002.png

Peak memory usage for “liblinear” and “newton-cg” appear to be significant above 10000 samples, while the other solvers use less memory than the detection threshold. Note that these benchmarks do not account for the memory used by X and y arrays.

Total running time of the script: ( 0 minutes 32.213 seconds)

Gallery generated by Sphinx-Gallery