Note
Click here to download the full example code
LogisticRegression scaling in scikit-learn¶
In this example we will look into the time and space complexity of
sklearn.linear_model.LogisticRegression
from collections import OrderedDict
import numpy as np
from sklearn.linear_model import LogisticRegression
from neurtu import Benchmark, delayed
rng = np.random.RandomState(42)
n_samples, n_features = 50000, 100
X = rng.rand(n_samples, n_features)
y = rng.randint(2, size=(n_samples))
def benchmark_cases():
for N in np.logspace(np.log10(100), np.log10(n_samples), 5).astype('int'):
for solver in ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']:
tags = OrderedDict(N=N, solver=solver)
model = delayed(LogisticRegression, tags=tags)(
solver=solver, random_state=rng)
yield model.fit(X[:N], y[:N])
bench = Benchmark(wall_time=True, peak_memory=True)
df = bench(benchmark_cases())
print(df.tail())
Out:
0%| | 0/50 [00:00<?, ?it/s]
4%|4 | 2/50 [00:00<00:09, 4.92it/s]
6%|6 | 3/50 [00:00<00:12, 3.71it/s]
8%|8 | 4/50 [00:01<00:12, 3.71it/s]
10%|# | 5/50 [00:01<00:12, 3.56it/s]
12%|#2 | 6/50 [00:01<00:10, 4.19it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
14%|#4 | 7/50 [00:01<00:10, 3.94it/s]
16%|#6 | 8/50 [00:01<00:08, 4.78it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
18%|#8 | 9/50 [00:02<00:09, 4.21it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
20%|## | 10/50 [00:02<00:08, 4.60it/s]
22%|##2 | 11/50 [00:02<00:11, 3.26it/s]
24%|##4 | 12/50 [00:03<00:12, 2.94it/s]
26%|##6 | 13/50 [00:03<00:15, 2.46it/s]
28%|##8 | 14/50 [00:04<00:14, 2.54it/s]
30%|### | 15/50 [00:04<00:13, 2.63it/s]
32%|###2 | 16/50 [00:04<00:10, 3.14it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
34%|###4 | 17/50 [00:05<00:10, 3.11it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
36%|###6 | 18/50 [00:05<00:08, 3.70it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
38%|###8 | 19/50 [00:05<00:08, 3.45it/s]/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
/home/docs/checkouts/readthedocs.org/user_builds/neurtu/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/sag.py:337: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
"the coef_ did not converge", ConvergenceWarning)
40%|#### | 20/50 [00:05<00:07, 3.96it/s]
42%|####2 | 21/50 [00:06<00:07, 3.83it/s]
44%|####4 | 22/50 [00:06<00:08, 3.21it/s]
46%|####6 | 23/50 [00:07<00:10, 2.63it/s]
48%|####8 | 24/50 [00:07<00:10, 2.46it/s]
50%|##### | 25/50 [00:07<00:09, 2.56it/s]
52%|#####2 | 26/50 [00:08<00:07, 3.04it/s]
54%|#####4 | 27/50 [00:08<00:08, 2.83it/s]
56%|#####6 | 28/50 [00:08<00:07, 2.98it/s]
58%|#####8 | 29/50 [00:09<00:07, 2.68it/s]
60%|###### | 30/50 [00:09<00:06, 2.98it/s]
62%|######2 | 31/50 [00:09<00:06, 3.09it/s]
64%|######4 | 32/50 [00:09<00:05, 3.47it/s]
66%|######6 | 33/50 [00:10<00:04, 3.44it/s]
68%|######8 | 34/50 [00:10<00:04, 3.90it/s]
70%|####### | 35/50 [00:10<00:03, 3.85it/s]
72%|#######2 | 36/50 [00:10<00:03, 4.20it/s]
74%|#######4 | 37/50 [00:11<00:04, 3.08it/s]
76%|#######6 | 38/50 [00:12<00:04, 2.45it/s]
78%|#######8 | 39/50 [00:12<00:04, 2.52it/s]
80%|######## | 40/50 [00:12<00:03, 2.56it/s]
82%|########2 | 41/50 [00:13<00:05, 1.78it/s]
84%|########4 | 42/50 [00:14<00:05, 1.36it/s]
86%|########6 | 43/50 [00:15<00:05, 1.35it/s]
88%|########8 | 44/50 [00:16<00:04, 1.33it/s]
90%|######### | 45/50 [00:17<00:03, 1.33it/s]
92%|#########2| 46/50 [00:17<00:03, 1.30it/s]
94%|#########3| 47/50 [00:22<00:05, 1.87s/it]
96%|#########6| 48/50 [00:26<00:04, 2.44s/it]
98%|#########8| 49/50 [00:28<00:02, 2.33s/it]
100%|##########| 50/50 [00:30<00:00, 2.30s/it]
wall_time peak_memory
N solver
49999 newton-cg 0.909621 65.433594
lbfgs 0.662035 0.000000
liblinear 0.625137 79.863281
sag 4.358365 0.000000
saga 2.037092 0.007812
The above section will run in approximately 1min, a progress bar will be displayed.
We can use the pandas plotting API (that requires matplotlib) to visualize the results,
ax = df.wall_time.unstack().plot(marker='o')
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_ylabel('Wall time (s)')
ax.set_title('Run time scaling for LogisticRegression.fit')
The solver with the best scalability in this example is “lbfgs”.
Similarly the memory scaling is represented below,
ax = df.peak_memory.unstack().plot(marker='o')
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_ylabel('Peak memory (MB)')
ax.set_title('Peak memory usage for LogisticRegression.fit')
Peak memory usage for “liblinear” and “newton-cg” appear to be significant
above 10000 samples, while the other solvers
use less memory than the detection threshold.
Note that these benchmarks do not account for the memory used by X
and
y
arrays.
Total running time of the script: ( 0 minutes 32.213 seconds)