-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathSKL47-StochasticGradient.tex
More file actions
57 lines (51 loc) · 2.02 KB
/
SKL47-StochasticGradient.tex
File metadata and controls
57 lines (51 loc) · 2.02 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
\documentclass[SKL-MASTER.tex]{subfiles}
\subsection{Using Stochastic Gradient Descent for classification}
As was discussed in Chapter 2, Working with Linear Models, Stochastic Gradient Descent is a
fundamental technique to fit a model for regression. There are natural connections between
the two techniques, as the name so obviously implies.
\subsection*{Getting ready}
In regression, we minimized a cost function that penalized for bad choices on a continuous
scale, but for classification, we'll minimize a cost function that penalizes for two (or more) cases.
\subsection*{Implementation} % How to do it…
First, let's create some very basic data:
----%
<pre>
\begin{verbatim}
>>> from sklearn import datasets
>>> X, y = datasets.make_classification()
\end{verbatim}
\end{framed}
Next, we'll create a SGDClassifier instance:
----%
<pre>
\begin{verbatim}
>>> from sklearn import linear_model
>>> sgd_clf = linear_model.SGDClassifier()
\end{verbatim}
\end{framed}
As usual, we'll fit the model:
----%
<pre>
\begin{verbatim}
>>> sgd_clf.fit(X, y)
SGDClassifier(alpha=0.0001, class_weight=None, epsilon=0.1, eta0=0.0,
fit_intercept=True, l1_ratio=0.15,
learning_rate='optimal', loss='hinge', n_iter=5,
n_jobs=1, penalty='l2', power_t=0.5, random_state=None,
shuffle=False, verbose=0, warm_start=False)
\end{verbatim}
\end{framed}
We can set the \texttt{class\_weight} parameter to account for the varying amounts of unbalance
in a dataset.
%===================================================================%
%-- Classifying Data with scikit-learn
%-- 154
\subsubsection{The Hinge Loss Function}
The Hinge loss function is defined as follows:
Here, t is the true classification denoted as +1 for one case and -1 for the other. The vector of
coefficients is denoted by y as fit from the model, and x is the value of interest. There is also
an intercept for good measure. To put it another way:
\[ \]
\[ \]
%=================================================================================%
\end{document}