Skip to content

Commit d52d502

Browse files
authored
Merge pull request #1284 from AmrutaJayanti/MD
Create Logistic-Regression.md
2 parents c5ae48e + 1eba1ef commit d52d502

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Logistic Regression
2+
3+
4+
``` python
5+
from sklearn.datasets import make_classification
6+
7+
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, n_clusters_per_class=1, random_state=42)
8+
```
9+
Above, is the custom dataset made using `make_classification` from
10+
`sklearn.datasets` .
11+
12+
``` python
13+
import matplotlib.pyplot as plt
14+
plt.scatter(X[:,0],X[:,1])
15+
plt.show()
16+
```
17+
![55558f59d1b98e9a3cc68d08daae54b9b065d057](https://github.com/AmrutaJayanti/codeharborhub/assets/142327526/84578011-0887-43da-b972-9e6f04ae505e)
18+
19+
20+
21+
Logistic Regression is a statistical method used for binary
22+
classification problems. It models the probability that a given input
23+
belongs to a particular category.
24+
25+
Logistic Function (Sigmoid Function): The core of logistic regression is
26+
the logistic function, which is an S-shaped curve that can take any
27+
real-valued number and map it into a value between 0 and 1. The function
28+
is defined as:
29+
30+
$$\sigma(x) = \frac{1}{1 + e^{-x}}$$
31+
32+
where $x$ is the input to the function
33+
34+
Logistic Regression is generally used for linearly separated data.
35+
36+
Logistic Regression cost function :
37+
38+
$J(\beta) = - \frac{1}{m} \sum_{i=1}^{m} \left[ y_i \log(h_\beta(x_i)) + (1 - y_i) \log(1 - h_\beta(x_i)) \right]$
39+
40+
### Applications
41+
42+
- **Medical Diagnosis**: Predicting whether a patient has a certain
43+
disease (e.g., diabetes, cancer) based on diagnostic features.
44+
- **Spam Detection**: Classifying emails as spam or not spam.
45+
- **Customer Churn**: Predicting whether a customer will leave a
46+
service.
47+
- **Credit Scoring**: Assessing whether a loan applicant is likely to
48+
default on a loan.
49+
50+
51+
``` python
52+
from sklearn.model_selection import train_test_split
53+
x_train,x_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)
54+
```
55+
56+
`X`,`y` are split into training and testing data using `train_test_split`
57+
58+
``` python
59+
from sklearn.linear_model import LogisticRegression
60+
61+
model = LogisticRegression()
62+
model.fit(x_train,y_train)
63+
y_pred = model.predict(x_test)
64+
65+
from sklearn.metrics import accuracy_score
66+
accuracy_score(y_test,y_pred)
67+
68+
```
69+
Output:
70+
71+
1.0
72+
73+
Our model predicts data accurately. Hence the accuracy score is 1 .
74+
75+
``` python
76+
import numpy as np
77+
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
78+
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
79+
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),
80+
np.arange(y_min, y_max, 0.01))
81+
82+
# Predict the class for each grid point
83+
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
84+
Z = Z.reshape(xx.shape)
85+
86+
# Plot decision boundary and data points
87+
plt.figure(figsize=(8, 6))
88+
plt.contourf(xx, yy, Z, alpha=0.8, cmap='viridis')
89+
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis', marker='o', edgecolors='k')
90+
plt.xlabel('Feature 1')
91+
plt.ylabel('Feature 2')
92+
plt.title('Logistic Regression Decision Boundary')
93+
plt.show()
94+
```
95+
96+
![3709358d7ef950353a7f26d9dfbb2f5f16fc962e](https://github.com/AmrutaJayanti/codeharborhub/assets/142327526/bd7361ac-b710-4975-8fb2-1ad4bf0ebe99)
97+
98+
99+
100+

0 commit comments

Comments
 (0)