138 lines
3.5 KiB
Plaintext
138 lines
3.5 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Ungraded Lab: Linear Regression using Scikit-Learn"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now that you've implemented linear regression from scratch, let's see you can train a linear regression model using scikit-learn.\n",
|
|
"\n",
|
|
"## Dataset \n",
|
|
"Let's start with the same dataset as the first labs."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"# X is the input variable (size in square feet)\n",
|
|
"# y in the output variable (price in 1000s of dollars)\n",
|
|
"X = np.array([1000, 2000])\n",
|
|
"y = np.array([200, 400])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Fit the model\n",
|
|
"\n",
|
|
"The code below imports the [linear regression model](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression) from scikit-learn. You can fit this model on the training data by calling `fit` function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from sklearn.linear_model import LinearRegression\n",
|
|
"\n",
|
|
"linear_model = LinearRegression()\n",
|
|
"# We must reshape X using .reshape(-1, 1) because our data has a single feature\n",
|
|
"# If X has multiple features, you don't need to reshape\n",
|
|
"linear_model.fit(X.reshape(-1, 1), y) "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Make Predictions\n",
|
|
"\n",
|
|
"You can see the predictions made by this model by calling the `predict` function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"y_pred = linear_model.predict(X.reshape(-1,1))\n",
|
|
"\n",
|
|
"print(\"Prediction on training set:\", y_pred)\n",
|
|
"\n",
|
|
"X_test = np.array([[1200]])\n",
|
|
"print(f\"Prediction for 1200 sqft house: {linear_model.predict(X_test)}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Calculate score\n",
|
|
"\n",
|
|
"You can calculate how well this model is doing by calling the `score` function. Specifically, it, returns the coefficient of determination $R^2$ of the prediction. 1 is the best score."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(\"Accuracy on training set:\", linear_model.score(X.reshape(-1,1), y))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## View Parameters \n",
|
|
"Our $\\mathbf{w}$ parameters from our earlier labs are referred to as 'intercept' and 'coefficients' in sklearn."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(f\"w = {linear_model.intercept_},{linear_model.coef_}\")"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.6"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|