{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Ungraded Lab: Linear Regression using Scikit-Learn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that you've implemented linear regression from scratch, let's see you can train a linear regression model using scikit-learn.\n", "\n", "## Dataset \n", "Let's start with the same dataset as the first labs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "# X is the input variable (size in square feet)\n", "# y in the output variable (price in 1000s of dollars)\n", "X = np.array([1000, 2000])\n", "y = np.array([200, 400])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fit the model\n", "\n", "The code below imports the [linear regression model](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression) from scikit-learn. You can fit this model on the training data by calling `fit` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LinearRegression\n", "\n", "linear_model = LinearRegression()\n", "# We must reshape X using .reshape(-1, 1) because our data has a single feature\n", "# If X has multiple features, you don't need to reshape\n", "linear_model.fit(X.reshape(-1, 1), y) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make Predictions\n", "\n", "You can see the predictions made by this model by calling the `predict` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y_pred = linear_model.predict(X.reshape(-1,1))\n", "\n", "print(\"Prediction on training set:\", y_pred)\n", "\n", "X_test = np.array([[1200]])\n", "print(f\"Prediction for 1200 sqft house: {linear_model.predict(X_test)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculate score\n", "\n", "You can calculate how well this model is doing by calling the `score` function. Specifically, it, returns the coefficient of determination $R^2$ of the prediction. 1 is the best score." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Accuracy on training set:\", linear_model.score(X.reshape(-1,1), y))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## View Parameters \n", "Our $\\mathbf{w}$ parameters from our earlier labs are referred to as 'intercept' and 'coefficients' in sklearn." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"w = {linear_model.intercept_},{linear_model.coef_}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 5 }