{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ungraded Lab: Model Representation\n",
    "\n",
    "In this ungraded lab, you will implement the model $f_w$ for linear regression with one variable.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem Statement\n",
    "\n",
    "You will use the motivating example of housing price prediction. There are two data points - a house with 1000 square feet sold for \\\\$200,000 and a house with 2000 square feet sold for \\\\$400,000.\n",
    "\n",
    "Therefore, your dataset contains the following two points - \n",
    "\n",
    "| Size (feet$^2$)     | Price (1000s of dollars) |\n",
    "| -------------------| ------------------------ |\n",
    "| 1000               | 200                      |\n",
    "| 2000               | 400                      |\n",
    "\n",
    "You would like to fit a linear regression model (represented with a straight line) through these two points, so you can then predict price for other houses - say, a house with 1200 feet$^2$.\n",
    "\n",
    "### Notation: `X` and `y`\n",
    "\n",
    "For the next few labs, you will use lists in python to represent your dataset. As shown in the video:\n",
    "- `X` represents input variables, also called input features (in this case - Size (feet$^2$)) and \n",
    "- `y` represents output variables, also known as target variables (in this case - Price (1000s of dollars)). \n",
    "\n",
    "Please run the following code cell to create your `X` and `y` variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# X is the input variable (size in square feet)\n",
    "# y in the output variable (price in 1000s of dollars)\n",
    "X = [1000, 2000] \n",
    "y = [200, 400]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Number of training examples `m`\n",
    "You will use `m` to denote the number of training examples. In Python, use the `len()` function to get the number of examples in a list.  You can get `m` by running the next code cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# m is the number of training examples\n",
    "m = len(X)\n",
    "print(f\"Number of training examples is: {m}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training example `x_i, y_i`\n",
    "\n",
    "You will use (x$^i$, y$^i$) to denote the $i^{th}$ training example. Since Python is zero indexed, (x$^0$, y$^0$) is (1000, 200) and (x$^1$, y$^1$) is (2000, 400). \n",
    "\n",
    "Run the next code block below to get the $i^{th}$ training example in a Python list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "i = 0 # Change this to 1 to see (x^1, y^1)\n",
    "\n",
    "x_i = X[i]\n",
    "y_i = y[i]\n",
    "print(f\"(x^({i}), y^({i})) = ({x_i}, {y_i})\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting the data\n",
    "First, let's run the cell below to import [matplotlib](http://matplotlib.org), which is a famous library to plot graphs in Python. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can plot these two points using the `scatter()` function in the `matplotlib` library, as shown in the cell below. \n",
    "- The function arguments `marker` and `c` show the points as red crosses (the default is blue dots).\n",
    "\n",
    "You can also use other functions in the `matplotlib` library to display the title and labels for the axes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot the data points\n",
    "plt.scatter(X, y, marker='x', c='r')\n",
    "\n",
    "# Set the title\n",
    "plt.title(\"Housing Prices\")\n",
    "# Set the y-axis label\n",
    "plt.ylabel('Price (in 1000s of dollars)')\n",
    "# Set the x-axis label\n",
    "plt.xlabel('Size (feet^2)')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model function\n",
    "\n",
    "The model function for linear regression (which is a function that maps from `X` to `y`) is represented as \n",
    "\n",
    "$f(x) = w_0 + w_1x$\n",
    "\n",
    "The formula above is how you can represent straight lines - different values of $w_0$ and $w_1$ give you different straight lines on the plot. Let's try to get a better intuition for this through the code blocks below.\n",
    "\n",
    "Let's represent $w$ as a list in python, with $w_0$ as the first item in the list and $w_1$ as the second. \n",
    "\n",
    "Let's start with $w_0 = 3$ and $w_1 = 1$ \n",
    "\n",
    "### Note: You can come back to this cell to adjust the model's w0 and w1 parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# You can come back here later to adjust w0 and w1\n",
    "w = [3, 1] \n",
    "print(\"w_0:\", w[0])\n",
    "print(\"w_1:\", w[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's calculate the value of $f(x)$ for your two data points. You can explicitly write this out for each data point as - \n",
    "\n",
    "for $x^0$, `f = w[0]+w[1]*X[0]`\n",
    "\n",
    "for $x^1$, `f = w[0]+w[1]*X[1]`\n",
    "\n",
    "For a large number of data points, this can get unwieldy and repetitive. So instead, you can calculate the function output in a `for` loop as follows - \n",
    "\n",
    "```\n",
    "f = []\n",
    "for i in range(len(X)):\n",
    "    f_x = w[0] + w[1]*X[i]\n",
    "    f.append(f_x)\n",
    "```\n",
    "\n",
    "Paste the code shown above in the `calculate_model_output` function below.\n",
    "Please recall that in Python, indentation is significant. Incorrect indentation may result in a Python error message."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def calculate_model_output(w, X):\n",
    "    ### START CODE HERE ### \n",
    "\n",
    "    ### END CODE HERE ###\n",
    "    return f"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's call the `calculate_model_output` function and plot the output using the `plot` method from `matplotlib` library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f = calculate_model_output(w, X)\n",
    "\n",
    "# Plot our model prediction\n",
    "plt.plot(X, f, c='b',label='Our Prediction')\n",
    "\n",
    "# Plot the data points\n",
    "plt.scatter(X, y, marker='x', c='r',label='Actual Values')\n",
    "\n",
    "# Set the title\n",
    "plt.title(\"Housing Prices\")\n",
    "# Set the y-axis label\n",
    "plt.ylabel('Price (in 1000s of dollars)')\n",
    "# Set the x-axis label\n",
    "plt.xlabel('Size (feet^2)')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, setting $w_0 = 3$ and $w_1 = 1$ does not result in a line that fits our data. \n",
    "\n",
    "### Challenge\n",
    "Try experimenting with different values of $w_0$ and $w_1$. What should the values be for getting a line that fits our data?\n",
    "\n",
    "#### Tip:\n",
    "You can use your mouse to click on the triangle to the left of the green \"Hints\" below to reveal some hints for choosing w0 and w1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<details>\n",
    "<summary>\n",
    "    <font size='3', color='darkgreen'><b>Hints</b></font>\n",
    "</summary>\n",
    "    <p>\n",
    "    <ul>\n",
    "        <li>Try w0 = 1 and w1 = 0.5,  w = [1, 0.5] </li>\n",
    "        <li>Try w0 = 0 and w1 = 0.2,  w = [0, 0.2] </li>\n",
    "    </ul>\n",
    "    </p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prediction\n",
    "Now that we have a model, we can use it to make our original prediction. Write the expression to predict the price of a house with 1200 feet^2. You can check your answer below.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "print(f\"{cost_1200sqft:.0f} thousand dollars\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<details>\n",
    "<summary>\n",
    "    <font size='3', color='darkgreen'><b>Answer</b></font>  \n",
    "</summary>    \n",
    "\n",
    "```\n",
    "    w = [0, 0.2] \n",
    "    cost_1200sqft = w[0] + w[1]*1200\n",
    " ```\n",
    "\n",
    "240 thousand dollars"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}