628 lines
330 KiB
Plaintext
628 lines
330 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Optional Lab: Gradient Descent for Linear Regression\n",
|
||
|
"\n",
|
||
|
"<figure>\n",
|
||
|
" <center> <img src=\"./images/C1_W1_L4_S1_Lecture_GD.png\" style=\"width:800px;height:200px;\" ></center>\n",
|
||
|
"</figure>"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Goals\n",
|
||
|
"In this lab, you will:\n",
|
||
|
"- automate the process of optimizing $w$ and $b$ using gradient descent."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Tools\n",
|
||
|
"In this lab, we will make use of: \n",
|
||
|
"- NumPy, a popular library for scientific computing\n",
|
||
|
"- Matplotlib, a popular library for plotting data\n",
|
||
|
"- plotting routines in the lab_utils.py file in the local directory"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 1,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import math, copy\n",
|
||
|
"import numpy as np\n",
|
||
|
"import matplotlib.pyplot as plt\n",
|
||
|
"plt.style.use('./deeplearning.mplstyle')\n",
|
||
|
"from lab_utils_uni import plt_house_x, plt_contour_wgrad, plt_divergence, plt_gradients"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2\"></a>\n",
|
||
|
"# Problem Statement\n",
|
||
|
"\n",
|
||
|
"Let's use the same two data points as before - a house with 1000 square feet sold for \\\\$300,000 and a house with 2000 square feet sold for \\\\$500,000.\n",
|
||
|
"\n",
|
||
|
"| Size (1000 sqft) | Price (1000s of dollars) |\n",
|
||
|
"| ----------------| ------------------------ |\n",
|
||
|
"| 1 | 300 |\n",
|
||
|
"| 2 | 500 |\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Load our data set\n",
|
||
|
"x_train = np.array([1.0, 2.0]) #features\n",
|
||
|
"y_train = np.array([300.0, 500.0]) #target value"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.0.1\"></a>\n",
|
||
|
"### Compute_Cost\n",
|
||
|
"This was developed in the last lab. We'll need it again here."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 3,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#Function to calculate the cost\n",
|
||
|
"def compute_cost(x, y, w, b):\n",
|
||
|
" \n",
|
||
|
" m = x.shape[0] \n",
|
||
|
" cost = 0\n",
|
||
|
" \n",
|
||
|
" for i in range(m):\n",
|
||
|
" f_wb = w * x[i] + b\n",
|
||
|
" cost = cost + (f_wb - y[i])**2\n",
|
||
|
" total_cost = 1 / (2 * m) * cost\n",
|
||
|
"\n",
|
||
|
" return total_cost"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.1\"></a>\n",
|
||
|
"## Gradient descent summary\n",
|
||
|
"So far in this course, you have developed a linear model that predicts $f_{w,b}(x^{(i)})$:\n",
|
||
|
"$$f_{w,b}(x^{(i)}) = wx^{(i)} + b \\tag{1}$$\n",
|
||
|
"In linear regression, you utilize input training data to fit the parameters $w$,$b$ by minimizing a measure of the error between our predictions $f_{w,b}(x^{(i)})$ and the actual data $y^{(i)}$. The measure is called the $cost$, $J(w,b)$. In training you measure the cost over all of our training samples $x^{(i)},y^{(i)}$\n",
|
||
|
"$$J(w,b) = \\frac{1}{2m} \\sum\\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2\\tag{2}$$ "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"\n",
|
||
|
"In lecture, *gradient descent* was described as:\n",
|
||
|
"\n",
|
||
|
"$$\\begin{align*} \\text{repeat}&\\text{ until convergence:} \\; \\lbrace \\newline\n",
|
||
|
"\\; w &= w - \\alpha \\frac{\\partial J(w,b)}{\\partial w} \\tag{3} \\; \\newline \n",
|
||
|
" b &= b - \\alpha \\frac{\\partial J(w,b)}{\\partial b} \\newline \\rbrace\n",
|
||
|
"\\end{align*}$$\n",
|
||
|
"where, parameters $w$, $b$ are updated simultaneously. \n",
|
||
|
"The gradient is defined as:\n",
|
||
|
"$$\n",
|
||
|
"\\begin{align}\n",
|
||
|
"\\frac{\\partial J(w,b)}{\\partial w} &= \\frac{1}{m} \\sum\\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})x^{(i)} \\tag{4}\\\\\n",
|
||
|
" \\frac{\\partial J(w,b)}{\\partial b} &= \\frac{1}{m} \\sum\\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)}) \\tag{5}\\\\\n",
|
||
|
"\\end{align}\n",
|
||
|
"$$\n",
|
||
|
"\n",
|
||
|
"Here *simultaniously* means that you calculate the partial derivatives for all the parameters before updating any of the parameters."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.2\"></a>\n",
|
||
|
"## Implement Gradient Descent\n",
|
||
|
"You will implement gradient descent algorithm for one feature. You will need three functions. \n",
|
||
|
"- `compute_gradient` implementing equation (4) and (5) above\n",
|
||
|
"- `compute_cost` implementing equation (2) above (code from previous lab)\n",
|
||
|
"- `gradient_descent`, utilizing compute_gradient and compute_cost\n",
|
||
|
"\n",
|
||
|
"Conventions:\n",
|
||
|
"- The naming of python variables containing partial derivatives follows this pattern,$\\frac{\\partial J(w,b)}{\\partial b}$ will be `dj_db`.\n",
|
||
|
"- w.r.t is With Respect To, as in partial derivative of $J(wb)$ With Respect To $b$.\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.3\"></a>\n",
|
||
|
"### compute_gradient\n",
|
||
|
"<a name='ex-01'></a>\n",
|
||
|
"`compute_gradient` implements (4) and (5) above and returns $\\frac{\\partial J(w,b)}{\\partial w}$,$\\frac{\\partial J(w,b)}{\\partial b}$. The embedded comments describe the operations."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def compute_gradient(x, y, w, b): \n",
|
||
|
" \"\"\"\n",
|
||
|
" Computes the gradient for linear regression \n",
|
||
|
" Args:\n",
|
||
|
" x (ndarray (m,)): Data, m examples \n",
|
||
|
" y (ndarray (m,)): target values\n",
|
||
|
" w,b (scalar) : model parameters \n",
|
||
|
" Returns\n",
|
||
|
" dj_dw (scalar): The gradient of the cost w.r.t. the parameters w\n",
|
||
|
" dj_db (scalar): The gradient of the cost w.r.t. the parameter b \n",
|
||
|
" \"\"\"\n",
|
||
|
" \n",
|
||
|
" # Number of training examples\n",
|
||
|
" m = x.shape[0] \n",
|
||
|
" dj_dw = 0\n",
|
||
|
" dj_db = 0\n",
|
||
|
" \n",
|
||
|
" for i in range(m): \n",
|
||
|
" f_wb = w * x[i] + b \n",
|
||
|
" dj_dw_i = (f_wb - y[i]) * x[i] \n",
|
||
|
" dj_db_i = f_wb - y[i] \n",
|
||
|
" dj_db += dj_db_i\n",
|
||
|
" dj_dw += dj_dw_i \n",
|
||
|
" dj_dw = dj_dw / m \n",
|
||
|
" dj_db = dj_db / m \n",
|
||
|
" \n",
|
||
|
" return dj_dw, dj_db"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<br/>"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<img align=\"left\" src=\"./images/C1_W1_Lab03_lecture_slopes.PNG\" style=\"width:340px;\" > The lectures described how gradient descent utilizes the partial derivative of the cost with respect to a parameter at a point to update that parameter. \n",
|
||
|
"Let's use our `compute_gradient` function to find and plot some partial derivatives of our cost function relative to one of the parameters, $w_0$.\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtcAAAERCAYAAACw14tpAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdZ2AU1drA8f+W9N4bKQQILQGllwRCEwSuUhUEBAtIUa+8iqIIiKiIiBevomABUQGVcgWVovQO0gIhdEhCGgnpPVvO+yESE0wDMruU8/sCOzM7zzPJ7pknM2fOUQkhBJIkSZIkSZIk3Ta1uROQJEmSJEmSpHuFLK4lSZIkSZIkqY7I4lqSJEmSJEmS6ogsriVJkiRJkiSpjsjiWpIkSZIkSZLqiCyuJUmSJEmSJKmOyOLazL744gtCQ0Oxs7MjMDCQZ599lqSkpFveX1BQEAcOHKjDDM0jPj4ed3f3stc3Htc333xDnz59zJFaBSkpKahUqrLXzZs3588//7ytfcbGxmJtba3Y9rdCpVKRkpJS5frIyEh++OGHW9r3rl276NKlC3Z2dpX+Tg8ePEhoaCi2trb07t2b1NTUsnUFBQUMHz4ce3t7goOD+fXXX28pB0m6E3z11Ve0bNkSW1tbfH196dOnD1u3bq2Tfd8JbVV5NbUppnLjucYUdu/eTevWrU0a80a1PYea4vxyL5LFtRm98847zJo1i/nz55ORkUF0dDQtW7Zkz5495k7N7AICArh27ZrJ4+p0utt6/6lTp2jbtm0dZXN/sLW1Zfz48bz++uv/WFdUVMSgQYOYOnUq6enp1K9fnwkTJpStf/PNNykoKCAlJYXFixczcuRIrl69asr0JalOzJ49m7feeov333+f9PR04uLiePnll9m0aVOl28u2qm6Y41wTERHBkSNH6nSft/t5kOqYkMwiMzNT2NjYiHXr1lW5TVRUlOjYsaNwcnISHTt2FFFRUUIIIQwGg5g4caJwc3MTjo6Oon379qKkpEQ888wzQqVSCRsbG2FnZydWr15dYX+7du0SjRo1qrDslVdeES+//LIQQohZs2YJLy8v4eDgIFq2bCmSkpL+kdOwYcPEkiVLhBBCbN68WQDixIkTQggh3n//ffHiiy9We9x5eXnCzs5OFBYWCiGEGD58uGjVqlXZ+iZNmoijR4+Ky5cvCysrKyGEqPS4li5dKnr16iXGjx8vHBwcRLNmzcSxY8eqjDtz5kzh7u4uGjVqJObNmycCAwOFEKIszqeffip8fHzEs88+K65duyZ69+4t3NzchIeHh5g4caIoKSkp29fixYuFj4+P8PPzEx9//LEo/zUKDAwU+/fvF0IIUVBQICZNmlS27fvvv1+23ejRo8XkyZNF9+7dhb29vejdu7fIyMgQQgjRqFEjAQg7OzthZ2cnrl69Knbv3i3CwsIqPbbrx/Dxxx8LNzc3ERQUVOXnKiUlRfTq1Us4OjoKV1dXMWnSpLJ1O3bsEK1atRJOTk6ia9eu4sKFC0IIIXr16iUAYWtrK+zs7MShQ4cq7PPtt98WarVaWFlZCTs7OzF//nwhhBA//vijaNSokXB1dRVDhw4V6enpVf5+hBBi6dKlonfv3hWWbdy4UTRt2rTsdWJiorC0tBS5ublCCCG8vLzEwYMHy9b37t1bLFq0qNo4knSnycjIENbW1mL9+vVVbnO3tFXlVXWuEkIIQCxevFgEBgYKNzc3MXfu3LL3FRQUiHHjxgkPDw8RFBQkFixYIISo/flj+/btonHjxuKtt94SLi4uIigoSPzxxx/V/lyvqy6vG+3fv180a9ZMODo6ildffVU0btxYbN++XQghRNeuXcXKlSvLtp05c6Z47rnnhBCiLD8hhHj66afFu+++W+Fn5uXlJU6ePCmEqLpdruzzcKOuXbuK6dOni9DQUOHi4iImTJggdDqdEOKf7W1V7XVNv2OpcrK4NpMNGzYIrVYr9Hp9peuLi4tFYGCgWLx4sSgpKRGfffaZCAwMFMXFxWLjxo2iTZs2IicnR+j1erF///6y/ZRvMG9kNBqFr69vhSI0KChIHDx4UJw+fVr4+/uLq1evCoPBIKKiokRWVtY/9vHpp5+Kp556SgghxPTp00VQUJBYuHChEEKI/v37i1WrVtV47G3atBE7d+4UQghRv359ERQUJHJyckRaWppwcnISer3+Hw3ejce1dOlSodVqxapVq4RerxfTpk0T3bt3rzTe+vXrRVBQkLh8+bJIS0sT7du3r1Bcq1QqMX78eFFYWCgKCgpEamqq+OWXX0RRUZFISEgQoaGh4vPPPxdCCHHixAnh5OQkjhw5InJzc0Xfvn2rPGFNnDhRjBw5UuTm5oqkpCTRrFmzspPn6NGjha+vrzh58qQoLCwU3bt3F2+//XZZTuWPvSbXj2HcuHGisLBQbNy4Udjb24vU1NR/bPvaa6+JiRMnCp1OJwoKCsSBAweEEELEx8cLDw8PsXv3bqHX68Unn3wiWrduLYxGoxCi9ISTnJxcZQ43nkhiYmKEs7OzOHjwoCgoKBAjR44Uw4cPr/Y4Kiuu58+fL4YOHVphmbu7uzh69KhIT08XgMjPzy9b98orr9T4B54k3WlqOh8IcXe2VdWdqwAxfPhwkZeXJ06ePClsbGxEbGysEEKIqVOnil69eons7Gxx9uxZ4evrKzZv3iyEqN35Y/v27UKj0Yj58+cLnU4nvvjiCxEcHFzlz/XG4rqqvMorLi4Wvr6+YsmSJaK4uFjMmDFDaDSamy6uN23aJFq2bFm23bZt28ouKFTXLlf2ebhR165dRXBwsLh48aJITU0VLVu2FJ9++qkQomJ7W117fbPnI6mU7BZiJunp6bi7u6PRaCpdf+DAAbRaLePGjcPCwoIJEyag0Wg4cOAAFhYW5Obmcu7cOdRqNR06dKhyP+WpVCqGDBnCTz/9BJT2ZQVo164dWq2W4uJiTp8+jRCCFi1a4OTk9I99hIeHs3v3bgD27NnDlClT2L17N0II9u7dS3h4eI15XN/HlStXcHV1pVu3buzbt489e/bU+lgAwsLCGDJkCBqNhieeeIKoqKhKt1uzZg1jx44lKCgId3d3XnzxxQrrhRDMmjULa2trbGxs8PDwoH///lhZWeHn58ezzz5b1lVnzZo1DB06lFatWmFvb8/UqVMrjSmEYOnSpcyfPx97e3t8fHyYMGECq1evLtvm8ccfJzQ0FGtrawYPHlxl/rUhhGDGjBlYW1vTp08f2rVrx4YNG/6xnYWFBcnJySQmJmJjY0P79u0BWL58OUOGDCE8PByNRsPzzz9PXFwcsbGxt5TPqlWrGDx4MO3atcPGxoZ3332X1atXYzAYbmo/+fn5ODo6Vljm6OhIXl4e+fn5aDQabG1t/7FOku4mlZ0PvL29cXJywtnZuWzZ3dZW1XSumjp1KnZ2doSGhhIaGsrJkycB+PHHH5kxYwaOjo6EhITw3HPPlT3TUdvzh5OTE5MnT0ar1TJixAguXbpU67ahqrzK27dvH7a2tjz11FNYWlryxhtvYGVlVav9l9ejRw8SEhI4d+4cAD/99BOPPfYYUHO7fOPnoTJPP/00wcHBeHh48PLLL1f4vV5XV+219DdZXJuJm5sb165dq/LDm5SUhL+/f4VlAQEBJCcn06NHD8aOHcvo0aPx8/PjrbfeqnXcxx9/nFWrVgEVv8QNGzbkgw8+YMqUKXh5efHiiy9SXFz8j/eHhYWRlpbGlStXOH/+PGPGjOHQoUNER0fj5uaGt7d3jTlcbxx3795NeHh4hdcRERG1PhZPT8+y/9va2lbZcKakpFCvXr2y135+fhXWW1paVthXbm4uTz75JH5+fjg6OvL666+Tnp5etq/yv5eAgIBKY6alpVFYWEhISAjOzs44OzvzxhtvVHgYr7b514ZarcbX17dCXsnJyf/YbsqUKfj5+dGpUydCQ0NZs2YNUPpQz9KlS8tydXZ2Jj8//5Yfrr3x8+vv74/BYLjpvo12dnbk5uZWWJaTk4O9vT12dnYYDAYKCwv/sU6S7iaVnQ9SUlKIioq
|
||
|
"text/plain": [
|
||
|
"<Figure size 864x288 with 2 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"plt_gradients(x_train,y_train, compute_cost, compute_gradient)\n",
|
||
|
"plt.show()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Above, the left plot shows $\\frac{\\partial J(w,b)}{\\partial w}$ or the slope of the cost curve relative to $w$ at three points. On the right side of the plot, the derivative is positive, while on the left it is negative. Due to the 'bowl shape', the derivatives will always lead gradient descent toward the bottom where the gradient is zero.\n",
|
||
|
" \n",
|
||
|
"The left plot has fixed $b=100$. Gradient descent will utilize both $\\frac{\\partial J(w,b)}{\\partial w}$ and $\\frac{\\partial J(w,b)}{\\partial b}$ to update parameters. The 'quiver plot' on the right provides a means of viewing the gradient of both parameters. The arrow sizes reflect the magnitude of the gradient at that point. The direction and slope of the arrow reflects the ratio of $\\frac{\\partial J(w,b)}{\\partial w}$ and $\\frac{\\partial J(w,b)}{\\partial b}$ at that point.\n",
|
||
|
"Note that the gradient points *away* from the minimum. Review equation (3) above. The scaled gradient is *subtracted* from the current value of $w$ or $b$. This moves the parameter in a direction that will reduce cost."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.5\"></a>\n",
|
||
|
"### Gradient Descent\n",
|
||
|
"Now that gradients can be computed, gradient descent, described in equation (3) above can be implemented below in `gradient_descent`. The details of the implementation are described in the comments. Below, you will utilize this function to find optimal values of $w$ and $b$ on the training data."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function): \n",
|
||
|
" \"\"\"\n",
|
||
|
" Performs gradient descent to fit w,b. Updates w,b by taking \n",
|
||
|
" num_iters gradient steps with learning rate alpha\n",
|
||
|
" \n",
|
||
|
" Args:\n",
|
||
|
" x (ndarray (m,)) : Data, m examples \n",
|
||
|
" y (ndarray (m,)) : target values\n",
|
||
|
" w_in,b_in (scalar): initial values of model parameters \n",
|
||
|
" alpha (float): Learning rate\n",
|
||
|
" num_iters (int): number of iterations to run gradient descent\n",
|
||
|
" cost_function: function to call to produce cost\n",
|
||
|
" gradient_function: function to call to produce gradient\n",
|
||
|
" \n",
|
||
|
" Returns:\n",
|
||
|
" w (scalar): Updated value of parameter after running gradient descent\n",
|
||
|
" b (scalar): Updated value of parameter after running gradient descent\n",
|
||
|
" J_history (List): History of cost values\n",
|
||
|
" p_history (list): History of parameters [w,b] \n",
|
||
|
" \"\"\"\n",
|
||
|
" \n",
|
||
|
" # An array to store cost J and w's at each iteration primarily for graphing later\n",
|
||
|
" J_history = []\n",
|
||
|
" p_history = []\n",
|
||
|
" b = b_in\n",
|
||
|
" w = w_in\n",
|
||
|
" \n",
|
||
|
" for i in range(num_iters):\n",
|
||
|
" # Calculate the gradient and update the parameters using gradient_function\n",
|
||
|
" dj_dw, dj_db = gradient_function(x, y, w , b) \n",
|
||
|
"\n",
|
||
|
" # Update Parameters using equation (3) above\n",
|
||
|
" b = b - alpha * dj_db \n",
|
||
|
" w = w - alpha * dj_dw \n",
|
||
|
"\n",
|
||
|
" # Save cost J at each iteration\n",
|
||
|
" if i<100000: # prevent resource exhaustion \n",
|
||
|
" J_history.append( cost_function(x, y, w , b))\n",
|
||
|
" p_history.append([w,b])\n",
|
||
|
" # Print cost every at intervals 10 times or as many iterations if < 10\n",
|
||
|
" if i% math.ceil(num_iters/10) == 0:\n",
|
||
|
" print(f\"Iteration {i:4}: Cost {J_history[-1]:0.2e} \",\n",
|
||
|
" f\"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e} \",\n",
|
||
|
" f\"w: {w: 0.3e}, b:{b: 0.5e}\")\n",
|
||
|
" \n",
|
||
|
" return w, b, J_history, p_history #return w and J,w history for graphing"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Iteration 0: Cost 7.93e+04 dj_dw: -6.500e+02, dj_db: -4.000e+02 w: 6.500e+00, b: 4.00000e+00\n",
|
||
|
"Iteration 1000: Cost 3.41e+00 dj_dw: -3.712e-01, dj_db: 6.007e-01 w: 1.949e+02, b: 1.08228e+02\n",
|
||
|
"Iteration 2000: Cost 7.93e-01 dj_dw: -1.789e-01, dj_db: 2.895e-01 w: 1.975e+02, b: 1.03966e+02\n",
|
||
|
"Iteration 3000: Cost 1.84e-01 dj_dw: -8.625e-02, dj_db: 1.396e-01 w: 1.988e+02, b: 1.01912e+02\n",
|
||
|
"Iteration 4000: Cost 4.28e-02 dj_dw: -4.158e-02, dj_db: 6.727e-02 w: 1.994e+02, b: 1.00922e+02\n",
|
||
|
"Iteration 5000: Cost 9.95e-03 dj_dw: -2.004e-02, dj_db: 3.243e-02 w: 1.997e+02, b: 1.00444e+02\n",
|
||
|
"Iteration 6000: Cost 2.31e-03 dj_dw: -9.660e-03, dj_db: 1.563e-02 w: 1.999e+02, b: 1.00214e+02\n",
|
||
|
"Iteration 7000: Cost 5.37e-04 dj_dw: -4.657e-03, dj_db: 7.535e-03 w: 1.999e+02, b: 1.00103e+02\n",
|
||
|
"Iteration 8000: Cost 1.25e-04 dj_dw: -2.245e-03, dj_db: 3.632e-03 w: 2.000e+02, b: 1.00050e+02\n",
|
||
|
"Iteration 9000: Cost 2.90e-05 dj_dw: -1.082e-03, dj_db: 1.751e-03 w: 2.000e+02, b: 1.00024e+02\n",
|
||
|
"(w,b) found by gradient descent: (199.9929,100.0116)\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# initialize parameters\n",
|
||
|
"w_init = 0\n",
|
||
|
"b_init = 0\n",
|
||
|
"# some gradient descent settings\n",
|
||
|
"iterations = 10000\n",
|
||
|
"tmp_alpha = 1.0e-2\n",
|
||
|
"# run gradient descent\n",
|
||
|
"w_final, b_final, J_hist, p_hist = gradient_descent(x_train ,y_train, w_init, b_init, tmp_alpha, \n",
|
||
|
" iterations, compute_cost, compute_gradient)\n",
|
||
|
"print(f\"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<img align=\"left\" src=\"./images/C1_W1_Lab03_lecture_learningrate.PNG\" style=\"width:340px; padding: 15px; \" > \n",
|
||
|
"Take a moment and note some characteristics of the gradient descent process printed above. \n",
|
||
|
"\n",
|
||
|
"- The cost starts large and rapidly declines as described in the slide from the lecture.\n",
|
||
|
"- The partial derivatives, `dj_dw`, and `dj_db` also get smaller, rapidly at first and then more slowly. As shown in the diagram from the lecture, as the process nears the 'bottom of the bowl' progress is slower due to the smaller value of the derivative at that point.\n",
|
||
|
"- progress slows though the learning rate, alpha, remains fixed"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Cost versus iterations of gradient descent \n",
|
||
|
"A plot of cost versus iterations is a useful measure of progress in gradient descent. Cost should always decrease in successful runs. The change in cost is so rapid initially, it is useful to plot the initial decent on a different scale than the final descent. In the plots below, note the scale of cost on the axes and the iteration step."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 8,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA2gAAAEoCAYAAAAt0dJ4AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdeXhU5d3/8Xc2spFkEnZiggsgIglWxZBNAibiwlIVKzUi9an4qBVFLfJ7irEgtKRS7WIpYLWKFlDcQKxFDDtKCCAQIyoqIsYgJCYhAUKWyfn9MWWSMwl7Jmcm+byuqxdzf88snwQv7n7n3Oc+PoZhGIiIiIiIiIjlfK0OICIiIiIiIg5q0ERERERERDyEGjQREREREREPoQZNRERERETEQ6hBExERERER8RBq0ERERERERDyEGjTxes899xwDBgwgNDSUXr16cffdd1NUVHTW73f++eeTm5vbgglPbuHChdx0000A7N27l6CgILd91rRp07j33ntNtUsvvZQtW7ac83sfPnyY/v37c/To0ZM+Ly0tjVdfffWsP2ft2rX069fPVPvFL37B8uXLz/o9RURam+au0+fOucvVq6++yv3339/i73vvvfcybdo0AF566SUeffTRFv8MaTvUoIlXmzlzJtOnT+fpp5+mtLSUgoICBg4cyMaNG62OdtoyMzN5++23W+S96urqzvg1n376KYMGDTrnz54zZw633HILISEh5/xeJ1JbW9ts/dFHH+XJJ5902+eKiLQkzV1mVs5drmbMmMEjjzzS4u/bWGZmJm+99RYlJSVu/RzxYoaIlyorKzOCg4ONZcuWnfA5O3fuNBITE42IiAgjMTHR2Llzp2EYhmG3243777/f6NSpkxEeHm4kJCQYNTU1xi9/+UvDx8fHCA4ONkJDQ4033njD9H7r1683+vTpY6r9+te/Nh599FHDMAxj+vTpRrdu3YywsDBj4MCBRlFR0Sl/jhdffNEYPny4YRiG0adPHwMwQkNDjdDQUOPAgQNGXV2d8cQTTxixsbFGt27djEceecSora01DMMwfvvb3xq333678dOf/tQICQkxNmzYYCxbtswYMGCA0bFjR6Nv377G22+/bRiGYaxZs8YICAgw/P39jdDQUOOnP/2pYRiG0atXL2PTpk2GYRjGjz/+aNx6661GVFSU0adPH+O1115z5hwyZIgxffp044orrjDCw8ONsWPHGtXV1c7jffr0MT799FPnuLnfxZNPPmn4+voagYGBRmhoqPH0008bdrvduOmmm4wuXboYkZGRxtixY43y8nLDMAzjm2++MQIDA42//e1vRo8ePYy7777bCAoKMnx8fIzQ0FCjU6dOzs+75JJLnH+/IiKeSnOXZ81djX344YfGoEGDTLW1a9cal19+uREREWEMGTLE+OqrrwzDaJifnnvuOaN79+5G9+7djVdeecX5uh9++MHIyMgwOnbsaNx4443G7bffbvz2t791Hr/vvvuMv/zlL6f8PUv7pAZNvNZ7771n+Pv7G3V1dc0er66uNnr16mXMnz/fqKmpMf7+978bvXr1Mqqrq43//Oc/xpVXXmlUVFQYdXV1xqZNm5zv0/gffVf19fVGz549je3btztr559/vrF582bjs88+M2JiYowDBw4Ydrvd2Llzp7PROJnGk9zxf/Abe+qpp4z09HSjuLjYKC8vN9LS0oy//vWvhmE4JrnAwEBj5cqVht1uN44dO2asWbPG+Pzzzw273W4sXbrU6Nixo1FcXOx8/v/+7/+a3r/xzzt27Fhj/PjxRlVVlZGbm2uEh4cbu3btMgzDMckNGDDA+Pbbb42ysjKjf//+xssvv2wYhmHs2bPHiIqKcr7nyX4XQ4YMMRYvXux8rt1uN1555RXjyJEjRmlpqTF06FBjypQpzt+Hj4+Pce+99xpVVVXG0aNHjTVr1hgXX3xxk9/jPffcYzz11FOn/H2LiFhJc5fnzF2unnjiCeORRx5xjvft22d06dLF2LBhg1FXV2c8++yzxhVXXGHU19c756dHH33UqK6uNt5//30jLCzMOHLkiGEYhnHLLbcYd999t1FVVWX85z//MTp06GBq0BYtWmTccMMNp/w9S/ukJY7itX788Uc6d+6Mn59fs8dzc3Px9/fnnnvuISAggPvuuw8/Pz9yc3MJCAigsrKS3bt34+vry+DBg0/4Po35+PgwZswYlixZAsDmzZsBuOqqq/D396e6uprPPvsMwzCIj48nIiLinH/OF154gd/97nd07tyZiIgIHn30Ud544w3n8WHDhpGRkYGvry+BgYGkpaVx8cUX4+vry+jRo+nduzc7duw45efY7XbefPNNZs6cSVBQEAkJCdxyyy28/vrrzudMmDCB2NhYbDYbN954Izt37gQcS0369OnjfN6Z/C58fX254447CAkJITIykoceesi0zMcwDKZPn05QUBDBwcEnzN+vXz/y8/NP+XOKiFhJc5eDJ8xdrlznsoULFzJmzBhSUlLw8/PjgQce4Ntvv2Xv3r2AY37KysqiQ4cOXHvttQQFBbFnzx7q6upYtmwZ06ZNIygoiOuuu46UlBTTZ2nOkpNRgyZeq1OnTpSUlGC325s9XlRURExMjKkWGxvL/v37ueaaa5gwYQLjx48nOjraeeHu6bjtttuc//AvWbKEn/3sZwD07t2bp556ismTJ9OtWzcefPBBqqurz+6Ha2Tfvn1kZGRgs9mw2WxkZmZSXFzsPO76M27cuJGkpCSioqKw2Wx88skn/Pjjj6f8nOLiYux2O9HR0c5ar1692L9/v3PctWtX5+OQkBAOHz4MwKFDh+jYsaPz2Jn8Lurq6pg0aRK9evUiPDyczMxMU94OHTqYPvdEwsLCOHTo0CmfJyJiJc1dDp4wd7lyncv27dvHiy++6PwZbDYbR44ccW7mEhgYaGpmj793SUkJ9fX19OzZ03ksNjbW9Fmas+Rk1KCJ10pMTMTf35/33nuv2eM9e/aksLDQVNu3bx89evQAHBtLFBQUsH79el544QU++OADwPFN46k+t7q6mo8//pg33njDOckBjB8/nry8PD755BM2btzIyy+/fEY/U3OfHR0dzfr16ykvL6e8vJxDhw6xa9euE75m3Lhx/PKXv+TAgQOUl5cTFxeHYRin/Nm6dOmCr68v33//vbPW+Pd1MhEREU0mvBP9LlwzLFy4kA0bNrBp0yYqKipYuHChM29zzz/Rz1BZWdki3/qKiLiT5q7mX2PF3OXKdS6Ljo7m3nvvdf4M5eXlHD16lOTk5JO+T+fOnfHx8TE1id99953pOZqz5GTUoInXstlsTJ06lfvvv58PPviAmpoajhw5wty5c1myZAmDBw+mpqaG559/nrq6OubPn09dXR2DBw9m69atbNu2DbvdTnh4OP7+/s5lIl27dmXPnj0n/NzjS0UmT55MQEAAV1xxBQBffPEF69evp6amho4dOxIYGHhaS08a69y5MzU1Naatlu+66y7+7//+j/3792MYBnv37mXdunUnfI/Kykrn8pnFixdTUFDgPNa1a1e++eYbUwN0nJ+fHzfffDNPPPEEx44dIy8vjzfffJMxY8acMndcXBxfffWVc3yy34Xr77eyspKgoCAiIyM5ePAgzzzzzEk/q2vXrhw4cKDJdv5ffPEFcXFxp8wqImIlzV3Ns2LuchUXF8eXX37pHP/85z/ntddeY/369dTX11NZWWlapnki/v7+jBo1iunTp1NdXc0HH3zAhg0bTM/RnCUnowZNvNrjjz/O448/zqRJk4iMjOSSSy5h+/btpKam0qFDB5YtW8bzzz9Pp06dePHFF1m2bBkdOnTg0KFDjB8/noiICOLj4/n5z3/OsGHDAPj1r3/N5MmTsdlsvPXWW81+7m233cbq1au59dZbnbXq6moeeeQROnXqxAUXXMCAAQMYN24c4Lj/ies9XJoTGhrK5MmTiY+Px2azcfDgQR577DESExNJTk4mIiKCkSNHNvkmrrFnn32W++67j06dOvHhhx+SmJjoPHbzzTdTXl5OZGRks5PXnDlzqKi
|
||
|
"text/plain": [
|
||
|
"<Figure size 864x288 with 2 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# plot cost versus iteration \n",
|
||
|
"fig, (ax1, ax2) = plt.subplots(1, 2, constrained_layout=True, figsize=(12,4))\n",
|
||
|
"ax1.plot(J_hist[:100])\n",
|
||
|
"ax2.plot(1000 + np.arange(len(J_hist[1000:])), J_hist[1000:])\n",
|
||
|
"ax1.set_title(\"Cost vs. iteration(start)\"); ax2.set_title(\"Cost vs. iteration (end)\")\n",
|
||
|
"ax1.set_ylabel('Cost') ; ax2.set_ylabel('Cost') \n",
|
||
|
"ax1.set_xlabel('iteration step') ; ax2.set_xlabel('iteration step') \n",
|
||
|
"plt.show()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Predictions\n",
|
||
|
"Now that you have discovered the optimal values for the parameters $w$ and $b$, you can now use the model to predict housing values based on our learned parameters. As expected, the predicted values are nearly the same as the training values for the same housing. Further, the value not in the prediction is in line with the expected value."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 9,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"1000 sqft house prediction 300.0 Thousand dollars\n",
|
||
|
"1200 sqft house prediction 340.0 Thousand dollars\n",
|
||
|
"2000 sqft house prediction 500.0 Thousand dollars\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"print(f\"1000 sqft house prediction {w_final*1.0 + b_final:0.1f} Thousand dollars\")\n",
|
||
|
"print(f\"1200 sqft house prediction {w_final*1.2 + b_final:0.1f} Thousand dollars\")\n",
|
||
|
"print(f\"2000 sqft house prediction {w_final*2.0 + b_final:0.1f} Thousand dollars\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.6\"></a>\n",
|
||
|
"## Plotting\n",
|
||
|
"You can show the progress of gradient descent during its execution by plotting the cost over iterations on a contour plot of the cost(w,b). "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 10,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtMAAAF+CAYAAABAnpacAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOydd3gUVduH7/SQnpAEkpBeCJAQQi9BQJQuSBcbfIqCWLCXV0V8xYIFCyICKvpiQYqCgAhYQHpvIZBGGgkkIb2X3fP9cZZNQq/ZJHvu69oL5szuzDOzu9lnnvmd32MihBAoFAqFQqFQKBSKa8bU0AEoFAqFQqFQKBSNFZVMKxQKhUKhUCgU14lKphUKhUKhUCgUiutEJdMKhUKhUCgUCsV1opJphUKhUCgUCoXiOlHJtEKhUCgUCoVCcZ2oZFqhuAQLFy4kLCwMW1tbfH19mTx5MhkZGTe0zZkzZzJ16tSbFOGN4efnx65du674vEmTJvHee+9d1z4qKysZMWIEjo6OdO3a9bq2cbVs3ryZ0NDQyz7HxMSEM2fO6JeXLl3KtGnTbsk+NRoN7du3JzMz87q3f6307duXpUuX1tv+LsYPP/zAyJEjAUhOTsba2rrO+hv5PF0tt3K/eXl59O7dGwcHB8aOHXvD27sWpk6dysyZM4G65/lGuNbzsnTpUvr27XvD+72ZXOz9VijqE5VMKxQXYdasWbz55pt89NFH5ObmEh0dTUREBNu2bTN0aBdFq9Wi1WoNHcYFHDx4kF27dpGdnc2ePXsMHc4FvPXWWzz77LO3ZNtmZmY88sgjfPjhh7dk+w2V++67j19//dXQYdwyNm7cSElJCfn5+Sxfvvy6t3Oj39mmfp4VisaESqYVivPIz8/nnXfeYf78+QwcOBArKyvs7e158sknGTduHABHjhyhZ8+eODk50bNnT44cOaJ/vYmJCQsXLsTPzw9XV1fef/99QFYx33nnHb7++mvs7Oz0VaUrbat2JbV25XHSpElMnz6dvn37Ymtre0HV/FzV9OWXX8bR0ZF27dpdshKdm5vLuHHjaN68OSEhISxbtgyA7777jh9++IE33ngDOzs7nn766Qteq9FoePXVV2nVqhVeXl68+uqraDQadu7cSd++fcnOzsbFxYVXX331gteWlJQwbdo0vLy8cHZ25oEHHtCv+/TTT/H398fd3Z0pU6ZQVlYGQGxsLD179sTBwQF3d3feffddNBoNgwcPJi4uDjs7O1xdXS/19urZsWMHtra2BAUFATBhwgQWL14MyITJxMSEo0ePAjB79mymT59+yW1d6hyPHTuWJUuWUF1dfcV4+vfvz48//qhfzs/Px87Ojry8vIse86WIiYkhPDwcFxcXpk2bdtF9b9iwgT59+uiXPT09ee655wDIysrC2dmZ8/t5Xe1rvv32WwYNGgTAgAEDqKiowM7ODjs7O7KysvTP79+/P/b29gwaNIi8vLyLHsukSZN46qmn6NWrF46OjowbN47i4mIAcnJyGDRoEK6urri7u/P4449TVVV1U/ZbVlbGlClTcHd3x9/fn08//RSAn3/+mYkTJ3LkyBEcHBxYuHDhBa+Ni4uja9euODg48PDDD3PnnXfy7bff6o/n/O/sokWLCA4Oxt7eng4dOtS5YM/MzGTAgAHY29szbNgwioqK9Otqn2eALVu20KlTJ5ycnOjbty+JiYlATdV20aJFeHh44OHhwffffw9c3fe7urqaxx9/HBcXF8LDwzl27Fid9Zfab2lpKePGjcPZ2RlnZ2fuvvtu/WsOHTpEnz59cHJywsfHR39RUlZWxhNPPIGnpyetWrVi9uzZdT4Lzz777EXfv0u93wpFvSEUCkUdfv/9d2Fubi6qq6svur6iokL4+vqKBQsWiMrKSvHFF18IX19fUVFRIYQQAhATJkwQxcXF4ujRo6JZs2YiOTlZCCHEG2+8IaZMmXJN2zp9+rT++X369BE//fSTEEKIiRMnCldXV7F//35RWVkpqqqq6sT5zz//CDMzMzFz5kxRUVEhFi1aJDw9PfXb9vX1FTt37hRCCHHPPfeIiRMnirKyMrFr1y7h4OAgYmJi9Pt59913L3m+vvzySxERESFOnz4tMjIyRHh4uFiwYIE+htatW1/ytZMnTxZDhw4V2dnZorKyUmzdulUIIcQff/whWrVqJeLi4kR+fr7o37+/eOWVV4QQQowfP168++67QqvVioKCArF///6r2tf553PGjBni2Wef1a/7/PPPxf/93/8JIYR4/fXXhZ+fn5g3b54QQohhw4aJ5cuXX7C9K51jIYQICQkRe/bsuWxcQsjzOGLECP3yN998IwYPHnzZYz6fPn36iICAAJGYmCiysrJERESE+Pzzzy94XmFhobC1tRXl5eUiMTFR+Pn5iS5dugghhFixYoUYOnTodb9m8eLFYuDAgUIIIZKSkoSVlVWd7UycOFF4enqKo0ePirKyMnH77beL//73vxc9nokTJwoXFxexf/9+UVRUJAYMGCCef/55IYQQWVlZYs2aNaK8vFycOnVKhIWFifnz59+U/b788svizjvvFAUFBSI2NlZ4enqKDRs2XHB8F6NTp07irbfeEpWVleK7774T5ubmYvHixfoYzv/Orl27VqSmporq6moxd+5c0apVK/13efTo0WLy5MmirKxMrF+/XlhaWoo33njjgjhSU1OFm5ub2Lp1q347nTp1ElqtViQlJQkTExPx3HPPiYqKCrFhwwZhb28vSkpK9DFd7vs9d+5cERkZKbKysvTve58+fa643/nz54vhw4eLsrIyUVFRIbZv3y6EECI/P1+4ubmJr776SlRWVoqsrCxx9OhRIYQQ06ZNE/fff78oKioSGRkZom3btuK333674vt3sfdboahPVGVaoTiPnJwcXF1dMTMzu+j6Xbt2YW5uzqOPPoqFhQWPPfYYZmZmdSqSL7/8Mra2toSFhREWFqavcF7Pti7H2LFj6dixIxYWFpibm1+w3tramldeeQVLS0smT55Ms2bN2LlzZ53naDQaVq5cyaxZs7C2tqZbt26MHj36qm9h//zzzzz//PO0bNkSDw8Pnn/++avS7Wq1WpYsWcJnn32Gq6srFhYWREVF6bc5ZcoUgoODcXR0ZMaMGfptWlhYkJqaSlZWFg4ODnTs2PGq4jyfY8eOERwcrF+Oiopi69atAGzbto0XXniBrVu3IoRg+/bt+tjO50rnODQ0tM7dhksxevRo/vzzTwoLCwFYtmyZ/k7ItRzzQw89REBAAG5ubjz33HOsWLHigufY29sTHBzMvn372LZtG6NHj6agoICSkhK2bt160WO9ntdcivHjxxMWFoa1tTWjR4/m8OHDl3zuqFGj6NixI3Z2drz22mv643Fzc2PYsGFYWVnh5eXF5MmTryjDutr9/vzzz8yYMQMHBwdCQkKYMmXKVX2mk5OTOXHiBC+99BIWFhY8+OCD+Pv713nO+d/ZoUOH4u3tjZmZGU888QQlJSUkJydTXV3N6tWrmTlzJtbW1gwaNOiS5/iHH35gzJgxREVF6beTkpJCcnIyAEIIXn/9dSwtLRkwYADW1tacPHnyiscDsGLFCp577jnc3NwICAjgoYceuqr9WlhYkJOTQ3JyMpaWlvTs2ROAtWvXEhoaysMPP4yFhQVubm6EhYUhhGDx4sV89NFH2NnZ4eHhwWOPPVbn83stnxuFoj5RybRCcR7Nmzfn7NmzaDSai67PyMjA29u7zpiPjw+nT5/WL7u7u+v/b2Njo781fT3buhznv/Z83N3dsbS0rPP887ednZ2NRqPBy8tLP+br63vVMZx/DFf72uzsbCorK/Hz87umbc6ePZuSkhLCwsLo1q0bW7Zsuao4z6egoAA7Ozv9cnh4ONnZ2aSlpREfH8+kSZPYs2cP0dHRNG/enJYtW150O1c6x/b29hQUFFwxHldXV3r27Mlvv/1GXl4eW7du1d8av5Zjrn3eLvdZOnfxsG3bNqKioujWrRs7d+5k69at9O7d+6a95mJc7ffjcsdTVFTEgw8+iJeXFw4ODrz
|
||
|
"text/plain": [
|
||
|
"<Figure size 864x432 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"fig, ax = plt.subplots(1,1, figsize=(12, 6))\n",
|
||
|
"plt_contour_wgrad(x_train, y_train, p_hist, ax)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Above, the contour plot shows the $cost(w,b)$ over a range of $w$ and $b$. Cost levels are represented by the rings. Overlayed, using red arrows, is the path of gradient descent. Here are some things to note:\n",
|
||
|
"- The path makes steady (monotonic) progress toward its goal.\n",
|
||
|
"- initial steps are much larger than the steps near the goal."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**Zooming in**, we can see that final steps of gradient descent. Note the distance between steps shrinks as the gradient approaches zero."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 11,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAswAAAERCAYAAABvmfF2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdd1yV5fvA8Q8bZSOggIp74kJzopKWO/fMHJUrR5qVDSvtl1lq2nCv0K+jXKlZmiMnmlFmubc4QAHZIPs8vz9uOOwN4rjer9d5yXnOc+7nPoy6zn2u+7oMNE3TEEIIIYQQQmTLsLQnIIQQQgghxONMAmYhhBBCCCFyIQGzEEIIIYQQuZCAWQghhBBCiFxIwCyEEEIIIUQuJGAWQgghhBAiFxIwi2faihUrcHd3x8LCAjc3N0aNGkVAQECRxpw5cybjxo0rphkWTZUqVTh58mSe540cOZIvv/yyUNdISEigV69e2NjY0Lx580KNkV+HDx+mTp06uZ5jYGDA/fv39fd//PFHxo8fXyLXTE5OpmHDhgQGBhZ6/ILy8vLixx9/fGTXy86GDRvo06cPAH5+fpibm2d4vCi/T/lVktcNCwujbdu2WFtbM2DAgCKPVxDjxo1j5syZQMbvc1EU9Pvy448/4uXlVeTrFqfsft5CPEoSMItn1qxZs/j000+ZP38+oaGhnDt3jkaNGuHj41PaU8uWTqdDp9OV9jSyOH36NCdPniQ4OBhfX9/Snk4Wn332GVOnTi2RsY2MjBg9ejRfffVViYz/uBo6dCjbt28v7WmUmH379hETE0N4eDhbtmwp9DhF/Zt92r/PQjxJJGAWz6Tw8HBmz57N0qVL6dy5M2ZmZlhZWTFp0iQGDhwIwJkzZ2jdujW2tra0bt2aM2fO6J9vYGDAihUrqFKlCg4ODsydOxdQq5GzZ89m9erVWFpa6leH8hor/Ypo+hXEkSNHMnnyZLy8vLCwsMiy+p26+vn+++9jY2ND/fr1c1xRDg0NZeDAgZQrV45atWqxefNmANauXcuGDRuYMWMGlpaWTJkyJctzk5OTmT59OhUrVsTV1ZXp06eTnJzMH3/8gZeXF8HBwdjb2zN9+vQsz42JiWH8+PG4urpiZ2fHsGHD9I99++23VK1aFScnJ8aOHUtsbCwAly9fpnXr1lhbW+Pk5MQXX3xBcnIyXbt25cqVK1haWuLg4JDTj1fvxIkTWFhYUKNGDQCGDBmCt7c3oIIiAwMDzp49C8CcOXOYPHlyjmPl9D0eMGAA69atIykpKc/5dOzYkY0bN+rvh4eHY2lpSVhYWLavOScXLlygQYMG2NvbM378+GyvvXfvXtq3b6+/7+Liwttvvw1AUFAQdnZ2ZO5bld/nrFmzhi5dugDQqVMn4uPjsbS0xNLSkqCgIP35HTt2xMrKii5duhAWFpbtaxk5ciRvvvkmbdq0wcbGhoEDBxIdHQ1ASEgIXbp0wcHBAScnJyZMmEBiYmKxXDc2NpaxY8fi5ORE1apV+fbbbwHYtGkTI0aM4MyZM1hbW7NixYosz71y5QrNmzfH2tqa119/nRdffJE1a9boX0/mv9mVK1dSs2ZNrKysaNy4cYY35YGBgXTq1AkrKyt69OhBVFSU/rH032eAI0eO0LRpU2xtbfHy8uL69etA2urrypUrcXZ2xtnZmfXr1wP5+/tOSkpiwoQJ2Nvb06BBA86fP5/h8Zyu+/DhQwYOHIidnR12dnb07t1b/5x///2X9u3bY2trS+XKlfVvPGJjY5k4cSIuLi5UrFiROXPmZPhdmDp1arY/v5x+3kI8MpoQz6Ddu3drxsbGWlJSUraPx8fHa25ubtry5cu1hIQEbcmSJZqbm5sWHx+vaZqmAdqQIUO06Oho7ezZs1qZMmU0Pz8/TdM0bcaMGdrYsWMLNNa9e/f057dv31774YcfNE3TtBEjRmgODg7aqVOntISEBC0xMTHDPA8dOqQZGRlpM2fO1OLj47WVK1dqLi4u+rHd3Ny0P/74Q9M0TRs8eLA2YsQILTY2Vjt58qRmbW2tXbhwQX+dL774Isfv17Jly7RGjRpp9+7d0wICArQGDRpoy5cv18+hdu3aOT531KhRWvfu3bXg4GAtISFBO3bsmKZpmvbbb79pFStW1K5cuaKFh4drHTt21D744ANN0zRt0KBB2hdffKHpdDotIiJCO3XqVL6ulfn7+cknn2hTp07VP7Zo0SLt1Vdf1TRN0z7++GOtSpUq2uLFizVN07QePXpoW7ZsyTJeXt9jTdO0WrVqab6+vrnOS9PU97FXr176+99//73WtWvXXF9zZu3bt9eqVaumXb9+XQsKCtIaNWqkLVq0KMt5kZGRmoWFhRYXF6ddv35dq1Klivbcc89pmqZpW7du1bp3717o53h7e2udO3fWNE3Tbt68qZmZmWUYZ8SIEZqLi4t29uxZLTY2VuvQoYP2f//3f9m+nhEjRmj29vbaqVOntKioKK1Tp07aO++8o2mapgUFBWm7du3S4uLitLt372ru7u7a0qVLi+W677//vvbiiy9qERER2uXLlzUXFxdt7969WV5fdpo2bap99tlnWkJCgrZ27VrN2NhY8/b21s8h89/sL7/8ot2+fVtLSkrSFi5cqFWsWFH/t9yvXz9t1KhRWmxsrLZnzx7N1NRUmzFjRpZ53L59W3N0dNSOHTumH6dp06aaTqfTbt68qRkYGGhvv/22Fh8fr+3du1ezsrLSYmJi9HPK7e974cKFWpMmTbSgoCD9z719+/Z5Xnfp0qVaz549tdjYWC0+Pl47fvy4pmmaFh4erjk6OmqrVq3SEhIStKCgIO3s2bOapmna+PHjtVdeeUWLiorSAgICtHr16mk///xznj+/7H7eQjxKssIsnkkhISE4ODhgZGSU7eMnT57E2NiYMWPGYGJiwhtvvIGRkVGGlcX3338fCwsL3N3dcXd3169UFmas3AwYMAAPDw9MTEwwNjbO8ri5uTkffPABpqamjBo1ijJlyvDHH39kOCc5OZlt27Yxa9YszM3NadGiBf369cv3x82bNm3inXfeoUKFCjg7O/POO+/kK49Wp9Oxbt06vvvuOxwcHDAxMcHT01M/5tixY6lZsyY2NjZ88skn+jFNTEy4ffs2QUFBWFtb4+Hhka95Znb+/Hlq1qypv+/p6cmxY8cA8PHx4d133+XYsWNomsbx48f1c8ssr+9xnTp1MnxqkJN+/fpx4MABIiMjAdi8ebP+E42CvObXXnuNatWq4ejoyNtvv83WrVuznGNlZUXNmjX5+++/8fHxoV+/fkRERBATE8OxY8eyfa2FeU5OBg0ahLu7O+bm5vTr14///vsvx3P79u2Lh4cHlpaWfPTRR/rX4+joSI8ePTAzM8PV1ZVRo0blmTKV3+tu2rSJTz75BGtra2rVqsXYsWPz9Tvt5+fHpUuXeO+99zAxMWH48OFUrVo1wzmZ/2a7d+9OpUqVMDIyYuLEicTExODn50dSUhI7d+5k5syZmJub06VLlxy/xxs2bKB///54enrqx7l16xZ+fn4AaJrGxx9/jKmpKZ06dcLc3JwbN27k+XoAtm7dyttvv42joyPVqlXjtddey9d1TUxMCAkJwc/PD1NTU1q3bg3AL7/8Qp06dXj99dcxMTHB0dERd3d3NE3D29ub+fPnY2lpibOzM2+88UaG39+C/N4I8ShJwCyeSeXKlePBgwckJydn+3hAQACVKlXKcKxy5crcu3dPf9/JyUn/ddmyZfUfIxdmrNxkfm5mTk5OmJqaZjg/89jBwcEkJyfj6uqqP+bm5pbvOWR+Dfl9bnBwMAkJCVSpUqVAY86ZM4eYmBjc3d1p0aIFR44cydc8M4uIiMDS0lJ/v0GDBgQHB3Pnzh2uXr3KyJEj8fX15dy5c5QrV44KFSpkO05e32MrKysiIiLynI+DgwOtW7fm559/JiwsjGPHjuk/xi7Ia07/fcvtdyn1DYKPjw+enp60aNGCP/74g2PHjtG2bdtie0528vv3kdvriYqKYvjw4bi6umJtbc0HH3xASEhIsVy3sL/
|
||
|
"text/plain": [
|
||
|
"<Figure size 864x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"fig, ax = plt.subplots(1,1, figsize=(12, 4))\n",
|
||
|
"plt_contour_wgrad(x_train, y_train, p_hist, ax, w_range=[180, 220, 0.5], b_range=[80, 120, 0.5],\n",
|
||
|
" contours=[1,5,10,20],resolution=0.5)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<a name=\"toc_40291_2.7.1\"></a>\n",
|
||
|
"### Increased Learning Rate\n",
|
||
|
"\n",
|
||
|
"<figure>\n",
|
||
|
" <img align=\"left\", src=\"./images/C1_W1_Lab03_alpha_too_big.PNG\" style=\"width:340px;height:240px;\" >\n",
|
||
|
"</figure>\n",
|
||
|
"In the lecture, there was a discussion related to the proper value of the learning rate, $\\alpha$ in equation(3). The larger $\\alpha$ is, the faster gradient descent will converge to a solution. But, if it is too large, gradient descent will diverge. Above you have an example of a solution which converges nicely.\n",
|
||
|
"\n",
|
||
|
"Let's try increasing the value of $\\alpha$ and see what happens:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 12,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Iteration 0: Cost 2.58e+05 dj_dw: -6.500e+02, dj_db: -4.000e+02 w: 5.200e+02, b: 3.20000e+02\n",
|
||
|
"Iteration 1: Cost 7.82e+05 dj_dw: 1.130e+03, dj_db: 7.000e+02 w: -3.840e+02, b:-2.40000e+02\n",
|
||
|
"Iteration 2: Cost 2.37e+06 dj_dw: -1.970e+03, dj_db: -1.216e+03 w: 1.192e+03, b: 7.32800e+02\n",
|
||
|
"Iteration 3: Cost 7.19e+06 dj_dw: 3.429e+03, dj_db: 2.121e+03 w: -1.551e+03, b:-9.63840e+02\n",
|
||
|
"Iteration 4: Cost 2.18e+07 dj_dw: -5.974e+03, dj_db: -3.691e+03 w: 3.228e+03, b: 1.98886e+03\n",
|
||
|
"Iteration 5: Cost 6.62e+07 dj_dw: 1.040e+04, dj_db: 6.431e+03 w: -5.095e+03, b:-3.15579e+03\n",
|
||
|
"Iteration 6: Cost 2.01e+08 dj_dw: -1.812e+04, dj_db: -1.120e+04 w: 9.402e+03, b: 5.80237e+03\n",
|
||
|
"Iteration 7: Cost 6.09e+08 dj_dw: 3.156e+04, dj_db: 1.950e+04 w: -1.584e+04, b:-9.80139e+03\n",
|
||
|
"Iteration 8: Cost 1.85e+09 dj_dw: -5.496e+04, dj_db: -3.397e+04 w: 2.813e+04, b: 1.73730e+04\n",
|
||
|
"Iteration 9: Cost 5.60e+09 dj_dw: 9.572e+04, dj_db: 5.916e+04 w: -4.845e+04, b:-2.99567e+04\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# initialize parameters\n",
|
||
|
"w_init = 0\n",
|
||
|
"b_init = 0\n",
|
||
|
"# set alpha to a large value\n",
|
||
|
"iterations = 10\n",
|
||
|
"tmp_alpha = 8.0e-1\n",
|
||
|
"# run gradient descent\n",
|
||
|
"w_final, b_final, J_hist, p_hist = gradient_descent(x_train ,y_train, w_init, b_init, tmp_alpha, \n",
|
||
|
" iterations, compute_cost, compute_gradient)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Above, $w$ and $b$ are bouncing back and forth between positive and negative with the absolute value increasing with each iteration. Further, each iteration $\\frac{\\partial J(w,b)}{\\partial w}$ changes sign and cost is increasing rather than decreasing. This is a clear sign that the *learning rate is too large* and the solution is diverging. \n",
|
||
|
"Let's visualize this with a plot."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 13,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAsIAAAFcCAYAAADClth3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdd5yTVd7//1fK9F5gBhhmAAGpQ1MpDisiLrYFFcQuuIu6yOKuq+C96iJyq6uu+nNv1q9lsdyoeyuWXcRGE5QqKtIRcKnD0CaZXlLP74+LZJJMy5RMkpnP8/HgAdeV5LpOrsyQd04+5xydUkohhBBCCCFEB6MPdgOEEEIIIYQIBgnCQgghhBCiQ5IgLIQQQgghOiQJwkIIIYQQokOSICyEEEIIITokCcJCCCGEEKJDkiAsRDtz7NgxJk2aRO/evRkwYAC//vWvqaysbNIx3nrrLQoLCwPUwrr16NGD6urqem8/cuQIH374YcDbMW7cOH766adWPeaMGTP48ssvW/WYngoKCrjjjjsCdnxfzXkt5s+fz6ZNm/y673PPPdecZtUSiNdSCNG+SBAWoh1RSjF58mRuvPFGfv75Z/bu3cvkyZMpKytr0nGCEYQb01ZBOFTZ7fZ6b+vatStLlixps/M157VYuHAhY8aM8eu+rRWEm6Kh5yuEaL8kCAvRjqxatYqUlBRuvfVW977JkyeTkZHB2bNnueqqq8jNzWXixImcPn0agHnz5tGvXz+GDBnCX//6Vz755BO+//57rrvuOsaOHVvrHMuXL2fkyJEMHTqUOXPmAHDy5Ekuvvhihg4dSm5urrsX7rXXXmPw4MEMGTKEhQsXAvDYY49x4YUXMmjQIB588MFaxy8tLWX8+PEMHz6coUOHsnbtWkDrUVyxYgVDhw7l3XffpaysjFtvvZULL7yQkSNH8sMPPwDw4osvup/PH/7wB69j22w2BgwYAMC2bdswGAyUlJRQXl7O8OHD3fd78803GTFiBMOHD+f48ePu5zhp0iQuuOACLrnkEg4dOgRovY4PPfRQrfvXp67rB3DNNdcwYsQIBg0axPvvvw9ogXP48OHceeedjBgxgnXr1jFx4kR+9atf0adPH5588kn3/UaNGgXAggULuOuuu8jLy+O8885j5cqVAFRUVHDttdcycOBAZs2aRWZmZq22rVu3jiuvvJIpU6Zw9dVXt/i18OTqFa+oqOCKK64gNzeXwYMHs27dOq/7zZ8/H5PJxNChQ7n//vtxOp387ne/Y9CgQYwYMYItW7YA1PvzXB9/rq/D4WDmzJkMGDCAG264gWHDhnHkyBEA/vGPf3DhhReSm5vLX/7ylwbPJYQII0oI0W68+OKL6v7776/ztlmzZqkXX3xRKaXUSy+9pGbOnKlMJpPq0aOHcjgcSimliouLlVJKXXLJJWrfvn21jnHmzBk1YcIEVV1drZRS6s4771TLly9Xzz33nJo/f75SSimr1aoqKyvVjh071JAhQ1RJSYlSSimTyeT1t8PhUJMnT1ZbtmxRSimVk5OjqqqqlNVqVaWlpUoppfLz89WwYcOUUkqtXbtW3Xjjje62zJ07Vy1btkwppdR//vMfddFFFymllEpPT1eVlZVez8fTJZdcovLz89WiRYvUsGHD1JdffqlWrVql7rnnHvftCxcuVEop9dxzz6mHH35YKaXUjTfeqH788UellFJff/21mjJlSoP39zR9+nT1xRdf1Hv9PK9LSUmJGjBggLJarerw4cPKYDCo3bt3u69B586dVWFhoaqoqFBZWVmqvLxcHT58WI0cOVIppdRjjz2mLr/8cmW329X333+vxowZo5RS6plnnlHz5s1TSim1fPlyVdd//2vXrlXJycnq1KlT7teyJa9FXdfgww8/VHfccYdSSvsZcB3fU0ZGhvvf77//vrruuuuU0+lUO3fuVOeff75Squ6fZ1+eP8f+XN/333/f/bx27Nih9Hq9Onz4sNq1a5eaNm2acjgcym63q8svv1zt3Lmz1vmEEOHHGOwgLoRoG5s2bWLBggUA3HbbbSxatIjExETi4+O5++67mTx5MldddVWDx9iyZQs7d+5k5MiRAFRVVTFkyBAuuOAC7rzzToxGI1OnTqV///6sW7eOm266icTERABSU1MBWLNmDX/961+prq7mzJkz7N271308l4ceeogNGzag1+vZv38/DoejVlvWrFnDihUrmD9/PgDFxcUADB8+nDvuuIOpU6cyefLkWo8bM2YMGzduZOPGjcybN4+NGzdiNBq9vrafNGkSAMOGDeONN94AYO3atV71prGxsQ3evynXD7Se7GXLlgFw9OhR8vPz0el09O/fn4EDB3q1Py0tDYCePXty8uRJjEbv/8qvvvpqDAYDw4YN4+jRo4D2+j/yyCOA1jvq2X5PeXl5ZGRkuLdb8lrUZfDgwfzxj3/k4Ycf5rrrruPCCy+s976udt90003odDoGDx5MbGwsp0+frvPnuSH+XN9NmzYxZcoUAHJzc+nXrx+gvfabNm1yf2tQXl7OwYMHGTx4cIPnFEKEPgnCQrQj/fv355NPPqnzNqUUOp3Oa5/RaOT7779nxYoVLFmyhI8++oi33nqr3uOrczXIr732Wq3bvvnmG5YvX851113HSy+9VOf5qqur+eMf/8h3331HZmYmDzzwABaLxes+77zzDtXV1Wzbtg2j0UhaWho2m63OtnzxxRd07drVa//nn3/O2rVr+fDDD3n55ZdrffV+8cUXs3LlSk6ePMnkyZPdz/eWW25x3ycqKgoAvV7vrh3V6/Vs27YNvb52RVld969Lfddv7dq1bNmyhW+//Zbo6GhGjBiBxWIhOjqauLi4Os/leT7fIFxXe+p6Perieb6WvhZ16du3Lz/88AOffvops2bNYvbs2dx555313t+33UqpJj0f8P/6NnSuWbNm8fDDD/t1PiFE+JAaYSHakQkTJmAymXjvvffc+5YuXcqpU6e4+OKL3fv/+c9/kpeXR3l5OSUlJUyaNIlnnnmG7du3AxAfH1/nALtRo0axZs0aTpw4AWh1midPnuTo0aN06dKFWbNmMW3aNHbt2sWll17Ke++95z6O2WymuroavV5PWloaxcXF7h46T6WlpWRkZGA0Glm+fDlms7nONk2YMIGXXnrJvb1z506cTif5+flMmDCBF154gX379tU6/pgxY/jXv/5F9+7diYmJwWq1cujQIXr37t3gtR07diyvv/46AE6nkz179jR4/7rUd/1KS0tJTU0lOjqa7du3s3PnziYfuzFjxozho48+ArQPC/7MJNKS16I+BQUFxMfHM2PGDO6991527NhR6z46nQ6n0wloH1zef/99lFLs2bOH6upqMjIy6vx5buh5+HN9x4wZw8cffwzArl272L9/PwDjx4/nvffec/d0Hz16lJKSknrPJ4QIHxKEhWhH9Ho9//73v3nnnXfo3bs3AwcOZOXKlSQmJrJgwQI+//xzcnNz+fjjj3niiScoKyvj6quvZsiQIVx//fX893//NwB33HEHt912W63Bcp07d+all15i0qRJ5ObmctVVV2E2m1m3bh25ubkMGzaMTZs2cdttt5Gbm8s999zDqFGjGDJkCC+99BLJycncdtttDBw4kGnTprkHeHm69dZbWbt2LRdeeCFffvkl2dnZgPZVdUVFhXuA1vz58zlx4gS5ubkMGDCAd999F4fDwS233EJubi4jR450DybzlJKSQkJCgrsUYujQoe4BdA1ZtGgRn332GUOGDGHQoEGsXr26ya9PfdfviiuuoLi4mKFDh/Lss88yYsSIJh+7MbNnz2bPnj0MGzaMNWvWuK9rQ1ryWtRn165dXHjhhQwdOpTXX3+d3/3ud7Xuc8sttzBo0CDuv/9+pk6dSmZmJoMHD+b222/nzTffBKjz57k+/l7fqVOnEhUVxeDBg3n++ecZMGAAiYmJDBo0iAceeIBf/OIXDB48mFtvvbXBqf6EEOFDp1zf/QghhGi37HY7DoeDqKgoNmzYwKOPPlqrbERos2vExcVx8OBBJk2aVOe3CkKI9kN
|
||
|
"text/plain": [
|
||
|
"<Figure size 864x360 with 2 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"plt_divergence(p_hist, J_hist,x_train, y_train)\n",
|
||
|
"plt.show()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Above, the left graph shows $w$'s progression over the first few steps of gradient descent. $w$ oscillates from positive to negative and cost grows rapidly. Gradient Descent is operating on both $w$ and $b$ simultaneously, so one needs the 3-D plot on the right for the complete picture."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"\n",
|
||
|
"## Congratulations!\n",
|
||
|
"In this lab you:\n",
|
||
|
"- delved into the details of gradient descent for a single variable.\n",
|
||
|
"- developed a routine to compute the gradient\n",
|
||
|
"- visualized what the gradient is\n",
|
||
|
"- completed a gradient descent routine\n",
|
||
|
"- utilized gradient descent to find parameters\n",
|
||
|
"- examined the impact of sizing the learning rate"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"dl_toc_settings": {
|
||
|
"rndtag": "40291"
|
||
|
},
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.7.6"
|
||
|
},
|
||
|
"toc-autonumbering": false
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 5
|
||
|
}
|