| 172 | 172 |
"y_predict = a * x_predict + b\n",
|
| 173 | 173 |
"\n",
|
| 174 | 174 |
"# 绘图\n",
|
|
175 |
"plt.rcParams['figure.dpi'] = 150\n",
|
| 175 | 176 |
"fig = plt.figure()\n",
|
| 176 | 177 |
"plt.xlabel(\"Year\")\n",
|
| 177 | 178 |
"plt.ylabel(\"Co2\")\n",
|
|
| 243 | 244 |
},
|
| 244 | 245 |
{
|
| 245 | 246 |
"cell_type": "code",
|
| 246 | |
"execution_count": 1,
|
| 247 | |
"metadata": {},
|
| 248 | |
"outputs": [
|
| 249 | |
{
|
| 250 | |
"name": "stdout",
|
| 251 | |
"output_type": "stream",
|
| 252 | |
"text": [
|
| 253 | |
"[[1.53438095]]\n",
|
| 254 | |
"[-2698.87714286]\n"
|
| 255 | |
]
|
| 256 | |
}
|
| 257 | |
],
|
|
247 |
"execution_count": null,
|
|
248 |
"metadata": {},
|
|
249 |
"outputs": [],
|
| 258 | 250 |
"source": [
|
| 259 | 251 |
"# 导入工具包\n",
|
| 260 | 252 |
"import numpy as np\n",
|
|
| 289 | 281 |
"\n",
|
| 290 | 282 |
"梯度下降背后的思想是:开始时我们随机选择一个参数的组合$(\\theta_{0},\\theta_{1},......,\\theta_{n})$ ,计算代价函数,然后我们寻找下一个能让代价函数值下降最多的参数组合。我们持续这么做直到抵达一个局部最小值,因为我们并没有尝试完所有的参数组合,所以不能确定我们得到的局部最小值是否便是全局最小值,选择不同的初始参数组合,可能会找到不同的局部最小值。 \n",
|
| 291 | 283 |
" \n",
|
| 292 | |
" <img src='https://i.loli.net/2018/11/30/5c00c262c5885.png' width=500 >"
|
|
284 |
" <img src=\"http://imgbed.momodel.cn//20200115014102.png\" width=500>"
|
| 293 | 285 |
]
|
| 294 | 286 |
},
|
| 295 | 287 |
{
|
|
| 298 | 290 |
"source": [
|
| 299 | 291 |
"梯度下降算法的公式为:\n",
|
| 300 | 292 |
"\n",
|
| 301 | |
"<img src='https://i.loli.net/2018/11/30/5c00c5fe7ce53.png' width=350 >\n",
|
|
293 |
"<img src=\"http://imgbed.momodel.cn//20200115014016.png\" width=350>\n",
|
| 302 | 294 |
" \n",
|
| 303 | 295 |
"其中 J 是代价函数,$\\theta_{0},\\theta_{1}$ 是待求参数, α 是学习率,它决定了我们沿着能让代价函数下降程度最大的方向向下迈出的步子有多大。 "
|
| 304 | 296 |
]
|