master
/ 8.4.2 and 3.ipynb

8.4.2 and 3.ipynb @master

d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3fd0d
d487d71
c7ab69a
 
 
 
 
 
 
 
 
 
 
 
 
 
d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3fd0d
d487d71
c7ab69a
 
 
 
 
 
 
 
 
 
 
 
 
 
d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3fd0d
d487d71
cc3fd0d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3fd0d
d487d71
c7ab69a
 
 
 
 
 
 
 
 
 
 
 
 
 
d487d71
 
 
 
 
 
cc3fd0d
 
d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3fd0d
d487d71
cc3fd0d
 
 
 
 
 
 
 
 
 
 
 
 
 
d487d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# NumPy中的数据统计与分析\n",
    "\n",
    "描述性统计是用于概括、表述事物完整状况,以及事物间关联、类属关系的统计方法。数值型特征的描述性统计主要包括计算数值型数据的完整情况、最小值、最大值、均值、中位数、四分位数、极差、标准差、方差、协方差和变异系数等。NumPy中提供了很多统计函数,部分统计函数如下表所示。\n",
    "\n",
    "| 函数     | 描述               | 函数      | 描述         |\n",
    "| :--------: | :------------------ | :---------: | :------------ |\n",
    "| min()    | 最小值             | max()     | 最大值       |\n",
    "| argmax() | 最大值索引         | argmin()  | 最小值索引   |\n",
    "| cumsum() | 所有元素累加       | cumprod() | 所有元素累乘 |\n",
    "| sum()    | 对数组元素进行求和 |    median()       |    中位数           |\n",
    "| mean()   | 均值               |   average()  | 加权平均值         |\n",
    "|   ptp()|      极差       |   var()   |    方差    |\n",
    "|   std()  | 标准差               | cov()     | 协方差       |\n",
    "\n",
    "下面对常用统计函数做简单介绍,未涉及到的函数或参数的含义与用法,可自行查看相关文档。\n",
    "\n",
    "+ ### 最大值和最小值\n",
    "\n",
    "max()方法或min()方法分别可用于获取数组中的最大值或最小值。函数原型为:\n",
    "```python\n",
    "ndarray.max(axis=None, out=None, keepdims=False, initial=<no value>, where=True)\n",
    "ndarray.min(axis=None, out=None, keepdims=False, initial=<no value>, where=True)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[10 16 65]\n",
      " [29 90 94]\n",
      " [30  9 74]]\n",
      "数组中的最大值: 94 数组中的最小值: 9\n",
      "数组中每行的最大值: [65 94 74] 数组中每行的最小值: [10 29  9]\n",
      "数组中每列的最大值: [30 90 94] 数组中每列的最小值: [10  9 65]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "np.random.seed(10)   # 设置随机数种子\n",
    "a = np.random.randint(1, 100, size=(3, 3))  # 生成一个值在区间[1,100)之间的3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组中的最大值:', a.max(), '数组中的最小值:',a.min())\n",
    "print('数组中每行的最大值:', a.max(axis=1), '数组中每行的最小值:',a.min(axis=1))\n",
    "print('数组中每列的最大值:', a.max(axis=0), '数组中每列的最小值:',a.min(axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 最大值索引和最小值索引\n",
    "\n",
    "argmax()方法或argmin()方法分别可用于获取数组中的最大值的索引或最小值的索引。函数原型为:\n",
    "```python\n",
    "ndarray.argmax(axis=None, out=None, *, keepdims=False)\n",
    "ndarray.argmin(axis=None, out=None, *, keepdims=False)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[10 16 65]\n",
      " [29 90 94]\n",
      " [30  9 74]]\n",
      "数组中的最大值的索引: 5 数组中的最小值的索引: 7\n",
      "数组中每行最大值的列索引: [2 2 2] 数组中每行最小值的列索引: [0 0 1]\n",
      "数组中每列最大值的行索引: [2 1 1] 数组中每列最小值的行索引: [0 2 0]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "np.random.seed(10)   # 设置随机数种子\n",
    "a = np.random.randint(1, 100, size=(3, 3))  # 生成一个值在区间[1,100)之间的3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组中的最大值的索引:', a.argmax(), '数组中的最小值的索引:',a.argmin())    # 展开为一维数组后的索引\n",
    "print('数组中每行最大值的列索引:', a.argmax(axis=1), '数组中每行最小值的列索引:',a.argmin(axis=1))\n",
    "print('数组中每列最大值的行索引:', a.argmax(axis=0), '数组中每列最小值的行索引:',a.argmin(axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 累加和累乘\n",
    "\n",
    "cumsum()或cumprod()方法分别可用于求对数组中的给定轴的元素的累加总和或累乘总积。函数原型为:\n",
    "```python\n",
    "ndarray.cumsum(axis=None, dtype=None, out=None)\n",
    "ndarray.cumprod(axis=None, dtype=None, out=None)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 3]\n",
      " [4 5 6]\n",
      " [7 8 9]]\n",
      "数组元素逐个累加结果: [ 1  3  6 10 15 21 28 36 45]\n",
      "数组元素逐个累乘结果: [     1      2      6     24    120    720   5040  40320 362880]\n",
      "数组逐行累加结果:\n",
      "[[ 1  2  3]\n",
      " [ 5  7  9]\n",
      " [12 15 18]]\n",
      "数组逐列累加结果:\n",
      "[[ 1  3  6]\n",
      " [ 4  9 15]\n",
      " [ 7 15 24]]\n",
      "数组逐行累乘结果:\n",
      "[[  1   2   3]\n",
      " [  4  10  18]\n",
      " [ 28  80 162]]\n",
      "数组逐列累乘结果:\n",
      "[[  1   2   6]\n",
      " [  4  20 120]\n",
      " [  7  56 504]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 10).reshape(3, 3)  # 生成一个3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组元素逐个累加结果:', a.cumsum())     # 展开为一维数组后的累加\n",
    "print('数组元素逐个累乘结果:', a.cumprod())    # 展开为一维数组后的累乘\n",
    "print('数组逐行累加结果:', a.cumsum(axis=0), '数组逐列累加结果:',a.cumsum(axis=1), sep='\\n')\n",
    "print('数组逐行累乘结果:', a.cumprod(axis=0), '数组逐列累乘结果:',a.cumprod(axis=1), sep='\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 求和\n",
    "\n",
    "sum()方法返回给定轴上数组元素的总和。函数原型为:\n",
    "```python\n",
    "ndarray.sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 3]\n",
      " [4 5 6]\n",
      " [7 8 9]]\n",
      "数组元素之和: 45\n",
      "数组每行元素的和: [ 6 15 24]\n",
      "数组每列元素的和: [12 15 18]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 10).reshape(3, 3)  # 生成一个3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组元素之和:', a.sum())     # 展开为一维数组后的求和\n",
    "print('数组每行元素的和:', a.sum(axis=1))  ##计算数组a沿着行方向(即axis=1)的和\n",
    "print('数组每列元素的和:', a.sum(axis=0))  ##计算数组a沿着列方向(即axis=0)的和"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 中位数\n",
    "\n",
    "median()函数用于沿指定轴计算并返回数组元素的中位数。函数原型为:\n",
    "```python\n",
    "numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1  2  3  4]\n",
      " [ 5  6  7  8]\n",
      " [ 9 10 11 12]]\n",
      "数组元素中位数: 6.5\n",
      "数组每行中位数: [ 2.5  6.5 10.5]\n",
      "数组每列中位数: [5. 6. 7. 8.]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 13).reshape(3, 4)  # 生成一个3 x 4的二维数组\n",
    "print(a)\n",
    "print('数组元素中位数:', np.median(a))          # 展开为一维数组后的求中位数\n",
    "print('数组每行中位数:', np.median(a, axis=1))  # 元素个数为偶数时,中位数是排序后中间两个数的平均值\n",
    "print('数组每列中位数:', np.median(a, axis=0))  # 元素个数为奇数时,中位数是排序后中间那个数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 均值与加权平均\n",
    "\n",
    "mean()方法用于返回数组元素沿给定轴的平均值;average()函数计算并返回沿指定轴的加权平均值。函数原型为:\n",
    "```python\n",
    "ndarray.mean(axis=None, dtype=None, out=None, keepdims=False, *, where=True)\n",
    "numpy.average(a, axis=None weights=None return=False *, keepdims=<no value>)\n",
    "```\n",
    "$$均值=\\frac{a_1+a_2+a_3+ \\cdots \\cdots +a_n}{n} \\qquad 加权均值=\\frac{a_1\\times b_1+a_2\\times b_2+a_3\\times b_3+ \\cdots \\cdots +a_n\\times b_n}{b_1+b_2+b_3+ \\cdots \\cdots +b_n}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 10).reshape(3, 3)  # 生成一个3 x 3的二维数组\n",
    "np.random.seed(10)   # 设置随机数种子\n",
    "b = np.random.randint(1, 10, size=(3, 3))  # 生成一个值在区间[1,10)之间的3 x 3的二维权重数组\n",
    "print(a)\n",
    "print('数组元素平均值:', round(a.mean(), 2))          # 展开为一维数组后求平均\n",
    "print('数组每行平均值:', a.mean(axis=1).round(2))     # 元素个数为偶数时,中位数是排序后中间两个数的平均值\n",
    "print('数组每列中位数:', np.median(a, axis=0))        # 元素个数为奇数时,中位数是排序后中间那个数\n",
    "print('数组元素加权(权值均为1)平均值:', round(np.average(a), 2))  # 缺省weights,权值都为1时,结果与mean()方法相同\n",
    "print('权重数组b:\\n', b)\n",
    "print('数组元素加权平均值:', round(np.average(a, weights=b), 2))   # a与b对位相乘后求和,再除以数组b的和,等价于(a*b).sum()/b.sum()\n",
    "# print((a*b).sum()/b.sum())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 极差\n",
    "\n",
    "ptp()函数计算并返回沿指定轴的极差(最大值减最小值)。函数原型为:\n",
    "```python\n",
    "numpy.ptp(a, axis=None, out=None, keepdims=<no value>)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "np.random.seed(10)   # 设置随机数种子\n",
    "a = np.random.randint(1, 10, size=(3, 3))  # 生成一个值在区间[1,10)之间的3 x 3的二维权重数组\n",
    "print(a)\n",
    "print('数组元素极差:', np.ptp(a))              # 展开为一维数组后求极差\n",
    "print('数组每行元素极差:', np.ptp(a, axis=1))  \n",
    "print('数组每列元素极差:', np.ptp(a, axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 方差\n",
    "\n",
    "var()函数计算并返回沿指定轴的方差。函数原型为:\n",
    "```python\n",
    "numpy.var(a, axis=None dtype=None out=None ddof=0 keepdims=<no value>, *, where=<no value>)\n",
    "```\n",
    "方差是在概率论和统计方差衡量随机变量或一组数据时离散程度的度量。统计中的方差(样本方差)是各个数据分别与其平均数之差的平方的和的平均数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 10).reshape(3, 3)  # 生成一个3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组元素有偏方差:', np.var(a))     # ddof参数默认为0,求得的是有偏方差(求方差时除以n)\n",
    "print('数组元素无偏方差:', np.var(a, ddof=1))   # 若要求的是无偏方差(求方差时除以n-1),需设置ddof参数为1\n",
    "print('数组每行元素有偏方差:', np.var(a, axis=1))  \n",
    "print('数组每列元素无偏方差:', np.var(a, ddof=1, axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 标准差\n",
    "\n",
    "std()函数计算并返回沿指定轴的标准差。函数原型为:\n",
    "```python\n",
    "numpy.std(a, axis=None dtype=None out=None ddof=0 keepdims=<no value>, *, where=<no value>)\n",
    "```\n",
    "标准差也称均方差,为方差的算术平方根,也能反映一个数据集的离散程度。较大的标准差,代表大部分数值和其平均值之间差异较大;较小的标准差,代表这些数值较接近平均值。比如在投资上,标准差可作为量度回报稳定性的指标。标准差数值越大,代表回报远离过去平均数值,回报较不稳定,风险较高;相反,标准差数值越小,代表回报较为稳定,风险亦较小。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "a = np.arange(1, 10).reshape(3, 3)  # 生成一个3 x 3的二维数组\n",
    "print(a)\n",
    "print('数组元素有偏标准差:', np.std(a))     # ddof参数默认为0,求得的是有偏标准差(求标准差时除以n)\n",
    "print('数组元素无偏标准差:', np.std(a, ddof=1))   # 若要求的是无偏标准差(求标准差时除以n-1),需设置ddof参数为1\n",
    "print('数组每行元素有偏标准差:', np.std(a, axis=1))  \n",
    "print('数组每列元素无偏标准差:', np.std(a, ddof=1, axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ ### 利用Python函数创建通用函数\n",
    "\n",
    "frompyfunc()函数处理任意Python函数并返回NumPy ufunc。可用于给内置Python函数或自定义函数添加广播功能。函数原型为:\n",
    "```python\n",
    "numpy.frompyfunc(func, nin, nout)\n",
    "```\n",
    "+ func:Python函数对象\n",
    "+ nin:输入参数个数\n",
    "+ nout:返回值个数\n",
    "\n",
    "\n",
    "\n",
    "例如:下面的例子可对存在列表中的多个方程系数分别求方程的解。小。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "\n",
    "\n",
    "def solve_equations(a, b, c):\n",
    "    delta = b ** 2 - 4 * a * c\n",
    "    if delta >= 0:\n",
    "        return (-b + math.sqrt(delta)) / (2 * a), (-b - math.sqrt(delta)) / (2 * a),\n",
    "    else:\n",
    "        return complex(-b / (2 * a), math.sqrt(math.fabs(delta)) / (2 * a)), \\\n",
    "               complex(-b / (2 * a), -math.sqrt(math.fabs(delta)) / (2 * a))\n",
    "\n",
    "\n",
    "factor_list = [[1, -2, 1], [1, 3, 10], [1, 3, -10]]\n",
    "for i in factor_list:  # 遍历系数列表,循环调用函数求解\n",
    "    print(solve_equations(*i))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过将solve_equations转换为NumPy ufunc,可简化求解过程"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "solve_equations_array = np.frompyfunc(solve_equations, 3, 1)  # 将函数矢量化,3个输入,1个输出。该ufunc始终返回 PyObject 数组。\n",
    "factor_array = np.array(factor_list)  # 参数列表转换为数组\n",
    "print(factor_array)\n",
    "print(solve_equations_array(factor_array[:, 0], factor_array[:, 1], factor_array[:, 2]))  # 一次函数调用,求解三个方程。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 实例:利用 NumPy进行成绩统计分析\n",
    "文件“<a href=\"images/ch8/8.6 score.csv\" target=\"_blank\">8.6 score.csv</a>”保存学生成绩数据,中文编码类型为utf-8,分隔符为英文逗号“,”,文件的内容如下,按要求完成以下操作:\n",
    "\n",
    "1. 求Python平均成绩并输出,\n",
    "2. 求Python中位数成绩并输出,\n",
    "3. 求Python成绩标准差并输出,\n",
    "4. 求罗明的平均成绩并输出。\n",
    "\n",
    "<img src=\"images/ch8/8.png\" style=\"zoom:60%;\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['姓名' '学号' '高数' '英语' 'python' '物理' 'java' 'C语言']\n",
      " ['罗明' '1217106' '95' '85' '96' '88' '78' '90']\n",
      " ['金川' '1217116' '85' '86' '90' '70' '88' '85']\n",
      " ['戈扬' '1217117' '80' '90' '75' '85' '98' '95']\n",
      " ['罗旋' '1217119' '78' '92' '85' '72' '95' '75']\n",
      " ['蒋维' '1217127' '99' '88' '65' '80' '85' '75']]\n",
      "[[95 85 96 88 78 90]\n",
      " [85 86 90 70 88 85]\n",
      " [80 90 75 85 98 95]\n",
      " [78 92 85 72 95 75]\n",
      " [99 88 65 80 85 75]]\n",
      "python成绩数组: [96 90 75 85 65]\n",
      "python平均成绩: 82.2\n",
      "python成绩中位数: 85.0\n",
      "python成绩标准差: 11.02\n",
      "罗明同学成绩数组: [95 85 96 88 78 90]\n",
      "罗明同学的平均成绩: 88.67\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "scoreAll=np.loadtxt('images/ch8/8.6 score.csv', str, delimiter=',', encoding='utf-8')   # 读取文件数据到数组\n",
    "print(scoreAll)  # 输出数组\n",
    "scoreNum = scoreAll[1:, 2:].astype(int)  # 数组切片,仅保留成绩部分,并全部转化为整形\n",
    "print(scoreNum)  # 输出数组c\n",
    "scorePython = scoreNum[:, 2]  # 从成绩数组中切片分离出Python成绩\n",
    "scoreStu = scoreNum[0]        # 从成绩数组中切片分离出罗明的成绩\n",
    "nameStu = scoreAll[1, 0]      # 从总数组中索引获得人名\n",
    "print('python成绩数组:', scorePython)                # 输出python成绩数组\n",
    "print('python平均成绩:', np.average(scorePython))    # 输出python的平均成绩\n",
    "print('python成绩中位数:', np.median(scorePython))   # 输出python成绩中位数\n",
    "print('python成绩标准差:', round(np.std(scorePython), 2))  # 输出python成绩标准差\n",
    "print(f'{nameStu}同学成绩数组: {scoreStu}')  # 输出罗明成绩数组\n",
    "print(f'{nameStu}同学的平均成绩: {np.average(scoreStu):.2f}')  # 输出罗明平均成绩"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font face='楷体' color='red' size=5> 练一练 </font>\n",
    "### 利用 NumPy分析花萼数据\n",
    "文件“<a href=\"images/ch8/iris_sepal_length.csv\" target=\"_blank\">iris_sepal_length.csv</a>”中有一列数据,保存着若干花萼的长度,编码类型为utf-8。编程按要求完成以下操作:\n",
    "\n",
    "1. 读取文件iris_sepal_length.csv中的花萼数据到数组。‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬\n",
    "2. 去掉重复的花萼数据。‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬\n",
    "3. 统计并按如下输出格式输出花萼长度的最大值、最小值、均值、标准差、方差。\n",
    "\n",
    "**提示:**\n",
    "1. 所有输出保留两位小数。\n",
    "2. 计算方差时,使用有偏方差(ddof=0)\n",
    "3. 计算标准差时,使用无偏标准差(ddof=1)\n",
    "4. 去重可使用numpy.unique()函数,例如:\n",
    "```python\n",
    "a = np.array([1, 1, 2, 3, 2])\n",
    "print(np.unique(a)) # 输出去重后的数组[1 2 3]\n",
    "```\n",
    "\n",
    "**期望输出:**\n",
    "\n",
    "花萼长度的最大值是:7.90<br>\n",
    "花萼长度的最小值是:4.30<br>\n",
    "花萼长度的均值是:6.01<br>\n",
    "花萼长度的方差是:1.06<br>\n",
    "花萼长度的标准差是:1.04<br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 此处编写代码,完成题目要求\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}