updated_10.30

archersama · web-flow · commit 6544e31ecd2e · 2019-10-30T14:19:57.000+08:00
diff --git a/ch3_DL_basics/3.9_mlp-scratch.ipynb b/ch3_DL_basics/3.9_mlp-scratch.ipynb
@@ -0,0 +1,262 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9 多层感知机的从零开始实现\n",
+    "我们已经从上一节里了解了多层感知机的原理。下面，我们一起来动手实现一个多层感知机。首先导入实现所需的包或模块。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "metadata": {
+    "pycharm": {
+     "is_executing": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2.0.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "import tensorflow as tf\n",
+    "import numpy as np\n",
+    "import sys\n",
+    "print(tf.__version__)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9.1 获取和读取数据\n",
+    "这里继续使用Fashion-MNIST数据集。我们将使用多层感知机对图像进行分类"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 42,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "from tensorflow.keras.datasets import fashion_mnist\n",
+    "(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()\n",
+    "batch_size = 256\n",
+    "x_train = tf.cast(x_train, tf.float32)\n",
+    "x_test = tf.cast(x_test, tf.float32)\n",
+    "x_train = x_train/255.0\n",
+    "x_test = x_test/255.0\n",
+    "train_iter = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size)\n",
+    "test_iter = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9.2 定义模型参数\n",
+    "我们在3.6节（softmax回归的从零开始实现）里已经介绍了，Fashion-MNIST数据集中图像形状为 28×28，类别数为10。本节中我们依然使用长度为 28×28=784 的向量表示每一张图像。因此，输入个数为784，输出个数为10。实验中，我们设超参数隐藏单元个数为256。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 43,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
+    "\n",
+    "w1 = tf.Variable(tf.random.truncated_normal([num_inputs, num_hiddens], stddev=0.1))\n",
+    "b1 = tf.Variable(tf.random.truncated_normal([num_hiddens], stddev=0.1))\n",
+    "w2 = tf.Variable(tf.random.truncated_normal([num_hiddens, num_outputs], stddev=0.1))\n",
+    "b2=tf.Variable(tf.random.truncated_normal([num_outputs], stddev=0.1))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9.3 定义激活函数\n",
+    "这里我们使用基础的max函数来实现ReLU，而非直接调用relu函数。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 44,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def relu(x):\n",
+    "    return tf.math.maximum(x,0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 45,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def net(x,w1,b1,w2,b2):\n",
+    "    x = tf.reshape(x,shape=[-1,num_inputs])\n",
+    "    h = relu(tf.matmul(x,w1) + b1 )\n",
+    "    y = tf.math.softmax( tf.matmul(h,w2) + b2 )\n",
+    "    return y"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9.5. 定义损失函数¶\n",
+    "为了得到更好的数值稳定性，我们直接使用Tensorflow提供的包括softmax运算和交叉熵损失计算的函数。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def loss(y_hat,y_true):\n",
+    "    return tf.losses.sparse_categorical_crossentropy(y_true,y_hat)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3.9.6. 训练模型"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def acc(y_hat,y):\n",
+    "    return np.mean((tf.argmax(y_hat,axis=1) == y))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "num_epochs, lr = 5, 0.5"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0 loss: 0.7799275\n",
+      "0 test_acc: 0.875\n",
+      "1 loss: 0.72887945\n",
+      "1 test_acc: 0.9375\n",
+      "2 loss: 0.72454\n",
+      "2 test_acc: 0.8125\n",
+      "3 loss: 0.5607478\n",
+      "3 test_acc: 0.875\n",
+      "4 loss: 0.5008962\n",
+      "4 test_acc: 0.9375\n"
+     ]
+    }
+   ],
+   "source": [
+    "for epoch in range(num_epochs):\n",
+    "    loss_all = 0\n",
+    "    for x,y in train_iter:\n",
+    "        with tf.GradientTape() as tape:\n",
+    "            y_hat = net(x,w1,b1,w2,b2)\n",
+    "            l = tf.reduce_mean(loss(y_hat,y))\n",
+    "            loss_all += l.numpy()\n",
+    "            grads = tape.gradient(l, [w1, b1, w2, b2])\n",
+    "            w1.assign_sub(grads[0])\n",
+    "            b1.assign_sub(grads[1])\n",
+    "            w2.assign_sub(grads[2])\n",
+    "            b2.assign_sub(grads[3])\n",
+    "    print(epoch, 'loss:', l.numpy())\n",
+    "    total_correct, total_number = 0, 0\n",
+    "\n",
+    "    for x,y in test_iter:\n",
+    "        with tf.GradientTape() as tape:\n",
+    "            y_hat = net(x,w1,b1,w2,b2)\n",
+    "            y=tf.cast(y,'int64')\n",
+    "            correct=acc(y_hat,y)\n",
+    "    print(epoch,\"test_acc:\", correct)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}