Skip to content

Commit 6544e31

Browse files
authored
updated_10.30
1 parent 922703f commit 6544e31

File tree

1 file changed

+262
-0
lines changed

1 file changed

+262
-0
lines changed
Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"3.9 多层感知机的从零开始实现\n",
8+
"我们已经从上一节里了解了多层感知机的原理。下面,我们一起来动手实现一个多层感知机。首先导入实现所需的包或模块。"
9+
]
10+
},
11+
{
12+
"cell_type": "code",
13+
"execution_count": 41,
14+
"metadata": {
15+
"pycharm": {
16+
"is_executing": false
17+
}
18+
},
19+
"outputs": [
20+
{
21+
"name": "stdout",
22+
"output_type": "stream",
23+
"text": [
24+
"2.0.0\n"
25+
]
26+
}
27+
],
28+
"source": [
29+
"import tensorflow as tf\n",
30+
"import numpy as np\n",
31+
"import sys\n",
32+
"print(tf.__version__)"
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"metadata": {},
38+
"source": [
39+
"3.9.1 获取和读取数据\n",
40+
"这里继续使用Fashion-MNIST数据集。我们将使用多层感知机对图像进行分类"
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": 42,
46+
"metadata": {
47+
"collapsed": true
48+
},
49+
"outputs": [],
50+
"source": [
51+
"from tensorflow.keras.datasets import fashion_mnist\n",
52+
"(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()\n",
53+
"batch_size = 256\n",
54+
"x_train = tf.cast(x_train, tf.float32)\n",
55+
"x_test = tf.cast(x_test, tf.float32)\n",
56+
"x_train = x_train/255.0\n",
57+
"x_test = x_test/255.0\n",
58+
"train_iter = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size)\n",
59+
"test_iter = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size)"
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"metadata": {},
65+
"source": [
66+
"3.9.2 定义模型参数\n",
67+
"我们在3.6节(softmax回归的从零开始实现)里已经介绍了,Fashion-MNIST数据集中图像形状为 28×28,类别数为10。本节中我们依然使用长度为 28×28=784 的向量表示每一张图像。因此,输入个数为784,输出个数为10。实验中,我们设超参数隐藏单元个数为256。"
68+
]
69+
},
70+
{
71+
"cell_type": "code",
72+
"execution_count": 43,
73+
"metadata": {},
74+
"outputs": [],
75+
"source": [
76+
"num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
77+
"\n",
78+
"w1 = tf.Variable(tf.random.truncated_normal([num_inputs, num_hiddens], stddev=0.1))\n",
79+
"b1 = tf.Variable(tf.random.truncated_normal([num_hiddens], stddev=0.1))\n",
80+
"w2 = tf.Variable(tf.random.truncated_normal([num_hiddens, num_outputs], stddev=0.1))\n",
81+
"b2=tf.Variable(tf.random.truncated_normal([num_outputs], stddev=0.1))\n"
82+
]
83+
},
84+
{
85+
"cell_type": "markdown",
86+
"metadata": {},
87+
"source": [
88+
"3.9.3 定义激活函数\n",
89+
"这里我们使用基础的max函数来实现ReLU,而非直接调用relu函数。"
90+
]
91+
},
92+
{
93+
"cell_type": "code",
94+
"execution_count": 44,
95+
"metadata": {
96+
"collapsed": true
97+
},
98+
"outputs": [],
99+
"source": [
100+
"def relu(x):\n",
101+
" return tf.math.maximum(x,0)"
102+
]
103+
},
104+
{
105+
"cell_type": "code",
106+
"execution_count": 45,
107+
"metadata": {
108+
"collapsed": true
109+
},
110+
"outputs": [],
111+
"source": [
112+
"def net(x,w1,b1,w2,b2):\n",
113+
" x = tf.reshape(x,shape=[-1,num_inputs])\n",
114+
" h = relu(tf.matmul(x,w1) + b1 )\n",
115+
" y = tf.math.softmax( tf.matmul(h,w2) + b2 )\n",
116+
" return y"
117+
]
118+
},
119+
{
120+
"cell_type": "markdown",
121+
"metadata": {},
122+
"source": [
123+
"3.9.5. 定义损失函数¶\n",
124+
"为了得到更好的数值稳定性,我们直接使用Tensorflow提供的包括softmax运算和交叉熵损失计算的函数。"
125+
]
126+
},
127+
{
128+
"cell_type": "code",
129+
"execution_count": 46,
130+
"metadata": {
131+
"collapsed": true
132+
},
133+
"outputs": [],
134+
"source": [
135+
"def loss(y_hat,y_true):\n",
136+
" return tf.losses.sparse_categorical_crossentropy(y_true,y_hat)"
137+
]
138+
},
139+
{
140+
"cell_type": "markdown",
141+
"metadata": {},
142+
"source": [
143+
"3.9.6. 训练模型"
144+
]
145+
},
146+
{
147+
"cell_type": "code",
148+
"execution_count": 47,
149+
"metadata": {
150+
"collapsed": true
151+
},
152+
"outputs": [],
153+
"source": [
154+
"def acc(y_hat,y):\n",
155+
" return np.mean((tf.argmax(y_hat,axis=1) == y))"
156+
]
157+
},
158+
{
159+
"cell_type": "code",
160+
"execution_count": 48,
161+
"metadata": {
162+
"collapsed": true
163+
},
164+
"outputs": [],
165+
"source": [
166+
"num_epochs, lr = 5, 0.5"
167+
]
168+
},
169+
{
170+
"cell_type": "code",
171+
"execution_count": 49,
172+
"metadata": {},
173+
"outputs": [
174+
{
175+
"name": "stdout",
176+
"output_type": "stream",
177+
"text": [
178+
"0 loss: 0.7799275\n",
179+
"0 test_acc: 0.875\n",
180+
"1 loss: 0.72887945\n",
181+
"1 test_acc: 0.9375\n",
182+
"2 loss: 0.72454\n",
183+
"2 test_acc: 0.8125\n",
184+
"3 loss: 0.5607478\n",
185+
"3 test_acc: 0.875\n",
186+
"4 loss: 0.5008962\n",
187+
"4 test_acc: 0.9375\n"
188+
]
189+
}
190+
],
191+
"source": [
192+
"for epoch in range(num_epochs):\n",
193+
" loss_all = 0\n",
194+
" for x,y in train_iter:\n",
195+
" with tf.GradientTape() as tape:\n",
196+
" y_hat = net(x,w1,b1,w2,b2)\n",
197+
" l = tf.reduce_mean(loss(y_hat,y))\n",
198+
" loss_all += l.numpy()\n",
199+
" grads = tape.gradient(l, [w1, b1, w2, b2])\n",
200+
" w1.assign_sub(grads[0])\n",
201+
" b1.assign_sub(grads[1])\n",
202+
" w2.assign_sub(grads[2])\n",
203+
" b2.assign_sub(grads[3])\n",
204+
" print(epoch, 'loss:', l.numpy())\n",
205+
" total_correct, total_number = 0, 0\n",
206+
"\n",
207+
" for x,y in test_iter:\n",
208+
" with tf.GradientTape() as tape:\n",
209+
" y_hat = net(x,w1,b1,w2,b2)\n",
210+
" y=tf.cast(y,'int64')\n",
211+
" correct=acc(y_hat,y)\n",
212+
" print(epoch,\"test_acc:\", correct)"
213+
]
214+
},
215+
{
216+
"cell_type": "code",
217+
"execution_count": null,
218+
"metadata": {},
219+
"outputs": [],
220+
"source": []
221+
},
222+
{
223+
"cell_type": "code",
224+
"execution_count": null,
225+
"metadata": {
226+
"collapsed": true
227+
},
228+
"outputs": [],
229+
"source": []
230+
},
231+
{
232+
"cell_type": "code",
233+
"execution_count": null,
234+
"metadata": {
235+
"collapsed": true
236+
},
237+
"outputs": [],
238+
"source": []
239+
}
240+
],
241+
"metadata": {
242+
"kernelspec": {
243+
"display_name": "Python 3",
244+
"language": "python",
245+
"name": "python3"
246+
},
247+
"language_info": {
248+
"codemirror_mode": {
249+
"name": "ipython",
250+
"version": 3
251+
},
252+
"file_extension": ".py",
253+
"mimetype": "text/x-python",
254+
"name": "python",
255+
"nbconvert_exporter": "python",
256+
"pygments_lexer": "ipython3",
257+
"version": "3.6.1"
258+
}
259+
},
260+
"nbformat": 4,
261+
"nbformat_minor": 2
262+
}

0 commit comments

Comments
 (0)