多層感知機的兩種實現方法

編碼的實現環境是Python3。8。3、torch1。5、Anaconda3（64-bit）、PyCharm2020。1。是《動手學深度學習》（PyTorch版）的練習及作業，個別程式碼有修改，僅供交流學習之用。

# 3。9 多層感知機的從零開始實現# import torchimport numpy as npimport d2lzh as d2l# （1）獲取和讀取資料# 使用Fashion-MNIST資料集，並設定批次大小為256。batch_size = 256train_iter， test_iter = d2l。load_data_fashion_mnist（batch_size）# （2）初始化模型引數# 使用向量表示每個樣本。已知每個樣本輸入是高和寬均為28畫素的影象，模型的輸入向量的長度是 28×28=784，該向量的每個元素對應影象中每個畫素。# 由於資料集影象有10個類別，神經網路輸出層的輸出個數為10。設超引數隱藏單元個數為256。# 權重和偏差引數需要模型引數梯度num_inputs = 784num_outputs = 10num_hiddens = 256W1 = torch。tensor（np。random。normal（0， 0。01，（num_inputs， num_hiddens））， dtype=torch。float）b1 = torch。tensor（num_hiddens， dtype=torch。float）W2 = torch。tensor（np。random。normal（0， 0。01，（num_hiddens， num_outputs））， dtype=torch。float）b2 = torch。tensor（num_outputs， dtype=torch。float）params = ［W1， b1， W2， b2］for param in params： param。requires_grad_（requires_grad=True）# （3）定義啟用函式# 使用基礎的max函式來實現ReLU，而非直接呼叫relu函式。def relu（X）： return torch。max（input=X， other=torch。tensor（0。0））# （4）定義模型# 透過view函式將每張原始影象改成長度為num_inputs的向量，實現多層感知機的計算表示式。def net（X）： X = X。view（（-1， num_inputs）） H = relu（torch。matmul（X， W1） + b1） return torch。matmul（H， W2） + b2# （5）定義損失函式，使用PyTorch提供的包括softmax運算和交叉熵損失計算的函式。loss = torch。nn。CrossEntropyLoss（）# （6）訓練模型# 定義超引數，num_epochs為迭代週期數，lr為學習率；改變超引數的值可能會得到分類更準確的模型。num_epochs， lr = 5， 100。0d2l。train_ch3（net， train_iter， test_iter， loss， num_epochs， batch_size， params， lr）# （7）預測# 給定一系列影象（在第三行影象輸出），我們比較一下它們的真實標籤（在第一行文字輸出）和模型預測結果（在第二行文字輸出）X， y = iter（test_iter）。next（）true_labels = d2l。get_fashion_mnist_label（y。numpy（））pred_labels = d2l。get_fashion_mnist_label（net（X）。argmax（dim=1）。detach（）。numpy（））titles = ［true + ‘\n’ + pred for true， pred in zip（true_labels， pred_labels）］d2l。show_fashion_mnist（X［10：19］， titles［10：19］）

# 3。10 多層感知機的Pytorch簡潔實現import torchfrom torch import nnfrom torch。nn import initimport d2lzh as d2l# （1）獲取和讀取資料# 使用Fashion-MNIST資料集，並設定批次大小為256。batch_size = 256train_iter， test_iter = d2l。load_data_fashion_mnist（batch_size）# （2）定義模型# 多加了一個全連線層作為隱藏層。它的隱藏單元個數為256，並使用ReLU函式作為啟用函式。num_inputs = 784num_outputs = 10num_hiddens = 256net = nn。Sequential（ d2l。FlattenLayer（）， nn。Linear（num_inputs， num_hiddens）， nn。ReLU（）， nn。Linear（num_hiddens， num_outputs），）for param in net。parameters（）： init。normal_（param， mean=0， std=0。01）# （3）定義損失函式，使用PyTorch提供的包括softmax運算和交叉熵損失計算的函式。loss = torch。nn。CrossEntropyLoss（）# （4）定義最佳化演算法optimizer = torch。optim。SGD（net。parameters（）， lr=0。5）# （5）訓練模型# 定義超引數，num_epochs為迭代週期數；改變超引數的值可能會得到分類更準確的模型。num_epochs = 5d2l。train_ch3（net， train_iter， test_iter， loss， num_epochs， batch_size， None， None， optimizer）# （6）預測# 給定一系列影象（第三行影象輸出），我們比較一下它們的真實標籤（第一行文字輸出）和模型預測結果（第二行文字輸出）X， y = iter（test_iter）。next（）true_labels = d2l。get_fashion_mnist_label（y。numpy（））pred_labels = d2l。get_fashion_mnist_label（net（X）。argmax（dim=1）。detach（）。numpy（））titles = ［true + ‘\n’ + pred for true， pred in zip（true_labels， pred_labels）］d2l。show_fashion_mnist（X［10：19］， titles［10：19］）

別眨眼網

多層感知機的兩種實現方法

相關推薦