# ud185

## Tips

* 回归问题和二元分类问题经常使用均方损失
* 前向传播
* 计算损失
* 反向传播得到梯度
* 更新权重（使用优化器）

## 创建一个分类器

```python
from torch import nn, optim
import torch.nn.functional as F

class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = F.log_softmax(self.fc4(x), dim=1)

        return x

#model = nn.Sequential(nn.Linear(784, 128),
#                      nn.ReLU(),
#                      nn.Linear(128, 64),
#                      nn.ReLU(),
#                      nn.Linear(64, 10),
#                       nn.LogSoftmax(dim=1))
```

## 验证分类器

```python
model = Classifier()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)

epochs = 30
steps = 0

train_losses, test_losses = [], []
for e in range(epochs):
    running_loss = 0
    for images, labels in trainloader:

        optimizer.zero_grad()

        log_ps = model(images)
        loss = criterion(log_ps, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    else:
        test_loss = 0
        accuracy = 0

        # Turn off gradients for validation, saves memory and computations
        with torch.no_grad():
            for images, labels in testloader:
                log_ps = model(images)
                test_loss += criterion(log_ps, labels)

                ps = torch.exp(log_ps)
                top_p, top_class = ps.topk(1, dim=1)
                equals = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equals.type(torch.FloatTensor))

        train_losses.append(running_loss/len(trainloader))
        test_losses.append(test_loss/len(testloader))

        print("Epoch: {}/{}.. ".format(e+1, epochs),
              "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader)),
              "Test Loss: {:.3f}.. ".format(test_loss/len(testloader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
```

## 过拟合

* *早停法* (early stopping) 即：使用验证损失最低的模型，需要频繁保存模型。
* *丢弃*（dropout）即随机丢弃单元
  * `self.dropout = nn.Dropout(p=0.2)`

    ```python
    class Classifier(nn.Module):
        def __init__(self):
            super().__init__()
            self.fc1 = nn.Linear(784, 256)
            self.fc2 = nn.Linear(256, 128)
            self.fc3 = nn.Linear(128, 64)
            self.fc4 = nn.Linear(64, 10)

            # Dropout module with 0.2 drop probability
            self.dropout = nn.Dropout(p=0.2)

        def forward(self, x):
            # make sure input tensor is flattened
            x = x.view(x.shape[0], -1)

            # Now with dropout
            x = self.dropout(F.relu(self.fc1(x)))
            x = self.dropout(F.relu(self.fc2(x)))
            x = self.dropout(F.relu(self.fc3(x)))

            # output so no dropout here
            x = F.log_softmax(self.fc4(x), dim=1)

            return x
    ```

    验证的时候关闭`dropout`, 先使用`model.eval()`设定为推理模式，计算测试损失和精度后再开启模型训练`model.train()`启动`dropout`

    ```python
            # Turn off gradients for validation, saves memory and computations
            with torch.no_grad():
                model.eval() #set up to eval not use dropout
                for images, labels in testloader:
                    log_ps = model(images)
                    ...

            model.train() # set up to train with dropout
    ```

## 保存和加载网络模型

* `torch.save` 和 `torch.load`
* `PyTorch` 网络的参数保存在模型的 `state_dict` 中。状态字典包含每个层级的权重和偏差矩阵

```python
torch.save(model.state_dict(), 'checkpoint.pth')
state_dict = torch.load('checkpoint.pth')
model.load_state_dict(state_dict)
```

* 将状态加载到神经网络中需要执行 `model.load_state_dict(state_dict)`
  * 只有**模型结构和检查点的结构完全一样**时，状态字典才能加载成功
* 可以将模型的架构信息和状态字典都保存在检查点里，可以通过创建一个字典来实现

  ```python
  checkpoint = {'input_size': 784,
                'output_size': 10,
                'hidden_layers': [each.out_features for each in model.hidden_layers],
                'state_dict': model.state_dict()}

  torch.save(checkpoint, 'checkpoint.pth')

  def load_checkpoint(filepath):
      checkpoint = torch.load(filepath)
      model = fc_model.Network(checkpoint['input_size'],
                               checkpoint['output_size'],
                               checkpoint['hidden_layers'])
      model.load_state_dict(checkpoint['state_dict'])

      return model

  model = load_checkpoint('checkpoint.pth')
  ```
* 再次加载模型的时候使用`load_checkpoint`函数即可正确完成

## 加载图像数据

```python
data_dir = 'Cat_Dog_data/train' # 数据集所在目录
# 数据转换，缩放、裁剪，然后转换为张量
transform = transforms.Compose([transforms.Resize(255),
                                 transforms.CenterCrop(224),
                                 transforms.ToTensor()])
# 加载数据集
dataset = datasets.ImageFolder(data_dir, transform=transform)
# use the ImageFolder dataset to create the DataLoader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)

# 测试数据加载器
images, labels = next(iter(dataloader))
helper.imshow(images[0], normalize=False)
```

### 数据增强

> 训练神经网络的一个常见策略是在输入数据本身里引入随机性。例如，你可以在训练过程中随机地旋转、翻转、缩放和/或裁剪图像。这样一来，你的神经网络在处理位置、大小、方向不同的相同图像时，可以更好地进行泛化。

```python
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.5, 0.5, 0.5], 
                                                            [0.5, 0.5, 0.5])])
```

另外，还需要使用 `transforms.Normalize` 标准化图像。传入均值和标准偏差列表，然后标准化颜色通道

## 迁移学习

* 加载 [DenseNet](http://pytorch.org/docs/0.3.0/torchvision/models.html#id5) 等模型 `model = models.densenet121(pretrained=True)`

```python
# Freeze parameters so we don't backprop through them 冻结特征层
for param in model.parameters():
    param.requires_grad = False
```

* 测试GPU是否可用

  ```python
  device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  ```

  > PyTorch 和其他深度学习框架一样，也使用 [CUDA](https://developer.nvidia.com/cuda-zone) 在 GPU 上高效地进行前向和反向运算。在 PyTorch 中，你需要使用 `model.to('cuda')` 将模型参数和其他张量转移到 GPU 内存中。你可以使用 `model.to('cpu')` 将它们从 GPU 移到 CPU，比如在你需要在 PyTorch 之外对网络输出执行运算时。

  ```python
  model.to(device)
  for epoch in range(epochs):
      for ii, (inputs, labels) in enumerate(trainloader):
          # Move input and label tensors to the default device
          inputs, labels = inputs.to(device), labels.to(device)
  ```

## Review

```python
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt

import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
from collections import OrderedDict

data_dir = 'Cat_Dog_data'

# Define transforms for the training data and testing data
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])

test_transforms = transforms.Compose([transforms.Resize(255),
                                      transforms.CenterCrop(224),
                                      transforms.ToTensor(),
                                      transforms.Normalize([0.485, 0.456, 0.406],
                                                           [0.229, 0.224, 0.225])])

# Pass transforms in here, then run the next cell to see how the transforms look
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)

trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(test_data, batch_size=64)


device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = models.densenet121(pretrained=True)

for param in model.parameters():
    param.requires_grad = False

model.classifier = nn.Sequential(OrderedDict([
                          ('fc1', nn.Linear(1024, 500)),
                          ('relu', nn.ReLU()),
                          ('fc2', nn.Linear(500, 2)),
                          ('output', nn.LogSoftmax(dim=1))
                          ]))

criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)
model.to(device);

epochs = 30
steps = 0
running_loss = 0
print_every = 5
train_losses, test_losses = [], []

for epoch in range(epochs):

    for images, labels in trainloader:
        steps += 1

        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad() # clearing the gradients

        log_ps = model(images)
        loss = criterion(log_ps, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        if steps %% print_every == 0:
            model.eval() # set the network to evaluation mode
            test_loss = 0
            accuracy = 0
            for images, labels in testloader:

                images, labels = images.to(device), labels.to(device)

                logps = model(images)
                loss = criterion(logps, labels)
                test_loss += loss.item()

                # calculate our accuracy
                ps = torch.exp(logps)
                top_ps, top_class = ps.topk(1, dim=1)
                equality = top_class == labels.view(*top_class,shape)
                accuracy += torch.mean(equality.type(torch.FloatTensor)).item()
                running_loss = 0
                model.train() # back to training mode

            print("Epoch: {}/{}.. ".format(e+1, epochs),
                  "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader)),
                  "Test Loss: {:.3f}.. ".format(test_loss/len(testloader)),
                  "Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.junyangz.com/note/ud185.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
