dive into deep learning 3.1 线性回归

历史记录

清除记录

猜你想搜

AcWing热点
App
登录/注册

dive into deep learning 3.1 线性回归

作者：

har , 2023-06-24 16:30:49 , 所有人可见 , 阅读 130

一.线性回归的从零开始实现

1.生成数据，模型

我们的模型是 𝐲=𝐗𝐰+𝑏+𝜖。 x, w均为二维向量，features中是[x1, x2],labels中是模型中跑出来的准确值(targets)

def synthetic_data(w, b, num_examples):  #@save
    """生成y=Xw+b+噪声"""
    X = torch.normal(0, 1, (num_examples, len(w)))
    y = torch.matmul(X, w) + b
    y += torch.normal(0, 0.01, y.shape)
    return X, y.reshape((-1, 1))

true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = synthetic_data(true_w, true_b, 1000)

2.读取数据集

我们每次从·features·中读出·batch_size·个数据，也从labels中读出batch_size个targets

def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    # 这些样本是随机读取的，没有特定的顺序
    random.shuffle(indices)
    for i in range(0, num_examples, batch_size):
        batch_indices = torch.tensor(
            indices[i: min(i + batch_size, num_examples)])
        yield features[batch_indices], labels[batch_indices]

3.初始化模型参数

我们随机初始化w,b的值。我们通过从均值为0、标准差为0.01的正态分布中采样随机数来初始化权重，并将偏置初始化为0。

w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

4.定义模型、损失函数

def linreg(X, w, b):  #@save
    """线性回归模型"""
    return torch.matmul(X, w) + b

def squared_loss(y_hat, y):  #@save
    """均方损失"""
    return (y_hat - y.reshape(y_hat.shape)) ** 2 / 2

5.定义优化算法

每次取出batch_size个数据进行梯度下降

def sgd(params, lr, batch_size):  #@save
    """小批量随机梯度下降"""
    with torch.no_grad():
        for param in params:
            param -= lr * param.grad / batch_size
            param.grad.zero_()

6.训练

调用优化算法sgd来更新模型参数。

lr = 0.03
num_epochs = 3
net = linreg
loss = squared_loss

for epoch in range(num_epochs):
    for X, y in data_iter(batch_size, features, labels):
        l = loss(net(X, w, b), y)  # X和y的小批量损失
        # 因为l形状是(batch_size,1)，而不是一个标量。l中的所有元素被加到一起，
        # 并以此计算关于[w,b]的梯度
        l.sum().backward()
        sgd([w, b], lr, batch_size)  # 使用参数的梯度更新参数
    with torch.no_grad():
        train_l = loss(net(features, w, b), labels)
        print(f'epoch {epoch + 1}, loss {float(train_l.mean()):f}')

tips:

Q1:loss为什么要先sum在backward求梯度？
A:loss是一个[batch_size, 1]的张量，里面每个元素是损失值，反向传播前，我们要将它们累加成一个标量，才能调用pytorch的backwood

Q2: detach()是干什么的
A：从计算图中分离张量以避免梯度的传播和计算图的扩展，即只关心张量的值而不需要梯度信息。

0 评论

App 内打开