使用半硬三联体时损失减少-Java 学习之路

Here是三联学习的简短回顾 . 我正在使用具有共享权重的三个卷积神经网络来生成面嵌入（ anchor ， positive ， negative ），其中描述了损失here .

三胞胎损失：

anchor_output = ...  # shape [None, 128]
positive_output = ...  # shape [None, 128]
negative_output = ...  # shape [None, 128]

d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)

loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)

当我只选择硬三元组（ distance(anchor, positive) < distance(anchor, negative) ）时，损失非常小： 0.08 . 当我选择所有三胞胎时，损失变得更大 0.17855 . 这些只是10 000个三联体对的测试值，但我在实际组（600 000个三联体对）上得到了类似的结果 .

为什么会这样？这是对的吗？

我正在使用SGD，从 learning rate 0.001 开始 .

2 回答

3
以下是三胞胎硬度术语的快速回顾：
- easy triplets ：损失为0的三胞胎，因为 d(a,p) + margin < d(a,n)
- hard triplets ：三元组，其中负数更接近锚点而不是正数，即 d(a,n) < d(a,p)
- semi-hard triplets ：三元组，其中负数不比正数更接近锚点，但仍有正面损失： d(a, p) < d(a, n) < d(a, p) + margin
你在这里描述的内容：

当我只选择硬三联体（距离（锚，正）<距离（锚，负））

实际上是选择半硬三胞胎和三胞胎 . 你正在移除硬三胞胎，所以你的损失会更小 .
回复于 2024-05-07T02:30:26+08:00
1

服用tf.sqrt（d_pos）和tf.sqrt（d_neg）后你会得到什么？

回复于 2024-05-07T02:30:26+08:00

使用半硬三联体时损失减少

2 回答

相关问题