hadamard
HadamardCode
¶
Bases: Module
A layer for applying an error correcting code to your outputs.
This class is intentionally not @traceable (models and layers are handled by a different process).
See 'https://papers.nips.cc/paper/9070-error-correcting-output-codes-improve-probability-estimation-and-adversarial- robustness-of-deep-neural-networks'. Note that for best effectiveness, the model leading into this layer should be split into multiple independent chunks, whose outputs this layer can combine together in order to perform the code lookup.
# Use as a drop-in replacement for your softmax layer:
def __init__(self, classes):
self.fc1 = nn.Linear(1024, 64)
self.fc2 = nn.Linear(64, classes)
def forward(self, x):
x = fn.relu(self.fc1(x))
x = fn.softmax(self.fc2(x), dim=-1)
# ----- vs ------
def __init__(self, classes):
self.fc1 = nn.Linear(1024, 64)
self.fc2 = HadamardCode(64, classes)
def forward(self, x):
x = fn.relu(self.fc1(x))
x = self.fc2(x)
# Use to combine multiple feature heads for a final output (biggest adversarial hardening benefit):
def __init__(self, classes):
self.fc1 = nn.ModuleList([nn.Linear(1024, 16) for _ in range(4)])
self.fc2 = HadamardCode([16]*4, classes)
def forward(self, x):
x = [fn.relu(fc(x)) for fc in self.fc1]
x = self.fc2(x)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_features |
Union[int, List[int]]
|
How many input features there are (inputs should be of shape (Batch, N) or [(Batch, N), ...]). |
required |
n_classes |
int
|
How many output classes to map onto. |
required |
code_length |
Optional[int]
|
How long of an error correcting code to use. Should be a positive multiple of 2. If not provided,
the smallest power of 2 which is >= |
None
|
max_prob |
float
|
The maximum probability that can be assigned to a class. For numeric stability this must be less than 1.0. Intuitively it makes sense to keep this close to 1, but to get adversarial training benefits it should be noticeably less than 1, for example 0.95 or even 0.8. |
0.95
|
power |
float
|
The power parameter to be used by Inverse Distance Weighting when transforming Hadamard class distances into a class probability distribution. A value of 1.0 gives an intuitive mapping to probabilities, but small values such as 0.25 appear to give slightly better adversarial benefits. Large values like 2 or 3 give slightly faster convergence at the expense of adversarial performance. Must be greater than zero. |
1.0
|
Raises:
Type | Description |
---|---|
ValueError
|
If |
Source code in fastestimator/fastestimator/layers/pytorch/hadamard.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|