Skip to content

data

Data

Bases: ChainMap[str, Any]

A class which contains prediction and batch data.

This class is intentionally not @traceable.

Data objects can be interacted with as if they are regular dictionaries. They are however, actually a combination of two dictionaries, a dictionary for trace communication and a dictionary of prediction+batch data. In general, data written into the trace dictionary will be logged by the system, whereas data in the pred+batch dictionary will not. We therefore provide helper methods to write entries into Data which are intended or not intended for logging.

d = fe.util.Data({"a":0, "b":1, "c":2})
a = d["a"]  # 0
d.write_with_log("d", 3)
d.write_without_log("e", 5)
d.write_with_log("a", 4)
a = d["a"]  # 4
r = d.read_logs(extra_keys={"c"})  # {"c":2, "d":3, "a":4}

Parameters:

Name Type Description Default
batch_data Optional[MutableMapping[str, Any]]

The batch data dictionary. In practice this is itself often a ChainMap containing separate prediction and batch dictionaries.

None
Source code in fastestimator\fastestimator\util\data.py
class Data(ChainMap[str, Any]):
    """A class which contains prediction and batch data.

    This class is intentionally not @traceable.

    Data objects can be interacted with as if they are regular dictionaries. They are however, actually a combination of
    two dictionaries, a dictionary for trace communication and a dictionary of prediction+batch data. In general, data
    written into the trace dictionary will be logged by the system, whereas data in the pred+batch dictionary will not.
    We therefore provide helper methods to write entries into `Data` which are intended or not intended for logging.

    ```python
    d = fe.util.Data({"a":0, "b":1, "c":2})
    a = d["a"]  # 0
    d.write_with_log("d", 3)
    d.write_without_log("e", 5)
    d.write_with_log("a", 4)
    a = d["a"]  # 4
    r = d.read_logs(extra_keys={"c"})  # {"c":2, "d":3, "a":4}
    ```

    Args:
        batch_data: The batch data dictionary. In practice this is itself often a ChainMap containing separate
            prediction and batch dictionaries.
    """
    maps: List[MutableMapping[str, Any]]

    def __init__(self, batch_data: Optional[MutableMapping[str, Any]] = None) -> None:
        super().__init__({}, batch_data or {}, {})
        self.per_instance_enabled = True  # Can be toggled if you need to block traces from recording detailed info

    def write_with_log(self, key: str, value: Any) -> None:
        """Write a given `value` into the `Data` dictionary with the intent that it be logged.

        Args:
            key: The key to associate with the new entry.
            value: The new entry to be written.
        """
        self.__setitem__(key, value)

    def write_without_log(self, key: str, value: Any) -> None:
        """Write a given `value` into the `Data` dictionary with the intent that it not be logged.

        Args:
            key: The key to associate with the new entry.
            value: The new entry to be written.
        """
        self.maps[1][key] = value

    def write_per_instance_log(self, key: str, value: Any) -> None:
        """Write a given per-instance `value` into the `Data` dictionary for use with detailed loggers (ex. CSVLogger).

        Args:
            key: The key to associate with the new entry.
            value: The new per-instance entry to be written.
        """
        if self.per_instance_enabled:
            self.maps[2][key] = value

    def read_logs(self) -> MutableMapping[str, Any]:
        """Read all values from the `Data` dictionary which were intended to be logged.

        Returns:
            A dictionary of all of the keys and values to be logged.
        """
        return self.maps[0]

    def read_per_instance_logs(self) -> MutableMapping[str, Any]:
        """Read all per-instance values from the `Data` dictionary for detailed logging.

        Returns:
             A dictionary of the keys and values to be logged.
        """
        return self.maps[2]

read_logs

Read all values from the Data dictionary which were intended to be logged.

Returns:

Type Description
MutableMapping[str, Any]

A dictionary of all of the keys and values to be logged.

Source code in fastestimator\fastestimator\util\data.py
def read_logs(self) -> MutableMapping[str, Any]:
    """Read all values from the `Data` dictionary which were intended to be logged.

    Returns:
        A dictionary of all of the keys and values to be logged.
    """
    return self.maps[0]

read_per_instance_logs

Read all per-instance values from the Data dictionary for detailed logging.

Returns:

Type Description
MutableMapping[str, Any]

A dictionary of the keys and values to be logged.

Source code in fastestimator\fastestimator\util\data.py
def read_per_instance_logs(self) -> MutableMapping[str, Any]:
    """Read all per-instance values from the `Data` dictionary for detailed logging.

    Returns:
         A dictionary of the keys and values to be logged.
    """
    return self.maps[2]

write_per_instance_log

Write a given per-instance value into the Data dictionary for use with detailed loggers (ex. CSVLogger).

Parameters:

Name Type Description Default
key str

The key to associate with the new entry.

required
value Any

The new per-instance entry to be written.

required
Source code in fastestimator\fastestimator\util\data.py
def write_per_instance_log(self, key: str, value: Any) -> None:
    """Write a given per-instance `value` into the `Data` dictionary for use with detailed loggers (ex. CSVLogger).

    Args:
        key: The key to associate with the new entry.
        value: The new per-instance entry to be written.
    """
    if self.per_instance_enabled:
        self.maps[2][key] = value

write_with_log

Write a given value into the Data dictionary with the intent that it be logged.

Parameters:

Name Type Description Default
key str

The key to associate with the new entry.

required
value Any

The new entry to be written.

required
Source code in fastestimator\fastestimator\util\data.py
def write_with_log(self, key: str, value: Any) -> None:
    """Write a given `value` into the `Data` dictionary with the intent that it be logged.

    Args:
        key: The key to associate with the new entry.
        value: The new entry to be written.
    """
    self.__setitem__(key, value)

write_without_log

Write a given value into the Data dictionary with the intent that it not be logged.

Parameters:

Name Type Description Default
key str

The key to associate with the new entry.

required
value Any

The new entry to be written.

required
Source code in fastestimator\fastestimator\util\data.py
def write_without_log(self, key: str, value: Any) -> None:
    """Write a given `value` into the `Data` dictionary with the intent that it not be logged.

    Args:
        key: The key to associate with the new entry.
        value: The new entry to be written.
    """
    self.maps[1][key] = value

FilteredData

A placeholder to indicate that this data instance should not be used.

This class is intentionally not @traceable.

Parameters:

Name Type Description Default
replacement bool

Whether to replace the filtered element with another (thus maintaining the number of steps in an epoch but potentially increasing data repetition) or else shortening the epoch by the number of filtered data points (fewer steps per epoch than expected, but no extra data repetition). Either way, the number of data points within an individual batch will remain the same. Even if replacement is true, data will not be repeated until all of the given epoch's data has been traversed (except for at most 1 batch of data which might not appear until after the re-shuffle has occurred).

True
Source code in fastestimator\fastestimator\util\data.py
class FilteredData:
    """A placeholder to indicate that this data instance should not be used.

    This class is intentionally not @traceable.

    Args:
        replacement: Whether to replace the filtered element with another (thus maintaining the number of steps in an
            epoch but potentially increasing data repetition) or else shortening the epoch by the number of filtered
            data points (fewer steps per epoch than expected, but no extra data repetition). Either way, the number of
            data points within an individual batch will remain the same. Even if `replacement` is true, data will not be
            repeated until all of the given epoch's data has been traversed (except for at most 1 batch of data which
            might not appear until after the re-shuffle has occurred).
    """
    def __init__(self, replacement: bool = True):
        self.replacement = replacement

    def __repr__(self):
        return "FilteredData"