mbodied.data package∞
Submodules∞
mbodied.data.recording module∞
Module for recording data to an h5 file.
- class mbodied.data.recording.Recorder(name: str, observation_space: Dict | str | None = None, action_space: Dict | str | None = None, supervision_space: Dict | str | None = None, out_dir: str = 'saved_datasets', image_keys_to_save: list = None)[source]∞
Bases:
object
Records a dataset to an h5 file. Saves images defined to folder with _frames appended to the name stem.
Example
``` # Define the observation and action spaces observation_space = spaces.Dict({
‘image’: spaces.Box(low=0, high=255, shape=(224, 224, 3), dtype=np.uint8), ‘instruction’: spaces.Discrete(10)
}) action_space = spaces.Dict({
‘gripper_position’: spaces.Box(low=-1, high=1, shape=(3,), dtype=np.float32), ‘gripper_action’: spaces.Discrete(2)
}).
# Create a recorder instance recorder = Recorder(name=’test_recorder’, observation_space=observation_space, action_space=action_space)
# Generate some sample data num_steps = 10 for i in range(num_steps):
- observation = {
‘image’: np.ones((224, 224, 3), dtype=np.uint8), ‘instruction’: i
} action = {
‘gripper_position’: np.zeros((3,), dtype=np.float32), ‘gripper_action’: 1
} recorder.record(observation, action)
# Save the statistics recorder.save_stats()
# Close the recorder recorder.close()
# Assert that the HDF5 file and directories are created assert os.path.exists(‘test_recorder.h5’) assert os.path.exists(‘test_recorder_frames’) ```
- configure_root_spaces(**spaces: Dict)[source]∞
Configure the root spaces.
- Parameters:
observation_space (spaces.Dict) – Observation space.
action_space (spaces.Dict) – Action space.
supervision_space (spaces.Dict) – Supervision space.
mbodied.data.replaying module∞
- class mbodied.data.replaying.Replayer(path: str, file_keys: List[str] = None, image_keys_to_save: List[str] = None)[source]∞
Bases:
object
Replays datasets recorded by Recorder.
This class provides methods to read, process, and analyze HDF5 files recorded by the Recorder class.
Example
replayer = Replayer(“data.h5”) for sample in replayer:
observation, action = sample …
- get_stats(key='', prefix='') dict [source]∞
Get statistics for a given key in the HDF5 file.
- Parameters:
key (str, optional) – Key in the HDF5 file. Defaults to ‘’.
prefix (str, optional) – Prefix for the key. Defaults to ‘’.
- Returns:
Statistics for the given key.
- Return type:
dict
- get_structure(key='', prefix='') dict [source]∞
Get the structure of the HDF5 file.
- Parameters:
key (str, optional) – Key in the HDF5 file. Defaults to ‘’.
prefix (str, optional) – Prefix for the key. Defaults to ‘’.
- Returns:
Structure of the HDF5 file.
- Return type:
dict
- get_unique_items(key: str) List[str] [source]∞
Get unique items for a given key.
- Parameters:
key (str) – Key in the HDF5 file.
- Returns:
List of unique items.
- Return type:
List[str]
- pack() Sample [source]∞
Pack all samples into a Sample object with attributes being lists of samples.
- Returns:
Sample object containing all samples.
- Return type:
- pack_one(index: int) Sample [source]∞
Pack a single sample into a Sample object.
- Parameters:
index (int) – Index of the sample.
- Returns:
Sample object.
- Return type:
- read_sample(index: int) Tuple[dict, ...] [source]∞
Read a sample from the HDF5 file at a given index.
- Parameters:
index (int) – Index of the sample.
- Returns:
Tuple of dictionaries containing the sample data.
- Return type:
Tuple[dict, …]
- recursive_do(do: Callable, key='', prefix='', **kwargs) Any [source]∞
Recursively perform a function on each key in the HDF5 file.
- Parameters:
do (Callable) – Function to perform.
key (str, optional) – Key in the HDF5 file. Defaults to ‘’.
prefix (str, optional) – Prefix for the key. Defaults to ‘’.
**kwargs – Additional arguments to pass to the function.
- Returns:
Result of the function.
- Return type:
Any
- sample(index: int | slice | None = None, n: int = 1) Sample [source]∞
Get a sample from the HDF5 file.
- Parameters:
index (Optional[Union[int, slice]], optional) – Index or slice of the sample. Defaults to None.
n (int, optional) – Number of samples to get. Defaults to 1.
- Returns:
Sample object.
- Return type:
- mbodied.data.replaying.clean_folder(folder: str, image_keys_to_save: List[str]) None [source]∞
Clean the folder by iterating through the files and asking for deletion.
- Parameters:
folder (str) – Path to the folder.
image_keys_to_save (List[str]) – List of image keys to save.
- mbodied.data.replaying.parse_slice(s: str) int | slice [source]∞
Parse a string to an integer or slice.
- Parameters:
s (str) – String to parse.
- Returns:
Integer or slice.
- Return type:
Union[int, slice]
Example
>>> lst = [0, 1, 2, 3, 4, 5] >>> lst[parse_slice("1")] 1 >>> lst[parse_slice("1:5:2")] [1, 3]
- mbodied.data.replaying.to_dataset(folder: str, name: str, description: str = None, **kwargs) None [source]∞
Convert the folder of HDF5 files to a Hugging Face dataset.
- Parameters:
folder (str) – Path to the folder containing HDF5 files.
name (str) – Name of the dataset.
description (str, optional) – Description of the dataset. Defaults to None.
**kwargs – Additional arguments to pass to the Dataset.push_to_hub method.
mbodied.data.utils module∞
- mbodied.data.utils.infer_features(example) Features [source]∞
Infer Hugging Face Datasets Features from an example.
- mbodied.data.utils.to_features(indict, image_keys=None, exclude_keys=None, prefix='') Features [source]∞
Convert a dictionary to a Datasets Features object.
- Parameters:
indict (dict) – The dictionary to convert.
image_keys (dict) – A dictionary of keys that should be treated as images.
exclude_keys (set) – A set of full-path-keys to exclude.
prefix (str) – A prefix to add to the keys.