mbodied.data package

Submodules

mbodied.data.recording module

Module for recording data to an h5 file.

class mbodied.data.recording.Recorder(name: str, observation_space: Dict | str | None = None, action_space: Dict | str | None = None, supervision_space: Dict | str | None = None, out_dir: str = 'saved_datasets', image_keys_to_save: list = None)[source]

Bases: object

Records a dataset to an h5 file. Saves images defined to folder with _frames appended to the name stem.

Example

``` # Define the observation and action spaces observation_space = spaces.Dict({

‘image’: spaces.Box(low=0, high=255, shape=(224, 224, 3), dtype=np.uint8), ‘instruction’: spaces.Discrete(10)

}) action_space = spaces.Dict({

‘gripper_position’: spaces.Box(low=-1, high=1, shape=(3,), dtype=np.float32), ‘gripper_action’: spaces.Discrete(2)

}).

# Create a recorder instance recorder = Recorder(name=’test_recorder’, observation_space=observation_space, action_space=action_space)

# Generate some sample data num_steps = 10 for i in range(num_steps):

observation = {

‘image’: np.ones((224, 224, 3), dtype=np.uint8), ‘instruction’: i

} action = {

‘gripper_position’: np.zeros((3,), dtype=np.float32), ‘gripper_action’: 1

} recorder.record(observation, action)

# Save the statistics recorder.save_stats()

# Close the recorder recorder.close()

# Assert that the HDF5 file and directories are created assert os.path.exists(‘test_recorder.h5’) assert os.path.exists(‘test_recorder_frames’) ```

close() None[source]

Closes the Recorder and send the data if train_config is set.

configure_root_spaces(**spaces: Dict)[source]

Configure the root spaces.

Parameters:
  • observation_space (spaces.Dict) – Observation space.

  • action_space (spaces.Dict) – Action space.

  • supervision_space (spaces.Dict) – Supervision space.

record(observation: Any | None = None, action: Any | None = None, supervision: Any | None = None) None[source]

Record a timestep.

Parameters:
  • observation (Any) – Observation to record.

  • action (Any) – Action to record.

  • supervision (Any) – Supervision to record.

record_timestep(group: Group, sample: Any, index: int) None[source]

Record a timestep.

Parameters:
  • group (h5py.Group) – Group to record to.

  • sample (Any) – Sample to record.

  • index (int) – Index to record at.

mbodied.data.recording.add_space_metadata(space, group) None[source]
mbodied.data.recording.copy_and_delete_old(filename) None[source]
mbodied.data.recording.create_dataset_for_space_dict(space_dict: Dict, group: Group) None[source]

mbodied.data.replaying module

class mbodied.data.replaying.FolderReplayer(path: str)[source]

Bases: object

class mbodied.data.replaying.Replayer(path: str, file_keys: List[str] = None, image_keys_to_save: List[str] = None)[source]

Bases: object

Replays datasets recorded by Recorder.

This class provides methods to read, process, and analyze HDF5 files recorded by the Recorder class.

Example

replayer = Replayer(“data.h5”) for sample in replayer:

observation, action = sample …

close() None[source]

Close the HDF5 file.

get_frames_path() str | None[source]

Get the path to the frames directory.

get_stats(key='', prefix='') dict[source]

Get statistics for a given key in the HDF5 file.

Parameters:
  • key (str, optional) – Key in the HDF5 file. Defaults to ‘’.

  • prefix (str, optional) – Prefix for the key. Defaults to ‘’.

Returns:

Statistics for the given key.

Return type:

dict

get_structure(key='', prefix='') dict[source]

Get the structure of the HDF5 file.

Parameters:
  • key (str, optional) – Key in the HDF5 file. Defaults to ‘’.

  • prefix (str, optional) – Prefix for the key. Defaults to ‘’.

Returns:

Structure of the HDF5 file.

Return type:

dict

get_unique_items(key: str) List[str][source]

Get unique items for a given key.

Parameters:

key (str) – Key in the HDF5 file.

Returns:

List of unique items.

Return type:

List[str]

pack() Sample[source]

Pack all samples into a Sample object with attributes being lists of samples.

Returns:

Sample object containing all samples.

Return type:

Sample

pack_one(index: int) Sample[source]

Pack a single sample into a Sample object.

Parameters:

index (int) – Index of the sample.

Returns:

Sample object.

Return type:

Sample

read_sample(index: int) Tuple[dict, ...][source]

Read a sample from the HDF5 file at a given index.

Parameters:

index (int) – Index of the sample.

Returns:

Tuple of dictionaries containing the sample data.

Return type:

Tuple[dict, …]

recursive_do(do: Callable, key='', prefix='', **kwargs) Any[source]

Recursively perform a function on each key in the HDF5 file.

Parameters:
  • do (Callable) – Function to perform.

  • key (str, optional) – Key in the HDF5 file. Defaults to ‘’.

  • prefix (str, optional) – Prefix for the key. Defaults to ‘’.

  • **kwargs – Additional arguments to pass to the function.

Returns:

Result of the function.

Return type:

Any

sample(index: int | slice | None = None, n: int = 1) Sample[source]

Get a sample from the HDF5 file.

Parameters:
  • index (Optional[Union[int, slice]], optional) – Index or slice of the sample. Defaults to None.

  • n (int, optional) – Number of samples to get. Defaults to 1.

Returns:

Sample object.

Return type:

Sample

mbodied.data.replaying.clean_folder(folder: str, image_keys_to_save: List[str]) None[source]

Clean the folder by iterating through the files and asking for deletion.

Parameters:
  • folder (str) – Path to the folder.

  • image_keys_to_save (List[str]) – List of image keys to save.

mbodied.data.replaying.parse_slice(s: str) int | slice[source]

Parse a string to an integer or slice.

Parameters:

s (str) – String to parse.

Returns:

Integer or slice.

Return type:

Union[int, slice]

Example

>>> lst = [0, 1, 2, 3, 4, 5]
>>> lst[parse_slice("1")]
1
>>> lst[parse_slice("1:5:2")]
[1, 3]
mbodied.data.replaying.to_dataset(folder: str, name: str, description: str = None, **kwargs) None[source]

Convert the folder of HDF5 files to a Hugging Face dataset.

Parameters:
  • folder (str) – Path to the folder containing HDF5 files.

  • name (str) – Name of the dataset.

  • description (str, optional) – Description of the dataset. Defaults to None.

  • **kwargs – Additional arguments to pass to the Dataset.push_to_hub method.

mbodied.data.utils module

mbodied.data.utils.infer_features(example) Features[source]

Infer Hugging Face Datasets Features from an example.

mbodied.data.utils.to_features(indict, image_keys=None, exclude_keys=None, prefix='') Features[source]

Convert a dictionary to a Datasets Features object.

Parameters:
  • indict (dict) – The dictionary to convert.

  • image_keys (dict) – A dictionary of keys that should be treated as images.

  • exclude_keys (set) – A set of full-path-keys to exclude.

  • prefix (str) – A prefix to add to the keys.

Module contents