mbodied.agents.sense package

Subpackages

Submodules

mbodied.agents.sense.object_pose_estimator_3d module

class mbodied.agents.sense.object_pose_estimator_3d.ObjectPoseEstimator3D(server_url: str = 'https://api.mbodi.ai/3d-object-pose-detection')[source]

Bases: SensoryAgent

3D object pose estimation class to interact with a Gradio server for image processing.

server_url

URL of the Gradio server.

Type:

str

client

Gradio client to interact with the server.

Type:

Client

act(rgb_image_path: str, depth_image_path: str, camera_intrinsics: List[float] | ndarray, distortion_coeffs: List[float] | None = None, aruco_pose_world_frame: Pose6D | None = None, object_classes: List[str] | None = None, confidence_threshold: float | None = None, using_realsense: bool = False) Dict[source]

Capture images using the RealSense camera, process them, and send a request to estimate object poses.

Parameters:
  • rgb_image_path (str) – Path to the RGB image.

  • depth_image_path (str) – Path to the depth image.

  • camera_intrinsics (List[float] | np.ndarray) – Path to the camera intrinsics or the intrinsic matrix.

  • distortion_coeffs (Optional[List[float]]) – List of distortion coefficients.

  • aruco_pose_world_frame (Optional[Pose6D]) – Pose of the ArUco marker in the world frame.

  • object_classes (Optional[List[str]]) – List of object classes.

  • confidence_threshold (Optional[float]) – Confidence threshold for object detection.

  • using_realsense (bool) – Whether to use the RealSense camera.

Returns:

Result from the Gradio server.

Return type:

Dict

Example

>>> estimator = ObjectPoseEstimator3D()
>>> result = estimator.act(
...     "resources/color_image.png",
...     "resources/depth_image.png",
...     [911, 911, 653, 371],
...     [0.0, 0.0, 0.0, 0.0, 0.0],
...     [0.0, 0.2032, 0.0, -90, 0, -90],
...     ["Remote Control", "Basket", "Fork", "Spoon", "Red Marker"],
...     0.5,
...     False,
... )
static save_data(color_image_array: ndarray, depth_image_array: ndarray, color_image_path: str, depth_image_path: str, intrinsic_matrix: ndarray) None[source]

Save color and depth images as PNG files.

Parameters:
  • color_image_array (np.ndarray) – The color image array.

  • depth_image_array (np.ndarray) – The depth image array.

  • color_image_path (str) – The path to save the color image.

  • depth_image_path (str) – The path to save the depth image.

  • intrinsic_matrix (np.ndarray) – The intrinsic matrix.

Example

>>> color_image = np.zeros((480, 640, 3), dtype=np.uint8)
>>> depth_image = np.zeros((480, 640), dtype=np.uint16)
>>> intrinsic_matrix = np.eye(3)
>>> ObjectPoseEstimator3D.save_data(color_image, depth_image, "color.png", "depth.png", intrinsic_matrix)

mbodied.agents.sense.sensory_agent module

class mbodied.agents.sense.sensory_agent.SensoryAgent(**kwargs)[source]

Bases: Agent

Abstract base class for sensory agents.

This class provides a template for creating agents that can sense the environment.

kwargs

Additional arguments to pass to the recorder.

Type:

dict

act(**kwargs) SensorReading[source]

Abstract method to define the sensing mechanism of the agent.

Parameters:

**kwargs – Additional arguments to pass to the sense method.

Returns:

The sensory sample created by the agent.

Return type:

Sample

sense(**kwargs) SensorReading[source]

Generate a SensorReading based on given parameters.

Parameters:

**kwargs – Arbitrary keyword arguments for sensory agent to sense on.

Returns:

A SensorReading object based on the provided arguments.

Return type:

SensorReading

Module contents