mbodied.agents.sense package∞

Submodules∞

mbodied.agents.sense.object_pose_estimator_3d module∞

class mbodied.agents.sense.object_pose_estimator_3d.ObjectPoseEstimator3D(server_url: str = 'https://api.mbodi.ai/3d-object-pose-detection')[source]∞

Bases: SensoryAgent

3D object pose estimation class to interact with a Gradio server for image processing.

server_url∞

URL of the Gradio server.

Type:: str

client∞

Gradio client to interact with the server.

Type:: Client

act(rgb_image_path: str, depth_image_path: str, camera_intrinsics: List[float] | ndarray, distortion_coeffs: List[float] | None = None, aruco_pose_world_frame: Pose6D | None = None, object_classes: List[str] | None = None, confidence_threshold: float | None = None, using_realsense: bool = False) → Dict[source]∞

Capture images using the RealSense camera, process them, and send a request to estimate object poses.

Parameters:

rgb_image_path (str) – Path to the RGB image.
depth_image_path (str) – Path to the depth image.
camera_intrinsics (List[float] | np.ndarray) – Path to the camera intrinsics or the intrinsic matrix.
distortion_coeffs (Optional[List[float]]) – List of distortion coefficients.
aruco_pose_world_frame (Optional[Pose6D]) – Pose of the ArUco marker in the world frame.
object_classes (Optional[List[str]]) – List of object classes.
confidence_threshold (Optional[float]) – Confidence threshold for object detection.
using_realsense (bool) – Whether to use the RealSense camera.

Returns:

Result from the Gradio server.

Return type:

Dict

Example

>>> estimator = ObjectPoseEstimator3D()
>>> result = estimator.act(
...     "resources/color_image.png",
...     "resources/depth_image.png",
...     [911, 911, 653, 371],
...     [0.0, 0.0, 0.0, 0.0, 0.0],
...     [0.0, 0.2032, 0.0, -90, 0, -90],
...     ["Remote Control", "Basket", "Fork", "Spoon", "Red Marker"],
...     0.5,
...     False,
... )

static save_data(color_image_array: ndarray, depth_image_array: ndarray, color_image_path: str, depth_image_path: str, intrinsic_matrix: ndarray) → None[source]∞

Save color and depth images as PNG files.

Parameters:

color_image_array (np.ndarray) – The color image array.
depth_image_array (np.ndarray) – The depth image array.
color_image_path (str) – The path to save the color image.
depth_image_path (str) – The path to save the depth image.
intrinsic_matrix (np.ndarray) – The intrinsic matrix.

Example

>>> color_image = np.zeros((480, 640, 3), dtype=np.uint8)
>>> depth_image = np.zeros((480, 640), dtype=np.uint16)
>>> intrinsic_matrix = np.eye(3)
>>> ObjectPoseEstimator3D.save_data(color_image, depth_image, "color.png", "depth.png", intrinsic_matrix)

mbodied.agents.sense.sensory_agent module∞

class mbodied.agents.sense.sensory_agent.SensoryAgent(**kwargs)[source]∞

Bases: Agent

Abstract base class for sensory agents.

This class provides a template for creating agents that can sense the environment.

kwargs∞

Additional arguments to pass to the recorder.

Type:: dict

act(**kwargs) → SensorReading[source]∞

Abstract method to define the sensing mechanism of the agent.

Parameters:: **kwargs – Additional arguments to pass to the sense method.
Returns:: The sensory sample created by the agent.
Return type:: Sample

sense(**kwargs) → SensorReading[source]∞

Generate a SensorReading based on given parameters.

Parameters:: **kwargs – Arbitrary keyword arguments for sensory agent to sense on.
Returns:: A SensorReading object based on the provided arguments.
Return type:: SensorReading

mbodied.agents.sense package∞

Subpackages∞

Submodules∞

mbodied.agents.sense.object_pose_estimator_3d module∞

mbodied.agents.sense.sensory_agent module∞

Module contents∞