mbodied.agents.sense package∞
Subpackages∞
Submodules∞
mbodied.agents.sense.object_pose_estimator_3d module∞
- class mbodied.agents.sense.object_pose_estimator_3d.ObjectPoseEstimator3D(server_url: str = 'https://api.mbodi.ai/3d-object-pose-detection')[source]∞
Bases:
SensoryAgent
3D object pose estimation class to interact with a Gradio server for image processing.
- server_url∞
URL of the Gradio server.
- Type:
str
- client∞
Gradio client to interact with the server.
- Type:
Client
- act(rgb_image_path: str, depth_image_path: str, camera_intrinsics: List[float] | ndarray, distortion_coeffs: List[float] | None = None, aruco_pose_world_frame: Pose6D | None = None, object_classes: List[str] | None = None, confidence_threshold: float | None = None, using_realsense: bool = False) Dict [source]∞
Capture images using the RealSense camera, process them, and send a request to estimate object poses.
- Parameters:
rgb_image_path (str) – Path to the RGB image.
depth_image_path (str) – Path to the depth image.
camera_intrinsics (List[float] | np.ndarray) – Path to the camera intrinsics or the intrinsic matrix.
distortion_coeffs (Optional[List[float]]) – List of distortion coefficients.
aruco_pose_world_frame (Optional[Pose6D]) – Pose of the ArUco marker in the world frame.
object_classes (Optional[List[str]]) – List of object classes.
confidence_threshold (Optional[float]) – Confidence threshold for object detection.
using_realsense (bool) – Whether to use the RealSense camera.
- Returns:
Result from the Gradio server.
- Return type:
Dict
Example
>>> estimator = ObjectPoseEstimator3D() >>> result = estimator.act( ... "resources/color_image.png", ... "resources/depth_image.png", ... [911, 911, 653, 371], ... [0.0, 0.0, 0.0, 0.0, 0.0], ... [0.0, 0.2032, 0.0, -90, 0, -90], ... ["Remote Control", "Basket", "Fork", "Spoon", "Red Marker"], ... 0.5, ... False, ... )
- static save_data(color_image_array: ndarray, depth_image_array: ndarray, color_image_path: str, depth_image_path: str, intrinsic_matrix: ndarray) None [source]∞
Save color and depth images as PNG files.
- Parameters:
color_image_array (np.ndarray) – The color image array.
depth_image_array (np.ndarray) – The depth image array.
color_image_path (str) – The path to save the color image.
depth_image_path (str) – The path to save the depth image.
intrinsic_matrix (np.ndarray) – The intrinsic matrix.
Example
>>> color_image = np.zeros((480, 640, 3), dtype=np.uint8) >>> depth_image = np.zeros((480, 640), dtype=np.uint16) >>> intrinsic_matrix = np.eye(3) >>> ObjectPoseEstimator3D.save_data(color_image, depth_image, "color.png", "depth.png", intrinsic_matrix)
mbodied.agents.sense.sensory_agent module∞
- class mbodied.agents.sense.sensory_agent.SensoryAgent(**kwargs)[source]∞
Bases:
Agent
Abstract base class for sensory agents.
This class provides a template for creating agents that can sense the environment.
- kwargs∞
Additional arguments to pass to the recorder.
- Type:
dict
- act(**kwargs) SensorReading [source]∞
Abstract method to define the sensing mechanism of the agent.
- Parameters:
**kwargs – Additional arguments to pass to the sense method.
- Returns:
The sensory sample created by the agent.
- Return type:
- sense(**kwargs) SensorReading [source]∞
Generate a SensorReading based on given parameters.
- Parameters:
**kwargs – Arbitrary keyword arguments for sensory agent to sense on.
- Returns:
A SensorReading object based on the provided arguments.
- Return type: