mbodied.agents package

Subpackages

Submodules

mbodied.agents.agent module

class mbodied.agents.agent.Agent(recorder: Literal['omit', 'auto'] | str = 'omit', recorder_kwargs=None, api_key: str = None, model_src=None, model_kwargs=None)[source]

Bases: object

Abstract base class for agents.

This class provides a template for creating agents that can optionally record their actions and observations.

recorder

The recorder to record observations and actions.

Type:

Recorder

actor

The backend actor to perform actions.

Type:

Backend

kwargs

Additional arguments to pass to the recorder.

Type:

dict

ACTOR_MAP = {'anthropic': <class 'mbodied.agents.backends.anthropic_backend.AnthropicBackend'>, 'gradio': <class 'mbodied.agents.backends.gradio_backend.GradioBackend'>, 'http': <class 'mbodied.agents.backends.httpx_backend.HttpxBackend'>, 'ollama': <class 'mbodied.agents.backends.ollama_backend.OllamaBackend'>, 'openai': <class 'mbodied.agents.backends.openai_backend.OpenAIBackendMixin'>}
act(*args, **kwargs) Sample[source]

Act based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.act() correctly to perform the actions.

act_and_record(*args, **kwargs) Sample[source]

Peform action based on the observation and record the action, if applicable.

Parameters:
  • *args – Additional arguments to customize the action.

  • **kwargs – Additional arguments to customize the action.

Returns:

The action sample created by the agent.

Return type:

Sample

async async_act(*args, **kwargs) Sample[source]

Act asynchronously based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.async_act() correctly to perform the actions.

async async_act_and_record(*args, **kwargs) Sample[source]

Act asynchronously based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.async_act() correctly to perform the actions.

static create_observation_from_args(observation_space, function, args, kwargs) dict[source]

Helper method to create an observation from the arguments of a function.

static handle_default(model_src: str, model_kwargs: dict) None[source]

Default to gradio then httpx backend if the model source is not recognized.

Parameters:
  • model_src – The model source to use.

  • model_kwargs – The additional arguments to pass to the model.

static init_backend(model_src: str, model_kwargs: dict, api_key: str) type[source]

Initialize the backend based on the model source.

Parameters:
  • model_src – The model source to use.

  • model_kwargs – The additional arguments to pass to the model.

  • api_key – The API key to use for the remote actor.

Returns:

The backend class to use.

Return type:

type

load_model(model: str) None[source]

Load a model from a file or path. Required if the model is a weights path.

Parameters:

model – The path to the model file.

Module contents

class mbodied.agents.Agent(recorder: Literal['omit', 'auto'] | str = 'omit', recorder_kwargs=None, api_key: str = None, model_src=None, model_kwargs=None)[source]

Bases: object

Abstract base class for agents.

This class provides a template for creating agents that can optionally record their actions and observations.

recorder

The recorder to record observations and actions.

Type:

Recorder

actor

The backend actor to perform actions.

Type:

Backend

kwargs

Additional arguments to pass to the recorder.

Type:

dict

ACTOR_MAP = {'anthropic': <class 'mbodied.agents.backends.anthropic_backend.AnthropicBackend'>, 'gradio': <class 'mbodied.agents.backends.gradio_backend.GradioBackend'>, 'http': <class 'mbodied.agents.backends.httpx_backend.HttpxBackend'>, 'ollama': <class 'mbodied.agents.backends.ollama_backend.OllamaBackend'>, 'openai': <class 'mbodied.agents.backends.openai_backend.OpenAIBackendMixin'>}
act(*args, **kwargs) Sample[source]

Act based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.act() correctly to perform the actions.

act_and_record(*args, **kwargs) Sample[source]

Peform action based on the observation and record the action, if applicable.

Parameters:
  • *args – Additional arguments to customize the action.

  • **kwargs – Additional arguments to customize the action.

Returns:

The action sample created by the agent.

Return type:

Sample

actor: AnthropicBackend | GradioBackend | OpenAIBackendMixin | HttpxBackend | OllamaBackend
async async_act(*args, **kwargs) Sample[source]

Act asynchronously based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.async_act() correctly to perform the actions.

async async_act_and_record(*args, **kwargs) Sample[source]

Act asynchronously based on the observation.

Subclass should implement this method.

For remote actors, this method should call actor.async_act() correctly to perform the actions.

static create_observation_from_args(observation_space, function, args, kwargs) dict[source]

Helper method to create an observation from the arguments of a function.

static handle_default(model_src: str, model_kwargs: dict) None[source]

Default to gradio then httpx backend if the model source is not recognized.

Parameters:
  • model_src – The model source to use.

  • model_kwargs – The additional arguments to pass to the model.

static init_backend(model_src: str, model_kwargs: dict, api_key: str) type[source]

Initialize the backend based on the model source.

Parameters:
  • model_src – The model source to use.

  • model_kwargs – The additional arguments to pass to the model.

  • api_key – The API key to use for the remote actor.

Returns:

The backend class to use.

Return type:

type

load_model(model: str) None[source]

Load a model from a file or path. Required if the model is a weights path.

Parameters:

model – The path to the model file.

class mbodied.agents.LanguageAgent(model_src: Literal['openai', 'anthropic'] | OpenAIBackendMixin | Url | Path = 'openai', context: list | Image | str | Message = None, api_key: str | None = None, model_kwargs: dict = None, recorder: Literal['default', 'omit'] | str = 'omit', recorder_kwargs: dict = None)[source]

Bases: Agent

An agent that can interact with users using natural language.

This class extends the functionality of a base Agent to handle natural language interactions. It manages memory, dataset-recording, and asynchronous remote inference, supporting multiple platforms including OpenAI, Anthropic, and Gradio.

reminders

A list of reminders that prompt the agent every n messages.

Type:

List[Reminder]

context

The current context of the conversation.

Type:

List[Message]

Inherits all attributes from the parent class `Agent`.

Examples

Basic usage with OpenAI:
>>> cognitive_agent = LanguageAgent(api_key="...", model_src="openai", recorder="default")
>>> cognitive_agent.act("your instruction", image)
Automatically act and record to dataset:
>>> cognitive_agent.act_and_record("your instruction", image)
act(instruction: str, image: Image = None, context: list | str | Image | Message = None, model=None, **kwargs) str[source]

Responds to the given instruction, image, and context.

Uses the given instruction and image to perform an action.

Parameters:
  • instruction – The instruction to be processed.

  • image – The image to be processed.

  • context – Additonal context to include in the response. If context is a list of messages, it will be interpreted as new memory.

  • model – The model to use for the response.

  • **kwargs – Additional keyword arguments.

Returns:

The response to the instruction.

Return type:

str

Example

>>> agent.act("Hello, world!", Image("scene.jpeg"))
"Hello! What can I do for you today?"
>>> agent.act("Return a plan to pickup the object as a python list.", Image("scene.jpeg"))
"['Move left arm to the object', 'Move right arm to the object']"
act_and_parse(instruction: str, image: ~mbodied.types.sense.vision.Image = None, parse_target: ~mbodied.types.sample.Sample = <class 'mbodied.types.sample.Sample'>, context: list | str | ~mbodied.types.sense.vision.Image | ~mbodied.types.message.Message = None, model=None, **kwargs) Sample[source]

Responds to the given instruction, image, and context and parses the response into a Sample object.

async async_act_and_parse(instruction: str, image: ~mbodied.types.sense.vision.Image = None, parse_target: ~mbodied.types.sample.Sample = <class 'mbodied.types.sample.Sample'>, context: list | str | ~mbodied.types.sense.vision.Image | ~mbodied.types.message.Message = None, model=None, **kwargs) Sample[source]

Responds to the given instruction, image, and context asynchronously and parses the response into a Sample object.

forget(everything=False, last_n: int = -1) None[source]

Forget the last n messages in the context.

forget_last() Message[source]

Forget the last message in the context.

history() List[Message][source]

Return the conversation history.

remind_every(prompt: str | Image | Message, n: int) None[source]

Remind the agent of the prompt every n messages.

class mbodied.agents.MotorAgent(recorder: Literal['omit', 'auto'] | str = 'omit', recorder_kwargs=None, api_key: str = None, model_src=None, model_kwargs=None)[source]

Bases: Agent

Abstract base class for motor agents.

Subclassed from Agent, thus possessing the ability to make remote calls, etc.

abstract act(**kwargs) Motion[source]

Generate a Motion based on given parameters.

Parameters:

**kwargs – Arbitrary keyword arguments for motor agent to act on.

Returns:

A Motion object based on the provided arguments.

Return type:

Motion