Multimodality support¶
RAI implements MultimodalMessage
that allows using image and audio* information in langchain.
Audio is not fully supported yet
Audio is currently added as a placeholder for future implementation.
Class Definition¶
LangChain supports multimodal data by default. This is done by expanding the content section from string to dictionary, containing specific keys.
To make it easier to use, RAI implements a MultimodalMessage
class, which is a wrapper around the BaseMessage
class.
Class Definition¶
rai.messages.multimodal.MultimodalMessage
¶
Bases: BaseMessage
Base class for multimodal messages.
Attributes:
Name | Type | Description |
---|---|---|
images |
Optional[List[str]]
|
List of base64 encoded images. |
audios |
Optional[Any]
|
List of base64 encoded audios. |
Source code in rai/messages/multimodal.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
|
Subclasses¶
rai.messages.multimodal.HumanMultimodalMessage
¶
Bases: HumanMessage
, MultimodalMessage
Source code in rai/messages/multimodal.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
rai.messages.multimodal.AIMultimodalMessage
¶
Bases: AIMessage
, MultimodalMessage
Source code in rai/messages/multimodal.py
150 151 |
|
rai.messages.multimodal.SystemMultimodalMessage
¶
Bases: SystemMessage
, MultimodalMessage
Source code in rai/messages/multimodal.py
91 92 |
|
rai.messages.multimodal.ToolMultimodalMessage
¶
Bases: ToolMessage
, MultimodalMessage
Note
When any subclass of this class is used with LangGraph agents, use
rai.agents.langchain.core import ToolRunner
as the tool runner, as it automatically
handles multimodal ToolMessages as well as converts them to a format
that is compatible with the vendor.
rai.agents.langchain.core.ToolRunner
¶
Bases: RunnableCallable
Source code in rai/agents/langchain/core/tool_runner.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
get_messages(input)
¶
Get fields from from input that will be processed.
Source code in rai/agents/langchain/core/tool_runner.py
50 51 52 |
|
update_input_with_outputs(input, outputs)
¶
Update input with tool outputs.
Source code in rai/agents/langchain/core/tool_runner.py
54 55 56 57 58 |
|
Source code in rai/messages/multimodal.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
Usage¶
Example:
from rai.messages import HumanMultimodalMessage, preprocess_image
from rai import get_llm_model # initialize your model of choice defined in config.toml
base64_image = preprocess_image('https://raw.githubusercontent.com/RobotecAI/RobotecGPULidar/develop/docs/image/rgl-logo.png')
llm = get_llm_model(model_type='complex_model') # initialize your vendor of choice in config.toml
msg = [HumanMultimodalMessage(content='Describe the image', images=[base64_image])]
llm.invoke(msg).pretty_print()
# ================================== Ai Message ==================================
#
# The image features the words "Robotec," "GPU," and "Lidar" displayed in a stylized,
# multicolored font against a black background. The text has a wavy, striped pattern,
# incorporating red, green, and blue colors that give it a vibrantly layered appearance.
Implementation of the following messages is identical: HumanMultimodalMessage, SystemMultimodalMessage, AIMultimodalMessage.
ToolMultimodalMessage usage
Most of the vendors, do not support multimodal tool messages.
ToolMultimodalMessage
has an addition of postprocess
method, which converts the
ToolMultimodalMessage
into format that is compatible with a chosen vendor.
See Also¶
- Agents: For more information on the different types of agents in RAI
- Aggregators: For more information on the different types of aggregators in RAI
- Connectors: For more information on the different types of connectors in RAI
- Langchain Integration: For more information on the LangChain integration within RAI
- Runners: For more information on the different types of runners in RAI