ChatCerebras
This notebook provides a quick overview for getting started with Cerebras chat models. For detailed documentation of all ChatCerebras features and configurations head to the API reference.
At Cerebras, we've developed the world's largest and fastest AI processor, the Wafer-Scale Engine-3 (WSE-3). The Cerebras CS-3 system, powered by the WSE-3, represents a new class of AI supercomputer that sets the standard for generative AI training and inference with unparalleled performance and scalability.
With Cerebras as your inference provider, you can:
- Achieve unprecedented speed for AI inference workloads
- Build commercially with high throughput
- Effortlessly scale your AI workloads with our seamless clustering technology
Our CS-3 systems can be quickly and easily clustered to create the largest AI supercomputers in the world, making it simple to place and run the largest models. Leading corporations, research institutions, and governments are already using Cerebras solutions to develop proprietary models and train popular open-source models.
Want to experience the power of Cerebras? Check out our website for more resources and explore options for accessing our technology through the Cerebras Cloud or on-premise deployments!
For more information about Cerebras Cloud, visit cloud.cerebras.ai. Our API reference is available at inference-docs.cerebras.ai.
Overview
Integration details
Class | Package | Local | Serializable | JS support | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatCerebras | langchain-cerebras | ❌ | beta | ❌ |
Model features
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ |
Setup
pip install langchain-cerebras
Credentials
Get an API Key from cloud.cerebras.ai and add it to your environment variables:
export CEREBRAS_API_KEY="your-api-key-here"
import getpass
import os
if "CEREBRAS_API_KEY" not in os.environ:
os.environ["CEREBRAS_API_KEY"] = getpass.getpass("Enter your Cerebras API key: ")
Enter your Cerebras API key: ········
To enable automated tracing of your model calls, set your LangSmith API key:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
Installation
The LangChain Cerebras integration lives in the langchain-cerebras
package:
%pip install -qU langchain-cerebras
Instantiation
Now we can instantiate our model object and generate chat completions:
from langchain_cerebras import ChatCerebras
llm = ChatCerebras(
model="llama-3.3-70b",
# other params...
)
Invocation
messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content='Je adore le programmation.', response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 35, 'total_tokens': 42}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_be27ec77ff', 'finish_reason': 'stop'}, id='run-e5d66faf-019c-4ac6-9265-71093b13202d-0', usage_metadata={'input_tokens': 35, 'output_tokens': 7, 'total_tokens': 42})
Chaining
We can chain our model with a prompt template like so:
from langchain_cerebras import ChatCerebras
from langchain_core.prompts import ChatPromptTemplate
llm = ChatCerebras(
model="llama-3.3-70b",
# other params...
)
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)
chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
AIMessage(content='Ich liebe Programmieren!\n\n(Literally: I love programming!)', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 30, 'total_tokens': 44}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_be27ec77ff', 'finish_reason': 'stop'}, id='run-e1d2ebb8-76d1-471b-9368-3b68d431f16a-0', usage_metadata={'input_tokens': 30, 'output_tokens': 14, 'total_tokens': 44})
Streaming
from langchain_cerebras import ChatCerebras
from langchain_core.prompts import ChatPromptTemplate
llm = ChatCerebras(
model="llama-3.3-70b",
# other params...
)
system = "You are an expert on animals who must answer questions in a manner that a 5 year old can understand."
human = "I want to learn more about this animal: {animal}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])
chain = prompt | llm
for chunk in chain.stream({"animal": "Lion"}):
print(chunk.content, end="", flush=True)
OH BOY! Let me tell you all about LIONS!
Lions are the kings of the jungle! They're really big and have beautiful, fluffy manes around their necks. The mane is like a big, golden crown!
Lions live in groups called prides. A pride is like a big family, and the lionesses (that's what we call the female lions) take care of the babies. The lionesses are like the mommies, and they teach the babies how to hunt and play.
Lions are very good at hunting. They work together to catch their food, like zebras and antelopes. They're super fast and can run really, really fast!
But lions are also very sleepy. They like to take long naps in the sun, and they can sleep for up to 20 hours a day! Can you imagine sleeping that much?
Lions are also very loud. They roar really loudly to talk to each other. It's like they're saying, "ROAR! I'm the king of the jungle!"
And guess what? Lions are very social. They like to play and cuddle with each other. They're like big, furry teddy bears!
So, that's lions! Aren't they just the coolest?