← All Models

seed-oss

30.2K Downloads

Advanced reasoning model from ByteDance with flexible "thinking budget" control and ability to reflect on the length of its own reasoning

Models
Updated 2 days ago
21.00 GB

Memory Requirements

To run the smallest seed-oss, you need at least 21 GB of RAM.

Capabilities

seed-oss models support tool use and reasoning. They are available in gguf and mlx.

About seed-oss

undefined

Seed-OSS is a series of open-source large language models developed by ByteDance's Seed Team, designed for powerful long-context, reasoning, agent and general capabilities, and versatile developer-friendly features. Although trained with only 12T tokens, Seed-OSS achieves excellent performance on several popular open benchmarks.

You can configure the model's thinking budget within LM Studio's chat interface.

Key Features

  • Flexible Control of Thinking Budget: Allowing users to flexibly adjust the reasoning length as needed. This capability of dynamically controlling the reasoning length enhances inference efficiency in practical application scenarios.
  • Enhanced Reasoning Capability: Specifically optimized for reasoning tasks while maintaining balanced and excellent general capabilities.
  • Agentic Intelligence: Performs exceptionally well in agentic tasks such as tool-using and issue resolving.
  • Research-Friendly: Given that the inclusion of synthetic instruction data in pre-training may affect the post-training research, we released pre-trained models both with and without instruction data, providing the research community with more diverse options.
  • Native Long Context: Trained with up-to-512K long context natively.

Model Summary

Seed-OSS adopts the popular causal language model architecture with RoPE, GQA attention, RMSNorm and SwiGLU activation.

Parameters36B
AttentionGQA
Activation FunctionSwiGLU
Number of Layers64
Number of QKV Heads80 / 8 / 8
Head Size128
Hidden Size5120
Vocabulary Size155K
Context Length512K
RoPE Base Frequency1e7

License

Seed-OSS is Apache-2.0 licensed.