Install

Start from the same requirements and quickstart path exposed upstream

The original repo is documentation-heavy, so the install surface is mostly about environment constraints, package baselines, and the shortest path to a first chat call.

Python 3.8+Transformers 4.32+Optional flash-attention

Runtime baseline

The upstream README calls out Python 3.8+, PyTorch 1.12+, Transformers 4.32+, and CUDA 11.4+ as the baseline environment.

Flash Attention is optional, but the README recommends it for supported fp16 or bf16 devices to improve efficiency and reduce memory usage.

  • Python 3.8 or newer
  • PyTorch 1.12 or newer, with 2.0+ recommended
  • Transformers 4.32 or newer
  • CUDA 11.4+ for GPU-oriented paths

Quickstart flow

  1. Install baseline dependencies

    Start with `pip install -r requirements.txt` if you want the simplest source-aligned local environment.

  2. Add Flash Attention only when the hardware supports it

    Treat flash-attention as an optimization layer, not a prerequisite, because the upstream README explicitly says the project still runs without it.

  3. Load the chat checkpoint with `trust_remote_code=True`

    The official quickstart shows `AutoTokenizer` and `AutoModelForCausalLM` loading the chat model directly from the public model hub.

Minimal Transformers example

The upstream quickstart centers the local experience on a direct `model.chat()` flow.

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen-7B-Chat",
    device_map="auto",
    trust_remote_code=True
).eval()

response, history = model.chat(tokenizer, "你好", history=None)
print(response)

Where builders usually branch next

Checkpoint hub

Hugging Face

Use the public Qwen organization when you want the standard open-source model-card and checkpoint flow.

Open link

Checkpoint hub

ModelScope

Mirror the same model line in the China-friendly distribution hub used throughout the original docs.

Open link

Runtime shortcut

Docker images

The README also points to prebuilt Docker images for faster environment setup when you do not want to build from scratch.

Open link

Source anchors

Install and Quickstart | Qwen Code