Benchmark of Qwen2.5 Model Performance

Test Conditions

  • Test Board:S100P。

  • Performance Data Acquisition: Test a single prompt and record the metrics of TTFT (Time to First Token) and TPS (Average Tokens Per Second).

  • Python version:Python3.10。

  • Runtime Environment:Linux。

Measured data

modelplatformdtypeseqlenmax contextTTFT(ms)TPSmemory(GB)
Qwen2.5-1.5BS100Pq8256102413024.041.8
Qwen2.5-1.5B-InstructS100Pq8256102413024.401.8
Qwen2.5-7BS100Pq825610245356.677.4
Qwen2.5-7B-InstructS100Pq825610245346.757.4