AIGC: Performance Perspective Black Technology: One-click analysis, full presentation of pictures and texts

Written by
Jasper Cole
Updated on:June-30th-2025
Recommendation

Analyze large model performance data with one click, and present the analysis results with pictures and text.

Core content:
1. Background and core goals of performance analysis tools
2. Tool architecture design and data processing flow
3. Performance indicator analysis and visualization chart implementation

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Performance analysis tool code analysis report

 

background

  • • Last time, we implemented the scripting function to test the large model script, based on the improvement of the previous script, read Excel data to generate analysis charts
  • • The tool mainly implements core functions such as performance data loading, analysis, visualization and report generation
  • • The core goal is to help evaluate and analyze the performance of large language models in different test scenarios\
 

Part 1: Tool Architecture Design

Core class design

Adopt object-oriented approach and encapsulate all functions through performance analysis script class

Implement modular functions such as data loading, indicator calculation, visualization and report generation

Use the logging module for log management and provide complete running status tracking

Data processing flow

Support test data input in Excel format

Implement data preprocessing and validation, including timestamp conversion, test type mapping, etc.

Automatically calculate key performance indicators such as throughput, response time, etc.\  

def _load_data ( self ) ->  None : 
    # Read Excel data
    self .df = pd.read_excel( self .data_file)
    
    # Convert timestamp to datetime object
    self .df[ 'timestamp' ] = pd.to_datetime( self .df[ 'timestamp' ])
    
    # Create a test type map and apply it
    self .df[ 'test_type' ] =  self .df[ 'test_id' ].apply(
        lambda  x: test_type_map[ 'basic_test'if  x ==  'basic_test' else
        (test_type_map[ 'long_text_test'if  x ==  'long_text_test' else  test_type_map[ 'concurrency_test' ])
    )
    
    # Calculate throughput
    self .df[ 'throughput' ] =  self .df[ 'total_tokens_generated' ] /  self .df[ 'total_time' ]

Part 2: Performance Indicator Analysis Function

Core performance indicators

Response time analysis: total response time, minimum/maximum response time

Token delay analysis: first token delay, average token delay

Throughput analysis: number of tokens generated per second

Concurrency performance: request success rate, number of requests per second, etc.  

Visual analysis

Implement four core chart types:

  1. Comparison of response time by test type

  2. Token Delay Comparison Analysis

  3. Throughput Comparison Analysis

  4. Concurrent test response time trend\  

def _calculate_test_type_metrics ( self ) ->  None : 
    # Calculate multiple aggregate indicators by test type
    self .test_type_metrics =  self .df.groupby( 'test_type' ).agg({
        'total_time' : [ 'mean''min''max' ],
        'first_token_latency''mean' ,
        'avg_token_latency''mean' ,
        'total_tokens_generated''mean' ,
        'throughput''mean'
    }).reset_index()
 

Part 3: Output and reporting functions

Data export capability

Supports exporting performance indicators to CSV format

Visualization charts are exported as PNG format

Provides structured performance indicator printout test type support
 

 Basic Test

Long Text Test

Concurrency Test

Support Chinese and English test type mapping display

def create_visualizations ( self, output_path:  str  =  'performance_analysis.png' ) -> Figure: 
    # Create a 2x2 sub-graph layout
    fig, axes = plt.subplots( 22 , figsize=( 1612 ))
    
    # Draw four different types of charts
    self ._plot_response_time_comparison(axes[ 00 ])
    self ._plot_token_latency_comparison(axes[ 01 ])
    self ._plot_throughput_comparison(axes[ 10 ])
    self ._plot_concurrency_response_time(axes[ 11 ])