Invoice recognition based on QwenVL2.5 model

Written by
Clara Bennett
Updated on:June-16th-2025
Recommendation

A new breakthrough in invoice recognition technology, based on the QwenVL2.5 model, to achieve accurate information extraction.

Core content:
1. Extracting key information from invoice images using the QwenVL2.5 model
2. Analysis of the relationship between model performance and parameter size
3. Recommendations for model selection in application scenarios with high real-time requirements

 
Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Prompt word

        Accurate and real invoice image data 

 

Prompt word

  •  
"Please extract the following information from the invoice image:\n" +"- Invoice code\n" +"- Invoice number\n" +"- Invoice date\n" +"- Verification code\n" +"- Machine number\n" +"- Name of the buyer\n" +"- Buyer's taxpayer identification number\n" +"- Buyer's address and phone number\n" +"- Buyer's bank account and account number\n" +"- Name of the seller\n" +"- Seller's taxpayer identification number\n" +"- Seller's address and phone number\n" +"- Seller's bank account and account number\n" +"- Item name\n" +"- Specification model\n" +"- Unit\n" +"- Quantity\n" +"- Unit price\n" +"- Amount\n" +"- Tax rate\n" +"- Tax amount\n" +"- Total price and tax (in capital letters)\n" +"- Total price and tax (lowercase)\n" +"- Payee\n" +"- Review\n" +"- Invoice issuer\n" +"It is required to extract the above key information accurately. Do not omit or fabricate false information. Blurred or light-obstructed single characters can be replaced by English question marks?. The returned data format is output in json format, and the format is: {\n" +"'Invoice code':'xxx', 'Invoice number':'xxx', 'Invoice date':'xxx', 'Verification code':'xxx', 'Machine number':'xxx', 'Purchaser name':'xxx', 'Purchaser taxpayer identification number':'xxx', 'Purchaser address and phone number':'xxx', 'Purchaser bank and account number':'xxx', 'Seller name':'xxx', 'Seller taxpayer identification number':'xxx', 'Seller address and phone number':'xxx', 'Seller's bank account and account number':'xxx', 'Project name':'xxx', 'Specification model':'xxx', 'Unit':'xxx', 'Quantity':'xxx', 'Unit price':'xxx', 'Amount':'xxx', 'Tax rate':'xxx', 'Tax amount':'xxx', 'Total price and tax (uppercase)':'xxx', 'Total price and tax (lowercase)':'xxx', 'Payee':'xxx', 'Review':'xxx', 'Invoice issuer':'xxx'}"

 

 

From the performance data, the model size is positively correlated with the response time: the 3B parameter model is the fastest (7372ms), while the 32B parameter model is the slowest (23942ms)

• It is worth noting that the 7B model (10181ms) is faster than the OCR-specific model (12434ms), although the OCR model parameters are not explicitly stated

• The 72B model (15573ms) is faster than the 32B model (23942ms), which may be due to model architecture optimization or test environment differences

• For applications with high real-time requirements, it is recommended to give priority to 3B or 7B versions; for scenarios that require higher accuracy, the 72B version can be used as a trade-off

• The 32B version behaved abnormally in this test. You may need to check whether there are any problems with the model configuration or test environment.

 

qwen2.5-vl-72b-instruct model

Interface call time: 15573 milliseconds

  •  
{ "Invoice code": "053002100113", "Invoice number": "00195236", "Invoice date": "December 5, 2022", "Verification code": "02954951630544033053", "Machine number": "497003211467", "Purchaser name": "Yunnan Aituo Information Technology Co., Ltd.", "Purchaser taxpayer identification number": "91530100792866047D", "Purchaser address and telephone number": "Room 1503, 15th Floor, Unit 1, Building C1, Classic Shuangcheng, High-tech Zone, Kunming, Yunnan Province 0871-65386218", "Purchaser bank and account number": "Fudian Bank, Kunming Wuhua Branch 242021010000495122", "Seller name": "Yunnan Yacai Technology Co., Ltd.", "Seller taxpayer identification number": "91530103MA6Q798R69", "Address and phone number of seller": "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province 13629663610", "Bank and account number of seller": "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644", "Project name": "*Computer network equipment*Fiber optic box\n*Computer network equipment*Fiber optic box\n*Computer network equipment*Fiber optic adapter", "Specification model": "DS-ZGHK-24/E\nDS-ZGHK-12/E\nDS-ZMFGHLS-2/E", "Unit": "Piece\nPiece\nPiece", "Quantity": "3\n8\n168", "Unit price": "143.362831858407\n106.194690265487\n4.42477876106195", "Amount": "430.09\n849.56\n743.36\n¥2023.01", "Tax rate": "13%\n13%\n13%", "Tax amount": "55.91\n110.44\n96.64\n¥262.99", "Total price and tax (in capital letters): "2,286 yuan", "Total price and tax (in lower case)": "¥2286.00", "Payee": "Jiang Yunfang", "Review": "Zhao Jianbai", "Invoice issuer": "Dong Feixia"}

 

qwen2.5-vl-32b-instruct model

Interface call time: 23942 milliseconds

  •  

{    "Invoice Code""053002100113" ,    "Invoice Number""00195236" ,    "Invoicing Date""December 5, 2022" ,    "Verification code""02954951630544033053" ,    "Machine Number""497003211467" ,    "Purchasing Party Name""Yunnan Aituo Information Technology Co., Ltd." ,    "Purchaser Taxpayer Identification Number""91530100792866047D" ,    "Purchasing party's address and phone number""Room 1503, 15th Floor, Unit 1, Building C1, Classic Twin City, High-tech Zone, Kunming City, Yunnan Province, 0871-65386218" ,    "Buyer's bank account and account number""Fudian Bank, Kunming Wuhua Branch, 242021010000495122" ,    "Seller Name""Yunnan Yacai Technology Co., Ltd." ,    "Seller Taxpayer Identification Number""91530103MA6Q798R69" ,    "Seller's address and phone number""No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province, 13629663610" ,    "Seller's bank account and account number""China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644" ,    "Project Name""*Computer Network Equipment*Fiber Optic Box \n *Computer Network Equipment*Fiber Optic Box \n *Computer Network Equipment*Fiber Optic Adapter \ nTotal" ,    "Specification Model""DS-ZGHK-24/E \n DS-ZGHK-12/E \n DS-ZMFG HLS-2/E" ,    "Unit""pcs \npcs \ npcs " ,    "Quantity""3 \n 8 \n 168" ,    "Unit Price""143.362831858407 \n 106.194690265487 \n 4.42477876106195" ,    "Amount""430.09 \n 849.56 \n 743.36 \n¥2023.01" ,    "Tax rate""13% \n 13% \n 13%" ,    "Tax amount""55.91 \n 110.44 \n 96.64 \n¥262.99" ,    "Price and Tax Total (in capital letters)""2,286 Yuan" ,    "Price and tax total (lowercase)""¥2286.00" ,    "Payee""Jiang Yunfang" ,    "Review""Zhao Jianbai" ,    "Bill Issuer""Dong Feixia"}

 

qwen2.5-vl-7b-instruct model:

Interface call time: 10181 milliseconds

 

  •  
{ "Invoice code": "?53002100113", "Invoice number": "00195236", "Invoice date": "December 5, 2022", "Verification code": "02954951630544033053", "Machine number": "497003211467", "Purchaser name": "Yunnan Aituo Information Technology Co., Ltd.", "Purchaser taxpayer identification number": "91530100792866047D", "Purchaser address and phone number": "Room 1503, 15th Floor, Unit 1, Building C1, Classic Shuangcheng, High-tech Zone, Kunming, Yunnan Province 0871-65386218", "Purchaser bank and account number": "Fudian Bank, Kunming Wuhua Branch 242021010000495122", "Seller name": "Yunnan Yacai Technology Co., Ltd.", "Seller taxpayer identification number": "91530103MA6Q798R69", "Address and phone number of seller": "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province 13629663610", "Bank and account number of seller": "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644", "Project name": "*Computer network equipment*Fiber optic box *Computer network equipment*Fiber optic box *Computer network equipment*Fiber optic adapter", "Specification model": "DS-ZGHK-24/E DS-ZGHK-12/E DS-ZMFGHLS-2/E", "Unit": "Yuan/set Set", "Quantity": "3 ? 168", "Unit price": "143,362831858407;106.194690265487 ;4 .42477876106195", "Amount": "¥:430.09; 849.56; 743.36", "Tax rate": "13%", "Tax amount": "55.91 yuan; 110.44 yuan; 96.64 yuan", "Total price and tax (in capital letters): "(2,286 yuan)", "Total price and tax (in lower case)": "¥?:2286.00?", "Payee": "Jiang Yunfang", "Review": "Zhao Jianbai", "Invoice issuer": "Dong Feixia"}

Note: Since some prices and numbers are omitted or incomplete, they are simplified during parsing and `?` is used to indicate the uncertain parts. In actual operation, adjustments and improvements should be made according to specific circumstances.

qwen2.5-vl-3b-instruct model:

Interface call time: 7372 milliseconds

 

  •  
{ "Invoice code": "053002100113", "Invoice number": "00195236", "Invoice date": "December 5, 2022", "Verification code": "02954 95163 05440 33053", "Machine number": "497003211467", "Purchaser name": "Yunnan Aituo Information Technology Co., Ltd.", "Purchaser taxpayer identification number": "91530100792866047D", "Purchaser address and telephone number": "Room 1503, 15th Floor, Unit 1, Building C1, Shuangcheng, High-tech Zone, Kunming, Yunnan Province 0871-65386218", "Purchaser bank and account number": "Fudian Bank, Kunming Wuhua Branch 242021010000495122", "Seller name": "Yunnan Yacai Technology Co., Ltd.", "Seller taxpayer identification number": "91530103MA6Q798R69", "Address and phone number of seller": "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province 13629663610", "Bank and account number of seller": "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644", "Project name": "*Computer network equipment*Fiber optic box*Computer network equipment*Fiber optic box*Computer network equipment*Fiber optic adapter total", "Specification model": "DS-ZGHK-24/E?DS-ZGHK-12/E?DS-ZMFG HLS-2/E", "Unit": "Piece", "Quantity": "38168", "Unit price": "143.362831858407106.1946902654874.42477876106195", "Amount": "430.09849.56743.36", "Tax rate": "13%", "Tax amount": "55.91110.4496.64", "Total price and tax (in capital letters): "2,286 yuan", "Total price and tax (in lower case)": "2,286.00", "Payee": "Jiang Yunfang", "Reviewer": "Zhao Jianbai", "Invoice issuer": "Dong Feixia"}

 

qwen-vl-ocr-latest model:

Interface call time: 12434 milliseconds

 

  •  
{ "Invoice code": "053002100113", "Invoice number": "00195236", "Invoice date": "December 5, 2022", "Verification code": "02954 95163 05440 33053", "Machine number": "497003211467", "Purchaser's name": "Yunnan Aituo Information Technology Co., Ltd.", "Purchaser's taxpayer identification number": "91530100792866047D", "Purchaser's address and telephone number": "Room 1503, 15th Floor, Unit 1, Building C1, Classic Twin Cities, High-tech Zone, Kunming, Yunnan Province 0871-65386218", "Purchaser's bank account and account number": "Fudian Bank, Kunming Wuhua Branch 242021010000495122", "Seller's name": "Yunnan Yacai Technology Co., Ltd.", "Seller's taxpayer identification number": "91530103MA6Q798R69", "Address and phone number of seller": "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province 13629663610", "Bank and account number of seller": "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644", "Project name": "*Computer network equipment*Fiber optic box*Computer network equipment*Fiber optic box*Computer network equipment*Fiber optic adapter total", "Specification model": "DS-ZGHK-24/EDS-ZGHK-12/EDS-ZMFGHLS-2/E", "Unit": "Piece by piece", "Quantity": "38168", "Unit price": "143.362831858407 106.194690265487 4.42477876106195", "Amount": "430.09849.56743.36", "Tax rate": "13%13%13%", "Tax amount": "55.91110.4496.64", "Total price and tax (in capital letters): "2,286 yuan", "Total price and tax (in lower case)": "2,286.00", "Payee": "Jiang Yunfang", "Reviewer": "Zhao Jianbai", "Invoice issuer": "Dong Feixia"}

 

 

Second round of testing

After adjusting the prompt word

  •  
Please extract the content of this photo, where the content format is 'machine number', 'invoice code', 'invoice number', 'invoice date', 'verification code', 'purchaser name', 'purchaser taxpayer identification number', 'purchaser address, telephone number', 'opening bank and account number', 'goods or taxable services, service name', 'specification model', 'unit', 'quantity', 'unit price', 'amount', 'tax rate', 'tax amount', 'price and tax total (uppercase)', 'price and tax total (lowercase)', 'seller name', 'seller taxpayer identification number', 'seller address, telephone number', 'seller address, telephone number', 'opening bank and account number', 'remarks', 'payee', 'review', 'invoice issuer'. The returned result information is returned in json format.

 

72B-VL

{
"Machine Number" : "497003211467" ,
"Invoice Code" : "053002100113" ,
"Invoice Number" : "00195236" ,
"Invoicing Date" : "December 5, 2022" ,
"Verification code" : "02954 95163 05440 33053" ,
"Purchasing Party Name" : "Yunnan Aituo Information Technology Co., Ltd." ,
"Purchaser Taxpayer Identification Number" : "91530100792866047D" ,
"Purchasing party's address and phone number" : "Room 1503, 15th Floor, Unit 1, Building C, Classic Twin City, High-tech Zone, Kunming City, Yunnan Province, 0871-65386218" ,
"Bank and account number" : "Fudian Bank, Kunming Wuhua Branch 242021010000495122" ,
"Name of goods or taxable labor or service" : [
{
"Project Name" : "*Computer Network Equipment*Fiber Optic Box" ,
"Specification Model" : "DS-ZGHK-24/E" ,
"Unit" : "piece" ,
"Quantity" : "3" ,
"Unit Price" : "143.302831858407" ,
"Amount" : "430.09" ,
"Tax rate" : "13%" ,
"Tax amount" : "55.91"
} ,
{
"Project Name" : "*Computer Network Equipment*Fiber Optic Box" ,
"Specification Model" : "DS-ZGHK-12/E" ,
"Unit" : "piece" ,
"Quantity" : "8" ,
"Unit Price" : "106.194690265487" ,
"Amount" : "849.56" ,
"Tax rate" : "13%" ,
"Tax amount" : "110.44"
} ,
{
"Project Name" : "*Computer Network Equipment*Fiber Optic Adapter" ,
"Specification Model" : "DS-ZMFGHLS-2/E" ,
"Unit" : "piece" ,
"Quantity" : "168" ,
"Unit Price" : "4.42477876106195" ,
"Amount" : "743.36" ,
"Tax rate" : "13%" ,
"Tax amount" : "96.64"
}
] ,
"Price and Tax Total (in capital letters)" : "2,286 Yuan" ,
"Price and tax total (lowercase)" : "¥2286.00" ,
"Seller Name" : "Yunnan Yacai Technology Co., Ltd." ,
"Seller Taxpayer Identification Number" : "91530103MA6Q798R69" ,
"Seller's address and phone number" : "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province, 13629663610" ,
"Seller's bank account and account number" : "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644" ,
"Remark" : "" ,
"Payee" : "Jiang Yunfang" ,
"Review" : "Zhao Jianbai" ,
"Bill Issuer" : "Dong Feixia"
}
 

32B-VL

{
"Machine Number" : "497003211467" ,
"Invoice Code" : "053002100113" ,
"Invoice Number" : "00195236" ,
"Invoicing Date" : "December 5, 2022" ,
"Verification code" : "02954951630544033053" ,
"Purchasing Party Name" : "Yunnan Aituo Information Technology Co., Ltd." ,
"Purchaser Taxpayer Identification Number" : "91530100792866047D" ,
"Purchasing party's address and phone number" : "Room 1503, 15th Floor, Unit 1, Building C1, Classic Twin City, High-tech Zone, Kunming City, Yunnan Province, 0871-65386218" ,
"Bank and account number" : "Fudian Bank, Kunming Wuhua Branch 242021010000495122" ,
"Name of goods or taxable labor or service" : [
"*Computer Network Equipment*Fiber Optic Box" ,
"*Computer Network Equipment*Fiber Optic Box" ,
"*Computer Network Equipment*Fiber Optic Adapter"
] ,
"Specification Model" : [
"DS-ZGHK-24/E" ,
"DS-ZGHK-12/E" ,
"DS-ZMFG HLS-2/E"
] ,
"Unit" : [
"indivual" ,
"indivual" ,
"indivual"
] ,
"quantity" : [
"3" ,
"8" ,
"168"
] ,
"unit price" : [
"143.362831858407" ,
"106.194690265487" ,
"4.42477876106195"
] ,
"Amount" : [
"430.09" ,
"849.56" ,
"743.36"
] ,
"Tax Rate" : [
"13%" ,
"13%" ,
"13%"
] ,
"Tax amount" : [
"55.91" ,
"110.44" ,
"96.64"
] ,
"Price and Tax Total (in capital letters)" : "2,286 Yuan" ,
"Price and tax total (lowercase)" : "¥2286.00" ,
"Seller Name" : "Yunnan Yacai Technology Co., Ltd." ,
"Seller Taxpayer Identification Number" : "91530103MA6Q798R69" ,
"Seller's address and phone number" : "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province, 13629663610" ,
"Seller's bank account and account number" : "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644" ,
"Remark" : "" ,
"Payee" : "Jiang Yunfang" ,
"Review" : "Zhao Jianbai" ,
"Bill Issuer" : "Dong Feixia"
}

VL-3b

{
"Machine Number" : "497003211467" ,
"Invoice Code" : "053002100113" ,
"Invoice Number" : "00195236" ,
"Invoicing Date" : "December 5, 2022" ,
"Verification code" : "02954 95163 05440 33053" ,
"Purchasing Party Name" : "Yunnan Aituo Information Technology Co., Ltd." ,
"Purchaser Taxpayer Identification Number" : "91530100792866047D" ,
"Purchaser's address and phone number" : "Room 1503, 15th Floor, Unit 1, Building C1, Shuangcheng, High-tech Zone, Kunming, Yunnan Province, 0871-65386218" ,
"Bank and account number" : "Fudian Bank, Kunming Wuhua Branch 242021010000495122" ,
"Name of goods or taxable labor or service" : [
"*Computer Network Equipment*Fiber Optic Box" ,
"*Computer Network Equipment*Fiber Optic Box" ,
"*Computer Network Equipment*Fiber Optic Adapter"
] ,
"Specification Model" : [
"DS-ZGHK-24/E" ,
"DS-ZGHK-12/E" ,
"DS-ZMFG HLS-2/E"
] ,
"Unit" : "piece" ,
"quantity" : [
"3" ,
"8" ,
"168"
] ,
"unit price" : [
"143.362831858407" ,
"106.194690265487" ,
"4.42477876106195"
] ,
"Amount" : [
"430.09" ,
"849.56" ,
"743.36"
] ,
"Tax rate" : "13%" ,
"Tax amount" : [
"55.91" ,
"110.44" ,
"96.64"
] ,
"Price and Tax Total (in capital letters)" : "2,286 Yuan" ,
"Price and tax total (lowercase)" : "¥2286.00" ,
"Seller Name" : "Yunnan Yacai Technology Co., Ltd." ,
"Seller Taxpayer Identification Number" : "91530103MA6Q798R69" ,
"Seller's address and phone number" : "No. 503-2, Building 10, Beijing Road Garden, Panlong District, Kunming City, Yunnan Province, 13629663610" ,
"Seller's bank account and account number" : "China Construction Bank, Kunming Baiyun Road Branch 53050188863700000644" ,
"Remark" : "" ,
"Payee" : "Jiang Yunfang" ,
"Review" : "Zhao Jianbai" ,
"Bill Issuer" : "Dong Feixia"
}
 

Traditional OCR and multimodal integration

 

  •  
import cv2import numpy as npfrom PIL import Imageimport pytesseractfrom transformers import AutoProcessor, AutoModelForVision2Seq# Initialize QwenVL2.5 modelprocessor = AutoProcessor.from_pretrained("Qwen/Qwen-VL-2-5-Int4")model = AutoModelForVision2Seq.from_pretrained("Qwen/Qwen-VL-2-5-Int4")def preprocess_image(image_path): """Image preprocessing flow""" img = Image.open(image_path).convert("RGB") # Basic preprocessing img = img.resize((1024, 1024)) # Uniform size img_array = np.array(img) # Enhance contrast (optional) # lab = cv2.cvtColor(img_array, cv2.COLOR_RGB2LAB) # l, a, b = tesseract.image_to_string(Image.fromarray(image_array), lang="chi_sim+eng") return textdef invoice_parser(raw_text): """Structured parsing of invoice information""" import re patterns = { "invoice_code": r"Invoice code [::]\s*(\d+)", "invoice_number": pytesseract.image_to_string(Image.fromarray(image_array), lang="chi_sim+eng") return textdef invoice_parser(raw_text): """Structured parsing of invoice information""" import re patterns = { "invoice_code": r"Invoice code [::]\s*(\d+)", "invoice_number": pytesseract.image_to_string(Image.fromarray(image_array), lang="chi_sim+eng") return textdef invoice_parser(raw_text): """Structured parsing of invoice information""" import re patterns = { "invoice_code": r"Invoice code [::]\s*(\d+)", "invoice_number": r"invoice number[::]\s*(\d+)", "check_code": r"check code[::]\s*(\d+)", "date": r"invoice date[::]\s*(\d{4}year\d{1,2}month\d{1,2}day)", "amount": r"total[::]\s*(¥?\d+\.\d{2})", "seller": r"seller[::]\s*(.*?)(?=\n|buyer)", "buyer": r"buyer[::]\s*(.*?)(?=\n|seller)", } results = {} for key, pattern in patterns.items(): match = re.search(pattern, raw_text, re.IGNORECASE) if match: results[key] = match.group(1).strip() # 量數化处理 if "amount" in results: try: results["amount"] = float(re.sub(r"[^\d.]", "", results["amount"])) except: pass return results def qwenvl_processing(image_array, prompt): """QwenVL2.5 Multimodal Processing""" inputs = processor( images=image_array, text=[prompt], return_tensors="pt", padding=True ) outputs = model.generate(**inputs) return processor.decode(outputs[0], skip_special_tokens=True) def main_process(image_path): # 1. Image Preprocessing processed_img = preprocess_image(image_path) # 2. Hybrid Recognition Process # Solution 1: Pure OCR Solution ocr_result = ocr_processing(processed_img) # Solution 2: QwenVL2.5 Multimodal Solution vl_result = qwenvl_processing( processed_img, "Please extract all the key information on this invoice, including invoice code, number, date, amount, seller and buyer information", ) # 3. Result fusion combined_text = ocr_result + "\n" + vl_result # 4. Structured parsing structured_data = invoice_parser(combined_text) return structured_data# Example if __name__ == "__main__": sample_image = "sample_invoice.jpg" result = main_process(sample_image) print("Recognition result:") for k, v in result.items(): print(f"{k.ljust(12)}: {v}")

 

Conclusion

Multimodal fusion architecture :
  • Combining traditional OCR (Tesseract) and QwenVL2.5's multimodal capabilities
  • OCR ensures basic text recognition, and QwenVL2.5 performs semantic understanding and information association
Intelligent pre-processing :
  • Image size normalized (1024x1024)
  • Optional contrast enhancement module (CLAHE algorithm)
  • RGB color space conversion
Structured parsing engine :
  • Regular expression-based pattern matching
  • Support Chinese and English invoice formats
  • Numerical processing of the amount
  • Key fields include: invoice code, number, date, amount, buyer, and seller information
QwenVL2.5 optimized use :
  • Special prompt word engineering: "Please extract all key information on this invoice..."
  • Image-text joint coding
  • Generative output
Scalability design :
  • Configurable regular expression pattern library
  • Support adding new invoice type templates
  • Result fusion strategy is extensible (weight distribution, conflict resolution)

Performance optimization suggestions:

For batch processing scenarios, it is recommended to:
  • Add an image caching mechanism
  • Implementing asynchronous processing queues
  • Deploy the quantized version of the model (INT4)
Precision improvement direction:
  • Collect invoice data in specific fields and fine-tune QwenVL2.5
  • Add a layout analysis module (detect table area)
  • Implementing field-level confidence assessment
Enterprise-level deployment considerations:
  • Add a data desensitization module
  • Implementing a distributed processing architecture
  • Integrated approval workflow system

      This solution integrates traditional OCR technology and advanced multimodal large models. While ensuring basic recognition capabilities, it leverages the semantic understanding advantages of QwenVL2.5 to better handle complex format invoices and unconventional field extraction needs.