Should enterprise-level large models choose prompt, RAG, fine-tuning, or training from scratch? ——A complete guide to generative AI best practices

Written by
Silas Grey
Updated on:June-13th-2025
Recommendation

Generative AI technology selection guide provides best practices for enterprise-level applications.

Core content:
1. Comparison of four major implementation paths for enterprise-level generative AI technology
2. Weighing and selecting from five dimensions such as accuracy and implementation complexity
3. Applicable scenarios and analysis of advantages and disadvantages of each solution

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)
This article is translated and edited from Vikesh Pandey's Medium blog post and is suitable for professional technicians.


Preface

Generative AI technology is developing rapidly. When enterprises apply this technology to solve business problems, they are faced with many options. The current mainstream implementation methods are:

  • Prompt Engineering
  • Retrieval Augmented Generation (RAG)
  • Fine-tuning
  • Train Foundation Model from Scratch

However, there is little systematic quantitative guidance on how to choose the right method for your business scenario. This article will help you weigh the five dimensions of accuracy, implementation complexity, effort, TCO (total cost of ownership), and ease of updating .

Note : This guide assumes that you are doing a serious business scenario, so using the original version of the basic model directly (as-is) is usually not advisable and is only suitable for general search scenarios.


Comparative analysis dimensions

Dimensions
meaning
accuracy
Accuracy of output results
Implementation complexity
Difficulty of implementing the solution (programming/architecture/skill requirements)
Workload
Efforts and time required for project implementation
Total Cost of Ownership TCO
Comprehensive cost of the entire life cycle from construction to maintenance
Changeability
Flexibility of solution structure, ease of updating and component replacement

1. Accuracy

The pros and cons of output accuracy of each solution

  • Prompt Project
    :Through context settings and few-shot examples, the adaptability of large models to special tasks is improved. The effect is good when viewed alone, but the accuracy is essentially the weakest.
  • RAG
    : Through external database retrieval, task-related content is vectorized and dynamically introduced into the context, which greatly improves accuracy and significantly reduces the "hallucination" problem.
  • Fine-tuning
    : By updating model parameters on proprietary domain data, the contextual relevance of the output is further improved, and the accuracy is often slightly higher than RAG, which is particularly suitable for scenarios where the output is highly controllable.
  • Training from scratch
    : Highest accuracy, the model is completely customized according to the scene, the risk of hallucination is extremely low, but the cost and difficulty are also the greatest.


2. Implementation complexity

Skill & Process Difficulty Assessment

  • Prompt Project
    :The implementation threshold is extremely low, and basically no programming is required. The key lies in language and domain knowledge, and template writing is the most flexible.
  • RAG
    :Requires certain programming and system integration capabilities, and needs to design components such as embedding, retrieval, and storage. Tool selection affects complexity.
  • Fine-tuning
    : It requires in-depth knowledge of machine learning, data processing, and model parameter tuning, which is significantly more difficult than the previous two.
  • Training from scratch
    :The most complex, involving data collection, cleaning, model architecture exploration and in-depth experiments. The team needs to have advanced ML/CV background.



3. Implementation workload

Time and energy consumption analysis

  • Prompt Project
    :The template is extremely sensitive. A vocabulary change may even completely change the output. It requires continuous iterative experiments, but the cycle and investment are generally low.
  • RAG
    :It requires embedding construction, vector library construction and other work, which is slightly higher than the Prompt project, but available cloud services such as Amazon Bedrock can reduce the complexity (you can consider inserting the cloud service RAG flowchart).
  • Fine-tuning
    : Even if only a small amount of data is needed to get started, the parameter selection and tuning process is extremely time-consuming and the workload is significantly higher than RAG.
  • Training from scratch
    : The data processing, modeling, and tuning process, which can take weeks or even months, is unmatched by other methods.

4. Total Cost of Ownership (TCO)

Evaluation of the entire life cycle from an investment perspective 

  • Prompt Project
    :The TCO is the lowest. Only the template and FM interface logic need to be maintained. FM itself can use the cloud API method, which reduces the maintenance pressure.
  • RAG
    : Because it involves multiple components (vectorization, retrieval library, FM), the overall cost is higher than the Prompt project, but the architecture can be quickly adjusted through modular building blocks.
  • Fine-tuning
    :It requires high-computing hardware and professionals. It also requires repeated optimization when upgrading the large model base and introducing new data, which puts a lot of pressure on maintenance.
  • Training from scratch
    :The TCO is the highest, it consumes a lot of resources, requires a professional team and hardware foundation, and each model update requires retraining and iteration.

5. Flexibility in architectural changes

method
Flexibility/Changeability
Applicable scenarios
Prompt Project
Very high, fast FM/template change
Fast-changing scenarios and general needs
RAG
Highest, components are loosely coupled and replaceable
Need to integrate data retrieval and high adaptability
Fine-tuning
Low, changing data/scenario requires complete retraining
Vertical field, high controllability
Training from scratch
Minimum, update means full retraining
Special extreme needs or giant-level R&D



[Core Reference Table 1] High-level comparison of the four major methods in all dimensions

Dimensions\Schemes
Prompt Project
RAG
Fine-tuning
Training from scratch
Accuracy
Lower
high
Very high
Highest
Complexity
Low
middle
Higher
Highest
Workload
Lower (repeatedly)
middle
high
Highest
Total Cost
Low
middle
high
Highest
Variability
high
Highest
Low
lowest

Suggestions for selecting a solution

1. When to choose Prompt Project?

  • Pursuing "rapid experiments" or frequent trial and error, with high requirements for context and FM compatibility, and giving priority to application scenarios that change rapidly.

2. When to choose RAG?

  • It needs to integrate external knowledge bases or own data, requires flexible replacement of data sources, search engines, FM and other components, and has high expectations for output quality.

3. When to choose fine-tuning?

  • Model parameters and their versions need to be strictly controlled, or the data/terminology is highly industry-specific (such as law, biomedicine), and high requirements are placed on interpretability/reusability.

4. When is it necessary to start training from scratch?

  • When you need to completely customize the model architecture, or when the three methods cannot cover extremely complex/sensitive scenarios (such as government enterprises or ultra-large-scale AI infrastructure), and you have a strong budget and an experienced team.



Practical Tips and Experience Supplement

  • RAG Complexity and Cloud Services
    : For example, fully managed RAG capabilities such as Amazon Bedrock will directly lower the development and maintenance thresholds and serve as efficiency accelerators for enterprise implementation.
  • RAG and fine-tuning are easily confused
    :The core of RAG is to dynamically adjust the retrieval and prompt links, while fine-tuning focuses on optimizing the underlying model parameters themselves. The two can be combined, but the implementation focuses on different aspects.

Practical advice : When selecting a solution, prioritize the company's "non-negotiable" and "trade-off" indicators, and fully understand the balance between cost, benefit, flexibility, and control for each solution. There is no absolute "best solution", only the selection route that best suits the team at this stage.


Conclusion

Building a generative AI application is an engineering art that requires multiple trade-offs. This article is only a high-level decision framework, and the specific deployment needs to be further refined in combination with industry attributes, compliance, budget, and organizational capabilities.