How I efficiently translated the 65-page Google Tips Engineering white paper PDF file

Master efficient PDF translation skills to improve work efficiency.
Core content:
1. Why choose Markdown format for translation
2. Comparison of two methods of PDF to Markdown
3. How to use large language model to translate Markdown files
A few days ago, when I was translating the Google official Prompt Engineering white paper PDF, I tried some automated methods to improve efficiency and share some of my experiences and insights in translating PDF.
First of all, I personally prefer not to use the translation method that keeps the layout, because after the PDF is translated, the inconsistent length of the text will lead to an ugly layout, sometimes large and sometimes small; in addition, during translation, the text is forcibly divided due to the layout, resulting in incomplete context, which will affect the quality of translation.
When I translate PDF, I will convert the PDF into Markdown first, then translate based on Markdown, and then regenerate PDF based on the translated Markdown. The text, tables, and pictures can be well preserved. The main disadvantage is that the layout format is not well preserved. However, my translations are usually mainly text and charts, so it has little impact.
How to convert PDF to Markdown?
There are two main ways I often use to convert PDF to Markdown:
One is to directly use a multimodal large language model to generate Markdown
Among them, Gemini has the best effect, with strong OCR capabilities and a large context window length, especially the latest Gemini 2.5 Pro, which has very good results. If you can access AI Studio (aistudio.google.com), there are many free quotas every day, which is almost free. If you are already a Gemini subscriber, it is also very convenient to use Gemini 2.5 Pro on Gemini.
The method of use is very simple, upload the PDF file, the prompt word reference:
Help me convert this PDF to Markdown, keeping all the content without deletion
The advantage of this method is that it is simple and convenient to operate, and the tables can be well preserved. The disadvantage is that the PDF cannot be too large, and it may not be able to be extracted normally after dozens of pages. In addition, pictures in the PDF cannot be extracted for you, and you need to manually take screenshots or use tools to extract them.
One is to use a third-party API. I have tried two that work better:
LlamaParse of LlamaIndex: https://www.llamaindex.ai/llamaparse The advantage is that it has a UI, you can directly upload PDF to generate Markdown, and pictures can also be downloaded separately; the disadvantage is that the billing method is not flexible, there is only a monthly subscription, and you can't pay by volume. Fortunately, the free quota is large enough to analyze many pages
Mistra's MistraOCR: https://mistral.ai/news/mistral-ocr The advantage is that the billing is flexible, you can pay according to usage, you can also generate Markdown and extract images (but I have never succeeded). The disadvantage is that there is no UI, you need to write your own code or use open source projects to assist
The advantage of this method is that no matter how large the PDF file is, it can be parsed, and the images built into the PDF can also be extracted (some PDFs cannot do this).
How to translate Markdown?
Translating Markdown is very simple. Just give the Markdown to be translated to your favorite language model and write a prompt at the beginning or end:
Please rewrite the input content in simplified Chinese, keep the original Markdown format unchanged and without deletion, and make the content easy to understand
However, if the Markdown content is very long, you need to manually split it into blocks, translate a part at a time, and then merge it manually. As for how long the model can translate, it depends on the model itself. The best translation length is Gemini 2.5 Pro, and the worst is GPT-4.5. However, I think GPT-4.5 has the best translation effect, so many times I would rather manually split it and use GPT-4.5 to translate it piece by piece.
As for translation consistency, you can add a glossary of terms to the translated prompts, for example:
Please rewrite the input content in Chinese, respect the original meaning, make it easy to understand for ordinary people, no deletions, no translation of names, vocabulary: AI Agent -> AI Intelligent Body LLM -> Large Language Model
Or replace it manually after translation.
How to translate PDF in one click
The above Markdown parsing and Markdown translation is a more accurate translation method, but it is more cumbersome. If your PDF is not very large, you can also use the large language model to translate it with one click.
If the PDF file is not long, for example, within 10 pages (it varies according to different models and requires more attempts), you can directly let the model translate and output Markdown.
If the PDF file is long but not particularly long, such as the 65-page Google official prompt engineering white paper I translated, there is a secret: use Deep Research to help you translate the long PDF.
Most of them only support Deep Research for writing research reports. I don’t know that Deep Research can actually do other tasks, such as translation and writing code. Since Deep Research has a temporary local storage and its model window is usually very long, it is sufficient for translating long content. For example, a 64-page PDF cannot be translated in a normal conversation, but it can be easily done in Deep Research.
However, you cannot upload attachments in Deep Research. You can only put the PDF in a publicly accessible address, such as GitHub Pages, S3, etc., and then provide the URL for it to translate. The prompt is simple:
Please help me translate this PDF into Chinese and export it in Markdown format
PDF address: {pdf url}
Deep Research can read and translate PDF content with the help of a browser.
Both OpenAI's DeepResearch and Google Gemini's DeepResearch are capable of translating this long PDF, but Gemini's DeepResearch has better translation results. In addition, Gemini's results can be directly exported to Google Doc and then downloaded as PDF, while OpenAI's DeepResearch needs to be copied into Markdown, and then some unnecessary reference links are removed before exporting, which is relatively troublesome.
Here are the links to the translation results using OpenAI's DeepResearch and Google's DeepResearch respectively, for comparison and reference:
• Google DeepResearch: https://g.co/gemini/share/7537a1fecca8 • OpenAI DeepResearch: https://chatgpt.com/share/67fd2597-843c-800f-811c-eb0d9047f71c
Please note that the translation using DeepResearch is not of unlimited length and is still limited by the length limit of its product. 65 pages is close to the limit. If it is longer, it is recommended to split it into multiple small PDFs for translation.