Midjourney V7 is here! More beautiful pictures, better understanding of human language, and half the cost of rendering

Written by

Clara Bennett

Updated on:July-08th-2025

After such a long wait, V7 is finally here.

After GPT-4o swept in with the edge of the "image age", almost every move of Midjourney was magnified under the microscope——

How will it respond? Will it catch up? Or will it be left far behind?

The V7 didn’t debut with much fanfare, nor did it announce anything groundbreaking, but I think what it delivered is worth a closer look.

The main changes in this update are as follows:

Its pictures have become more "feeling"——

It's not a simple improvement in clarity, but rather an improvement in texture, structure, light perception and character details that are perceptible to the naked eye.

Especially in the processing of portraits and complex scenes, the coherence is stronger, the picture has less "collage feeling" and more breathing.

Language comprehension also improves.

Unlike in the past, when you had to carefully carve out prompts to draw a close picture, now you can use more natural language to describe it, and it can also grasp the main idea.

Of course, it hasn’t achieved the state of “it understands everything you say”, but its semantic inclusiveness is indeed higher.

There is also a small detail behind this that is worth noting - personalization settings are now turned on by default, but users need to manually "unlock" them, a process that takes about 5 minutes.

This means that V7 is trying to create "everyone's Midjourney" rather than a unified style assembly line.

But I have to tell you the truth:

It still hasn't made any breakthrough in generating "text in pictures".

Since V6, Midjourney has struggled with making the text in images accurate and readable.

With V7, despite optimizations in language comprehension, once you embed sentences, brand names, and slogans into the screen, the result is still the familiar deviation - missing letters, confusing spellings, and disconnected semantics are almost the norm.

So I was not surprised to see some users complain that “text generation still failed”.

This is an expected wall.

This is not just a bug, but a limitation of the path.

Midjourney is trained from "images" and has not actually gone through the "language modeling" learning process. Naturally, it cannot control the output of words as accurately as GPT-4o.

Midjourney is not a language model. Its path determines that it is better at "drawing artistic conception" rather than "reading language".

It’s not that it didn’t try its best, but it simply didn’t regard “characters” as its first language.

Midjourney can paint poetry but cannot write a clear sentence.

This wall has been there since V6, but this time, everyone expects it to climb over it——

In the end, it got around it.

But in this technical reality, it launched a new feature that I am very concerned about:

Draft Mode.

Although I haven’t tested it yet, from what users have described, it feels like a lightweight revolution in the creative experience:

Half the price, ten times the speed, and the ability to generate images directly from speech.

You no longer need to form complete sentences, just speak and it will start drawing.

A user said,

“ You just need to say a word to AI, and the dream will unfold before your eyes.”

I believe this statement is not an exaggeration, but a new "creative attitude":

You are no longer held back by the prompt, nor do you need to pursue a perfect start. You just want to draw it out and see. The draft mode becomes the starting point for "trying to start".

I'm looking forward to trying this lightness myself.

Even though it is not perfect now, in terms of direction, it is indeed moving towards "understanding people".

In the next 60 days, the official will continue to update every one or two weeks, such as character and object reference pictures, style personalization, moodboard adjustment, SREF control, etc., to slowly complete the outline of V7.

Midjourney isn't speeding, but it's seriously polishing the bricks under its feet.

In this competition about the "future form of imaging tools", the route it chooses may not be the fastest, but it may have its own flavor.