Home / AI / How Does Midjourney Work and Function? A Beginner’s Overview
How Does Midjourney Work and Function? A Beginner's Overview a man working on the beach

How Does Midjourney Work and Function? A Beginner’s Overview

How Does Midjourney Work and Function? A Beginner's Overview a man working on the beach

Surely you’ve heard whispers and echoes of this little phenomenon known as Midjourney. I’m sure you’ve seen its name illuminate chatroom discussions, light up social media feeds, and maybe you’ve even caught a snippet of conversation at the water cooler.

“Midjourney,” they say, with a twinkle in their eyes, “It’s a game-changer.”

But what exactly is Midjourney? Is it a mystical spaceship cruising through the cosmos? Or perhaps a secret society of explorers, unfurling the many layers of life’s mysteries? Hold onto your hats, because we’re about to take a trip down the rabbit hole and reveal just what this awesome AI software is all about.

What is Midjourney?

Midjourney is a generative artificial intelligence program created by a San Francisco-based independent research lab, Midjourney, Inc. It generates images based on natural language descriptions, called “prompts”, a feature that makes it similar to AI programs like OpenAI’s DALL-E and Stable Diffusion.

Awesome images like this:

A cat eating snacks on a couch watching TV

The brainchild of David Holz, co-founder of Leap Motion, Midjourney first entered open beta on July 12, 2022.

The core technology of Midjourney lies in its ever-evolving algorithm. Since its inception, the team has launched several versions of its algorithm, each improving upon its predecessor.

Version 1 was released in February 2022, followed by V2 in April 2022, V3 in July 2022, and the alpha iteration of V4 on November 5, 2022. The most recent version, 5.1, was launched on May 3, 2023​​.

Notably, the 5.1 model is more ‘opinionated’ than version 5, applying more of its own stylization to images, while the 5.1 RAW model is better suited to more literal prompts.

As of now, Midjourney is accessible solely via a Discord bot on their official Discord server. Users generate images by direct messaging the bot or by inviting the bot to a third-party server. To initiate the image generation process, users enter the /imagine command followed by a prompt. The bot then generates a set of four images, and users can choose which images they want to upscale.

Image Creating AI and Machine Learning

The magic behind Midjourney lies in machine learning and artificial neural networks. Just like teaching a child to recognize objects, machine learning works by exposing the AI to millions of images until it learns to recognize shapes, colors, textures, and more.

Once you provide a description, the AI searches its database of learned images, constructing the picture in stages.

Here’s an example of an image with the prompt “a robot tripping and falling, sci-fi, wild, vivid colors” at only 15% constructed:

Example of Midjourney image only 15% constructed

And then 46%:

Example of Midjourney image only 46% constructed

And then 93%:

Example of Midjourney image 93% constructed

And then one of the finished products (covered in weird graffiti for some reason):

Example of a robot in Midjourney

You can kind of see how this intricate process mirrors the assembly of a jigsaw puzzle, ensuring the end result aligns with your given description, or at least will be after enough attempts.

Another pivotal element in Midjourney’s technology is ‘reinforcement learning from human feedback’ (RLHF). This is like training a new puppy with rewards, but here, the rewards are rankings from real people for the AI’s output. These rankings shape a “reward model,” enabling the AI to gauge if the output aligns with human values, understanding, and preferences.

The Power of GANs

Lastly, Midjourney is thought to use one more aspect of its programming to help create all of these amazing pictures. It employs a system called the ‘generative adversarial network‘ (GAN), which comprises two neural networks – the creator and the critic.

Simply put, the creator generates the image, and the critic evaluates it.

The two continue their dance until the resulting image fits the provided description, ensuring a smooth journey through Midjourney and giving something like the example images above.

How Does the Midjourney CEO Feel About AI Generated Art?

David Holz, the CEO of Midjourney, envisions artists as customers of Midjourney, not competitors. Many artists use Midjourney for rapid prototyping of artistic concepts to present to clients before beginning their work.

However, the company has faced some backlash from artists who claim that Midjourney devalues original creative work by using it in the training set. In response, Midjourney’s terms of service include a DMCA takedown policy, allowing artists to request removal of their work from the set if they perceive a copyright infringement​.

This step hasn’t stopped a large class action lawsuit from artists against the company, though, and it’s still ongoing with the results to be decided.

Midjourney’s capabilities have not gone unnoticed by the advertising industry, which is increasingly employing AI tools to create original content and brainstorm ideas swiftly. Midjourney and similar tools are creating new opportunities for advertisers, such as custom ads created for individuals, a new way to create special effects, or making e-commerce advertising more efficient.

While Midjourney was not particularly impressive at launch, its continuous improvements have led to it being regarded as one of the easiest AI art generation tools to use, capable of delivering stunning results.

Users have the flexibility to manually switch between any of Midjourney’s model versions using /settings. There are also models trained specifically for generating images in the Japanese Anime and Manga style. The quality of the generated images depends on the model version and the complexity of the prompt​​.

So, How Does Midjourney Really Work?

The whole process can be a bit confusing, so let’s recap how exactly AI image generators like Midjourney go from start to finish.

  1. Data Collection & Training: AI image generators first need a large dataset of images to learn from. These images are analyzed by the system to understand patterns, colors, shapes, and structures.
  2. Deep Learning & Neural Networks: These AI systems utilize a form of machine learning known as deep learning, which involves artificial neural networks. These networks mimic the human brain’s structure and allow the AI to learn from the input data.
  3. Generation of New Images: Once trained, the AI can generate new images based on its learning. It uses a form of neural network called a Generative Adversarial Network (GAN). Here, two parts of the system (the generator and the discriminator) work together. The generator creates new images, and the discriminator evaluates them against the real images from the training data. Over time, the generator gets better at creating images that the discriminator cannot distinguish from real ones.
  4. Refinement & User Interaction: The final images can then be refined based on user inputs or additional criteria. Users can change various aspects of the image, and the AI will adapt accordingly, generating a final image that meets the user’s requirements. This interactive process allows for the creation of incredibly detailed and realistic images.

And there you have it, the simplified version of how Midjourney works behind the scenes to create amazing art. The reality is far more technical and complex, and involves millions of dollars worth of using GPU’s on massive server farms owned by Amazon.

But the above outline should be enough to give you the knowledge to go talk about this in real life.

As AI continues to evolve and improve, Midjourney promises a fascinating journey through the world of generative art. Its unique blend of neural networks, machine learning, and human feedback systems makes it an exciting tool for artists, advertisers, and AI enthusiasts alike.

Happy imagining!

Leave a Comment

Your email address will not be published. Required fields are marked *