Home / AI Technology / Article
AI Technology News

Photons Make for More Energy-Efficient AI Image Generation

Rina Diane Cab…
2025-10-06 19 min read
Photons Make for More Energy-Efficient AI Image Generation
Photons Make for More Energy-Efficient AI Image Generation

<img src="https://spectrum.ieee.org/media-library/schematic-of-the-iteration-inputs-encoded-phase-pattern-and-diffractive-decoder-of-an-iterative-optical-generative-model.jpg?id=61689959&amp;width=124...



AI generators can create weird, wacky, or wonderful images—all while emitting hefty amounts of carbon. Energy-hungry electronic computations drive the generative AI process, with the underlying diffusion models trained to produce novel images out of random noise.

Researchers at the University of California, Los Angeles, aim to reduce this carbon footprint by employing photons instead of electrons to power AI image generation. Their optical generative models pair digital processors with analog diffractive processors that compute with photons. The group described their technology 27 August in the journal Nature.

Optical Generative Models Explained

Here’s how the process works. The first step is called knowledge distillation, in which a “teacher” diffusion model trains a “student” optical generative model to digitally process random noise. Next, the student model encodes random noise inputs into optical generative seeds, which are phase patterns representing the phase information of light—think of each seed as something like a slide for an overhead projector. These seeds are displayed on a spatial light modulator (SLM), which can control the phase of light passing through it. (The specific SLMs used by the researchers are liquid crystal devices.) When laser light shines through the seed, its phase pattern propagates through a second SLM. This SLM—the diffractive processor—decodes the phase pattern to create a new image captured by an image sensor.

“There’s a digital encoder, which gives you the seed rapidly, and then the analog processor is the key that decodes that representation for the human eye to visualize,” says Aydogan Ozcan, a professor of electrical and computer engineering at UCLA. “The generation happens in the optical analog domain, with the seed coming from a digital network. All in all, it’s replicating or distilling the information-generation capabilities of a diffusion model.”

Generation happens at the speed of light: “The system runs end-to-end in a single snapshot,” Ozcan says. By harnessing the physics of optics, these systems can run more swiftly and potentially consume less energy than diffusion models that iterate through thousands of steps.

The team devised two versions of their model. The aforementioned “snapshot model” that generates an image in a single optical pass, and an iterative model that enhances its outputs successively. The iterative model creates images with higher quality and clearer backgrounds than its snapshot counterpart. Both models were able to produce monochrome and multicolor images—including representations of butterflies, fashion products, handwritten digits, and even Van Gogh-style art—that the researchers found closely resembled the output image quality of diffusion models.

Privacy Benefits of Optical Models

Optical generative models offer an added benefit of privacy, mimicking encryption capabilities. “If you look at the phase information of the digital encoder, you won’t understand much from it. It’s not for the human eye to directly visualize,” says Ozcan. “That means if somebody intercepts the image of the digital encoder and looks at it or tries to decode it without the decoder, [they] won’t be able to do that. I can then encrypt the information that is generated so that only you can decode it and nobody else can know what it represents.”

An experimental snapshot optical generative model, comprised of a collimator, polarizer, sensor, beam splitter, SLM, and decoding layer. An experimental setup for a “snapshot” optical generative model creates monochrome images of handwritten digits and fashion items.Shiqi Chen, Yuhang Li, et al.

Ozcan is quick to point out that the architecture the team has developed may not suit content generation for digital use. “If you want to compute in the digital world as part of a digital computer ecosystem, maybe going from digital to analog and then back to digital might not be very ideal,” he says. “That’s why we are thinking of them as visual computers. They compute for the human eye in the analog world. And that’s where this fits better, as opposed to calling it as an alternative to a digital generative model—it’s not.”

This makes optical generative models apt for art, entertainment, and media applications, especially augmented reality and virtual reality.

“We can make this system work as part of AR and VR variable systems, where the device must communicate with the human eye and project into the human eye. During this projection, we can use the decoder as not just a projection system but also as a processing system, so that you can communicate from the cloud with optical generative seeds and do the last part of the computing with just light matter interactions as you are communicating with the human eye,” says Ozcan.

As part of their next stage, the researchers are exploring potential avenues of commercialization, as well as turning their prototype into a smaller form. “That way, the system can be significantly more compact, and it can even further reduce the power consumption,” Ozcan says. For now, the team has reimagined a brighter, more sustainable generative AI future with the help of light.

Source: IEEE Spectrum Word count: 5751 words
Published on 2025-10-06 23:33