Skip to content

Spring AI 图像处理

Spring AI不仅支持文本生成,还提供了图像处理功能,可以实现图像生成、图像分析和图像编辑等功能。本文将介绍如何使用Spring AI处理图像相关任务。

图像生成

Spring AI支持通过文本描述生成图像,主要通过ImageClient接口实现:

java
@Service
public class ImageGenerationService {
    
    private final ImageClient imageClient;
    
    public ImageGenerationService(ImageClient imageClient) {
        this.imageClient = imageClient;
    }
    
    public byte[] generateImage(String description, int width, int height) {
        ImagePrompt prompt = new ImagePrompt(description);
        ImageOptions options = ImageOptions.builder()
                .width(width)
                .height(height)
                .build();
        
        ImageResponse response = imageClient.call(prompt, options);
        return response.getResult().getOutput().getImageData();
    }
}

集成OpenAI DALL-E

Spring AI可以集成OpenAI的DALL-E模型来生成图像:

依赖配置

xml
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>0.8.0</version>
</dependency>

配置OpenAI图像生成

yaml
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      image:
        model: dall-e-3

使用DALL-E生成图像

java
@RestController
@RequestMapping("/api/images")
public class ImageController {
    
    private final OpenAiImageClient imageClient;
    
    public ImageController(OpenAiImageClient imageClient) {
        this.imageClient = imageClient;
    }
    
    @PostMapping("/generate")
    public ResponseEntity<byte[]> generateImage(@RequestBody ImageRequest request) {
        ImagePrompt prompt = new ImagePrompt(request.getDescription());
        
        ImageOptions options = ImageOptions.builder()
                .width(1024)
                .height(1024)
                .quality(ImageOptions.Quality.STANDARD)
                .style(ImageOptions.Style.VIVID)
                .build();
        
        ImageResponse response = imageClient.call(prompt, options);
        byte[] imageData = response.getResult().getOutput().getImageData();
        
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.IMAGE_PNG);
        
        return new ResponseEntity<>(imageData, headers, HttpStatus.OK);
    }
    
    // 请求模型
    @Data
    public static class ImageRequest {
        private String description;
    }
}

图像变体生成

除了从文本生成图像外,Spring AI还支持基于现有图像生成变体:

java
@Service
public class ImageVariationService {
    
    private final OpenAiImageClient imageClient;
    
    public byte[] createImageVariation(byte[] originalImage, int numberOfVariations) {
        ImageVariationPrompt prompt = new ImageVariationPrompt(originalImage);
        
        ImageOptions options = ImageOptions.builder()
                .n(numberOfVariations)
                .size("1024x1024")
                .build();
        
        ImageResponse response = imageClient.createVariation(prompt, options);
        return response.getResult().getOutput().getImageData();
    }
}

图像编辑

Spring AI支持图像编辑功能,如去除图像中的部分内容或添加新元素:

java
@Service
public class ImageEditingService {
    
    private final OpenAiImageClient imageClient;
    
    public byte[] editImage(byte[] originalImage, byte[] mask, String prompt) {
        ImageEditPrompt editPrompt = new ImageEditPrompt(originalImage, mask, prompt);
        
        ImageOptions options = ImageOptions.builder()
                .size("1024x1024")
                .build();
        
        ImageResponse response = imageClient.edit(editPrompt, options);
        return response.getResult().getOutput().getImageData();
    }
}

图像到文本处理

Spring AI也支持图像分析和描述功能,将图像转换为文本描述:

java
@Service
public class ImageAnalysisService {
    
    private final MultiModalClient multiModalClient;
    
    public String analyzeImage(byte[] imageData) {
        // 创建包含图像的提示
        Message userMessage = new UserMessage("描述这张图片的内容", List.of(
                new Media(MediaType.IMAGE_PNG, imageData)
        ));
        
        Prompt prompt = new Prompt(List.of(
                new SystemMessage("你是一个图像分析专家,请详细描述图像内容"),
                userMessage
        ));
        
        // 获取模型分析结果
        ChatResponse response = multiModalClient.call(prompt);
        return response.getResult().getOutput().getContent();
    }
}

保存生成的图像

在实际应用中,可能需要保存生成的图像:

java
@Service
public class ImageStorageService {
    
    private final ImageClient imageClient;
    private final String storageDirectory;
    
    public ImageStorageService(ImageClient imageClient, 
                               @Value("${app.image.storage-dir:./images}") String storageDirectory) {
        this.imageClient = imageClient;
        this.storageDirectory = storageDirectory;
        
        // 确保存储目录存在
        new File(storageDirectory).mkdirs();
    }
    
    public String generateAndSaveImage(String description) throws IOException {
        // 生成图像
        ImagePrompt prompt = new ImagePrompt(description);
        ImageResponse response = imageClient.call(prompt);
        byte[] imageData = response.getResult().getOutput().getImageData();
        
        // 生成文件名并保存
        String filename = generateFilename();
        String filePath = Paths.get(storageDirectory, filename).toString();
        
        try (FileOutputStream outputStream = new FileOutputStream(filePath)) {
            outputStream.write(imageData);
        }
        
        return filename;
    }
    
    private String generateFilename() {
        String timestamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(new Date());
        return "image_" + timestamp + ".png";
    }
}

构建图像生成Web应用

Spring AI可以轻松集成到Web应用中,提供图像生成功能:

java
@Controller
@RequestMapping("/images")
public class ImageGenerationController {
    
    private final ImageClient imageClient;
    
    @GetMapping
    public String showImageForm(Model model) {
        model.addAttribute("imageRequest", new ImageRequestForm());
        return "image-form";
    }
    
    @PostMapping
    public String generateImage(@ModelAttribute ImageRequestForm imageRequest, Model model) {
        try {
            ImagePrompt prompt = new ImagePrompt(imageRequest.getDescription());
            
            ImageOptions options = ImageOptions.builder()
                    .width(imageRequest.getWidth())
                    .height(imageRequest.getHeight())
                    .build();
            
            ImageResponse response = imageClient.call(prompt, options);
            byte[] imageData = response.getResult().getOutput().getImageData();
            
            // 转换为Base64以在HTML中显示
            String base64Image = Base64.getEncoder().encodeToString(imageData);
            model.addAttribute("generatedImage", base64Image);
            
            return "image-result";
        } catch (Exception e) {
            model.addAttribute("error", "图像生成失败: " + e.getMessage());
            return "image-form";
        }
    }
    
    @Data
    public static class ImageRequestForm {
        private String description;
        private int width = 512;
        private int height = 512;
    }
}

对应的Thymeleaf模板:

html
<!-- image-form.html -->
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
    <title>图像生成</title>
</head>
<body>
    <h1>AI图像生成</h1>
    
    <form th:action="@{/images}" th:object="${imageRequest}" method="post">
        <div>
            <label>描述你想要的图像:</label>
            <textarea th:field="*{description}" rows="4" cols="50"></textarea>
        </div>
        <div>
            <label>宽度:</label>
            <input type="number" th:field="*{width}" min="256" max="1024" step="256"/>
        </div>
        <div>
            <label>高度:</label>
            <input type="number" th:field="*{height}" min="256" max="1024" step="256"/>
        </div>
        <button type="submit">生成图像</button>
    </form>
    
    <div th:if="${error}" th:text="${error}" style="color: red;"></div>
</body>
</html>

<!-- image-result.html -->
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
    <title>生成的图像</title>
</head>
<body>
    <h1>生成结果</h1>
    
    <div>
        <img th:src="'data:image/png;base64,' + ${generatedImage}" alt="生成的图像"/>
    </div>
    
    <a th:href="@{/images}">生成新图像</a>
</body>
</html>

图像处理的最佳实践

  1. 详细的提示描述:提供具体、详细的描述以获得更准确的图像
  2. 参数调整:根据需求调整图像尺寸、质量和样式等参数
  3. 错误处理:实现健壮的错误处理机制,处理API限制和异常情况
  4. 缓存策略:对于频繁请求的图像,实现缓存机制以减少API调用
  5. 内容过滤:实现内容审核,确保生成的图像符合应用的内容政策

总结

Spring AI为图像处理提供了强大的功能,包括图像生成、变体创建、图像编辑和图像分析等。通过统一的抽象接口,开发人员可以轻松集成各种图像处理功能,并构建功能丰富的AI图像应用。Spring AI的图像处理能力与其文本处理功能相辅相成,为开发全面的AI应用提供了完整的解决方案。