Delete engineers
Browse files- engineers/LICENSE +0 -23
- engineers/NOTICE.md +0 -76
- engineers/README.md +0 -211
- engineers/__init__.py +0 -0
- engineers/deformes2D_thinker.py +0 -171
- engineers/deformes3D.py +0 -194
- engineers/deformes3D_thinker.py +0 -136
- engineers/deformes4D.py +0 -338
- engineers/deformes7D.py +0 -316
engineers/LICENSE
DELETED
|
@@ -1,23 +0,0 @@
|
|
| 1 |
-
# AducSdr: Uma implementação aberta e funcional da arquitetura ADUC-SDR para geração de vídeo coerente.
|
| 2 |
-
# Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
|
| 3 |
-
#
|
| 4 |
-
# Contato:
|
| 5 |
-
# Carlos Rodrigues dos Santos
|
| 6 |
-
# carlex22@gmail.com
|
| 7 |
-
# Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
| 8 |
-
#
|
| 9 |
-
# Repositórios e Projetos Relacionados:
|
| 10 |
-
# GitHub: https://github.com/carlex22/Aduc-sdr
|
| 11 |
-
#
|
| 12 |
-
# This program is free software: you can redistribute it and/or modify
|
| 13 |
-
# it under the terms of the GNU Affero General Public License as published by
|
| 14 |
-
# the Free Software Foundation, either version 3 of the License, or
|
| 15 |
-
# (at your option) any later version.
|
| 16 |
-
#
|
| 17 |
-
# This program is distributed in the hope that it will be useful,
|
| 18 |
-
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 19 |
-
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 20 |
-
# GNU Affero General Public License for more details.
|
| 21 |
-
#
|
| 22 |
-
# You should have received a copy of the GNU Affero General Public License
|
| 23 |
-
# along with this program. If not, see <https://www.gnu.org/licenses/>.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/NOTICE.md
DELETED
|
@@ -1,76 +0,0 @@
|
|
| 1 |
-
# NOTICE
|
| 2 |
-
|
| 3 |
-
Copyright (C) 2025 Carlos Rodrigues dos Santos. All rights reserved.
|
| 4 |
-
|
| 5 |
-
---
|
| 6 |
-
|
| 7 |
-
## Aviso de Propriedade Intelectual e Licenciamento
|
| 8 |
-
|
| 9 |
-
### **Processo de Patenteamento em Andamento (EM PORTUGUÊS):**
|
| 10 |
-
|
| 11 |
-
O método e o sistema de orquestração de prompts denominados **ADUC (Automated Discovery and Orchestration of Complex tasks)**, conforme descritos neste documento e implementados neste software, estão atualmente em processo de patenteamento.
|
| 12 |
-
|
| 13 |
-
O titular dos direitos, Carlos Rodrigues dos Santos, está buscando proteção legal para as inovações chave da arquitetura ADUC, incluindo, mas não se limitando a:
|
| 14 |
-
|
| 15 |
-
* Fragmentação e escalonamento de solicitações que excedem limites de contexto de modelos de IA.
|
| 16 |
-
* Distribuição inteligente de sub-tarefas para especialistas heterogêneos.
|
| 17 |
-
* Gerenciamento de estado persistido com avaliação iterativa e realimentação para o planejamento de próximas etapas.
|
| 18 |
-
* Planejamento e roteamento sensível a custo, latência e requisitos de qualidade.
|
| 19 |
-
* O uso de "tokens universais" para comunicação agnóstica a modelos.
|
| 20 |
-
|
| 21 |
-
### **Reconhecimento e Implicações (EM PORTUGUÊS):**
|
| 22 |
-
|
| 23 |
-
Ao acessar ou utilizar este software e a arquitetura ADUC aqui implementada, você reconhece:
|
| 24 |
-
|
| 25 |
-
1. A natureza inovadora e a importância da arquitetura ADUC no campo da orquestração de prompts para IA.
|
| 26 |
-
2. Que a essência desta arquitetura, ou suas implementações derivadas, podem estar sujeitas a direitos de propriedade intelectual, incluindo patentes.
|
| 27 |
-
3. Que o uso comercial, a reprodução da lógica central da ADUC em sistemas independentes, ou a exploração direta da invenção sem o devido licenciamento podem infringir os direitos de patente pendente.
|
| 28 |
-
|
| 29 |
-
---
|
| 30 |
-
|
| 31 |
-
### **Patent Pending (IN ENGLISH):**
|
| 32 |
-
|
| 33 |
-
The method and system for prompt orchestration named **ADUC (Automated Discovery and Orchestration of Complex tasks)**, as described herein and implemented in this software, are currently in the process of being patented.
|
| 34 |
-
|
| 35 |
-
The rights holder, Carlos Rodrigues dos Santos, is seeking legal protection for the key innovations of the ADUC architecture, including, but not limited to:
|
| 36 |
-
|
| 37 |
-
* Fragmentation and scaling of requests exceeding AI model context limits.
|
| 38 |
-
* Intelligent distribution of sub-tasks to heterogeneous specialists.
|
| 39 |
-
* Persistent state management with iterative evaluation and feedback for planning subsequent steps.
|
| 40 |
-
* Cost, latency, and quality-aware planning and routing.
|
| 41 |
-
* The use of "universal tokens" for model-agnostic communication.
|
| 42 |
-
|
| 43 |
-
### **Acknowledgement and Implications (IN ENGLISH):**
|
| 44 |
-
|
| 45 |
-
By accessing or using this software and the ADUC architecture implemented herein, you acknowledge:
|
| 46 |
-
|
| 47 |
-
1. The innovative nature and significance of the ADUC architecture in the field of AI prompt orchestration.
|
| 48 |
-
2. That the essence of this architecture, or its derivative implementations, may be subject to intellectual property rights, including patents.
|
| 49 |
-
3. That commercial use, reproduction of ADUC's core logic in independent systems, or direct exploitation of the invention without proper licensing may infringe upon pending patent rights.
|
| 50 |
-
|
| 51 |
-
---
|
| 52 |
-
|
| 53 |
-
## Licença AGPLv3
|
| 54 |
-
|
| 55 |
-
This program is free software: you can redistribute it and/or modify
|
| 56 |
-
it under the terms of the GNU Affero General Public License as published by
|
| 57 |
-
the Free Software Foundation, either version 3 of the License, or
|
| 58 |
-
(at your option) any later version.
|
| 59 |
-
|
| 60 |
-
This program is distributed in the hope that it will be useful,
|
| 61 |
-
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 62 |
-
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 63 |
-
GNU Affero General Public License for more details.
|
| 64 |
-
|
| 65 |
-
You should have received a copy of the GNU Affero General Public License
|
| 66 |
-
along with this program. If not, see <https://www.gnu.org/licenses/>.
|
| 67 |
-
|
| 68 |
-
---
|
| 69 |
-
|
| 70 |
-
**Contato para Consultas:**
|
| 71 |
-
|
| 72 |
-
Para mais informações sobre a arquitetura ADUC, o status do patenteamento, ou para discutir licenciamento para usos comerciais ou não conformes com a AGPLv3, por favor, entre em contato:
|
| 73 |
-
|
| 74 |
-
Carlos Rodrigues dos Santos
|
| 75 |
-
carlex22@gmail.com
|
| 76 |
-
Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/README.md
DELETED
|
@@ -1,211 +0,0 @@
|
|
| 1 |
-
---
|
| 2 |
-
title: Euia-AducSdr
|
| 3 |
-
emoji: 🎥
|
| 4 |
-
colorFrom: indigo
|
| 5 |
-
colorTo: purple
|
| 6 |
-
sdk: gradio
|
| 7 |
-
app_file: app.py
|
| 8 |
-
pinned: true
|
| 9 |
-
license: agpl-3.0
|
| 10 |
-
short_description: Uma implementação aberta e funcional da arquitetura ADUC-SDR
|
| 11 |
-
---
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
### 🇧🇷 Português
|
| 15 |
-
|
| 16 |
-
Uma implementação aberta e funcional da arquitetura ADUC-SDR (Arquitetura de Unificação Compositiva - Escala Dinâmica e Resiliente), projetada para a geração de vídeo coerente de longa duração. Este projeto materializa os princípios de fragmentação, navegação geométrica e um mecanismo de "eco causal 4bits memoria" para garantir a continuidade física e narrativa em sequências de vídeo geradas por múltiplos modelos de IA.
|
| 17 |
-
|
| 18 |
-
**Licença:** Este projeto é licenciado sob os termos da **GNU Affero General Public License v3.0**. Isto significa que se você usar este software (ou qualquer trabalho derivado) para fornecer um serviço através de uma rede, você é **obrigado a disponibilizar o código-fonte completo** da sua versão para os usuários desse serviço.
|
| 19 |
-
|
| 20 |
-
- **Copyright (C) 4 de Agosto de 2025, Carlos Rodrigues dos Santos**
|
| 21 |
-
- Uma cópia completa da licença pode ser encontrada no arquivo [LICENSE](LICENSE).
|
| 22 |
-
|
| 23 |
-
---
|
| 24 |
-
|
| 25 |
-
### 🇬🇧 English
|
| 26 |
-
|
| 27 |
-
An open and functional implementation of the ADUC-SDR (Architecture for Compositive Unification - Dynamic and Resilient Scaling) architecture, designed for long-form coherent video generation. This project materializes the principles of fragmentation, geometric navigation, and a "causal echo 4bits memori" mechanism to ensure physical and narrative continuity in video sequences generated by multiple AI models.
|
| 28 |
-
|
| 29 |
-
**License:** This project is licensed under the terms of the **GNU Affero General Public License v3.0**. This means that if you use this software (or any derivative work) to provide a service over a network, you are **required to make the complete source code** of your version available to the users of that service.
|
| 30 |
-
|
| 31 |
-
- **Copyright (C) August 4, 2025, Carlos Rodrigues dos Santos**
|
| 32 |
-
- A full copy of the license can be found in the [LICENSE](LICENSE) file.
|
| 33 |
-
|
| 34 |
-
---
|
| 35 |
-
|
| 36 |
-
## **Aviso de Propriedade Intelectual e Patenteamento**
|
| 37 |
-
|
| 38 |
-
### **Processo de Patenteamento em Andamento (EM PORTUGUÊS):**
|
| 39 |
-
|
| 40 |
-
A arquitetura e o método **ADUC (Automated Discovery and Orchestration of Complex tasks)**, conforme descritos neste projeto e nas reivindicações associadas, estão **atualmente em processo de patenteamento**.
|
| 41 |
-
|
| 42 |
-
O titular dos direitos, Carlos Rodrigues dos Santos, está buscando proteção legal para as inovações chave da arquitetura ADUC, que incluem, mas não se limitam a:
|
| 43 |
-
|
| 44 |
-
* Fragmentação e escalonamento de solicitações que excedem limites de contexto de modelos de IA.
|
| 45 |
-
* Distribuição inteligente de sub-tarefas para especialistas heterogêneos.
|
| 46 |
-
* Gerenciamento de estado persistido com avaliação iterativa e realimentação para o planejamento de próximas etapas.
|
| 47 |
-
* Planejamento e roteamento sensível a custo, latência e requisitos de qualidade.
|
| 48 |
-
* O uso de "tokens universais" para comunicação agnóstica a modelos.
|
| 49 |
-
|
| 50 |
-
Ao utilizar este software e a arquitetura ADUC aqui implementada, você reconhece a natureza inovadora desta arquitetura e que a **reprodução ou exploração da lógica central da ADUC em sistemas independentes pode infringir direitos de patente pendente.**
|
| 51 |
-
|
| 52 |
-
---
|
| 53 |
-
|
| 54 |
-
### **Patent Pending (IN ENGLISH):**
|
| 55 |
-
|
| 56 |
-
The **ADUC (Automated Discovery and Orchestration of Complex tasks)** architecture and method, as described in this project and its associated claims, are **currently in the process of being patented.**
|
| 57 |
-
|
| 58 |
-
The rights holder, Carlos Rodrigues dos Santos, is seeking legal protection for the key innovations of the ADUC architecture, including, but not limited to:
|
| 59 |
-
|
| 60 |
-
* Fragmentation and scaling of requests exceeding AI model context limits.
|
| 61 |
-
* Intelligent distribution of sub-tasks to heterogeneous specialists.
|
| 62 |
-
* Persistent state management with iterative evaluation and feedback for planning subsequent steps.
|
| 63 |
-
* Cost, latency, and quality-aware planning and routing.
|
| 64 |
-
* The use of "universal tokens" for model-agnostic communication.
|
| 65 |
-
|
| 66 |
-
By using this software and the ADUC architecture implemented herein, you acknowledge the innovative nature of this architecture and that **the reproduction or exploitation of ADUC's core logic in independent systems may infringe upon pending patent rights.**
|
| 67 |
-
|
| 68 |
-
---
|
| 69 |
-
|
| 70 |
-
### Detalhes Técnicos e Reivindicações da ADUC
|
| 71 |
-
|
| 72 |
-
#### 🇧🇷 Definição Curta (para Tese e Patente)
|
| 73 |
-
|
| 74 |
-
**ADUC** é um *framework pré-input* e *intermediário* de **gerenciamento de prompts** que:
|
| 75 |
-
|
| 76 |
-
1. **fragmenta** solicitações acima do limite de contexto de qualquer modelo,
|
| 77 |
-
2. **escala linearmente** (processo sequencial com memória persistida),
|
| 78 |
-
3. **distribui** sub-tarefas a **especialistas** (modelos/ferramentas heterogêneos), e
|
| 79 |
-
4. **realimenta** a próxima etapa com avaliação do que foi feito/esperado (LLM diretor).
|
| 80 |
-
|
| 81 |
-
Não é um modelo; é uma **camada orquestradora** plugável antes do input de modelos existentes (texto, imagem, áudio, vídeo), usando *tokens universais* e a tecnologia atual.
|
| 82 |
-
|
| 83 |
-
#### 🇬🇧 Short Definition (for Thesis and Patent)
|
| 84 |
-
|
| 85 |
-
**ADUC** is a *pre-input* and *intermediate* **prompt management framework** that:
|
| 86 |
-
|
| 87 |
-
1. **fragments** requests exceeding any model's context limit,
|
| 88 |
-
2. **scales linearly** (sequential process with persisted memory),
|
| 89 |
-
3. **distributes** sub-tasks to **specialists** (heterogeneous models/tools), and
|
| 90 |
-
4. **feeds back** to the next step with an evaluation of what was done/expected (director LLM).
|
| 91 |
-
|
| 92 |
-
It is not a model; it is a pluggable **orchestration layer** before the input of existing models (text, image, audio, video), using *universal tokens* and current technology.
|
| 93 |
-
|
| 94 |
-
---
|
| 95 |
-
|
| 96 |
-
#### 🇧🇷 Elementos Essenciais (Telegráfico)
|
| 97 |
-
|
| 98 |
-
* **Agnóstico a modelos:** opera com qualquer LLM/difusor/API.
|
| 99 |
-
* **Pré-input manager:** recebe pedido do usuário, **divide** em blocos ≤ limite de tokens, **prioriza**, **agenda** e **roteia**.
|
| 100 |
-
* **Memória persistida:** resultados/latentes/“eco” viram **estado compartilhado** para o próximo bloco (nada é ignorado).
|
| 101 |
-
* **Especialistas:** *routers* decidem quem faz o quê (ex.: “descrição → LLM-A”, “keyframe → Img-B”, “vídeo → Vid-C”).
|
| 102 |
-
* **Controle de qualidade:** LLM diretor compara *o que fez* × *o que deveria* × *o que falta* e **regenera objetivos** do próximo fragmento.
|
| 103 |
-
* **Custo/latência-aware:** planeja pela **VRAM/tempo/custo**, não tenta “abraçar tudo de uma vez”.
|
| 104 |
-
|
| 105 |
-
#### 🇬🇧 Essential Elements (Telegraphic)
|
| 106 |
-
|
| 107 |
-
* **Model-agnostic:** operates with any LLM/diffuser/API.
|
| 108 |
-
* **Pre-input manager:** receives user request, **divides** into blocks ≤ token limit, **prioritizes**, **schedules**, and **routes**.
|
| 109 |
-
* **Persisted memory:** results/latents/“echo” become **shared state** for the next block (nothing is ignored).
|
| 110 |
-
* **Specialists:** *routers* decide who does what (e.g., “description → LLM-A”, “keyframe → Img-B”, “video → Vid-C”).
|
| 111 |
-
* **Quality control:** director LLM compares *what was done* × *what should be done* × *what is missing* and **regenerates objectives** for the next fragment.
|
| 112 |
-
* **Cost/latency-aware:** plans by **VRAM/time/cost**, does not try to “embrace everything at once”.
|
| 113 |
-
|
| 114 |
-
---
|
| 115 |
-
|
| 116 |
-
#### 🇧🇷 Reivindicações Independentes (Método e Sistema)
|
| 117 |
-
|
| 118 |
-
**Reivindicação Independente (Método) — Versão Enxuta:**
|
| 119 |
-
|
| 120 |
-
1. **Método** de **orquestração de prompts** para execução de tarefas acima do limite de contexto de modelos de IA, compreendendo:
|
| 121 |
-
(a) **receber** uma solicitação que excede um limite de tokens;
|
| 122 |
-
(b) **analisar** a solicitação por um **LLM diretor** e **fragmentá-la** em sub-tarefas ≤ limite;
|
| 123 |
-
(c) **selecionar** especialistas de execução para cada sub-tarefa com base em capacidades declaradas;
|
| 124 |
-
(d) **gerar** prompts específicos por sub-tarefa em **tokens universais**, incluindo referências ao **estado persistido** de execuções anteriores;
|
| 125 |
-
(e) **executar sequencialmente** as sub-tarefas e **persistir** suas saídas como memória (incluindo latentes/eco/artefatos);
|
| 126 |
-
(f) **avaliar** automaticamente a saída versus metas declaradas e **regenerar objetivos** do próximo fragmento;
|
| 127 |
-
(g) **iterar** (b)–(f) até que os critérios de completude sejam atendidos, produzindo o resultado agregado;
|
| 128 |
-
em que o framework **escala linearmente** no tempo e armazenamento físico, **independente** da janela de contexto dos modelos subjacentes.
|
| 129 |
-
|
| 130 |
-
**Reivindicação Independente (Sistema):**
|
| 131 |
-
|
| 132 |
-
2. **Sistema** de orquestração de prompts, compreendendo: um **planejador LLM diretor**; um **roteador de especialistas**; um **banco de estado persistido** (incl. memória cinética para vídeo); um **gerador de prompts universais**; e um **módulo de avaliação/realimentação**, acoplados por uma **API pré-input** a modelos heterogêneos.
|
| 133 |
-
|
| 134 |
-
#### 🇬🇧 Independent Claims (Method and System)
|
| 135 |
-
|
| 136 |
-
**Independent Claim (Method) — Concise Version:**
|
| 137 |
-
|
| 138 |
-
1. A **method** for **prompt orchestration** for executing tasks exceeding AI model context limits, comprising:
|
| 139 |
-
(a) **receiving** a request that exceeds a token limit;
|
| 140 |
-
(b) **analyzing** the request by a **director LLM** and **fragmenting it** into sub-tasks ≤ the limit;
|
| 141 |
-
(c) **selecting** execution specialists for each sub-task based on declared capabilities;
|
| 142 |
-
(d) **generating** specific prompts per sub-task in **universal tokens**, including references to the **persisted state** of previous executions;
|
| 143 |
-
(e) **sequentially executing** the sub-tasks and **persisting** their outputs as memory (including latents/echo/artifacts);
|
| 144 |
-
(f) **automatically evaluating** the output against declared goals and **regenerating objectives** for the next fragment;
|
| 145 |
-
(g) **iterating** (b)–(f) until completion criteria are met, producing the aggregated result;
|
| 146 |
-
wherein the framework **scales linearly** in time and physical storage, **independent** of the context window of the underlying models.
|
| 147 |
-
|
| 148 |
-
**Independent Claim (System):**
|
| 149 |
-
|
| 150 |
-
2. A prompt orchestration **system**, comprising: a **director LLM planner**; a **specialist router**; a **persisted state bank** (incl. kinetic memory for video); a **universal prompt generator**; and an **evaluation/feedback module**, coupled via a **pre-input API** to heterogeneous models.
|
| 151 |
-
|
| 152 |
-
---
|
| 153 |
-
|
| 154 |
-
#### 🇧🇷 Dependentes Úteis
|
| 155 |
-
|
| 156 |
-
* (3) Onde o roteamento considera **custo/latência/VRAM** e metas de qualidade.
|
| 157 |
-
* (4) Onde o banco de estado inclui **eco cinético** para vídeo (últimos *n* frames/latentes/fluxo).
|
| 158 |
-
* (5) Onde a avaliação usa métricas específicas por domínio (Lflow, consistência semântica, etc.).
|
| 159 |
-
* (6) Onde *tokens universais* padronizam instruções entre especialistas.
|
| 160 |
-
* (7) Onde a orquestração decide **cut vs continuous** e **corte regenerativo** (Déjà-Vu) ao editar vídeo.
|
| 161 |
-
* (8) Onde o sistema **nunca descarta** conteúdo excedente: **reagenda** em novos fragmentos.
|
| 162 |
-
|
| 163 |
-
#### 🇬🇧 Useful Dependents
|
| 164 |
-
|
| 165 |
-
* (3) Wherein routing considers **cost/latency/VRAM** and quality goals.
|
| 166 |
-
* (4) Wherein the state bank includes **kinetic echo** for video (last *n* frames/latents/flow).
|
| 167 |
-
* (5) Wherein evaluation uses domain-specific metrics (Lflow, semantic consistency, etc.).
|
| 168 |
-
* (6) Wherein *universal tokens* standardize instructions between specialists.
|
| 169 |
-
* (7) Wherein orchestration decides **cut vs continuous** and **regenerative cut** (Déjà-Vu) when editing video.
|
| 170 |
-
* (8) Wherein the system **never discards** excess content: it **reschedules** it in new fragments.
|
| 171 |
-
|
| 172 |
-
---
|
| 173 |
-
|
| 174 |
-
#### 🇧🇷 Como isso conversa com SDR (Vídeo)
|
| 175 |
-
|
| 176 |
-
* **Eco Cinético**: é um **tipo de estado persistido** consumido pelo próximo passo.
|
| 177 |
-
* **Déjà-Vu (Corte Regenerativo)**: é **uma política de orquestração** aplicada quando há edição; ADUC decide, monta os prompts certos e chama o especialista de vídeo.
|
| 178 |
-
* **Cut vs Continuous**: decisão do **diretor** com base em estado + metas; ADUC roteia e garante a sobreposição/remoção final.
|
| 179 |
-
|
| 180 |
-
#### 🇬🇧 How this Converses with SDR (Video)
|
| 181 |
-
|
| 182 |
-
* **Kinetic Echo**: is a **type of persisted state** consumed by the next step.
|
| 183 |
-
* **Déjà-Vu (Regenerative Cut)**: is an **orchestration policy** applied during editing; ADUC decides, crafts the right prompts, and calls the video specialist.
|
| 184 |
-
* **Cut vs Continuous**: decision made by the **director** based on state + goals; ADUC routes and ensures the final overlap/removal.
|
| 185 |
-
|
| 186 |
-
---
|
| 187 |
-
|
| 188 |
-
#### 🇧🇷 Mensagem Clara ao Usuário (Experiência)
|
| 189 |
-
|
| 190 |
-
> “Seu pedido excede o limite X do modelo Y. Em vez de truncar silenciosamente, o **ADUC** dividirá e **entregará 100%** do conteúdo por etapas coordenadas.”
|
| 191 |
-
|
| 192 |
-
Isso é diferencial prático e jurídico: **não-obviedade** por transformar limite de contexto em **pipeline controlado**, com **persistência de estado** e **avaliação iterativa**.
|
| 193 |
-
|
| 194 |
-
#### 🇬🇧 Clear User Message (Experience)
|
| 195 |
-
|
| 196 |
-
> "Your request exceeds model Y's limit X. Instead of silently truncating, **ADUC** will divide and **deliver 100%** of the content through coordinated steps."
|
| 197 |
-
|
| 198 |
-
This is a practical and legal differentiator: **non-obviousness** by transforming context limits into a **controlled pipeline**, with **state persistence** and **iterative evaluation**.
|
| 199 |
-
|
| 200 |
-
---
|
| 201 |
-
|
| 202 |
-
### Contact / Contato / Contacto
|
| 203 |
-
|
| 204 |
-
- **Author / Autor:** Carlos Rodrigues dos Santos
|
| 205 |
-
- **Email:** carlex22@gmail.com
|
| 206 |
-
- **GitHub:** [https://github.com/carlex22/Aduc-sdr](https://github.com/carlex22/Aduc-sdr)
|
| 207 |
-
- **Hugging Face Spaces:**
|
| 208 |
-
- [Ltx-SuperTime-60Secondos](https://huggingface.co/spaces/Carlexx/Ltx-SuperTime-60Secondos/)
|
| 209 |
-
- [Novinho](https://huggingface.co/spaces/Carlexxx/Novinho/)
|
| 210 |
-
|
| 211 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/__init__.py
DELETED
|
File without changes
|
engineers/deformes2D_thinker.py
DELETED
|
@@ -1,171 +0,0 @@
|
|
| 1 |
-
# engineers/deformes2D_thinker.py
|
| 2 |
-
# AducSdr: Uma implementação aberta e funcional da arquitetura ADUC-SDR
|
| 3 |
-
# Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
|
| 4 |
-
#
|
| 5 |
-
# Contato:
|
| 6 |
-
# Carlos Rodrigues dos Santos
|
| 7 |
-
# carlex22@gmail.com
|
| 8 |
-
# Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
| 9 |
-
#
|
| 10 |
-
# Repositórios e Projetos Relacionados:
|
| 11 |
-
# GitHub: https://github.com/carlex22/Aduc-sdr
|
| 12 |
-
#
|
| 13 |
-
# This program is free software: you can redistribute it and/or modify
|
| 14 |
-
# it under the terms of the GNU Affero General Public License as published by
|
| 15 |
-
# the Free Software Foundation, either version 3 of the License, or
|
| 16 |
-
# (at your option) any later version.
|
| 17 |
-
#
|
| 18 |
-
# This program is distributed in the hope that it will be useful,
|
| 19 |
-
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 20 |
-
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 21 |
-
# GNU Affero General Public License for more details.
|
| 22 |
-
#
|
| 23 |
-
# You should have received a copy of the GNU Affero General Public License
|
| 24 |
-
# along with this program. If not, see <https://www.gnu.org/licenses/>.
|
| 25 |
-
#
|
| 26 |
-
# This program is free software: you can redistribute it and/or modify
|
| 27 |
-
# it under the terms of the GNU Affero General Public License...
|
| 28 |
-
# PENDING PATENT NOTICE: Please see NOTICE.md.
|
| 29 |
-
#
|
| 30 |
-
# Version 1.0.1
|
| 31 |
-
|
| 32 |
-
import logging
|
| 33 |
-
from pathlib import Path
|
| 34 |
-
from PIL import Image
|
| 35 |
-
import gradio as gr
|
| 36 |
-
from typing import List
|
| 37 |
-
|
| 38 |
-
# It imports the communication layer, not the API directly
|
| 39 |
-
from managers.gemini_manager import gemini_manager_singleton
|
| 40 |
-
|
| 41 |
-
logger = logging.getLogger(__name__)
|
| 42 |
-
|
| 43 |
-
class Deformes2DThinker:
|
| 44 |
-
"""
|
| 45 |
-
The cognitive specialist that handles prompt engineering and creative logic.
|
| 46 |
-
"""
|
| 47 |
-
def _read_prompt_template(self, filename: str) -> str:
|
| 48 |
-
"""Reads a prompt template file from the 'prompts' directory."""
|
| 49 |
-
try:
|
| 50 |
-
prompts_dir = Path(__file__).resolve().parent.parent / "prompts"
|
| 51 |
-
with open(prompts_dir / filename, "r", encoding="utf-8") as f:
|
| 52 |
-
return f.read()
|
| 53 |
-
except FileNotFoundError:
|
| 54 |
-
raise gr.Error(f"Prompt template file not found: prompts/{filename}")
|
| 55 |
-
|
| 56 |
-
def generate_storyboard(self, prompt: str, num_keyframes: int, ref_image_paths: List[str]) -> List[str]:
|
| 57 |
-
"""Acts as a Scriptwriter to generate a storyboard."""
|
| 58 |
-
try:
|
| 59 |
-
template = self._read_prompt_template("unified_storyboard_prompt.txt")
|
| 60 |
-
storyboard_prompt = template.format(user_prompt=prompt, num_fragments=num_keyframes)
|
| 61 |
-
images = [Image.open(p) for p in ref_image_paths]
|
| 62 |
-
|
| 63 |
-
# Assemble all parts into a single list for the manager
|
| 64 |
-
prompt_parts = [storyboard_prompt] + images
|
| 65 |
-
storyboard_data = gemini_manager_singleton.get_json_object(prompt_parts)
|
| 66 |
-
|
| 67 |
-
storyboard = storyboard_data.get("scene_storyboard", [])
|
| 68 |
-
if not storyboard or len(storyboard) != num_keyframes:
|
| 69 |
-
raise ValueError(f"Incorrect number of scenes generated. Expected {num_keyframes}, got {len(storyboard)}.")
|
| 70 |
-
return storyboard
|
| 71 |
-
except Exception as e:
|
| 72 |
-
raise gr.Error(f"The Scriptwriter (Deformes2D Thinker) failed: {e}")
|
| 73 |
-
|
| 74 |
-
def select_keyframes_from_pool(self, storyboard: list, base_image_paths: list[str], pool_image_paths: list[str]) -> list[str]:
|
| 75 |
-
"""Acts as a Photographer/Editor to select keyframes."""
|
| 76 |
-
if not pool_image_paths:
|
| 77 |
-
raise gr.Error("The 'image pool' (Additional Images) is empty.")
|
| 78 |
-
|
| 79 |
-
try:
|
| 80 |
-
template = self._read_prompt_template("keyframe_selection_prompt.txt")
|
| 81 |
-
|
| 82 |
-
image_map = {f"IMG-{i+1}": path for i, path in enumerate(pool_image_paths)}
|
| 83 |
-
|
| 84 |
-
prompt_parts = ["# Reference Images (Story Base)"]
|
| 85 |
-
prompt_parts.extend([Image.open(p) for p in base_image_paths])
|
| 86 |
-
prompt_parts.append("\n# Image Pool (Scene Bank)")
|
| 87 |
-
prompt_parts.extend([Image.open(p) for p in pool_image_paths])
|
| 88 |
-
|
| 89 |
-
storyboard_str = "\n".join([f"- Scene {i+1}: {s}" for i, s in enumerate(storyboard)])
|
| 90 |
-
selection_prompt = template.format(storyboard_str=storyboard_str, image_identifiers=list(image_map.keys()))
|
| 91 |
-
prompt_parts.append(selection_prompt)
|
| 92 |
-
|
| 93 |
-
selection_data = gemini_manager_singleton.get_json_object(prompt_parts)
|
| 94 |
-
|
| 95 |
-
selected_identifiers = selection_data.get("selected_image_identifiers", [])
|
| 96 |
-
|
| 97 |
-
if len(selected_identifiers) != len(storyboard):
|
| 98 |
-
raise ValueError("The AI did not select the correct number of images for the scenes.")
|
| 99 |
-
|
| 100 |
-
selected_paths = [image_map[identifier] for identifier in selected_identifiers]
|
| 101 |
-
return selected_paths
|
| 102 |
-
|
| 103 |
-
except Exception as e:
|
| 104 |
-
raise gr.Error(f"The Photographer (Deformes2D Thinker) failed to select images: {e}")
|
| 105 |
-
|
| 106 |
-
def get_anticipatory_keyframe_prompt(self, global_prompt: str, scene_history: str, current_scene_desc: str, future_scene_desc: str, last_image_path: str, fixed_ref_paths: list[str]) -> str:
|
| 107 |
-
"""Acts as an Art Director to generate an image prompt."""
|
| 108 |
-
try:
|
| 109 |
-
template = self._read_prompt_template("anticipatory_keyframe_prompt.txt")
|
| 110 |
-
|
| 111 |
-
director_prompt = template.format(
|
| 112 |
-
historico_prompt=scene_history,
|
| 113 |
-
cena_atual=current_scene_desc,
|
| 114 |
-
cena_futura=future_scene_desc
|
| 115 |
-
)
|
| 116 |
-
|
| 117 |
-
prompt_parts = [
|
| 118 |
-
f"# CONTEXT:\n- Global Story Goal: {global_prompt}\n# VISUAL ASSETS:",
|
| 119 |
-
"Current Base Image [IMG-BASE]:",
|
| 120 |
-
Image.open(last_image_path)
|
| 121 |
-
]
|
| 122 |
-
|
| 123 |
-
ref_counter = 1
|
| 124 |
-
for path in fixed_ref_paths:
|
| 125 |
-
if path != last_image_path:
|
| 126 |
-
prompt_parts.extend([f"General Reference Image [IMG-REF-{ref_counter}]:", Image.open(path)])
|
| 127 |
-
ref_counter += 1
|
| 128 |
-
|
| 129 |
-
prompt_parts.append(director_prompt)
|
| 130 |
-
|
| 131 |
-
final_flux_prompt = gemini_manager_singleton.get_raw_text(prompt_parts)
|
| 132 |
-
|
| 133 |
-
return final_flux_prompt.strip().replace("`", "").replace("\"", "")
|
| 134 |
-
except Exception as e:
|
| 135 |
-
raise gr.Error(f"The Art Director (Deformes2D Thinker) failed: {e}")
|
| 136 |
-
|
| 137 |
-
def get_cinematic_decision(self, global_prompt: str, story_history: str,
|
| 138 |
-
past_keyframe_path: str, present_keyframe_path: str, future_keyframe_path: str,
|
| 139 |
-
past_scene_desc: str, present_scene_desc: str, future_scene_desc: str) -> dict:
|
| 140 |
-
"""Acts as a Film Director to make editing decisions and generate motion prompts."""
|
| 141 |
-
try:
|
| 142 |
-
template = self._read_prompt_template("cinematic_director_prompt.txt")
|
| 143 |
-
prompt_text = template.format(
|
| 144 |
-
global_prompt=global_prompt,
|
| 145 |
-
story_history=story_history,
|
| 146 |
-
past_scene_desc=past_scene_desc,
|
| 147 |
-
present_scene_desc=present_scene_desc,
|
| 148 |
-
future_scene_desc=future_scene_desc
|
| 149 |
-
)
|
| 150 |
-
|
| 151 |
-
prompt_parts = [
|
| 152 |
-
prompt_text,
|
| 153 |
-
"[PAST_IMAGE]:", Image.open(past_keyframe_path),
|
| 154 |
-
"[PRESENT_IMAGE]:", Image.open(present_keyframe_path),
|
| 155 |
-
"[FUTURE_IMAGE]:", Image.open(future_keyframe_path)
|
| 156 |
-
]
|
| 157 |
-
|
| 158 |
-
decision_data = gemini_manager_singleton.get_json_object(prompt_parts)
|
| 159 |
-
|
| 160 |
-
if "transition_type" not in decision_data or "motion_prompt" not in decision_data:
|
| 161 |
-
raise ValueError("AI response (Cinematographer) is malformed. Missing 'transition_type' or 'motion_prompt'.")
|
| 162 |
-
return decision_data
|
| 163 |
-
except Exception as e:
|
| 164 |
-
logger.error(f"The Film Director (Deformes2D Thinker) failed: {e}. Using fallback to 'continuous'.", exc_info=True)
|
| 165 |
-
return {
|
| 166 |
-
"transition_type": "continuous",
|
| 167 |
-
"motion_prompt": f"A smooth, continuous cinematic transition from '{present_scene_desc}' to '{future_scene_desc}'."
|
| 168 |
-
}
|
| 169 |
-
|
| 170 |
-
# --- Singleton Instance ---
|
| 171 |
-
deformes2d_thinker_singleton = Deformes2DThinker()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/deformes3D.py
DELETED
|
@@ -1,194 +0,0 @@
|
|
| 1 |
-
# engineers/deformes3D.py
|
| 2 |
-
#
|
| 3 |
-
# AducSdr: Uma implementação aberta e funcional da arquitetura ADUC-SDR
|
| 4 |
-
# Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
|
| 5 |
-
#
|
| 6 |
-
# Contato:
|
| 7 |
-
# Carlos Rodrigues dos Santos
|
| 8 |
-
# carlex22@gmail.com
|
| 9 |
-
# Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
| 10 |
-
#
|
| 11 |
-
# Repositórios e Projetos Relacionados:
|
| 12 |
-
# GitHub: https://github.com/carlex22/Aduc-sdr
|
| 13 |
-
#
|
| 14 |
-
# This program is free software: you can redistribute it and/or modify
|
| 15 |
-
# it under the terms of the GNU Affero General Public License as published by
|
| 16 |
-
# the Free Software Foundation, either version 3 of the License, or
|
| 17 |
-
# (at your option) any later version.
|
| 18 |
-
#
|
| 19 |
-
# This program is distributed in the hope that it will be useful,
|
| 20 |
-
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 21 |
-
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 22 |
-
# GNU Affero General Public License for more details.
|
| 23 |
-
#
|
| 24 |
-
# You should have received a copy of the GNU Affero General Public License
|
| 25 |
-
# along with this program. If not, see <https://www.gnu.org/licenses/>.
|
| 26 |
-
#
|
| 27 |
-
# This program is free software: you can redistribute it and/or modify
|
| 28 |
-
# it under the terms of the GNU Affero General Public License...
|
| 29 |
-
# PENDING PATENT NOTICE: Please see NOTICE.md.
|
| 30 |
-
#
|
| 31 |
-
# Version 2.0.1
|
| 32 |
-
|
| 33 |
-
from PIL import Image, ImageOps
|
| 34 |
-
import os
|
| 35 |
-
import time
|
| 36 |
-
import logging
|
| 37 |
-
import gradio as gr
|
| 38 |
-
import yaml
|
| 39 |
-
import torch
|
| 40 |
-
import numpy as np
|
| 41 |
-
|
| 42 |
-
from managers.flux_kontext_manager import flux_kontext_singleton
|
| 43 |
-
from engineers.deformes2D_thinker import deformes2d_thinker_singleton
|
| 44 |
-
from aduc_types import LatentConditioningItem
|
| 45 |
-
from managers.ltx_manager import ltx_manager_singleton
|
| 46 |
-
from managers.vae_manager import vae_manager_singleton
|
| 47 |
-
from managers.latent_enhancer_manager import latent_enhancer_specialist_singleton
|
| 48 |
-
|
| 49 |
-
logger = logging.getLogger(__name__)
|
| 50 |
-
|
| 51 |
-
class Deformes3DEngine:
|
| 52 |
-
"""
|
| 53 |
-
ADUC Specialist for static image (keyframe) generation.
|
| 54 |
-
"""
|
| 55 |
-
def __init__(self, workspace_dir):
|
| 56 |
-
self.workspace_dir = workspace_dir
|
| 57 |
-
self.image_generation_helper = flux_kontext_singleton
|
| 58 |
-
logger.info("3D Engine (Image Specialist) ready to receive orders from the Maestro.")
|
| 59 |
-
|
| 60 |
-
def _generate_single_keyframe(self, prompt: str, reference_images: list[Image.Image], output_filename: str, width: int, height: int, callback: callable = None) -> str:
|
| 61 |
-
"""
|
| 62 |
-
Low-level function that generates a single image using the LTX helper.
|
| 63 |
-
"""
|
| 64 |
-
logger.info(f"Generating keyframe '{output_filename}' with prompt: '{prompt}'")
|
| 65 |
-
generated_image = self.image_generation_helper.generate_image(
|
| 66 |
-
reference_images=reference_images, prompt=prompt, width=width,
|
| 67 |
-
height=height, seed=int(time.time()), callback=callback
|
| 68 |
-
)
|
| 69 |
-
final_path = os.path.join(self.workspace_dir, output_filename)
|
| 70 |
-
generated_image.save(final_path)
|
| 71 |
-
logger.info(f"Keyframe successfully saved to: {final_path}")
|
| 72 |
-
return final_path
|
| 73 |
-
|
| 74 |
-
def generate_keyframes_from_storyboard(self, storyboard: list, initial_ref_path: str, global_prompt: str, keyframe_resolution: int, general_ref_paths: list, progress_callback_factory: callable = None):
|
| 75 |
-
"""
|
| 76 |
-
Orchestrates the generation of all keyframes.
|
| 77 |
-
"""
|
| 78 |
-
current_base_image_path = initial_ref_path
|
| 79 |
-
previous_prompt = "N/A (initial reference image)"
|
| 80 |
-
final_keyframes_gallery = [] #[current_base_image_path]
|
| 81 |
-
width, height = keyframe_resolution, keyframe_resolution
|
| 82 |
-
target_resolution_tuple = (width, height)
|
| 83 |
-
|
| 84 |
-
num_keyframes_to_generate = len(storyboard) - 1
|
| 85 |
-
logger.info(f"IMAGE SPECIALIST: Received order to generate {num_keyframes_to_generate} keyframes (LTX versions).")
|
| 86 |
-
|
| 87 |
-
for i in range(num_keyframes_to_generate):
|
| 88 |
-
scene_index = i + 1
|
| 89 |
-
current_scene = storyboard[i]
|
| 90 |
-
future_scene = storyboard[i+1]
|
| 91 |
-
progress_callback_flux = progress_callback_factory(scene_index, num_keyframes_to_generate) if progress_callback_factory else None
|
| 92 |
-
|
| 93 |
-
logger.info(f"--> Generating Keyframe {scene_index}/{num_keyframes_to_generate}...")
|
| 94 |
-
|
| 95 |
-
# --- STEP A: Generate with FLUX (Primary Method) ---
|
| 96 |
-
logger.info(f" - Step A: Generating with keyframe...")
|
| 97 |
-
|
| 98 |
-
img_prompt = deformes2d_thinker_singleton.get_anticipatory_keyframe_prompt(
|
| 99 |
-
global_prompt=global_prompt, scene_history=previous_prompt,
|
| 100 |
-
current_scene_desc=current_scene, future_scene_desc=future_scene,
|
| 101 |
-
last_image_path=current_base_image_path, fixed_ref_paths=general_ref_paths
|
| 102 |
-
)
|
| 103 |
-
|
| 104 |
-
#flux_ref_paths = list(set([current_base_image_path] + general_ref_paths))
|
| 105 |
-
#flux_ref_images = [Image.open(p) for p in flux_ref_paths]
|
| 106 |
-
|
| 107 |
-
#flux_keyframe_path = self._generate_single_keyframe(
|
| 108 |
-
# prompt=img_prompt, reference_images=flux_ref_images,
|
| 109 |
-
# output_filename=f"keyframe_{scene_index}_flux.png", width=width, height=height,
|
| 110 |
-
# callback=progress_callback_flux
|
| 111 |
-
#)
|
| 112 |
-
#final_keyframes_gallery.append(flux_keyframe_path)
|
| 113 |
-
|
| 114 |
-
# --- STEP B: LTX Enrichment Experiment ---
|
| 115 |
-
#logger.info(f" - Step B: Generating enrichment with LTX...")
|
| 116 |
-
|
| 117 |
-
ltx_context_paths = []
|
| 118 |
-
context_paths = []
|
| 119 |
-
context_paths = [current_base_image_path] + [p for p in general_ref_paths if p != current_base_image_path][:3]
|
| 120 |
-
|
| 121 |
-
ltx_context_paths = list(reversed(context_paths))
|
| 122 |
-
logger.info(f" - LTX Context Order (Reversed): {[os.path.basename(p) for p in ltx_context_paths]}")
|
| 123 |
-
|
| 124 |
-
ltx_conditioning_items = []
|
| 125 |
-
|
| 126 |
-
weight = 0.6
|
| 127 |
-
for idx, path in enumerate(ltx_context_paths):
|
| 128 |
-
img_pil = Image.open(path).convert("RGB")
|
| 129 |
-
img_processed = self._preprocess_image_for_latent_conversion(img_pil, target_resolution_tuple)
|
| 130 |
-
pixel_tensor = self._pil_to_pixel_tensor(img_processed)
|
| 131 |
-
latent_tensor = vae_manager_singleton.encode(pixel_tensor)
|
| 132 |
-
|
| 133 |
-
ltx_conditioning_items.append(LatentConditioningItem(latent_tensor, 0, weight))
|
| 134 |
-
|
| 135 |
-
if idx >= 0:
|
| 136 |
-
weight -= 0.1
|
| 137 |
-
|
| 138 |
-
ltx_base_params = {"guidance_scale": 1.0, "stg_scale": 0.001, "num_inference_steps": 25}
|
| 139 |
-
generated_latents, _ = ltx_manager_singleton.generate_latent_fragment(
|
| 140 |
-
callback_on_step_end=progress_callback_flux
|
| 141 |
-
height=height, width=width,
|
| 142 |
-
conditioning_items_data=ltx_conditioning_items,
|
| 143 |
-
motion_prompt=img_prompt,
|
| 144 |
-
video_total_frames=48,
|
| 145 |
-
video_fps=24,
|
| 146 |
-
**ltx_base_params
|
| 147 |
-
)
|
| 148 |
-
|
| 149 |
-
final_latent = generated_latents[:, :, -1:, :, :]
|
| 150 |
-
upscaled_latent = latent_enhancer_specialist_singleton.upscale(final_latent)
|
| 151 |
-
enriched_pixel_tensor = vae_manager_singleton.decode(upscaled_latent)
|
| 152 |
-
|
| 153 |
-
ltx_keyframe_path = os.path.join(self.workspace_dir, f"keyframe_{scene_index}_ltx.png")
|
| 154 |
-
self.save_image_from_tensor(enriched_pixel_tensor, ltx_keyframe_path)
|
| 155 |
-
final_keyframes_gallery.append(ltx_keyframe_path)
|
| 156 |
-
|
| 157 |
-
# Use the FLUX keyframe as the base for the next iteration to maintain the primary narrative path
|
| 158 |
-
current_base_image_path = ltx_keyframe_path #flux_keyframe_path
|
| 159 |
-
previous_prompt = img_prompt
|
| 160 |
-
|
| 161 |
-
logger.info(f"IMAGE SPECIALIST: Generation of all keyframe versions (LTX) complete.")
|
| 162 |
-
return final_keyframes_gallery
|
| 163 |
-
|
| 164 |
-
# --- HELPER FUNCTIONS ---
|
| 165 |
-
|
| 166 |
-
def _preprocess_image_for_latent_conversion(self, image: Image.Image, target_resolution: tuple) -> Image.Image:
|
| 167 |
-
"""Resizes and fits an image to the target resolution for VAE encoding."""
|
| 168 |
-
if image.size != target_resolution:
|
| 169 |
-
return ImageOps.fit(image, target_resolution, Image.Resampling.LANCZOS)
|
| 170 |
-
return image
|
| 171 |
-
|
| 172 |
-
def _pil_to_pixel_tensor(self, pil_image: Image.Image) -> torch.Tensor:
|
| 173 |
-
"""Helper to convert PIL to the 5D pixel tensor the VAE expects."""
|
| 174 |
-
image_np = np.array(pil_image).astype(np.float32) / 255.0
|
| 175 |
-
tensor = torch.from_numpy(image_np).permute(2, 0, 1).unsqueeze(0).unsqueeze(2)
|
| 176 |
-
return (tensor * 2.0) - 1.0
|
| 177 |
-
|
| 178 |
-
def save_image_from_tensor(self, pixel_tensor: torch.Tensor, path: str):
|
| 179 |
-
"""Helper to save a 1-frame pixel tensor as an image."""
|
| 180 |
-
tensor_chw = pixel_tensor.squeeze(0).squeeze(1)
|
| 181 |
-
tensor_hwc = tensor_chw.permute(1, 2, 0)
|
| 182 |
-
tensor_hwc = (tensor_hwc.clamp(-1, 1) + 1) / 2.0
|
| 183 |
-
image_np = (tensor_hwc.cpu().float().numpy() * 255).astype(np.uint8)
|
| 184 |
-
Image.fromarray(image_np).save(path)
|
| 185 |
-
|
| 186 |
-
# --- Singleton Instantiation ---
|
| 187 |
-
try:
|
| 188 |
-
with open("config.yaml", 'r') as f:
|
| 189 |
-
config = yaml.safe_load(f)
|
| 190 |
-
WORKSPACE_DIR = config['application']['workspace_dir']
|
| 191 |
-
deformes3d_engine_singleton = Deformes3DEngine(workspace_dir=WORKSPACE_DIR)
|
| 192 |
-
except Exception as e:
|
| 193 |
-
logger.error(f"Could not initialize Deformes3DEngine: {e}", exc_info=True)
|
| 194 |
-
deformes3d_engine_singleton = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/deformes3D_thinker.py
DELETED
|
@@ -1,136 +0,0 @@
|
|
| 1 |
-
# engineers/deformes3D_thinker.py
|
| 2 |
-
#
|
| 3 |
-
# Copyright (C) 2025 Carlos Rodrigues dos Santos
|
| 4 |
-
#
|
| 5 |
-
# Version: 4.0.0 (Definitive)
|
| 6 |
-
#
|
| 7 |
-
# This is the definitive, robust implementation. It directly contains the prompt
|
| 8 |
-
# enhancement logic copied from the LTX pipeline's utils. It accesses the
|
| 9 |
-
# enhancement models loaded by the LTX Manager and performs the captioning
|
| 10 |
-
# and LLM generation steps locally, ensuring full control and compatibility.
|
| 11 |
-
|
| 12 |
-
import logging
|
| 13 |
-
from PIL import Image
|
| 14 |
-
import torch
|
| 15 |
-
|
| 16 |
-
# Importa o singleton do LTX para ter acesso à sua pipeline e aos modelos nela
|
| 17 |
-
from managers.ltx_manager import ltx_manager_singleton
|
| 18 |
-
|
| 19 |
-
# Importa o prompt de sistema do LTX para garantir consistência
|
| 20 |
-
from ltx_video.utils.prompt_enhance_utils import I2V_CINEMATIC_PROMPT
|
| 21 |
-
|
| 22 |
-
logger = logging.getLogger(__name__)
|
| 23 |
-
|
| 24 |
-
class Deformes3DThinker:
|
| 25 |
-
"""
|
| 26 |
-
The tactical specialist that now directly implements the prompt enhancement
|
| 27 |
-
logic, using the models provided by the LTX pipeline.
|
| 28 |
-
"""
|
| 29 |
-
|
| 30 |
-
def __init__(self):
|
| 31 |
-
# Acessa a pipeline exposta para obter os modelos necessários
|
| 32 |
-
pipeline = ltx_manager_singleton.prompt_enhancement_pipeline
|
| 33 |
-
if not pipeline:
|
| 34 |
-
raise RuntimeError("Deformes3DThinker could not access the LTX pipeline.")
|
| 35 |
-
|
| 36 |
-
# Armazena os modelos e processadores como atributos diretos
|
| 37 |
-
self.caption_model = pipeline.prompt_enhancer_image_caption_model
|
| 38 |
-
self.caption_processor = pipeline.prompt_enhancer_image_caption_processor
|
| 39 |
-
self.llm_model = pipeline.prompt_enhancer_llm_model
|
| 40 |
-
self.llm_tokenizer = pipeline.prompt_enhancer_llm_tokenizer
|
| 41 |
-
|
| 42 |
-
# Verifica se os modelos foram realmente carregados
|
| 43 |
-
if not all([self.caption_model, self.caption_processor, self.llm_model, self.llm_tokenizer]):
|
| 44 |
-
logger.warning("Deformes3DThinker initialized, but one or more enhancement models were not loaded by the LTX pipeline. Fallback will be used.")
|
| 45 |
-
else:
|
| 46 |
-
logger.info("Deformes3DThinker initialized and successfully linked to LTX enhancement models.")
|
| 47 |
-
|
| 48 |
-
@torch.no_grad()
|
| 49 |
-
def get_enhanced_motion_prompt(self, global_prompt: str, story_history: str,
|
| 50 |
-
past_keyframe_path: str, present_keyframe_path: str, future_keyframe_path: str,
|
| 51 |
-
past_scene_desc: str, present_scene_desc: str, future_scene_desc: str) -> str:
|
| 52 |
-
"""
|
| 53 |
-
Generates a refined motion prompt by directly executing the enhancement pipeline logic.
|
| 54 |
-
"""
|
| 55 |
-
# Verifica se os modelos estão disponíveis antes de tentar usá-los
|
| 56 |
-
if not all([self.caption_model, self.caption_processor, self.llm_model, self.llm_tokenizer]):
|
| 57 |
-
logger.warning("Enhancement models not available. Using fallback prompt.")
|
| 58 |
-
return f"A cinematic transition from '{present_scene_desc}' to '{future_scene_desc}'."
|
| 59 |
-
|
| 60 |
-
try:
|
| 61 |
-
present_image = Image.open(present_keyframe_path).convert("RGB")
|
| 62 |
-
|
| 63 |
-
# --- INÍCIO DA LÓGICA COPIADA E ADAPTADA DO LTX ---
|
| 64 |
-
|
| 65 |
-
# 1. Gerar a caption da imagem de referência (presente)
|
| 66 |
-
image_captions = self._generate_image_captions([present_image])
|
| 67 |
-
|
| 68 |
-
# 2. Construir o prompt para o LLM
|
| 69 |
-
# Usamos a cena futura como o "prompt do usuário"
|
| 70 |
-
messages = [
|
| 71 |
-
{"role": "system", "content": I2V_CINEMATIC_PROMPT},
|
| 72 |
-
{"role": "user", "content": f"user_prompt: {future_scene_desc}\nimage_caption: {image_captions[0]}"},
|
| 73 |
-
]
|
| 74 |
-
|
| 75 |
-
# 3. Gerar e decodificar o prompt final com o LLM
|
| 76 |
-
enhanced_prompt = self._generate_and_decode_prompts(messages)
|
| 77 |
-
|
| 78 |
-
# --- FIM DA LÓGICA COPIADA E ADAPTADA ---
|
| 79 |
-
|
| 80 |
-
logger.info(f"Deformes3DThinker received enhanced prompt: '{enhanced_prompt}'")
|
| 81 |
-
return enhanced_prompt
|
| 82 |
-
|
| 83 |
-
except Exception as e:
|
| 84 |
-
logger.error(f"The Film Director (Deformes3D Thinker) failed during enhancement: {e}. Using fallback.", exc_info=True)
|
| 85 |
-
return f"A smooth, continuous cinematic transition from '{present_scene_desc}' to '{future_scene_desc}'."
|
| 86 |
-
|
| 87 |
-
def _generate_image_captions(self, images: list[Image.Image]) -> list[str]:
|
| 88 |
-
"""
|
| 89 |
-
Lógica interna para gerar captions, copiada do LTX utils.
|
| 90 |
-
"""
|
| 91 |
-
# O modelo Florence-2 do LTX não usa um system_prompt aqui, mas um task_prompt
|
| 92 |
-
task_prompt = "<MORE_DETAILED_CAPTION>"
|
| 93 |
-
inputs = self.caption_processor(
|
| 94 |
-
text=[task_prompt] * len(images), images=images, return_tensors="pt"
|
| 95 |
-
).to(self.caption_model.device)
|
| 96 |
-
|
| 97 |
-
generated_ids = self.caption_model.generate(
|
| 98 |
-
input_ids=inputs["input_ids"],
|
| 99 |
-
pixel_values=inputs["pixel_values"],
|
| 100 |
-
max_new_tokens=1024,
|
| 101 |
-
num_beams=3,
|
| 102 |
-
)
|
| 103 |
-
|
| 104 |
-
# Usa o post_process_generation para extrair a resposta limpa
|
| 105 |
-
generated_text = self.caption_processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
|
| 106 |
-
processed_result = self.caption_processor.post_process_generation(
|
| 107 |
-
generated_text,
|
| 108 |
-
task=task_prompt,
|
| 109 |
-
image_size=(images[0].width, images[0].height)
|
| 110 |
-
)
|
| 111 |
-
return [processed_result[task_prompt]]
|
| 112 |
-
|
| 113 |
-
def _generate_and_decode_prompts(self, messages: list[dict]) -> str:
|
| 114 |
-
"""
|
| 115 |
-
Lógica interna para gerar prompt com o LLM, copiada do LTX utils.
|
| 116 |
-
"""
|
| 117 |
-
text = self.llm_tokenizer.apply_chat_template(
|
| 118 |
-
messages, tokenize=False, add_generation_prompt=True
|
| 119 |
-
)
|
| 120 |
-
model_inputs = self.llm_tokenizer([text], return_tensors="pt").to(self.llm_model.device)
|
| 121 |
-
|
| 122 |
-
output_ids = self.llm_model.generate(**model_inputs, max_new_tokens=256)
|
| 123 |
-
|
| 124 |
-
input_ids_len = model_inputs.input_ids.shape[1]
|
| 125 |
-
decoded_prompts = self.llm_tokenizer.batch_decode(
|
| 126 |
-
output_ids[:, input_ids_len:], skip_special_tokens=True
|
| 127 |
-
)
|
| 128 |
-
return decoded_prompts[0].strip()
|
| 129 |
-
|
| 130 |
-
# --- Singleton Instantiation ---
|
| 131 |
-
try:
|
| 132 |
-
deformes3d_thinker_singleton = Deformes3DThinker()
|
| 133 |
-
except Exception as e:
|
| 134 |
-
# A falha já terá sido logada dentro do __init__
|
| 135 |
-
deformes3d_thinker_singleton = None
|
| 136 |
-
raise e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/deformes4D.py
DELETED
|
@@ -1,338 +0,0 @@
|
|
| 1 |
-
# engineers/deformes4D.py
|
| 2 |
-
#
|
| 3 |
-
# AducSdr: Uma implementação aberta e funcional da arquitetura ADUC-SDR
|
| 4 |
-
# Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
|
| 5 |
-
#
|
| 6 |
-
# Contato:
|
| 7 |
-
# Carlos Rodrigues dos Santos
|
| 8 |
-
# carlex22@gmail.com
|
| 9 |
-
# Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
| 10 |
-
#
|
| 11 |
-
# Repositórios e Projetos Relacionados:
|
| 12 |
-
# GitHub: https://github.com/carlex22/Aduc-sdr
|
| 13 |
-
#
|
| 14 |
-
# This program is free software: you can redistribute it and/or modify
|
| 15 |
-
# it under the terms of the GNU Affero General Public License as published by
|
| 16 |
-
# the Free Software Foundation, either version 3 of the License, or
|
| 17 |
-
# (at your option) any later version.
|
| 18 |
-
#
|
| 19 |
-
# This program is distributed in the hope that it will be useful,
|
| 20 |
-
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 21 |
-
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 22 |
-
# GNU Affero General Public License for more details.
|
| 23 |
-
#
|
| 24 |
-
# You should have received a copy of the GNU Affero General Public License
|
| 25 |
-
# along with this program. If not, see <https://www.gnu.org/licenses/>.
|
| 26 |
-
#
|
| 27 |
-
# This program is free software: you can redistribute it and/or modify
|
| 28 |
-
# it under the terms of the GNU Affero General Public License...
|
| 29 |
-
# PENDING PATENT NOTICE: Please see NOTICE.md.
|
| 30 |
-
#
|
| 31 |
-
# Version 2.0.1
|
| 32 |
-
|
| 33 |
-
import os
|
| 34 |
-
import time
|
| 35 |
-
import imageio
|
| 36 |
-
import numpy as np
|
| 37 |
-
import torch
|
| 38 |
-
import logging
|
| 39 |
-
from PIL import Image, ImageOps
|
| 40 |
-
from dataclasses import dataclass
|
| 41 |
-
import gradio as gr
|
| 42 |
-
import subprocess
|
| 43 |
-
import gc
|
| 44 |
-
import shutil
|
| 45 |
-
from pathlib import Path
|
| 46 |
-
from typing import List, Tuple, Generator, Dict, Any
|
| 47 |
-
|
| 48 |
-
from aduc_types import LatentConditioningItem
|
| 49 |
-
from managers.ltx_manager import ltx_manager_singleton
|
| 50 |
-
from managers.latent_enhancer_manager import latent_enhancer_specialist_singleton
|
| 51 |
-
from managers.vae_manager import vae_manager_singleton
|
| 52 |
-
from engineers.deformes2D_thinker import deformes2d_thinker_singleton
|
| 53 |
-
from managers.seedvr_manager import seedvr_manager_singleton
|
| 54 |
-
from managers.mmaudio_manager import mmaudio_manager_singleton
|
| 55 |
-
from tools.video_encode_tool import video_encode_tool_singleton
|
| 56 |
-
|
| 57 |
-
logger = logging.getLogger(__name__)
|
| 58 |
-
|
| 59 |
-
class Deformes4DEngine:
|
| 60 |
-
"""
|
| 61 |
-
Implements the Camera (Ψ) and Distiller (Δ) of the ADUC-SDR architecture.
|
| 62 |
-
Orchestrates the generation, latent post-production, and final rendering of video fragments.
|
| 63 |
-
"""
|
| 64 |
-
def __init__(self, workspace_dir="deformes_workspace"):
|
| 65 |
-
self.workspace_dir = workspace_dir
|
| 66 |
-
self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
| 67 |
-
logger.info("Deformes4D Specialist (ADUC-SDR Executor) initialized.")
|
| 68 |
-
os.makedirs(self.workspace_dir, exist_ok=True)
|
| 69 |
-
|
| 70 |
-
# --- HELPER METHODS ---
|
| 71 |
-
|
| 72 |
-
def save_video_from_tensor(self, video_tensor: torch.Tensor, path: str, fps: int = 24):
|
| 73 |
-
"""Saves a pixel-space tensor as an MP4 video file."""
|
| 74 |
-
if video_tensor is None or video_tensor.ndim != 5 or video_tensor.shape[2] == 0: return
|
| 75 |
-
video_tensor = video_tensor.squeeze(0).permute(1, 2, 3, 0)
|
| 76 |
-
video_tensor = (video_tensor.clamp(-1, 1) + 1) / 2.0
|
| 77 |
-
video_np = (video_tensor.detach().cpu().float().numpy() * 255).astype(np.uint8)
|
| 78 |
-
with imageio.get_writer(path, fps=fps, codec='libx264', quality=8, output_params=['-pix_fmt', 'yuv420p']) as writer:
|
| 79 |
-
for frame in video_np: writer.append_data(frame)
|
| 80 |
-
|
| 81 |
-
def read_video_to_tensor(self, video_path: str) -> torch.Tensor:
|
| 82 |
-
"""Reads a video file and converts it into a pixel-space tensor."""
|
| 83 |
-
with imageio.get_reader(video_path, 'ffmpeg') as reader:
|
| 84 |
-
frames = [frame for frame in reader]
|
| 85 |
-
|
| 86 |
-
frames_np = np.stack(frames, axis=0).astype(np.float32) / 255.0
|
| 87 |
-
# (F, H, W, C) -> (C, F, H, W)
|
| 88 |
-
tensor = torch.from_numpy(frames_np).permute(3, 0, 1, 2)
|
| 89 |
-
tensor = tensor.unsqueeze(0) # (B, C, F, H, W)
|
| 90 |
-
tensor = (tensor * 2.0) - 1.0 # Normalize to [-1, 1]
|
| 91 |
-
return tensor.to(self.device)
|
| 92 |
-
|
| 93 |
-
def _preprocess_image_for_latent_conversion(self, image: Image.Image, target_resolution: tuple) -> Image.Image:
|
| 94 |
-
"""Resizes and fits an image to the target resolution for VAE encoding."""
|
| 95 |
-
if image.size != target_resolution:
|
| 96 |
-
return ImageOps.fit(image, target_resolution, Image.Resampling.LANCZOS)
|
| 97 |
-
return image
|
| 98 |
-
|
| 99 |
-
def pil_to_latent(self, pil_image: Image.Image) -> torch.Tensor:
|
| 100 |
-
"""Converts a PIL Image to a latent tensor by calling the VaeManager."""
|
| 101 |
-
image_np = np.array(pil_image).astype(np.float32) / 255.0
|
| 102 |
-
tensor = torch.from_numpy(image_np).permute(2, 0, 1).unsqueeze(0).unsqueeze(2)
|
| 103 |
-
tensor = (tensor * 2.0) - 1.0
|
| 104 |
-
return vae_manager_singleton.encode(tensor)
|
| 105 |
-
|
| 106 |
-
# --- CORE ADUC-SDR LOGIC ---
|
| 107 |
-
|
| 108 |
-
def generate_original_movie(self, keyframes: list, global_prompt: str, storyboard: list,
|
| 109 |
-
seconds_per_fragment: float, trim_percent: int,
|
| 110 |
-
handler_strength: float, destination_convergence_strength: float,
|
| 111 |
-
video_resolution: int, use_continuity_director: bool,
|
| 112 |
-
guidance_scale: float, stg_scale: float, num_inference_steps: int,
|
| 113 |
-
progress: gr.Progress = gr.Progress()):
|
| 114 |
-
FPS = 24
|
| 115 |
-
FRAMES_PER_LATENT_CHUNK = 8
|
| 116 |
-
LATENT_PROCESSING_CHUNK_SIZE = 4
|
| 117 |
-
|
| 118 |
-
run_timestamp = int(time.time())
|
| 119 |
-
temp_latent_dir = os.path.join(self.workspace_dir, f"temp_latents_{run_timestamp}")
|
| 120 |
-
temp_video_clips_dir = os.path.join(self.workspace_dir, f"temp_clips_{run_timestamp}")
|
| 121 |
-
os.makedirs(temp_latent_dir, exist_ok=True)
|
| 122 |
-
os.makedirs(temp_video_clips_dir, exist_ok=True)
|
| 123 |
-
|
| 124 |
-
total_frames_brutos = self._quantize_to_multiple(int(seconds_per_fragment * FPS), FRAMES_PER_LATENT_CHUNK)
|
| 125 |
-
frames_a_podar = self._quantize_to_multiple(int(total_frames_brutos * (trim_percent / 100)), FRAMES_PER_LATENT_CHUNK)
|
| 126 |
-
latents_a_podar = frames_a_podar // FRAMES_PER_LATENT_CHUNK
|
| 127 |
-
|
| 128 |
-
#if frames_a_podar % 2 == 0:
|
| 129 |
-
# frames_a_podar = frames_a_podar-1
|
| 130 |
-
|
| 131 |
-
total_latent_frames = total_frames_brutos // FRAMES_PER_LATENT_CHUNK
|
| 132 |
-
|
| 133 |
-
DEJAVU_FRAME_TARGET = frames_a_podar - 1 if frames_a_podar > 0 else 0
|
| 134 |
-
DESTINATION_FRAME_TARGET = total_frames_brutos - 1
|
| 135 |
-
|
| 136 |
-
base_ltx_params = {"guidance_scale": guidance_scale, "stg_scale": stg_scale, "num_inference_steps": num_inference_steps, "rescaling_scale": 0.15, "image_cond_noise_scale": 0.00}
|
| 137 |
-
keyframe_paths = [item[0] if isinstance(item, tuple) else item for item in keyframes]
|
| 138 |
-
story_history = ""
|
| 139 |
-
target_resolution_tuple = (video_resolution, video_resolution)
|
| 140 |
-
eco_latent_for_next_loop, dejavu_latent_for_next_loop = None, None
|
| 141 |
-
latent_fragment_paths = []
|
| 142 |
-
|
| 143 |
-
if len(keyframe_paths) < 2: raise gr.Error(f"Generation requires at least 2 keyframes. You provided {len(keyframe_paths)}.")
|
| 144 |
-
num_transitions_to_generate = len(keyframe_paths) - 1
|
| 145 |
-
|
| 146 |
-
logger.info("--- STARTING STAGE 1: Latent Fragment Generation ---")
|
| 147 |
-
for i in range(num_transitions_to_generate):
|
| 148 |
-
fragment_index = i + 1
|
| 149 |
-
progress(i / num_transitions_to_generate, desc=f"Generating Latent {fragment_index}/{num_transitions_to_generate}")
|
| 150 |
-
past_keyframe_path = keyframe_paths[i - 1] if i > 0 else keyframe_paths[i]
|
| 151 |
-
start_keyframe_path = keyframe_paths[i]
|
| 152 |
-
destination_keyframe_path = keyframe_paths[i + 1]
|
| 153 |
-
future_story_prompt = storyboard[i + 1] if (i + 1) < len(storyboard) else "The final scene."
|
| 154 |
-
logger.info(f"Calling deformes2D_thinker to generate cinematic decision for fragment {fragment_index}...")
|
| 155 |
-
decision = deformes2d_thinker_singleton.get_cinematic_decision(global_prompt, story_history, past_keyframe_path, start_keyframe_path, destination_keyframe_path, storyboard[i - 1] if i > 0 else "The beginning.", storyboard[i], future_story_prompt)
|
| 156 |
-
transition_type, motion_prompt = decision["transition_type"], decision["motion_prompt"]
|
| 157 |
-
story_history += f"\n- Act {fragment_index}: {motion_prompt}"
|
| 158 |
-
|
| 159 |
-
conditioning_items = []
|
| 160 |
-
if eco_latent_for_next_loop is None:
|
| 161 |
-
img_start = self._preprocess_image_for_latent_conversion(Image.open(start_keyframe_path).convert("RGB"), target_resolution_tuple)
|
| 162 |
-
conditioning_items.append(LatentConditioningItem(self.pil_to_latent(img_start), 0, 1.0))
|
| 163 |
-
else:
|
| 164 |
-
conditioning_items.append(LatentConditioningItem(eco_latent_for_next_loop, 0, 1.0))
|
| 165 |
-
conditioning_items.append(LatentConditioningItem(dejavu_latent_for_next_loop, DEJAVU_FRAME_TARGET, handler_strength))
|
| 166 |
-
|
| 167 |
-
if transition_type == "cutx":
|
| 168 |
-
logger.info(f"Cinematic Director chose a 'cut'. Creating FFmpeg transition bridge...")
|
| 169 |
-
bridge_duration_seconds = FRAMES_PER_LATENT_CHUNK / FPS
|
| 170 |
-
bridge_video_path = video_encode_tool_singleton.create_transition_bridge(
|
| 171 |
-
start_image_path=start_keyframe_path, end_image_path=destination_keyframe_path,
|
| 172 |
-
duration=bridge_duration_seconds, fps=FPS, target_resolution=target_resolution_tuple,
|
| 173 |
-
workspace_dir=self.workspace_dir
|
| 174 |
-
)
|
| 175 |
-
bridge_pixel_tensor = self.read_video_to_tensor(bridge_video_path)
|
| 176 |
-
bridge_latent_tensor = vae_manager_singleton.encode(bridge_pixel_tensor)
|
| 177 |
-
final_fade_latent = bridge_latent_tensor[:, :, -2:, :, :]
|
| 178 |
-
conditioning_items.append(LatentConditioningItem(final_fade_latent, total_latent_frames - 16, 0.95))
|
| 179 |
-
#img_dest = self._preprocess_image_for_latent_conversion(Image.open(destination_keyframe_path).convert("RGB"), target_resolution_tuple)
|
| 180 |
-
#conditioning_items.append(LatentConditioningItem(self.pil_to_latent(img_dest), DESTINATION_FRAME_TARGET, destination_convergence_strength * 0.5))
|
| 181 |
-
del bridge_pixel_tensor, bridge_latent_tensor, final_fade_latent
|
| 182 |
-
if os.path.exists(bridge_video_path): os.remove(bridge_video_path)
|
| 183 |
-
else:
|
| 184 |
-
img_dest = self._preprocess_image_for_latent_conversion(Image.open(destination_keyframe_path).convert("RGB"), target_resolution_tuple)
|
| 185 |
-
conditioning_items.append(LatentConditioningItem(self.pil_to_latent(img_dest), DESTINATION_FRAME_TARGET, destination_convergence_strength))
|
| 186 |
-
|
| 187 |
-
current_ltx_params = {**base_ltx_params, "motion_prompt": motion_prompt}
|
| 188 |
-
logger.info(f"Calling LTX to generate video latents for fragment {fragment_index} ({total_frames_brutos} frames)...")
|
| 189 |
-
latents_brutos, _ = self._generate_latent_tensor_internal(conditioning_items, current_ltx_params, target_resolution_tuple, total_frames_brutos)
|
| 190 |
-
num_latent_frames = latents_brutos.shape[2]
|
| 191 |
-
logger.info(f"LTX responded with a latent tensor of shape {latents_brutos.shape}, representing ~{num_latent_frames * 8 + 1} video frames at {FPS} FPS.")
|
| 192 |
-
|
| 193 |
-
last_trim = latents_brutos[:, :, -(latents_a_podar+1):, :, :].clone()
|
| 194 |
-
eco_latent_for_next_loop = last_trim[:, :, :2, :, :].clone()
|
| 195 |
-
dejavu_latent_for_next_loop = last_trim[:, :, -1:, :, :].clone()
|
| 196 |
-
latents_video = latents_brutos[:, :, :-(latents_a_podar-1), :, :].clone()
|
| 197 |
-
latents_video = latents_video[:, :, 1:, :, :]
|
| 198 |
-
del last_trim, latents_brutos; gc.collect(); torch.cuda.empty_cache()
|
| 199 |
-
|
| 200 |
-
if transition_type == "cutx":
|
| 201 |
-
eco_latent_for_next_loop, dejavu_latent_for_next_loop = None, None
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
cpu_latent = latents_video.cpu()
|
| 205 |
-
latent_path = os.path.join(temp_latent_dir, f"latent_fragment_{i:04d}.pt")
|
| 206 |
-
torch.save(cpu_latent, latent_path)
|
| 207 |
-
latent_fragment_paths.append(latent_path)
|
| 208 |
-
del latents_video, cpu_latent; gc.collect()
|
| 209 |
-
del eco_latent_for_next_loop, dejavu_latent_for_next_loop; gc.collect(); torch.cuda.empty_cache()
|
| 210 |
-
|
| 211 |
-
logger.info(f"--- STARTING STAGE 2: Processing {len(latent_fragment_paths)} latents in chunks of {LATENT_PROCESSING_CHUNK_SIZE} ---")
|
| 212 |
-
final_video_clip_paths = []
|
| 213 |
-
num_chunks = -(-len(latent_fragment_paths) // LATENT_PROCESSING_CHUNK_SIZE)
|
| 214 |
-
for i in range(num_chunks):
|
| 215 |
-
chunk_start_index = i * LATENT_PROCESSING_CHUNK_SIZE
|
| 216 |
-
chunk_end_index = chunk_start_index + LATENT_PROCESSING_CHUNK_SIZE
|
| 217 |
-
chunk_paths = latent_fragment_paths[chunk_start_index:chunk_end_index]
|
| 218 |
-
progress(i / num_chunks, desc=f"Processing & Decoding Batch {i+1}/{num_chunks}")
|
| 219 |
-
tensors_in_chunk = [torch.load(p, map_location=self.device) for p in chunk_paths]
|
| 220 |
-
tensors_para_concatenar = [frag[:, :, :-1, :, :] if j < len(tensors_in_chunk) - 1 else frag for j, frag in enumerate(tensors_in_chunk)]
|
| 221 |
-
sub_group_latent = torch.cat(tensors_para_concatenar, dim=2)
|
| 222 |
-
del tensors_in_chunk, tensors_para_concatenar; gc.collect(); torch.cuda.empty_cache()
|
| 223 |
-
logger.info(f"Batch {i+1} concatenated. Latent shape: {sub_group_latent.shape}")
|
| 224 |
-
base_name = f"clip_{i:04d}_{run_timestamp}"
|
| 225 |
-
current_clip_path = os.path.join(temp_video_clips_dir, f"{base_name}.mp4")
|
| 226 |
-
pixel_tensor = vae_manager_singleton.decode(sub_group_latent)
|
| 227 |
-
self.save_video_from_tensor(pixel_tensor, current_clip_path, fps=FPS)
|
| 228 |
-
del pixel_tensor, sub_group_latent; gc.collect(); torch.cuda.empty_cache()
|
| 229 |
-
final_video_clip_paths.append(current_clip_path)
|
| 230 |
-
|
| 231 |
-
progress(0.98, desc="Final assembly of clips...")
|
| 232 |
-
final_video_path = os.path.join(self.workspace_dir, f"original_movie_{run_timestamp}.mp4")
|
| 233 |
-
video_encode_tool_singleton.concatenate_videos(video_paths=final_video_clip_paths, output_path=final_video_path, workspace_dir=self.workspace_dir)
|
| 234 |
-
logger.info("Cleaning up temporary clip files...")
|
| 235 |
-
try:
|
| 236 |
-
shutil.rmtree(temp_video_clips_dir)
|
| 237 |
-
except OSError as e:
|
| 238 |
-
logger.warning(f"Could not remove temporary clip directory: {e}")
|
| 239 |
-
logger.info(f"Process complete! Original video saved to: {final_video_path}")
|
| 240 |
-
return {"final_path": final_video_path, "latent_paths": latent_fragment_paths}
|
| 241 |
-
|
| 242 |
-
def upscale_latents_and_create_video(self, latent_paths: list, chunk_size: int, progress: gr.Progress):
|
| 243 |
-
if not latent_paths:
|
| 244 |
-
raise gr.Error("Cannot perform upscaling: no latent paths were provided.")
|
| 245 |
-
logger.info("--- STARTING POST-PRODUCTION: Latent Upscaling ---")
|
| 246 |
-
run_timestamp = int(time.time())
|
| 247 |
-
temp_upscaled_clips_dir = os.path.join(self.workspace_dir, f"temp_upscaled_clips_{run_timestamp}")
|
| 248 |
-
os.makedirs(temp_upscaled_clips_dir, exist_ok=True)
|
| 249 |
-
final_upscaled_clip_paths = []
|
| 250 |
-
num_chunks = -(-len(latent_paths) // chunk_size)
|
| 251 |
-
for i in range(num_chunks):
|
| 252 |
-
chunk_start_index = i * chunk_size
|
| 253 |
-
chunk_end_index = chunk_start_index + chunk_size
|
| 254 |
-
chunk_paths = latent_paths[chunk_start_index:chunk_end_index]
|
| 255 |
-
progress(i / num_chunks, desc=f"Upscaling & Decoding Batch {i+1}/{num_chunks}")
|
| 256 |
-
tensors_in_chunk = [torch.load(p, map_location=self.device) for p in chunk_paths]
|
| 257 |
-
tensors_para_concatenar = [frag[:, :, :-1, :, :] if j < len(tensors_in_chunk) - 1 else frag for j, frag in enumerate(tensors_in_chunk)]
|
| 258 |
-
sub_group_latent = torch.cat(tensors_para_concatenar, dim=2)
|
| 259 |
-
del tensors_in_chunk, tensors_para_concatenar; gc.collect(); torch.cuda.empty_cache()
|
| 260 |
-
logger.info(f"Batch {i+1} loaded. Original latent shape: {sub_group_latent.shape}")
|
| 261 |
-
upscaled_latent_chunk = latent_enhancer_specialist_singleton.upscale(sub_group_latent)
|
| 262 |
-
del sub_group_latent; gc.collect(); torch.cuda.empty_cache()
|
| 263 |
-
logger.info(f"Batch {i+1} upscaled. New latent shape: {upscaled_latent_chunk.shape}")
|
| 264 |
-
pixel_tensor = vae_manager_singleton.decode(upscaled_latent_chunk)
|
| 265 |
-
del upscaled_latent_chunk; gc.collect(); torch.cuda.empty_cache()
|
| 266 |
-
base_name = f"upscaled_clip_{i:04d}_{run_timestamp}"
|
| 267 |
-
current_clip_path = os.path.join(temp_upscaled_clips_dir, f"{base_name}.mp4")
|
| 268 |
-
self.save_video_from_tensor(pixel_tensor, current_clip_path, fps=24)
|
| 269 |
-
final_upscaled_clip_paths.append(current_clip_path)
|
| 270 |
-
del pixel_tensor; gc.collect(); torch.cuda.empty_cache()
|
| 271 |
-
logger.info(f"Saved upscaled clip: {Path(current_clip_path).name}")
|
| 272 |
-
progress(0.98, desc="Assembling upscaled clips...")
|
| 273 |
-
final_video_path = os.path.join(self.workspace_dir, f"upscaled_movie_{run_timestamp}.mp4")
|
| 274 |
-
video_encode_tool_singleton.concatenate_videos(video_paths=final_upscaled_clip_paths, output_path=final_video_path, workspace_dir=self.workspace_dir)
|
| 275 |
-
logger.info("Cleaning up temporary upscaled clip files...")
|
| 276 |
-
try:
|
| 277 |
-
shutil.rmtree(temp_upscaled_clips_dir)
|
| 278 |
-
except OSError as e:
|
| 279 |
-
logger.warning(f"Could not remove temporary upscaled clip directory: {e}")
|
| 280 |
-
logger.info(f"Latent upscaling complete! Final video at: {final_video_path}")
|
| 281 |
-
yield {"final_path": final_video_path}
|
| 282 |
-
|
| 283 |
-
def master_video_hd(self, source_video_path: str, model_version: str, steps: int, prompt: str, progress: gr.Progress):
|
| 284 |
-
logger.info(f"--- STARTING POST-PRODUCTION: HD Mastering with SeedVR {model_version} ---")
|
| 285 |
-
progress(0.1, desc=f"Preparing for HD Mastering with SeedVR {model_version}...")
|
| 286 |
-
run_timestamp = int(time.time())
|
| 287 |
-
output_path = os.path.join(self.workspace_dir, f"hd_mastered_movie_{model_version}_{run_timestamp}.mp4")
|
| 288 |
-
try:
|
| 289 |
-
final_path = seedvr_manager_singleton.process_video(
|
| 290 |
-
input_video_path=source_video_path,
|
| 291 |
-
output_video_path=output_path,
|
| 292 |
-
prompt=prompt,
|
| 293 |
-
model_version=model_version,
|
| 294 |
-
steps=steps,
|
| 295 |
-
progress=progress
|
| 296 |
-
)
|
| 297 |
-
logger.info(f"HD Mastering complete! Final video at: {final_path}")
|
| 298 |
-
yield {"final_path": final_path}
|
| 299 |
-
except Exception as e:
|
| 300 |
-
logger.error(f"HD Mastering failed: {e}", exc_info=True)
|
| 301 |
-
raise gr.Error(f"HD Mastering failed. Details: {e}")
|
| 302 |
-
|
| 303 |
-
def generate_audio_for_final_video(self, source_video_path: str, audio_prompt: str, progress: gr.Progress):
|
| 304 |
-
logger.info(f"--- STARTING POST-PRODUCTION: Audio Generation ---")
|
| 305 |
-
progress(0.1, desc="Preparing for audio generation...")
|
| 306 |
-
run_timestamp = int(time.time())
|
| 307 |
-
source_name = Path(source_video_path).stem
|
| 308 |
-
output_path = os.path.join(self.workspace_dir, f"{source_name}_with_audio_{run_timestamp}.mp4")
|
| 309 |
-
try:
|
| 310 |
-
result = subprocess.run(
|
| 311 |
-
["ffprobe", "-v", "error", "-show_entries", "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", source_video_path],
|
| 312 |
-
capture_output=True, text=True, check=True)
|
| 313 |
-
duration = float(result.stdout.strip())
|
| 314 |
-
logger.info(f"Source video duration: {duration:.2f} seconds.")
|
| 315 |
-
progress(0.5, desc="Generating audio track...")
|
| 316 |
-
final_path = mmaudio_manager_singleton.generate_audio_for_video(
|
| 317 |
-
video_path=source_video_path,
|
| 318 |
-
prompt=audio_prompt,
|
| 319 |
-
duration_seconds=duration,
|
| 320 |
-
output_path_override=output_path
|
| 321 |
-
)
|
| 322 |
-
logger.info(f"Audio generation complete! Final video with audio at: {final_path}")
|
| 323 |
-
progress(1.0, desc="Audio generation complete!")
|
| 324 |
-
yield {"final_path": final_path}
|
| 325 |
-
except Exception as e:
|
| 326 |
-
logger.error(f"Audio generation failed: {e}", exc_info=True)
|
| 327 |
-
raise gr.Error(f"Audio generation failed. Details: {e}")
|
| 328 |
-
|
| 329 |
-
def _generate_latent_tensor_internal(self, conditioning_items, ltx_params, target_resolution, total_frames_to_generate):
|
| 330 |
-
"""Internal helper to call the LTX manager."""
|
| 331 |
-
final_ltx_params = {**ltx_params, 'width': target_resolution[0], 'height': target_resolution[1], 'video_total_frames': total_frames_to_generate, 'video_fps': 24, 'current_fragment_index': int(time.time()), 'conditioning_items_data': conditioning_items}
|
| 332 |
-
return ltx_manager_singleton.generate_latent_fragment(**final_ltx_params)
|
| 333 |
-
|
| 334 |
-
def _quantize_to_multiple(self, n, m):
|
| 335 |
-
"""Helper to round n to the nearest multiple of m."""
|
| 336 |
-
if m == 0: return n
|
| 337 |
-
quantized = int(round(n / m) * m)
|
| 338 |
-
return m if n > 0 and quantized == 0 else quantized
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
engineers/deformes7D.py
DELETED
|
@@ -1,316 +0,0 @@
|
|
| 1 |
-
# engineers/deformes7D.py
|
| 2 |
-
#
|
| 3 |
-
# AducSdr: Uma implementação aberta e funcional da arquitetura ADUC-SDR
|
| 4 |
-
# Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
|
| 5 |
-
#
|
| 6 |
-
# Contato:
|
| 7 |
-
# Carlos Rodrigues dos Santos
|
| 8 |
-
# carlex22@gmail.com
|
| 9 |
-
# Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
|
| 10 |
-
#
|
| 11 |
-
# Repositórios e Projetos Relacionados:
|
| 12 |
-
# GitHub: https://github.com/carlex22/Aduc-sdr
|
| 13 |
-
#
|
| 14 |
-
# This program is free software: you can redistribute it and/or modify
|
| 15 |
-
# it under the terms of the GNU Affero General Public License as published by
|
| 16 |
-
# the Free Software Foundation, either version 3 of the License, or
|
| 17 |
-
# (at your option) any later version.
|
| 18 |
-
#
|
| 19 |
-
# This program is distributed in the hope that it will be useful,
|
| 20 |
-
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
| 21 |
-
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
| 22 |
-
# GNU Affero General Public License for more details.
|
| 23 |
-
#
|
| 24 |
-
# You should have received a copy of the GNU Affero General Public License
|
| 25 |
-
# along with this program. If not, see <https://www.gnu.org/licenses/>.
|
| 26 |
-
#
|
| 27 |
-
# This program is free software: you can redistribute it and/or modify
|
| 28 |
-
# it under the terms of the GNU Affero General Public License...
|
| 29 |
-
# PENDING PATENT NOTICE: Please see NOTICE.md.
|
| 30 |
-
#
|
| 31 |
-
# Version 3.2.1
|
| 32 |
-
|
| 33 |
-
import os
|
| 34 |
-
import time
|
| 35 |
-
import imageio
|
| 36 |
-
import numpy as np
|
| 37 |
-
import torch
|
| 38 |
-
import logging
|
| 39 |
-
from PIL import Image, ImageOps
|
| 40 |
-
import gradio as gr
|
| 41 |
-
import subprocess
|
| 42 |
-
import gc
|
| 43 |
-
import yaml
|
| 44 |
-
import shutil
|
| 45 |
-
from pathlib import Path
|
| 46 |
-
from typing import List, Tuple, Dict, Generator
|
| 47 |
-
|
| 48 |
-
from aduc_types import LatentConditioningItem
|
| 49 |
-
from managers.ltx_manager import ltx_manager_singleton
|
| 50 |
-
from managers.latent_enhancer_manager import latent_enhancer_specialist_singleton
|
| 51 |
-
from managers.vae_manager import vae_manager_singleton
|
| 52 |
-
from engineers.deformes2D_thinker import deformes2d_thinker_singleton
|
| 53 |
-
from engineers.deformes3D_thinker import deformes3d_thinker_singleton
|
| 54 |
-
from managers.seedvr_manager import seedvr_manager_singleton
|
| 55 |
-
from managers.mmaudio_manager import mmaudio_manager_singleton
|
| 56 |
-
from tools.video_encode_tool import video_encode_tool_singleton
|
| 57 |
-
|
| 58 |
-
logger = logging.getLogger(__name__)
|
| 59 |
-
|
| 60 |
-
class Deformes7DEngine:
|
| 61 |
-
# ... (todo o corpo da classe permanece exatamente o mesmo da nossa última versão) ...
|
| 62 |
-
"""
|
| 63 |
-
Unified 3D/4D engine for continuous, interleaved generation of keyframes and video fragments.
|
| 64 |
-
"""
|
| 65 |
-
def __init__(self, workspace_dir="deformes_workspace"):
|
| 66 |
-
self.workspace_dir = workspace_dir
|
| 67 |
-
self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
| 68 |
-
logger.info("Deformes7D Unified Engine initialized.")
|
| 69 |
-
os.makedirs(self.workspace_dir, exist_ok=True)
|
| 70 |
-
|
| 71 |
-
# --- HELPER METHODS ---
|
| 72 |
-
def save_video_from_tensor(self, video_tensor: torch.Tensor, path: str, fps: int = 24):
|
| 73 |
-
"""Saves a pixel-space tensor as an MP4 video file."""
|
| 74 |
-
if video_tensor is None or video_tensor.ndim != 5 or video_tensor.shape[2] == 0: return
|
| 75 |
-
video_tensor = video_tensor.squeeze(0).permute(1, 2, 3, 0)
|
| 76 |
-
video_tensor = (video_tensor.clamp(-1, 1) + 1) / 2.0
|
| 77 |
-
video_np = (video_tensor.detach().cpu().float().numpy() * 255).astype(np.uint8)
|
| 78 |
-
with imageio.get_writer(path, fps=fps, codec='libx264', quality=8, output_params=['-pix_fmt', 'yuv420p']) as writer:
|
| 79 |
-
for frame in video_np: writer.append_data(frame)
|
| 80 |
-
|
| 81 |
-
def read_video_to_tensor(self, video_path: str) -> torch.Tensor:
|
| 82 |
-
"""Reads a video file and converts it into a pixel-space tensor."""
|
| 83 |
-
with imageio.get_reader(video_path, 'ffmpeg') as reader:
|
| 84 |
-
frames = [frame for frame in reader]
|
| 85 |
-
frames_np = np.stack(frames, axis=0).astype(np.float32) / 255.0
|
| 86 |
-
tensor = torch.from_numpy(frames_np).permute(3, 0, 1, 2)
|
| 87 |
-
tensor = tensor.unsqueeze(0)
|
| 88 |
-
tensor = (tensor * 2.0) - 1.0
|
| 89 |
-
return tensor.to(self.device)
|
| 90 |
-
|
| 91 |
-
def _preprocess_image(self, image: Image.Image, target_resolution: tuple) -> Image.Image:
|
| 92 |
-
if image.size != target_resolution:
|
| 93 |
-
return ImageOps.fit(image, target_resolution, Image.Resampling.LANCZOS)
|
| 94 |
-
return image
|
| 95 |
-
|
| 96 |
-
def _pil_to_pixel_tensor(self, pil_image: Image.Image) -> torch.Tensor:
|
| 97 |
-
image_np = np.array(pil_image).astype(np.float32) / 255.0
|
| 98 |
-
tensor = torch.from_numpy(image_np).permute(2, 0, 1).unsqueeze(0).unsqueeze(2)
|
| 99 |
-
return (tensor * 2.0) - 1.0
|
| 100 |
-
|
| 101 |
-
def _save_image_from_tensor(self, pixel_tensor: torch.Tensor, path: str):
|
| 102 |
-
tensor_chw = pixel_tensor.squeeze(0).squeeze(1)
|
| 103 |
-
tensor_hwc = tensor_chw.permute(1, 2, 0)
|
| 104 |
-
tensor_hwc = (tensor_hwc.clamp(-1, 1) + 1) / 2.0
|
| 105 |
-
image_np = (tensor_hwc.cpu().float().numpy() * 255).astype(np.uint8)
|
| 106 |
-
Image.fromarray(image_np).save(path)
|
| 107 |
-
|
| 108 |
-
def _quantize_to_multiple(self, n, m):
|
| 109 |
-
if m == 0: return n
|
| 110 |
-
quantized = int(round(n / m) * m)
|
| 111 |
-
return m if n > 0 and quantized == 0 else quantized
|
| 112 |
-
|
| 113 |
-
# --- CORE GENERATION LOGIC ---
|
| 114 |
-
def _generate_next_causal_keyframe(self, base_keyframe_path: str, all_ref_paths: list,
|
| 115 |
-
prompt: str, resolution_tuple: tuple) -> Tuple[str, torch.Tensor]:
|
| 116 |
-
# (código interno deste método permanece o mesmo)
|
| 117 |
-
ltx_context_paths = [base_keyframe_path] + [p for p in all_ref_paths if p != base_keyframe_path][:3]
|
| 118 |
-
ltx_conditioning_items = []
|
| 119 |
-
weight = 1.0
|
| 120 |
-
for path in ltx_context_paths:
|
| 121 |
-
img_pil = Image.open(path).convert("RGB")
|
| 122 |
-
img_processed = self._preprocess_image(img_pil, resolution_tuple)
|
| 123 |
-
pixel_tensor = self._pil_to_pixel_tensor(img_processed)
|
| 124 |
-
latent_tensor = vae_manager_singleton.encode(pixel_tensor)
|
| 125 |
-
ltx_conditioning_items.append(LatentConditioningItem(latent_tensor, 0, weight))
|
| 126 |
-
if weight == 1.0: weight = -0.2
|
| 127 |
-
else: weight -= 0.2
|
| 128 |
-
ltx_base_params = {"guidance_scale": 3.0, "stg_scale": 0.1, "num_inference_steps": 25}
|
| 129 |
-
generated_latents, _ = ltx_manager_singleton.generate_latent_fragment(
|
| 130 |
-
height=resolution_tuple[0], width=resolution_tuple[1],
|
| 131 |
-
conditioning_items_data=ltx_conditioning_items, motion_prompt=prompt,
|
| 132 |
-
video_total_frames=48, video_fps=24, **ltx_base_params
|
| 133 |
-
)
|
| 134 |
-
final_latent = generated_latents[:, :, -1:, :, :]
|
| 135 |
-
upscaled_latent = latent_enhancer_specialist_singleton.upscale(final_latent)
|
| 136 |
-
pixel_tensor_out = vae_manager_singleton.decode(upscaled_latent)
|
| 137 |
-
timestamp = int(time.time() * 1000)
|
| 138 |
-
output_path = os.path.join(self.workspace_dir, f"keyframe_{timestamp}.png")
|
| 139 |
-
self._save_image_from_tensor(pixel_tensor_out, output_path)
|
| 140 |
-
return output_path, final_latent
|
| 141 |
-
|
| 142 |
-
def generate_full_movie_interleaved(self, initial_ref_paths: list, storyboard: list, global_prompt: str,
|
| 143 |
-
video_resolution: int, seconds_per_fragment: float, trim_percent: int,
|
| 144 |
-
handler_strength: float, dest_strength: float, ltx_params: dict,
|
| 145 |
-
progress=gr.Progress()):
|
| 146 |
-
# (código interno deste método permanece o mesmo)
|
| 147 |
-
logger.info("--- DEFORMES 7D: INITIATING INTERLEAVED RENDERING PIPELINE ---")
|
| 148 |
-
run_timestamp = int(time.time())
|
| 149 |
-
temp_video_clips_dir = os.path.join(self.workspace_dir, f"temp_clips_{run_timestamp}")
|
| 150 |
-
os.makedirs(temp_video_clips_dir, exist_ok=True)
|
| 151 |
-
FPS = 24
|
| 152 |
-
FRAMES_PER_LATENT_CHUNK = 8
|
| 153 |
-
resolution_tuple = (video_resolution, video_resolution)
|
| 154 |
-
generated_keyframe_paths, generated_keyframe_latents, generated_video_fragment_paths = [], [], []
|
| 155 |
-
progress(0, desc="Bootstrap: Processing K0...")
|
| 156 |
-
k0_path = initial_ref_paths[0]
|
| 157 |
-
k0_pil = Image.open(k0_path).convert("RGB")
|
| 158 |
-
k0_processed_pil = self._preprocess_image(k0_pil, resolution_tuple)
|
| 159 |
-
k0_pixel_tensor = self._pil_to_pixel_tensor(k0_processed_pil)
|
| 160 |
-
k0_latent = vae_manager_singleton.encode(k0_pixel_tensor)
|
| 161 |
-
generated_keyframe_paths.append(k0_path)
|
| 162 |
-
generated_keyframe_latents.append(k0_latent)
|
| 163 |
-
progress(0.01, desc="Bootstrap: Generating K1...")
|
| 164 |
-
prompt_k1 = deformes2d_thinker_singleton.get_anticipatory_keyframe_prompt(
|
| 165 |
-
global_prompt, "Initial scene.", storyboard[0], storyboard[1], k0_path, initial_ref_paths
|
| 166 |
-
)
|
| 167 |
-
k1_path, k1_latent = self._generate_next_causal_keyframe(k0_path, initial_ref_paths, prompt_k1, resolution_tuple)
|
| 168 |
-
generated_keyframe_paths.append(k1_path)
|
| 169 |
-
generated_keyframe_latents.append(k1_latent)
|
| 170 |
-
story_history = ""
|
| 171 |
-
eco_latent_for_next_loop, dejavu_latent_for_next_loop = None, None
|
| 172 |
-
num_transitions = len(storyboard) - 1
|
| 173 |
-
base_4d_ltx_params = {"rescaling_scale": 0.15, "image_cond_noise_scale": 0.00, **ltx_params}
|
| 174 |
-
|
| 175 |
-
for i in range(1, num_transitions):
|
| 176 |
-
act_progress = i / num_transitions
|
| 177 |
-
progress(act_progress, desc=f"Processing Act {i+1}/{num_transitions} (Keyframe Gen)...")
|
| 178 |
-
logger.info(f"--> Step 3D: Generating Keyframe K{i+1}")
|
| 179 |
-
kx_path = generated_keyframe_paths[i]
|
| 180 |
-
prompt_ky = deformes2d_thinker_singleton.get_anticipatory_keyframe_prompt(
|
| 181 |
-
global_prompt, story_history, storyboard[i], storyboard[i+1], kx_path, initial_ref_paths
|
| 182 |
-
)
|
| 183 |
-
ky_path, ky_latent = self._generate_next_causal_keyframe(kx_path, initial_ref_paths, prompt_ky, resolution_tuple)
|
| 184 |
-
generated_keyframe_paths.append(ky_path)
|
| 185 |
-
generated_keyframe_latents.append(ky_latent)
|
| 186 |
-
progress(act_progress, desc=f"Processing Act {i+1}/{num_transitions} (Video Gen)...")
|
| 187 |
-
logger.info(f"--> Step 4D: Generating Video Fragment V{i-1}")
|
| 188 |
-
kb_path, kx_path, ky_path = generated_keyframe_paths[i-1], generated_keyframe_paths[i], generated_keyframe_paths[i+1]
|
| 189 |
-
motion_prompt = deformes3d_thinker_singleton.get_enhanced_motion_prompt(
|
| 190 |
-
global_prompt, story_history, kb_path, kx_path, ky_path,
|
| 191 |
-
storyboard[i-1], storyboard[i], storyboard[i+1]
|
| 192 |
-
)
|
| 193 |
-
transition_type = "continuous"
|
| 194 |
-
story_history += f"\n- Act {i}: {motion_prompt}"
|
| 195 |
-
total_frames_brutos = self._quantize_to_multiple(int(seconds_per_fragment * FPS), FRAMES_PER_LATENT_CHUNK)
|
| 196 |
-
frames_a_podar = self._quantize_to_multiple(int(total_frames_brutos * (trim_percent / 100)), FRAMES_PER_LATENT_CHUNK)
|
| 197 |
-
latents_a_podar = frames_a_podar // FRAMES_PER_LATENT_CHUNK
|
| 198 |
-
DEJAVU_FRAME_TARGET = frames_a_podar - 1 if frames_a_podar > 0 else 0
|
| 199 |
-
DESTINATION_FRAME_TARGET = total_frames_brutos - 1
|
| 200 |
-
conditioning_items = []
|
| 201 |
-
if eco_latent_for_next_loop is None:
|
| 202 |
-
conditioning_items.append(LatentConditioningItem(generated_keyframe_latents[i], 0, 1.0))
|
| 203 |
-
else:
|
| 204 |
-
conditioning_items.append(LatentConditioningItem(eco_latent_for_next_loop, 0, 1.0))
|
| 205 |
-
conditioning_items.append(LatentConditioningItem(dejavu_latent_for_next_loop, DEJAVU_FRAME_TARGET, handler_strength))
|
| 206 |
-
if transition_type != "cut":
|
| 207 |
-
conditioning_items.append(LatentConditioningItem(ky_latent, DESTINATION_FRAME_TARGET, dest_strength))
|
| 208 |
-
fragment_latents_brutos, _ = ltx_manager_singleton.generate_latent_fragment(
|
| 209 |
-
height=video_resolution, width=video_resolution,
|
| 210 |
-
conditioning_items_data=conditioning_items, motion_prompt=motion_prompt,
|
| 211 |
-
video_total_frames=total_frames_brutos, video_fps=FPS, **base_4d_ltx_params
|
| 212 |
-
)
|
| 213 |
-
last_trim = fragment_latents_brutos[:, :, -(latents_a_podar+1):, :, :].clone()
|
| 214 |
-
eco_latent_for_next_loop = last_trim[:, :, :2, :, :].clone()
|
| 215 |
-
dejavu_latent_for_next_loop = last_trim[:, :, -1:, :, :].clone()
|
| 216 |
-
final_fragment_latents = fragment_latents_brutos[:, :, :-(latents_a_podar-1), :, :].clone()
|
| 217 |
-
final_fragment_latents = final_fragment_latents[:, :, 1:, :, :]
|
| 218 |
-
pixel_tensor = vae_manager_singleton.decode(final_fragment_latents)
|
| 219 |
-
fragment_path = os.path.join(temp_video_clips_dir, f"fragment_{i-1}.mp4")
|
| 220 |
-
self.save_video_from_tensor(pixel_tensor, fragment_path, fps=FPS)
|
| 221 |
-
generated_video_fragment_paths.append(fragment_path)
|
| 222 |
-
logger.info(f"Video Fragment V{i-1} saved to {fragment_path}")
|
| 223 |
-
|
| 224 |
-
logger.info("--- Final Assembly of Video Fragments ---")
|
| 225 |
-
final_video_path = os.path.join(self.workspace_dir, f"movie_7D_{run_timestamp}.mp4")
|
| 226 |
-
video_encode_tool_singleton.concatenate_videos(generated_video_fragment_paths, final_video_path, self.workspace_dir)
|
| 227 |
-
shutil.rmtree(temp_video_clips_dir)
|
| 228 |
-
logger.info(f"Full movie generated at: {final_video_path}")
|
| 229 |
-
return {"final_path": final_video_path, "all_keyframes": generated_keyframe_paths, "latent_paths": "NOT_IMPLEMENTED_YET"}
|
| 230 |
-
|
| 231 |
-
# --- POST-PRODUCTION METHODS ---
|
| 232 |
-
def task_run_latent_upscaling(self, latent_paths: list, chunk_size: int, progress: gr.Progress) -> Generator[Dict[str, any], None, None]:
|
| 233 |
-
# (código interno deste método permanece o mesmo)
|
| 234 |
-
if not latent_paths:
|
| 235 |
-
raise gr.Error("Cannot perform upscaling: no latent paths were provided from the main generation.")
|
| 236 |
-
logger.info("--- POST-PRODUCTION: Latent Upscaling ---")
|
| 237 |
-
run_timestamp = int(time.time())
|
| 238 |
-
temp_upscaled_clips_dir = os.path.join(self.workspace_dir, f"temp_upscaled_clips_{run_timestamp}")
|
| 239 |
-
os.makedirs(temp_upscaled_clips_dir, exist_ok=True)
|
| 240 |
-
final_upscaled_clip_paths = []
|
| 241 |
-
num_chunks = -(-len(latent_paths) // chunk_size)
|
| 242 |
-
for i in range(num_chunks):
|
| 243 |
-
chunk_start_index = i * chunk_size
|
| 244 |
-
chunk_end_index = chunk_start_index + chunk_size
|
| 245 |
-
chunk_paths = latent_paths[chunk_start_index:chunk_end_index]
|
| 246 |
-
progress(i / num_chunks, desc=f"Upscaling & Decoding Batch {i+1}/{num_chunks}")
|
| 247 |
-
tensors_in_chunk = [torch.load(p, map_location=self.device) for p in chunk_paths]
|
| 248 |
-
tensors_para_concatenar = [frag[:, :, :-1, :, :] if j < len(tensors_in_chunk) - 1 else frag for j, frag in enumerate(tensors_in_chunk)]
|
| 249 |
-
sub_group_latent = torch.cat(tensors_para_concatenar, dim=2)
|
| 250 |
-
del tensors_in_chunk, tensors_para_concatenar; gc.collect(); torch.cuda.empty_cache()
|
| 251 |
-
upscaled_latent_chunk = latent_enhancer_specialist_singleton.upscale(sub_group_latent)
|
| 252 |
-
del sub_group_latent; gc.collect(); torch.cuda.empty_cache()
|
| 253 |
-
pixel_tensor = vae_manager_singleton.decode(upscaled_latent_chunk)
|
| 254 |
-
del upscaled_latent_chunk; gc.collect(); torch.cuda.empty_cache()
|
| 255 |
-
base_name = f"upscaled_clip_{i:04d}_{run_timestamp}"
|
| 256 |
-
current_clip_path = os.path.join(temp_upscaled_clips_dir, f"{base_name}.mp4")
|
| 257 |
-
self.save_video_from_tensor(pixel_tensor, current_clip_path, fps=24)
|
| 258 |
-
final_upscaled_clip_paths.append(current_clip_path)
|
| 259 |
-
del pixel_tensor; gc.collect(); torch.cuda.empty_cache()
|
| 260 |
-
progress(0.98, desc="Assembling upscaled clips...")
|
| 261 |
-
final_video_path = os.path.join(self.workspace_dir, f"upscaled_movie_{run_timestamp}.mp4")
|
| 262 |
-
video_encode_tool_singleton.concatenate_videos(video_paths=final_upscaled_clip_paths, output_path=final_video_path, workspace_dir=self.workspace_dir)
|
| 263 |
-
shutil.rmtree(temp_upscaled_clips_dir)
|
| 264 |
-
logger.info(f"Latent upscaling complete! Final video at: {final_video_path}")
|
| 265 |
-
yield {"final_path": final_video_path}
|
| 266 |
-
|
| 267 |
-
def master_video_hd(self, source_video_path: str, model_version: str, steps: int, prompt: str, progress: gr.Progress):
|
| 268 |
-
# (código interno deste método permanece o mesmo)
|
| 269 |
-
logger.info(f"--- POST-PRODUCTION: HD Mastering with SeedVR {model_version} ---")
|
| 270 |
-
run_timestamp = int(time.time())
|
| 271 |
-
output_path = os.path.join(self.workspace_dir, f"{Path(source_video_path).stem}_hd.mp4")
|
| 272 |
-
try:
|
| 273 |
-
final_path = seedvr_manager_singleton.process_video(
|
| 274 |
-
input_video_path=source_video_path, output_video_path=output_path,
|
| 275 |
-
prompt=prompt, model_version=model_version, steps=steps, progress=progress
|
| 276 |
-
)
|
| 277 |
-
yield {"final_path": final_path}
|
| 278 |
-
except Exception as e:
|
| 279 |
-
logger.error(f"HD Mastering failed: {e}", exc_info=True)
|
| 280 |
-
raise gr.Error(f"HD Mastering failed. Details: {e}")
|
| 281 |
-
|
| 282 |
-
def generate_audio(self, source_video_path: str, audio_prompt: str, progress: gr.Progress):
|
| 283 |
-
# (código interno deste método permanece o mesmo)
|
| 284 |
-
logger.info(f"--- POST-PRODUCTION: Audio Generation ---")
|
| 285 |
-
run_timestamp = int(time.time())
|
| 286 |
-
output_path = os.path.join(self.workspace_dir, f"{Path(source_video_path).stem}_audio.mp4")
|
| 287 |
-
try:
|
| 288 |
-
result = subprocess.run(
|
| 289 |
-
["ffprobe", "-v", "error", "-show_entries", "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", source_video_path],
|
| 290 |
-
capture_output=True, text=True, check=True)
|
| 291 |
-
duration = float(result.stdout.strip())
|
| 292 |
-
progress(0.5, desc="Generating audio track...")
|
| 293 |
-
final_path = mmaudio_manager_singleton.generate_audio_for_video(
|
| 294 |
-
video_path=source_video_path, prompt=audio_prompt,
|
| 295 |
-
duration_seconds=duration, output_path_override=output_path
|
| 296 |
-
)
|
| 297 |
-
yield {"final_path": final_path}
|
| 298 |
-
except Exception as e:
|
| 299 |
-
logger.error(f"Audio generation failed: {e}", exc_info=True)
|
| 300 |
-
raise gr.Error(f"Audio generation failed. Details: {e}")
|
| 301 |
-
|
| 302 |
-
# --- Singleton Instantiation ---
|
| 303 |
-
try:
|
| 304 |
-
config_path = Path(__file__).resolve().parent.parent / "config.yaml"
|
| 305 |
-
with open(config_path, 'r') as f:
|
| 306 |
-
config = yaml.safe_load(f)
|
| 307 |
-
WORKSPACE_DIR = config['application']['workspace_dir']
|
| 308 |
-
deformes7d_engine_singleton = Deformes7DEngine(workspace_dir=WORKSPACE_DIR)
|
| 309 |
-
# <--- INÍCIO DA CORREÇÃO --->
|
| 310 |
-
except Exception as e:
|
| 311 |
-
# Loga o erro como CRÍTICO, pois a aplicação não pode funcionar sem este motor.
|
| 312 |
-
logger.critical(f"CRITICAL: Failed to initialize the Deformes7DEngine singleton from {config_path}: {e}", exc_info=True)
|
| 313 |
-
# Relança a exceção para parar a aplicação imediatamente.
|
| 314 |
-
# Isso evita o erro 'NoneType' mais tarde e fornece um ponto claro de falha.
|
| 315 |
-
raise
|
| 316 |
-
# <--- FIM DA CORREÇÃO --->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|