Transcribe audio/video with speaker diarization
detection of AI generated Audio or Real audio using score