Transcript Overlay
Real-time speech transcript display component with smooth animations
The TranscriptOverlay component displays real-time speech transcripts as an animated overlay. It automatically listens to Pipecat Client events to show speech transcripts from either the local user or remote bot, with smooth fade-in and fade-out animations.
import { TranscriptOverlayComponent } from "@pipecat-ai/voice-ui-kit";
function Demo() {
const [transcript, setTranscript] = React.useState([]);
const [turnEnd, setTurnEnd] = React.useState(false);
const [isSpeaking, setIsSpeaking] = React.useState(false);
const [fadeInDuration, setFadeInDuration] = React.useState(300);
const [fadeOutDuration, setFadeOutDuration] = React.useState(1000);
const phrases = [
"Hello, welcome to our voice application!",
"I can help you with various tasks today.",
"What would you like to know about our services?",
"Let me search for that information for you.",
"I found some relevant details that might help.",
];
const startSpeech = () => {
setIsSpeaking(true);
setTurnEnd(false);
setTranscript([]);
const selectedPhrase = phrases[Math.floor(Math.random() * phrases.length)];
const words = selectedPhrase.split(" ").filter(word => word && word.trim().length > 0);
let wordIndex = 0;
const interval = setInterval(() => {
if (wordIndex < words.length) {
const word = words[wordIndex];
if (word) {
setTranscript(prev => [...prev, word]);
wordIndex++;
} else {
wordIndex++;
}
} else {
clearInterval(interval);
}
}, 200);
};
const endSpeech = () => {
setIsSpeaking(false);
setTurnEnd(true);
};
const clearTranscript = () => {
setTranscript([]);
setTurnEnd(false);
setIsSpeaking(false);
};
return (
<div className="space-y-4 w-full">
<div className="flex gap-2 flex-wrap">
<button
onClick={startSpeech}
disabled={isSpeaking}
className="px-4 py-2 bg-blue-500 text-white rounded hover:bg-blue-600 disabled:opacity-50"
>
Simulate Speech
</button>
<button
onClick={endSpeech}
disabled={!isSpeaking}
className="px-4 py-2 bg-orange-500 text-white rounded hover:bg-orange-600 disabled:opacity-50"
>
Simulate Turn End
</button>
<button
onClick={clearTranscript}
className="px-4 py-2 bg-gray-500 text-white rounded hover:bg-gray-600"
>
Clear
</button>
</div>
<div className="flex gap-4 text-sm">
<div>
<label className="block text-xs font-medium mb-1">Fade In (ms)</label>
<input
type="number"
value={fadeInDuration}
onChange={(e) => setFadeInDuration(Number(e.target.value))}
className="w-20 px-2 py-1 border rounded text-xs"
min="100"
max="2000"
step="100"
/>
</div>
<div>
<label className="block text-xs font-medium mb-1">Fade Out (ms)</label>
<input
type="number"
value={fadeOutDuration}
onChange={(e) => setFadeOutDuration(Number(e.target.value))}
className="w-20 px-2 py-1 border rounded text-xs"
min="100"
max="3000"
step="100"
/>
</div>
</div>
<div className="bg-gray-50 rounded-lg p-4 min-h-[80px] w-full flex items-center justify-center">
{transcript.length > 0 ? (
<TranscriptOverlayComponent
words={transcript}
turnEnd={turnEnd}
className="w-full text-center"
fadeInDuration={fadeInDuration}
fadeOutDuration={fadeOutDuration}
/>
) : (
<p className="text-gray-500 text-sm">
Click 'Simulate Speech' to see the transcript build up word by word
</p>
)}
</div>
<div className="text-sm text-gray-600">
<p><strong>Status:</strong> {isSpeaking ? "Speaking..." : turnEnd ? "Speech ended" : "Ready"}</p>
<p><strong>Transcript:</strong> "{transcript.join(" ")}"</p>
</div>
</div>
);
}
render(<Demo />);TranscriptOverlay
| Prop | Type | Default |
|---|---|---|
participant? | "local" | "remote" | "remote" |
className? | string | undefined |
size? | "sm" | "md" | "lg" | "md" |
fadeInDuration? | number | 300 |
fadeOutDuration? | number | 1000 |
TranscriptOverlayComponent
The TranscriptOverlayComponent is the headless variant that accepts an array of words and animation state as props. This allows you to use it with any framework or state management solution.
| Prop | Type | Default |
|---|---|---|
words | string[] | undefined |
className? | string | undefined |
size? | "sm" | "md" | "lg" | "md" |
turnEnd? | boolean | false |
fadeInDuration? | number | 300 |
fadeOutDuration? | number | 1000 |
Usage Examples
Connected Component Usage
The TranscriptOverlay component automatically integrates with Pipecat Client events:
import { TranscriptOverlay } from "@pipecat-ai/voice-ui-kit";
// Display bot speech transcripts
<TranscriptOverlay participant="remote" />
// Display user speech transcripts
<TranscriptOverlay participant="local" />
// With custom styling and size
<TranscriptOverlay
participant="remote"
size="lg"
className="bg-blue-500/20 border border-blue-300 rounded-lg p-4"
/>
// With custom animation durations
<TranscriptOverlay
participant="remote"
fadeInDuration={500}
fadeOutDuration={1500}
/>Headless Component Usage
The TranscriptOverlayComponent allows manual control over transcript display:
import { TranscriptOverlayComponent } from "@pipecat-ai/voice-ui-kit";
// Basic usage with word array
<TranscriptOverlayComponent
words={["Hello", "world", "this", "is", "a", "test"]}
/>
// With turn end animation
<TranscriptOverlayComponent
words={["Speech", "has", "ended"]}
turnEnd={true}
/>
// With custom styling and animations
<TranscriptOverlayComponent
words={["Custom", "styling", "example"]}
size="lg"
fadeInDuration={200}
fadeOutDuration={800}
className="max-w-md"
/>Multiple Transcript Overlays
You can display both user and bot transcripts simultaneously:
import { TranscriptOverlay } from "@pipecat-ai/voice-ui-kit";
<div className="space-y-4">
<div>
<h4 className="text-sm font-medium mb-2">Bot Speech</h4>
<TranscriptOverlay participant="remote" />
</div>
<div>
<h4 className="text-sm font-medium mb-2">User Speech</h4>
<TranscriptOverlay participant="local" />
</div>
</div>Integration
The TranscriptOverlay component uses several hooks from the Pipecat Client React SDK:
usePipecatClientTransportStatefor connection state monitoringuseRTVIClientEventfor listening to speech events
This means it must be used within a PipecatClientProvider context to function properly.
The component listens to these events:
RTVIEvent.BotTtsText- Receives text chunks as the bot speaksRTVIEvent.BotStoppedSpeaking- Triggers when the bot stops speakingRTVIEvent.BotTtsStopped- Triggers when TTS stops
The component automatically:
- Accumulates transcript text as speech progresses
- Clears the transcript when a new speech turn begins
- Triggers fade-out animations when speech ends
- Only displays when the transport state is "ready"
Visual States
The component displays different visual states based on the speech status:
- Hidden: Component is not rendered when no transcript is available or transport is not ready
- Active: Shows transcript text with fade-in animation as speech progresses
- Fading: Shows fade-out animation when speech turn ends
Animation Behavior
The component includes sophisticated animation handling:
- Word-by-Word Fade-in: Each word appears with a smooth fade-in animation (300ms duration by default)
- Fade-out: Text disappears with a fade-out animation (1000ms duration) when speech ends
- Line-Wrapped Backgrounds: Background wraps around each line of text, creating separate blocks for multi-line transcripts
- Text Balance: Uses CSS
text-balancefor optimal text wrapping - Box Decoration: Applies background styling to text content for better readability
- Customizable Timing: Both fade-in and fade-out durations can be customized
How It Works
The TranscriptOverlay component demonstrates how real-time speech transcripts work:
- Event Listening: The component listens for Pipecat Client events related to speech
- Text Accumulation: As speech progresses, text chunks are received and accumulated
- Real-time Display: The growing transcript is displayed with smooth animations
- Turn Management: When speech ends, the component triggers fade-out animations
- Cleanup: The transcript is cleared when new speech begins
This creates a natural, real-time experience where users can see speech being transcribed as it happens, with smooth visual feedback for the start and end of each speech turn.