Welcome to this comprehensive guide on customizing your ScrAIbe-WebUI! Whether you’re just getting started or looking to refine an existing setup, this guide will show you how to tailor your WebUI to meet your unique requirements. We’ll explore how to effectively use the config.yaml file, command-line arguments, and Python-based configurations to enhance your ScrAIbe-WebUI experience.
- Customize Your WebUI
- How to Use Custom Settings
- Customizing Your
config.yaml- 1. Interface Type
- 2. Gradio Launch Configuration
- 3. Gradio Queue Configuration (Simple Interface Only)
- 4. Layout Configuration
- Configuration Options in Detail
- Updated Important Notes
- 5. SCRAIBE Parameters
- Parameter Details
- 6. Setting Up the Email Backend for Async Interface
- Parameter Details
- Best Practices for Email Configuration
- 7. Advanced Configuration
- Summary
- Need Help or Have Questions?
To illustrate how customization works, let’s start with a simple example: changing the server port to 8080 and setting the whisper model to large-v3.
Before making any changes, ensure that ScrAIbe-WebUI is correctly installed and running. If it’s not set up yet, follow our Installation Guide.
You can configure ScrAIbe-WebUI using several approaches:
- Command-Line Interface (CLI)
- Python Interface
Within these approaches, you can provide configuration values in various formats:
- YAML File: Ideal for multiple or frequently changing parameters.
- Structured Dictionaries: Useful for Python-based configuration in code.
- Direct Keyword Arguments: Perfect for quick tests or small tweaks.
Tipp:
- If you’re using
docker, the CLI approach is typically more convenient for quick tests or small tweaks. However, for long-term configurations and easier retrieval of changes, using theconfig.yamlfile is recommended. Theconfig.yamlfile is the heart of yourScrAIbe-WebUIcustomization and should be preferred throughout the guide. - If you’re using
docker-compose, or prefer a more programmatic setup, consider using the Python interface. For more details on Docker-specific usage, see our Getting Started with Docker guide.
-
Create a
custom.yamlFile:launch: server_port: 8080 scraibe_params: whisper_model: 'large-v3'
-
Run the WebUI Using Your Custom Configuration:
scraibe-webui start -c custom.yaml
This approach cleanly separates your configuration into a YAML file, making it easier to maintain and share.
If you prefer a more programmatic approach or want to integrate ScrAIbe-WebUI into a larger Python application, you can load configurations directly from Python:
from scraibe_webui import App
app = App.load_config("custom_config.yaml")
app.start()Sometimes, you might just want to test a setting quickly without creating or editing a YAML file. In that case, you can pass parameters directly via the CLI or Python interface.
CLI Example Without YAML:
scraibe-webui start server_port=8080 whisper_model=large-v3Python Example Without YAML:
from scraibe_webui import App
# Using a dictionary:
settings = {
"launch": {"server_port": 8080},
"scraibe_params": {"whisper_model": "large-v3"}
}
App(**settings).start()
# Or directly as keyword arguments:
App(server_port=8080, whisper_model="large-v3").start()- YAML File (CLI or Python): Best for organized, long-term configurations.
- Direct CLI Arguments: Good for quick tests, especially in containerized environments.
- Python Keyword Arguments or Dictionaries: Ideal for programmatic integration or when automating tasks.
With these methods, you have flexibility and control over how you configure your WebUI. So far, we’ve covered basic changes to the port and whisper model through both CLI and Python interfaces, ensuring even beginners can get started with customization.
At this point, you’ve learned the fundamentals of customizing your ScrAIbe-WebUI using different configuration methods. Next, we’ll explore how to dive deeper into customizing your config.yaml file for a more structured and comprehensive setup.
The config.yaml file is the heart of your ScrAIbe-WebUI customization. This file allows you to define various settings that control how your WebUI behaves and appears. Below, we will dive into the key sections of the config.yaml file and explain how to customize each part. You can find the original config.yaml file in the repository under scraibe_webui/misc/config.yaml
The interface type determines how ScrAIbe-WebUI processes your transcription tasks. There are two interface types available: simple and async. Each serves different use cases depending on your needs, resources, and preferences. You can configure the interface type in your config.yaml like this:
interface_type: simpleinterface_type: Choose betweensimpleorasync.
The simple interface is ideal for real-time transcription or smaller tasks. This option allows you to upload your file, process it on the page, and get the results with a short waiting time.
-
Best For:
- Live transcriptions.
- Users with a GPU setup to speed up processing.
- Smaller audio/video files.
-
Advantages:
- Quick and straightforward.
- Doesn’t require additional configurations like email setup.
- More robust for immediate use cases.
-
Example UI: Below is a screenshot of the simple interface layout:

The async interface is designed for scenarios where you do not want to keep the browser open while the transcription is being processed. Files are added to a queue and processed asynchronously, with the results delivered to your email once ready.
-
Best For:
- Saving resources like GPU usage since you are no longer dependent on fast processing. You can run tasks on the CPU and receive the results via email.
- Transcribing larger or longer files.
- Users who prefer not to wait actively for the transcription to complete.
-
How It Works:
- You upload your files to the system.
- The files are queued for processing.
- Once processing is complete, the transcript is sent to your email.
-
Requirements:
- You must configure the email settings in the
config.yamlfile to enable this feature. (Covered in the Email Backend section.)
- You must configure the email settings in the
-
Advantages:
- Allows background processing without requiring the browser to remain open.
- Ideal for larger tasks where immediate results are not necessary.
-
Example UI: Below is a screenshot of the async interface layout:

| Feature | Simple | Async |
|---|---|---|
| Setup Complexity | Minimal | Requires email configuration |
| Use Case | Live transcription, small files | Asynchronous processing, large files |
| Speed | Faster results | Background processing |
| Resource Efficiency | More demanding (CPU/GPU) | Saves resources |
| Robustness | More reliable | Depends on email setup |
By understanding your specific use case, you can select the interface type that best suits your needs. For example, if you’re working on smaller files with GPU acceleration, the simple type is the way to go. On the other hand, if you have longer recordings or prefer to process files without waiting actively, the async type is more appropriate.
The launch configuration section in config.yaml determines how your ScrAIbe-WebUI instance is served by Gradio. By providing parameters to Gradio’s launch function, you influence where and how the interface is hosted, accessed, and secured.
What You Can Control:
- Server Port (
server_port): Choose a specific port to ensure predictable access points, such ashttp://localhost:8080. - Server Name (
server_name): Define the network interface that your WebUI binds to, allowing external access (e.g.,"0.0.0.0") or restricting it to the local machine only. - Authentication (
auth): Set credentials to protect the interface from unauthorized access, making it suitable for private or internal deployments. - Additional Parameters: Options like
inbrowser,share, or SSL configurations can be passed along to Gradio, tailoring the user’s initial experience when the WebUI launches.
Where to Learn More:
- Consult the default
config.yamlfor examples. - Detailed parameter explanations are available in the Gradio Launch Documentation.
- Always verify that the Gradio documentation matches the version you’ve installed.
Example config.yaml Snippet:
launch:
server_port: 8080
server_name: "0.0.0.0"
auth: [my_username, my_passwd]In this example:
- The WebUI is accessible at
http://<your_machine_ip>:8080/. - It listens on all network interfaces, allowing LAN-based users to connect.
- Users must log in with the specified credentials.
The queue section configures how incoming requests are queued and processed. Note that this queueing system is only relevant for the simple (synchronous) interface. In the simple interface, requests are handled directly through Gradio’s built-in queue functionality, allowing you to manage how many tasks are processed at once and how they line up when the system is busy.
What You Can Control:
- Maximum Queue Size (
max_size): Set how many requests can wait in line. Once the queue is full, new requests might be delayed or turned away, depending on your logic. - Ensuring Responsiveness: By tuning the queue size, you can balance resource usage and user experience. A larger queue can handle more users but may slow down processing; a smaller queue maintains responsiveness but might turn some requests away.
Important Note:
- The queue configuration does not apply to the async interface, as the async interface handles job scheduling and parallelization differently.
- For additional details on configuring the queue and other Gradio functionalities, refer to the Gradio Queue Documentation.
- Always ensure that your Gradio documentation version matches the version you have installed.
Example config.yaml Snippet:
queue:
max_size: 10In this example:
- Up to 10 requests can wait to be processed in the simple interface.
- Adjusting this value allows you to scale the WebUI’s capacity based on your hardware resources and expected user load.
When tuning the launch and queue configurations, remember that these settings primarily pass directly to Gradio. For any advanced configurations or deeper dives into parameters, consult the Gradio Documentation to ensure proper implementation and compatibility.
The layout section in config.yaml focuses on customizing your ScrAIbe-WebUI’s appearance. It allows you to define separate HTML files for the header and footer and inject dynamic content into those templates using header_format_options and footer_format_options. Importantly, Gradio no longer supports importing external CSS or SVG files, so all styling must be fully contained within the HTML files themselves, and images must be in supported formats like .png, .jpg, or .jpeg.
What You Can Control:
- Header and Footer HTML Files:
Specify your own HTML files for these sections. Each file can include its own inline CSS styles. - Dynamic Variables:
Use placeholders in the HTML (e.g.,{header_logo_url}) that Gradio replaces at runtime with values fromheader_format_optionsorfooter_format_options. - No External CSS Files or SVGs:
All styling must be inline in the HTML. CSS files cannot be imported, and.svgimages are no longer supported. Choose standard image formats like.png,.jpg, or.jpegfor your logos and icons.
Where to Learn More:
- Check our default
config.yamlfor baseline examples. - If you have advanced layout questions, refer to the Gradio Documentation. However, note that support for external CSS and
.svgfiles is no longer provided by Gradio, so you must rely on inline HTML styling and compatible image formats.
Example config.yaml Snippet:
layout:
header: path/to/my/header.html
header_format_options:
header_logo_url: https://www.example.com/
header_logo_src: path/to/my/logo/logo.png
footer: path/to/my/footer.html
footer_format_options: {}
show_settings: trueIn this example:
header.htmlandfooter.htmldefine your layout, with inline styling applied directly in the HTML.header_logo_urlandheader_logo_srcare dynamically inserted into the HTML.- The
show_settingsoption displays a settings panel in the interface if set totrue(experimental).
-
header:
Points to an HTML file that defines your header’s structure. All CSS must be inline. -
header_format_options:
A dictionary that maps placeholders in the header HTML to their actual values. For example,{header_logo_url}inheader.htmlcould be replaced byhttps://www.example.com/.Example HTML:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>ScrAIbe</title> <link href="https://fonts.googleapis.com/css2?family=Cormorant+Garamond:wght@400;700&display=swap" rel="stylesheet"> <style> .header-container { display: flex; align-items: center; justify-content: center; padding: 30px; } .logo-container { position: absolute; top: 50%; right: 20px; transform: translateY(-50%); width: 150px; } .logo { width: 100%; height: auto; } .header-title { font-family: 'Cormorant Garamond', serif; font-size: 40px; font-weight: bold; color: #50AF31; } </style> </head> <body> <div class="header-container"> <h1 class="header-title">ScrAIbe</h1> <div class="logo-container"> <a href="{header_logo_url}"> <img src="/gradio_api/file={header_logo_src}" alt="Logo" class="logo"> </a> </div> </div> </body> </html>
Here:
- The logo and styles are fully defined inline.
- The placeholder
{header_logo_url}is replaced by the URL defined inheader_format_options. - The image path is specified with
/gradio_api/file=, which Gradio uses to serve the image. Only images can be referenced this way; external CSS files are not supported.
-
footer:
Similarly to the header, points to an HTML file for the footer, which must also contain all of its styling inline. -
footer_format_options:
Works the same way asheader_format_options, allowing dynamic insertion of values into the footer template. -
show_settings:
A Boolean to toggle the settings panel in the interface. This feature remains experimental and may not be suitable for all scenarios.
-
No External CSS Files:
Gradio no longer supports importing external CSS files. All CSS must be defined inline within your HTML files. This ensures that the entire layout configuration is self-contained. -
Only Image Files Are Served:
Images can be referenced with/gradio_api/file=. No other file types, including.cssor.svg, are supported. Use common image formats like.png,.jpg, or.jpeg. -
SVG Files Not Supported:
If you previously relied on SVGs for icons or logos, switch to.png,.jpg, or.jpeg. Ensure images are appropriately sized and optimized for performance.
- Inline All Styles: Place all necessary CSS in
<style>tags within the HTML. Since external CSS files are not supported, this keeps your interface portable and predictable. - Use Common Image Formats: Stick to
.png,.jpg, or.jpegfor logos and icons. - Descriptive Variable Names: Choose clear keys in
header_format_optionsandfooter_format_optionsfor easier maintenance. - Validate Your HTML: Well-formed HTML helps prevent layout issues.
- Test Your Layout: Check the final appearance in different browsers and devices. If issues arise, consult the Gradio Documentation or community forums for guidance.
By adhering to these guidelines and limitations, you can create a visually appealing and fully functional layout. Inline styles and compatible image formats help ensure that your ScrAIbe-WebUI loads smoothly and consistently across various environments.
The scraibe_params section defines the core transcription and processing characteristics of ScrAIbe-WebUI. Here, you specify which Whisper model to use, how to handle diarization, which hardware to run on, and other performance-related settings. Adjusting these parameters allows you to optimize for speed, accuracy, or resource constraints.
Key Considerations:
- Model Selection: Choose a Whisper model that matches your accuracy and latency needs. Smaller models (like
tinyorbase) run faster on limited hardware, while larger models (large-v3,large-v3-turbo) offer better accuracy but require more compute. - Backend and Device Management: Decide whether to use the standard Whisper backend or the faster-whisper alternative, and choose between CPU or GPU (
cuda) processing depending on your hardware capabilities. - Diarization and Authentication: Enable speaker diarization with pyannote models if needed, and supply authentication tokens for protected model access.
Example config.yaml Snippet:
scraibe_params:
whisper_model: medium
whisper_type: whisper
dia_model: null
use_auth_token: null
device: null
num_threads: 0In this example:
- The
mediumWhisper model is selected, striking a balance between speed and accuracy. - The standard
whisperbackend is chosen. - The default diarization model is set (
dia_model: null). use_auth_token: nullmeans no special authentication is currently required.device: nullandnum_threads: 0let ScrAIbe-WebUI auto-select the best available resources.
-
whisper_model:
Defines the exact Whisper model used. Options include:tiny,base,small,medium,large,large-v2,large-v3,large-v3-turbo
Another option is a compatible Hugging Face model specified as
repo/model. This allows you to leverage models hosted on Hugging Face's platform, providing flexibility and access to a wide range of pre-trained models.Example
config.yamlSnippet:scraibe_params: whisper_model: repo/model
In this example:
- Replace
repo/modelwith the actual repository and model name from Hugging Face, such asopenai/whisper-large. - Ensure you have the necessary authentication token if the model requires it.
This approach enables you to utilize the latest models available on Hugging Face, potentially improving transcription accuracy and performance.
Trade-Offs:
-
Smaller models (
tiny,base) process audio more quickly but may yield less accurate results. -
whisper_type:
Choosewhisperfor the original backend orfaster-whisperfor a potentially more efficient implementation. Thefaster-whisperbackend, found at faster-whisper, may offer speed or optimization benefits, but always test to ensure it meets your quality and performance needs. -
num_threads:
This parameter controls how many CPU threads are allocated to transcription when running oncpu. Increasing the number of threads can improve performance on multi-core systems. However, avoid setting it too high, as excessive parallelization can lead to diminishing returns or increased contention for system resources. Only values greater than 0 are allowed, and it's capped at the number of CPU cores. A value of 4 or 8 is generally a good starting point.
Choosewhisperfor the original backend orfaster-whisperfor a potentially more efficient implementation. Thefaster-whisperbackend, found at faster-whisper, may offer speed or optimization benefits, but always test to ensure it meets your quality and performance needs.
-
dia_model:
When set, this parameter enables speaker diarization using a pyannote-based model. By default, it’snull, meaning performed using the defaultpyannote/speaker-diarization-3.1model. If you require identifying and separating different speakers within an audio track, specify a pyannote model path or name. This is a powerful feature for transcribing interviews, meetings, or multi-speaker podcasts. -
use_auth_token:
Some advanced models, particularly those hosted on Hugging Face, require authentication. If you need access to original pyannote models or other restricted resources, provide your Hugging Face token here. -
device:
cpu: Ideal for systems without GPUs or when GPU resources are limited.cuda: Utilize your GPU for faster processing, assuming you have CUDA-compatible hardware. GPU acceleration can significantly speed up transcription times for larger models or longer audio files.
-
num_threads:
This parameter controls how many CPU threads are allocated to transcription when running oncpu. Increasing the number of threads can improve performance on multi-core systems. However, avoid setting it too high, as excessive parallelization can lead to diminishing returns or increased contention for system resources.
Additional Options:
- verbose or other supported keyword arguments can be passed to the underlying
ScrAIbeclass to provide more detailed logging, debugging information, or fine-grained control over processing behavior.
Where to Learn More:
- Refer to the default
config.yamlfor baseline settings and additional parameters. - For details on Whisper models and performance characteristics, consult the official Whisper documentation and community resources.
- The pyannote documentation can provide guidance on selecting and using diarization models.
- For questions on Hugging Face authentication tokens or model access, visit the Hugging Face documentation and platform guidelines.
When using the asynchronous (async) interface, completed transcripts are not directly displayed in your browser. Instead, they’re processed in the background and delivered to you via email once ready. Configuring the email backend ensures that transcripts, error notifications, and upload confirmations reach you or your users reliably.
Key Considerations:
- SMTP Credentials & Security:
Provide your email service’s SMTP details, including the server address, port, and authentication credentials. Ensure that these credentials are kept secure and private. - Encryption & Authentication Methods:
Choose betweenSSL,TLS, orPLAINdepending on your email provider’s requirements. TLS or SSL is typically recommended for secure email transmission. - Templating & Customization:
Customize HTML templates for various notification types: success messages, error reports, and upload confirmations. Insert dynamic fields (likecontact_emailorexception) to make messages more informative. - Layout & Styling:
While Gradio supports amail_css_pathfor styling your emails, ensure your CSS is compatible and properly referenced so that recipients see a well-formatted message.
Note:
These settings are only applicable if you’re using the async interface. The simple interface processes requests immediately and does not send emails.
Example config.yaml Snippet:
mail:
sender_email: null
smtp_server: null
smtp_port: 0
sender_password: null
connection_type: TLS
context: default
default_subject: "SCRAIBE"
error_template: scraibe_webui/misc/error_notification_template.html
error_subject: "An error occurred during processing."
error_format_options:
contact_email: support@mail.com
success_template: scraibe_webui/misc/success_template.html
success_subject: "Your transcript is ready."
success_format_options:
contact_email: support@mail.com
upload_notification_template: scraibe_webui/misc/upload_notification_template.html
upload_subject: "Upload Successful"
upload_notification_format_options:
queue_position: null
contact_email: support@mail.com
mail_css_path: scraibe_webui/misc/mail_style.css-
sender_email:
The email address from which all notifications are sent. Use an address you control and ensure it’s correctly authenticated with your email provider. -
smtp_server & smtp_port:
Provide your email provider’s SMTP details. Common servers includesmtp.gmail.comfor Gmail, and ports often are587(TLS) or465(SSL). -
sender_password:
The account’s password or app-specific password. Handle this securely—avoid committing sensitive credentials to public repositories. -
connection_type (
SSL,TLS, orPLAIN):
Select the encryption/authentication method your SMTP server requires. Most providers recommend TLS or SSL for secure connections. -
context:
Controls the SSL context for secure email transmission. When set todefault, it usesssl.create_default_context(). If needed, you can supply a customssl.SSLContextor pass a dictionary of arguments to configure security further. -
default_subject:
The fallback subject line used if no other specific subject is provided. -
error_template & error_subject:
Define an HTML template and subject line for error notifications. Theexceptionplaceholder is automatically populated with the error details, making it easier to debug issues. -
error_format_options:
Insert dynamic content likecontact_emailor other fields into the error notification template, allowing you to provide support information or troubleshooting steps. -
success_template & success_subject:
Specify the HTML template and subject for success notifications, sent when transcripts are ready. Dynamic placeholders (e.g.,company_name,contact_email) personalize these messages. -
success_format_options:
Similar to error notifications, these key-value pairs populate placeholders in the success template. -
upload_notification_template & upload_subject:
Configure a template and subject line for upload confirmations, optionally including aqueue_positionto indicate the user’s place in the processing line. -
upload_notification_format_options:
Customize placeholders for upload notifications. For instance,queue_positioncan reassure users their request is queued and not lost. -
mail_css_path:
Points to a CSS file for styling email templates. Ensure the CSS is inline-friendly and that your email provider/client supports the styles used.
Demo Template Example:
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" type="text/css" href="{{ mail_css_path }}">
</head>
<body>
<h1>Transcript Ready</h1>
<p>Dear User,</p>
<p>Your transcript is now ready.</p>
<p>Thank you for using our service!</p>
<p>Best regards,</p>
<p>{{ company_name }}</p>
<footer>
<p>Contact us at <a href="mailto:{{ contact_email }}">{{ contact_email }}</a></p>
</footer>
</body>
</html>YAML Integration:
success_template: scraibe_webui/misc/success_template.html
success_format_options:
company_name: "My Awesome Company"
contact_email: support@mail.comIn this example:
- The template uses
{{ company_name }}and{{ contact_email }}placeholders. - The
success_format_optionsin YAML provides the values inserted at runtime.
- Use Secure Credentials: Consider using app-specific passwords or OAuth methods supported by your email provider.
- Test Templates Thoroughly: Send test emails to ensure formatting, placeholders, and styling appear as intended in common email clients.
- Monitor Deliverability: Some SMTP servers or email providers may require additional authentication steps (like
App Passwordsor2FA) to ensure emails are delivered reliably and not marked as spam. - Consult Documentation: For advanced customization or troubleshooting, refer to the Gradio Documentation and your email provider’s SMTP configuration guides.
By correctly setting up the email backend, the async interface can notify you (or your users) automatically when transcripts are ready, when uploads are completed, or if errors occur. This streamlined communication ensures a smoother, more efficient workflow without requiring the browser to remain open or the user to wait actively for processing to finish.
The advanced section in your config.yaml gives you direct control over resource usage and performance tuning for ScrAIbe-WebUI. Adjusting these parameters can help you strike the right balance between responsiveness, throughput, and memory consumption.
Example config.yaml Snippet:
advanced:
keep_model_alive: false
concurrent_workers_async: 1Key Parameters:
-
keep_model_alive (Applies to the Simple Interface Only):
-
When
true: The Whisper model remains in memory continuously.- What This Means: Faster subsequent transcriptions since you don’t have to reload the model each time.
- Trade-Off: Higher ongoing memory usage.
- Concrete Guidance: Start with
false. If you find the initial loading delay bothersome, set it totrueand monitor memory usage. If memory usage becomes an issue, revert tofalse.
-
When
false: The model unloads after each transcription.- Result: Lower memory consumption, but a short model loading delay before each task.
- Concrete Guidance: This is the safest default. Only switch to
trueif you frequently run many short tasks and need to eliminate loading delays.
-
-
concurrent_workers_async (Applies to the Async Interface Only):
- What It Does: Determines how many transcription tasks the async interface can process at once.
- Trade-Off: More concurrent workers can boost throughput, but also increase CPU/GPU usage.
- Concrete Guidance:
- Start with
concurrent_workers_async = 1. - If you find that tasks are backing up and you have sufficient hardware resources, increment this by 1 and test again.
- Continue increasing gradually until you reach an acceptable balance between speed and resource usage. If system performance degrades or resources become strained, dial the number back down.
- Start with
With these advanced parameters, you have precise control over ScrAIbe-WebUI’s performance characteristics. By starting with conservative values and incrementally adjusting based on observed behavior, you can tailor the WebUI to your environment without guesswork. Over time, fine-tuning these settings ensures that your transcription tasks run efficiently and meet your productivity goals.
If you run into issues, have suggestions, or need further assistance, we’re here to help! Don’t hesitate to open an issue on our GitHub repository. Your input helps us continually improve ScrAIbe-WebUI for everyone.
Happy customizing! 🎉