WhisperWeb: Web-Based Audio Transcription with OpenAI Whisper

Looking for an easy way to transcribe audio directly from your browser without installing complex software? WhisperWeb lets you record audio and get accurate transcriptions using OpenAI’s powerful Whisper model — all through a simple web interface.

ℹ️ Info

WhisperWeb is a web-based service that allows you to record audio from your browser and transcribe it using OpenAI’s Whisper model. No external software installation required on the client side.

What You’ll Learn

How to install and set up the WhisperWeb server
Configure Whisper models (tiny, base, small, medium)
Record and transcribe audio from your browser
Choose the right model for your needs

Why Choose WhisperWeb?

📖 1. No External Software Needed

Record audio and transcribe directly from your browser. The client only needs an HTML file — no installation required on client machines.

📖 2. High Accuracy

📖 3. Multiple Model Options

Choose from different Whisper models based on your needs:

Model	Size	Speed	Accuracy	Best For
tiny	39 MB	Fast	Lower	Quick transcription
base	74 MB	Medium	Good	General use
small	244 MB	Slow	Better	Detailed work
medium	1.5 GB	Slowest	Best	Professional transcription

💡 Tip

Start with “tiny” or “base” if you have limited disk space. Use “medium” for the most accurate transcriptions.

📖 4. Multi-Language Support

Whisper supports 99+ languages and can also translate transcribed text.

My Journey

I used Google Translate for transcription for a long time. But I found it inaccurate and poor at understanding audio. When OpenAI released Whisper, I realized it was accurate enough to translate or transcribe. Since I wanted a web-based solution, I built WhisperWeb — a simple interface to use Whisper from any browser.

Installation

Step 1: Install Python Dependencies

pip install flask flask-cors openai-whisper

⚠️ Warning

The first time you run WhisperWeb, it will automatically download the Whisper model files (39MB - 1.5GB depending on model). Ensure you have a stable internet connection.

Step 2: Clone the Repository

git clone https://gitlab.com/krafi/whisperweb.git
cd whisperweb

Step 3: Run the Server

python server.py

The server will start and display something like:

Server running on http://0.0.0.0:5000
Public IP: xx.xx.xx.xx

📝 Note

Note down your public IP address — you’ll need it to connect from other devices on your network.

How to Use

Step 1: Access the Web Interface

Open the index.html file from the repository in your web browser. You can either:

Open it directly in your browser: file:///path/to/whisperweb/index.html
Or serve it locally: python -m http.server 8000

Step 2: Configure Server Address

In the web interface, enter your server’s IP address and port:

http://YOUR_SERVER_IP:5000

Step 3: Select Whisper Model

Choose a model from the dropdown:

tiny - Fastest, uses least resources
base - Good balance
small - Better accuracy
medium - Best accuracy (1.5GB download)

Click “Set Model” to load it. On first run, the model will be downloaded.

Step 4: Record and Transcribe

Click the record button to start recording
Speak into your microphone
Click stop when finished
Click “Submit” to transcribe
View and copy the transcription

API Endpoints

If you want to integrate WhisperWeb with your own applications:

Endpoint	Method	Description
`/ping`	GET	Check if server is running
`/upload`	POST	Upload audio file for transcription
`/set_model`	POST	Set the Whisper model

Example API Call

# Set model
curl -X POST http://localhost:5000/set_model -H "Content-Type: application/json" -d '{"model": "base"}'

# Upload audio for transcription
curl -X POST -F "file=@audio.wav" http://localhost:5000/upload

Troubleshooting

📖 Microphone not working?

Check browser permissions — ensure microphone access is allowed
Try using a different browser (Chrome or Firefox recommended)
On Android, try Soul Browser if local HTML files have microphone issues

📖 Server won't start?

Check if port 5000 is already in use:
```
lsof -i :5000
```
Try running with a different port by modifying server.py

📖 Model download fails?

Ensure you have a stable internet connection
Check disk space (medium model needs 1.5GB)
Try a smaller model first (tiny or base)

📖 Transcription is slow?

Try a smaller model (tiny or base)
Reduce audio length
Close other resource-heavy applications

Use Cases

Content Creators: Transcribe video and audio content
Students: Record lectures and transcribe notes
Journalists: Transcribe interviews quickly
Developers: Add speech-to-text to applications
Researchers: Transcribe interviews and focus groups

Source Code

View the full project and contribute: WhisperWeb on GitLab

💡 Tip

WhisperWeb transforms how you convert spoken words into written text. With its simple setup, powerful Whisper models, and browser-based interface, it’s perfect for anyone needing reliable audio transcription.

Try WhisperWeb today and experience accurate audio transcription!

WhisperWeb: Web-Based Audio Transcription with OpenAI Whisper

What You’ll Learn

Why Choose WhisperWeb?

My Journey

Installation

Step 1: Install Python Dependencies

Step 2: Clone the Repository

Step 3: Run the Server

How to Use

Step 1: Access the Web Interface

Step 2: Configure Server Address

Step 3: Select Whisper Model

Step 4: Record and Transcribe

API Endpoints

Example API Call

Troubleshooting

Use Cases

Source Code

On this page

Share Article

Knowledge Check

Answer Review

Discussion

WhisperWeb: Web-Based Audio Transcription with OpenAI Whisper

What You’ll Learn

Why Choose WhisperWeb?

My Journey

Installation

Step 1: Install Python Dependencies

Step 2: Clone the Repository

Step 3: Run the Server

How to Use

Step 1: Access the Web Interface

Step 2: Configure Server Address

Step 3: Select Whisper Model

Step 4: Record and Transcribe

API Endpoints

Example API Call

Troubleshooting

Use Cases

Source Code

On this page

Share Article

Knowledge Check

📋 Answer Review

Join the Discussion

Welcome aboard!

Welcome aboard!

Answer Review