# Project README

## Setting Up the Project Environment

Follow the steps below to set up your project environment:

### 1. Create a Virtual Environment

To isolate the project dependencies and prevent conflicts, create a virtual environment using the following commands:

```bash
# Create the virtual environment
python3 -m venv venv

# Activate the virtual environment:
source venv/bin/activate
```

### 2. Install Project Dependencies

Once the virtual environment is activated, install the required dependencies using `pip`:

```bash
pip install -r requirements.txt
```

Make sure you have a `requirements.txt` file in the project directory with the list of all required dependencies.

## Running the Script

To run the Python script `transcribe.py`, you need to provide an audio file as a parameter. Use the following command:

```bash
python transcribe.py <audio_file>
```

- Replace `<audio_file>` with the path to your audio file.
- Example:

```bash
python transcribe.py sample_audio.wav
```

## Where is downloaded the AI Whisper model ?

When using the `openai-whisper` package, the AI Whisper model is downloaded and stored in a local cache directory. By
default, it is stored under the user's home directory in the following path:

```plaintext
~/.cache/whisper/
```

Here:

- `~` refers to the user's home directory.
- `.cache/whisper/` is the folder where the models are cached.

The cache directory contains the downloaded model files, which are reused in subsequent runs to avoid re-downloading
them. Specifically:

- Whisper downloads the model files when they are first used, based on the requested model size (e.g., `base`, `medium`,
  or `large`).

If you need to modify or relocate the cache directory, you can set the `WHISPER_CACHE_DIR` environment variable to
specify a custom path for storing these files.

## Which whisper model for French ?

| Model     | Multilingual? | Accuracy in French | Speed | Best For |
|-----------|--------------|--------------------|-------|----------|
| **tiny**  | ✅ Yes       | ❌ Poor           | 🚀 Fastest | Basic transcription, very simple audio |
| **base**  | ✅ Yes       | ⚠️ Okay          | 🔥 Fast | Short/simple French audio |
| **small** | ✅ Yes       | ✅ Good           | ⚡ Fast | General French transcription |
| **medium**| ✅ Yes       | ✅✅ Very Good   | ⏳ Slower | Better accuracy in noisy audio |
| **large** | ✅ Yes       | ✅✅✅ Best      | 🐢 Slowest | High accuracy, complex speech |


## Notes

- Ensure that your virtual environment is activated before running the script.
- If you encounter any missing dependencies, double-check your `requirements.txt` file and re-run the installation
  command.

Happy coding!