Speech Note

Submitted by mkiol on Sun, 2026/06/28 - 21:10

Rating:

Your rating: None Average: 4.9 (25 votes)

Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine Translator.

Speech Note let you take, read and translate notes in multiple languages. It uses Speech to Text, Text to Speech and Machine Translation to do so. Text and voice processing take place entirely offline, locally on your phone, without using a network connection.

Your privacy is always respected. No data is sent to the Internet!

Speech Note uses many different processing engines to do its job. Currently these are used:

Speech to Text (STT)
- Coqui STT
- Vosk
- whisper.cpp
- April-ASR
Text to Speech (TTS)
- espeak-ng
- MBROLA
- Piper
- RHVoice
- S.A.M.
Machine Translation (MT)
- Bergamot Translator (the same engine which is behind Firefox Translations)

Speech Note supports extensive number of language models. Some of them give very good accuracy, but some are not perfect. All models can be downloaded directly from the app.

A detailed list of supported languages is here.

If you are looking for similar app but for Linux Desktop you should check Speech Note available on Flathub (video demo).

Limitations:

App does not work on i486 architecture (e.g. Jolla Tablet)
Models for Whisper engine are disabled on phones with ARMv7 CPU (e.g. Jolla C).
Models for Whisper and April-ASR engine are extremely slow on ARM32. Practically, they are usable only on ARM64.
Speech to Text for languages other than English is not very accurate in general.
Machine translation is slow on ARM32, especially on ARMv7 phones.

Any comments, ideas, translations, issue reports are highly appreciated.

Translations (both Speech Note and Speech Keyboard):
All translations are very welcome. There are three ways to contribute:
- [preferred] Transifex project
- Direct github pull request or gitlab merge request
- Translation file sent to me via e-mail: dsnote@mkiol.net

Source code: https://github.com/mkiol/dsnote or https://gitlab.com/mkiol/dsnote
Bugs, Feature requests: https://github.com/mkiol/dsnote/issues or https://gitlab.com/mkiol/dsnote/-/issues or just email: dsnote@mkiol.net

Screenshots:

Keywords:

Application versions:

Attachment	Size	Date
harbour-dsnote-1.5.1-1.armv7hl.rpm	1.27 MB	17/11/2021 - 10:00
harbour-dsnote-1.5.1-1.aarch64.rpm	1.34 MB	17/11/2021 - 19:28
harbour-dsnote-1.6.1-1.armv7hl.rpm	1.31 MB	10/12/2021 - 20:52
harbour-dsnote-1.6.1-1.aarch64.rpm	1.39 MB	10/12/2021 - 20:52
harbour-dsnote-1.8.0-1.aarch64.rpm	1.44 MB	02/04/2022 - 19:40
harbour-dsnote-1.8.0-1.armv7hl.rpm	1.36 MB	02/04/2022 - 19:40
harbour-dsnote-2.0.1-1.armv7hl.rpm	6.7 MB	15/04/2023 - 16:58
harbour-dsnote-2.0.1-1.aarch64.rpm	7.86 MB	15/04/2023 - 16:58
harbour-dsnote-3.1.6-1.aarch64.rpm	92.14 MB	13/07/2023 - 16:41
harbour-dsnote-3.1.6-1.armv7hl.rpm	21.78 MB	13/07/2023 - 16:41
harbour-dsnote-4.5.0-1.aarch64.rpm	28.63 MB	18/05/2024 - 19:55
harbour-dsnote-4.5.0-1.armv7hl.rpm	27.38 MB	18/05/2024 - 19:55
harbour-dsnote-4.6.0-1.armv7hl.rpm	28.21 MB	03/08/2024 - 18:05
harbour-dsnote-4.6.0-1.aarch64.rpm	29.41 MB	03/08/2024 - 18:05
harbour-dsnote-4.6.1-1.aarch64.rpm	29.41 MB	17/08/2024 - 11:58
harbour-dsnote-4.6.1-1.armv7hl.rpm	28.21 MB	17/08/2024 - 11:58
harbour-dsnote-4.7.0-1.aarch64.rpm	29.77 MB	29/12/2024 - 12:55
harbour-dsnote-4.7.0-1.armv7hl.rpm	28.55 MB	29/12/2024 - 12:55
harbour-dsnote-4.8.4-1.armv7hl.rpm	31.73 MB	15/04/2026 - 22:43
harbour-dsnote-4.8.4-1.aarch64.rpm	33.86 MB	15/04/2026 - 22:43
harbour-dsnote-4.9.0-1.aarch64.rpm	37.01 MB	28/06/2026 - 21:10
harbour-dsnote-4.9.0-1.armv7hl.rpm	34.72 MB	28/06/2026 - 21:10

Changelog:

4.9.0

General
- Fix: Python dependency blocked the installation on SFOS 5.1
Speech to Text
- Inline timestamps in text output. A new output format is now available, displaying timestamps that show when each segment of text was recognized in the audio. To enable new format set Text format to Inline timestamps. You can also customize the timestamp interval and format in the settings. Inline timestamps option is available only when Subtitles and inline timestamps support is enabled in the settings.
Text to Speech
- New RHVoice models for: Belarusian, Croatian, English, Romanian, Russian
- Updated RHVoice models for: Czech, Serbian, Spanish
- New Piper models for: Albanian, Bulgarian, Dutch, Greek, Hindi, Indonesian, Kurdish, Latvian, Polish, Swedish, Telugu, Ukrainian, Urdu
User Interface
- Import a note from a URL. Use the Import a note from a URL option in the main app toolbar to import text content from a link. The URL must use HTTP or HTTPS. For HTML pages, you can choose to extract only the readable text or import the entire page.
- Setting option to force the English language in the user interface
- Switch between Notepad and Translator with a tab interface
- Speech Note has been translated into Portuguese-Brazilian language.

4.8.4

Translator:
- Updated models for: Catalan, Czech, German, Danish, Spanish, Persian, French, Icelandic, Italian, Korean, Dutch, Polish, Portuguese, Tamil, Ukrainian
- Fix: Translation models cannot be downloaded.

4.8.2

Text to Speech
- New Piper voices for Argentine Spanish, Hindi, Malayalam and Nepali
- Fix: Crash when the TTS engine generates a corrupted audio file
Speech to Text
- New languages enabled in Whisper: Azerbaijani, Belarusian, Kannada, Malayalam, Tamil

4.8.1

User Interface
- Speech Note has been translated into German language.
Text to Speech
- Fix: RHVoice voices not available due to broken lib on ARM64
Translator
- Fix: Model download error for Portuguese, Dutch, Persian, Norwegian and Icelandic languages.
- Updated models with improved accuracy: German to English, Dutch to English, English to Ukrainian, English to Hungarian, English to Catalan, Catalan to English, English to Lithuanian, English to Latvian, English to Slovenian, Slovenian to English, English to Slovak, English to Russian
- New models: Azerbaijani to English, Belarusian to English, Bengali to English, Gujarati to English, Hebrew to English, Hindi to English, Kannada to English, Malayalam to English, Malay to English, Albanian to English, Tamil to English

4.8.0

User Interface
- Speech Note has been translated into Arabic, Catalan, Spanish, Turkish and French-Canadian languages.
Speech to Text
- New KBLab Whisper models for Swedish. The National Library of Sweden has released fine-tuned STT models trained on its library collections. The models have significantly improved accuracy compared to regular Whisper models.
- FUTO Whisper models. New models used in the FUTO mobile keyboard app.
- Using an existing note as the initial context in decoding. This has the potential to improve transcription quality and reduce "hallucination" problem. If you observe a degradation in quality, turn off the Use note as context option.
- Option to pause listening while processing. This option can be useful when Listening mode is Always on. By default, listening continues even when a piece of audio data is being processed. Using this option, you can temporarily pause listening for the duration of processing.
- Option to play an audible tone when starting and stopping listening
Text to Speech
- S.A.M. TTS engine. S.A.M. is a small speech synthesizer designed for the Commodore 64. It features a robotic voice that evokes a strong sense of nostalgia. The S.A.M. voice is available in English only.
- Normalize audio setting option. Use this option to enable/disable audio volume normalization. The volume is normalized independently for each sentence, which can lead to unstable volume levels in different sentences. Disable this option if you observe this problem.
- New Piper voices for Dutch, Finnish, German and Luxembourgish
- New RHVoice voice for Spanish
- Updated RHVoice voice for Czech
Translator
- New models: English to Chinese, English to Arabic, Arabic to English, English to Korean, English to Japanese

4.7.0

General
- New mode for replacing the current note instead of appending new text to it. When the Replace an existing note option is set, whenever new text is added, it will replace the existing note.
User Interface
- Speech Note has been translated into Slovenian language.
Speech to Text
- Settings option Profile which allows you to change WhisperCpp processing parameters. There are two profiles to choose from: Best Performance, Best Quality.
- Echo mode. After processing, the decoded text will be immediately read out using the currently set Text to Speech model.
- Update the whisper.cpp library. This provides a 10% increase in STT speed with WhisperCpp models.
Text to Speech
- New Piper voice for Latvian
Translator
- New models: English to Finnish, English to Turkish, English to Swedish, Swedish to English, English to Slovak, English to Indonesian, English to Romanian, English to Greek, Chinese to English
- Updated models: English to Catalan, English to Russian, English to Ukrainian, English to Czech

4.6.1

User Interface
- Swedish translation has been updated.
Translator
- New models: English to Latvian, English to Danish, English to Croatian, English to Slovenian, Indonesian to English, Romanian to English
- Updated models: English to Hungarian, Czech to English, Greek to English

4.6.0

User Interface
- Speech Note has been translated into Norwegian language.
- Grouped models. Models that provide multiple sub-models (for example, TTS models that provide different voices) are shown in groups.
- Option to enable/disable support for subtitles. Subtitle support is a niche functionality. To simplify the user interface, the subtitle options is not visible by default.
Speech to Text
- The name of the all Whisper models has been changed to WhisperCpp to better reflect the engine behind them.
- Automatic language detection in STT. To automatically detect the language during STT, select one of the models that is in the Auto detected category in the language list.
- Quicker decoding with WhisperCpp. Optimization for short sentences has been added to WhisperCpp. With it, the speed of STT has doubled!
- Translate to English option for WhisperCpp models. When enabled, speech is automatically translated into English.
- Option for inserting processing statistics. New settings option allows inserting processing related information to the text after decoding, such as processing time and audio length. This can be useful for comparing the performance of different models, engines and their parameters.
Text to Speech
- Welsh language. New language is enabled with Piper voice.
- New Piper voices for Spanish, Italian and English
- New RHVoice voices for Slovak and Croatian
Translator
- New button for switching languages.
- New models: English to Lithuanian, Croatian to English, Latvian to English, Danish to English, Serbian to English, Slovak to English, Bosnian to English, Vietnamese to English
- Updated models: Lithuanian to English, Slovenian to English, Russian to English, Ukrainian to English

4.5.0

User Interface
- Import subtitles in many formats and subtitles embedded into video file. You can import and export subtitles in SRT, WebVTT and ASS formats. If your video file contains one or many subtitle streams, you can import the selected subtitles into notepad.
- Unified file importing and exporting. Text, subtitles, audio and video files can be imported or exported using unified pull-down menu option.
- Settings option to enable/disable remembering the last note. If the option is disabled, the last note will not be available after restarting the app.
- Settings option for default action when importing note from a file.
- New text appending style: After empty line
- Speech Note has been translated into Ukrainian and Russian languages.
- Fix: Cancellation was blocking the user interface.
Speech to Text
- Subtitles support in STT. To generate timestamped text in SRT format, change the text format to SRT Subtitles using the button at the bottom of the text area. Check the settings to find more subtitle options.
Text to Speech
- Speech synchronized with subtitle timestamps in TTS. When the text format is set to SRT Subtitles, the generated speech will be synchronized with the subtitle timestamps.
- New Piper voices for English, Persian, Slovenian, Turkish, French and Spanish
- New RHVoice voice for Czech
- Settings option to enable/disable speech synchronization with subtitle timestamps.
- Speech audio is always normalized after TTS processing.
Translator
- New models: Greek to English, Maltese to English, Slovenian to English, Turkish to English, English to Catalan
- Updated models: Czech and Lithuanian

4.4.0

Translator
- New model: Lithuanian to English
- Progress indicator
Speech to Text:
- New language: Marathi
- Support for Speex audio codec in 'Transcribe a file'
- Support for multiple audio streams in a video file
Text to Speech:
- New voices for Serbian and Uzbek languages (RHVoice models)

4.3.0

Translator
- New model: English to Hungarian
Speech to Text:
- New languages: Afrikaans, Gujarati, Hausa, Telugu, Tswana, Javanese, Hebrew
- New engine: April-ASR. Models for: English, French and Polish.
- Stop listening button
- Support for Opus audio codec in Transcribe a file
Text to Speech:
- New Piper voices: Arabic, English, Hungarian, Polish, Czech, German, Ukrainian, Vietnamese, Serbian, French, Spanish, Nepali
- More steps in Speech speed option
- Diacritical marks restoration before speech synthesis for Arabic
- Fix: Exporting to audio file was not possible when text was very long
Other:
- Setting option Clear cache on close
- Cache compression (Opus format instead of raw audio)

4.2.0

Translator
- New models: Hungarian to English, Finnish to English
Speech to Text:
- Support for video files transcription. With 'Transcribe a file' menu option you can convert audio file or audio from video file to text.
- Whisper engine update and increase in performance. Processing time has been reduced by an average of 15% (Xperia 10 III).
Text to Speech:
- Save audio in compressed formats (MP3 or Ogg Vorbis). You can also save metadata tags to the audio file, such as track number, title, artist or album.
- Pause option. You can pause or resume speech reading.
- Update of RHVoice voice for Uzbek
- Fix: Piper models could not be downloaded
User Interface:
- Share to Speech Note. You can push text, audio or video content to Speech Note using share button in other apps (e.g. Notes, Gallery, Audio recorder, Browser).

4.1.0

Speech to Text:
- Remove of experimental 'Restore punctuation' option
- Fix: Whisper wasn't able to decode short speech sentences
Text to Speech:
- Option 'Speech speed' to make synthesized speech slower or faster.
- New Piper voices: Czech, German, Hungarian, Portuguese, Slovak,
  English
- Update of RHVoice voices for Slovak and Czech
- Fix: Splitting text into sentences was incorrect for: Georgian,
  Japanese, Bengali, Nepali, Hindi

4.0.0

Translator (new feature - watch this video)
- Support for offline translations for following languages: Catalan, Bulgarian, Czech, Danish, English, Spanish, German, Estonian, French, Italian, Polish, Portuguese, Norwegian, Iranian, Dutch, Russian, Ukrainian, Icelandic
- Translator uses models that were created as part of Bergamot project and Firefox Translations.
User Interface:
- User interface has been redesign. It is more handy and better supports landscape view.
- Application has been translated to new languages: Dutch and Italian. Many thanks to all translators <3 <3
Text to Speech:
- All existing Piper models have been updated.
- New voices for: English, Swedish, Turkish, Polish, German, Spanish, Finnish, French, Ukrainian, Russian, Swahili, Serbian, Romanian, Luxembourgish and Georgian and Slovak

To read more details check About->Changes in the app.

Comments

PamNor

Sat, 2023/04/08 - 14:09

Search form

Main menu

You are here

User login

Categories

Tags Cloud

Recent comments

Who's online

Speech Note

Category:

Keywords:

Comments

Pages