Speech Note

Rating: 
4.90909
Your rating: None Average: 4.9 (22 votes)

Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine Translator.

Speech Note let you take, read and translate notes in multiple languages. It uses Speech to Text, Text to Speech and Machine Translation to do so. Text and voice processing take place entirely offline, locally on your phone, without using a network connection.

Your privacy is always respected. No data is sent to the Internet!

Speech Note uses many different processing engines to do its job. Currently these are used:

Speech Note supports extensive number of language models. Some of them give very good accuracy, but some are not perfect. All models can be downloaded directly from the app.

A detailed list of supported languages is here.

If you are looking for similar app but for Linux Desktop you should check Speech Note available on Flathub (video demo).

Limitations:

  • App does not work on i486 architecture (e.g. Jolla Tablet)
  • Models for Whisper engine are disabled on phones with ARMv7 CPU (e.g. Jolla C).
  • Models for Whisper and April-ASR engine are extremely slow on ARM32. Practically, they are usable only on ARM64.
  • Speech to Text for languages other than English is not very accurate in general.
  • Machine translation is slow on ARM32, especially on ARMv7 phones.

Any comments, ideas, translations, issue reports are highly appreciated.

Translations (both Speech Note and Speech Keyboard):
All translations are very welcome. There are three ways to contribute:
- [preferred] Transifex project
- Direct github pull request or gitlab merge request
- Translation file sent to me via e-mail: dsnote@mkiol.net

Source code: https://github.com/mkiol/dsnote or https://gitlab.com/mkiol/dsnote
Bugs, Feature requests: https://github.com/mkiol/dsnote/issues or https://gitlab.com/mkiol/dsnote/-/issues or just email: dsnote@mkiol.net

Screenshots: 
Application versions: 
AttachmentSizeDate
File harbour-dsnote-1.5.1-1.armv7hl.rpm1.27 MB17/11/2021 - 10:00
File harbour-dsnote-1.5.1-1.aarch64.rpm1.34 MB17/11/2021 - 19:28
File harbour-dsnote-1.6.1-1.armv7hl.rpm1.31 MB10/12/2021 - 20:52
File harbour-dsnote-1.6.1-1.aarch64.rpm1.39 MB10/12/2021 - 20:52
File harbour-dsnote-1.8.0-1.aarch64.rpm1.44 MB02/04/2022 - 19:40
File harbour-dsnote-1.8.0-1.armv7hl.rpm1.36 MB02/04/2022 - 19:40
File harbour-dsnote-2.0.1-1.armv7hl.rpm6.7 MB15/04/2023 - 16:58
File harbour-dsnote-2.0.1-1.aarch64.rpm7.86 MB15/04/2023 - 16:58
File harbour-dsnote-3.1.6-1.aarch64.rpm92.14 MB13/07/2023 - 16:41
File harbour-dsnote-3.1.6-1.armv7hl.rpm21.78 MB13/07/2023 - 16:41
File harbour-dsnote-4.5.0-1.aarch64.rpm28.63 MB18/05/2024 - 19:55
File harbour-dsnote-4.5.0-1.armv7hl.rpm27.38 MB18/05/2024 - 19:55
File harbour-dsnote-4.6.0-1.armv7hl.rpm28.21 MB03/08/2024 - 18:05
File harbour-dsnote-4.6.0-1.aarch64.rpm29.41 MB03/08/2024 - 18:05
File harbour-dsnote-4.6.1-1.aarch64.rpm29.41 MB17/08/2024 - 11:58
File harbour-dsnote-4.6.1-1.armv7hl.rpm28.21 MB17/08/2024 - 11:58
File harbour-dsnote-4.7.0-1.aarch64.rpm29.77 MB29/12/2024 - 12:55
File harbour-dsnote-4.7.0-1.armv7hl.rpm28.55 MB29/12/2024 - 12:55
Changelog: 

4.7.0

  • General
    • New mode for replacing the current note instead of appending new text to it. When the Replace an existing note option is set, whenever new text is added, it will replace the existing note.
  • User Interface
    • Speech Note has been translated into Slovenian language.
  • Speech to Text
    • Settings option Profile which allows you to change WhisperCpp processing parameters. There are two profiles to choose from: Best Performance, Best Quality.
    • Echo mode. After processing, the decoded text will be immediately read out using the currently set Text to Speech model.
    • Update the whisper.cpp library. This provides a 10% increase in STT speed with WhisperCpp models.
  • Text to Speech
    • New Piper voice for Latvian
  • Translator
    • New models: English to Finnish, English to Turkish, English to Swedish, Swedish to English, English to Slovak, English to Indonesian, English to Romanian, English to Greek, Chinese to English
    • Updated models: English to Catalan, English to Russian, English to Ukrainian, English to Czech

4.6.1

  • User Interface
    • Swedish translation has been updated.
  • Translator
    • New models: English to Latvian, English to Danish, English to Croatian, English to Slovenian, Indonesian to English, Romanian to English
    • Updated models: English to Hungarian, Czech to English, Greek to English

4.6.0

  • User Interface
    • Speech Note has been translated into Norwegian language.
    • Grouped models. Models that provide multiple sub-models (for example, TTS models that provide different voices) are shown in groups.
    • Option to enable/disable support for subtitles. Subtitle support is a niche functionality. To simplify the user interface, the subtitle options is not visible by default.
  • Speech to Text
    • The name of the all Whisper models has been changed to WhisperCpp to better reflect the engine behind them.
    • Automatic language detection in STT. To automatically detect the language during STT, select one of the models that is in the Auto detected category in the language list.
    • Quicker decoding with WhisperCpp. Optimization for short sentences has been added to WhisperCpp. With it, the speed of STT has doubled!
    • Translate to English option for WhisperCpp models. When enabled, speech is automatically translated into English.
    • Option for inserting processing statistics. New settings option allows inserting processing related information to the text after decoding, such as processing time and audio length. This can be useful for comparing the performance of different models, engines and their parameters.
  • Text to Speech
    • Welsh language. New language is enabled with Piper voice.
    • New Piper voices for Spanish, Italian and English
    • New RHVoice voices for Slovak and Croatian
  • Translator
    • New button for switching languages.
    • New models: English to Lithuanian, Croatian to English, Latvian to English, Danish to English, Serbian to English, Slovak to English, Bosnian to English, Vietnamese to English
    • Updated models: Lithuanian to English, Slovenian to English, Russian to English, Ukrainian to English

4.5.0

  • User Interface
    • Import subtitles in many formats and subtitles embedded into video file. You can import and export subtitles in SRT, WebVTT and ASS formats. If your video file contains one or many subtitle streams, you can import the selected subtitles into notepad.
    • Unified file importing and exporting. Text, subtitles, audio and video files can be imported or exported using unified pull-down menu option.
    • Settings option to enable/disable remembering the last note. If the option is disabled, the last note will not be available after restarting the app.
    • Settings option for default action when importing note from a file.
    • New text appending style: After empty line
    • Speech Note has been translated into Ukrainian and Russian languages.
    • Fix: Cancellation was blocking the user interface.
  • Speech to Text
    • Subtitles support in STT. To generate timestamped text in SRT format, change the text format to SRT Subtitles using the button at the bottom of the text area. Check the settings to find more subtitle options.
  • Text to Speech
    • Speech synchronized with subtitle timestamps in TTS. When the text format is set to SRT Subtitles, the generated speech will be synchronized with the subtitle timestamps.
    • New Piper voices for English, Persian, Slovenian, Turkish, French and Spanish
    • New RHVoice voice for Czech
    • Settings option to enable/disable speech synchronization with subtitle timestamps.
    • Speech audio is always normalized after TTS processing.
  • Translator
    • New models: Greek to English, Maltese to English, Slovenian to English, Turkish to English, English to Catalan
    • Updated models: Czech and Lithuanian

4.4.0

  • Translator
    • New model: Lithuanian to English
    • Progress indicator
  • Speech to Text:
    • New language: Marathi
    • Support for Speex audio codec in 'Transcribe a file'
    • Support for multiple audio streams in a video file
  • Text to Speech:
    • New voices for Serbian and Uzbek languages (RHVoice models)

4.3.0

  • Translator
    • New model: English to Hungarian
  • Speech to Text:
    • New languages: Afrikaans, Gujarati, Hausa, Telugu, Tswana, Javanese, Hebrew
    • New engine: April-ASR. Models for: English, French and Polish.
    • Stop listening button
    • Support for Opus audio codec in Transcribe a file
  • Text to Speech:
    • New Piper voices: Arabic, English, Hungarian, Polish, Czech, German, Ukrainian, Vietnamese, Serbian, French, Spanish, Nepali
    • More steps in Speech speed option
    • Diacritical marks restoration before speech synthesis for Arabic
    • Fix: Exporting to audio file was not possible when text was very long
  • Other:
    • Setting option Clear cache on close
    • Cache compression (Opus format instead of raw audio)

4.2.0

  • Translator
    • New models: Hungarian to English, Finnish to English
  • Speech to Text:
    • Support for video files transcription. With 'Transcribe a file' menu option you can convert audio file or audio from video file to text.
    • Whisper engine update and increase in performance. Processing time has been reduced by an average of 15% (Xperia 10 III).
  • Text to Speech:
    • Save audio in compressed formats (MP3 or Ogg Vorbis). You can also save metadata tags to the audio file, such as track number, title, artist or album.
    • Pause option. You can pause or resume speech reading.
    • Update of RHVoice voice for Uzbek
    • Fix: Piper models could not be downloaded
  • User Interface:
    • Share to Speech Note. You can push text, audio or video content to Speech Note using share button in other apps (e.g. Notes, Gallery, Audio recorder, Browser).

4.1.0

  • Speech to Text:
    • Remove of experimental 'Restore punctuation' option
    • Fix: Whisper wasn't able to decode short speech sentences
  • Text to Speech:
    • Option 'Speech speed' to make synthesized speech slower or faster.
    • New Piper voices: Czech, German, Hungarian, Portuguese, Slovak,
      English
    • Update of RHVoice voices for Slovak and Czech
    • Fix: Splitting text into sentences was incorrect for: Georgian,
      Japanese, Bengali, Nepali, Hindi

4.0.0

  • Translator (new feature - watch this video)
    • Support for offline translations for following languages: Catalan, Bulgarian, Czech, Danish, English, Spanish, German, Estonian, French, Italian, Polish, Portuguese, Norwegian, Iranian, Dutch, Russian, Ukrainian, Icelandic
    • Translator uses models that were created as part of Bergamot project and Firefox Translations.
  • User Interface:
    • User interface has been redesign. It is more handy and better supports landscape view.
    • Application has been translated to new languages: Dutch and Italian. Many thanks to all translators <3 <3
  • Text to Speech:
    • All existing Piper models have been updated.
    • New voices for: English, Swedish, Turkish, Polish, German, Spanish, Finnish, French, Ukrainian, Russian, Swahili, Serbian, Romanian, Luxembourgish and Georgian and Slovak

To read more details check About->Changes in the app.

Comments

mkiol's picture

Unfortunately app does not work on Jolla C (and most likely on other older devices). Sorry :(

sashikknox's picture

Cool, start testing... Too long time while download soeech model

mkiol's picture

Indeed, download time might be long. Model size for english is almost 1GB.

Pelzlurch's picture

For a first version indeed quite polished. Recognition quality is not brillant but quite OK - and very cool - offline!
The only thing I noticed negatively is that there is no automatic line break.

defactofactotum's picture

Thanks for this! I haven't tried it much except for simple phrases but seems to work well. Maybe in the future you could make it easier to copy lines of text to other apps.

oops just noticed now you can copy text on the pulley menu!
 I dictated this with speech note....

mkiol's picture

nice :D

ziellos's picture

Thanks a lot! Had no chance to really test speech recognition, but your app looks already very polished.

Pages