Speech Note

Rating: 
4.90476
Your rating: None Average: 4.9 (21 votes)

Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine Translator.

Speech Note let you take, read and translate notes in multiple languages. It uses Speech to Text, Text to Speech and Machine Translation to do so. Text and voice processing take place entirely offline, locally on your phone, without using a network connection.

Your privacy is always respected. No data is sent to the Internet!

Speech Note uses many different processing engines to do its job. Currently these are used:

Speech Note supports extensive number of language models. Some of them give very good accuracy, but some are not perfect. All models can be downloaded directly from the app.

A detailed list of supported languages is here.

If you are looking for similar app but for Linux Desktop you should check Speech Note available on Flathub (video demo).

Limitations:

  • App does not work on i486 architecture (e.g. Jolla Tablet)
  • Models for Whisper engine are disabled on phones with ARMv7 CPU (e.g. Jolla C).
  • Models for Whisper and April-ASR engine are extremely slow on ARM32. Practically, they are usable only on ARM64.
  • Speech to Text for languages other than English is not very accurate in general.
  • Machine translation is slow on ARM32, especially on ARMv7 phones.

Any comments, ideas, translations, issue reports are highly appreciated.

Translations (both Speech Note and Speech Keyboard):
All translations are very welcome. There are three ways to contribute:
- [preferred] Transifex project
- Direct github pull request or gitlab merge request
- Translation file sent to me via e-mail: dsnote@mkiol.net

Source code: https://github.com/mkiol/dsnote or https://gitlab.com/mkiol/dsnote
Bugs, Feature requests: https://github.com/mkiol/dsnote/issues or https://gitlab.com/mkiol/dsnote/-/issues or just email: dsnote@mkiol.net

Screenshots: 
Application versions: 
AttachmentSizeDate
File harbour-dsnote-1.5.1-1.armv7hl.rpm1.27 MB17/11/2021 - 10:00
File harbour-dsnote-1.5.1-1.aarch64.rpm1.34 MB17/11/2021 - 19:28
File harbour-dsnote-1.6.1-1.armv7hl.rpm1.31 MB10/12/2021 - 20:52
File harbour-dsnote-1.6.1-1.aarch64.rpm1.39 MB10/12/2021 - 20:52
File harbour-dsnote-1.8.0-1.aarch64.rpm1.44 MB02/04/2022 - 19:40
File harbour-dsnote-1.8.0-1.armv7hl.rpm1.36 MB02/04/2022 - 19:40
File harbour-dsnote-2.0.1-1.armv7hl.rpm6.7 MB15/04/2023 - 16:58
File harbour-dsnote-2.0.1-1.aarch64.rpm7.86 MB15/04/2023 - 16:58
File harbour-dsnote-3.1.6-1.aarch64.rpm92.14 MB13/07/2023 - 16:41
File harbour-dsnote-3.1.6-1.armv7hl.rpm21.78 MB13/07/2023 - 16:41
File harbour-dsnote-4.3.0-1.armv7hl.rpm26.42 MB13/11/2023 - 11:59
File harbour-dsnote-4.3.0-1.aarch64.rpm27.87 MB13/11/2023 - 11:59
File harbour-dsnote-4.4.0-1.aarch64.rpm28.47 MB25/01/2024 - 17:11
File harbour-dsnote-4.4.0-1.armv7hl.rpm27.06 MB25/01/2024 - 17:11
Changelog: 

4.4.0

  • Translator
    • New model: Lithuanian to English
    • Progress indicator
  • Speech to Text:
    • New language: Marathi
    • Support for Speex audio codec in 'Transcribe a file'
    • Support for multiple audio streams in a video file
  • Text to Speech:
    • New voices for Serbian and Uzbek languages (RHVoice models)

4.3.0

  • Translator
    • New model: English to Hungarian
  • Speech to Text:
    • New languages: Afrikaans, Gujarati, Hausa, Telugu, Tswana, Javanese, Hebrew
    • New engine: April-ASR. Models for: English, French and Polish.
    • Stop listening button
    • Support for Opus audio codec in Transcribe a file
  • Text to Speech:
    • New Piper voices: Arabic, English, Hungarian, Polish, Czech, German, Ukrainian, Vietnamese, Serbian, French, Spanish, Nepali
    • More steps in Speech speed option
    • Diacritical marks restoration before speech synthesis for Arabic
    • Fix: Exporting to audio file was not possible when text was very long
  • Other:
    • Setting option Clear cache on close
    • Cache compression (Opus format instead of raw audio)

4.2.0

  • Translator
    • New models: Hungarian to English, Finnish to English
  • Speech to Text:
    • Support for video files transcription. With 'Transcribe a file' menu option you can convert audio file or audio from video file to text.
    • Whisper engine update and increase in performance. Processing time has been reduced by an average of 15% (Xperia 10 III).
  • Text to Speech:
    • Save audio in compressed formats (MP3 or Ogg Vorbis). You can also save metadata tags to the audio file, such as track number, title, artist or album.
    • Pause option. You can pause or resume speech reading.
    • Update of RHVoice voice for Uzbek
    • Fix: Piper models could not be downloaded
  • User Interface:
    • Share to Speech Note. You can push text, audio or video content to Speech Note using share button in other apps (e.g. Notes, Gallery, Audio recorder, Browser).

4.1.0

  • Speech to Text:
    • Remove of experimental 'Restore punctuation' option
    • Fix: Whisper wasn't able to decode short speech sentences
  • Text to Speech:
    • Option 'Speech speed' to make synthesized speech slower or faster.
    • New Piper voices: Czech, German, Hungarian, Portuguese, Slovak,
      English
    • Update of RHVoice voices for Slovak and Czech
    • Fix: Splitting text into sentences was incorrect for: Georgian,
      Japanese, Bengali, Nepali, Hindi

4.0.0

  • Translator (new feature - watch this video)
    • Support for offline translations for following languages: Catalan, Bulgarian, Czech, Danish, English, Spanish, German, Estonian, French, Italian, Polish, Portuguese, Norwegian, Iranian, Dutch, Russian, Ukrainian, Icelandic
    • Translator uses models that were created as part of Bergamot project and Firefox Translations.
  • User Interface:
    • User interface has been redesign. It is more handy and better supports landscape view.
    • Application has been translated to new languages: Dutch and Italian. Many thanks to all translators <3 <3
  • Text to Speech:
    • All existing Piper models have been updated.
    • New voices for: English, Swedish, Turkish, Polish, German, Spanish, Finnish, French, Ukrainian, Russian, Swahili, Serbian, Romanian, Luxembourgish and Georgian and Slovak

To read more details check About->Changes in the app.

Comments

sashikknox's picture

Cool, start testing... Too long time while download soeech model

mkiol's picture

Indeed, download time might be long. Model size for english is almost 1GB.

Pelzlurch's picture

For a first version indeed quite polished. Recognition quality is not brillant but quite OK - and very cool - offline!
The only thing I noticed negatively is that there is no automatic line break.

defactofactotum's picture

Thanks for this! I haven't tried it much except for simple phrases but seems to work well. Maybe in the future you could make it easier to copy lines of text to other apps.

oops just noticed now you can copy text on the pulley menu!
 I dictated this with speech note....

mkiol's picture

nice :D

ziellos's picture

Thanks a lot! Had no chance to really test speech recognition, but your app looks already very polished.

Pages