Speech Note

Rating: 
4.894735
Your rating: None Average: 4.9 (19 votes)

Note taking and reading with speech to text and text to speech.

Speech Note let you create and read notes using your voice. It converts speech to text and text to speech with only off-line processing. It supports many languages thanks to integration with following STT/TTS engines:  Coqui STT (Mozilla DeepSpeech), Vosk, Whisper, Piper, RHVoice, eSpeak-NG, MBROLA.

All voice analysis is entirely done locally on the device. Internet connection is only required for models download during app initial configuration. Speech Note respects your privacy and does not send any data to the Internet.

    Speech Note supports extensive number language models. Some of them give very good accuracy, but some are not perfect. All models can be downloaded directly from the app.

    A detailed list of supported languages is here.

    Limitations:

    • App does not work on i486 architecture (e.g. Jolla Tablet)
    • Models for Whisper engine are disabled on devices with ARMv7 CPU (e.g. Jolla C).
    • Models for Whisper engine are extremely slow on ARM32. Practically, they are usable only on ARM64.

    Any comments, ideas, translations, issue reports are highly appreciated.

    Translations (both Speech Note and Speech Keyboard):
    All translations are very welcome. There are three ways to contribute:
    - [preferred] Transifex project
    - Direct github pull request
    - Translation file sent to me via e-mail: dsnote@mkiol.net

    Source code: https://github.com/mkiol/dsnote
    Bugs, Feature requests: https://github.com/mkiol/dsnote/issues or just email: dsnote@mkiol.net

    Application versions: 
    AttachmentSizeDate
    File harbour-dsnote-1.5.1-1.armv7hl.rpm1.27 MB17/11/2021 - 10:00
    File harbour-dsnote-1.5.1-1.aarch64.rpm1.34 MB17/11/2021 - 19:28
    File harbour-dsnote-1.6.0-1.aarch64.rpm1.39 MB09/12/2021 - 21:32
    File harbour-dsnote-1.6.0-1.armv7hl.rpm1.31 MB09/12/2021 - 21:32
    File harbour-dsnote-1.6.1-1.armv7hl.rpm1.31 MB10/12/2021 - 20:52
    File harbour-dsnote-1.6.1-1.aarch64.rpm1.39 MB10/12/2021 - 20:52
    File harbour-dsnote-1.8.0-1.aarch64.rpm1.44 MB02/04/2022 - 19:40
    File harbour-dsnote-1.8.0-1.armv7hl.rpm1.36 MB02/04/2022 - 19:40
    File harbour-dsnote-2.0.0-1.armv7hl.rpm6.19 MB07/04/2023 - 18:03
    File harbour-dsnote-2.0.0-1.aarch64.rpm7.31 MB07/04/2023 - 18:03
    File harbour-dsnote-2.0.1-1.armv7hl.rpm6.7 MB15/04/2023 - 16:58
    File harbour-dsnote-2.0.1-1.aarch64.rpm7.86 MB15/04/2023 - 16:58
    File harbour-dsnote-3.0.0-1.aarch64.rpm92.81 MB22/05/2023 - 16:43
    File harbour-dsnote-3.0.0-1.armv7hl.rpm22.16 MB22/05/2023 - 16:43
    Changelog: 

    3.0.0

    • New feature: Note reading with Text to Speech
    • Experimental option: Restore punctuation (enabled only on ARM64)

    To read more details check About->Changes in the app.

    2.0.1

    • Translations update: Dutch, Swedish
    • Improved decoding accuracy thanks to noise canceling module.
    • Minor UI fixes

    2.0.0

    • New languages: Arabic, Bulgarian, Bosnian, Esperanto, Persian, Hindi, Japanese, Kazakh, Korean, Macedonian, Malay, Norwegian, Portuguese, Slovak, Serbian, Swedish, Swahili, Tagalog, Uzbek, Vietnamese
    • Support for Vosk engine and models
    • Support for Whisper engine and model (works decently only on ARM64)
    • New DeepSpeech models and update of existing ones
    • Voice Activity Detection
    • Option for text appending style
    • Option for setting default model (model which is used in Speech Keyboard)

    1.8.0

    • New languages: Finnish, Mongolian (experimental), Estonian (experimental)
    • Improved model for Polish language: Polski (mkiol)
    • Experimental German medical model: Deutsch (med)
    • New models for English: English (Coqui Huge Vocabulary), English (Coqui Large Vocabulary)
    • Improved languages browser
    • Support for SFOS 4.4 (sandboxing disabled)

    => I would be very grateful for any feedback how good speech transcription is for individual models.

    1.6.1

    • New German language model "Deutsch (Aashish Agarwal)" (experimental). This model might be even better than the currently configured default. I would be greateful for the feedback.

    1.6.0

    • New and default listening mode: One sentence (Clicking on the bottom panel starts listening, which ends when the first sentence is recognized)
    • Cover action (When 'One sentence' mode is set, cover displays action to enable/cancel listening.)
    • Improved language viewer
    • Coqui STT lib update (v1.1.0)
    • Bug fixes and performance improvements (e.g. App starts much quicker with multiple languages enabled)

    1.5.1

    • Fix: Languages configuration wasn't loaded when app was installed for the first time

    1.5.0

    • Fix for ARM64 - now app should work
    • Model for Catalan language
    • Many "experimental" models for various languages: Dutch, Yoruba, Amharic, Basque, Turkish, Thai, Slovenian, Romanian, Portuguese, Latvian, Indonesian, Greek, Hungarian. Most of these models provide very bad accuracy :(

    1.4.0

    • Russian and Ukrainian models
    • D-Bus API and service for 3rd-party app integration (e.g. Speech Keyboard)

    1.3.0

    • Czech language model and translation (many thanks to Lukáš Karas for the contribution)
    • New additional models: French (Common Voice), Italian (Mozilla Italia)

    1.2.0

    • Option to transcribe audio file
    • Minor UI fixes and improvements

    1.0.1

    • support for Jolla 1, Jolla C and PinePhone (alpha)
    • speech recognition accuracy is much improved thanks to DeepSpeech library update to version '0.10.0-alpha.3'
    • UI minor fixes

    Comments

    PamNor's picture

    Can't find Norwegian download in settings.
    Jolla C.

    mkiol's picture

    Unfortunately Norwegian is provided only by Whisper model and all Whisper models are disabled on ARM7 devices (like Jolla C). Whisper requires a lot of computation power and this old CPU can't handle it. Sorry.

    eson's picture

    Great upgrade! Thanks for the Swedish speech models. Much appreciated.

    articice's picture

    Fatal error: the to be installed harbour-dsnote-1.8.0-1.armv7hl require
    s 'qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder'

    Looks like there's no gstaudiodecoder for qt5-qtmultimedia-5.6.2+git31-1.12.1 in Vanha Rauma

    mkiol's picture

    On which device you are installing? This package should be available on SFOS 4.4 as well.

    At least it is available on Jolla C:

    [root@Sailfish nemo]# cat /etc/sailfish-release  
    NAME="Sailfish OS"
    ID=sailfishos
    VERSION="4.4.0.58 (Vanha Rauma)"
    VERSION_ID=4.4.0.58
    PRETTY_NAME="Sailfish OS 4.4.0.58 (Vanha Rauma)"
    SAILFISH_BUILD=58
    SAILFISH_FLAVOUR=release
    HOME_URL="https://sailfishos.org/"
    [root@Sailfish nemo]# zypper info qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder
    Loading repository data...
    Reading installed packages...
    
    
    Information for package qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder:
    -----------------------------------------------------------------------------
    Repository     : jolla
    Name           : qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder
    Version        : 5.6.2+git31-1.12.1.jolla
    Arch           : armv7hl
    Vendor         : meego
    Installed Size : 23.8 KiB
    Installed      : Yes (automatically)
    Status         : up-to-date
    Source package : qt5-qtmultimedia-5.6.2+git31-1.12.1.jolla.src
    Summary        : Qt Multimedia - GStreamer audio decoder media service
    Description    :  
       This package contains the GStreamer audio decoder plugin for QtMultimedia
    
    articice's picture

    It's Xperia 10 Plus.

    Perhaps this issue only applies to aarch64.

    pkcon install harbour-dsnote
    Fatal error: the to be installed harbour-dsnote-1.8.0-1.armv7hl requires 'qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder', but this requirement cannot be provided
    
    pkcon search qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder
    Available       qt5-qtmultimedia-plugin-mediaservice-gstaudiodecoder-5.6.2+git29-1.11.1.jolla.armv7hl (jolla)   Qt Multimedia - GStreamer audio decoder media service

     

    unsocialcortex's picture

    Re 1.6.1 patchnotes:

    just tested this wonderful app out for a while and "Deutsch (Aashish Agarwal)" seems very inferior to "Deutsch (Jaco)". tried some normal conversation aswell as nicely read out sentences using my xa2 for both and alot more words just got completly garbled or left out with "Aashish Agarwal".

    mkiol's picture

    Thank you so much for the feedback. Would you be able to evaluate "Deutsch (med)" as well? This model is available in version 1.8.0.

    unsocialcortex's picture

    so im no doctor or anything but i tested "med" a bit using some medical vocabulary and excerpts from german medical journals. "jaco" always gets more in general from sentences. for the medical terms they miss words or get them wrong regularly but "jaco" gets closer in my experience by doing *something* instead of nothing in some cases.

    all in all german deepspeech is obviously nowhere near english but its not bad for normal people conversation

    JayJay's picture

    Real nice work! The app is really cool. Is there any option to customize the vocabulary (i would need german medical language with drug recognition and medical vocabulary... is there maybe a file i can download or buy? If not... That would be an awesome new feature if i could add new vocabulary myself :-)

    rdomschk's picture

    Perfect Work!  A big Thank You from me...

    inta's picture

    Thanks for the great work, now it runs on arm64 and it works really well. :)

    inta's picture

    Languages still do not load here. Is there anything I have to clean up? I removed the settings folder from .config and the models dir inside Downloads.

    mkiol's picture

    Sorry, silly me. I forgot to upload 1.5.1 package for aarch64. It should be available in a moment.

    inta's picture

    The app does not "hang" anymore on startup and uninstall works, but the language list in the settings is empty (Xperia 10 II), so I can not choose a model to get started with.

    mkiol's picture

    Fixed in 1.5.1. Would be grateful for check if problem is resolved. Thanks.

    mkiol's picture

    Oh dear. I know what is wrong. I will fix it tomorrow.

    inta's picture

    @robthebold 10 II, so @mkiol could be right that this is an arm64 issue. Never mind, force uninstall worked and I'll try it again if you need someone to test it.

    dubliner's picture

    While version 1.3 worked flawlessly under SFOS 3.4, it seems the new version 1.4 runs into a problem. All I get is "Language is not configured". When I open the settings, there are "no languages", nothing is displayed.

    Curiously, the old "Downloads/DeepSpeech models" directory was still there, populated with "de.scorer  de.tflite  en.scorer  en.tflite". Pointing the "Location on language files" to that directory does not make any difference.

    I also tried deleting "Downloads/DeepSpeech models" as well as ".config/harbour-dsnote" to get a fresh start. Unexpectedly, that ".config/harbour-dsnote" is not re-created after starting DeepSpeech Note.

    Starting from the CLI I receive this output:

    $ harbour-dsnote
    [D] unknown:0 - cannot load translation: "C" "/usr/share/harbour-dsnote/translations"
    [D] unknown:0 - cannot load default translation
    [D] unknown:0 - starting configuration
    [D] unknown:0 - Using Wayland-EGL
    [W] unknown:0 - cannot open models file
    [W] unknown:0 - cannot open lang models file
    [D] unknown:0 - [app => dbus] call KeepAliveService
    [W] unknown:247 - file:///usr/lib/qt5/qml/Sailfish/Silica/private/TextBase.qml:247: TypeError: Cannot call method 'createObject' of null
    [W] unknown:0 - cannot reload service because is's not running
    [D] unknown:0 - [dbus => app] signal ModelsPropertyChanged
    [D] unknown:0 - [dbus => app] signal StatusPropertyChanged: 2
    [D] unknown:0 - [dbus => app] signal ModelsPropertyChanged
    [D] unknown:0 - [dbus => app] signal StatusPropertyChanged: 1
    [D] unknown:0 - [app => dbus] get DefaultModel
    [D] unknown:0 - [app => dbus] get CurrentTask
    [W] unknown:0 - ignore update speech
    [D] unknown:0 - [app => dbus] call KeepAliveService

    Any help would be appreciated, especially since I really love this application!

    dubliner's picture

    Update: When I copied ".config/harbour-dsnote" and ".local/share/harbour-dsnote" as well as "Downloads/DeepSpeech models" from another phone running SFOS 4.2 it works!!! Yay!

    Not sure, though, why the ".local/share/harbour-dsnote" directory was not created and populated on the first try?!

    P.S. Now Speech keyboard is not working on the SFOS 3.4 phone. I get the logo (three vertical lines) with strikethrough symbol.

    robthebold's picture

    I installed this on my Xperia 10 II, can't seem to make it work . . . When I start the app, I see an error "Unable to start service" pop up. As I'd expect for this error, speech recognition doesn't work, and when I go to Settings, there are no languages to choose from.

    I was going to uninstall and reinstall the app, but Storeman can't uninstall it and when I try to uninstall from terminal a "scriptlet" fails, saying it can't stop the service because it isn't running and uninstalling fails.

    I've also tried starting the service manually from the terminal but that didn't work. I'm not totally sure I did that right, though: as root I tried "systemctl start harbour-dsnote.service" and "systemctl start --user harbour-dsnote.service" and fails with message "Unit harbour-dsnote.service not found."

    "rpm -rl harbour-dsnote"  led me to check to make sure /usr/lib64/systemd/user/harbour-dsnote/ exists, and it does.

    Any ideas on how I can fix this or debug further? If more details are needed I can find my glasses and copy/paste stuff from terminal

    mkiol's picture

    I'm sorry for this mess. Most likely something is wrong with arm64 package. To be honest, I did not test it because I don't have any arm64 device yet.

    To force uninstall run following in a terminal:

    devel-su rpm --erase --allmatches --noscripts harbour-dsnote harbour-dskeyboard
    

    I will investigate what went wrong tomorrow. Sorrrry.

    inta's picture

    I tried to install this app and the keyboard app, but the list of languages in the settings is empty. I cannot remove this app, it fails with the message that the service is not running. Any idea how to fix that?

    robthebold's picture

    I didn't realize you posted this issue before me -- I'm getting the same problem. What device are you using?

    PamNor's picture

    @mkoli. I'll continue search for Norwegian *.tflite file. Keep up your good work.

    PamNor's picture

    Is there a possibility to get speech model for Norwegian language?
    https://www.google.com/url?sa=t&source=web&cd=&ved=2ahUKEwi53uqAnpj0AhVC...

    mkiol's picture

    I really would like to add such support but unfortunately I wasn't able to find any DeepSpeech model for Norwegian (usually file with *.tflite extension) :(

    lispy's picture

    A big Thankyou for the Transcribe Audio File feature. Made my day!!!

    eson's picture

    How about more language models, Swedish in perticular? ;)

    mkiol's picture

    I've tried but unfortunately I didn't find any available DeepSpeech model for Swedish. If you find one I will be pleased to add it.

    Pages