Vesper-assistant

What is this?

This is Vesper, a voice personal assistant that is supposed to get to know you and help you on a daily basis, based on the information he learns about you.

For now, this is only his shell, the AI is missing, but almost all the automation and principles are intact. He can hear you out, answer based on the topic or your emotions and store the conversations. I plan to grow on this project, do it step by step as my understanding on AI and ML deepens.

How it works?

Vesper has "ears" and a "voice" of his own, the mic of your computer should be turned on in order for him to listen to you, but because he was created with the idea of being able to have conversations, short or long, you will need to use some special wake-up words to make him receptive, otherwise, he won't listen to you. His voice will shift with the emotion he senses in your voice, which means that he would be either neutral, happy or sad besides you.

In order not to mess his memory, I thought of having an audio memory that will be saved in folders in your computer that need to respect the format that I will specify in the How to use it (so far) section; and a memory that uses a json file for a faster process of "remembering" things about the user. The folder-based memory is a little bit tricky because it is segmented using the topics that are saved as "tags" inside the program ( the user can change the tags as they want, the code is not set in stone) and he will respond accordingly to the specific tag he recognize.

Reasons behind the project

Have you ever had that question when you look at something and be like: "Can I build that?". Well, Vesper is just a low profile of JARVIS, Siri, Bixby and all the chatboxes that have a voice. I was curious on how they work and how I could actually build my own version.
In the meanwhile, I understood how should I apply some logic and how to combine different human-like function into a big project. This is my start in learning about the controversy about AIs.

Languages & Tools

Python
Libraries/ Packages:
- textblob (for sentiment polarity in a text reply);
- datatime (for time stamp and calculating the time in seconds of a reply or pause in speech);
- pydub -> AudioSegment (for the audio path);
- os (for audio and storing);
- json (for the written memory);
- sounddevice (for the mic to record the voice of the user);
- vosk + wave (for transcribing the audio);
- soundfile (write an answer for Vesper);
- pyttsx3 (for Vesper's voice, of course);
- subprocess (play the record/audio reply from Vesper).

How to use it (so far)

First of all, make sure you have installed and download all the packages mentioned above for a perfect and undisturbed usage.
ATTENTION!: the memory is structured using rules; the memory is structured using tags and interests, so make sure you follow the memory management, which are the following:
- in the folder with the code source, add a folder named voice_cache
- you're path for a known topic should be: voice_cache/memory/{tag}
- if you are talking about somthing that is not so important, your path should look like: voice_cache/temp
- in each folder temp or tagged, there will be the records of Vesper talking to you
After you clone or you download the repo, make sure you hit the run button from the core.py file (as the name suggests, this is Vesper's heart).
As soon as you run the program, small hints about the program progress will be displayed, as "I'm listening..." when Vesper is waiting for your reply.
To start a conversation, please use one of the wake-up words:
- vesper
- hey assistant
- help
- what should I do
- I'm stuck
After you are done conversating, to stop Vesper, please use one of the stopping words:
- that's all
- stop listening
- bye
- we're done
- we are done

Lessons learnt

Working with libraries and packages; linking the files so they respond as one
Breaking a big and dreamy project idea into something that could be doable, structurating plans and files in modules
Creating a voice is harder that almost anything, the pitch that comes with the emotion from a plain text is powerful
Paying attention when working with physical memory, even in folder format; things get messy fast

Challenges

At some point, because I didn't link properly the files, I created a digital parrot. I know that it must sound funny, but a line of code gave me hours of struggle.
The sensitivity of the mic and permissions are important. The audio transcription accuracy was already a challenge because of my accent, but when the sensitivity of my own mic created new obstacles, then I realised how important some minor aspects could actual be.
Balancing emotion tone with technical limitation (I am still not 100% content with this). Working with free libraries and packages sometimes is not the most proffesional way to do things. If the assistant's voice sounds pretty well in neutral or sad tones... well, let's just say that when he is happy, he is another person.
The error when some audio packages with my computer settings and permissions crashed and I wasn't able to memorise Vesper's audio responses in folders or even freeze. That's how audio_config.py file was born.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
ai_brain.py		ai_brain.py
audio_config.py		audio_config.py
core.py		core.py
hear_me.py		hear_me.py
logger_4_cons.py		logger_4_cons.py
memory.json		memory.json
memory_manager.py		memory_manager.py
voice_up.py		voice_up.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vesper-assistant

What is this?

How it works?

Reasons behind the project

Languages & Tools

How to use it (so far)

Lessons learnt

Challenges

About

Uh oh!

Releases

Packages

Languages

lorenabora/Vesper-assistant

Folders and files

Latest commit

History

Repository files navigation

Vesper-assistant

What is this?

How it works?

Reasons behind the project

Languages & Tools

How to use it (so far)

Lessons learnt

Challenges

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages