rohanricky - Tumblr blog

rohanricky · 6 years ago

Text

Jarvis-My personal assistant

Did you watch the movie Iron Man? Here’s my love for the movie in the form of a my own personal assistant named Jarvis.

Basic Function of Jarvis: Do what is commanded ----> execute scripts for specific tasks

I designed a server which could process & authenticate requests from any number of sources. By sources, I mean mobile app, desktop app, browser extension and many more. I can even send commands remotely.

For my desktop app, I used Tkinter(Python’s GUI) to make GUI.

This is the GUI I was talking about. We can give input in 2 formats, either type and press enter or click on the voice icon to the right and speak.

For my mobile app, I used Telegram’s bot interface. I also created a Chrome extension to operate inside browser.

The above is a telegram interface which sends commands to the server running on my computer and executes commands. The first command “download how deep is your love” downloads the song. The “Wifi” command toggles the wifi hotspot on my computer and “Shut Down” command shuts down the computer. There are many such commands which perform the tasks even remotely to make my life easier. I put the Jarvis’s response same because it was a really boring task.

However, I was not at all satisfied with Jarvis because it was designed to run on specific commands. To open any website, I had to say “open youtube”. The script needed the first word(open) to open a particular website. I couldn’t even use “youtube open”. Imagine the pain!

I wanted to use something like, “Hey Jarvis! Can you open youtube for me?”. For this flexibility I had to use NLP. I didn’t know NLP initially, so I used Facebook’s Wit.ai(not a framework!) to train on specific intents and entities. It was working!

Code to get confidence score from Wit.ai. If it is greater than 70% a particular script will be executed.

Finally! I had what I wanted but it was really slow. Just look at the process: 1. Take the audio recording turn it into text(send request to google) 2. Send the text to Wit.ai’s endpoint 3. Pray that internet connection is strong and that their servers are running. 4. Get the confidence score. 5. Go to a particular script and execute.

The above process takes indefinite amount of time. As expected it was causing a lag. Designing my own NLP system sounded promising!

learning phase...

I designed the model that might work and started collecting data specific to the application. Like most people, I used RNNs using LSTM based model to train on word embedding available on the internet. The output of these RNNs will pass through a softmax layer that outputs the specific script to be executed with a confidence score. I am still working on this model, it gives out correct results sometimes but is not reliable. Maybe owing to the training data specific to the intents that I have to provide.

Jarvis in the movie executes commands based on voice. So, I used python’s SpeechRecognition module to take audio inputs and used Google Cloud’s APIs to convert into text(STTs).

I will keep adding regularly to what I have been working on related to Jarvis.

Things need to be done: 1. Making more scripts and generalising them. 2. Combine Jarvis with Home Automation module to use appliances at home. 3. Sending audio data to Google Cloud and returning the response is creating an undesired lag. Have to create an audio to text Neural Net. 4. Enclose with Docker to resolve dependency issues. Most of you will face these issues while trying out the project.

Github : https://github.com/rohanricky/Jarvis

0 notes

rohanricky · 6 years ago

Text

Home Security System using Python and OpenCV

Home surveillance systems start at $300 for a basic setup. I wanted to make one of my own. As an electronics student, I got a chance to work with embedded systems along with OpenCV, python and Keras.

The minimum features that we need in a Home Security system is: 1. Movement detection when we are not at home(detect an intruder through movement) 2. Face Recognition to identify the residents(our code should not record us all the time-saves data).

I used Python and OpenCV to work with frames generated from the camera. Generally, products sold by a Home surveillance company detects faces in the frame and gives out a red signal if the face is unknown. The computation power needed to do this was huge and it generated a huge amount of lag between live video and the processed frames. This is why, most of the products need a dedicated CPU setup for monitoring. I wanted a minimalistic and reliable software, so I implemented a 2-factor type approach. The lag is generated because we are sending in all the images to the face recognition algorithm(30 frames/second). What if we sample and send only the important frames.

My first layer checks if there is a movement in the video. To do this, I calculated the area difference between consecutive frame(cv2.absdiff(frame1-frame2)). If there is a movement, the area will be higher than a threshold value(movement of a curtain due to wind won’t be detected). If 10 frames pass this threshold only 2 frames will reach the face recognition algorithm which detects if it is a known face. If there is someone unknown, the recorded video is sent out to Google drive(Google Drive Developer API) in 10 second bursts and I receive a SMS alert.

What about the face recognition algorithm?

Initially I used OpenCV even for face detection(length of nose, distance between eye brows etc). I modified the face_recognition module in python to detect faces of my family members. I soon found out that OpenCV does not fare well with side angle detection and detection in different contrasts throughout the day. I then shifted to using the VGG-16 model and added softmax layer with 5 outputs as the 17th layer. The 5 outputs label the people: 4 members are family, 1 for unknown. I trained the model on the known and unknown images and its performance was excellent.

I used webcam while developing this project but soon bought Raspberry Pi with Pi Camera and installed my code into it with minor changes. Now, it works like a charm.

Github link : https://github.com/rohanricky/home-security.

Visit the link and try it. I will soon release a Docker image so that you don’t have to bang to install the dependencies.

Cheers.

0 notes