Neural style transfer

Among the many applications a neural network can be applied to, there is an interesting field called Neural style transfer. This term refers to an algorithm that takes as input a content-image (e.g. a tortle), a style-image (e.g. artistic waves) and return the content of the content-image as if it was ‘painted’ using the artistic style of the style-image. This technique was initially proposed by Gatys et al in 2105 and the good things is that it does not require any new foundation: it just uses well-known loss functions. In short we define two loss functions, one for the content (DC) and one for the style (DS). DC measures how different the content is between two images, while DS measures how different the style is between two images. Then, we take a third image, the input, (e.g. a with noise), and we transform it in order to both minimize its content-distance with the content-image and its style-distance with the style-image.

 

 

 

 

 

Johnson et al. (2016) built on the work of Gatys et al., proposing a neural style transfer algorithm that is up to three orders of magnitude faster. The Johnson et al. method frames neural style transfer as a super-resolution-like problem based on perceptual loss functions. While the Johnson et al. method is certainly fast, the biggest downside is that you cannot arbitrarily select your style images like you could in the Gatys et al. method. Instead, you first need to explicitly train a network to reproduce the style of your desired image. Once the network is trained, you can then apply it to any content image you wish. You should see the Johnson et al. method as a more of an “investment” in your style image — you better like your style image as you’ll be training your own network to reproduce its style on content images.

Johnson et al. provide documentation on how to train your own neural style transfer models on their official GitHub page.

Finally, it’s also worth noting that that in Ulyanov et al.’s 2017 publication, Instance Normalization: The Missing Ingredient for Fast Stylization, it was found that swapping batch normalization for instance normalization (and applying instance normalization at both training and testing), leads to even faster real-time performance and arguably more aesthetically pleasing results as well.

 

Python implementation

There are many implementations of this algorithm on the web. I started from the code I extracted from this tutorial because it only depends open python3 and opencv-python. There are other feature-rich and fast implementations that use pytorch and cuda, but I experienced some installation issues and, for this reason, I temporarily gave up.

To create the environment, I installed opencv-python package

 

pip3 install opencv-python

 

Then I simply executed the code extracted from the tutorial on a sample image. The first think to do, is to download the model and norms from Johnson web pages

 

python3 init.py --download

 

The style transfer can now be tested on a sample image

 

python3 init.py --image ./image.jpg

 

To run a real-time style transfer on the video captured by the camera, simply run

 

python3 init.py

 

 

Emotion settings

Next part of the project is to create controls the use will adjust according to his or her feelings at the moment the picture was taken

I implemented a control panel in PyQt. The control panel includes a push button to save the picture and six sliders, one for each of the following base feelings

  • fear
  • sadness
  • surprise
  • happiness
  • disgust
  • anger

Each basic feeling will be associated to a certain style and the percentage associated to each feelings will be use to make a weighted composition of styles to apply to the original image.

Control panel updates a file on the Raspberry Pi's file system whenever one of the sliders value changes. The file is the read by the style transfer application

 

The algorithm to mix styles is implemented in function predict_all. Here we re-scale the percentages of each style and create an array of weights to apply to each image generated by the style transfer

 

def predict_all(img, values, h, w):
    blob = cv.dnn.blobFromImage(img, 1.0, (w, h),
        (meanX, meanY, meanZ), swapRB=False, crop=False)

    sum = 0
    for value in values:
        sum += value

    if sum == 0:
        return

    weights = []
    for value in values:
        weights.append(value / sum)

    num_models = 0
    for i in range(0, len(nets)):
        if weights[i] != 0:
            net = nets[i]

            print ("[INFO] Applying model " + str(i) + ", weight: " + str(weights[i]))
            if num_models == 0:
                out = predict(blob, net, h, w) * weights[i]
            else:
                out += predict(blob, net, h, w) * weights[i]

            num_models = num_models + 1
    return out 

 

The output of the predict_all function is then visualized by means of the OpenCV's imshow function

 

Sharing pictures

My initial plan was to install either a PiMoroni Enviro or a PiMoroni Automation hat to get some feedback from the environment that could have been "merged" into the picture, but there was some mechanical issues with the supports of the plexyglass case. So for the moment I put this feature in standby and I implemented instead the ability to send picture to your Telegram account. To make your Emoticam talk with you Telegram account, follow these steps

  • Search for a Telegram contact named "botfather"
  • Type "/start" to start chatting with the bot
  • Type "/newbot" to create a new bot. You will be asked to enter a name and a username for the new bot. Botfather will reply with the token you need to insert in your python script

  • Because we need to send notifications to the Telegram account, we need the unique identifier of the user. There is no easy way to find the user id but invoke a specific API to get the latest messages from the destination account and read the user id there. So in your Telegram account search the account of the newlt-created bot and send a message. Then, in your browser, open the URL https://api.telegram.org/bot<YOUR_BOT_TOKEN>/getUpdates. This will show the list of pending messages. The id we are looking for is the value of the filed "chat_id"

   

  • Install the Python Telegram library
pip3 install telepot

 

  • We have everything we need to create our bot. The skeleton of the bot is in the listing below

 

import datetime  # Importing the datetime library
import telepot   # Importing the telepot library
from telepot.loop import MessageLoop    # Library function to communicate with telegram bot
from time import sleep      # Importing the time library to provide the delays in program

def handle(msg):
    chat_id = msg['chat']['id'] # Receiving the message from telegram
    command = msg['text']   # Getting text from the message

    print ('Received:')
    print(command)
  
     # code to handle incoming commands 

# Insert your telegram token below
bot = telepot.Bot('<YOUR TOKEN>')
print (bot.getMe())

# Start listening to the telegram bot and whenever a message is  received, the handle function will be called.
MessageLoop(bot, handle).run_as_thread()
print ('Listening....')

while 1:
    sleep(10)
     if (os.path.exists('./share') and os.path.exists("./picture.jpg")):
        bot.sendPhoto(char_id="<CHAT ID>", photo=open("./picture.jpg", rb"))
        os.system("rm -f ./picture.jpg")

 

 

Building the Emoticam

The components of the Emoticam are

  • a Raspberyy Pi 4
  • A 5" 800x480 DSI display
  • A Raspberry Pi HW camera
  • 2 mm Plexiglass

 

Assembly the Raspberry Pi and the display by means of four bolts as shown in picture below

 

Then, you need 5 pieces of Plexyglass:

  • 12.5 x 8.6 cm
  • 12.1 x 4 cm (2 pieces)
  • 8.6 x 4 cm (2 pieces)

Glue the pieces to make a box, the drill 4 holes to fix the box to the Raspberry Pi and display assembly and other 4 holes to fix the Raspberry Pi HW camera

 

Use screws and bolts to keep everything in place

 

 

Final touches

Now some final touches

 

Hide desktop taskbar

The procedure to hide the desktop taskbar is quite simple

  • Open an SSH terminal
  • Go to /etc/xdg/lxsession/LXDE-pi folder

          cd /etc/xdg/lxsession/LXDE-pi

  • Edit file autostart

          sudo nano autostart

  • Comment out the line "@lxpanel --profile LXDE-pi" by inserting a "#" character at the start of the line

    

  • Press CTRL-X to save changes and exit the editor

 

Create the script

To run the three Python scripts that makes up  the Emoticam, I created a small script and saved it into the folder /etc/init.d, where all the initialization scripts are stored

The script invokes the three Python applications and makes them run in background (see the "&" at the end of each line)

 

#!/bin/sh
cd /home/pi/emoticam
python3 init.py &
python3 controls.py &
python3 tg.py &

 

 

Be sure to give the file you just created execution permission

     sudo chmod +x /etc/init.d/emoticam.sh

 

Run the script at boot

To start the Emoticam every time the Raspberry pi boots,

  • create a directory autostart in /home/pi/.config

          mkdir /home/pi/.config/autostart

  • move into this directory

          cd /home/pi/.config/autostart

  • create a file Emoticam.desktop with the following content

 

Desktop Entry]
Name=Emoticam
Type=Application
Comment=
Exec=/home/pi/emoticam/emoticam.sh

 

  • make sure the file has execute permissions

          chmod +x Emoticam.desktop

 

Demo

Video below shows the Emoticam in action

 

 

Source code for this project is available at

https://github.com/ambrogio-galbusera/emoticam

 

 

I hope you enjoyed this project. Thanks for reading!