I (
kwatters) recently attended some training classes around a deep learning framework called Deeplearning4j. It was provided by the company SkyMind.IO, they’re the ones that created, maintain, and support the dl4j open source project.
The class generally covered many topics around neural networks and training them. As some background a neural network is a data structure that tries to model how a brain works. It models individual neurons and their connections. The idea is that each neuron has a bias, and it’s connections to other neurons has a weight. These networks have shown that with training data and some fancy math, you can adjust the biases and weights of the neural network so that it gets really good at modeling the training data. This means that you can use this trained network to classify new data that it hasn’t seen yet. There are many types of neural networks such as feed forward networks, or convolutional networks.
One network that I found was particularlly interesting was a particular type of recurrent neural network (RNN) called Long Short Term Memory (LSTM) networks. These LSTM networks change the model of the neuron slightly differently so that the neurons have a little bit of short term memory associated with them, such as a previous input value. Because there is some memory, it means that these types of networks are good at dealing with data that has a definite direction in time. One such type of data could be things like temperature measurements over time. In the same way, textual data has a temporal nature in that sentences are read from left to right. (except for a few languages)
So, I began looking for training data that I could use to train a chatbot using one of these LSTM networks as it’s brain. Luckily captain grog has about 300k messages from the shoutbox over the past few years!
I made a few changes to one of the examples so that it could read the training data in from the shoutbox history. The way it works is, we give the model some text, and ask it what it thinks the next letter is in the sequence. We repeat this task until the model generates a few hundred characters.
At first the model just generates garbage.. but after a few iterations of training, it begins generating things that look like words.. a few more iterations, and the words start being spelled correctly, a few more iterations and some loose grammar begins appearing from it’s responses.
One other thing that’s very interesting about this is that the input training data was in a JSON format that looked like the following
…
“msg”: “I am GroG”,
“msg”: “GroG I am”,
…
Interestingly, very early on in the training iterations of the model, it was able to generate sequences of text that contains valid JSON syntax… Imagine that, not only did the model learn how to generate words and sentences, but it did so by generating valid JSON including the ending quote mark and comma.
So, we started training and along the way we would pass it the following string
“msg”: “ahoy!”,
“msg”: “
and ask it what comes next. Early on in the training (after iteration 39) it spit out it’s first words..
----- Sample 39 -----
"msg": "ahoy!",
"msg": "nig e"oevt uaol0uoto asoFdssd s naletn i eo m"ao"to"eii tnir
nlh i,im sMomy Oi arilttae o ta?e"ta nn,sktjgscydv irim ! geo"aeoe hsis:.lb"ydyns!t dnh
"nggsdoe"dtoss libiastgm ostcsyr hm iol itd "D e"hd" gumde vh"haoi"i gItoe""w, :" ke"L !ui umemosrl "ettfhnd,: oat uw"-a)dks "iZg noytannte," t
"ge eig
bf"; ee,,n hdmiotl3, iha i:b .1ths
wT""s "Fueelgs eattgsnobo ""efr mn"nnb " l ta suo .lo
-------------------
Here you can see everything after the first quote mark on the second line was generate by the model and it’s basically gibberish…
After iteration 359 , it’s learned that each line should start with “msg”… and should end with a “,
----- Sample 359 -----
"msg": "ahoy!",
"msg": "Soudyatthe lig",
"msg": "it !. gowert bag Erid smovar lon G",
"msg": "macune pausshien.ber seorempg"s "te"ts andesthai't"nu ...",
"msg": "wout, move I's ald is tou pa farting far.s. gutny mich the noruoly fouler ffuEZG.nac",
"msg": "the Sy en ",
"msg": "os u deun",
"msg": "Uhit meredeore?1",
"msg": "A -) grot a meging the Gtuy innimeis",
"msg": "Ahamy gawa sorn sORYJn'c wo I gwha
-------------------
But in deep learning, assuming you’re not overfitting the model, the model gets better as you train it more… so we fast forward.. here’s iteration 759 we see the model , for the first time, says InMoov
----- Sample 759 -----
"msg": "ahoy!",
"msg": "I onle think instarded with in servict. AzayC modeh the degand in the enMoov is nick",
"msg": "NaKy gail too it was a good works..",
"msg": "i have about be but lestart experamese",
"msg": "a yug morning thenk and leg ), quesies enMoov night of device buy and InMoov like got and there.. doal and aps.. the uld fieging & csaye befoution out what you expel i can resound API us finentt servacD.i
-------------------
Around iteration 1599 it seems to say MyRobotLab (almost ) for the first time.
----- Sample 1599 -----
"msg": "ahoy!",
"msg": "even no need it of erfors",
"msg": "I'm a connerter. Rack from the ElMost printer rome, org.myrobotlab.duffioulter(maxper is python, but it simple tomoopher ;-)" a booth year was home ouch of it",
"msg"
",
"msg": "ambet tri frem the leart the wrrit - I coat cbrogucuitillulfiot Atipror prifrtrot pro ercoro .. morrl com. lle morrech ull combit troig trgrt cocrere ;",c .citroplero "u)alft.",c
-------------------
At iteration 1759 it says Gael for the first time..
----- Sample 1759 -----
"msg": "ahoy!",
"msg": "hi vy... How gael was done I've cectallel partial",
"msg": "at the logictone and if we can have starter 1.4... And ransis has soldering the robbares can wanted",
"msg": "Rabyvanderehered",
"msg": "msgathaith ite whatedee Wiave oneydrarestialaraththt )'t weehay suwerwwey they sphai ante mitaG"t way whyky'ataywytitwiowe ta-yphaate tahat i't?kathaydyttyrayymak phaanthquantwvewray'sakarader waut
-------------------
Iteration 5239 it seems to associate Grog as being a bike rider?! What , how?!
----- Sample 5239 -----
"msg": "ahoy!",
"msg": "it works :-) ",
"msg": "ok flexer bike GroG?!",
"msg": "oh .. yes to put a picture of get there",
"msg": "comy so doing a hoy great - do wear does program als on tas you do to Op to row in the controller.. Wolls som WHOk ang Movoo us",
"msg": "so kovo you just boul just atto conoroloo,, 9! -o!!! You knov! ....ooooutobboteex jow! Wooobbooullo botlooob. Io booto joo! Wooo!!!!!!!!!!7!Alvoub
-------------------
After 20000 iterations.. it almost seemed like it was talking ..
----- Sample 19999 -----
"msg": "ahoy!",
"msg": "Hi kevin i hope that took the x year - why no your post hem push",
"msg": "I have no one keep it pleased into my lead in everyone with the arduino and the other single in the webgui",
"msg": "one shoutbox worked in I've got the MRL join' for InMoov...",
"msg": "You're hope I'm storing more of InMoov hand :) How'ran'on I'm I have'' ") )')') ) ) )))))", y e) ) )", - oK8050000000000000
-------------------
So, keep in mind, the first message “ahoy” is the seed that generates the rest of the text. In examples above you’ll see it usually generated about 4 or 5 additional messages , as you ask it to generate more text, it starts loosing its mind a bit, which is why the last message in each of these outputs starts looking like the bot got drunk somewhere between the first thing it said, and the last thing it said …
I just wanted to share some of these responses, I’ll be playing with this technology a bit more and seeing how we can make it more useful. Another thing that might be interesting is to have it train on the blog posts here, so it would generate a blog post, rather than a shoutbox message. I’m still blown away at how the network figured out the json syntax…
This LSTM network is largely modeled after the work documented here :
I for one…