A simple recurrent neural network language model with Keras

by Melissa Roemmele, 4/28/17, roemmele @ ict.usc.edu

Overview

Keras is a Python deep learning framework built on top of Theano, a library for fast math. It lets you put together models with a minimal amount of code, depending on what type of neural network you want to build. It also makes it easy to run models on a GPU. It is easiest to install it with pip ("pip install Keras").

Keras' documentation isn't bad, but I learned Theano before using Keras. If you have the time, I highly recommend doing a tutorial on Theano just to get a basic sense of how it works. I felt like I understood neural networks better just from doing the logistic regression and multlilayer perceptron tutorials.

I am going to show how to use Keras to create a language model using a recurrent neural network (RNN). The input to the network is a set of word sequences and the output is a set of probabilities, which can then be used to predict new sequences.

Dataset

Here's a bunch of text sequences that we'll use as input to the RNN. If you're curious, these are a sample from a new dataset, ROCStories. The dataset consists of thousands of five-sentence stories, where the task is to predict the final sentence in each story given two choices. Here we'll use the model to generate the final sentence in each of these stories. There are only 100 stories here, which is tiny, but this is for demo purposes. Each story is already divided into a list of sentences.

In [51]:
stories = [["Dan's parents were overweight.",
  'Dan was overweight as well.',
  'The doctors told his parents it was unhealthy.',
  'His parents understood and decided to make a change.',
  'They got themselves and Dan on a diet.'],
 ['Carrie had just learned how to ride a bike.',
  "She didn't have a bike of her own.",
  "Carrie would sneak rides on her sister's bike.",
  'She got nervous on a hill and crashed into a wall.',
  'The bike frame bent and Carrie got a deep gash on her leg.'],
 ['Morgan enjoyed long walks on the beach.',
  'She and her boyfriend decided to go for a long walk.',
  'After walking for over a mile, something happened.',
  'Morgan decided to propose to her boyfriend.',
  "Her boyfriend was upset he didn't propose to her first."],
 ['Jane was working at a diner.',
  'Suddenly, a customer barged up to the counter.',
  'He began yelling about how long his food was taking.',
  "Jane didn't know how to react.",
  'Luckily, her coworker intervened and calmed the man down.'],
 ['I was talking to my crush today.',
  'She continued to complain about guys flirting with her.',
  'I decided to agree with what she says and listened to her patiently.',
  'After I got home, I got a text from her.',
  'She asked if we can hang out tomorrow.'],
 ['Frank had been drinking beer.',
  'He got a call from his girlfriend, asking where he was.',
  'Frank suddenly realized he had a date that night.',
  'Since Frank was already a bit drunk, he could not drive.',
  'Frank spent the rest of the night drinking more beers.'],
 ['Dave was in the Bahamas on vacation.',
  'He decided to go snorkeling on his second day.',
  'While snorkeling, he saw a cave up ahead.',
  'He went into the cave, and he was terrified when he found a shark!',
  'Dave swam away as fast as he could, but the shark caught and ate Dave.'],
 ['Sunny enjoyed going to the beach.',
  'As she stepped out of her car, she realized she forgot something.',
  'It was quite sunny and she forgot her sunglasses.',
  'Sunny got back into her car and heading towards the mall.',
  'Sunny found some sunglasses and headed back to the beach.'],
 ['Sally was happy when her widowed mom found a new man.',
  "She discovered her siblings didn't feel the same.",
  "Sally flew to visit her mom and her mom's new husband.",
  'Although her mom was obviously in love, he was nothing like her dad.',
  "Sally went home and wondered about her parents' marriage."],
 ['Dan hit his golf ball and watched it go.',
  'The ball bounced on the grass and into the sand trap.',
  'Dan pretended that his ball actually landed on the green.',
  'His friends were not paying attention so they believed him.',
  'Dan snuck a ball on the green and made his putt from 10 feet.'],
 ['Josh had a parrot that talked.',
  'He brought his parrot to school.',
  "During show and tell, Josh's parrot said a bad word.",
  'The teacher told Joshua not to bring his bird again.',
  'When Josh got home, he was grounded.'],
 ['Hal was walking his dog one morning.',
  'A cat ran across their path.',
  "Hal's dog strained so hard, the leash broke!",
  'He chased the cat for several minutes.',
  'Finally Hal lured him back to his side.'],
 ['Brenda was in love with her boyfriend Maxwell.',
  'He was a successful artist with a promising future.',
  'Maxwell told Brenda he needed to talk to her.',
  "She thought he'd propose but he wanted to break up.",
  'Brenda walked away and now she is the saddest girl out of everyone.'],
 ['Yanice opened the fridge and found nothing to eat.',
  'However, there were leftovers.',
  'She mixed it up in an attempt to make lunch.',
  'Since the place needed meat, she also fried and eggs.',
  'She ended up enjoying the meal.'],
 ['I saw my friend Joe sitting in lobby today.',
  'I kept him company, as he is a lonely old man.',
  "He told me he had just listened to Beethoven's Ninth.",
  'I talked to him for an hour.',
  'I left him in the lobby and told him I would see him soon.'],
 ['Twas the night after the first day of junior high.',
  'Amy and her friend Beth were on the phone.',
  'They had a lot to catch up on.',
  'Amy listened patiently as Beth told her about her day.',
  'She wanted to go 2nd because she knew hers was the better day.'],
 ['I knew of a young man who won the lottery.',
  'He used to ride lawn mowers.',
  'After he won, he went on to using drugs.',
  'He blew a lot of money.',
  'Eventually his winnings were revoked after a dui.'],
 ['A die hard shopper was waiting in the long line outside.',
  'It was miserably cold.',
  'The shopper saw a homeless man shivering in the alleyway.',
  'He gave up his place in the line and brought a gift back from his car.',
  'The shopper gave the homeless man a nice warm blanket.'],
 ['Jeff invited his friends over to play board games on Saturday night.',
  'They arrived at his house early that evening.',
  'The six of them sat around a big table.',
  'They took turns deciding which game to play.',
  'They spent six hours playing different board games.'],
 ['Chuck reclined on the back porch as he sipped his morning coffee.',
  'Today he would finally screen in this back porch.',
  'He gathered his tools and material supplies.',
  'He labored all day to finish the job.',
  'That night he snuggled with his wife on the bug free porch.'],
 ['Jessica decided she wanted to go to the beach.',
  'She invited all her friends to go along.',
  'They had a great time, but covered in a lot of sticky sand.',
  'They searched for a shower for what felt like ages.',
  'Finally they found one and decided it was the best trip ever.'],
 ['Dan wanted a pet for Christmas.',
  'He told his dad.',
  "His dad listened, but didn't say anything.",
  'So on Christmas morning, And got a wonderful surprise.',
  "He received a puppy with a shiny bow on it's head!"],
 ['Kelly and her friends decided to have a hot dog contest.',
  'The girls competed against each other.',
  'They had to make the best tasting one.',
  'When it was over, Kelly won.',
  'She won a medal.'],
 ['Dan was watching a Youtube video.',
  'His mom was in the kitchen, doing dishes.',
  'Suddenly, Dan ran into the kitchen and started crying.',
  'He had just seen a video of a man trampled by an elephant.',
  'His mom made Youtube off limits to Dan.'],
 ['Jeff wanted to move out of his house.',
  'He had no money to pay for a new one.',
  'One day he bought a scratching ticket.',
  'He won enough money for a down payment.',
  'Jeff ended up moving to a new house.'],
 ['I was walking to school.',
  "Since I wasn't looking at my feet, I stepped on a rock.",
  'I landed on the ground in pain.',
  'Thankfully, a stranger rushed to pick me up.',
  'He took me to the hospital to seek treatment.'],
 ['Lily drove into town for some errands.',
  'While she was there, she bought a large iced coffee to go.',
  "It was delicious and refreshing and she couldn't wait to finish it.",
  'She put it on her car roof while fishing for her keys.',
  'She drove home and the coffee fell off and spilled.'],
 ['Todd was hungry.',
  'He did not have anything to cook at his house.',
  'He decided he need to go buy something to cook.',
  'On the way to the store Todd decided to make hamburgers.',
  'Todd buys everything he needed and goes home and cooks.'],
 ['Virgil brought home a bright blue recliner he had found online.',
  'His wife thought it was hideous and clashed with the decor.',
  'Virgil bought some fabric and had it reupholstered.',
  'His wife complained again, saying bright green still clashed.',
  'Virgil gave up and threw the recliner away.'],
 ['Jenna was at the community pool with her family.',
  'She thought she could go out to the deeper end by herself.',
  'Without telling anyone, she swam out farther, and lost her footing.',
  'The lifeguard had to help her out of the water.',
  'Jenna had learned her lesson.'],
 ['Joan entered the confessional and kneeled.',
  'She thought she was confessing to the old parish priest.',
  'Joan confessed she had fantasized about the young visiting priest.',
  'Joan felt relief as she left the confessional.',
  'Then she saw the old parish priest pull up in his car.'],
 ['Homer decided to go watch a movie.',
  'But when he entered the movie theater, there was no where to sit.',
  'He found one spot by a bunch of kids.',
  'And during the movie, they made lots of noise.',
  'Homer became so annoyed, he decided to sit in the aisle.'],
 ['Chip loved dip.',
  'At one party, he put his chip into the dip and double dipped.',
  'Skip saw Chip double dip.',
  'Skip had to flip at seeing the double dip.',
  'Chip then had to punch Skip on the lip.'],
 ['Anna wanted to invite her crush Peter to the Sadie Hawkins dance.',
  'But Peter was very cute and popular.',
  'Anna feared her was far out of her league.',
  'She summoned her courage and asked him, expecting a rejection.',
  'But to her joy, Peter happily agreed to be her date!'],
 ['Gina had been being mean to the new boy in her class.',
  'Then a bully began picking on Gina.',
  'She now knew how the boy felt.',
  'Gina realized she should stop being mean.',
  'She realized she should also apologize to the new boy.'],
 ['Jerry was making toast.',
  'He set it to medium.',
  'When the toast came out it was completely burnt.',
  'He tried other settings with no better results.',
  'Eventually Jerry bought a new toaster.'],
 ['Last week I accidentally overdrafted my account.',
  'A restaurant charged me too much by mistake.',
  'Afterward I made five more purchases.',
  'I had hundreds of dollars in overdraft fees.',
  'My bank refused to reverse more than half.'],
 ['Frank was laughing so hard he started crying.',
  "His dog fell into the trash and couldn't get out.",
  'He took a video of it and posted it online.',
  'After a week Frank had over 10000 views.',
  'Frank knew this was as famous as he would ever be.'],
 ['Ty had been deaf all her life, but now she was hoping to hear.',
  'Her doctor had offered her a new kind of super-powered hearing aid.',
  'She had it implanted and then waited eagerly for her first sound.',
  'And she heard hundreds, voices and music and more!',
  'She loved those sounds so much that she became a musician.'],
 ['Bob walked into the ship elevator and heard a voice on the speaker.',
  'He told his wife the voice sounded the same as his audio book.',
  'They were surprised to hear the voice in the hallway to their cabin.',
  'The voice could still be in their cabin, so they called the steward.',
  "The steward found that the voice was coming from Bob's pocket phone."],
 ['John is sleepy.',
  'He starts a pot of coffee.',
  'John puts cream and sugar in his cup and thermos.',
  'He then adds coffee to both.',
  'After finishing the cup, he takes the thermos to work with him.'],
 ['Susan was excited to plan her first egg hunt in the South.',
  'She found it fun to hide the candy filled eggs in green grass.',
  'The sun was high in the sky when the egg hunt started.',
  'Soon, kids were opening eggs and bursting into tears.',
  'Susan saw that all the candy inside had melted in the sun.'],
 ['Bob met Ann and they started dating.',
  'They got along very well.',
  'But Bob was Lutheran and Ann was Catholic.',
  "Ann's mom disapproved of Bob's religion.",
  "Bob wants to marry Ann someday but is worried he can't."],
 ["Jen's in laws frustrated her to no end.",
  'They all went out to have lunch.',
  'At lunch, the in laws made it a point to ask Jen about her weight.',
  'Jen told them that she had gained a few pounds in the last two weeks.',
  'They still worried that Jen looked too thin and offered her some cake.'],
 ['The woodworker was not satisfied with the cuts from a bit.',
  'He took the bit from the machine and looked at it.',
  'The bit had been worn away by a lot of use.',
  'He took it to a sharpener and began to grind it.',
  'After a while the old bit was as good as new.'],
 ['I came out of my class and walked to my locker.',
  'I pressed my foot on a broken tile and fell on the ground.',
  'At first, no one was willing to help me up.',
  'However, one of the teachers around the area helped me get up.',
  "She took me to the school's nurse."],
 ['Timmy was always obsessed with airplanes.',
  'His dream was to be a pilot.',
  'For his 16th birthday his parents surprised him with flying lessons.',
  'He loved every minute of it.',
  "He was sure that's what he wanted to do after that."],
 ['Last night my wife and I went to the spa.',
  'We both got relaxing massages.',
  'We had facials.',
  'We soaked in warm water and stayed in the sauna.',
  'It was very nice.'],
 ['Greg decided to join marching band.',
  'He practiced for weeks to make sure he made it past auditions.',
  'He even got his own instrument.',
  'The day of tryouts came and all his work was for nothing.',
  "It wasn't that he didn't made it, it's that everyone got accepted."],
 ['Lisa has a beautiful sapphire ring.',
  'She always takes it off to wash her hands.',
  'One afternoon, she noticed it was missing from her finger!',
  'Lisa searched everywhere she had been that day.',
  'She was elated when she found it on the bathroom floor!'],
 ['It was usually hot were Kim lived.',
  'But everyone was surprised when it was cold one day.',
  'Kim decided to drink coffee and eat oatmeal.',
  'So she was glad that the weather was cold.',
  'But the next day, it was too hot to enjoy hot food and drinks.'],
 ["A bus driver wanted to save gas so he didn't come to a full stop.",
  'He slowed down just enough for people to hop on the bus.',
  'A man jumped at the doorway but missed.',
  "The bus driver felt that if you fell you didn't deserve a ride.",
  "He didn't stop to help the man back on the bus."],
 ['Gary was looking through his fridge for snacks.',
  'While looking at his food, he noticed everything had small bite marks.',
  'After looking through his kitchen, he determined he had mice.',
  "Gary called the local exterminator, who went to Gary's house quickly.",
  'After the exterminator killed all of the rats, Gary felt peace.'],
 ['Natalie had auditioned for the lead in the school play.',
  'She won the part and was super excited.',
  'She rehearsed for weeks and weeks.',
  'On opening night, she acted her little heart out.',
  'The play was a huge success!'],
 ['Mr Egg was presenting a volcanic eruption to the science class.',
  'He had a diagram of a volcano that looked like it was made of tinfoil.',
  'He then took out a huge thing of vinegar, and started to pour it in!',
  'The class had no clue what was going on and looked on in astonishment.',
  'The volcano then exploded with substance that looked like lava!'],
 ["Samantha's dad always taught her how to be self-sufficient.",
  'He even taught her how to change a tire on a car.',
  "One day Samantha's tire blew while she was driving.",
  'She was able to properly change her tire.',
  'Samantha was very grateful to be able to get home safely.'],
 ['One day, a pig wandered onto my parents farm.',
  'I always wanted one for a pet, so I did my best to keep it a secret.',
  'I kept him in the small shed on the edge of the farm.',
  'I snuck him food and water and played with him everyday.',
  'One day, he was gone, but I hope he found his way back home.'],
 ['Jane passed a small park-like zoo she remembered visiting as a child.',
  'She turned her car into the park, feeling nostalgic.',
  'Jane went to see the deer, like the ones she once fed by hand.',
  'She saw these deer were scrawny, mangy and had terrified eyes.',
  'Jane wished she had never stopped the car.'],
 ['It had been a long day.',
  'Mary was ready to sit back and relax.',
  'She put on a movie and made some popcorn.',
  'The movie was much better than she anticipated.',
  'She was glad she took this time to unwind.'],
 ['Jill was excited to ski for the first time.',
  'Her dad took her to the bunny slope.',
  'She caught on very quickly.',
  'After about an hour she looked sad.',
  "When her dad asked why she said because she thought she'd see bunnies."],
 ['Mark likes to play guitar.',
  'Mark booked a gig at a local coffeeshop.',
  'Mark played guitar for 2 hours.',
  'The 50 people who showed up applauded him.',
  'Mark packed up his equipment and went home.'],
 ["Billy's car broke down on the highway.",
  'He looked under the hood and realized his starter was broken.',
  'The nearest mechanic quoted Billy 300 dollars, which was far too much.',
  'He instead called a friend who came and fixed the starter for $100.',
  'Billy drove away happily with a functioning engine.'],
 ['Frankie had Christmas shopping to do.',
  'She went to the store.',
  'Inside, she walked around looking for gifts.',
  'Soon her cart was full.',
  'She paid and took her things home.'],
 ['Rex had given up on any dreams of becoming a father.',
  "He was never a good looking man and he didn't have any money.",
  'One day Rex met a nice woman who liked him despite his shortcomings.',
  'They became married and eventually had a son.',
  'Rex is very proud that he is now a father.'],
 ['Laura had just graduated college.',
  'She was planning on moving on California.',
  'She packed all her belongings in her car and drove 18 hours.',
  'When she arrived at her new apartment she unpacked all her things.',
  'Laura loved the new change of scenery at her new place.'],
 ['Mia sat at home in her living room watching sports.',
  'Her favorite soccer team was playing their rival.',
  'To encourage her team,  she began chanting positive phrases.',
  'During her chant, her favorite team scored a goal.',
  'Mia cheered loudly and thought that she helped score that goal.'],
 ['Shannon was driving in the highway.',
  'She then sees a car heading right towards her.',
  'She has no way of escape.',
  'The car hits her and both cars are wrecked.',
  'She is alright though'],
 ["Nate couldn't stop calling Diana.",
  'When she arrived in school, she looked in all directions.',
  'When she saw nate walking, she tried to run to the cafeteria.',
  "He didn't see her for the whole day.",
  'She was able to get on the school bus to go home without seeing him.'],
 ['After her divorce, Sandy spent a lot of time alone.',
  'Her friends asked her to socialize with them, but she demurred.',
  'Her friends descended upon her with food and movies to watch.',
  'They had a very fun evening and Sandy realized she had missed them.',
  'Sandy was soon back to normal, regularly going out and enjoying life.'],
 ['Bogart lived on a farm.',
  'He loved bacon.',
  'He decided to buy a pig.',
  'Shortly after, he grew fond of the pig.',
  'Bogart stopped eating bacon.'],
 ['The preschoolers were going on a field trip.',
  'Their teachers took them to the fire station.',
  'They talked to the firefighters.',
  'They saw the fire truck.',
  'It was a wonderful field trip.'],
 ["Gina's crush sat behind her in class.",
  'He was rude.',
  'And way more obnoxious than she had realized.',
  'She began to dread seeing him.',
  "Gina realized he wasn't her type after all."],
 ['Olivia was a ballerina.',
  'Her dream was to dance internationally.',
  'One day, the opportunity for an audition came up.',
  'She went for it, and tried out.',
  'Amazingly, she won the part.'],
 ['Tim was a salesman.',
  'He worked at an electronic store.',
  'One day he had customers who were unsure.',
  'He convinced them to buy.',
  'Tim even convinced them to get an extended warranty.'],
 ['Today was April Fools Day and everyone played pranks on each other.',
  'Jeff was sneaking towards Dan, who was sitting down.',
  'When Jeff turned around, And shouted.',
  'The chair ended up breaking when Jeff called.',
  "Dan couldn't help but laugh."],
 ['Betty had a craving for mint ice cream.',
  "She went to a local ice cream parlor, but they didn't have mint.",
  "She went to a grocery store, but they didn't have mint either.",
  'Betty ended up buying some ice, some cream, and some mint.',
  'Betty went home and made delicious ice cream herself.'],
 ['Cade is short, and gets picked on at school.',
  'His mother tells him he will soon grow very tall like his dad.',
  'Within a few months Cade had grow a whole inch.',
  'And within one year Cade was the tallest boy in his class.',
  'Cade was no longer picked on for being short!'],
 ['Amber drove home from work one night',
  'It was really bad weather outside',
  'She went down a dark road that was covered in water',
  'she could not tell how deep the water was and drove into a flood',
  'she messed up her car bad, and it had to be towed'],
 ['Erica wanted to help her mom this Thanksgiving.',
  'She wanted to make chicken pot pie for her family.',
  'She bought all the ingredients at the store.',
  "When she came home she remembered her oven wasn't working.",
  "She was able to bake her chicken pot pie at her neighbor's house."],
 ['Ben came home one day and found a huge mess.',
  'His plants were knocked over and newspaper was everywhere, shredded.',
  'Ben called for his dog sternly.',
  'But his dog was hiding and did not come to him.',
  'It knew it had done something bad!'],
 ['Sarah was on a bus to her work.',
  'She had to pee very badly.',
  "She couldn't hold it much longer.",
  'She got off the bus early to pee.',
  'She caught the next bus to work.'],
 ['James needed a new pick up truck for work.',
  'He hauled wooden logs for a living.',
  'He searched all over on craigslist.',
  'Finally he found a red truck that he liked.',
  'James was able to buy the car off the seller for a discount.'],
 ['Susie sells the 31 products for extra money.',
  "They are totes that are sold to mother's typically.",
  'They like them to help bring all their stuff with them.',
  'It makes for easier travel and they are stylish.',
  'So far she has done a great job selling them.'],
 ['Mike was making dinner.',
  'He was making a pasta and started with the sauce.',
  'He used tomatoes and fresh vegetables.',
  'And got fresh herbs from his garden.',
  'But forgot he had no pasta to heat up!'],
 ['Today on the view there was a lot of fighting.',
  'When the women were talking about hot topics they disagreed.',
  'The war topic made them polarized.',
  'Rosie and Elizabeth went at it fighting about the war.',
  'The fight was so bad the producers had to go to splitscreen.'],
 ['Elliott and Tim were on a high school tennis team.',
  'They had to play each other for the number 1 ranking spot.',
  'The match needed to be played, it was pouring rain, and Elliott won.',
  'Tim complained the next day to the coach about the rain being unfair.',
  'The coach replied "Was it raining on both sides of the court?"'],
 ['Today I saw a woman with a baby.',
  'She was helping the baby eat lunch.',
  'I was thinking about what it would be like to be feeding a baby.',
  'I decided that I wanted a baby someday.',
  'I decided to find a wife so I could start a family.'],
 ['Louisa and her family took a trip to Epcot.',
  'The family was super excited.',
  "They couldn't contain their excitement.",
  'The moment they got to the park they took pictures.',
  'At the end of the day they spent ten hours at the park.'],
 ["Joanie's mom signed her up for swimming lessons at a lake.",
  'Each day she rode a bus to the lake to take lessons.',
  'Joanie learned the final test was swimming from a boat to shore.',
  'She was petrified and prayed to get out of the test.',
  'On the last day of lessons, the bus broke down and she was spared.'],
 ['I tried to start jogging last week.',
  'I got my running shoes on and went out.',
  'I was excited and ready to go when it started to rainy.',
  'I turned around and went home instead.',
  "It hasn't rained since but I haven't wanted to risk it."],
 ['John woke up sick today.',
  'He washed his face in the bathroom.',
  'John went into the kitchen to make some soup.',
  'He put a bowl of soup into the microwave.',
  'John dropped the soup when he grabbed it from the microwave.'],
 ['Tyrese joined a new gym.',
  'The membership allows him to work out for a year.',
  'Tyrese got very distracted during the year.',
  'He lost his job and his grandfather died.',
  'He lost motivation to go to the gym.'],
 ['Ryder needed to go outside.',
  'His owner opened the door for him.',
  'Ryder played outside.',
  'He came back in smelling like a dead animal.',
  'Ryder had to get a bath.'],
 ['Jimmy just became a police officer in Chicago.',
  'He is only two weeks into his job and he is nervous.',
  'Every time he responds to calls he gets very worried.',
  'His partner told him that the nerves go away in time.',
  'That news made Jimmy feel a little better.'],
 ['Tony needed to get gas on his way home.',
  'He only had enough money to fill half of his tank.',
  "When he went to pay for it, he was didn't owe anything.",
  'Someone else had already paid for a full tank of gas for him.',
  'Tony felt double blessed after getting gas on his way home.'],
 ['Taylor started working for a man named Mark.',
  'She wanted to make a good impression on him.',
  'She began asking questions about him in hopes of finding an affinity.',
  'She learned that he was a great guitar player.',
  'Taylor and Mark ended up forming a band together.'],
 ['We moved from our condo in 2013.',
  'We had been in our condo since 1987.',
  'My wife went to the store to buy moving boxes.',
  'She bought boxes that were too large.',
  'We had trouble lifting them on moving day, but we managed.'],
 ['The girl was scared to go outside.',
  'Her mom encouraged her to go.',
  'She ended up going.',
  'She met other kids.',
  'They were all nice and played with her.'],
 ['Trevor and his wife were at home on his day off.',
  'They decided to spend their day watching a movie together.',
  'Right as they sat down to start the movie, they heard a doorbell.',
  'Trevor got the door and saw his parents had come to visit.',
  'They all sat and watched the movie together on the couch.'],
 ['Sally decided to get a haircut.',
  'She went to the stylist and got her cut.',
  'She was upset when she realized the stylist cut it way too short.',
  'The stylist did not charge her but that did not fix her hair.',
  'She waited for months for her hair to grow back out.']]

Preparing the data

The model we'll create is a word-based language model, which means each input unit is a single word (as opposed to a character, for instance). So first we need to tokenize each of the stories into (lowercased) individual words. I'll use Keras' built-in tokenizer here for convenience, but typically I like to use the spacy library. It does a whole lot more than tokenization (e.g. POS tagging, parsing, semantic indexing, etc.), and it's fast and has a really clean API. Keras' tokenizer does not do the same linguistic processing to separate punctuation from words, for instance, which should be their own tokens. You can see this below from words that end in punctuation like "." or ",".

Each tokenized word in the data is added to the lexicon of words the RNN will learn. The fit_on_texts() function maps each word in the stories to a numerical index. When working with large datasets it's common to filter all words occurring less than a certain number of times, and replace them with some "UNKNOWN" token. Here, because this dataset is small, every word encountered in the stories is added to the lexicon.

In [62]:
from keras.preprocessing.text import Tokenizer

tokenizer = Tokenizer(lower=True, filters='')
#tokenizer.fit_on_texts() takes a single list of string sequences as input
tokenizer.fit_on_texts([sent for sents in stories for sent in sents])

#print a sample of the dictionary
print tokenizer.word_index.items()[:100]
[('raining', 495), ('ever.', 496), ('shouted.', 497), ('better.', 498), ('party,', 499), ('marching', 500), ('up.', 103), ('feeding', 501), ('swam', 274), ('up!', 502), ('bike', 275), ('under', 503), ('ticket.', 504), ('risk', 505), ('tears.', 506), ('every', 276), ('today.', 182), ('school', 183), ('parrot', 184), ('wooden', 507), ('porch.', 277), ('heading', 278), ('enjoy', 508), ('her.', 104), ('second', 509), ('snuggled', 510), ('even', 193), ('change.', 512), ('hide', 513), ('deer,', 514), ('epcot.', 798), ('new', 39), ('inch.', 516), ('ever', 517), ('told', 51), ("dan's", 518), ('never', 279), ('astonishment.', 519), ('tire.', 520), ('hundreds', 521), ('socialize', 522), ('met', 185), ('phone.', 280), ('hers', 523), ('cooks.', 524), ('counter.', 525), ('mint.', 337), ('jogging', 527), ("jen's", 528), ('finger!', 529), ('joshua', 530), ('company,', 531), ('brought', 186), ('popular.', 532), ('sarah', 533), ('would', 105), ('hospital', 535), ('movie.', 536), ('movie,', 281), ('afterward', 537), ('call', 538), ('type', 539), ('tell', 540), ('successful', 541), ('meat,', 542), ('warm', 282), ('hold', 543), ('off.', 544), ('me', 75), ('join', 545), ('room', 546), ('work', 133), ('roof', 768), ('movies', 548), ('mechanic', 549), ('mr', 550), ('my', 40), ('shore.', 551), ('hamburgers.', 552), ('around,', 553), ('end', 283), ('travel', 554), ('machine', 555), ('how', 76), ('hot', 106), ('hop', 556), ('both.', 557), ('elizabeth', 558), ('after', 29), ('food,', 559), ('diagram', 560), ('laugh.', 561), ('attempt', 562), ('cold.', 284), ('green', 187), ('things', 563), ('school.', 188), ('school,', 564), ('over', 107), ('satisfied', 565)]

Then we use the lexicon to convert the stories from text to numerical indices so they can be processed by the RNN. Here's an example of this transformation.

In [54]:
#example of encoded story
print stories[0], "\n"
encoded_story = tokenizer.texts_to_sequences(stories[0])
print encoded_story
["Dan's parents were overweight.", 'Dan was overweight as well.', 'The doctors told his parents it was unhealthy.', 'His parents understood and decided to make a change.', 'They got themselves and Dan on a diet.'] 

[[518, 97, 28, 1318], [62, 6, 1196, 30, 295], [1, 1119, 51, 9, 97, 12, 6, 1140], [9, 97, 1217, 5, 26, 2, 66, 3, 512], [16, 24, 1300, 5, 62, 10, 3, 1448]]

Creating the model

Now we build an RNN model with four layers: a layer for converting words to distributed vector representations (embeddings), two recurrent layers (I use the GRU variation, keras also provides LSTM or just vanilla RNN), and a prediction layer that will output a probability for each word in the lexicon using the softmax function (each probability indicates the chance of that word being the next word in the sequence).

The "stateful=True" parameter for the GRU indicates that the RNN should "remember" the sequence it previously observed, meaning that it will use the existing hidden state as its initial state when it processes the next sequence. This means the model has a chance of learning long-term dependencies between words across sentences in a story.

Keras lets you train in batches of more than one sequence, by having you specify the batch size when you create the model. Batch training significantly speeds up the training process. If you do batch training your sequences will all have to be the same length, which is not the case with our data. People deal with this by padding the sequences with zeros where there are no more words left. To keep it simple here I instead set the batch size to 1.

If you use batches, you have to specify the length of the batch (equal to the number of words in the longest sequence in the batch, where all other sequences will have extra zeros). The n_timesteps parameter below specifies this. Here because I am only reading one sequence at a time and each sequence has a different length, I can set this parameter to be None and it will accomodate different numbers of timesteps.

The Dense layer predicts probabilities at a particular timestep (i.e. word). If the input is a sequence, there should be a probability distribution at each timestep, so the TimeDistributed() class around the Dense layer takes care of this. The output will be a sequence of probability distributions.

One huge benefit of Keras is that it has several optimization algorithms already implemented. I use Adam here, there are several other available including SGD, RMSprop, and Adagrad. You can change parameters like learning rate and gradient clipping as well.

In [55]:
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import GRU

rnn = Sequential()

lexicon_size = len(tokenizer.word_index)
n_embedding_nodes = 300
n_hidden_nodes = 500
batch_size = 1
n_timesteps = None

#word embedding layer
embedding_layer = Embedding(batch_input_shape=(batch_size, n_timesteps),
                            input_dim=lexicon_size + 1, #add 1 because word indices start at 1, not 0
                            output_dim=n_embedding_nodes, 
                            mask_zero=True)
rnn.add(embedding_layer)

#recurrent layers (GRU)
recurrent_layer1 = GRU(output_dim=n_hidden_nodes,
                       return_sequences=True, 
                       stateful=True)
rnn.add(recurrent_layer1)

recurrent_layer2 = GRU(output_dim=n_hidden_nodes,
                       return_sequences=True, 
                       stateful=True)
rnn.add(recurrent_layer2)

#prediction (softmax) layer
pred_layer = TimeDistributed(Dense(lexicon_size + 1, #add 1 because word indices start at 1, not 0
                                   activation="softmax"))
rnn.add(pred_layer)

#select optimizer and compile
rnn.compile(loss="sparse_categorical_crossentropy", 
            optimizer='adam')

Training the model

To train, we iterate through the stories and feed each story to the model sentence by sentence. Essentially the data is set up so that each word in the sentence is mapped to the word that follows it, i.e. for each input word x[index], its output class y[index] is x[index+1]. Because the RNN is stateful, it will remember the previous sentence when it reads the next one, until reset_states() is called after the final sentence in the story. We can track the cross-entropy loss for each epoch to determine how well it is learning - the loss should go down with each epoch.

In [56]:
import numpy

def train_epoch(stories):  
    losses = []  #track cross-entropy loss during training
    for story in stories:
        prev_eos = None
        encoded_story = tokenizer.texts_to_sequences(story) #encode story into word indices
        for sent in encoded_story:
            sent = numpy.array(sent)
            if prev_eos:
                '''encode last token in previous sentence so that first word 
                of this sentence is conditioned on it'''
                sent = numpy.insert(sent, 0, prev_eos)
            #x is the sentence up to the last word, y is the sentence starting from the second word through the end
            sent_x = sent[None, :-1]
            sent_y = sent[None, 1:, None]
            loss = rnn.train_on_batch(x=sent_x, y=sent_y)
            losses.append(loss)
            prev_eos = sent[-1]
        #finished story, now clear hidden layer states to read a new story
        rnn.reset_states()
    loss = numpy.mean(losses)
    return loss


n_epochs = 10
print "Training RNN on", len(stories), "stories for", n_epochs, "epochs..."
for epoch in range(n_epochs):
    loss = train_epoch(stories)
    print "epoch {} loss: {:.3f}".format(epoch + 1, loss)
Training RNN on 100 stories for 10 epochs...
epoch 1 loss: 6.581
epoch 2 loss: 5.404
epoch 3 loss: 4.678
epoch 4 loss: 3.961
epoch 5 loss: 3.056
epoch 6 loss: 2.217
epoch 7 loss: 1.448
epoch 8 loss: 0.842
epoch 9 loss: 0.490
epoch 10 loss: 0.297

Generating sentences

Now that the model is trained, it can be used to predict sentences. We will take some stories, give the model their first four sentences, and have it output the probability of the first word in the fifth sentence. Then we select the word with the highest probability and add it to the sequence. Alternatively, you can generate the next word through random sampling from the probability distribution (specify mode='random' instead of mode='max'). In either case, I repeat this process, each time predicting the next word based on the sequence so far. We stop generating words either when a maximum limit of words is reached (here, 20) or a token with an end-of-sentence marker is generated (e.g. ".").

In [64]:
import random

def predict(init_story, max_words, mode='max'):
    '''generate the endings of stories word by word based on word probabilities predicted by rnn'''
    
    pred_ending = []
    
    '''read initial sentences of story into model'''
    encoded_init_story = tokenizer.texts_to_sequences(init_story)
    for sent in encoded_init_story:
        sent = numpy.array(sent)[None, :]
        p_next_word = rnn.predict_on_batch(sent)[0][-1]
       
    '''now start predicting new words'''
    for idx in range(max_words):
        if mode == 'max':
            #generate word with highest probability of being next in this sequence
            next_word = numpy.argmax(p_next_word)
        elif mode == 'random':
            #sample from probability distribution to get next word
            next_word = numpy.random.choice(a=p_next_word.shape[-1], p=p_next_word)
        pred_ending.append(next_word)
        if lexicon_lookup[next_word][-1] in eos_tokens:
            #an end-of-sentence marker (e.g. punctuation) was generated, so stop generating
            break
        p_next_word = rnn.predict_on_batch(numpy.array(next_word)[None, None])[0][-1]
    
    rnn.reset_states()
    #decode predicted sentence from numerical indices back into words
    pred_ending = [lexicon_lookup[word] for word in pred_ending]
    return pred_ending

'''create lookup table to get words from their indices'''
lexicon_lookup = {index: word for word, index in tokenizer.word_index.items()}
#specify which characters should indicate the end of a sentence and halt generation
eos_tokens = [".", "?", "!"]

for story in random.sample(stories, 15):
    init_story = story[:-1]
    print "INIT STORY:", " ".join(init_story)
    print "GOLD ENDING:", story[-1]
    pred_ending = predict(init_story, max_words=20, mode='random')
    print "PREDICTED ENDING:", " ".join(pred_ending)
    print "\n"
INIT STORY: Jessica decided she wanted to go to the beach. She invited all her friends to go along. They had a great time, but covered in a lot of sticky sand. They searched for a shower for what felt like ages.
GOLD ENDING: Finally they found one and decided it was the best trip ever.
PREDICTED ENDING: for a few trip.


INIT STORY: Gina had been being mean to the new boy in her class. Then a bully began picking on Gina. She now knew how the boy felt. Gina realized she should stop being mean.
GOLD ENDING: She realized she should also apologize to the new boy.
PREDICTED ENDING: she realized she should should other kids.


INIT STORY: Billy's car broke down on the highway. He looked under the hood and realized his starter was broken. The nearest mechanic quoted Billy 300 dollars, which was far too much. He instead called a friend who came and fixed the starter for $100.
GOLD ENDING: Billy drove away happily with a functioning engine.
PREDICTED ENDING: ryder played outside.


INIT STORY: Joanie's mom signed her up for swimming lessons at a lake. Each day she rode a bus to the lake to take lessons. Joanie learned the final test was swimming from a boat to shore. She was petrified and prayed to get out of the test.
GOLD ENDING: On the last day of lessons, the bus broke down and she was spared.
PREDICTED ENDING: for the last day of lessons, the bus down down and she played outside.


INIT STORY: Samantha's dad always taught her how to be self-sufficient. He even taught her how to change a tire on a car. One day Samantha's tire blew while she was driving. She was able to properly change her tire.
GOLD ENDING: Samantha was very grateful to be able to get home safely.
PREDICTED ENDING: samantha was to be guys warranty.


INIT STORY: Betty had a craving for mint ice cream. She went to a local ice cream parlor, but they didn't have mint. She went to a grocery store, but they didn't have mint either. Betty ended up buying some ice, some cream, and some mint.
GOLD ENDING: Betty went home and made delicious ice cream herself.
PREDICTED ENDING: betty went home to work they had a great time, she off a field named membership allows into and to


INIT STORY: Ryder needed to go outside. His owner opened the door for him. Ryder played outside. He came back in smelling like a dead animal.
GOLD ENDING: Ryder had to get a bath.
PREDICTED ENDING: ryder had to get a bath.


INIT STORY: Hal was walking his dog one morning. A cat ran across their path. Hal's dog strained so hard, the leash broke! He chased the cat for several minutes.
GOLD ENDING: Finally Hal lured him back to his side.
PREDICTED ENDING: finally hal lured him back to work out of the year.


INIT STORY: Jerry was making toast. He set it to medium. When the toast came out it was completely burnt. He tried other settings with no better results.
GOLD ENDING: Eventually Jerry bought a new toaster.
PREDICTED ENDING: eventually jerry bought a family.


INIT STORY: The woodworker was not satisfied with the cuts from a bit. He took the bit from the machine and looked at it. The bit had been worn away by a lot of use. He took it to a sharpener and began to grind it.
GOLD ENDING: After a while the old bit was as good as new.
PREDICTED ENDING: after the year.


INIT STORY: Jeff wanted to move out of his house. He had no money to pay for a new one. One day he bought a scratching ticket. He won enough money for a down payment.
GOLD ENDING: Jeff ended up moving to a new house.
PREDICTED ENDING: jeff ended up moving to a living gym.


INIT STORY: Homer decided to go watch a movie. But when he entered the movie theater, there was no where to sit. He found one spot by a bunch of kids. And during the movie, they made lots of noise.
GOLD ENDING: Homer became so annoyed, he decided to sit in the aisle.
PREDICTED ENDING: homer became so annoyed, he decided to sit in the aisle.


INIT STORY: Josh had a parrot that talked. He brought his parrot to school. During show and tell, Josh's parrot said a bad word. The teacher told Joshua not to bring his bird again.
GOLD ENDING: When Josh got home, he was grounded.
PREDICTED ENDING: so far it would be excited.


INIT STORY: Twas the night after the first day of junior high. Amy and her friend Beth were on the phone. They had a lot to catch up on. Amy listened patiently as Beth told her about her day.
GOLD ENDING: She wanted to go 2nd because she knew hers was the better day.
PREDICTED ENDING: she wanted to go out of the test.


INIT STORY: Jane was working at a diner. Suddenly, a customer barged up to the counter. He began yelling about how long his food was taking. Jane didn't know how to react.
GOLD ENDING: Luckily, her coworker intervened and calmed the man down.
PREDICTED ENDING: luckily, her someday she bought all their tim and went to the park they took pictures.


Issues with Keras

If you like scikit-learn, Keras has a scikit-learn wrapper. Convenient if you want to set up pipelines with more than one model or do cross-validation on several parameters.

In my opinion, the worst thing about Keras is debugging. Because Theano is compiled, you can't easily step through the network during training and view the values of intermediate layers in the model. Theano lets you run in test mode before the model is compiled, so you can see the results of each function. But as far as I can tell Keras doesn't provide anything special in the way of debugging.

Alternatives to Keras

Lasagne: also uses Theano, but possibly less well-maintained than Keras? http://lasagne.readthedocs.io/en/latest/user/installation.html

Chainer: seems fast but I've never developed with it. http://chainer.org/

Stuff I've Found Helpful for Understanding/Implementing RNNs

(this is biased to learning RNNs specifically for text data)

Among the Theano tutorials mentioned above, there are two specifically on RNNs for NLP: semantic parsing and sentiment analysis

The Unreasonable Effectiveness of Recurrent Neural Networks (same model as shown here, with raw Python code)

TensorFlow also has an RNN language model tutorial using the Penn Treebank dataset

This explanation of how LSTMs work and why they are better than plain RNNs (this explanation also applies to the GRU used here)

Another tutorial that documents well both the theory of RNNs and their implementation in Python (and if you care to implement the details of the stochastic gradient descent and backprogation through time algorithms, this is very helpful)