I'm just participating the Deep Learning Udacity course (I really like it :) When I found an article: Deep Learning for Chatbots, Part 1 – Introduction. Exact the topic I'm really interested in.
It's a nice introduction. And it sheds some lights on the capabilities of the state-of-the-art systems. E.g.: "However, we’re still at the early stages of building generative models that work reasonably well. Production systems are more likely to be retrieval-based for now."
Also I found interesting references about incorporating context into generativ models: "Experiments in Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models and Attention with Intention for a Neural Network Conversation Model both go into that direction."
Tuesday, April 19, 2016
Sunday, April 17, 2016
Question Answering datasets
To extend the list of conversational datasets there is a collection of Question Answering (QA) datasets. A question-answer pair is a very short conversation which can be also used to train chatbots. If you want to use the chatbot for giving information for customers, like automated customer support or automated sales agent on your website, this type of datasets can be particularly useful.
The WikiQA corpus is a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering.
Usually on TREC (Text REtrieval Conference) there is a QA task which has some kind of datasets associated with it. Most of the datasets are focusing on factoid QA task but the one in 2015 is a kind of live QA. The task was to answer questions on Yahoo Answers.
Manually-generated factoid question/answer pairs with difficulty ratings from Wikipedia articles. Dataset includes articles, questions, and answers.
The WikiQA corpus is a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering.
Usually on TREC (Text REtrieval Conference) there is a QA task which has some kind of datasets associated with it. Most of the datasets are focusing on factoid QA task but the one in 2015 is a kind of live QA. The task was to answer questions on Yahoo Answers.
Manually-generated factoid question/answer pairs with difficulty ratings from Wikipedia articles. Dataset includes articles, questions, and answers.
There are some manually curated QA datasets from Yahoo Answers from Yahoo.
You also can download the Stack Overflow questions and answers. It's a domain specific but huge dataset.
Saturday, April 9, 2016
Conversational datasets to train a chatbot

Data collected from twitter (by Chenhao Tan):
Argument trees, "successful persuasion" metadata, and related data from the subreddit ChangeMyView. First release 2016.
Multi-community engagement (users posting, or not posting, in different subreddits since Reddit's inception). Data includes the texts of posts made and associated metadata, such as the subreddit, the "number" of upvotes, and the time stamp. First release 2015.
Cornell natural-experiment tweet pairs: data for investigating whether whether phrasing affects message propagation, controlling for user and topic. zip file can be retrieved from the given URL (first release 2014)
EDIT: you can also check the collection of QA datasets.
ALSO CHECK OUT THIS more comprehensive list of dialogue datasets.
Best in March
This month is still about AI more specifically about chatbots. There was so many news about this:
Techcrunch wrote that Facebook’s Messenger Bot Store could be the most important launch since the App Store.
Tay, Microsoft's AI Twitter chatbot got racist.
PocketConfidant developed an AI for coaching through a chat interface.
Robot At SXSW Says She Wants To Destroy Humans ...
There was more companies on Y-combinator's demo day dealing with this topic:
Techcrunch wrote that Facebook’s Messenger Bot Store could be the most important launch since the App Store.
Tay, Microsoft's AI Twitter chatbot got racist.
PocketConfidant developed an AI for coaching through a chat interface.
Robot At SXSW Says She Wants To Destroy Humans ...
There was more companies on Y-combinator's demo day dealing with this topic:
- Nova uses artificial intelligence to write sales emails automatically. It can search the web and social media for facts about the recipient that it can include in the email, like that they were recently the subject of a news article, or enjoy a specific hobby. Nova’s emails perform better than humans. They get a 67% open rate and 11% click through rate.
- MSG.ai helps onoperating chatbots on multiple platforms. It offers a centralized dashboard to detect trends and sentiments, and integrates with Salesforce Desk and Zendesk. With Msg.ai’s intelligence and A/B testing, businesses can maximize the benefit of their chatbots.
- Sendbird provides a UI, SDK and backend to easily add chat functionality for websites and apps.
- With Chatfuel, those looking to build and engage an audience can use the native interface to create bots that help facilitate conversations. More than 130,000 bots have been created on the platform. Publishers, like TechCrunch and Forbes, can build on Chatfuel and deploy to any messengers.
- Promt is a chatbot building platform that lets businesses build a chatbot in 15 minutes with 15 lines of code. It can then be deployed instantly to Slack, Line, WeChat, SMS, and soon Facebook Messenger.
Subscribe to:
Posts (Atom)