SCAI workshop @ EMNLP 2020
Top 7 solutions from the current leaderboard are eligible for the ‘wild’ evaluation. To be qualified the solution should have better scores in at least one metric out of F1, PPL or hits compared to the ParlAI Team baselines. The solution submitted to the ‘wild’ evaluation should be the same as the solution tested with automated evaluation metrics.
Wild evaluation itself involves human conversations with bots. The bots are exposed through a so called proxy-bot which randomly connects a person with a bot. A person talks to a bot and gives it a score by two mearures: how a person likes the conversation and how well a bot is playing its part.
To submit a bot for the wild evaluation you should:
You can use the existing ParlAI integration from the first ConvAI competition. To use the integration you need to use a ConvAI world, like this:
from parlai.projects.convai.convai_world import ConvAIWorld
and run your bot with these command line options:
python3 bot.py -bi <BOT_TOKEN> -rbu <SERVER_URL>
where <BOT_TOKEN>
provided by the organizers and <SERVER_URL>
is listed below (it differs for test & production servers).
https://2258.lnsigo.mipt.ru/bot
You can test your bot via Telegram:
Also the production server’s proxy-bot is available through Facebook Messenger.
Due to random nature of connection through a proxy-bot you may be not connected to your bot. To overcome this there is an option to bind a Telegram account to a specific bot by using /setbot
command for proxy-bot. Once you’ve set a bot for your account this account will be always connected to the specified bot. You can unset preferred bot by using /unset
command.
This functionality will be available only on test server.
Your bot should be able to handle multiple dialogs at once, where each dialog is a private chat with specific user.
The very first message your bot receives should look like:
/start
Profile description
Profile description
Profile description
Profile description
i.e. one line with “/start” text, and four lines afterwards with sentences describing profile.
If you’re using provided ParlAI integration then the first message will be:
your persona: profile description
your persona: profile description
your persona: profile description
your persona: profile description
Maximum inactivity time (s): 600
Maximum utterance number: 1000
Maximum inactivity time is a longest period between two successive messages of a bot or a person.
A bot cannot copy input profile description. We test it by checking 5-grams from profile in bot’s output.
If there is such a 5-gram in an utterance, the following error message will occur:
Error: <class 'convai.exceptions.ProfileTrigramDetectedInMessageError'>:
Example error message received by a bot from the test server:
Send response from agent to chat #183344811: {'id': 'agent', 'text': 'i am a dragon .', 'episode_done': False}
{"ok": false, "error_code": 400, "description": "Error: <class 'convai.exceptions.ProfileTrigramDetectedInMessageError'>: "}
Exception: 400 Client Error: Bad Request for url: https://2258.lnsigo.mipt.ru/bot<bot_id>/sendMessage
We use two metrics:
You could check out additional details on API in corresponding document.