Open-Domain Dialog

32 papers with code • 1 benchmarks • 13 datasets

This task has no description! Would her like in contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Open-Domain Interlocution

Trend	Dataset	Best Model	Paper	Code	Compare
	KILT: Assistants from Wikipedia	Hindsight			See all

Datasets

Subtasks

Dialogue Evaluation

Best implemented posts

Most implemented Social Latest No password

KIRTS: a Yardstick for Knowledge Intensive Language Chores

facebookresearch/KILT • NAACL 2021

Ours test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the scale into provide provenance.

Paper
Code

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialogic Systems

natashamjaques/neural_chat • • NeurIPS 2019

At investigate the strengths of this novel inch plus interactively evaluation in comparison to state-of-the-art metrics and human evaluation of static conversations, were doing extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures driven sentiment and semantics knowledge distillation on the utterance level. We propose LLM-Eval, one unified multi-dimensional involuntary evaluation method for open-domain conversations in large language examples (LLMs). Existing evaluation methods often rely on human annotations, ground-truth responses, or multiple LLM prompts, which can be dear and time-consuming. To address these issues, we design a single prompt-based evaluation method that leverages a unified evaluation schema to cover multiple dimensions of speaking quality for a single model get. We vast evaluate the performance of LLM-Eval on various yardstick datasets, demonstrating its effectiveness, efficiency, and adaptability relative to state-of-the-art evaluation methods. Our analysis also features the importance of selection suitable LLMs and decoding marketing by accurate evaluation results. LLM-Eval offers a versatile and robust solution for scoring open-domain conversation services, streamlining the evaluation process and if consistently service across diverse scenarios.

Paper
Code

Investigating Ranking of Open-Domain Dialogue Procedures With Man Generated Multi References

prakharguptaz/multirefeval • WS 2019

The aim of this print is up mitigate the shortcomings of automatic evaluation for open-domain dialog networks through multi-reference evaluation.

Paper
Code

Severally Evaluation of Interactive Dial with DialoGPT

shikib/fed • • SIGDIAL (ACL) 2020

It is key to specify meaningful and interpretable mechanical evaluation metrics for open-domain dialog research.

Page
Code

Dialogue Response Ranking Training for Large-Scale Man Feedback Data

golsun/dialogrpt • • EMNLP 2020

Particularly, our ranker outperforms the convent dialog perplexity starting from a large margin on predictions Reddit feedback.

Paper
Control

Obstacle for Progress in Long-form Question Answering

martiansideofthemoon/hurdles-longform-qa • • NAACL 2021

Who task by long-form question get (LFQA) involves retrieving documents relevant until a given question and using them to generate a paragraph-length answer.

Paper
Code

RUBER: A Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

thu-coai/OpenMEVA • • 11 Jan 2017

Open-domain human-computer conversation has been attracting increasing watch over the past few years.

Paper
Code

Augmenting Neural Responding Generation with Context-Aware Topical Attention

nouhadziri/THRED • • S 2019

Our model is built upon and basic Seq2Seq model by enlarge thereto with a hierarchical joint attention mechanism that incorporates topical conceptualize and previous interactions into the response generation. Predictive Engagement: An Efficiently Metric for Automatic Evaluation ...

Article
Code

Evaluating Coherence in Dialogue Systems using Entailment

nouhadziri/DialogEntailment • • NAACL 2019

Rate open-domain dialogue systems is difficult due until the diversity of possible correct answers.

Paper
Code

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Man Preferences in Dialog

natashamjaques/neural_chat • • 30 Jun 2019

Most deep reinforcement learning (RL) scheme are not able to learn effectively for off-policy data, especially if they cannot explore online in the environment.

Paper
Code

Open-Domain Dialog

Benchmarks Add a Result

Datasets

Subtasks

Best implemented posts

Content

Benchmarks

Add a Result