NLPExplorer

Daniel Deutsch

Number of Papers:- 30

Number of Citations:- 1

First ACL Paper:- 2018

Latest ACL Paper:- 2024

Venues:-

NAACL

CoNLL

COLING

EMNLP

Findings

Eval4NLP

TACL

ACL

EACL

WMT

NLPOSS

Co-Authors:-

Similar Authors:-

On the Role of Summary Content Units in Text Summarization Evaluation NAACL

LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback F i n d i n g s - N A A C L

Finding Replicable Human Evaluations via Stable Ranking Probability NAACL

Mitigating Metric Bias in Minimum Bayes Risk Decoding WMT

Geza Kovacs | Daniel Deutsch | Markus Freitag |

Improving Statistical Significance in Human Evaluation of Automatic Metrics via Soft Pairwise Accuracy WMT

Brian Thompson | Nitika Mathur | Daniel Deutsch | Huda Khayrallah |

MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task WMT

Juraj Juraska | Daniel Deutsch | Mara Finkelstein | Markus Freitag |

Are LLMs Breaking MT Metrics? Results of the WMT24 Metrics Shared Task WMT

Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data WMT

Incorporating Question Answering-Based Signals into Abstractive Summarization via Salient Span Selection EACL

Daniel Deutsch | Dan Roth |

Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level WMT

Daniel Deutsch | Juraj Juraska | Mara Finkelstein | Markus Freitag |

MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task WMT

Quality Estimation Using Minimum Bayes Risk WMT

Subhajit Naskar | Daniel Deutsch | Markus Freitag |

There’s No Data like Better Data: Using QE Metrics for MT Data Filtering WMT

Results of WMT23 Metrics Shared Task: Metrics Might Be Guilty but References Are Not Innocent WMT

The Devil Is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation WMT

Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration EMNLP

Daniel Deutsch | George Foster | Markus Freitag |

Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems Eval4NLP WS

The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics Eval4NLP WS

A Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization ACL

Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics ACL Findings

Daniel Deutsch | Dan Roth |

Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics NAACL

Daniel Deutsch | Rotem Dror | Dan Roth |

On the Limitations of Reference-Free Evaluations of Generated Text EMNLP

Daniel Deutsch | Rotem Dror | Dan Roth |

A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods TACL

Daniel Deutsch | Rotem Dror | Dan Roth |

Understanding the Extent to which Content Quality Metrics Measure the Information Quality of Summaries CoNLL EMNLP

Daniel Deutsch | Dan Roth |

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary TACL

Daniel Deutsch | Tania Bedrax-Weiss | Dan Roth |

Is Killed More Significant than Fled? A Contextual Model for Salient Event Detection COLING

Disha Jindal | Daniel Deutsch | Dan Roth |

SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics EMNLP NLPOSS

Daniel Deutsch | Dan Roth |

Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization EMNLP

Daniel Deutsch | Dan Roth |

A General-Purpose Algorithm for Constrained Sequential Inference CoNLL

Daniel Deutsch | Shyam Upadhyay | Dan Roth |

A Distributional and Orthographic Aggregation Model for English Derivational Morphology ACL

Daniel Deutsch | John Hewitt | Dan Roth |