CMU

Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania. The institution was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institute of Technology and began granting four-year degrees. In 1967, it became Carnegie Mellon University through its merger with the Mellon Institute of Industrial Research, founded in 1913 by Andrew Mellon and Richard B. Mellon and formerly a part of the University of Pittsb... For More Info

Total links:- 361
Total paper mentions:- 794

First ACL Paper:- 1998
Latest ACL Paper:- 2019

Links

Alongwith its Literature Mentions


http://childes.psy.cmu.edu/data/Romance/French/
[mdr] Une analyse préliminaire du rire chez des enfants de 18 à 36 mois ([lol]: a preliminary study of laughter in 18- to 36- month old children) [in French]

http://www.cs.cmu.edu/%7Enbach/papers/
Relation Classification via Multi-Level Attention CNNs

http://childes.psy.cmu.edu/data/EastAsian/Chi
The Role of Qualia Structure in Mandarin Children Acquiring Noun-modifying Constructions

http://www.cs.cmu.edu/~jhclark/
Improving the neural network-based machine transliteration for low-resourced language pair

http://www.cs.cmu.edu/~ark/TweetNLP/
Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

http://rtw.ml.cmu.edu/rtw
Affective Common Sense Knowledge Acquisition for Sentiment Analysis

http://www.naclo.cs.cmu.edu/problems2011/E.pdf
Introducing Computational Concepts in a Linguistics Olympiad

http://articulab.hcii.cs.cmu.edu/sigdial2016/
Special Session - The Future Directions of Dialogue-Based Intelligent Personal Assistants

http://www.cs.cmu.edu/~max
Building Practical Spoken Dialog Systems

http://www.is.cs.cmu.edu/
Advances in meeting recognition

http://www.cs.cmu.edu/mbilotti/resources
Evaluation for Scenario Question Answering Systems

http://kitchen.cs.cmu.edu/
Discovering Causal Relations in Textual Instructions

http://www.cs.cmu.edu/~cprose/SIDE.html
SIDE: The Summarization Integrated Development Environment

http://cs.cmu.edu/
Retrofitting Word Vectors to Semantic Lexicons
Community Evaluation and Exchange of Word Vectors at wordvectors.org
Sparse Overcomplete Word Vector Representations
Early Gains Matter: A Case for Preferring Generative over Discriminative Crowdsourcing Models
Semi-supervised Learning of Naive Bayes Classifier with feature constraints
A Virtual Manipulative for Learning Log-Linear Models
Multilingual Open Relation Extraction Using Cross-lingual Projection
Examining the Relationship between Preordering and Word Order Freedom in Machine Translation

http://www.ark.cs.cmu.edu/cdyer/ru-600/
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References

http://tts.speech.cs.cmu.edu/apappu/navagati/
The Structure and Generality of Spoken Route Instructions

http://www.cs.cmu.edu/,,~dougb/ident.html
Mining the Web for Bilingual Text

http://www.is.cs.cmu.edu/papers/
Domain Portability in Speech-to-Speech Translation

http://www.cs.cmu.edu/~shomir/um_corpus.html
The Creation of a Corpus of English Metalanguage
Toward Automatic Processing of English Metalanguage

http://www.cs.cmu.edu/~mdenkows/meteor-
Meteor Universal: Language Specific Translation Evaluation for Any Target Language

http://childes.psy.cmu.edu/manuals/CHAT.pdf
Automatic Measurement of Syntactic Development in Child Language
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous Speech for Romance Languages

http://www.cs.cmu.edu/~WebKB/
Web Mining for Unsupervised Classification

http://www.cs.cmu.edu/~maheshj/datasets/acl09short.html
Exploring the Use of Word Relation Features for Sentiment Classification

http://www.cs.cmu.edu/yww/data/petpeeves.zip
That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets

http://www.link.cs.cmu.edu/link/index.html
Integrating Linguistic Resources: The American National Corpus Model
Exploiting Semantic Web Technologies for Intelligent Access to Historical Documents

https://www.ark.cs.cmu.edu/TweetNLP
Dynamic Language Models for Streaming Text

http://www.cs.cmu.edu/~TextLearning/
Parametric Models of Linguistic Count Data

http://projectile.is.cs.cmu.edu/research/public/tal
Speech to Speech Translation for Medical Triage in Korean

http://www.cs.cmu.edu/~
Language Model Adaptation for Statistical Machine Translation via Structured Query Models

http://rtw.ml.cmu.edu/sslnlp09
Coupling Semi-Supervised Learning of Categories and Relations

http://www.ark.cs.cmu.edu/LexSem
Comprehensive Annotation of Multiword Expressions in a Social Web Corpus

http://www.cs.cmu.edu/tom/science2008
Concept Classification with Bayesian Multi-task Learning

http://www.speech.cs.cmu.edu/cgi-bin/
Echoes of Persuasion: The Effect of Euphony in Persuasive Communication
KU Leuven at HOO-2012: A Hybrid Approach to Detection and Correction of Determiner and Preposition Errors in Non-native English Text
G2P Conversion of Proper Names Using Word Origin Information
Pronunciation Modeling in Spelling Correction for Writers of English as a Foreign Language
Idiom Savant at Semeval-2017 Task 7: Detection and Interpretation of English Puns
Robust Dictionary Lookup in Multiple Noisy Orthographies
X575: Writing rengas with web services
A Text Normalisation System for Non-Standard English Words
Native Language Identification using Phonetic Algorithms
An Unsupervised Model for Text Message Normalization
Deep-speare: A joint neural model of poetic language, meter and rhyme
Optimal Data Set Selection: An Application to Grapheme-to-Phoneme Conversion
Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license
Automation and Evaluation of the Keyword Method for Second Language Learning
Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Classifying Recognition Results for Spoken Dialog Systems
Construction and Analysis of Word-level Time-aligned Simultaneous Interpretation Corpus

http://www.cs.cmu.edu/~dmortens/
Grapheme-to-Phoneme Models for (Almost) Any Language

http://www-2.cs.cmu.edu/$\sim$lemur
Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora

http://www.cs.cmu.edu/People/ref/mlim/
When is an Embedded MT System “Good Enough” for Filtering?

http://www.cs.cmu.edu/afs/cs/project/
Regular Expression Guided Entity Mention Mining from Noisy Web Data
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity

http://www.cs.cmu.edu/alavie/METEOR/
An Empirical Comparison Between N-gram and Syntactic Language Models for Word Ordering
Fluent Translations from Disfluent Speech in End-to-End Speech Translation

http://www.cs.cmu.edu/~alavie/papers/BanerjeeLavie2005-
Learning Translation Rules for a Bidirectional English-Filipino Machine Translator

http://childes.psy.cmu.edu/data-
Computational simulations of second language construction learning

http://www.ark.cs.cmu.edu/SEMAFOR/
SXUCFN-Core: STS Models Integrating FrameNet Parsing Information
Any-language frame-semantic parsing
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogue Systems
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken Language Understanding
Unsupervised extractive summarization via coverage maximization with syntactic and semantic concepts
Jointly Modeling Inter-Slot Relations by Random Walk on Knowledge Graphs for Unsupervised Spoken Language Understanding

http://boston.lti.cs.cmu.edu/clueweb09/wiki/
Combinaison de ressources générales pour une contextualisation implicite de requêtes (Query Contextualization and Reformulation by Combining External Corpora) [in French]

http://www.cs.cmu.edu/~ref/mlim/chapter3.html
Research on a Model of Extracting Persons’ Information Based on Statistic Method and Conceptual Knowledge

http://www.naclo.cs.cmu
Introducing Computational Concepts in a Linguistics Olympiad

http://www.cs.cmu.edu/~enron/
Towards the Orwellian Nightmare: Separation of Business and Personal Emails
Distractorless Authorship Verification
Degrees of Orality in Speech-like Corpora: Comparative Annotation of Chat and E-mail Corpora
Extracting Social Power Relationships from Natural Language
Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk
Evaluating the Ontology underlying sMail - the Conceptual Framework for Semantic Email Communication

http://childes.psy.cmu.edu/topics/
Plural Problems in the Nominal Morphology of Marathi

http://boston.lti.cs.cmu.edu/Data/clueweb09/
Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining
Random Walk Inference and Learning in A Large Scale Knowledge Base
Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition
Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches
Bootstrapping Biomedical Ontologies for Scientific Text using NELL
Discovering Relations between Noun Categories

http://www.cs.cmu.edu/zollmann/samt/
N-Gram-Based Statistical Machine Translation versus Syntax Augmented Machine Translation: Comparison and System Combination

http://www.speech.cs.cmu.edu/cgi-
A Beam-Search Decoder for Normalization of Social Media Text with Application to Machine Translation
Extending Pronunciation Lexicons via Non-phonemic Respellings
Ensemble Methods for Native Language Identification
What Makes Writing Great? First Experiments on Article Quality Prediction in the Science Journalism Domain
ProPOSEL: a human-oriented prosody and PoS English lexicon for machine-learning and NLP
Transliteration Alignment
Report of NEWS 2011 Machine Transliteration Shared Task
Priming vs. Inhibition of Optional Infinitival “to”

http://www.speech.cs.cmu.edu/haitian/
The Gulf of Guinea Creole Corpora
Monolingual Distributional Profiles for Word Substitution in Machine Translation

http://www.cs.cmu.edu/~aberger/software
Toward a Scoring Function for Quality-Driven Machine Translation

http://www.cs.cmu.edu/afs/cs/project/theo-
Selecting Corpus-Semantic Models for Neurolinguistic Decoding
A Modified Cosine-Similarity based Log Kernel for Support Vector Machines in the Domain of Text Classification

http://www.cs.cmu.edu/~ark/TurboParser/
Automatic Selection of Context Configurations for Improved Class-Specific Word Representations

http://www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html
Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding

http://www.ark.cs.cmu.edu/MT/
Benchmarking SMT Performance for Farsi Using the TEP++ Corpus
Hindi-to-Urdu Machine Translation through Transliteration
That’s Not What I Meant! Using Parsers to Avoid Structural Ambiguities in Generated Text
Quality Estimation for Synthetic Parallel Data Generation
The Operation Sequence Model—Combining N-Gram-Based and Phrase-Based Statistical Machine Translation

http://rtw.ml.cmu.edu/rtw/
Automatic Evaluation of Commonsense Knowledge for Refining Japanese ConceptNet
Never-Ending Multiword Expressions Learning
Jointly Embedding Relations and Mentions for Knowledge Population
KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs
Collectively Representing Semi-Structured Data from the Web
Construction of the Literature Graph in Semantic Scholar
Towards Never Ending Language Learning for Morphologically Rich Languages

http://www-cgi.cs.cmu.edu/~dr/TGrep2/tgrep2.pdf
Netgraph – Making Searching in Treebanks Easy

http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-
Learning Field Compatibilities to Extract Database Records from Unstructured Text
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?

http://www.phil.cmu.edu/projects/tetrad/tetrad4.html
Comparing Triggering Policies for Social Behaviors

http://www.speech.cs.cmu.edu/SLM/toolkit.html
A Challenge Set for Advancing Language Modeling
Combining Neural and Non-Neural Methods for Low-Resource Morphological Reinflection
CIEMPIESS: A New Open-Sourced Mexican Spanish Radio Corpus
Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features
Weakly Supervised Part-of-speech Tagging Using Eye-tracking Data
Entropy-based Training Data Selection for Domain Adaptation
Computational Approaches to Sentence Completion
Development of Speech corpora for different Speech Recognition tasks in Malayalam language
If you can’t beat them, join them: the University of Alberta system description

http://rtw.ml.cmu.edu/
Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World
Unsupervised Relation Extraction of In-Domain Data from Focused Crawls
A Generative Entity-Mention Model for Linking Entities with Knowledge Base
Relating Simple Sentence Representations in Deep Neural Networks and the Brain

http://www.ark.cs.cmu.edu/ArabicSST/
Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study

http://projectile.is.cs.cmu.edu/research/public/isa/index.htm
Competitive Grouping in Integrated Phrase Segmentation and Alignment Model

http://rtw.ml.cmu.edu/readtheWeb.html
Minimização do Impacto do Problema de Desvio de Conceito por Meio de Acoplamento em Ambiente de Aprendizado Sem Fim (Minimizing the Impact of the Concept Drift Problem by Using a Framework of Endless Learning) [in Portuguese]

http://www.cgi.sc.cmu.edu/
Learning Translation Rules for a Bidirectional English-Filipino Machine Translator

http://www.cs.cmu.edu/chongw/slda/
Generative Topic Embedding: a Continuous Representation of Documents

http://www.naclo.cs.cmu.edu/
Introducing Computational Concepts in a Linguistics Olympiad
The North American Computational Linguistics Olympiad (NACLO)

http://reports-archive.adm.cs.cmu.edu/
Correction Annotation for Non-Native Arabic Texts: Guidelines and Corpus

http://www.cs.cmu.edu/~awb
Building Practical Spoken Dialog Systems

http://rtw.ml.cmu.edu/resources/
Mapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua

http://www.ark.cs.cmu.edu/bills
Textual Predictors of Bill Survival in Congressional Committees
Linguistic Structured Sparsity in Text Categorization

https://cs.cmu.edu/
RtGender: A Corpus for Studying Differential Responses to Gender

http://projectile.is.cs.cmu.edu/research/public/tools/s
A Walk on the Other Side: Using SMT Components in a Transfer-Based Translation System

http://www.cs.cmu.edu/~bmurphy/NNSE/
Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding

http://www.speech.cs.cmu.edu/festival
Transliteration of Proper Names in Cross-Lingual Information Retrieval

http://www.ark.cs.cmu.edu/TweetNLP/clusters/50mpaths2
Automatic Keyword Extraction on Twitter
Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets

http://www.cnbc.cmu.edu/Resources/
Named Entity Recognition with Long Short-Term Memory

http://www.speech.cs.cmu.edu/pocketsphinx
Evaluating a Spoken Dialogue System that Detects and Adapts to User Affective States

http://www.speech.cs.cmu.edu/
Syllable weight encodes mostly the same information for English word segmentation as dictionary stress
Can Chinese Phonemes Improve Machine Transliteration?: A Comparative Study of English-to-Chinese Transliteration Models
Factors Influencing the Surprising Instability of Word Embeddings
Correcting General Purpose ASR Errors using Posteriors
Practical Evaluation of Speech Recognizers for Virtual Human Dialogue Systems
“Let Everything Turn Well in Your Wife”: Generation of Adult Humor Using Lexical Constraints

https://learnlab.web.cmu.edu/datashop/index.jsp
Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems

http://www.ark.cs.cmu.edu/SEMAFOR
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
Frame-Semantic Parsing
Integrating lexicographic examples in a lexical network (Intégration relationnelle des exemples lexicographiques dans un réseau lexical) [in French]
Probabilistic Frame-Semantic Parsing
Semantic Frames to Predict Stock Price Movement
An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints
Statistical Models for Frame-Semantic Parsing

http://www.cs.cmu.edu/~mccallum/bow
Using a Recurrent Neural Network Model for Classification of Tweets Conveyed Influenza-related Information
MMR-based Feature Selection for Text Categorization
學術會議資訊之擷取及其應用 (Information Extraction for Academic Conference and It’s Application) [In Chinese]
Domain Specific Speech Acts for Spoken Language Translation
Predicting Morphological Types of Chinese Bi-Character Words by Machine Learning Approaches

http://www.cs.cmu.edu/~roseh/Papers/wordnet
Incorporation of WordNet Features to n-gram Features in a Language Modeler

http://cairo.lti.cs.cmu.edu/kbp/2015/
Event Coreference Resolution with Multi-Pass Sieves

http://www.ark.cs.cmu.edu/TweetNLP
GU-MLT-LT: Sentiment Analysis of Short Messages using Linguistic Features and Stochastic Gradient Descent
RTRGO: Enhancing the GU-MLT-LT System for Sentiment Analysis of Short Messages
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
A Simple Bayesian Modelling Approach to Event Extraction from Twitter
Using Skipgrams, Bigrams, and Part of Speech Features for Sentiment Classification of Twitter Messages
A Dependency Parser for Tweets

http://artigas.lti.cs.cmu.edu/rite/Main_Page
基於單語言機器翻譯技術改進中文文字蘊涵 (Improving Chinese Textural Entailment by Monolingual Machine Translation Technology) [In Chinese]
中文文字蘊涵系統之特徵分析 (Feature Analysis of Chinese Textual Entailment System) [In Chinese]
JU_CSE_NLP: Language Independent Cross-lingual Textual Entailment System

http://www.cs.cmu.edu/qing/giza/
Combining fast_align with Hierarchical Sub-sentential Alignment for Better Word Alignments

http://boston.lti.cs.cmu.edu/clueweb12/
Semantic Search in Documents Enriched by LOD-based Annotations

http://www.speech.cs.cmu.edu/cgibin/cmudict
PronouncUR: An Urdu Pronunciation Lexicon Generator
Phonological Pun-derstanding

http://aclia.lti.cs.cmu.edu/wiki/TaskDefinition#Format
Document Re-ranking via Wikipedia Articles for Definition/Biography Type Questions

http://childes.psy.cmu.edu/manuals
Morphosyntactic Analysis of the CHILDES and TalkBank Corpora

http://rtw.ml.cmu.edu/papers/mitchell-iswc09.pdf
Empirical Studies in Learning to Read

http://www.cs.cmu.edu/~ark/
Any-language frame-semantic parsing
SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

http://www.cs.cmu.edu/afs/cs/project/theo-11/
MT and Topic-Based Techniques to Enhance Speech Recognition Systems for Professional Translators

http://boston.lti.cs.cmu.edu/Data/clueweb09
Crowdsourcing Document Relevance Assessment with Mechanical Turk

http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Substring-based Transliteration with Conditional Random Fields
An Implementation of a Flexible Author-Reviewer Model of Generation using Genetic Algorithms
Tweet Normalization with Syllables
Humor Recognition and Humor Anchor Extraction
Evaluation and collection of proper name pronunciations online
Modeling Language Proficiency Using Implicit Feedback
Creative language explorations through a high-expressivity N-grams query language
Exploiting Syntactic Structures for Humor Recognition
Entity Linking for Spoken Language
Factored Language Model based on Recurrent Neural Network
A Web Application for Automated Dialect Analysis
Bekli:A Simple Approach to Twitter Text Normalization.
NgramQuery - Smart Information Extraction from Google N-gram using External Resources
Model Invertibility Regularization: Sequence Alignment With or Without Parallel Data
A Computational Approach to the Automation of Creative Naming
An MDL-based approach to extracting subword units for grapheme-to-phoneme conversion
Exploration of the Impact of Maximum Entropy in Recurrent Neural Network Language Models for Code-Switching Speech
A Comparison of Entity Matching Methods between English and Japanese Katakana
Using English Acoustic Models for Hindi Automatic Speech Recognition
Modeling Sentiment Association in Discourse for Humor Recognition
Inducing Search Keys for Name Filtering
Homonym Detection For Humor Recognition In Short Text
An Ensemble of Grapheme and Phoneme for Machine Transliteration
Recognizing Humour using Word Associations and Humour Anchor Extraction
Name Matching between Roman and Chinese Scripts: Machine Complements Human
Incorporating Pronunciation Variation into Different Strategies of Term Transliteration
A Real-life, French-accented Corpus of Air Traffic Control Communications
Ambient Search: A Document Retrieval System for Speech Streams
Pair Language Models for Deriving Alternative Pronunciations and Spellings from Pronunciation Dictionaries
Making Computers Laugh: Investigations in Automatic Humor Recognition
Predicting the Difficulty of Language Proficiency Tests
BRAINSUP: Brainstorming Support for Creative Sentence Generation
How to Memorize a Random 60-Bit String
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System
Semi-Supervised Lexicon Mining from Parenthetical Expressions in Monolingual Web Pages
以語文特徵為基之中學閱讀測驗短文分級 (Using Linguistic Features to Classify Texts for Reading Comprehension Tests at the High School Levels) [In Chinese]
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation
Computerized Analysis of a Verbal Fluency Test
A Hybrid Approach to English-Korean Name Transliteration
ProPOSEL: A Prosody and POS English Lexicon for Language Engineering
LDC Forced Aligner
Why is “SXSW” trending? Exploring Multiple Text Sources for Twitter Topic Summarization
Beyond Normalization: Pragmatics of Word Form in Text Messages
Automatic Recognition of Cantonese-English Code-Mixing Speech
Generating Topical Poetry
Readability Assessment of Translated Texts
Predicting the Spelling Difficulty of Words for Language Learners
Détection de transcriptions incorrectes de parole non-native dans le cadre de l’apprentissage de langues étrangères (Detection of incorrect transcriptions of non-native speech in the context of foreign language learning) [in French]
A Broad-Coverage Normalization System for Social Media Language
A Sequence Alignment Model Based on the Averaged Perceptron

http://infinitive.lti.cs.cmu.edu:9090
Exploiting Machine-Transcribed Dialog Corpus to Improve Multiple Dialog States Tracking Methods

http://www.is.cs.cmu.edu/trl
Multi-Tier Annotations in the Verbmobil Corpus
Bikers Accessing the Web: The SmartWeb Motorbike Corpus
SmartWeb UMTS Speech Data Collection: The SmartWeb Handheld Corpus

http://childes.psy.cmu.edu
Annotating Multi-media/Multi-modal Resources with ELAN
ELAN: a Professional Framework for Multimodality Research
Parsing the CHILDES Database: Methodology and Lessons Learned
I will shoot your shopping down and you can shoot all my tins—Automatic Lexical Acquisition from the CHILDES Database
Morphosyntactic Analysis of the CHILDES and TalkBank Corpora
Talkbank: Building an Open Unified Multimodal Database of Communicative Interaction
Vulnerability in Acquisition, Language Impairments in Dutch: Creating a VALID Data Archive
POSCAT: A Morpheme-based Speech Corpus Annotation Tool

http://www.naclo.cs.cmu.edu/assets/problems/
Introducing Computational Concepts in a Linguistics Olympiad

http://nlp.qatar.cmu.edu/resources/
A Human Judgement Corpus and a Metric for Arabic MT Evaluation

http://www.radar.cs.cmu.edu/external.asp
Understanding Temporal Expressions in Emails

http://nlp.qatar.cmu.edu/resources/SuMT
SuMT: A Framework of Summarization and MT

http://www.cs.cmu.edu/f)dupont/m197p/
Toward General-Purpose Learning for Information Extraction

http://www.cs.cmu.edu/~bbd/ExploreEM_package.zip
Breaking the Closed World Assumption in Text Classification

http://www.speech.cs.cmu.edu/sphinx/
Multi-Human Dialogue Understanding for Assisting Artifact-Producing Meetings
Domain Adaptation of Maximum Entropy Language Models

http://www.cs.cmu.edu/~cprose/Graffiti.html
Modeling the Use of Graffiti Style Features to Signal Social Relations within a Multi-Domain Learning Paradigm

http://www.ark.cs.cmu.edu/mheilman/questions
Supersense Tagging for Arabic: the MT-in-the-Middle Attack

http://childes.psy.cmu.edu/morgrams
Morphosyntactic Analysis of the CHILDES and TalkBank Corpora

http://mocap.cs.cmu.edu
Heterogeneous Data Sources for Signed Language Analysis and Synthesis: The SignCom Project

http://uima.lti.cs.cmu.edu
Promoting Interoperability of Resources in META-SHARE

http://www.cs.cmu.edu/~blangner
Building Practical Spoken Dialog Systems

https://www.ark.cs.cmu.edu/TurboParser
An Out-of-Domain Test Suite for Dependency Parsing of German

http://wiki.cnbc.cmu.edu/Objects
G-TUNA: a corpus of referring expressions in German, including duration information

http://www.cs.cmu.edu/listen
Towards Using EEG to Improve ASR Accuracy

http://demo.clab.cs.cmu.edu/ethical_nlp/
Socially Responsible NLP

http://childes.psy.cmu.edu/grasp/
Syntactic annotation of spoken utterances: A case study on the Czech Academic Corpus

http://www.cs.cmu.edu/ralf/langid.html
Non-linear Mapping for Improved Identification of 1300+ Languages

http://www.ark.cs.cmu.edu/SEMAFOR/;
Event Extraction as Frame-Semantic Parsing

http://www.cs.cmu.edu/afs/cs/user/aberger/
Modeling Consensus: Classifier Combination for Word Sense Disambiguation

http://www.ark.cs.cmu.edu/AQMAR
A Survey of Arabic Named Entity Recognition and Classification
Recall-Oriented Learning of Named Entities in Arabic Wikipedia

http://cairo.lti.cs.cmu.edu/kbp/2015/event/annotation
Joint Inference for Event Coreference Resolution

http://www.cs.cmu.edu/ralf/papers.html
Adapting an Example-Based Translation System to Chinese

http://www-2.cs.cmu.edu/~enron/
Hands-On NLP for an Interdisciplinary Audience

http://www.ark.cs.cmu.edu/TweetNLP/clusters/
A Dependency Parser for Tweets

http://www.naclo.cs.cmu.edu
Introducing Computational Concepts in a Linguistics Olympiad
The North American Computational Linguistics Olympiad (NACLO)

http://speech.sv.cmu.edu/HRItk
HRItk: The Human-Robot Interaction ToolKit Rapid Development of Speech-Centric Interactive Systems in ROS

https://www.cs.cmu.edu/afs/cs/
Speaking, Seeing, Understanding: Correlating semantic models with conceptual representation in the brain

http://www-2.cs.cmu.edu/
Wordform- and Class-based Prediction of the Components of German Nominal Compounds in an AAC System
Teaching Applied Natural Language Processing: Triumphs and Tribulations
Towards Conversational QA: Automatic Identification of Problematic Situations and User Intent
Grouping business news stories based on salience of named entities
An ontology-based approach in the literary research: two case-studies

http://www.cs.cmu.edu/lemur
Matching Inconsistently Spelled Names in Automatic Speech Recognizer Output for Information Retrieval

http://www.cs.cmu.edu/~ytsvetko/definiteness
Automatic Classification of Communicative Functions of Definiteness

http://www.ark.cs.cmu
Stacking or Supertagging for Dependency Parsing – What’s the Difference?
A Bayesian Mixed Effects Model of Literary Character
Dynamic Language Models for Streaming Text
Frame-Semantic Role Labeling with Heterogeneous Annotations
UNIBA: Sentiment Analysis of English Tweets Combining Micro-blogging, Lexicon and Semantic Features
Unsupervised Parsing for Generating Surface-Based Relation Extraction Patterns

http://www.cs.cmu.edu/apparikh/plre.html
Language Modeling with Power Low Rank Ensembles

http://childes.psy.cmu.edu/
LEXUS, a web-based tool for manipulating lexical resources lexicon
Studying the Effect of Input Size for Bayesian Word Segmentation on the Providence Corpus
Construction and Automatization of a Minnan Child Speech Corpus with some Research Findings
Metadata Collection Records for Language Resources
Active Learning for Building a Corpus of Questions for Parsing
Challenges in modality annotation in a Brazilian Portuguese Spontaneous Speech Corpus
High-accuracy Annotation and Parsing of CHILDES Transcripts
FOLKER: An Annotation Tool for Efficient Transcription of Natural, Multi-party Interaction
Morphosyntactic Analysis of the CHILDES and TalkBank Corpora
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project
An annotated English child language database
A corpus of European Portuguese child and child-directed speech
Lower and higher estimates of the number of “true analogies” between sentences contained in a large multilingual corpus
The ACQDIV Database: Min(d)ing the Ambient Language
Representing and Rendering Linguistic Paradigms
Multimedia Language Resources
A large scale annotated child language construction database
The AnnCor CHILDES Treebank

http://www.cs.cmu.edu/~mccallum
Text Classification by Bootstrapping with Keywords, EM and Shrinkage

http://www.cs.cmu.edu/~antoine
Building Practical Spoken Dialog Systems

http://childes.psy.cmu.edu/derived/sesotho.zip
A Phonemic Corpus of Polish Child-Directed Speech

http://www.cs.cmu.edu/~ark/LexSem/
SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

http://www.cs.cmu.edu/enron/
Multi-Class Confidence Weighted Algorithms
Domain Adaptation to Summarize Human Conversations
Learning User Embeddings from Emails
Summarizing Spoken and Written Conversations
Semi-supervised Speech Act Recognition in Emails and Forums

http://cs.cmu.edu/~dmortens/uriel.html
Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

http://www.cs.cmu.edu/
CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification
Authorship Attribution of E-Mail: Comparing Classifiers over a New Corpus for Evaluation
The Role of Roles in Classifying Annotated Biomedical Text
Inconsistency Detection in Semantic Annotation
Supersense Embeddings: A Unified Model for Supersense Interpretation, Prediction, and Utilization
Extreme Adaptation for Personalized Neural Machine Translation
Metaphor Detection with Cross-Lingual Model Transfer
“A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce”: Learning State Changing Verbs from Wikipedia Revision History
Identifying Semantic Edit Intentions from Revisions in Wikipedia
Pushing the Limits of Translation Quality Estimation
Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment
Discrepancy Between Automatic and Manual Evaluation of Summaries
A Corpus of Preposition Supersenses
DART: a Dataset of Arguments and their Relations on Twitter
Large-scale Cloze Test Dataset Created by Teachers
Embracing Non-Traditional Linguistic Resources for Low-resource Language Name Tagging
A Continuously Growing Dataset of Sentential Paraphrases
The WebNLG Challenge: Generating Text from RDF Data
Quality Estimation of English-French Machine Translation: A Detailed Study of the Role of Syntax
Local Histograms of Character N-grams for Authorship Attribution
Prepositional Phrase Attachment over Word Embedding Products
Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora
Seernet at EmoInt-2017: Tweet Emotion Intensity Estimator
Can Chinese Phonemes Improve Machine Transliteration?: A Comparative Study of English-to-Chinese Transliteration Models
Edit Categories and Editor Role Identification in Wikipedia
Universal Dependencies for Portuguese
Measuring corpus homogeneity using a range of measures for inter-document distance
To Memorize or to Predict: Prominence labeling in Conversational Speech
Impact of MWE Resources on Multiword Recognition
New Experiments in Distributional Representations of Synonymy
A Compositional and Interpretable Semantic Space
Learning to Follow Navigational Directions
IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text
Illegal is not a Noun: Linguistic Form for Detection of Pejorative Nominalizations
Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts
Annotating similes in literary texts
Named Entity Recognition and Hashtag Decomposition to Improve the Classification of Tweets
Elucidating Conceptual Properties from Word Embeddings
Learning to Identify Definitions using Syntactic Features
Event Embeddings for Semantic Script Modeling
Learning to Search for Recognizing Named Entities in Twitter
BLANC: Learning Evaluation Metrics for MT
A Combination of Topic Models with Max-margin Learning for Relation Detection
Joint Information Extraction and Reasoning: A Scalable Statistical Relational Learning Approach
Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension
THU_NGN at SemEval-2018 Task 3: Tweet Irony Detection with Densely connected LSTM and Multi-task Learning
Crowdsourcing High-Quality Parallel Data Extraction from Twitter
GradAscent at EmoInt-2017: Character and Word Level Recurrent Neural Network Models for Tweet Emotion Intensity Detection
Language Identification: The Long and the Short of the Matter
JU_NLP at SemEval-2016 Task 6: Detecting Stance in Tweets using Support Vector Machines
Parallel Implementations of Word Alignment Tool
Resolving Task Specification and Path Inconsistency in Taxonomy Construction
Real Time Adaptive Machine Translation for Post-Editing with cdec and TransCenter
Entity Annotation based on Inverse Index Operations
Context Sensitive Lemmatization Using Two Successive Bidirectional Gated Recurrent Networks
HLP@UPenn at SemEval-2017 Task 4A: A simple, self-optimizing text classification system combining dense and sparse vectors
Unsupervised Learning of Prototypical Fillers for Implicit Semantic Role Labeling
Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse
Of Words, Eyes and Brains: Correlating Image-Based Distributional Semantic Models with Neural Representations of Concepts
Representation Based Translation Evaluation Metrics
Recursive Top-down Fuzzy Match : New Perspectives on Memory-based Parsing
Microblogs as Parallel Corpora
Detecting Nastiness in Social Media
“i have a feeling trump will win..................”: Forecasting Winners and Losers from User Predictions on Twitter
Feature-Rich Twitter Named Entity Recognition and Classification
Frame Semantics across Languages: Towards a Multilingual FrameNet
Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction
SystemT: An Algebraic Approach to Declarative Information Extraction
Tweety at SemEval-2018 Task 2: Predicting Emojis using Hierarchical Attention Neural Networks and Support Vector Machine
Integrating Optical Character Recognition and Machine Translation of Historical Documents
Acquisition of Syntactic Simplification Rules for French
Towards a General Rule for Identifying Deceptive Opinion Spam
Language Identification and Analysis of Code-Switched Social Media Text
Assessing linguistically aware fuzzy matching in translation memories
Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules
THU_NGN at SemEval-2018 Task 2: Residual CNN-LSTM Network with Attention for English Emoji Prediction
MPST: A Corpus of Movie Plot Synopses with Tags
Neural Activation Semantic Models: Computational lexical semantic models of localized neural activations
Classification from Full Text: A Comparison of Canonical Sections of Scientific Papers
SimiHawk at SemEval-2016 Task 1: A Deep Ensemble System for Semantic Textual Similarity
Twitter Named Entity Extraction and Linking Using Differential Evolution
UW-CSE at SemEval-2016 Task 10: Detecting Multiword Expressions and Supersenses using Double-Chained Conditional Random Fields
Praat on the Web: An Upgrade of Praat for Semi-Automatic Speech Annotation
Predicting Native Language from Gaze
A Cascade Method for Detecting Hedges and their Scope in Natural Language Text
Scalable Construction and Reasoning of Massive Knowledge Bases
Cutting the Long Tail: Hybrid Language Models for Translation Style Adaptation
A Sense-Based Translation Model for Statistical Machine Translation
Exploring Semantic Representation in Brain Activity Using Word Embeddings
Constructing Task-Specific Taxonomies for Document Collection Browsing
The U.S. Policy Agenda Legislation Corpus Volume 1 - a Language Resource from 1947 - 1998
Learning Paraphrasing for Multiword Expressions
Generalizing Dependency Features for Opinion Mining
Sprinkling Topics for Weakly Supervised Text Classification
A Hybrid Text Classification Approach for Analysis of Student Essays
Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
MT Tuning on RED: A Dependency-Based Evaluation Metric
Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world
Novelty Goes Deep. A Deep Neural Solution To Document Level Novelty Detection
A Quantitative Analysis of Lexical Differences Between Genders in Telephone Conversations
The First Surface Realisation Shared Task: Overview and Evaluation Results
Learning a POS tagger for AAVE-like language
Demographic Dialectal Variation in Social Media: A Case Study of African-American English
Document-Level Automatic MT Evaluation based on Discourse Representations
MTNT: A Testbed for Machine Translation of Noisy Text
LIUM’s SMT Machine Translation Systems for WMT 2011
Non-distributional Word Vector Representations
A Comparative Study of Syntactic Parsers for Event Extraction
Understanding Mental States in Natural Language
Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations
CASICT-DCU Participation in WMT2015 Metrics Task
THU_NGN at SemEval-2018 Task 1: Fine-grained Tweet Sentiment Intensity Analysis with Attention CNN-LSTM
Paraphrase Identification and Semantic Similarity in Twitter with Simple Features
Towards Automatically Classifying Depressive Symptoms from Twitter Data for Population Health
Which Tumblr Post Should I Read Next?
IITP at SemEval-2017 Task 8 : A Supervised Approach for Rumour Evaluation
Mining Parallel Corpora from Sina Weibo and Twitter
The DCU Dependency-Based Metric in WMT-MetricsMATR 2010
Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection
Machine Learning Disambiguation of Quechua Verb Morphology
RED: A Reference Dependency Based MT Evaluation Metric
FBK-HLT: An Effective System for Paraphrase Identification and Semantic Similarity in Twitter
Unbabel’s Participation in the WMT17 Translation Quality Estimation Shared Task
Detecting Context Dependent Messages in a Conversational Environment
Simple or Complex? Classifying Questions by Answering Complexity
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011
ASU: An Experimental Study on Applying Deep Learning in Twitter Named Entity Recognition.
Identifying Real or Fake Articles: Towards better Language Modeling
IITP at EmoInt-2017: Measuring Intensity of Emotions using Sentence Embeddings and Optimized Features
Identifying Effective Translations for Cross-lingual Arabic-to-English User-generated Speech Search
Identifying Experimental Techniques in Biomedical Literature
IMS at EmoInt-2017: Emotion Intensity Prediction with Affective Norms, Automatically Extended Resources and Deep Learning
Automatic Extraction of News Values from Headline Text
Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction
The Karlsruhe Institute for Technology Translation System for the ACL-WMT 2010
LIUM’s SMT Machine Translation Systems for WMT 2012
Labeling Unlabeled Data using Cross-Language Guided Clustering
Unbabel’s Participation in the WMT16 Word-Level Translation Quality Estimation Shared Task
RACE: Large-scale ReAding Comprehension Dataset From Examinations
Testing Semantic Similarity Measures for Extracting Synonyms from a Corpus
The Web Library of Babel: evaluating genre collections
TwiSe at SemEval-2017 Task 4: Five-point Twitter Sentiment Classification and Quantification
Story Assembly in a Dyslexia Fluency Tutor
evision PDF of 'Recognizing Counterfactual Thinking in Social Media Texts
Activity detection for information access to oral communication
Neural Models for Key Phrase Extraction and Question Generation
Is “Universal Syntax” Universally Useful for Learning Distributed Word Representations?
A Joint Sequential and Relational Model for Frame-Semantic Parsing
Grammatical Relations in Chinese: GB-Ground Extraction and Data-Driven Parsing
Typed Tensor Decomposition of Knowledge Bases for Relation Extraction
Parsing for Grammatical Relations via Graph Merging
Language Model-Based Document Clustering Using Random Walks
A Joint Model of Conversational Discourse Latent Topics on Microblogs
UWB at SemEval-2018 Task 1: Emotion Intensity Detection in Tweets
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
Textual Entailment based Question Generation
Character Sequence Models for Colorful Words
Making Dependency Labeling Simple, Fast and Accurate
Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation
Factoring Adjunction in Hierarchical Phrase-Based SMT
The Grammar of English Deverbal Compounds and their Meaning
Gating Mechanisms for Combining Character and Word-level Word Representations: an Empirical Study
On the Feasibility of Automated Detection of Allusive Text Reuse
The binary trio at SemEval-2019 Task 5: Multitarget Hate Speech Detection in Tweets
Using Human Attention to Extract Keyphrase from Microblog Post
Handling Divergent Reference Texts when Evaluating Table-to-Text Generation
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
Blackbox Meets Blackbox: Representational Similarity & Stability Analysis of Neural Language Models and Brains
Proceedings of the 27th International Conference on Computational Linguistics: Tutorial Abstracts
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
MIDAS at SemEval-2019 Task 6: Identifying Offensive Posts and Targeted Offense from Twitter
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections
JHU 2019 Robustness Task System Description

http://rtw.ml.cmu.edu/rtw/kbbrowser/
Major Life Event Extraction from Twitter based on Congratulations/Condolences Speech Acts
Weakly Supervised User Profile Extraction from Twitter
Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
Learning Verbs on the Fly

http://www.cs.cmu.edu/sagae/parser/
Task-oriented Evaluation of Syntactic Parsers and Their Representations

http://www.cs.cmu.edu/yww/data/earningscalls.zip
A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls

http://cairo.lti.cs.cmu.edu/kbp/2017/event/documents
Graph Based Decoding for Event Sequencing and Coreference Resolution

http://www.cs.cmu.edu/~alavie/papers/Mapudungun-LREC-04.pdf
AUTOLEX: An Automatic Lexicon Builder for Minority Languages Using an Open Corpus

http://www-2.cs.cmu.edu/~mccallum/bow/rainbow/
Categorizing Web Pages as a Preprocessing Step for Information Extraction

http://www.cs.cmu.edu/Groups/AI/util/
Connotation Frames of Power and Agency in Modern Films

http://www.ark.cs.cmu.edu
The CMU-ARK German-English Translation System

http://www.speech.cs.cmu.edu/tools/
Role-specific Language Models for Processing Recorded Neuropsychological Exams
NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
Native Language Identification using Phonetic Algorithms
English-Korean Named Entity Transliteration Using Statistical Substring-based and Rule-based Approaches

http://projectile.sv.cmu.edu/research/
Designing Agreement Features for Realization Ranking
Perceptron Reranking for CCG Realization
Facilitating Translation Using Source Language Paraphrase Lattices

http://projectile.is.cs.cmu.edu/research/public/tools/bootStrap/tut
Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?

http://www.ark.cs.cmu.edu/cdyer/en-600/
Identifying the L1 of non-native writers: the CMU-Haifa system

http://childes.psy.cmu.edu/);
CEPLEXicon ― A Lexicon of Child European Portuguese

http://www.speech.cs.cmu.edu
Leveraging Inflection Tables for Stemming and Lemmatization.
NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
Morphological Analysis without Expert Annotation
English-Korean Named Entity Transliteration Using Statistical Substring-based and Rule-based Approaches

http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/
Learning a Deep Hybrid Model for Semi-Supervised Text Classification

http://nlp.qatar.cmu.edu/qalb/
The Second QALB Shared Task on Automatic Text Correction for Arabic
Large Scale Arabic Error Annotation: Guidelines and Framework
Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation
Correction Annotation for Non-Native Arabic Texts: Guidelines and Corpus
The First QALB Shared Task on Automatic Text Correction for Arabic

http://www.naclo.cs.cmu.edu/problems2012/
Introducing Computational Concepts in a Linguistics Olympiad

http://www2.cs.cmu.edu/
Building a Dataset for Summarization and Keyword Extraction from Emails

http://cairo.lti.cs.cmu.edu/kbp/2015/event/Event_Me
Event Nugget and Event Coreference Annotation

http://www.ark.cs.cmu.edu/TweetNLP/
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
UT-DB: An Experimental Study on Sentiment Analysis in Twitter
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
Transferring from Formal Newswire Domain with Hypernet for Twitter POS Tagging
TUGAS: Exploiting unlabelled data for Twitter sentiment analysis
What I’ve learned about annotating informal text (and why you shouldn’t take my word for it)
Simultaneous Feature Selection and Parameter Optimization Using Multi-objective Optimization for Sentiment Analysis
Experiments with crowdsourced re-annotation of a POS tagging data set
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter
Tune Your Brown Clustering, Please
A Unified Model for Topics, Events and Users on Twitter
Sarcastic or Not: Word Embeddings to Predict the Literal or Sarcastic Meaning of Words
ECNU: Expression- and Message-level Sentiment Orientation Classification in Twitter Using Multiple Effective Features
Sentiment Lexicon Interpolation and Polarity Estimation of Objective and Out-Of-Vocabulary Words to Improve Sentiment Classification on Microblogging
QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text
IITP: Hybrid Approach for Text Normalization in Twitter
CodeX: Combining an SVM Classifier and Character N-gram Language Models for Sentiment Analysis on Twitter Text
IITP: Multiobjective Differential Evolution based Twitter Named Entity Recognition
Semi-Supervised Learning of Sequence Models with Method of Moments
UNIBA: Sentiment Analysis of English Tweets Combining Micro-blogging, Lexicon and Semantic Features
Indian Institute of Technology-Patna: Sentiment Analysis in Twitter
TeamX: A Sentiment Analyzer with Enhanced Lexicon Mapping and Weighting Scheme for Unbalanced Data
Negation Scope Detection for Twitter Sentiment Analysis
Learning part-of-speech taggers with inter-annotator agreement loss
The Unreasonable Effectiveness of Word Representations for Twitter Named Entity Recognition
KLUE: Simple and robust methods for polarity classification
What does this Emoji Mean? A Vector Space Skip-Gram Model for Twitter Emojis
Adapting taggers to Twitter with not-so-distant supervision

http://www.cs.cmu.edu/ehn/JAVELIN/
Question Answering in Restricted Domains: An Overview

http://www.phil.cmu.edu/projects/
Mining Arguments From 19th Century Philosophical Texts Using Topic Based Modelling

http://www.cs.cmu.edu/lenzo/t2p/
Transliteration Alignment

http://www.speech.cs.cmu.edu/cgi-bin/cmudict/
Evaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles
Korean Children’s Spoken English Corpus and an Analysis of its Pronunciation Variability

http://ankara.lti.cs.cmu.edu/side/video.swf
SIDE: The Summarization Integrated Development Environment

http://www.cs.cmu.edu/afs/cs
Using a Wikipedia-based Semantic Relatedness Measure for Document Clustering
Morphological Segmentation for Keyword Spotting

http://www.cs.cmu.edu/pbennett/action-item-dataset.html
Combining Probability-Based Rankers for Action-Item Detection

http://www.speech.cs.cmu.edu/tools/lextool.html
PronouncUR: An Urdu Pronunciation Lexicon Generator
Phonological Pun-derstanding
A case study on using speech-to-translation alignments for language documentation

http://childes.psy.cmu.edu/clan/
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project
Lekbot: A talking and playing robot for children with disabilities
Querying Both Time-aligned and Hierarchical Corpora with NXT Search

http://demo.ark.cs.cmu.edu/parse
Towards Broad-coverage Meaning Representation: The Case of Comparison Structures
Learning to Jointly Predict Ellipsis and Comparison Structures

https://www.cs.cmu.edu/~biglou/resources/bad-
Telling Apart Tweets Associated with Controversial versus Non-Controversial Topics
Hope at SemEval-2019 Task 6: Mining social media language to discover offensive language

http://www.ark.cs.cmu.edu/ArabicSST/corpus/
A Corpus and Model Integrating Multiword Expressions and Supersenses

http://www.ark.cs.cmu.edu/bio/
Unsupervised Discovery of Biographical Structure from Text

http://rtw.ml.cmu.edu
Random Walk Inference and Learning in A Large Scale Knowledge Base
Which Noun Phrases Denote Which Concepts?
Discovering Relations between Noun Categories

http://cairo.lti.cs.cmu.edu/kbp/2015/event/scoring
Joint Inference for Event Coreference Resolution

http://www.ark.cs.cmu.edu/personas
Learning Latent Personas of Film Characters

http://www.ark.cs.cmu.edu/FUDG/
Simplified Dependency Annotations with GFL-Web

http://www.is.cs.cmu.edu/http://werner.ira.uka.de
Activity detection for information access to oral communication

http://multicomp.cs.cmu.edu/
Sentiment Analysis using Imperfect Views from Spoken Language and Acoustic Modalities

http://www.is.cs.cmu.edu/iwslt2005/
A Web-based Demonstrator of a Multi-lingual Phrase-based Translation System
SYNGRAPH: A Flexible Matching Method based on Synonymous Expression Extraction from an Ordinary Dictionary and a Web Corpus

http://www.cs.cmu.edu/afs/cs/project/theo-3/www/
Semi-Supervised SimHash for Efficient Document Similarity Search

http://curtis.ml.cmu.edu/gnat/software/
Learning to Define Terms in the Software Domain

http://www.ark.cs.cmu.edu/bio
Unsupervised Discovery of Biographical Structure from Text

http://www.cs.cmu.edu/%7Ealavie/METEOR/index.html
FBK-HLT: An Application of Semantic Textual Similarity for Answer Selection in Community Question Answering

http://www.speech.cs.cmu.edu/SLM/toolkit_docu
A Linguistic Knowledge Discovery Tool: Very Large Ngram Database Search with Arbitrary Wildcards

http://www.speech.cs.cmu.edu/sphinx/tutorial.html
N-Best Rescoring Based on Pitch-accent Patterns

http://www.ark.cs.cmu.edu/global-voices/
Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features
evision PDF of 'What Can We Get From 1000 Tokens? A Case Study of Multilingual POS Tagging For Resource-Poor Languages

http://www.speech.cs.cmu.edu/sphinx/doc/phoneset_s2.html
以語音辨識與評分輔助口說英文學習 (Spoken English Learning Based on Speech Recognition and Assessment) [In Chinese]

http://childes.psy.cmu.edu/data/Slavic/Polish/Weist.zip
A Phonemic Corpus of Polish Child-Directed Speech

http://www-2.cs.cmu.edu/~lemur/
Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval

http://childes.psy.cmu.edu/manuals/CHAT
The ACQDIV Database: Min(d)ing the Ambient Language

http://childes.psy.cmu.edu/derived/brent
A Phonemic Corpus of Polish Child-Directed Speech

http://www.cs.cmu.edu/ark/personas/
Prediction of a Movie’s Success From Plot Summaries Using Deep Learning Models

https://www.cmu.edu/teaching/technology/whitepap
Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration

http://www.ark.cs.cmu.edu/LexSem/
Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut
A Corpus and Model Integrating Multiword Expressions and Supersenses
SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

http://www.qatar.cmu.edu/
Dudley North visits North London: Learning When to Transliterate to Arabic

http://childes.psy.cmu.edu/manuals/chat.pdf
The Dutch LESLLA Corpus

http://www.cs.cmu.edu/scohen/parser.html
Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation

http://lib.stat.cmu.edu/
Word-Sense Disambiguation for Machine Translation

http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data
Book Reviews: Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms by Thorsten Joachims; Anaphora Resolution by Ruslan Mitkov

http://www.cs.cmu.edu/cprose/SIDE.html
An Interactive Tool for Supporting Error Analysis for Text Mining

http://www.cs.cmu
SMT and SPE Machine Translation Systems for WMT‘09
Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter
Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition
Toward General-Purpose Learning for Information Extraction
Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
LIUM SMT Machine Translation System for WMT 2010
Spectral Clustering for Example Based Machine Translation
Improving “Email Speech Acts” Analysis via N-gram Selection

http://www.cs.cmu.edu/~shomir/wb_cd_study/
Determiner-Established Deixis to Communicative Artifacts in Pedagogical Text

http://trac.speech.cs.cmu.edu/repos/
Conversational Strategies for Robustly Managing Dialog in Public Spaces

http://cs.cmu.edu/~sef/scone
Two Approaches to Metaphor Detection

http://www.informedia.cs.cmu.edu/
Development of Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish
Recognition of Polish Temporal Expressions
PodCastle: A Spoken Document Retrieval Service Improved by Anonymous User Contributions
Multilingual and cross-lingual news topic tracking
The Automatic Generation of Formal Annotations in a Multimedia Indexing and Searching Environment
Improved Recognition and Normalisation of Polish Temporal Expressions

http://www.ark.cs.cmu.edu/TurboParser
Parsing as Reduction
Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies
Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning

http://www.cs.cmu.edu/%7Edwijaya/mapping.html
Mapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua

http://www.cs.cmu.edu/~ytsvetko/
A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness
Socially Responsible NLP
Augmenting English Adjective Senses with Supersenses

http://www.is.cs.cmu.edu/nespole/db/
The Italian NESPOLE! Corpus: a Multilingual Database with Interlingua Annotation in Tourism and Medical Domains

http://reap.cs.cmu.edu/demo/readability2012/
Lexical Level Distribution of Metadiscourse in Spoken Language

https://www.cs.cmu.edu/alavie/METEOR/
Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation

http://www.ark.cs.cmu.edu/GeoText
#Emotional Tweets

http://www.ark.cs.cmu.edu/TweetNLP/cluster
Crowdsourcing and annotating NER for Twitter #drift

http://www.is.cs.cmu.edu
Signatures, Typed Feature Structures and RDFS

http://projectile.is.cs.cmu.edu/research/public/tools/bootStrap
Boosting Statistical Machine Translation by Lemmatization and Linear Interpolation

http://www.cs.cmu.edu/~wcohen/repository.tgz
Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk

http://www.cs.cmu.edu/ark/LexSem/
An Efficient Annotation for Phrasal Verbs using Dependency Information

http://demo.clab.cs.cmu.edu/cdyer/form-tg-m3.html
Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

http://cs.cmu
Improving Vector Space Word Representations Using Multilingual Correlation

http://www.cs.cmu.edu/sagae/parser/gdep/
From Protein-Protein Interaction to Molecular Event Extraction
Overview of BioNLP’09 Shared Task on Event Extraction

http://ankara.lti.cs.cmu.edu/side/download.html
Improving Peer Feedback Prediction: The Sentence Level is Right

http://link.cs.cmu.edu/link/papers/
Automated Rating of ESL Essays

http://rtw.ml.cmu.edu/rtw/resources
Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
Predicting Tasks in Goal-Oriented Spoken Dialog Systems using Semantic Knowledge Bases

http://arap.qatar.cmu.edu
Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification

http://www.speech.cs.cmu.edu/apappu/kacq
Knowledge Acquisition Strategies for Goal-Oriented Dialog Systems

http://rtw.ml.cmu.edu/resources/ppa
A Knowledge-Intensive Model for Prepositional Phrase Attachment

http://www.cs.cmu.edu/~alavie/METEOR
Cross-Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories

http://www.cs.cmu.edu/~alavie/METEOR/
Odds of Successful Transfer of Low-Level Concepts: a Key Metric for Bidirectional Speech-to-Speech Machine Translation in DARPA’s TRANSTAC Program
METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments
Edit Distance: A Metric for Machine Translation Evaluation
Meteor, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output
Meteor Universal: Language Specific Translation Evaluation for Any Target Language

http://cbdr.cmu.edu/
AppDialogue: Multi-App Dialogues for Intelligent Assistants

http://kettle.ubiq.cs.cmu.edu/
Predicting the Evocation Relation between Lexicalized Concepts

http://www.cs.cmu.edu/~sef/scone
Knowledge-Based Labeling of Semantic Relationships in English

http://boston.lti.cs.cmu.edu/
langid.py for better language modelling
Tempo-Lexical Context Driven Word Embedding for Cross-Session Search Task Extraction
Cross-domain Feature Selection for Language Identification

http://moscow.mt.cs.cmu.edu:
A Text Categorization Based on a Summarization Extraction

http://www.cs.cmu.edu/~ark/SEMAFOR/
A Proposal for combining “general” and specialized frames

http://childes.psy.cmu
On Grammaticality in the Syntactic Annotation of Learner Language
High-accuracy Annotation and Parsing of CHILDES Transcripts
The AnnCor CHILDES Treebank

http://www-2.cs.cmu.edu/enron/
Statistical Modality Tagging from Rule-based Annotations and Crowdsourcing

http://www.ark.cs.cmu.edu/global-voices
Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource Languages

http://www.cs.cmu.edu/~lemur/
Improving Statistical Machine Translation Performance by Training Data Selection and Optimization

http://www.cs.cmu.edu/dhuggins/touchcorrect.ogg
Interactive ASR Error Correction for Touchscreen Devices

https://skylar.speech.cs.cmu.edu
DialPort: A General Framework for Aggregating Dialog Systems

http://nlp.qatar.cmu.edu/madar/
The MADAR Arabic Dialect Corpus and Lexicon

http://www.cs.cmu.edu/webkb
HTM: A Topic Model for Hypertexts

http://www.speech.cs.cmu.edu/comp.speech/
From Pipedreams to Products, and Promise!

http://www.is.cs.cmu.edu/trl_conventions
The Italian NESPOLE! Corpus: a Multilingual Database with Interlingua Annotation in Tourism and Medical Domains

http://www.ark.cs.cmu.edu/SEMAFOR/eval/
Context-aware Frame-Semantic Role Labeling

http://childes.psy.cmu.edu/data/Romance/Portugue
An Investigation on the Influence of Frequency on the Lexical Organization of Verbs

http://www.ark.cs.cmu.edu/AD3/
A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis

http://www.cs.cmu.edu/ref/mlim/index.html
The New Edition of the Natural Language Software Registry (an Initiative of ACL hosted at DFKI)

http://www.cs.cmu.edu/afs/cs.cmu.edu/project/cmt-
Embracing Non-Traditional Linguistic Resources for Low-resource Language Name Tagging

http://www.cs.cmu.edu/~ark/dyogatam/wordvecs/
Why does PairDiff work? - A Mathematical Analysis of Bilinear Relational Compositional Operators for Analogy Detection

http://bobo.link.cs.cmu.edu/link/dict/summarize-links.html
Generating Typed Dependency Parses from Phrase Structure Parses

http://tera-3.ul.cs.cmu.edu/
Script Independent Word Spotting in Multilingual Documents

http://www.cs.cmu.edu/~ashishv/mer.html
A Walk on the Other Side: Using SMT Components in a Transfer-Based Translation System

http://www-2.cs.cmu.edu/~lemur
Discretization Based Learning for Information Retrieval

http://www.cs.cmu.edu/~sef/scone/
SconeEdit: A Text-guided Domain Knowledge Editor

http://www.cs.cmu.edu/mccallum/bow
Distributional Identification of Non-Referential Pronouns
The Influence of Data Homogeneity on NLP System Performance
Linguistic Miner: An Italian Linguistic Knowledge System

http://www.cs.cmu.edu/afs/cs/project/face/www/facs.htm
Utterance-Level Multimodal Sentiment Analysis

http://wiki.speech.cs.cmu.edu/olympus/index.php/Phoenix
Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration

http://www.cs.cmu.edu/~yww/
Scalable Statistical Relational Learning for NLP

http://www.naclo.cs.cmu.edu/problems2011/
Introducing Computational Concepts in a Linguistics Olympiad

http://privacy.cs.cmu.edu/dataprivacy/
Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text

http://www.ark.cs.cmu.edu/ARKref/
Computational Analysis of Referring Expressions in Narratives of Picture Books
Learning to Order Natural Language Texts

http://www.cs.cmu.edu/alavie/METEOR
Stochastic Iterative Alignment for Machine Translation Evaluation
Boosting Statistical Machine Translation by Lemmatization and Linear Interpolation

http://speech.sv.cmu.edu/SimInteraction
A Simulation-based Framework for Spoken Language Understanding and Action Selection in Situated Interaction

http://multicomp.cs.cmu.edu
Unsupervised Text Recap Extraction for TV Series

http://www.ark.cs.cmu.edu/AD3
Frame-Semantic Parsing
Priberam Compressive Summarization Corpus: A New Multi-Document Summarization Corpus for European Portuguese

http://cairo.lti.cs.cmu.edu/kbp/2016/after/
Graph Based Decoding for Event Sequencing and Coreference Resolution

http://www.cs.cmu.edu/hideki/software/jawjaw/
Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation

http://aclia.lti.cs.cmu.edu/ntcir8
Exploiting a Multilingual Web-based Encyclopedia for Bilingual Terminology Extraction

http://www.ark.cs.cmu.edu/GeoText/
Simple supervised document geolocation with geodesic grids

http://www.cs.cmu.edu/~gparent/amt/wsi/
Clustering dictionary definitions using Amazon Mechanical Turk

http://rtw.ml.cmu.edu/tacl2015_csf
Learning a Compositional Semantics for Freebase with an Open Predicate Vocabulary

http://www.cs.cmu.edu/~sjauhar/Software_files/LR-SDSM.tar
Inducing Latent Semantic Relations for Structured Distributional Semantics

http://www.cs.cmu.edu/jhclark/loonybin
LoonyBin: Keeping Language Technologists Sane through Automated Management of Experimental (Hyper)Workflows

http://www.cgi.sc.cmu.edu/People/kathrin/Research/SummaryOfProposal.pdf
Learning Translation Rules from Bilingual English - Filipino Corpus

http://www.cs.cmu.edu/~911/
Speech Translation for Triage of Emergency Phonecalls in Minority Languages

http://www.cs.cmu.edu/~textlearning
Learning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification

http://www.naclo.cs.cmu.edu/assets/
The Swedish Model of Public Outreach of Linguistics to secondary school Students through Olympiads
Introducing Computational Concepts in a Linguistics Olympiad

http://www.casos.cs.cmu.edu/projects/automap/
MIKE: An Interactive Microblogging Keyword Extractor using Contextual Semantic Smoothing

http://oli.cmu.edu
LAPPS/Galaxy: Current State and Next Steps

http://www.cs.cmu.edu/yww/data/WeiboTreebank.zip
Dependency Parsing for Weibo: An Efficient Probabilistic Logic Programming Approach

http://www.link.cs.cmu.edu/link/
Predicting Grammaticality on an Ordinal Scale
Error Detection for Statistical Machine Translation Using Linguistic Features
Reranking Translation Hypotheses Using Structural Properties
An algorithm for open text semantic parsing
Lycos Retriever: An Information Fusion Engine
Integrating lexical, syntactic and system-based features to improve Word Confidence Estimation in SMT
Analysis of Link Grammar on Biomedical Dependency Corpus Targeted at Protein-Protein Interactions
Error Detection Using Linguistic Features

http://childes.psy.cmu.edu/manuals/
Identifying and Avoiding Confusion in Dialogue with People with Alzheimer’s Disease
Extensions to the GrETEL Treebank Query Application
The ACQDIV Database: Min(d)ing the Ambient Language
Towards a Model of Prediction-based Syntactic Category Acquisition: First Steps with Word Embeddings

http://www.ark.cs.cmu.edu/
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
Turbo Parsers: Dependency Parsing by Approximate Variational Inference
Learning finite state word representations for unsupervised Twitter adaptation of POS taggers
Minimal Dependency Length in Realization Ranking
LYSGROUP: Adapting a Spanish microtext normalization system to English.
Classifying Tweet Level Judgements of Rumours in Social Media
Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data
Frame-Semantic Role Labeling with Heterogeneous Annotations
Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions
Challenges of studying and processing dialects in social media

http://www-2.cs.cmu.edu/afs/cs.cmu.edu/
A Language Model Approach to Keyphrase Extraction

http://avenue.lti.cs.cmu.edu/aria/spanish/error-
The Translation Correction Tool: English-Spanish User Studies

http://www.cs.cmu.edu/ark/
Towards Normalising Konkani-English Code-Mixed Social Media Text

https://www.cs.cmu.edu/Groups/AI/
Conceptor Debiasing of Word Representations Evaluated on WEAT
The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations

http://www.ark.cs.cmu.edu/TurboParser/
Branch and Bound Algorithm for Dependency Parsing with Non-local Features
(Re)ranking Meets Morphosyntax: State-of-the-art Results from the SPMRL 2013 Shared Task
A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis
Introducing the IMS-Wrocław-Szeged-CIS entry at the SPMRL 2014 Shared Task: Reranking and Morpho-syntax meet Unlabeled Data

http://penance.is.cs.cmu.edu/iwslt2005
Exploiting Variant Corpora for Machine Translation

http://www.ark.cs.cmu.edu/TweetNLP/model
Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut

http://www.cs.cmu.edu/avrim/ML09/lect0126.pdf
Artificial IntelliDance: Teaching Machine Learning through a Choreography

http://www.cs.cmu.edu/~ref/mlim
Multilingual Terminology Extraction and Validation

http://projectile.sv.cmu.edu/research/public/
A Discriminative Latent Variable-Based “DE” Classifier for Chinese-English SMT

http://www.edvisees.cs.cmu.edu/
Identifying Metaphorical Word Use with Tree Kernels

http://www.speech.cs.cmu
To Sing like a Mockingbird
Lexical Discovery with an Enriched Semantic Network
Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems

http://sailing.cs.cmu.edu/
Social Links from Latent Topics in Microblogs

http://www.speech.cs.cmu.edu/SLM
Compressing Trigram Language Models With Golomb Coding
String Transduction with Target Language Models and Insertion Handling

http://www.cs.cmu.edu/yww/data/emnlp2013.zip
This Text Has the Scent of Starbucks: A Laplacian Structured Sparsity Model for Computational Branding Analytics

http://rtw.ml.cmu.edu/acl2014_asp/
Joint Syntactic and Semantic Parsing with Combinatory Categorial Grammar

http://www.speech.cs.cmu.edu/hltnaacl2003/
Proceedings of the HLT-NAACL 2003 Workshop on Research Directions in Dialogue Processing

http://uima.lti.cs.cmu.edu:8080/UCR/Welcome.do
A Development Environment for Configurable Meta-Annotators in a Pipelined NLP Architecture
Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench

http://www.cs.cmu.edu/~nschneid/
A Corpus and Model Integrating Multiword Expressions and Supersenses

http://www.qatar.cmu.edu/~emohamed/
Annotating and Learning Morphological Segmentation of Egyptian Colloquial Arabic

http://www.cnbc.cmu.edu/~gobbel/clarity/ma
STANDARDISATION EFFORTS ON THE LEVEL OF DIALOGUE ACT IN THE MATE PROJECT

http://www.cs.cmu.edu/yww/data/meme
I Can Has Cheezburger? A Nonparanormal Approach to Combining Textual and Visual Information for Predicting and Generating Popular Meme Descriptions

http://reap.cs.cmu.edu
A VIEW of Russian: Visual Input Enhancement and Adaptive Feedback
Exploring Measures of “Readability” for Spoken Language: Analyzing linguistic features of subtitles to identify age-specific TV programs
Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple categories
On The Applicability of Readability Models to Web Texts
On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition
Enhancing Authentic Web Pages for Language Learners
Assessing the relative reading level of sentence pairs for text simplification
Readability Classification for German using Lexical, Syntactic, and Morphological Features

http://www.link.cs.cmu.edu/link/papers/index.html
Entailment due to Syntactically Encoded Semantic Relationships

https://www.cs.cmu.edu/
Two-Stage Stochastic Natural Language Generation for Email Synthesis by Modeling Sender Style and Topic Structure
Two-Stage Stochastic Email Synthesizer
Continuous fluency tracking and the challenges of varying text complexity
Discourse Coherence in the Wild: A Dataset, Evaluation and Methods
Interpretable Word Embedding Contextualization
Combining Shallow and Deep Learning for Aggressive Text Detection
Using Morphemes from Agglutinative Languages like Quechua and Finnish to Aid in Low-Resource Translation
SSMT:A Machine Translation Evaluation View To Paragraph-to-Sentence Semantic Similarity
A Joint Model of Conversational Discourse Latent Topics on Microblogs
Reusable workflows for gender prediction
Simple and Effective Paraphrastic Similarity from Parallel Translations
LTL-UDE at SemEval-2019 Task 6: BERT and Two-Vote Classification for Categorizing Offensiveness
Beyond BLEU:Training Neural Machine Translation with Semantic Similarity

http://lagos.lti.cs.cmu.edu:8002/
STCP: Simplified-Traditional Chinese Conversion and Proofreading

http://avenue.lti.cs.cmu.edu/aria/spanish/tutorial.html
The Translation Correction Tool: English-Spanish User Studies

http://tts.speech.cs.cmu.edu/webshodh/cmqa.php
Transliteration Better than Translation? Answering Code-mixed Questions over a Knowledge Base

http://www.is.cs.cmu.edu/mie
Archivus: A Multimodal System for Multimedia Meeting Browsing and Retrieval

http://projectile.is.cs.cmu.edu/research/public/tools/bootStrap/tutorial.htm
Phrase-Based Statistical Machine Translation: A Level of Detail Approach

http://www.cs.cmu.edu/~mccallum/bow/."
Text Mining Techniques for Leveraging Positively Labeled Data

http://www.cs.cmu.edu/~ark/TweetNLP/#pos
Exploring Word Embeddings for Unsupervised Textual User-Generated Content Normalization

http://www.2.cs.cmu.edu/jrs/jrspapers.html/#cg
Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference

http://www.cs.cmu.edu/~wcohen/
Scalable Statistical Relational Learning for NLP
Degrees of Orality in Speech-like Corpora: Comparative Annotation of Chat and E-mail Corpora

http://www.speech.cs.cmu.edu/sphinxman/FAQ.html
CIEMPIESS: A New Open-Sourced Mexican Spanish Radio Corpus

http://www.speech.cs.cmu.edu/air/papers/speechwear.ps
A procedure assistant for astronauts in a functional programming architecture, with step previewing and spoken correction of dialogue moves

http://kimi.ml.cmu.edu/transfer/data.tar.gz
Transfer Learning for Entity Recognition of Novel Classes

http://rtw.ml.cmu.edu/wk/WebSets/
Collectively Representing Semi-Structured Data from the Web

http://projectile.is.cs.cmu
Rich Source-Side Context for Statistical Machine Translation

http://childes.psy.cmu.edu/data-xml/Eng-USA/Providence.zip
Studying the Effect of Input Size for Bayesian Word Segmentation on the Providence Corpus

http://www.cs.cmu.edu/?zechner/publications.html
Minimizing Word Error Rate in Textual Summaries of Spoken Language

http://www.ark.cs.cmu.edu/mheilman/
A Corpus and Model Integrating Multiword Expressions and Supersenses
More or less supervised supersense tagging of Twitter
Towards Automatic Topical Question Generation

http://www.cs.cmu.edu/~mfaruqui/soft
Augmenting English Adjective Senses with Supersenses

http://www.cs.cmu.edu/robotnavcps/
The Structure and Generality of Spoken Route Instructions

http://avenue.lti.cs.cmu.edu/aria/spanish/
The Translation Correction Tool: English-Spanish User Studies

http://rtw.ml.cmu.edu/emnlp2014
Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases

http://rtw.ml.cmu.edu/emnlp2015
Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction

http://www-2.cs.cmu.edu/webkb/
Mining Web Sites Using Unsupervised Adaptive Information Extraction

http://rtw.ml.cmu.edu/emnlp2013
Improving Learning and Inference in a Large Knowledge-Base using Latent Syntactic Cues

http://www.cs.cmu.edu/Nmccallum
Text Classification by Bootstrapping with Keywords, EM and Shrinkage

http://www.speech.cs.cmu.edu/Communicator/papers/-
Using Domain Knowledge about Medications to Correct Recognition Errors in Medical Report Creation

http://lti.cs.cmu.edu/sites/
CMU: Arc-Factored, Discriminative Semantic Dependency Parsing

https://pslcdatashop.web.cmu.edu/DatasetInfo?datasetId=827
Towards Automatic Description of Knowledge Components

http://www.cs.cmu.edu/~mccallum/bow/rainbow/,Rainbow
學術會議資訊之擷取及其應用 (Information Extraction for Academic Conference and It’s Application) [In Chinese]

http://articulab.hcii.cs.cmu.edu/projects/rapt/
Automatic Recognition of Conversational Strategies in the Service of a Socially-Aware Dialog System

http://www.link.cs.cmu.edu/lexfn
Untangling Text Data Mining

http://speech.sv.cmu.edu/software.html
Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks

http://uima.lti.cs.cmu.edu/
Towards Data and Goal Oriented Analysis: Tool Inter-operability and Combinatorial Comparison

http://www.ark.cs.cmu.edu/GeoTwitter
A Latent Variable Model for Geographic Lexical Variation
Estimating User Location in Social Media with Stacked Denoising Auto-encoders

http://www.speech.cs.cmu.edu/SLM_info.html
Using Log-linear Models for Tuning Machine Translation Output
Towards Domain Adaptation for Parsing Web Data

http://www.ark.cs.cmu.edu/mheilman/questions/SupersenseTagger-10-01-12.tar.gz
Improving Translation Selection with Supersenses

http://www-2.cs.cmu.edu/mccallum/bow/
SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining
Determining Term Subjectivity and Term Orientation for Opinion Mining

http://www.ark.cs.cmu.edu/ArabicSST
Supersense Tagging for Arabic: the MT-in-the-Middle Attack

http://childes.psy.cmu.edu/data/EastAsian/Japanese/Miyata/
Data-driven Measurement of Child Language Development with Simple Syntactic Templates

http://nlp.qatar.cmu.edu/qalb
A Web-based Annotation Framework For Large-Scale Text Correction

http://www.ark.cs.cmu.edu/MT
How to Produce Unseen Teddy Bears: Improved Morphological Processing of Compounds in SMT
Cache-based Document-level Statistical Machine Translation
Rich Source-Side Context for Statistical Machine Translation
Instance Weighting for Neural Machine Translation Domain Adaptation
Classifier-Based Tense Model for SMT

http://www.cs.cmu.edu/~mccallum/bow/
Parametric Models of Linguistic Count Data

http://www.cs.cmu.edu/~mccallum/bow/rainbow/
Web Mining for Unsupervised Classification

http://www.cs.cmu.edu/Groups/AI/
Structure-based Clustering of Novels

http://www.ark.cs.cmu.edu/mheilman/questions/
Linguistic Considerations in Automatic Question Generation
Towards Automatic Topical Question Generation
Towards Topic-to-Question Generation

http://www.cgi.cs.cmu.edu/~kathrin/amta02CarbonellEtAl.pdf
Learning Translation Rules from Bilingual English - Filipino Corpus

http://www-2.cs.cmu.edu/lemur
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora

http://www.cs.cmu.edu/~einat/datasets.html
Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk

http://multicomp.cs.cmu.edu/acl2018multimodalchallenge/
Recognizing Emotions in Video Using Multimodal DNN Feature Fusion

http://www.speech.cs.cmu.edu/sigdial2003/
Proceedings of the Fourth SIGdial Workshop of Discourse and Dialogue