Selected Publications
2013
- Suttles, J., Ide, N. (forthcoming).
Distant
Supervision for Emotion Classification with Discrete Binary Values.
Computational Linguistics and
Intelligent Text, Lecture Notes in Computer Science, Springer.
- Ide, N., Suderman, K. (forthcoming).
The Linguistic Annotation Framework: A Standard for Annotation Interchange and Merging.
Language Resources and Evaluation.
- Ide, N. (2013)
An
Open Linguistic Infrastructure for Annotated Corpora. In Gurevych, I., Kim, J. (eds.)
The People’s Web Meets NLP: Collaboratively Constructed Language
Resources, Springer, 265-286.
-
Passonneau, Rebecca J., Bhardwaj, Vikas, Salleb-Aouissi, Ansaf, Ide,
Nancy (forthcoming). Multiplicity and Word Sense: Evaluating and Learning
from Multiply Labeled Word Sense Annotations. Language Resources and
Evaluation.
2012
- Ide, N., Suderman, K. (2012).
A
Model for Linguistic Resource Description. Proceedings of the
Sixth Linguistic Annotation Workshop, held in
conjunction with ACL 2012, Jeju, Korea.
- Ide, N. (2012).
MultiMASC: An Open Linguistic Infrastructure
for Language Research.
Proceedings of the Fifth Workshop on Building and Using Comparable Corpora,
held in conjunction with LREC 2012, Istanbul.
- de Melo, G., Baker, C.F., Ide, N., Passonneau, R., Fellbaum,
C. (2012).
Empirical
Comparisons of MASC Word Sense Annotations.
Proceedings of the Eighth Language Resources and Evaluation
Conference, Istanbul.
- Passonneau, R., Baker, C., Fellbaum,
C., Ide, N. (2012).
The MASC Word Sense Sentence Corpus.
Proceedings of the Eighth Language Resources and Evaluation
Conference, Istanbul.
2011
- Ide, N. Prasad, R, Joshi, A. (2011). Towards Interoperability for
the Penn Discourse Treebank.
Proceedings
of the Sixth Joint ISO - ACL SIGSEM Workshop on
Interoperable Semantic Annotation
(ISA-6), Oxford, England, 49-55.
-
Passonneau, Bhardwaj, V., R., Salleb-Aouissi, A., Ide, N. (2011).
Multiplicity
and Word Sense: Evaluating and Learning from Multiply
Labeled Word Sense Annotations.
Language Resources and Evaluation.
-
Ide, N., Suderman, K. (2011).
Bridging the Gaps: Interoperability for
Language Engineeering Architectures Using GrAF.
Language Resources and Evaluation, Selected
Papers from the Third
Linguistic Annotation Workshop, Stede, M., and Huang, C.-R. (eds.)
2010
-
Ide, N., Baker, C., Fellbaum, C., Passonneau, R. (2010).
The
Manually Annotated Sub-Corpus: A Community Resource For and By the
People.
Proceedings of the 48th Annual Meeting of the Association for
Computational Linguistics, Uppsala, Sweden.
- Ide, N., Bunt, H. (2010).
Anatomy of Annotation Schemes: Mapping to GrAF.
Proceedings of the Fourth Linguistic Annotation Workshop (LAW
IV),
held in conjunction with the 48th Annual Meeting of the Association
for
Computational Linguistics, Uppsala, Sweden.
- Bhardwaj, V., Passonneau, R., Salleb-Aouissi, A., Ide, N. (2010).
Anveshan:
A Framework for Analysis of Multiple Annotators' Labeling
Behavior.
Proceedings of the Fourth Linguistic Annotation Workshop (LAW
IV),
held in conjunction with the 48th Annual Meeting of the Association
for
Computational Linguistics, Uppsala, Sweden.
- Ide, N., Suderman, K., Simms, B. (2010).
ANC2Go:
A Web Application for Customized Corpus Creation.
Proceedings of the Seventh Language Resources and Evaluation
Conference (LREC 2010), Valletta, Malta.
- Passonneau, R., Salleb-Aoussi, A., Bhardwaj, V., and Ide,
N. (2010).
Word Sense Annotation of Polysemous Words by Multiple Annotators .
Proceedings of the Seventh Language Resources and Evaluation
Conference (LREC 2010), Valletta, Malta.
- Ide, N., Pustejovsky, J. (2010).
What Does Interoperability Mean,
anyway? Toward an Operational Definition of
Interoperability. Proceedings of the Second International Conference
on Global Interoperability for Language Resources (ICGL 2010), Hong
Kong, China.
2009
- Ide, N., Suderman, K. (2009).
Bridging the Gaps: Interoperability for GrAF, GATE, and UIMA.
Proceedings of the Third Linguistic Annotation Workshop, held in
conjunction with ACL 2009, Singapore.
- Ide, N., Pustejovsky, J., Calzolari, N., Soria, C. (2009).
The SILT and FlaReNet International Collaboration for
Interoperability.
Proceedings of the Third Linguistic Annotation Workshop, held in
conjunction with ACL 2009, Singapore.
- Passonneau, R., Salleb-Aoussi, A., Ide, N. (2009).
Making Sense of Word Sense Variation.
Semantic Evaluations: Recent Achievements and Future Directions.
NAACL-HLT 2009 Workshop, Boulder, Colorado, USA.
2008
- Ide, N. (2008).
The American National Corpus: Then, Now, and Tomorrow.
In Michael Haugh, Kate Burridge, Jean Mulder and Pam Peters (eds.),
Selected Proceedings of the 2008 HCSNet Workshop on Designing the
Australian National Corpus: Mustering Languages, Cascadilla
Proceedings Project, Sommerville, MA.
- Ide, N., Baker, C., Fellbaum, C., Fillmore, C.,
Passonneau, R. (2008).
MASC: The Manually Annotated Sub-Corpus of American English.
Proceedings of the Sixth Language Resources and Evaluation
Conference (LREC), Marrakech, Morocco.
- Caselli, T., Ide, N., Bartolini, R. (2008).
A Bilingual Corpus of Inter-linked Events.
Proceedings of the Sixth Language Resources and Evaluation
Conference (LREC), Marrakech, Morocco.
- Ide, N., Passonneau, R., Baker, C., Fellbaum, C. (2008).
Semantics Isn't Easy:
Thoughts on the Way Forward. Position paper presented at the
NSF Symposium on Semantic Knowledge Discovery, Organization and Use,
New York, New York, November 14-15, 2008.
2007
- Ide, N. and Suderman, K. (2007).
GrAF: A Graph-based Format for Linguistic Annotations.
Proceedings of the Linguistic Annotation Workshop, held in
conjunction with ACL 2007, Prague, June 28-29, 1-8.
- Ide, N. (2007).
Annotation Science: From Theory to Practice and Use. (Invited Talk) Data
Structures for Linguistics Resources and Applications. Proceedings of
the Bienniel GLDV Conference, April 11-13, 2007, Tübingen, Germany.
- Ide, N. and Woolner, D. (2007).
Historical Ontologies.
In Ahmad, K, Brewster, C., and Stevenson, M. (eds.), Words and
Intelligence II: Essays in Honor of Yorick Wilks,
Springer, 137-152.
- Ide, N., Romary, L. (2007).
Towards International Standards for Language Resources.
In Dybkjaer, L., Hemsen, H., Minker, W. (Eds.),
Evaluation of Text and Speech Systems,
Springer, 263-84.
2006
- Ide, N., Wilks, Y. (2006). Making Sense About Sense.
In Agirre, E., Edmonds, P. (Eds.), Word Sense Disambiguation:
Algorithms and Applications, Springer, 47-74..
- Ide, N. (2006). Making
Senses:
Bootstrapping Sense-tagged Lists of
Semantically-Related Words. In Gelbukh, Alexander (Ed.),
Computational Linguistics and
Intelligent Text, Lecture Notes in Computer Science, Springer.
- Ide, N., Suderman, K. (2006).
Integrating Linguistic Resources:
The American National Corpus Model.
Proceedings of the Fifth Language
Resources and Evaluation Conference (LREC),
Genoa, Italy.
- Ide, N., Romary, L.. (2006).
Representing Linguistic Corpora and Their Annotations.
Proceedings of the Fifth Language
Resources and Evaluation Conference (LREC),
Genoa, Italy.
- Ide, N., Suderman, K. (2006).
Merging Layered Annotations
Proceedings of Merging
and Layering Linguistic Information, Workshop held in conjunction
with
LREC 2006, Genoa,
Italy.
2004
- Ide, N., Woolner, D. (2004).
Exploiting Semantic Web Technologies for Intelligent Access to
Historical Documents.
Proceedings of the Fourth Language
Resources and Evaluation Conference (LREC),
Lisbon, 2177-80.
- Ide, N., Suderman, K. (2004). The
American National Corpus First Release.
Proceedings of the Fourth Language
Resources and Evaluation Conference (LREC),
Lisbon, 1681-84.
- Ide, N., Romary, L. (2004). International
standard for a linguistic annotation framework. Journal of Naturaql
Language Engineering, 10:3-4, 211-225.
- Tufis, D., Ion, R., Ide, N. (2004). Fine-Grained Word Sense
Disambiguation Based on Parallel Corpora, Word Alignment, Word
Clustering, and Aligned WordNets. Proceedings of COLING'04. Geneva.
- Ide, N., Romary, L. (2004). A Registry of Standard Data Categories
for Linguistic Annotation. Proceedings of the Fourth Language
Resources and Evaluation Conference (LREC),
Lisbon, 135-39.
- Reppen, R., Ide, N. (2004).
The American National Corpus: An
overview of the First Release. Journal of English Linguistics.
- Ide, N. (2004). Preparation and Analysis of Linguistic Corpora.
In Schreibman, S., Siemens, R, Unsworth, J, Eds. A Companion to
Digital Humanities. Blackwell.
- Tufis, D., Ion, R., Ide, N. (2004). Fine-Grained Word Sense
Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and
Aligned WordNets. Proceedings of COLING'04, Geneva.
2003
- Ide, N., Romary, L. (2003).
Encoding Syntactic Annotation. In Abeillé, Anne (ed.)
Treebanks: Building and Using Parsed Corpora, Kluwer,
Dordrecht, 281-96.
- Ide, N., Romary, L. (2003).
Outline of the International Standard
Linguistic Annotation Framework.
Proceedings of ACL'03 Workshop on Linguistic Annotation: Getting
the Model Right, Sapporo, 1-5.
- Ide, N., Lenci, A., Calzolari, N. (2003).
RDF Instantiation of ISLE/MILE Lexical Entries.
Proceedings of ACL'03 Workshop on Linguistic Annotation: Getting
the Model Right, Sapporo, 30-37.
- Ide, N., Romary, L., de la Clergerie, E. (2003).
International Standard for a Linguistic Annotation Framework.
Proceedings of HLT-NAACL'03
Workshop on The Software Engineering and Architecture of Language
Technology, Edmunton.
2002
- Ide, N. Erjavec, T., Tufis, D. (2002).
Sense Discrimination with
Parallel Corpora.
Proceedings of ACL'02 Workshop on Word Sense Disambiguation: Recent
Successes and Future Directions, Philadelphia, 54-60.
- Ide, N., Reppen, R., Suderman, K. (2002).
The American National Corpus: More Than the Web Can Provide.
Proceedings of the Third Language Resources and Evaluation
Conference (LREC), Las Palmas, Canary Islands, Spain, 839-44.
- Calzolari, Nicoletta, Charles J. Fillmore, Ralph Grishman, Nancy
Ide, Alessandro Lenci, Catherine MacLeod, Antonio Zampolli.
Towards Best Practice for Multiword Expressions in Computational
Lexicons.
Proceedings of the Third Language Resources and Evaluation Conference (LREC), Las Palmas, Canary Islands, Spain,
1934-40.
- Ide, N., Romary, L. (2002).
Standards for Language Resources.
Proceedings of the Third Language Resources and Evaluation
Conference (LREC), Las Palmas, Canary Islands, Spain, 839-44.
2001
- Ide, N. Erjavec,
T., Tufis, D. (2001).
Automatic Sense Tagging Using Parallel Corpora.
Proceedings of the Sixth Natural Language Processing Pacific Rim
Symposium, Tokyo, 83-9.
- Ide, N., Romary, L. (2001).
Standards for Language Resources.
Proceedings of the IRCS Workshop on Linguistic Databases,
Philapdelphia, 141-9.
- Ide, N., Romary, L. (2001). A Common
Framework for Syntactic Annotation. Proceedings of
ACL'2001, Toulouse, 298-305.
- Ide, N., Macleod, C. (2001). The American
National Corpus: A Standardized Resource of American English. Proceedings of
Corpus Linguistics 2001, Lancaster UK.
2000
- Ide, N., Cristea, D. (2000). A Hierarchical
Account of Referential Accessibility. Proceedings of
ACL'2000, Hong Kong, 416-24.
- Ide, N., Romary, L. (2000) XML Support
for Annotated Language Resources.
Proceedings of the Workshop
on Web-based Language Documentation and Description,
Philadelphia, 148-153.
- Ide, N., Kilgarriff, A., Romary, L. (2000).
A Formal Model of Dictionary Structure and Content.
Proceedings of Euralex 2000,
Stuttgart, 113-126.
- Cristea, D., Ide, N., Marcu, D., Tablan, V. (2000).
An Empirical
Study of the Relation Between Discourse Structure and Reference.
Proceedings of COLING'00, Saarbrucken, Germany, 208-214.
- Ide, N. (2000). Cross-lingual
sense determination: Can it work? Computers and the
Humanities, 34: 1-2, Special Issue on the Proceedings of the
SIGLEX/SENSEVAL Workshop, A. Kilgarriff and M. Palmer, eds., 223-34.
- Ide, N. (2000). The XML Framework and Its Implications for the
Development of Natural Language Processing Tools. Proceedings of the COLING Workshop on
Using Toolsets and Architectures to Build NLP Systems, Luxembourg, 5
August 2000.
- Ide, N., Brew, C. (2000).
Requirements, Tools, and Architectures
for Annotated Corpora.
Proceedings of
Data Architectures and Software Support for
Large Corpora. Paris: European Language Resources Association, 1-5.
-
Ide, N., 2000.
The XML Framework and Its Implications for Corpus
Access and Use. Proceedings of
Data Architectures and Software Support for
Large Corpora. Paris: European Language Resources Association,
28-32.
- Ide, N., Bonhomme, P., Romary, L. (2000).
XCES: An XML-based Standard for Linguistic Corpora..
Proceedings of the Second Language Resources and Evaluation
Conference (LREC), Athens, Greece, 825-30.
- Erjavec, T., Evans, R., Ide, N., Kilgarriff, A. (2000).
The CONCEDE Model for Lexical Databases..
Proceedings of the Second Language Resources and Evaluation
Conference (LREC), Athens, Greece, 355-62.
- Macleod, C., Ide, N., Grishman, R. (2000).
The American National Corpus: Standardized Resources for American English..
Proceedings of the Second Language Resources and Evaluation
Conference (LREC), Athens, Greece, 831-36.
1999
- Cristea, D., Ide, N., Marcu, D., Tablan, V. (1999). Discourse structure and co-reference:
An empirical study. Proceedings of the ACL99 Workshop on the Relation between Discourse/Dialogue Structure and Reference, College Park, Maryland, 46-53.
- Ide, N. (1999). Parallel translations as sense discriminators. SIGLEX99: Standardizing Lexical Resources, ACL99 Workshop, College Park, Maryland, 52-61.
- Ide, N. (ed.) (1999). Methods and Techniques of Processing (html document). In Frederking, R., Hovy, E., Ide, N. (eds.) Multilingual Information Management: Current Levels and Future Abilities. National Science Foundation report. Available at http://www.cs.cmu.edu/~ref/mlim/.
- Erjavec, T. and Ide, N. (1999) Markup
enhancement: Converting CEE dictionaries into TEI, and beyond.
In Kiefer, F., Kiss, G. and Pajzs, J. (eds.)
Papers in Computational Lexicography COMPLEX'99,
Linguistics Institute, Hungarian Academy of Sciences, Budapest, 211-217.
- Welty, C. and Ide, N. (1999). Using the right tools:
Enhancing retrieval from marked-up documents. Computers and the
Humanities 33:1-2,Special Issue on the Tenth Anniversary of the Text Encoding Initiative, E. Mylonas and A. Renear, eds.
1998
- Ide, N. and V éronis, J. (1998) Word Sense
Disambiguation: The State of the Art. Computational
Linguistics,24:1, 1-40.
- Cristea, D., Ide, N., Romary, L. (1998) Marking Up Multiple Views of a Text:
Discourse and Reference. Proceedings of the First International
Language Resources and Evaluation Conference,Granada, Spain,
483-88.
- Cristea, D., Ide, N., Romary, L. (1998). Veins Theory: A Model of Global Discourse
Cohesion and Coherence. Proceedings of ACL/COLING98,
Montreal, 281-85.
- Dimitrova, L., Erjavec, T., Ide, N. Kaalep, H., Petkevic, V., Tufis,
D. (1998) Multext-East: Parallel and Comparable Corpora for Six Central
and Eastern European Languages. Proceedings of ACL/COLING98,
Montreal, 315-19.
- Fillmore, C., Ide, N., Jurafsky, D., and Macleod, C. (1998). An American National Corpus: A Proposal.
Proceedings of the First International Language Resources and
Evaluation Conference,Granada, Spain, 965-70.
- Ide, N. (1998) Encoding Linguistic
Corpora. Proceedings of the Sixth Workshop on Very Large
Corpora,Montreal, 9-17.
- Ide, N. (1998). Corpus Encoding Standard: SGML Guidelines for
Encoding Linguistic Corpora. Proceedings of the First International
Language Resources and Evaluation Conference,Granada, Spain,
463-70.
- Tufis, D., Ide, N., Erjavec, T. (1998) Standardized
Specifications, Development and
Assessment of Large Morpho-syntactic Resources for Six Central and
Eastern European Languages. Proceedings of the First
International Language Resources and Evaluation
Conference,Granada,
Spain, 233-40.
1997
- Ide, N., McGraw, T., Welty, C. (1997) Representing TEI Documents in
the CLASSIC Knowledge Representation System. Proceedings of the Tenth
Anniversary Conference of the Text Encoding Initiative,Providence,
RI.
- Ide, N. (1997)
Final
Report on
Integration, Testing, and Refinement.
Multext Deliverable D1.2F. 32 pp.
- Ide, N., Sperberg-McQueen, C.M. (1997). Toward a Unified Docuverse: Standardizing Document
Markup and Access Without Procrustean Bargains. Proceedings of
the American Society for Information Science Conference,Washington,
D.C., 347-60.
- Barnard, D. and Ide, N. (1997)
The Text Encoding Initiative:
Flexible and Extensible Document Encoding. Journal of the American
Society for Information Science.
1996
- Erjavec, T., Ide, N., Petkevic, V., Véronis, J. (1996) Multext-East: Multilingual Text, Tools and
Corpora for Central and Eastern European Languages.
Proceedings of the First TELRI European Seminar,87-98.
- Ide, N. (1996) Encoding Standards for Linguistic Corpora.
Proceedings of the First TELRI European Seminar,65-78.
- Erjavec, T. and Ide, N. (1996) Corpus Collection and Preparation.
COP Project 106 MULTEXT-EAST Deliverable D2.1 F Final Report.
- Priest-Dorman, G,. Erjavec, T., Ide, N. and Petkevic, V. (1996)
Corpus Markup. COP Project 106 MULTEXT-EAST Deliverable D2.3F.
- Ide, N. (Ed.) (1996).
Multext-East Language-specific Resources.
COP Project 106 MULTEXT-EAST Deliverable D1.2.
- Véronis, J., Ide, N. (1996).
Considerations for the Reusability of Linguistic
Software. EAGLES-MULTEXT Report.
- Véronis, J., Ide, N. (1996).
Guidelines for Linguistic Software Development.
EAGLES-MULTEXT Report.
1995
- Durand, D., Ide, N., LeMaitre, J., Véronis, J. (1995)
Internal Standard Formats.
Multext Deliverable 1.3.1B. 27 pp.
- Ide, N., Le Maitre, J., Véronis, J. (1995). Outline of a Model
for
Lexical Databases. Current Issues in Computational Linguistics:
In
Honour of Don Walker. Linguistica ComputazionaleIX, X (Pisa,
1995), 283-320. [reprinted from Information Processing and
Management.,
29, 2, 159-186]
- Ide, N., Véronis, J. (1995). Large
Neural Networks for the Resolution of Lexical Ambiguity. In
Saint-Dizier, P. and Viegas, E. (Eds.) Computational Lexical
Semantics.Natural Language Processing Series, Cambridge University
Press, 251-270.
- Ide, N., Sperberg-McQueen, C.M. (1995). The Text Encoding
Initiative: History and Context.
In Ide, N., Veronis, J. (Eds.) The Text Encoding Initiative:
Background and Context.Dordrecht: Kluwer Academic Publishers,
5-15.
- Ide, N., Véronis, J. (1995). Encoding
dictionaries. In Ide, N.,
Veronis, J. (Eds.) The Text Encoding Initiative: Background and
Context.Dordrecht: Kluwer Academic Publishers, 167-80.
1994
- Ide, N., Véronis, J. (1994) Knowledge
Extraction from Machine-Readable Dictionaries: An Evaluation. In
Steffens, P. (Ed.) Machine Translation and the Lexicon.
Springer-Verlag, 19-34.
- Ide, N., Véronis, J. (1994). A feature-based
data model for lexical databases. In Hockey, S., Ide,
N. Research
in Humanities Computing IV,Oxford University Press , 193-206.
- Le Maitre, J., Ide, N., Véronis, J. (1994) Modélisation
et interrogation de bases de données
lexicales. Ingénierie des systemes
d'informations,2:1,
57-82.
- Ide, N. (1994) Encoding standards for large text
resources: The Text Encoding Initiative. Proceedings of the 15th
International Conference on Computational Linguistics, COLING'94,
Kyoto, Japan, August 1994, 574-78.
- Ide, N., Véronis, J. (1994). MULTEXT:
Multilingual Text Tools and Corpora. Proceedings of the 15th
International Conference on Computational Linguistics, COLING'94,
Kyoto, Japan, 588-92.
- Ide, N., Véronis, J. (1994). Machine
Readable Dictionaries: What have we learned, where do we go?
Proceedings of the International Workshop on the Future of Lexical
Research,Beijing, China, 137-46.
- Véronis, J., Hirst, D. Espesser, R., Ide, N. (1994). NL and Speech in the MULTEXT Project.
Proceedings of the AIII-94 Workshop on Integration of Natural
Language and Speech,Seattle, July 1994, 72-8.
- Hirst, D., Ide, N. Véronis, J. (1994) Coding fundamental frequency
patterns for multi-lingual synthesis with INTSINT in the MULTEXT
project. Conference Proceedings of the 2nd ESCA/IEEE Workshop on
Speech Synthesis,New Paltz, NY, September 1994, 77-81.
1993
- Ide, N., Véronis, J. (1993). Extracting
knowledge bases from machine-readable dictionaries : Have we wasted our
time? KB&KS'93 Workshop,Tokyo, 257-266.
- Ide, N., Véronis, J. (1993). Refining taxonomies extracted from
machine-readable dictionaries. In Hockey, S., Ide, N. Research in
Humanities Computing II,Oxford University Press, 145-59.
- Ide, N., Walker, D. (1993) Common methodologies in humanities
computing and computational linguistics. Computers and the
Humanities,Special issue on Common Methodologies in Humanities
Computing and Computational Linguistics, N. Ide and D. Walker, Eds.,
327-330.
1992
- Ide, N., Véronis, J., Warwick-Armstrong, S., Calzolari, N.,
(1992). Principles for encoding machine
readable dictionaries, EURALEX'92 Proceedings,H. Tommola, K.
Varantola, T. Salmi-Tolonen, Y. Schopp, Eds., in Studia
Translatologica,Ser. a, 2, Tampere, Finland, 239-246.
- Véronis, J., Ide, N. (1992). A feature-based model for lexical
databases. Proceedings of the 14th International Conference on
Computational Linguistics, COLING'92,Nantes, France, 588-594.
1991
1990
- Ide, N.M., Véronis, J. (1990). Mapping Dictionaries: A Spreading
Activation Approach. Proceedings of the 6th Annual Conference of the
Centre for the New Oxford English Dictionary,Waterloo, Ontario,
52-64.
- Ide, N.M., Véronis, J. (1990). Very large neural networks for word
sense disambiguation. Proceedings of the 9th European Conference on
Artificial Intelligence, ECAI'90,Stockholm, 366-68.
- Ide, N.M., Véronis, J. (1990). Word Sense Disambiguation with Very Large Neural Networks
Extracted from Machine Readable Dictionaries. Proceedings of the 13th International
Conference on Computational Linguistics, COLING'90,Helsinki, vol.
2, 389-94.