��Ҵ�ý�ƽ��

Fall 2015

Date	Event	Speaker	Abstract/Details
08/26/2015	Incorporating World Knowledge to Heterogeneous Information Networks	Ming Zhang	Location: KOELBEL 203 The key challenges of applying world knowledge are how to adapt the world knowledge to domains and how to represent it for learning. In this talk, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. Experimental results in Freebase and YAGO2 on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.
09/09/2015	Cultural Heritage Linked Data on the Semantic Web	Eero Hyvönen	Location: KOELBEL 355 Cultural Heritage (CH) (meta)data is often heterogeneous, multilingual, distributed, semantically interlinked, and produced independently by organizations and individuals using different schemas, tools, and practices. As a result, a fundamental problem area in dealing with CH data is to make the content mutually interoperable, so that it can be searched, linked, and presented in a harmonized way across the boundaries of the datasets and data silos. Semantic Web and Linked Data standards and practices of W3C are a promising approach to address these issues [1]. However, this is not enough: we also need a content infrastructure, i.e., the actual domain ontologies, metadata models, and data shared by the CH community, and web services that make their integration and use in CH data systems easy and cost efficient. This talk tells about our experiences in building a national level Linked Data content infrastructure in Finland.
09/16/2015	Matrix Completion and Robust PCA: new data analysis tools	Stephen Becker	Location: KOELBEL 203 Matrix completion is a generalization of compressed sensing that seeks to determine missing matrix entries under some (non-Bayesian) assumptions about the matrix. The technique has generated a lot of excitement due to rigorous guarantees in some case, and also due to applications to machine learning (e.g., the Netflix prize problem). This talk discusses basic matrix completion, including efficient algorithms suitable for big data, as well as an extension of matrix completion known as robust PCA, which can handle large outliers in the data. We continue with several applications: inferring the structure of chromosomes, functional imaging of the brain, removing clouds from multi-spectral satellite image data, and verifying the properties of a quantum state or a quantum gate.
09/23/2015	N-minute madness	��	Location: ENG Clark Conference Room
09/30/2015	AMR and AMR Parsing	Martha, Wei-Te, Wayne Ward	Location: Fleming 279 • Broad-coverage CCG Semantic Parsing with AMR • A Transition-based Algorithm for AMR Parsing • Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation
10/07/2015	NN for SRL	Bill Foland, Jim Martin	Location: Fleming 279
10/14/2015	Topic modeling for sentence annotation - brainstorming	��	��
10/21/2015	Aligning perspectives to scientific literature	Jin-Dong Kim	Location: Fleming 279 Scientific literature holds the accumulation of our scientific discoveries. By accessing the accumulated knowledge, development of new knowledge could be efficient. Because the size of the scientific literature is increasing exponentially, semantic indexing of literature is important to allow instant and fine-grained access to the sources of scientific assertions. There are many projects on-going to produce semantic indexing of scientific literature, a.k.a. literature annotation. Literature annotation projects are particularly active in the area of life sciences, partly due to the existence of public literature databases, e.g. PubMed. Although many of those annotation projects are conducted individually, fundamentally, they share the same target, i.e. PubMed articles. Since it is impossible for a single group to annotate the whole PubMed collection for every important aspect, individual projects annotate different parts of PubMed for different aspects of life sciences. It is like many blind men annotating a giant elephant from their individual perspectives. The annotations produced by an individual may be limited, but if all the annotations are collected and aligned, the chances of figuring out the whole picture will be maximized. The PubAnnotation system is developed to provide a platform for collecting and aligning various annotations made to a collection of literature, particularly now a collection of life science literature, represented by PubMed articles. The community of Biomedical Linked Annotation Hackathon (BLAH) is backing-up the developments around PubAnnotation, towards public shared resources of linked literature annotation.
10/28/2015	Verb semantics	Bill Croft	��
11/04/2015	Document Classification by Topic Using Neural Networks	Scott Denning	Presented is a method for classifying patent documents by technology type. This method is enabled by the creation of document indexes using latent semantic indexing. The indexes are input into an artificial neural network and based on learned patterns of categories and corresponding indexes, the neural network determines the most appropriate topic category. Testing has shown that this system achieves 99.5% accuracy in correctly classifying documents of a particular technology category if there are at least fifty patents in that category’s training set.
11/11/2015	��	James Gung, James Pustejovsky, Annie Zaenen	��
11/18/2015	Topic modeling for sentence annotation - brainstorming	��	��
12/02/2015	AMR parsing	Wei-Te Chen	��
12/09/2015	NAACL Paper Clinic	��	��