36 research outputs found
Robust Unsupervised Topic-Based Language Model Adaptation
本論文的主要貢獻在於提出一個基於主題分析的語言模型調適法,這個方法主要是使用潛藏狄式配置(Latent Dirichlet Allocation, LDA)。我們使用機率式潛藏語意分析(Probabilistic Latent Semantic Analysis, PLSA)自動地把一個具有不同性質的文字語料加以聚成許多個潛藏主題,然後用這些結果當作我們LDA模型的初始化模型。我們用最後的LDA模型一句一句地建造主題式的文字語料,這些主題式語料則用來估計主題式的語言模型。當我們用語言模型調適進行N-best重新評分時,我們把這些主題式的語言模型以內插法跟一個背景(也就是非主題式的)語言模型結合在一起。本論文共提出幾個機制,可以讓主題推論的結果更強健,比較不會被辨識錯誤扭曲,我們也用詮釋資料做片段分割,進行節目層的語言模型調適。最後在多來源的美國國防部GALE計劃中文資料上的結果顯示比其他最新的語言模型調適方法更有效。We present a novel topic mixture-based language model adaptation approach that uses Latent Dirichlet Allocation (LDA). We use Probabilistic Latent Semantic Analysis (PLSA) to automatically cluster a heterogeneous training corpus, and then train an LDA model using the resultant topic-document assignments. Using this LDA model, we construct fine-grained topic-specific corpora at the utterance level, which we use to train topic language models. These topic LMs are interpolated with a background language model during language model adaptation under an N-best rescoring framework. We describe several techniques for hardening LDA topic inference to first-pass recognition errors, and demonstrate the effectiveness of metadata-based segmentation when combined with show-level language model adaptation. Good improvements over state-of-the-art schemes were obtained in experiments on multi-genre GALE Project data in Mandarin Chinese.1 Introduction 1
1.1 An Automatic Speech Recognition System . . . . . . . . . . . . . . . . . 2
1.2 Training an ASR System . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 This Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Language Model Adaptation 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Automatic Speech Recognition . . . . . . . . . . . . . . . . . . . 5
2.1.2 n-gram Language Models . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Language Model Adaptation . . . . . . . . . . . . . . . . . . . . 9
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Adaptation Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Topic Corpus Construction and LM Training . . . . . . . . . . . 11
2.3.2 Language Model Adaptation . . . . . . . . . . . . . . . . . . . . 12
2.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Topic LM Mixture Size Experiments . . . . . . . . . . . . . . . 13
2.4.2 Interpolation Weight Experiments . . . . . . . . . . . . . . . . . 14
2.4.3 Speech Recognition Experiments . . . . . . . . . . . . . . . . . 15
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Robust Language Model Adaptation 19
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 LDA Topic Inference Algorithm . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Custom Prior Mixtures . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Custom Word Weights . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Adaptation Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Topic LM Training . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.2 Segmentation for Topic Inference . . . . . . . . . . . . . . . . . 23
3.3.3 Language Model Adaptation . . . . . . . . . . . . . . . . . . . . 25
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.2 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.3 Topic Inference: tmtw . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.4 Adaptation Windows . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.5 Show Adaptation: Metadata timegap . . . . . . . . . . . . . . . . 33
3.4.6 Weighted N-Best Topic Inference . . . . . . . . . . . . . . . . . 33
3.4.7 Soft vs. Hard Topic Language Models . . . . . . . . . . . . . . . 33
3.4.8 Topic LMs and Topic Inference . . . . . . . . . . . . . . . . . . 36
3.4.9 Optimal lB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Conclusion 39
4.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 Topic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 Topic Language Model Training . . . . . . . . . . . . . . . . . . 40
4.1.3 Adaptation Framework . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.4 Topical Context Windows: Content-Based Segmentation . . . . . 40
4.1.5 Interpolation Weights: Dynamic lB . . . . . . . . . . . . . . . . 40
4.1.6 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
A Latent Dirichlet Allocation 43
A.1 LDA Topic Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Language Model Adaptation Using Latent Dirichlet Allocation and an Efficient Topic Inference Algorithm
Multi-level Cues for Guest Language Recovery in Code-Mixed Speech Recognition
行動裝置及線上學習近年已經廣泛實用,語音辨識跟著也變成很重要的技術,不但世界各種語言都需要語音辨識,也需要辨識不同語言混合的語者訊號,尤其是當英文扮演著第二語言的場合,若講話時混合著多種語言,在一句話在不同語言間來回切換,稱為『語碼切換』;要能夠聽懂這種話,不但電腦很難勝任,就連人有時也不容易做好,部份原因是相關語料不足而無法訓練好辨識系統。 本論文探討我們如何用中英語碼切換的課堂語料(中文為主英文為次)來找出複雜的特徵,好讓系統可以恢復原本辨識錯誤的英文語段。我們的多層次架構能同時考慮低階及高階的特徵,這包括音位特徵、韻律特徵、語言特徵以及純聲學或語音學特徵,這些特徵可用來分辨每個音框的訊號應屬於哪一種語言。 我們用簡單而精確的方法找出有效的用在條件隨機域(conditional random field, CRF)上的特徵,也探討如何更有效的使用語料訓練出的串聯式特徵。我們發現,可以調語料中的中英文比例以致大幅改善所恢復的英文語段的正確率,甚至發現這種技術比使用深層類神經網路(DNN)的方法更好。這種技術不但在傳統的GMM-HMM語音辨識環境下可以提昇辨識效能,在最先進的混合式CD-HMM-DNN辨識環境下亦同。The rise of mobile devices and online learning brings into sharp focus the importance of speech recognition not only for the many languages of the world but also for code-mixed speech, especially where English is the second language. The recognition of code-mixed speech, where the speaker mixes languages within a single utterance, is a challenge for both computers and humans, not least because of the limited training data. We conduct research on a Mandarin-English code-mixed lecture corpus, where Mandarin is the host language and English the guest language, and attempt to find complex features for the recovery of English segments that were misrecognized in the initial recognition pass. We propose a multi-level framework wherein both low-level and high-level cues are jointly considered; we use phonotactic, prosodic, and linguistic cues in addition to acoustic-phonetic cues to discriminate at the frame level between English- and Chinese-language segments. We develop a simple and exact method for CRF feature induction, and improved methods for using cascaded features derived from the training corpus. By additionally tuning the data imbalance ratio between English and Chinese, we demonstrate highly significant improvements over previous work in the recovery of English-language segments, and demonstrate performance superior to DNN-based methods. We demonstrate considerable performance improvements not only with the traditional GMM-HMM recognition paradigm but also with a state-of-the-art hybrid CD-HMM-DNN recognition framework
Aaron Perrine's It Has to Be Beautiful: Concerto for Alto Saxophone and Wind Ensemble – An Analysis, Conductor's Guide, and Soloist’s Guide
Aaron Perrine’s music has grown in popularity and critical acclaim since his 2005 composition April became a finalist in the first Frank Ticheli Composition Contest. Perrine is only the ninth composer to earn two Sousa/American Bandmasters Association/Ostwald Awards since the creation of that honor in 1956. However, there is a lack of published scholarly material on Perrine and his music. The subsequent void between his impact on the repertoire of the wind band and the lack of available resources precipitates the need for this document. The goal of this study is to examine Perrine and his 2018 composition, It Has to Be Beautiful: Concerto for Alto Saxophone and Wind Ensemble. The resultant document includes a biography of Perrine; an analysis; a conductor’s guide to addressing technical issues, interpretation, and gestures; and a soloist’s guide to address the technical and interpretive challenges within It Has to Be Beautiful. This study was prepared with the help of multiple interview subjects: Dr. Kenneth Tse, for whom the concerto was written; Dr. Richard Mark Heidel, who conducted the North American premiere; Dr. Timothy Diem, whose personal story is intertwined in the programmatic nature of the concerto; and the composer of the concerto, Dr. Aaron Perrine, who was interviewed multiple times. Lastly, this study presents information on Perrine’s compositional style and aesthetic, which may also be germane to other compositions in his catalogue, for wind band or otherwise. The many interviews undertaken for this study resulted in the sharing of previously unpublished information which will prove useful to composers, conductors, and others interested in Perrine, as well as those composers who identify with his compositional processes or product, regardless of genre or the ensembles with which they work
Robust topic inference for latent semantic language model adaptation
We perform topic-based, unsupervised language model adap-tation under an N-best rescoring framework by using previous-pass system hypotheses to infer a topic mixture which is used to select topic-dependent LMs for interpolation with a topic-independent LM. Our primary focus is on techniques for im-proving the robustness of topic inference for a given utterance with respect to recognition errors, including the use of ASR confidence and contextual information from surrounding ut-terances. We describe a novel application of metadata-based pseudo-story segmentation to language model adaptation, and present good improvements to character error rate on multi-genre GALE Project data in Mandarin Chinese. Index Terms — language model adaptation, topic model-ing, unsupervised adaptation, speech recognition, story seg-mentation 1
Acanthocaudus tissoti Smith
Acanthocaudus tissoti (Smith) (Figs 3–4, 6–7) Trioxys (Acanthocaudus) tissoti Smith, 1944: 96 [USNM, examined]. Acanthocaudus tissoti: Mackauer 1960: 138 [revised combination]. Trioxys (Acanthocaudus) schlingeri Muesebeck, 1958: 144 [USNM, examined]. New synonym. Acanthocaudus schlingeri: Mackauer 1960: 138 [revised combination]. Diagnosis. The mesosoma is mottled yellow and brown or entirely brown in A. tissoti; it is entirely yellow in A. caudacanthus . The head is entirely brown or brown dorsally and gradually transitioning to yellow ventrally in A. tissoti; it is yellow with ocellar triangle entirely brown to black or yellow with brown to black markings around periphery of each ocellus in A. bicolor. Distribution. CANADA: British Columbia (Schlinger & Hall 1960); CUBA: Guantánamo (as Oriente), Pinar del Río (Starý 1981); USA: Florida (Smith 1944), Indiana *, South Dakota (Assefa et al. 2015). Hosts. Uroleucon (Uroleucon) ambrosiae (Muesebeck 1958) ex Baccharis sp. (Schlinger & Hall 1960) and Parthenium hysterophorus (Starý 1981), Uroleucon (Uroleucon) rudbeckiae (Smith 1944) ex Silphium perfoliatum, Uroleucon (Uroleucon) russellae (Marsh 1979). Specimens reared. All USA. INDIANA: 1 ♀ Tippecanoe Co., Lily Wildlife Area, 40°23'15.26"N 86°56'11.30"W, 2.vii.2007, T.T. Heidel, ex undet. aphids on Silphium perfoliatum, 07-254; 1 ♀ same data as previous except 07-296; SOUTH DAKOTA: 3 ♀ 21 ♂ Brookings Co., Brookings, South Dakota State University, Campus Agronomy Farm, 1.viii.2001, P. Loewe & A. Boe, ex aphids on Silphium perfoliatum; 15 ♀ 12 ♂ 5 indet. same data as previous except Felt Farm, 4 mi N of Brookings, 44°22'08"N 96°47'39"W, 1693' elevation, coll. 28.vii.2013, A. Boe, em. 28.vii.–2.viii.2013, ex Uroleucon cf. rudbeckiae on Silphium perfoliatum; 20 ♀ 20 ♂ 1 indet. same data as previous except coll. 3.viii.2013, P. J. Johnson, em. 3–4.viii.2013; 1 ♀ same data as previous except coll. 15.viii.2013, em. 17–20.viii.2013 (1 ♀ PURC, 10 ♀ 10 ♂ SDSU, 30 ♀ 43 ♂ 6 indet. USNM). Discussion. Muesebeck (1958) differentiated A. schlingeri (Figs 4, 7) from A. tissoti (Figs 3, 6) based on the absence of a distinct carina mediobasally on the propodeum and the eyes more convergent ventrally in the former compared to the later. Analysis of 31 female specimens from South Dakota regarded as A. tissoti by the first author revealed that the carina mediobasally on the propodeum varies from present to absent within this species. Also, FW was 1.40–1.67X FH for 26 of the female specimens from South Dakota regarded as A. tissoti. The FW:FH ratio for the holotypes of A. tissoti and A. schlingeri are 1.42 and 1.43, respectively, and thus, both fall within that range. Therefore, Acanthocaudus schlingeri Muesebeck, 1958 is synonymized with Acanthocaudus tissoti (Smith, 1944) given the intraspecific variation observed for the features used to distinguish those species.Published as part of Kula, Robert R., Johnson, Paul J., Heidel-Baker, Thelma T. & Boe, Arvid, 2017, A new species of Acanthocaudus Smith (Braconidae: Aphidiinae), with a key to species and new host and distribution records for aphidiines associated with Silphium perfoliatum L. (Asterales: Asteraceae), pp. 543-552 in Zootaxa 4236 (3) on pages 548-550, DOI: 10.11646/zootaxa.4236.3.8, http://zenodo.org/record/32230
Automatic topic detection strategy for information retrieval in spoken document
This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) produced by word and phone recognizers respectively, and their outputs are combined. We propose to use latent Dirichlet allocation (LDA) model for capturing the semantic information on word transcription. The LDA model is employed for estimating topic distribution in queries and word transcribed spoken documents, and the matching is performed at the topic level. Acoustic matching between query words and phonetically transcribed spoken documents is performed using phone-based matching algorithm. The results of acoustic and topic level matching methods are compared and shown to be complementary
Photosynthesis-inspired device architectures for organic photovoltaics
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 153-166).Organic semiconductor photovoltaics offer a promising route to low-cost, scalable, emissions-free electricity generation. However, achieving higher power conversion efficiencies is critical before these devices can play a larger role in our future energy generation landscape. Organic photovoltaic devices are currently limited by two primary challenges: (1) a trade-off between light absorption and exciton diffusion and (2) low open-circuit voltage due to charge recombination at the donor-acceptor interface. In this work, we demonstrate two new device architectures inspired by photosynthesis that aim to overcome these two challenges. First, we overcome the trade-off between light absorption and exciton diffusion by introducing an external light absorbing antenna layer. We model energy transfer from the antenna to the charge generating layers via surface plasmon polariton modes in the interfacial thin silver contact and via radiation into waveguide modes. We experimentally demonstrate devices with both single layer antennas and strongly absorbing resonant cavity antennas. We measure energy transfer efficiency from the antenna layer to the PV active layers as high as 51±10%. We discuss structural design criteria and describe ideal antenna material characteristics. Second, we reduce charge transfer state recombination in organic photovoltaics by inserting a thin interfacial layer at the donor-acceptor interface. The thin interfacial layer creates a cascade energy structure that destabilizes the Coulombically bound charge transfer state formed immediately following exciton dissociation. We nd the optimal interfacial layer thickness to be approximately 1.5 nm. In CuPc/C₆₀ devices, under simulated solar illumination the short-circuit current increased 34%, the open-circuit voltage increased 33%, and the power conversion eciency increased 49%. Thin interfacial layers can also be used to study the physics of exciton separation.by Timothy David Heidel.Ph.D
Improving Word Vector with Prior Knowledge in Semantic Dictionary
Using low dimensional vector space to represent words has been very effective in many NLP tasks.However,it doesn't work well when faced with the problem of rare and unseen words.In this paper,we propose to leverage the knowledge in semantic dictionary in combination with some morphological information to build an enhanced vector space.We get an improvement of 2.3%over the state-of-the-art Heidel Time system in temporal expression recognition,and obtain a large gain in other name entity recognition(NER)tasks.The semantic dictionary Hownet alone also shows promising results in computing lexical similarity.1-
