1,720,991 research outputs found

    End-to-End Generation of Written-style Transcript of Speech from Parliamentary Meetings

    Full text link
    従来の音声認識システムは,入力音声に現れるすべての単語を忠実に再現するように設計されているため,認識精度が高いときでも,人間にとって読みやすい文を出力するとは限らない.これに対して,本研究では,フィラーや言い誤りの削除,句読点や脱落した助詞の挿入,また口語的な表現の修正など,適宜必要な編集を行いながら,音声から直接可読性の高い書き言葉スタイルの文を出力する新しい音声認識のアプローチについて述べる.我々はこのアプローチを単一のニューラルネットワークを用いた音声から書き言葉への end-to-end 変換として定式化する.また,音声に忠実な書き起こしを疑似的に復元し,end-to-end モデルの学習を補助する手法と,句読点位置を手がかりとした新しい音声区分化手法も併せて提案する.700 時間の衆議院審議音声を用いた評価実験により,提案手法は音声認識とテキストベースの話し言葉スタイル変換を組み合わせたカスケード型のアプローチより高精度かつ高速に書き言葉を生成できることを示す.さらに,国会会議録作成時に編集者が行う修正作業を分類・整理し,これらについて提案システムの達成度と誤り傾向の分析を行う.Because conventional automatic speech recognition (ASR) systems are designed to faithfully reproduce utterances word-by-word, their outputs are not necessarily easy to read even when they have few speech recognition errors. To address this issue, we propose a novel ASR approach that outputs readable and clean text directly from speech by removing fillers and disfluent regeons, substituting colloquial expressions with formal ones, insertintg punctuation and recovering omitted particles, and performing other types of appropriate corrections. We formalize this approach as an end-to-end generation of written-style text from speech using a single neural network. We also propose a method to guide the training of this end-to-end model using automatically generated faithful transcripts, as well as a novel speech segmentation strategy based on online punctuation detection. An evaluation using 700 hours of Japanese Parliamentary speech data demonstrates that the proposed direct approach successfully generates clean transcripts suitable for human consumption more accurately at a faster decoding speed than the conventional cascade approach. We also provide an in-depth analysis on the types of edits performed by professional human editors to create the official written records of Japanese Parliamentary meetings, and evaluate the level of achievement of the proposed system in terms of each of the edit types

    有限生成群のなす位相空間 (一般位相幾何学の発展と諸分野との連携)

    Full text link
    k ∈ Z≥1を固定したとき, K元生成群GとK元からなる順序つき生成集合S=(s1, s2, ... , sk)の組(G;S)=(G; s1, ... , sk)を元とする空間Q(k)が定まる. Grigorchukが考察したように, Q(k)にはコンパクトで距離化可能な位相が自然に入る. 本稿ではこの位相空間Q(k)と群性質について知られていることを議論する. 最後に, Q(k)上でのLEF近似を用いて示された筆者の結果も述べる

    AN ALTERNATIVE PROOF OF KAZHDAN PROPERTY FOR ELEMENTARY GROUPS (Topology and Analysis of Discrete Groups and Hyperbolic Spaces)

    Full text link
    In 2010, Invent. Math, , Ershov and Jaikin-Zapirain proved Kazh-dan's property (T) for elementary groups. This expository article focuses on presenting an alternative simpler proof. Unlike the original one, our proof supplies no estimate of Kazhdan constants. It may be regarded as a specific example of the results in the paper "Upgrading fixed points without bounded generation" (arXiv: 1505.06728, forthcoming version) by the author

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Fixed Point Property For Universal Lattice On Schatten Classes

    No full text
    The special linear group G = SLn(Z[x(1), ... , x(k)]) (n at least 3 and k finite) is called the universal lattice. Let n be at least 4, and p be any real number in (1, infinity). The main result is the following: any finite index subgroup of G has the fixed point property with respect to every affine isometric action on the space of p-Schatten class operators. It is in addition shown that higher rank lattices have the same property. These results are a generalization of previous theorems respectively of the author and of Bader-Furman-Gelander-Monod, which treated a commutative L-p-setting.EG

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Automatic Speech Recognition for the Archive of Ainu Folklores

    Full text link
    本稿では,アイヌ民話(ウウェペケㇾ)の音声認識に関する我々の取り組みについて述べる.まず,2 つの博物館から提供されたアイヌ語アーカイブのデータを元に,沙流方言を対象としたアイヌ語音声コーパスを構築した.次に,このコーパスを用いて注意機構モデルに基づく音声認識システムを構成し,音素・音節・ワードピース・単語の 4 つの認識単位について検討した.その結果,音節単位での音声認識精度が最も高くなることがわかり,話者クローズド条件と話者オープン条件のそれぞれについて,音素認識精度で 93.7% と 86.2%,単語認識精度で 78.3% と 61.4% を実現した.音声認識精度が話者オープン条件において大幅に低下する問題に対して,CycleGAN を用いた教師なし話者適応を提案した.これは,学習データ内の話者の音声から認識対象話者の音声への写像を CycleGAN に学習させ,学習データ内の音声を全て認識対象話者風の音声に変換するものである.本手法によって最大で相対 60.6% の音素誤り率の改善を得た.さらに,日本語とアイヌ語が混合した音声における言語識別についても検討を行い,音素認識と単語認識を用いた構成で一定の識別性能を達成できることを示した.In this article, our work on the speech recognition of Ainu folklores (Uwepeker) is described. First, we constructed an Ainu speech corpus for the Saru dialect based on the data provided by two museums that had constructed the Ainu archive. Next, we built an automatic speech recognition (ASR) system based on an attention-based encoder-decoder model, and compared four recognition units of phones, syllables, word pieces, and words. With the syllable unit, we achieved a phone recognition accuracy of 93.7% and 86.2%, and word recognition accuracy of 78.3% and 61.4% for the speaker-closed and speaker-open conditions, respectively. To address the problem of significant degradation in the speaker-open condition, an unsupervised speaker adaptation method using a CycleGAN is proposed. In this method, mapping of the speaker’s voice in the training data to the target speaker’s voice is learned by a CycleGAN, that converts all speech in the training data into the target speaker’s speech. This method reduced the phone error rate by up to 60.6%. In addition, we investigated language identification in Japanese and Ainu mixed speech and realized reasonable performance by cascading phone and word recognition modules

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore