Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

doi:10.1109/ICME.2006.262675

Open AccessProceedings Article10.1109/ICME.2006.262675

Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

Mustafa Sert, +2 more

- 09 Jul 2006

- pp 941-944

8

TL;DR: A novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetitive patterns can provide valuable information about the content of audio, such as a chorus or a concept.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1109/ISCE.2006.1689505

A Robust and Time-Efficient Fingerprinting Model for Musical Audio

Mustafa Sert, +2 more

- 11 Sep 2006

TL;DR: The audio spectrum flatness (ASF) and the audio signature (AS) features of the MPEG-7 standard are made use, which are new to the audio feature family and have not been considered as much as other feature types.

...read moreread less

13

Patent

A system and method for streaming music repair and error concealment

Jonathan Doherty, +2 more

- 20 May 2010

TL;DR: In this article, a method for analyzing the self-similarity of audio data is presented, which involves obtaining the audio spectrum envelope data of an audio file to be analyzed; performing a clustering operation on the spectrum envelope to produce a clustered set of data; for a first portion of the clustered data, performing a string matching operation on at least one other portion of clustered data; and based on the results of the string-matching operation, determining the at least part of the cluster data most similar to the first portion, the most similar subset of data.

...read moreread less

10

Proceedings Article•10.1109/ISM.2008.90

Combining Structural Analysis and Computer Vision Techniques for Automatic Speech Summarization

Mustafa Sert, +2 more

- 15 Dec 2008

TL;DR: This work transforms a 1-D time-domain speech signal to a 2-D image representation, namely (dis)similarity matrix and detects possible repetitions within the matrix by using proper computer vision techniques and can be generalized as speech-to-speech summarization method, in which summarization results are presented by speech instead of text.

...read moreread less

8

Proceedings Article•10.1109/ICPR.2014.140

Audiotory Movie Summarization by Detecting Scene Changes and Sound Events

Tong Lu, +2 more

- 24 Aug 2014

TL;DR: A novel movie audio summarization framework is presented, which consists of three processing levels, namely, low-level audio feature extraction, mid- level audio event detection, and high-level auditory movie summarization.

...read moreread less

3

Book Chapter•10.1007/11766254_27

Structural and semantic modeling of audio for content-based querying and browsing

Mustafa Sert, +2 more

- 07 Jun 2006

TL;DR: In this article, the authors integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data, where the clients can express their queries in the form of point, range and k-nearest neighbor, which are particularly significant in the multimedia domain.

...read moreread less

2

References

•Journal Article

Iso/iec jtc 1/sc 29

Fumitaka Ono

- 25 Nov 2006

- The Journal of the Institute of Image El...

TL;DR: Technologies de l'information — Classement international et comparaison de chaînes de caractères and description du modèle commun et adaptable d'ordre de classement AMENDEMENT 1.

...read moreread less

Proceedings Article•10.1109/ASPAA.2003.1285836

Summarizing popular music via structural similarity analysis

Matthew Cooper, +1 more

- 19 Oct 2003

TL;DR: A framework for summarizing digital media based on structural analysis on characterizing the repetitive structure in popular music by combining segments representing the clusters most frequently repeated throughout the piece is presented.

...read moreread less

•Proceedings Article•10.1109/ICASSP.2003.1200000

A chorus-section detecting method for musical audio signals

Masataka Goto

- 06 Apr 2003

TL;DR: This method, called RefraiD, can detect all the chorus sections in a song and estimate both ends of each section and can also detect modulated chorus sections by introducing a similarity that enables modulated repetition to be judged correctly.

...read moreread less

Proceedings Article•10.1145/319463.319472

Visualizing music and audio using self-similarity

Jonathan Foote

- 30 Oct 1999

TL;DR: The acoustic similarity between any two instants of an audio recording is displayed in a 2D representation, allowing identification of structural and rhythmic characteristics, as well as tempo and structure extraction.

...read moreread less

Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

Chat with Paper

AI Agents for this Paper

Citations

A Robust and Time-Efficient Fingerprinting Model for Musical Audio

A system and method for streaming music repair and error concealment

Combining Structural Analysis and Computer Vision Techniques for Automatic Speech Summarization

Audiotory Movie Summarization by Detecting Scene Changes and Sound Events

Structural and semantic modeling of audio for content-based querying and browsing

References

Iso/iec jtc 1/sc 29

Summarizing popular music via structural similarity analysis

A chorus-section detecting method for musical audio signals

Visualizing music and audio using self-similarity

Related Papers (5)

MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

Classification of audio signals using SVM and RBFNN

Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval

Improve audio representation by using feature structure patterns

A hierarchical approach for speech-instrumental-song classification