From 9a53dffe7a796d9be4604e360f96c8a57354632a Mon Sep 17 00:00:00 2001
From: Sander Wood <88025106+sander-wood@users.noreply.github.com>
Date: Sun, 23 Apr 2023 00:56:11 +0800
Subject: [PATCH] Update README.md
---
clamp/README.md | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/clamp/README.md b/clamp/README.md
index db5efe6..88118e5 100644
--- a/clamp/README.md
+++ b/clamp/README.md
@@ -3,17 +3,17 @@ The intellectual property of the CLaMP project is owned by the Central Conservat
## Model description
In [CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval](https://ai-muzic.github.io/clamp/), we introduce a solution for cross-modal symbolic MIR that utilizes contrastive learning and pre-training. The proposed approach, CLaMP: Contrastive Language-Music Pre-training, which learns cross-modal representations between natural language and symbolic music using a music encoder and a text encoder trained jointly with a contrastive loss. To pre-train CLaMP, we collected a large dataset of 1.4 million music-text pairs. It employed text dropout as a data augmentation technique and bar patching to efficiently represent music data which reduces sequence length to less than 10%. In addition, we developed a masked music model pre-training objective to enhance the music encoder's comprehension of musical context and structure. CLaMP integrates textual information to enable semantic search and zero-shot classification for symbolic music, surpassing the capabilities of previous models. To support the evaluation of semantic search and music classification, we publicly release [WikiMusicText](https://huggingface.co/datasets/sander-wood/wikimusictext) (WikiMT), a dataset of 1010 lead sheets in ABC notation, each accompanied by a title, artist, genre, and description. In comparison to state-of-the-art models that require fine-tuning, zero-shot CLaMP demonstrated comparable or superior performance on score-oriented datasets.
-
-
The architecture of CLaMP, including two encoders - one for music and one for text - trained jointly with a contrastive loss to learn cross-modal representations.
The architecture of CLaMP, including two encoders - one for music and one for text - trained jointly with a contrastive loss to learn cross-modal representations.
The processes of CLaMP performing cross-modal symbolic MIR tasks, including semantic search and zero-shot classification for symbolic music, without requiring task-specific training data.
The processes of CLaMP performing cross-modal symbolic MIR tasks, including semantic search and zero-shot classification for symbolic music, without requiring task-specific training data.