Home

Awesome

⚠️ 重要 2024/10/8 より多様なタスクにより埋め込みモデルを評価したリーダーボードJMTEBが公開されておりますので、こちらを参照することをお勧めします。
⚠️ IMPORTANT UPDATE: we recommend checking out JMTEB, a new leaderboard that evaluates embedding models using a more diverse set of tasks.

JapaneseEmbeddingEval

Model#dims#paramsJSTS valid-v1.1JSICK testMIRACL devAverage
BAAI/bge-m3(dense_vecs)1024567M0.8020.7980.91010.837
jinaai/jina-embeddings-v3102412M0.8190.7820.8620.821
MU-Kindai/SBERT-JSNLI-base768110M0.7660.6520.3260.581
MU-Kindai/SBERT-JSNLI-large1024337M0.7740.6770.2780.576
bclavie/fio-base-japanese-v0.1 2768111M0.8630.8940.7180.825
cl-nagoya/ruri-small76867M0.8210.8330.79110.815
cl-nagoya/ruri-base768111M0.8330.8230.84610.834
cl-nagoya/ruri-large1024337M0.8420.8190.86410.842
cl-nagoya/sup-simcse-ja-base768111M0.8090.8270.5270.721
cl-nagoya/sup-simcse-ja-large1024337M0.8310.8310.5070.723
cl-nagoya/unsup-simcse-ja-base768111M0.7890.7900.4870.689
cl-nagoya/unsup-simcse-ja-large1024337M0.8140.7960.4850.699
colorfulscoop/sbert-base-ja768110M0.7420.6570.2540.551
intfloat/multilingual-e5-small384117M0.7890.8140.84710.817
intfloat/multilingual-e5-base768278M0.7960.8060.84510.816
intfloat/multilingual-e5-large1024559M0.8190.7940.88310.832
intfloat/multilingual-e5-large-instruct1024559M0.8320.8220.87610.844
oshizo/sbert-jsnli-luke-japanese-base-lite768133M0.8110.7260.4970.678
pkshatech/GLuCoSE-base-ja-v2768133M0.8090.8490.87910.846
pkshatech/RoSEtta-base-ja768190M0.7900.8350.84510.823
pkshatech/GLuCoSE-base-ja768133M0.8180.7570.6920.755
pkshatech/simcse-ja-bert-base-clcmlp768111M0.8010.7350.5440.693
API
text-embedding-3-large30720.8380.8120.84130.830
text-embedding-3-small15360.7810.8040.79530.793
text-embedding-ada-00215360.7900.7900.72830.769
textembedding-gecko-multilingual@0017680.8010.8040.80030.801
LLM
intfloat/e5-mistral-7b-instruct40967.3B0.8360.8360.8850.852
oshizo/japanese-e5-mistral-7b_slerp40967.3B0.8460.8420.8860.858
oshizo/japanese-e5-mistral-1.9b40961.9B0.8260.8330.7970.819
ColBERT
bclavie/jacolbert_first_100 4128/token111M0.8723
bclavie/JaColBERTv2 4128/token111M0.9183
BAAI/bge-m3(colbert_vecs)1024/token567M0.7990.7980.91710.838
BAAI/bge-m3(colbert+sparse+dense)1024/token5567M0.8000.8050.926 10.844
Reranker
hotchpotch/japanese-bge-reranker-v2-m3-v1-567M0.9471
Sparse Retrieval
hotchpotch/japanese-splade-base-v1-111M0.9251

Datasets

Footnotes

  1. These models have been fine-tuned using the MIRACL dataset, so the MIRACL task is not an unseen task for them. For detailed information on each model, please refer to the following links: multilingual-e5, BGE-M3, hotchpotch/japanese-bge-reranker-v2-m3-v1, hotchpotch/japanese-splade-base-v1, Ruri, pkshatech/GLuCoSE-base-ja-v2, pkshatech/RoSEtta-base-ja 2 3 4 5 6 7 8 9 10 11 12 13 14

  2. According to the blog post about fio-base-japanese-v0.1, the tasks aren't unseen by the model, which makes it hard to directly compare with the other models.

  3. Evaluate only the first 100 queries out of 860 queries 2 3 4 5 6

  4. JaColBERT is a retrieval model. It is optimised only for document retrieval tasks, and not for semantic similarity/entailment tasks like JSTS or JSICK. 2

  5. Embedded dimension for dence is 1024, sparse is one float value per unique token, colbert is 1024 per token.