How much content should be contained within an embedding?

October 12, 2023

Embedding Size:

Model Capacity: Higher-dimensional embeddings capture more information but require more computational resources.
Downstream Task: The required accuracy for your task could influence embedding size. Text classification tasks might need higher dimensionality, while simpler tasks might not.
Vocabulary Size: Larger vocabularies might require larger embeddings to capture the semantic differences between words or phrases.
Dataset Size: A larger dataset might justify a higher-dimensional embedding as it has more information to encapsulate.

One Embedding Per Video vs Multiple Embeddings:

Homogeneity of Content: If the video discusses multiple unrelated topics, multiple embeddings may be better.
Length of Videos: Longer videos could contain several different themes or points, making it more sensible to generate multiple embeddings.
Downstream Applications: If you aim to capture the essence of the entire video for high-level classification, one embedding might be sufficient. But for fine-grained analysis, you might want multiple embeddings.
Computational Resources: Generating and storing multiple embeddings per video will require more computational and storage resources.