Monday, August 31, 2020

How Milvus Realizes the Delete Function

This article deals with how Milvus implements the delete function. As a much-anticipated feature for many users, the delete function was introduced to Milvus v0.7.0. We did not call remove_ids in FAISS directly, instead, we came up with a brand new design to make deletion more efficient and support more index types.

In FAISS, deleting an ID and its corresponding vector requires going through the whole dataset to determine which vectors to remove (facebookresearch/faiss). Thus, frequently calling remove_ids greatly worsens system performance and makes deleting and searching at the same time impossible. Furthermore, to delete the data that is flushed to the disk, you need to load it to the memory before flushing it back to the disk. This is prohibitively pricy in terms of system consumption and is obviously not viable for a production environment. Besides, FAISS only supports deletion for FLAT, IVF_FLAT, and IDMAP. Our goal for Milvus is to support deletion for not only all CPU and GPU indexes in FAISS, but also other ANNS libraries going forward. Therefore, we must design a new delete function for Milvus.



from DZone.com Feed https://ift.tt/31JWqky

No comments:

Post a Comment