Monday, February 7, 2022

OpenAI Embeddings (and Controversy?!)


#mlnews #openai #embeddings OpenAI launches an embeddings endpoint in their API, providing high-dimensional vector embeddings for use in text similarity, text search, and code search. While embeddings are universally recognized as a standard tool to process natural language, people have raised doubts about the quality of OpenAI's embeddings, as one blog post found they are often outperformed by open-source models, which are much smaller and with which embedding would cost a fraction of what OpenAI charges. In this video, we examine the claims made and determine what it all means. OUTLINE: 0:00 - Intro 0:30 - Sponsor: Weights & Biases 2:20 - What embeddings are available? 3:55 - OpenAI shows promising results 5:25 - How good are the results really? 6:55 - Criticism: Open models might be cheaper and smaller 10:05 - Discrepancies in the results 11:00 - The author's response 11:50 - Putting things into perspective 13:35 - What about real world data? 14:40 - OpenAI's pricing strategy: Why so expensive? Sponsor: Weights & Biases https://wandb.me/yannic Merch: store.ykilcher.com ERRATA: At 13:20 I say "better", it should be "worse" References: https://ift.tt/Kja0ten https://ift.tt/wC43HmO https://ift.tt/J6ib7SI https://ift.tt/E8eFt7y https://twitter.com/Nils_Reimers/status/1487014195568775173?s=20&t=NBF7D2DYi41346cGM-PQjQ https://ift.tt/9trWCdY https://mobile.twitter.com/arvind_io/status/1487188996774002688 https://twitter.com/gwern/status/1487096484545847299 https://twitter.com/gwern/status/1487156204979855366 https://twitter.com/Nils_Reimers/status/1487216073409716224 https://twitter.com/gwern/status/1470203876209012736 https://ift.tt/dtNxoIW https://mobile.twitter.com/arvind_io/status/1488257004783112192 https://mobile.twitter.com/arvind_io/status/1488569644726177796 Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/ZHMtbVa BitChute: https://ift.tt/5H0Uacw LinkedIn: https://ift.tt/bADK3vk BiliBili: https://ift.tt/xMr78Xe If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/VeZlaRO Patreon: https://ift.tt/iegP1FO Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

No comments:

Post a Comment