#Vector Search
Explore tagged Tumblr posts
govindhtech · 2 months ago
Text
LeanVec Improves Out-of-Distribution Vector Search Accuracy
Tumblr media
Intel LeanVec Conquer Vector Search with Smart Dimensionality Reduction
The last essay in this series highlighted how vector search is essential in many applications that need precise and fast replies. Vector search systems often perform poorly due to memory and computation strain from large vector dimensionality. Also common are cross-modal retrieval tasks, such as those in which a user provides a text query to find the most relevant photographs.
These searches often have statistical distributions that differ from database embeddings, making accuracy problematic. Intel's LeanVec integrates dimensionality reduction and vector quantisation to speed up vector search on huge vectors while retaining accuracy in out-of-distribution queries.
Introduction
Recently, deep learning models have enhanced their capacity to construct high-dimensional embedding vectors whose spatial similarities match inputs including pictures, music, video, text, genomics, and computer code. This capability allows programs to explore massive vector collections for semantically meaningful results by finding the closest neighbours to a query vector. Even though similarity search has improved, modern vector indices perform poorly as dimensionality increases.
The most frequent are graph indices, which are directed graphs with edges indicating vector neighbor-relationships and vertices representing dataset vectors. Graph traversal is effective to find nearest neighbours in sub-linear time.
Graph-based indices excel at small dimensionalities (D = 100) but struggle with deep learning model dimensionalities (D ≈ 512, 768, 1536). If deep learning model-derived vectors dominate similarity search deployments, eliminating this performance gap is crucial.
This graph search speed drop is caused by the system's memory latency and bandwidth, which are largely utilised to fetch database vectors from memory randomly. Vector compression sounds like a decent technique to minimise memory strain, however PQ and SCANN either don't compress sufficiently or perform poorly due to irregular memory access patterns.
The Out-of-Distribution Queries Challenge
The queries are out-of-distribution (OOD) when the database and query vector statistical distributions diverge, making vector compression harder. Unfortunately, two modern programs often do this. The first is cross-modal searching, when a user queries one modality to return relevant elements from another. Word searches help text2image find thematically similar pictures. Second, many models, including question-answering ones, may create queries and database vectors.
A two-dimensional example shows the importance of query-aware dimensionality reduction for maximum inner product search. For a query-agnostic method like PCA, projecting the database (𝒳) and query (Q) vectors onto the first main axis (large green arrow) is recommended. This selection will lower inner product resolution since this path is opposing Q's principal axis (orange arrow). Furthermore, the helpful direction (the second primary axis of 𝒳) is gone.
A Lightweight Dimensionality Reduction Method
To speed up similarity search for deep learning embedding vectors, LeanVec approximates the inner product of a database vector x and a query q.
How projection works LVQ reduces the number of bits per entry, whereas DRquery and DRDB reduce vector dimensionality. As shown in Figure, LeanVec down-projects query and database vectors using linear functions DRquery and DRDB.
Each database vector x is compressed twice via LeanVec:
First vector LVQ(DRDB(x)). Inner-product approximation is semi-accurate.
LVQ(x), secondary vector. An appropriate description is the inner-product approximation.
The graph is built and searched using main vectors. Intel experiments show that the graph construction resists LVQ quantisation and dimensionality reduction. Only secondary vectors are searched.
The graph index is searched using main vectors. Less memory footprint reduces vector retrieval time. Due to its decreased dimensionality, the approach requires fewer fused multiply-add operations, reducing processing effort. This approximation is ideal for graph search's random memory-access pattern because it permits inner product calculations with individual database vectors without batch processing.
Intel compensates for inner-product approximation errors by collecting additional candidates and reranking them using secondary vectors to return the top-k. Because query dimensionality reduction (i.e., computing f(q)) is only done once per search, there is some runtime overhead.
Searches are essential to graph formation. Intel's search acceleration directly affects graph construction.
LeanVec learns DRquery and DRDB from data using novel mathematical optimisation algorithms. Because these methods are computationally efficient, their execution time depends on the number of dimensions, not vectors. The approaches additionally consider the statistical distributions of a small sample of typical query vectors and database vectors.
Findings
The results are obvious. LeanVec improves SVS performance, exceeding the top open-source version of a top-performing algorithm (HNSWlib). The reduction in per-query memory capacity increases query speed approximately 4-fold with the same recall (95% 10 recall@10).
Conclusion
LeanVec uses linear dimensionality reduction and vector quantisation to speed up similarity searches on modern embedding models' high-dimensional vectors. As with text2image and question-answering systems, LeanVec excels when enquiries are out of distribution.
0 notes
mikelogan · 6 months ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
TOP 5 TUMBLR POP CULTURE MOMENTS OF 2024
@lgbtqcreators creator bingo
1K notes · View notes
valdevia · 11 months ago
Text
Shot in the dark, but I have this problem with Google where it keeps wanting to show these information panels when you search my name and getting the completely wrong person. I think Google doesn't have a trustworthy "about" page it can pull info from, so it just guesses a random person with a similar name?
First it was a biologist:
Tumblr media
Then a congressman from Argentina:
Tumblr media
And now it's a singer?
Tumblr media
I've tried giving feedback, claiming the knowledge panel, and nothing seems to work, they just switch to a new identity... I've tried linking them to my own about page and socials but they don't take any information from there either. I guess they need some kind of external "authoritative source" for something like this?
So I'm thinking the only viable solution might be to have a Wikipedia article with a name, photo and basic info so that Google stops making me steal people's identities? I don't think this is fun to any of the people with similar names to me with their own careers who keep getting their search pages invaded by my links... It's getting pretty frustrating.
Does anyone here have any experience with editing Wikipedia and can help me through this? Thank you! (if you can help me, shoot me a DM or message on Discord @ Valdevia)
439 notes · View notes
jupitercl0uds-art · 1 year ago
Text
to celebrate me using a 4k canvas for the first time, heres espio and silver caught in 4k (haha that was such an intentional joke. i meant to do that. i didnt just realise that as i was typing the file name)
Tumblr media
anyway this made me realise 1) i have never drawn espio outside of the context of espilver and 2) because ive drawn bowser before vector isnt as hard to draw as i thought
aftermath:
Tumblr media
(and a version without speech bubbles because they cover a lot and were NOT well placed at all)
Tumblr media
192 notes · View notes
pocoslip · 3 months ago
Text
Tumblr media
If Alpha Trion is just a Retool of any Old Figures, I can always get his Untransformable Figure
(I still can't think of a New Name with Prime for Trion)
24 notes · View notes
berisims · 2 years ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media
Team Chaotix
189 notes · View notes
yourfathersmustache · 6 months ago
Text
Tumblr media
Neopet Advent - Activity Page Its Christmas time over on Neopets, so I did some of the daily event images. Heres a fun activity page, harping back to kids menus at a diner. Complete the maze, connect the ginger-dots and do a word search!
15 notes · View notes
kandicon · 1 year ago
Text
*writes the same exact headcannons in slightly different scenarios over and over again*
#it all comes back to my unicron-spawn Starscream and my quintesson-built Jazz#today I worked a little on us Starscream and qb Jazz becoming friends and getting a absurdly similar dynamic to how I write Prowl and Jazz#but I stopped that to work on a memory loss fic w that Jazz fighting his way from autobots to Starscream bc he was the only one who he#trusted with a complete memory back up as another not-cybertronian#and I stopped THAT to work on a qb Jazz/Prowl fic where it's non-essential no pain killer surgery that Prowl has to do on Hazx bc he refuses#to go to medics. partially bc the surgery is completely unsafe in any firm and partly bc qb Jazz doesn't want anyone else to know what he is#(and Prowl barely knows either)#but I only got a few sentences into that b4 I went to do an Autobot!DJD (AJD?) torture scene w qb Jazz where the nameless character to die#manages to tear open his chest while fighting back and finds nothing inside#BUT that's rlly similar 2 a fic where I've done the same thing w Starscream (the chest discovery in a scuffle bit) so I reread that before#I got distracted thinking abt my Starop fic that's all Starscream doesn't have a spark because he's a ghost Optimus Prime doesn't have a#spark because he's a lab experiment gone rogue. Misunderstandings ensue. which I adore but have no idea how to fit a plot into#so bc I couldn't think of anything more than a few sentences for that I went to my fic where ALL of the command trine formed from Unicron#but Skywarp and Thundercracker died early and Starscream spends millions of years searching all of cybertron and hoping Vector Sigma#reincarnation works for unicronians too. biiiig depression angst fic. I can't decide if I want it to end in Starscream self-inducing stasis#in one of Vector Sigma's chambers or whether I want it to end w Starscream brutally murdering the new trine member the reincarnated versions#of Skywarp and Thundercracker were made with (who ftr would be Sun Storm)#n that fic reminded me of that one rewritting of the Starscream's Ghost ep where Starscream catches a glimpse of Scourge and immediately#attacks. it's barely a fight because in seconds SS is ripping through layers of armor desperately searching for Thundercracker beneath the#shell Unicron gave him. He needs Thundercracker to be there (he isn't). Only when his claws have gone completely thru Scourge's back does he#round on the armada- only to completely ignore Cyclonus and go for one of his clones (Skywarp)#and that reminded me of- *gunshots*#do u see why I only ever manage to post ponies?? I have less ideas w them so I actually finish.#I'm worried of hitting tag limit but I have plenty more of even less fleshed out fics for us Starscream and qb Jazz#(I barely said half of what's in my writing docs)
50 notes · View notes
supercantaloupe · 6 months ago
Note
Bovine TB is 100% a thing that can and will kill you if you get it from drinking unpasteurized milk that a cow that contracted TB made. Yes, farmers can get it from being around infected cows and breathing it in but it is also super easily transmittable through cow milk.
oh, i see the confusion. bovine tuberculosis is a different disease caused by a different pathogen from regular tuberculosis. the latter is not transmitted through cattle or raw milk but the former (which you can probably guess by the name) is. they're similar but not the same.
3 notes · View notes
rodhunt · 10 months ago
Text
Christmas Search and Find Illustration by Rod Hunt
Tumblr media Tumblr media
Rod Hunt was commissioned to create a Where's Waldo / Wally? style Christmas search and find illustration for Brittanica's What on Earth? magazine. Can you find all the hidden objects and the festive word in this detailed seasonal scene?
This illustration is available to licence, please contact Rod for details.
© Rod Hunt 2024
rodhunt.com
6 notes · View notes
themetallicnemesis · 2 years ago
Text
Before i actually watched a Forces gameplay, i used to think that the Chaotix in the game would be like the in-game merchant characters
Like yeah the world just got taken over and here they are making a quick buck on the side selling you used wispons and upgrades.
Charmy every once in a while lets it slip something along the lines of "Wow! You're buying THAT one?? REALLY? Cause the last guy that used it-" while Vector desperatly tries to shush him and espio shakes his head in disappointment in the background
Yeah i was so wrong 😔
15 notes · View notes
wemlygust · 2 months ago
Text
@strangerinthesecretforest, who commented about trying to find a library book they saw this image in. Reblogging and tagging you instead of comment replying because this is long, there are links others might be interested in, and also I wanted better formatting options. This image is uploaded to WikimediaCommons as "own work" by Wikimedia user McGeddon here https://commons.wikimedia.org/wiki/File:Survivorship-bias.png, and the current version of the file - after other Wikipedia Commons users edited it a bit more and converted it to a vector image - is here
The description on the older version of the file says, "Illustration of hypothetical damage pattern on a WW2 bomber, dot pattern roughly based on that given at http://www.motherjones.com/kevin-drum/2010/09/counterintuitive-world which gives credit to Cameron Moll. This file was derived from: Lockheed PV-1 Ventura BuAer 3 side view.jpg".
Lockheed_PV-1_Ventura_BuAer_3_side_view.jpg is the airplane diagram without the dots the Wikimedia Commons user apparently added themself.
The description of the newer version of the file says, "Illustration of hypothetical damage pattern on a WW2 bomber. Loosely based on data from an unillustrated report by Abraham Wald (1943), showing that a similar plane survived a single hit to the engine 60% of the time, but a hit to the fuselage or fuel system closer to 95% of the time. Picture is based on US Air Force "hit plots", such as this F-4 hit plot published in 1991. New version by McGeddon based on a Lockheed PV-1 Ventura drawing (2016), vector file by Martin Grandjean (2021)."
Here is the F-4 hit plot that mentions. It looks cool https://apps.dtic.mil/sti/pdfs/ADA245827.pdf.
Anyway, tldr, since this is an image uploaded to Wikimedia Commons in 2016 by its creator (presumably for use in the survivorship bias Wikipedia article) under a Creative Commons Attribution-Share Alike 4.0 International license, this image appears to have been used by a loooot of different people, all over the place. (Many of whom appear to be totally ignoring the "Attribution-Share Alike" part of that license. Maybe because some people assume Wikipedia images are always public domain, though CC-BY-SA licenses actually seem to be more common, at least in my anecdotal experience.)
So I think it probably appears in multiple books that you could have seen it in. But, it was uploaded in 2016, so you can at least rule out anything published before that date (unless one was re-printed with additions after that date).
I’m so glad that things like survivorship bias and statistical outliers became memes I wish more critical thinking skills would become widely-understood this way, I’m not kidding let’s get on this
26K notes · View notes
mobmaxime · 1 month ago
Text
0 notes
bumblevoid · 5 months ago
Text
someone stop me before i write whump to procrastinate my calculus homework
0 notes
azadarch · 7 months ago
Text
Tumblr media
ওয়েব ডেভেলপমেন্ট শিখে কী কী কাজ করা যায়?
0 notes
successivetech22 · 1 year ago
Text
How Vector Search Transforms Information Retrieval?
Tumblr media
Vector search revolutionizes information retrieval by representing data as high-dimensional vectors, allowing for more nuanced and accurate searches. Unlike traditional keyword searches, vector search captures semantic relationships, enabling the retrieval of contextually relevant information even when exact keywords are absent. This enhances the effectiveness of searches across various applications, including natural language processing and recommendation systems.
Also read Vector Search Transforms
0 notes