#multimodal
Explore tagged Tumblr posts
Text
Sotrigender Pride Flag
Sotrigender or tritrisogender/trisotrigender: trimodal trigender in which someone is iso, trans, and cis; being trisogender as a result of being trigender; or being trigender as a result of being trisomodal.
#ap#sotrigender#trigender#trimodal#trisomodal#multimodal#gender modality#multimodality#trimodality#genders#neogender#gender umbrella#mogai coining#liom coin#pride flag#isogender#transgender#trans#cis#iso#cisgender#trisgender#isotrans#isocis#tris#trismodal#trisogender
38 notes
·
View notes
Text
Pegasus 1.2: High-Performance Video Language Model

Pegasus 1.2 revolutionises long-form video AI with high accuracy and low latency. Scalable video querying is supported by this commercial tool.
TwelveLabs and Amazon Web Services (AWS) announced that Amazon Bedrock will soon provide Marengo and Pegasus, TwelveLabs' cutting-edge multimodal foundation models. Amazon Bedrock, a managed service, lets developers access top AI models from leading organisations via a single API. With seamless access to TwelveLabs' comprehensive video comprehension capabilities, developers and companies can revolutionise how they search for, assess, and derive insights from video content using AWS's security, privacy, and performance. TwelveLabs models were initially offered by AWS.
Introducing Pegasus 1.2
Unlike many academic contexts, real-world video applications face two challenges:
Real-world videos might be seconds or hours lengthy.
Proper temporal understanding is needed.
TwelveLabs is announcing Pegasus 1.2, a substantial industry-grade video language model upgrade, to meet commercial demands. Pegasus 1.2 interprets long films at cutting-edge levels. With low latency, low cost, and best-in-class accuracy, model can handle hour-long videos. Their embedded storage ingeniously caches movies, making it faster and cheaper to query the same film repeatedly.
Pegasus 1.2 is a cutting-edge technology that delivers corporate value through its intelligent, focused system architecture and excels in production-grade video processing pipelines.
Superior video language model for extended videos
Business requires handling long films, yet processing time and time-to-value are important concerns. As input films increase longer, a standard video processing/inference system cannot handle orders of magnitude more frames, making it unsuitable for general adoption and commercial use. A commercial system must also answer input prompts and enquiries accurately across larger time periods.
Latency
To evaluate Pegasus 1.2's speed, it compares time-to-first-token (TTFT) for 3–60-minute videos utilising frontier model APIs GPT-4o and Gemini 1.5 Pro. Pegasus 1.2 consistently displays time-to-first-token latency for films up to 15 minutes and responds faster to lengthier material because to its video-focused model design and optimised inference engine.
Performance
Pegasus 1.2 is compared to frontier model APIs using VideoMME-Long, a subset of Video-MME that contains films longer than 30 minutes. Pegasus 1.2 excels above all flagship APIs, displaying cutting-edge performance.
Pricing
Cost Pegasus 1.2 provides best-in-class commercial video processing at low cost. TwelveLabs focusses on long videos and accurate temporal information rather than everything. Its highly optimised system performs well at a competitive price with a focused approach.
Better still, system can generate many video-to-text without costing much. Pegasus 1.2 produces rich video embeddings from indexed movies and saves them in the database for future API queries, allowing clients to build continually at little cost. Google Gemini 1.5 Pro's cache cost is $4.5 per hour of storage, or 1 million tokens, which is around the token count for an hour of video. However, integrated storage costs $0.09 per video hour per month, x36,000 less. Concept benefits customers with large video archives that need to understand everything cheaply.
Model Overview & Limitations
Architecture
Pegasus 1.2's encoder-decoder architecture for video understanding includes a video encoder, tokeniser, and big language model. Though efficient, its design allows for full textual and visual data analysis.
These pieces provide a cohesive system that can understand long-term contextual information and fine-grained specifics. It architecture illustrates that tiny models may interpret video by making careful design decisions and solving fundamental multimodal processing difficulties creatively.
Restrictions
Safety and bias
Pegasus 1.2 contains safety protections, but like any AI model, it might produce objectionable or hazardous material without enough oversight and control. Video foundation model safety and ethics are being studied. It will provide a complete assessment and ethics report after more testing and input.
Hallucinations
Occasionally, Pegasus 1.2 may produce incorrect findings. Despite advances since Pegasus 1.1 to reduce hallucinations, users should be aware of this constraint, especially for precise and factual tasks.
#technology#technews#govindhtech#news#technologynews#AI#artificial intelligence#Pegasus 1.2#TwelveLabs#Amazon Bedrock#Gemini 1.5 Pro#multimodal#API
2 notes
·
View notes
Text
any experienced multimodal analysts have any Thoughts on ELAN? i'm on a hunt for a mac-compatible software for annotating vids that allows for a customizable coding scheme. lots of the ones i've seen are for conversation analysis -- which is awesome, but not aligned with my needs
#michelle's thesis#yes a new tag lol#gradblr#ELAN#multimodal#LOL i have no idea what to tag this so ppl can see it#conversation analysis#studyblr#research#phdblr#graduate school#grad student#grad school#grad studies#救命#for context i'm analyzing long-form video essays -- a descriptive sort of component descriptive analysis?#so the often crazy and chaotic multimodal/semiotic entanglements ... warrant a software#personal
2 notes
·
View notes
Text
Content of Multimodality

The image attached above is the graphic I created as a multimodal resource. The image displays the eight concepts of rhetoric, serving as a guide into the complexities of writing. Specifically, how multiple variables influence the literary technique of the writer and the receptive perception of the viewer. Created in a well orchestrated diagram, the graphic shows the viewer framework of each concept in relation to another– much displaying how rhetoric isn’t effective if one piece is missing from the “symmetric��� image. In course of the definitions, they were added as “mini notes” for the individual concepts of rhetoric for people like me who may be unfamiliar with one or two terms. Being a person who had never really knew what discourse community was, I found the graphic to be helpful in remembering the premise of it through a memorable layout.
5 notes
·
View notes
Text
Shape: E (Multimodal, Roughly Symmetrical)
25K notes
·
View notes
Text
SciTech Chronicles. . . . . . . . .Mar 24th, 2025
#genes#chromosome#hippocampus#myelin#Aardvark#multimodal#GFS#democratising#cognitive#assessment#Human-AI#Collaboration#E2A#192Tbit/s#Alcatel#2028#genomics#bioinformatics#antimicrobial
0 notes
Text
oops I missed this the first time 😭
Shape: Multimodal (?), Skewed Left
and did you have to look it up
24K notes
·
View notes
Text
Nanoplatform strategy for multimodal cancer therapy and tumor visualization
Announcing a new publication for Acta Materia Medica journal. Treatment of cancer can be challenging, because of the disease’s intricate and varied nature. Consequently, developing nanomedicines with multimodal therapeutic capabilities for precise tumor therapy holds substantial promise in advancing cancer treatment. Herein, a nanoplatform strategy involving supramolecular photosensitizers…

View On WordPress
0 notes
Text
INTERFACES ET APPROCHE MULTIMODALE : UN CERTAIN AVENIR DE L’IA
Temps de lecture : 6 minutes (+ vidéos)mots-clés : IA, multimodal, Proof of Concept, architecture, enseignement, apprentissage, échecNombre de « pages » : 2 En bref : En décembre 2024, Google a repris sa place dans la recherche de l’IA alors qu’Open AI se démène avec ses problèmes entropiques. Google a non seulement proposé un Gemini plus performant, mais surtout multimodal. Probablement…
0 notes
Text
Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL! | Qwen
0 notes
Text
Shape: Multimodal, Roughly Symmetrical (?)
Target audience
402 notes
·
View notes
Text
TRASMVTATO Magazine
#Aztec mythology#biobots#colonial-era texts#creative imagination#creativity#cultural philosophy#digital consciousness#dystopias#editorial#exoskeletons#experimental work#fragmented narratives#futuristic narratives#gender#genetic advancements#human evolution#human transmutation#identity#interdisciplinary#Jungian archetypes#Kabbalah#lunar voyage#Manuel Antonio Rivas#memory#meteorite impacts#Mortality#multimodal#mystical elements#nanorobotics#Oumuamua
0 notes
Text
Discover Project Astra, powered by Gemini 2.0: a revolutionary AI assistant with cross-device intelligence, multimodal understanding, and seamless adaptability.
#ortmoragency#deliveringdigitalhappiness#artificialintelligence#AIRevolution#FutureOfAI#AIInnovation#smarttechnology#AIFuture#MachineLearning#IntelligentSystems#AIForGood#TechEvolution#ProjectAstra#Gemini#revolutionary#nextgenassistant#multimodal#smartadaptation#seamlesstechnology
0 notes
Text
Shape: Multimodal, Skewed Right
No im not putting a fucking graduated cylinder contraptions only
#s#poll#multimodal#skewed right#I have a therory about why the distribution is the way it is but I shan't say
181 notes
·
View notes
Text




Week 9, October 23rd, 2024
We’ve arrived at the ninth week of the course. We were past the half-way point in the semester. I was feeling comfortable in my understanding of improvisational performing, particularly with my small number of group mates. We were experimenting with combining visual and audio elements together in this performance. My group mate built a MAX patch that made for some cool audio outputs, and my other group mate brought in a hand-made circuit board that also made for some cool audio!! In our first go, I was on a midi that controlled the delay and X Y Z planes of the visuals. I also controlled the volume of my other group mates who were producing audio. This performance was, by far, my most favourite of the course. It was here that i began to develop a taste for the style and elements I enjoyed in an improvisation performance.
I’ve provided some stills from the taped performance. For the link to videos: https://drive.google.com/file/d/1--fQ-ZLiJezhbJldqH9OtdidCRyhiEdc/view?usp=drivesdk ,
0 notes