#image-to-text ai
Explore tagged Tumblr posts
accessibleaesthetics · 1 year ago
Text
Image-to-Text AI
I wanted to discuss image-to-text AI, what it's good at, what limitations it has, and how you can use it to help make accessibility easier.
How It Works
To demonstrate how this works, I'm going to use the image from this post.
Tumblr media
This photo shows a sleeping kitten laying on desk beside a computer, in between the keyboard and the mouse. There is also a corner of a frame of some sort in the upper right corner of the image. Text displays in the center of the image and reads: my coworker got her new kitten to work and the little nugget was just too tuckered out from being adorable all day.
Image-To-Text AI
Image-to-text AI is basically the exact reverse of the famous (or infamous, depending who you ask) text-to-image AI that has taken the world by storm since early 2021. There are a ton of websites for this, some free, many not. For simplicity, I chose to use the image-to-text feature built into Microsoft Word.
When I paste an image into a Word document, the program automatically generates alt text for it using Microsoft's AI. You can view this alt text in the Alt Text panel when editing the document. It will add "Description automatically generated" to the end of the alt text for transparency though, so if you want to keep the alt text it made, make sure to delete that. You can also edit the alt text directly to make it more accurate.
Tumblr media
Microsoft's AI came up with "A kitten sleeping on a desk text to a computer mouse." Honestly, not a bad description at all, except it's missing one important thing: the text overlaying the image. This is because Microsoft's image-to-text AI, like many AI of this kind, does not have the ability to transcribe text directly from the image. However, there is a technology that can.
Optical Character Recognition (OCR)
Optical character recognition, or OCR, is a technology that dates back to the 1970s, possibly earlier depending on how you define it. While it's application and accuracy have grown extensively since then, the core function remains the same: recognizing text in an image and transcribing it into a true text format.
I took the photo from the previous section and put it into a Free Online OCR Image To Text Converter.
Tumblr media
It recognized there was text on the image and transcribed it exactly. Very useful, but it doesn't give us any info about the actual image outside of that.
Limitations
Now, the examples I used above were kind of an ideal situation. AI is not as good with more complex images. For example, I tried putting in a screenshot of a tweet from nym™ (@aretteepls) with a photo of The Sphere at the Venetian Resort in Los Vegas. It is currently displaying a image of SpongeBob's face that fills the entire globe and glows very brightly, turning the night sky's clouds a tinge of yellow. Above the photo, the actual tweet says: The sky is turning yellow because of Spunch Bob.
Tumblr media
Microsoft's image-to-text AI came up with "A screenshot of a phone." Defintely much less impressive than our first example, but AI is only as good as the data it's trained on. Things like "screenshot of a phone" or "screenshot of a computer" are not uncommon when AI recognizes that you're giving it a screenshot of something on a screen, but can't make heads or tails of what's in it beyond that. And once again, it has no OCR capabilities, so none of the text on the image is transcribed.
But even OCR isn't infallible. The output for this image from that same website I used earlier would be:
nym ,M @aretteepls The sky is turning yellow because of Spunch Bob
The trademark symbol is kind of faint on the screenshot, so the OCR struggled with making that out, transcribing it as "comma M" instead. The less clear the text is visually, the less accurate the OCR output is going to be.
What Do We Do With This?
AI is best when used in conjunction with human aid, and image-to-text AI is non exception. I think the best way forward with this technology is to use generated descriptions as a starting point, not a replacement for human-written ones. And of course, we need to be careful what programs you use to generate the descriptions, especially with art. Programs like Chat GPT have image-to-text functions, but there is no guarantee that an image you upload to it for that purpose will not be used to train it's text-to-image AI as well.
Unfortunately, the more ethically-sourced a training data base for AI is, the more limited it will be compared to it's less-ethically sourced counterparts.
But there are legal precedents being put in place around this, and many text-to-image AI programs now have explicit and detailed terms of service for what you can and can't do with its output, as well as what you should be uploading as input.
So, for the time being, be very cautious with how you use this technology especially when describing others' art. And even with your own art, read through terms and conditions before uploading your work to a website. I think the Microsoft Word one is fairly safe though.
I also think it would be great if someone developed a image-to-text AI that could incorporate OCR to make the end result more informative.
23 notes · View notes
cowsabungus · 2 months ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
If you disagree just block me :)
1K notes · View notes
amalasdraws · 2 years ago
Note
https://www.tumblr.com/bigmammallama5/732632789726478336?source=share do you have any tips on how to detect ai and deepfakes?
Good question and I'm gonna be honest, it's not always easy and it will only get harder and harder. I'm just an artist who has spent their personal time to dive into this topic and study images. I'm still learning and there is a lot I don't know. But let me show what I know. This will be long, but I will make a summary at the end! So far, even with ai having become better and better there are still almost always some things wrong with an image, and they all have a very specific look to them. So let me try to show you some and point out some of them.
As we all know, a biggest struggle ai had were hands. And even though here and there we still see messed up hands, I say "had", because the hands is actual a good example on how ai is improving and will only get better. Still, looking at pictures that show more hands is always worth it, because somewhere in the back there will be most likely at least one messed up hand.
Another issue a lot of ai still has is hair though!
Tumblr media Tumblr media Tumblr media
It's very obvious still in many ai "drawings" and in those otherwise well rendered portraits. Hair starts to blend with the ears a lot, or with the clothes.
There is also often this very odd look between something too sharp and way too blurry
Tumblr media Tumblr media Tumblr media
There is often a very specific texture to the hair. I actually do not know the artistic or specific name for it. I can only describe it as this weird sharp feeling that makes it look oddly pixely, and then you have areas where it's very blurry. And the kind of loops and almost flame like looking hair we see in the last pic out of the three here is also something very common with ai.
As an artist I know we make mistakes too! The way I draw hair is flawed too! But it's not only that it's flawed here, but it's following always the same pattern and falls into the same issues over and over again, no matter who is "creating" the image. Those flame like loops are a common one, next to the odd blends and weird sharp and blurry textures.
But ai is getting better, and we not only have "art" and something that tries to be a drawing/painting, but photos too.
Tumblr media Tumblr media Tumblr media Tumblr media
A lot of those "photos" have a very specific texture and look to them! Again, it's not always the mistakes, but the very specific optic too. A lot of the images are oddly smooth, too rendered, with always blurry backgrounds. And when you look closer at the background you will see the mistakes! The crowd behind Jesus is a hot mess once you look closer. Bob Marley's hair has the same issue than I described before. Lincoln is surrounded by people with messed up hands and don't even get me started on the faces behind Caesar.
So a lot of ai images look alright on a first and quick glance, but as more time you spend with them, as more mistakes you will notice. The wehre is Waldo of ai horror.
And those "photos" shared here are still very obvious. Not just the mistakes and messed up details but the very specific aesthetic too.
Those images get better and better and as less details you have, as less mistakes you have!
Tumblr media Tumblr media Tumblr media
With photos like this it becomes harder and harder. There are not many details and no hands. Not many mistakes can be made. Also the very obvious plastic looking smoothness isn't so much here anymore. It kinda still is...but differently. And always the blurry background!! Sometimes the hair is still a giveaway. Collars and clothe straps are also often still a giveaway upon close look. As is jewelry. Earrings will be different and necklaces often don't go all the way around, just end, or blend with the hair or clothes.
Tumblr media Tumblr media
Often details on jewelry is also blurry and not shown properly. This is a trick with many details. With jewelry, batches, hair, ears, text. So it's often blurred out and not shown properly because ai doesn't know what to really show here.
Tumblr media Tumblr media
It's often really just the small details and when we scroll down quickly we will miss them. Like the wedding ring on the middle finger, the pens on top of a closed pocket, the batches that are always blurry, messed up faces that blend with a blurry background.
And sometimes it's so subtle that I could only really tell that right is the ai image in comparison to the real photo on the left. The real photo shows hands clearly and even when things are blurred out it doesn't feel that it's done to hide things. The ai image on the right hides the hands. There is also a very dead look in the eyes :D
Tumblr media
And here I could only tell because the text in the back doesn't make sense. Even blurred out we should be able to make out something here
Tumblr media
And after seeing a lot of ai images I recognize the kind of blurred out bg in combination with a very smooth and well rendered foreground/characters.
And here the only giveaway is a closer look at the backgrounds as well
Tumblr media
To summarize it:
Ai and fake news rely on a fast living world. We are being bombarded with tons of information and messages daily and we scroll past quickly. But the best tool, for now, in detecting ai is taking our time! Those images get better and better but so far there are still always some things off!! Especially in the background!
Hair. Often weirdly smoothed out and oddly sharp at the same time
Hair often blends with the ears or the clothes
Details are blurred out.
Jewelry doesn't match (example earrings). Details on metal often blurred out and never shown. Necklaces blend with hair or the clothes, and don't go around the neck.
Background is always blurred out.
In this blurred mess there are often hidden very messed up faces and/or hands.
A very specific smooth and yet too sharp/too rendered aesthetic combines with an always blurry bg.
Text, especialyl in the background, is not legible and doesn't make sense.
Backgrounds are often (so far) the dead giveaway. Somewhere in the back things become muddled and messed up. This shows also very well in ai decor/architecture. There will be odd lines that don't align or align too well. Curtain poles that end in the furniture, a plant that is behind a lamp suddenly having leaves in front of the lamp. As longer you look as more you will notice.
Tumblr media Tumblr media
Conclusion:
Take your time with images! Sit with them! Especially when it's framed as important and political news. Is it ai and propaganda, or did it really happen? Don't fall for the quick buzz and outrage! Some things are obvious right away but with others you have to take your time. And it's time you have! If you are still unsure if a pic is real or not, do some research on top. Image reverse search. Can you find it anywhere else? Are other news outlets sharing it? Does the image/message make sense? For example there is now a deepfake of Bella Hadid voicing support for Israel. Ask yourself, does this make sense? If it feels out of line compared to previous behavior, do some research! Media literacy is not just as being able to recognize a fake or real right away, but being able to do research. To question things! Don't just take every post online for face value. Even when shared by a mutual you trust. They might have been tricked!
There are so many information online and it's great to have access to so information, but it's also difficult to wade through all of it. Media and truth are a weapon and it's being twisted and bend used to manipulate. Always has! But ai and so many people being able to post and share things, it becomes bigger and bigger and more dangerous. So don't just take everything that is handed to you and share it further no questions asked. Media literacy and being able to think for ourselves and do the research is important!! And as research becomes harder and harder, as sources are being messed up with ai and other fake news, it's even more important to sit with the images and study them. See the flaws, the mistakes. Compare it to other news and images.
This got long, and I started to ramble at the end. Sorry But I hope this helped
6K notes · View notes
keepcalmandwritefiction · 1 year ago
Text
Tumblr media
The allure of AI entices those people who fetishize ideas but dismiss the work. They're the people who tell writers, "I'll give you the idea, then you write it, and we'll split the profits." For them, the vision is everything, and the work is just an annoying obstacle. But the WORK is everything. The work is how a thing happens, where it's made, where skill is put to work. AI in creativity is for the people who have no skill, no work, no effort, no ethic. They just want to push a button.
– Chuck Wendig
5K notes · View notes
kittybroker · 11 months ago
Text
btw if a cat picture looks AI generated (and there is consensus in the notes that it's AI) I probably won't reblog them. There is no life in those things. I only sell the real deal here. Highest quality kitties around.
1K notes · View notes
roboute-guilliman · 2 months ago
Text
Official statement
Due to recent events, I feel I should make something clear. Abominable Intelligence (AI) is forbidden in the Imperium. While I do not intend to unleash the Inquisition onto anyone, I want to reiterate to anyone following me that I will not knowingly reblog AI art or chat logs, or respond to such things being sent into my inbox. These things are not sanctioned by Terra.
393 notes · View notes
bendedspoon · 6 months ago
Text
Tumblr media
I fucked up the ivantill drawing DRAMATICALLY so y'all can have the zestiest drawing I've ever made instead (IVANTILL IS COMING I JUST NEED TO REDO IT)
219 notes · View notes
shadowshavecolor · 3 months ago
Text
Tumblr media
Skinky bow-leggy girl (affectionate)
89 notes · View notes
good-advice-ganondorf · 1 year ago
Note
mr ganondorf dragmire sir , how do i tell the difference between a regular anime styled image and a ai image?
Tumblr media
258 notes · View notes
lanaucita · 4 months ago
Text
PAPA V PERPETUA C.AI BOT WHEN???????
Tumblr media
︶︶୨୧︶︶⊹︶︶⊹︶︶୨୧︶︶
Tumblr media
77 notes · View notes
person918x · 1 year ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
331 notes · View notes
gen-z-superheroes · 2 years ago
Text
Tumblr media
"Stopncii.org is a free tool designed to support victims of Non-Consensual Intimate Image (NCII) abuse."
"Revenge Porn Helpline is a UK service supporting adults (aged 18+) who are experiencing intimate image abuse, also known as, revenge porn."
"Take It Down (ncmec.org) is for people who have images or videos of themselves nude, partially nude, or in sexually explicit situations taken when they were under the age of 18 that they believe have been or will be shared online."
726 notes · View notes
ultrabeast01symbiont · 1 year ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
please be nice to ryuki when he comes out
237 notes · View notes
thisisallai · 3 months ago
Text
Tumblr media
The Sistine Chapel if it was painted by Vincent Van Gogh
41 notes · View notes
nekomiras · 1 year ago
Text
Tumblr media
Alhaitham in an Art Nouveau inspired style Here's a thread I wrote about this concept on Twitter, below the cut will be a copy of the text, sorry if it takes a weird format on tumblr since it was initially written as a twt thread
This might not make a lot of sense to some of you but before i talk about Alhaitham and Art Nouveau i'd like to talk about Kaveh and Romanticism The connection between Kaveh and Romanticism can be more easily done, specially with characters such as Faruzan calling him a romantic
Tumblr media
The Romantic movement, as the name suggest, is very emotionally driven. Its a movement that values individualism ane subjectvism, it's objective is on evoking an emotional response, most comonly being feelings of sympathy, awe, fear, dread and wonder in relation to the world
Basically the artistic view of the Romantic is to represent the world while trying to say "we are hopeless in the grand scheme of things, little can we do to change the world yet the world is always changing us"
In Romantic pieces the man is always small compared to the setting they find themselves in, see the painting Wanderer Above the Sea of Fog by Caspar David Friedrich as an example, the human figure is central but relativelly insignificant to the world
Tumblr media
Another thing about Romanticism is the importance of beauty, it's through it that the Romantic seeks to get in touch with their emotions and ituition and its through these lenses that they see the world. The Kaveh comparison should be easy to make with these descriptions
Kaveh's idle chat "The ability to ability to appreciat beauty is an important virtue" just cements to me the idea that his romanticism is closely connected to the artistic movement. He does have an argument agaisnt this connection but I'll bring it up later on the thread
Now that I used the opportunity to talk about my favorite character in a thread that wasn't supposed to be about him let's go back to Alhaitham and how to connect him to the Art Nouveau movement
But seriously, I brought up Kaveh's more obvious connection to Romanticism because the Nouveau movement was created as a direct mirrored response to the Romantic movement, and we all know how we feel about mirrored themes between these two characters
Art Nouveau is about rationality and logic, the movement was used more comonly on mass produced interior design pieces or architectural buildings, it's a movement much more focused on functionality than on art appreciation
They also had a big focus on the natural world but in a very different way, while Romantics saw nature as a power they couldnt contend with, artists from the Nouveau used the natural as an universal symbolical theme for broad mass appeal
Flowers, leaves, branches, complexes and organic shapes are the basis of this style, the logical side of it coming from the mathematics needed to create these shapes and themes in ways that were appealing and also structurally sound
To appreciate the Art Nouveau style is to understand it is a calculated artistic movement (another reason to be salty about an AI generated image trying to emulate it) In short, this style is less about the art and more about the rationality in the mathematics to make it
Another note I'd like to point out is that I love how both Alhaitham and Kaveh have dendro visions while both movements are so nature centric in different ways, Romanticism seeing it as a subjective power and Art Nouveau seeing it as recognizeable symbols
I mentioned an argument against the Kaveh comparison before: the one thing that bothers me about Romanticism is how negative it is in relation to humanity's position in the world and how that related back to Kaveh
In the Parade of Providence it was explicitely showed how much Kaveh dislikes the idea of people seeing themselves as helpless in relation to the problems of the world
People may suffer but there is something he can do to help them and he will do it
It doesn't feel right for me to say that Kaveh fits the Romantic themes because of his suffering, in a similar sense it also doesn't feel right to me to say Alhaitham fits Art Nouveau because of his rational behaviour while he as a character is a lot more complex than that
This thread was done all in fun and love for an artistic discussion, it's not a perfect argument to connect these characters and movements
+ I haven't studied art history in a year, if anyone knows more about these movements please tell me I love learning new things
++ Really sorry if my english is bad or I sound repetitive, it's not my first language and im trying my best here
Thanks for reading
I love you, have a nice day/evening/night
116 notes · View notes
deathgamegirl · 3 months ago
Text
Tumblr media
THEY PUT MY GIRL ON THE MOON?????
25 notes · View notes