Text
The 5 Free Dataset Sources for Data Analytics Projects
In this video, I'm sharing the five free dataset sources that are perfect for data analytics projects. By using these free datasets, you'll be able to create powerful data analytics projects in no time! Dataset sources are essential for data analytics projects, and these five free dataset sources will help you get started quickly.
By using these sources, you'll be able to collect data from a variety of sources and crunch the numbers with ease. So be sure to check out this video to learn about the five free dataset sources for data analytics projects!
#data science project#data analyst project#data science#free datasets#free dataset sources#free dataset sources for data analytics#free datasets for data science#free datasets for data analytics#free datasets for data analysis#free dataset sources for data science#datasets for data science beginners#dataset sources for data analytics#free dataset websites#where to get free datasets#how to get free datasets#datasets to use#datasets for projects#datasets#kaggle
2 notes
·
View notes
Text
Tonight I am hunting down venomous and nonvenomous snake pictures that are under the creative commons of specific breeds in order to create one of the most advanced, in depth datasets of different venomous and nonvenomous snakes as well as a test set that will include snakes from both sides of all species. I love snakes a lot and really, all reptiles. It is definitely tedious work, as I have to make sure each picture is cleared before I can use it (ethically), but I am making a lot of progress! I have species such as the King Cobra, Inland Taipan, and Eyelash Pit Viper among just a few! Wikimedia Commons has been a huge help!
I'm super excited.
Hope your nights are going good. I am still not feeling good but jamming + virtual snake hunting is keeping me busy!
#programming#data science#data scientist#data analysis#neural networks#image processing#artificial intelligence#machine learning#snakes#snake#reptiles#reptile#herpetology#animals#biology#science#programming project#dataset#kaggle#coding
41 notes
·
View notes
Text
The issue with being a computer scientist is that sometimes you'll be doing a project and halfway through it you're gonna sit down and think to yourself that maybe what you're doing is unethical
#this is about web scraping lmao#i'm making a sentiment analysis project and have arrived at an impasse#i'll start looking for open source datasets since they'd make me arrive at more or less a similar result#but my instructor just has a real clear idea of how the project can be about this year's elections so that's why i've been looking into tha#hhhhhhhh why is this area of work so full of ethical dilemmas i just wanted a degree
5 notes
·
View notes
Text
the way my term project is making me write an outline that includes my "approach/plan" for the project like what kinda question is that lol my plan is to follow the detailed instructions that are provided for the project? this project is very clearly and structurally defined already i may as well just copy/paste the instructions
#for reference this is for my data preparation class. all i'm doing for this project is preparing and cleaning datasets#my approach will be... get this... to follow the rubric#anyway data science masters degree going great#win rambles
3 notes
·
View notes
Text
This capstone project is going to be the death of me
#I’m behind on my project plan but I made some decent progress today#I think I’m going to have to refine my dataset a bit tho#I feel like I’ll have an easier time training my models if I throw out some of the images that are too ambiguous#if I can just get my models squared away before April I should be good#then I can spend those last two weeks gathering data and prepping for my presentation#💚
6 notes
·
View notes
Text
after nearly three years... over two hundred cases... unknown dozens of meetings... more than a thousand videos... and one last short-notice late-night data entry triage session...
i have just submitted the final dataset for my parent-child relationship video coding project for analysis
i feel like i need to smash a gameshow buzzer or something
#'project' makes this sound like it was a class assignment or an arts and crafts activity#this thing was a BEAST of a research endeavor#basically we have hundreds of hours of video footage#of parents engaging in various lab tasks with their toddlers and having different kinds of interactions#and i had to sit down and watch them and rate/quantify various aspects of their relationship/interactions#so that we can use the data for our broader research aims#yesterday i drove nearly 800 miles and then stayed up til 1:30 AM fixing the dataset#the issues were partly someone else's fault and then tbh partly the RA in charge of data entry just not doing a great job#i am a shell of a human
16 notes
·
View notes
Text
more people gotta learn the difference between liking someone, trusting someone, and relying on someone.
I dislike and distrust my coworker, but i know that snake can be relied on and will deliver the chart by monday if they say they will deliver it by monday. And i don't have to like them, just rely on them and they gotta be able to do the same with me. My friend who i very much like and trust is absolutely unreliable in comparison, they will cancel on me last minute or totally forget about smth we had agreed on. My neighbour who i like i don't trust bc they like to gossip and my personal stuff would not remain personal.
I mean english ain't my first or even my second language but i feel there is a very big difference and people who say 'why don't you trust them' should maybe instead ask 'why don't you rely on them'? and instead of requesting i should stop disliking someone should maybe check first if i do treat them respectfully or not.
#dunno lots and lots of feels#i've had bosses who i would figuratively like to gut like a fish but i KNOW they will forward my project to the person that needs to see it#i've had coworkers i could trust with my newborn if i had one but not with the dataset i gotta work on#so it pisses me off when i am accused of disliking someone as if that is an issue bc i know i am still polite and respectful#even though yes i don't wanna go grab coffee with them#i still very much know they are good and consistent at what they do#the other way too btw i dont need everyone to be my friend i need them to be able to work with me tho#and while this post is mostly work related it does translate to other things too#my neighbour and i can hate each other but I will be a good neighbour and help put out the flames if their place catches fire etc#bc i take it as my duty#and duty needs to be done not liked#even tho liking it also helps#etc etc#funny thing is that x and i who obviously do not like one another are a power team when it comes to getting work done#bc they feel the same about me#and others are like 'why cant you get along?'#bc we dont need to. we dislike each other and thats ok this isn't nearly as dramatic as people seem to think it is#just ranting sorry
11 notes
·
View notes
Text
Google Imagen 3 vs. The Competition: A New Benchmark in Text-to-Image Models
New Post has been published on https://thedigitalinsider.com/google-imagen-3-vs-the-competition-a-new-benchmark-in-text-to-image-models/
Google Imagen 3 vs. The Competition: A New Benchmark in Text-to-Image Models
Artificial Intelligence (AI) is transforming the way we create visuals. Text-to-image models make it incredibly easy to generate high-quality images from simple text descriptions. Industries like advertising, entertainment, art, and design already employ these models to explore new creative possibilities. As technology continues to evolve, the opportunities for content creation become even more vast, making the process faster and more imaginative.
These text-to-image models use generative AI and deep learning to interpret text and transform it into visuals, effectively bridging the gap between language and vision. The field saw a breakthrough with OpenAI’s DALL-E in 2021, which introduced the ability to generate creative and detailed images from text prompts. This led to further advancements with models like MidJourney and Stable Diffusion, which have since improved image quality, processing speed, and the ability to interpret prompts. Today, these models are reshaping content creation across various sectors.
One of the latest and most exciting developments in this space is Google Imagen 3. It sets a new benchmark for what text-to-image models can achieve, delivering impressive visuals based on simple text prompts. As AI-driven content creation evolves, it is essential to understand how Imagen 3 measures up against other major players like OpenAI’s DALL-E 3, Stable Diffusion, and MidJourney. By comparing their features and capabilities, we can better understand the strengths of each model and their potential to transform industries. This comparison provides valuable insights into the future of generative AI tools.
Key Features and Strengths of Google Imagen 3
Google Imagen 3 is one of the most significant advancements in text-to-image AI, developed by Google’s AI team. It addresses several limitations in earlier models, improving image quality, prompt accuracy, and flexibility in image modification. This makes it a leading contender in the world of generative AI.
One of Google Imagen 3’s primary strengths is its exceptional image quality. It consistently produces high-resolution images that capture complex details and textures, making them appear almost natural. Whether the task involves generating a close-up portrait or a vast landscape, the level of detail is remarkable. This achievement is due to its transformer-based architecture, which allows the model to process complex data while maintaining fidelity to the input prompt.
What truly sets Imagen 3 apart is its ability to follow even the most complex prompts accurately. Many earlier models struggled with prompt adherence, often misinterpreting detailed or multi-faceted descriptions. However, Imagen 3 exhibits a solid capability to interpret nuanced inputs. For example, when tasked with generating the images, the model, instead of simply combining random elements, integrates all the possible details into a coherent and visually compelling image, reflecting a high level of understanding of the prompt.
Additionally, Imagen 3 introduces advanced inpainting and outpainting features. Inpainting is especially useful for restoring or filling in missing parts of an image, such as in photo restoration tasks. On the other hand, outpainting allows users to expand the image beyond its original borders, smoothly adding new elements without creating awkward transitions. These features provide flexibility for designers and artists who need to refine or extend their work without starting from scratch.
Technically, Imagen 3 is built on the same transformer-based architecture as other top-tier models like DALL-E. However, it stands out due to its access to Google’s extensive computing resources. The model is trained on a massive, diverse dataset of images and text, enabling it to generate realistic visuals. Furthermore, the model benefits from distributed computing techniques, allowing it to process large datasets efficiently and deliver high-quality images faster than many other models.
The Competition: DALL-E 3, MidJourney, and Stable Diffusion
While Google Imagen 3 performs excellently in the AI-driven text-to-image, it competes with other strong contenders like OpenAI’s DALL-E 3, MidJourney, and Stable Diffusion XL 1.0, each offering unique strengths.
DALL-E 3 builds on OpenAI’s previous models, which generate imaginative and creative visuals from text descriptions. It excels at blending unrelated concepts into coherent, often weird images, like a “cat riding a bicycle in space.” DALL-E 3 also features inpainting, allowing users to modify sections of an image by simply providing new text inputs. This feature makes it particularly valuable for design and creative projects. DALL-E 3’s large and active user base, including artists and content creators, has also contributed to its widespread popularity.
MidJourney takes a more artistic approach compared to other models. Instead of strictly adhering to prompts, it focuses on producing aesthetic and visually striking images. Although it may not always generate images that perfectly match the text input, MidJourney’s real strength lies in its ability to evoke emotion and wonder through its creations. With a community-driven platform, MidJourney encourages collaboration among its users, making it a favorite among digital artists who want to explore creative possibilities.
Stable Diffusion XL 1.0, developed by Stability AI, adopts a more technical and precise approach. It uses a diffusion-based model that refines a noisy image into a highly detailed and accurate final output. This makes it especially suitable for medical imaging and scientific visualization industries, where precision and realism are essential. Furthermore, the open-source nature of Stable Diffusion makes it highly customizable, attracting developers and researchers who want more control over the model.
Benchmarking: Google Imagen 3 vs. the Competition
It is essential to evaluate Google Imagen 3 against DALL-E 3, MidJourney, and Stable Diffusion to understand better how they compare. Key parameters like image quality, prompt adherence, and compute efficiency should be considered.
Image Quality
In terms of image quality, Google Imagen 3 consistently outperforms its competitors. Benchmarks like GenAI-Bench and DrawBench have shown that Imagen 3 excels at producing detailed and realistic images. While Stable Diffusion XL 1.0 excels in realism, especially in professional and scientific applications, it often prioritizes precision over creativity, giving Google Imagen 3 the edge in more imaginative tasks.
Prompt Adherence
Google Imagen 3 also leads when it comes to following complex prompts. It can easily handle detailed, multi-faceted instructions, creating cohesive and accurate visuals. DALL-E 3 and Stable Diffusion XL 1.0 also perform well in this area, but MidJourney often prioritizes its artistic style over strictly adhering to the prompt. Image 3’s ability to integrate multiple elements effectively into a single, visually appealing image makes it especially effective for applications where precise visual representation is critical.
Speed and Compute Efficiency
In terms of compute efficiency, Stable Diffusion XL 1.0 stands out. Unlike Google Imagen 3 and DALL-E 3, which require substantial computational resources, Stable Diffusion can run on standard consumer hardware, making it more accessible to a broader range of users. However, Imagen 3 benefits from Google’s robust AI infrastructure, allowing it to process large-scale image generation tasks quickly and efficiently, even though it requires more advanced hardware.
The Bottom Line
In conclusion, Google Imagen 3 sets a new standard for text-to-image models, offering superior image quality, prompt accuracy, and advanced features like inpainting and outpainting. While competing models like DALL-E 3, MidJourney, and Stable Diffusion have their strengths in creativity, artistic flair, or technical precision, Imagen 3 maintains a balance between these elements.
Its ability to generate highly realistic and visually compelling images and its robust technical infrastructure make it a powerful tool in AI-driven content creation. As AI continues to evolve, models like Imagen 3 will play a key role in transforming industries and creative fields.
#advertising#ai#AI Infrastructure#ai tools#applications#approach#architecture#Art#artificial#Artificial Intelligence#artists#benchmark#benchmarking#benchmarks#Capture#Collaboration#Community#comparison#competition#computing#content#content creation#creative projects#creativity#creators#dall-e#DALL-E 3#data#datasets#Deep Learning
0 notes
Text
0 notes
Text
The biggest problem with this project is that it needs a lot of memory and we don't have any PC powerful enough for it.
So while we can train with even less data, it is less than ideal... 😞
I'm not sure if we could change to PyTorch instead at this point (and I'm not sure if it is installed on that PC), but I'll try to have an alternative with YOLOv8 just in case.
We've tried YOLOv8 in the past and I remember it working well enough and even my PC could handle it (with even less data, so idk). Sadly, part of this project was to create our own neural network with Tensorflow but there isn't enough time left and it's still dying... 😭
#talking about the firerisk project#this dataset is in github and in huggingface :3 if anyone want to know#we have to train that data and apply it to predict fire risk in our city
0 notes
Text
If you're feeling anxious or depressed about the climate and want to do something to help right now, from your bed, for free...
Start helping with citizen science projects
What's a citizen science project? Basically, it's crowdsourced science. In this case, crowdsourced climate science, that you can help with!
You don't need qualifications or any training besides the slideshow at the start of a project. There are a lot of things that humans can do way better than machines can, even with only minimal training, that are vital to science - especially digitizing records and building searchable databases
Like labeling trees in aerial photos so that scientists have better datasets to use for restoration.
Or counting cells in fossilized plants to track the impacts of climate change.
Or digitizing old atmospheric data to help scientists track the warming effects of El Niño.
Or counting penguins to help scientists better protect them.
Those are all on one of the most prominent citizen science platforms, called Zooniverse, but there are a ton of others, too.
Oh, and btw, you don't have to worry about messing up, because several people see each image. Studies show that if you pool the opinions of however many regular people (different by field), it matches the accuracy rate of a trained scientist in the field.
--
I spent a lot of time doing this when I was really badly injured and housebound, and it was so good for me to be able to HELP and DO SOMETHING, even when I was in too much pain to leave my bed. So if you are chronically ill/disabled/for whatever reason can't participate or volunteer for things in person, I highly highly recommend.
Next time you wish you could do something - anything - to help
Remember that actually, you can. And help with some science.
#honestly I've been meaning to make a big fancy thorough post about this for literally over a year now#finally just accepted that's not going to happen#so have this!#there's also a ton of projects in other fields as well btw#including humanities#and participating can be a great way to get experience/build your resume esp if you want to go into the sciences#actual data handling! yay#science#citizen science#climate change#climate crisis#climate action#environment#climate solutions#meterology#global warming#biology#ecology#plants#hope#volunteer#volunteering#disability#actually disabled#data science#archives#digital archives#digitization#ways to help#hopepunk
13K notes
·
View notes
Text
Publication in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
https://ieeexplore.ieee.org/document/10356628
A New Framework for Evaluating Image Quality Including Deep Learning Task Performances as A Proxy
iquaflow is a framework that provides a set of tools to assess image quality. The user can add custom metrics that can be easily integrated and a set of unsupervised methods is offered by default. Furthermore, iquaflow measures quality by using the performance of AI models trained on the images as a proxy. This also helps to easily make studies of performance degradation of several modifications of the original dataset, for instance, with images reconstructed after different levels of lossy compression; satellite images would be a use case example, since they are commonly compressed before downloading to the ground. In this situation, the optimization problem involves finding images that, while being compressed to their smallest possible file size, still maintain sufficient quality to meet the required performance of the deep learning algorithms. Thus, a study with iquaflow is suitable for such case. All this development is wrapped in Mlflow : an interactive tool used to visualize and summarize the results. This document describes different use cases and provides links to their respective repositories. To ease the creation of new studies, we include a cookiecutter repository. The source code, issue tracker and aforementioned repositories are all hosted on GitHub.
https://github.com/satellogic/iquaflow
1 note
·
View note
Text
There is no such thing as AI.
How to help the non technical and less online people in your life navigate the latest techbro grift.
I've seen other people say stuff to this effect but it's worth reiterating. Today in class, my professor was talking about a news article where a celebrity's likeness was used in an ai image without their permission. Then she mentioned a guest lecture about how AI is going to help finance professionals. Then I pointed out, those two things aren't really related.
The term AI is being used to obfuscate details about multiple semi-related technologies.
Traditionally in sci-fi, AI means artificial general intelligence like Data from star trek, or the terminator. This, I shouldn't need to say, doesn't exist. Techbros use the term AI to trick investors into funding their projects. It's largely a grift.
What is the term AI being used to obfuscate?
If you want to help the less online and less tech literate people in your life navigate the hype around AI, the best way to do it is to encourage them to change their language around AI topics.
By calling these technologies what they really are, and encouraging the people around us to know the real names, we can help lift the veil, kill the hype, and keep people safe from scams. Here are some starting points, which I am just pulling from Wikipedia. I'd highly encourage you to do your own research.
Machine learning (ML): is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines "discover" their "own" algorithms, without needing to be explicitly told what to do by any human-developed algorithms. (This is the basis of most technologically people call AI)
Language model: (LM or LLM) is a probabilistic model of a natural language that can generate probabilities of a series of words, based on text corpora in one or multiple languages it was trained on. (This would be your ChatGPT.)
Generative adversarial network (GAN): is a class of machine learning framework and a prominent framework for approaching generative AI. In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss. (This is the source of some AI images and deepfakes.)
Diffusion Models: Models that generate the probability distribution of a given dataset. In image generation, a neural network is trained to denoise images with added gaussian noise by learning to remove the noise. After the training is complete, it can then be used for image generation by starting with a random noise image and denoise that. (This is the more common technology behind AI images, including Dall-E and Stable Diffusion. I added this one to the post after as it was brought to my attention it is now more common than GANs.)
I know these terms are more technical, but they are also more accurate, and they can easily be explained in a way non-technical people can understand. The grifters are using language to give this technology its power, so we can use language to take it's power away and let people see it for what it really is.
12K notes
·
View notes
Text
📚 A List Of Useful Websites When Making An RPG 📚
My timeloop RPG In Stars and Time is done! Which means I can clear all my ISAT gamedev related bookmarks. But I figured I would show them here, in case they can be useful to someone. These range from "useful to write a story/characters/world" to "these are SUPER rpgmaker focused and will help with the terrible math that comes with making a game".
This is what I used to make my RPG game, but it could be useful for writers, game devs of all genres, DMs, artists, what have you. YIPPEE
Writing (Names)
Behind The Name - Why don't you have this bookmarked already. Search for names and their meanings from all over the world!
Medieval Names Archive - Medieval names. Useful. For ME
City and Town Name Generator - Create "fake" names for cities, generated from datasets from any country you desire! I used those for the couple city names in ISAT. I say "fake" in quotes because some of them do end up being actual city names, especially for french generated ones. Don't forget to double check you're not 1. just taking a real city name or 2. using a word that's like, Very Bad, especially if you don't know the country you're taking inspiration from! Don't want to end up with Poopaville, USA
Writing (Words)
Onym - A website full of websites that are full of words. And by that I mean dictionaries, thesauruses, translators, glossaries, ways to mix up words, and way more. HIGHLY recommend checking this website out!!!
Moby Thesaurus - My thesaurus of choice!
Rhyme Zone - Find words that rhyme with others. Perfect for poets, lyricists, punmasters.
In Different Languages - Search for a word, have it translated in MANY different languages in one page.
ASSETS
In general, I will say: just look up what you want on itch.io. There are SO MANY assets for you to buy on itch.io. You want a font? You want a background? You want a sound effect? You want a plugin? A pixel base? An attack animation? A cool UI?!?!?! JUST GO ON ITCH.IO!!!!!!
Visual Assets (General)
Creative Market - Shop for all kinds of assets, from fonts to mockups to templates to brushes to WHATEVER YOU WANT
Velvetyne - Cool and weird fonts
Chevy Ray's Pixel Fonts - They're good fonts.
Contrast Checker - Stop making your text white when your background is lime green no one can read that shit babe!!!!!!
Visual Assets (Game Focused)
Interface In Game - Screenshots of UI (User Interfaces) from SO MANY GAMES. Shows you everything and you can just look at what every single menu in a game looks like. You can also sort them by game genre! GREAT reference!
Game UI Database - Same as above!
Sound Assets
Zapsplat, Freesound - There are many sound effect websites out there but those are the ones I saved. Royalty free!
Shapeforms - Paid packs for music and sounds and stuff.
Other
CloudConvert - Convert files into other files. MAKE THAT .AVI A .MOV
EZGifs - Make those gifs bigger. Smaller. Optimize them. Take a video and make it a gif. The Sky Is The Limit
Marketing
Press Kitty - Did not end up needing this- this will help with creating a press kit! Useful for ANY indie dev. Yes, even if you're making a tiny game, you should have a press kit. You never know!!!
presskit() - Same as above, but a different one.
Itch.io Page Image Guide and Templates - Make your project pages on itch.io look nice.
MOOMANiBE's IGF post - If you're making indie games, you might wanna try and submit your game to the Independent Game Festival at some point. Here are some tips on how, and why you should.
Game Design (General)
An insightful thread where game developers discuss hidden mechanics designed to make games feel more interesting - Title says it all. Check those comments too.
Game Design (RPGs)
Yanfly "Let's Make a Game" Comics - INCREDIBLY useful tips on how to make RPGs, going from dungeons to towns to enemy stats!!!!
Attack Patterns - A nice post on enemy attack patterns, and what attacks you should give your enemies to make them challenging (but not TOO challenging!) A very good starting point.
How To Balance An RPG - Twitter thread on how to balance player stats VS enemy stats.
Nobody Cares About It But It’s The Only Thing That Matters: Pacing And Level Design In JRPGs - a Good Post.
Game Design (Visual Novels)
Feniks Renpy Tutorials - They're good tutorials.
I played over 100 visual novels in one month and here’s my advice to devs. - General VN advice. Also highly recommend this whole blog for help on marketing your games.
I hope that was useful! If it was. Maybe. You'd like to buy me a coffee. Or maybe you could check out my comics and games. Or just my new critically acclaimed game In Stars and Time. If you want. Ok bye
#reference#tutorial#writing#rpgmaker#renpy#video games#game design#i had this in my drafts for a while so you get it now. sorry its so long#long post
7K notes
·
View notes
Text
what do u do when u tell ur parent in no uncertain terms Thank You For The Offer But I Do Not Want A Tutor For This Course It Will Not Help And I Am Deeply Uncomfortable With It Do Not Get Me One
and then they go and book u with an online tutor. without asking. what the fuck.
#25 but being treated like im fucking 12#didnt even WANT help with courseworki went out there just looking for some goddamn emotional support#and i didnt even get it!!!!!!#theres 'problem solving instead of listening & supporting' and then theres THIS#i hate college#but rn i hate this family more#ANYWAYS#if any1 knows how 2 use python 2 'use file i/o on startup to open and read the dataset; initializing a few record objects with data parsed#from first few records in the csv file. the record objects should be stored in a simple data structure (array or list). use exception#handling in case the file is missing or not found'#i know how to open the file but idk how 2 'initialize a few record objects using data parsed etc. etc.'#like. i have a class so thats the record object. and ig i could have a list of instances of that class object#but idk how 2 like. combine the data from the csv file with instances of the class.#without having to individually list.append(()) 7000 rows bc eventually in this project u gotta use the whole dataset.
0 notes
Text
Why is there not a complete and accurate database of deaths per year around the world from before 1950 :(((
#for heaven's sake this data is going to be so meaningless#if this keeps irritating me I'll join some project to put together a dataset of this sort#at least to have a mostly valid graph for the 20th century#my post#why is everyone always analyzing fertility rates and children born but never how many people died 🙄😭
0 notes