#image webscraper | Explore Tumblr posts and blogs

jamingbenn · 6 months ago

Text

year in review - hockey rpf on ao3

hello!! the annual ao3 year in review had some friends and i thinking - wouldn't it be cool if we had a hockey rpf specific version of that. so i went ahead and collated the data below!!

i start with a broad overview, then dive deeper into the 3 most popular ships this year (with one bonus!)

if any images appear blurry, click on them to expand and they should become clear!

₊˚⊹♡ . ݁₊ ⊹ . ݁˖ . ݁𐙚 ‧₊˚ ⋅. ݁

before we jump in, some key things to highlight: - CREDIT TO: the webscraping part of my code heavily utilized the ao3 wrapped google colab code, as lovingly created by @kyucultures on twitter, as the main skeleton. i tweaked a couple of things but having it as a reference saved me a LOT of time and effort as a first time web scraper!!! thank you stranger <3 - please do NOT, under ANY circumstances, share any part of this collation on any other website. please do not screenshot or repost to twitter, tiktok, or any other public social platform. thank u!!! T_T - but do feel free to send requests to my inbox! if you want more info on a specific ship, tag, or you have a cool idea or wanna see a correlation between two variables, reach out and i should be able to take a look. if you want to take a deeper dive into a specific trope not mentioned here/chapter count/word counts/fic tags/ship tags/ratings/etc, shoot me an ask!

˚　　. 　 ˚　.　　　　　 . ✦　　　　˚　　　　 . ★⋆. ࿐࿔

with that all said and done... let's dive into hockey_rpf_2024_wrapped_insanity.ipynb

BIG PICTURE OVERVIEW

i scraped a total of 4266 fanfics that dated themselves as published or finished in the year 2024. of these 4000 odd fanfics, the most popular ships were:

Note: "Minor or Background Relationship(s)" clocked in at #9 with 91 fics, but I removed it as it was always a secondary tag and added no information to the chart. I did not discern between primary ship and secondary ship(s) either!

breaking down the 5 most popular ships over the course of the year, we see:

super interesting to see that HUGE jump for mattdrai in june/july for the stanley cup final. the general lull in the offseason is cool to see as well.

as for the most popular tags in all 2024 hockey rpf fic...

weee like our fluff. and our established relationships. and a little H/C never hurt no one.

i got curious here about which AUs were the most popular, so i filtered down for that. note that i only regex'd for tags that specifically start with "Alternate Universe - ", so A/B/O and some other stuff won't appear here!

idk it was cool to me.

also, here's a quick breakdown of the ratings % for works this year:

and as for the word counts, i pulled up a box plot of the top 20 most popular ships to see how the fic length distribution differed amongst ships:

mattdrai-ers you have some DEDICATION omg. respect

now for the ship by ship break down!!

₊ . ݁ ݁ . ⊹ ࣪ ˖͙͘͡★ ⊹ .

#1 MATTDRAI

most popular ship this year. peaked in june/july with the scf. so what do u people like to write about?

fun fun fun. i love that the scf is tagged there like yes actually she is also a main character

₊ . ݁ ݁ . ⊹ ࣪ ˖͙͘͡★ ⊹ .

#2 SIDGENO

(my babies) top tags for this ship are:

folks, we are a/b/o fiends and we cannot lie. thank you to all the selfless authors for feeding us good a/b/o fic this year. i hope to join your ranks soon.

(also: MPREG. omega sidney crosby. alpha geno. listen, the people have spoken, and like, i am listening.)

₊ . ݁ ݁ . ⊹ ࣪ ˖͙͘͡★ ⊹ .

#3 NICOJACK

top tags!!

it seems nice and cozy over there... room for one more?

₊ . ݁ ݁ . ⊹ ࣪ ˖͙͘͡★ ⊹ .

BONUS: JDTZ.

i wasnt gonna plot this but @marcandreyuri asked me if i could take a look and the results are so compelling i must include it. are yall ok. do u need a hug

top tags being h/c, angst, angst, TRADES, pining, open endings... T_T katie said its a "torture vortex" and i must concurr

₊ . ݁ ݁ . ⊹ ࣪ ˖͙͘͡★ ⊹ .

BONUS BONUS: ALPHA/BETA/OMEGA

as an a/b/o enthusiast myself i got curious as to what the most popular ships were within that tag. if you want me to take a look about this for any other tag lmk, but for a/b/o, as expected, SID GENO ON TOP BABY!:

thats all for now!!! if you have anything else you are interested in seeing the data for, send me an ask and i'll see if i can get it to ya!

471 notes · View notes

pmdfanfiction · 1 year ago

Text

The official Tumblr of PMDFanfiction.com!

We're working hard to provide a good and stable home for any and all PMD-based stories. Self-hosted images, no ads, strict anti-AI policy (both posting and blocking AI webscrapers), clean UI, welcoming to all skill levels, and it's built, run, maintained and paid for out of pocket by your community members! We hope to read YOUR story there soon! <3

This blog will be reblogging PMD Stuff that catches our eye + posting about new fics on our site, Weekly Fic Spotlights, and whatever else~!

We also have a discord! Join our community where Readers can connect with the Authors that write the stories we do so adore!

Discord Link: https://discord.gg/M7qRzjgW

We look forward to seeing you there!

Our partner servers/websites!

Pokemon Mystery Diner:

"Take a seat, order a drink, and enjoy! Diner is a multipurpose server with a focus on writing, art, roleplay, and casual conversation.

But most importantly, we run regular events with a focus on community engagement and involvement.

Users not only have the opportunity to join in, but they can help out as well.

Pokémon Mystery Diner is a welcoming place for all who enter its doors!"

Link: https://discord.gg/PMDiner

PMD: Writers United

"Welcome to the PMD: Writers United server, where fanfiction of all things PMD come to take you into another world!"

Link: https://discord.gg/pmdunited

Thousand Roads

"Welcome to Thousand Roads! We’re a server for all things pokémon fanfiction, affiliated with the Thousand Roads forums.

A community for all Pokémon fanfiction, from PMD to Anipoke to the mainline games — TR covers the whole fandom!" Links: https://thousandroads.net/ (main site) https://forums.thousandroads.net/ (TR forums) https://discord.gg/BeWCYxk (TR Discord)

#pokemon #pokemon mystery dungeon #fanfic #fanfiction #writing

21 notes · View notes

ao3-anonymous · 2 years ago

Note

Hey, can I ask how you pull the data into your dashboard from ao3? I’m just another curious data nerd - thanks!

Sure! I use a webscraper (Beautiful Soup) to scrape all the fandom category pages (e.g. https://archiveofourown.org/media/TV%20Shows/fandoms) and basically use a bunch of string functions to parse through a giant piece of HTML to pull every Fandom Name and it's associated fic count.

(Note: some people have pointed out that occasionally the number on the Category page does not match the number at the top when you click through and I have no idea what causes this discrepancy. Take it up with AO3 lol. I just use whatever the Category page says.)

I have the scraper set to run once a week on Monday, so I just match up the new count with previously collected data and calculate fandom size change from there! I store the raw data in a bunch of nested dictionaries in a JSON locally, which then powers a Google Sheet where the calculations are run. The dashboard is then connected to the Google Sheet as a data source!

I pretty much taught myself Python just for this project initially (although now I use it at work too and am getting a pay raise for it - thanks ADHD!) so it's definitely not the most elegant solution, but it works!

The only part of the workflow that isn't fully automated is the weekly Tumblr post - I have a script that creates the post, but if I set it to run automatically, the images never come out right. So I have to remember to click the button once a week, which I'm not great at (thanks ADHD), so that's why they don't always come out on Mondays.

Hope that helps! LMK if you have any questions!

#ao3 #ao3 stats #ao3-anonymous #ao3-anon-ask #crabbyapplesauce

14 notes · View notes

xerxestexastoast · 2 years ago

Text

So apparently the company that made HaveIBeenTrained.com is actively working on tools and APIs for machine learning companies to better tell who is and isn't okay with their websites being scraped for training data, and they have partnerships with ArtStation and Stability. I might just have to settle for their ai.txt in concert with OpenAI's new robots.txt declaration.

I still don't trust AI corpos, and I don't believe justice can be served unless the current models are completely destroyed and retrained from scratch on legitimately-obtained data, but if I keep wringing my hands about how to block webscrapers from grabbing images off a static webpage, I will never be able to get my website off the ground. I WANT to POST to the INTERNET!

#artificial intelligence #machine learning #Instagram is unrightclickable and I still managed to get to the direct image link with inspect element #but at that level of obfuscation you'd think large-scale image scrapers just give up #I will be adding a gallery pass tier to my Ko-Fi for fullres images and only post in watermarked low-res from now on though

2 notes · View notes

i-think-im-asleep · 2 years ago

Text

unironically a tool like that would be a great, non-socially-destructive use of language+image comprehension models and webscraping. the difference between building things with capital in mind vs building them with humans in mind.

"I want," the man said to the art robot, and then described an image in some detail. "Certainly," said the art robot. A printout came out of its chest. "Thank y- Hey! What's this?" "A list of artists who make images of the kind you describe, and who are accepting commissions."

59K notes · View notes

shreyash-hexa · 7 months ago

Text

Unlocking the Power of Data: A Comprehensive Guide to Web Scraping

🌐 What is Web Scraping? Web scraping is the automated process of extracting data from websites, allowing businesses and individuals to gather valuable insights quickly and efficiently. Whether you're conducting market research, optimizing SEO, or analyzing real estate trends, web scraping can transform how you access and utilize data.

🔧 Tools of the Trade From user-friendly options like Octoparse and ParseHub to powerful frameworks like Scrapy and Beautiful Soup, there’s a tool for everyone—regardless of your technical skill level. Discover which tools best suit your needs!

⚖️ Ethical Considerations As you dive into web scraping, remember to respect website guidelines and data privacy laws. Ethical scraping practices ensure that you can gather information responsibly without overloading servers or infringing on privacy.

💡 Best Practices Maximize your scraping efficiency by implementing strategies like throttling requests, using proxies, and handling dynamic content effectively. Planning your approach can save you time and headaches!

🚀 Future Trends Stay ahead of the curve with AI integration in scraping tools and the rise of no-code solutions that make data extraction accessible to everyone.

For expert software development services tailored to your needs, check out Hexadecimal Software. And if you're looking for a seamless real estate experience, explore HexaHome for commission-free property management!

👉 Read the full blog for an in-depth look at web scraping: [Your Blog Link Here]

WebScraping #DataExtraction #TechTrends #SoftwareDevelopment #HexadecimalSoftware #HexaHome #MarketResearch #SEO #DataDriven

Feel free to customize any part of this post or add images to make it more visually appealing on Tumblr!

#hexadecimal software #software development

0 notes

dykesbat · 4 months ago

Text

ok wait setting the scene. currently making an archive for this chinese game that's been around for. almost a decade now. bc it's been around for almost a decade, there aren't a lot of good in depth english archives for it. now note that my chinese literacy? incredibly basic.

so i:

went to the official eng site bc i know they have a wallpaper section. but then i looked closer and noticed that they missed some that i knew existed and were in low quality..

decided to go to the official eng forum site the company has for its games. they ALSO have wallpapers, cg's, ect, on this site. BUT the image quality wasn't good.

had to resort to the chinese version of the last site. the thing is. A. the search feature has no way to discern player accounts from official accounts and B. the account section is infinite scroll and there's no gallery nor any anchor points. which means ill have to scroll all the way down to like 2019-ish when this site first started in order to see the first posts.

decided to webscrape following the shoes of this other person who set out to do an archive for this same horrible game. they stopped the archive like two years ago and since doing this, i realized they didnt even put all the arts in it. thing is. im either horrible at following directions or they didnt give me the proper directions bc i can not figure out how to webscrape past 2021.

trying for days to figure out how to scrape past 2021 even to the point of trying different webscrapers and tutorials and failing

deciding to yell fuck it and get an autoscroller to scroll all the way down to 2019. this fails me bc firefox crashes multiple times during this bc im scrolling back more than five years like no shit its gonna crash.

realizing theres still early wallpapers that weren't published on this site???????????? and deciding to go on weibo instead. weibo on desktop does not allow you to look at a full account without a profile. trying and failing to make a weibo account bc i dont live in china and dont have a chinese phone number.

just deciding to yell fuck it and scroll on my phone bc weibos app allows me to use the gallery w/o an account.

archiving not for the weak im losing my mind

4 notes · View notes

pyshambles-blog · 6 years ago

Text

Python Code Snippets #25

Python Code Snippets #25. Five more simple but really useful Python code snippets mostly for beginners. All work on Linux and Windows and most likely Mac.

Python Code Snippets #25

Hi Py-Snippers, welcome to Python Snippets volume 25. Yes, somehow I have yet again managed to cobble together another five more deliciously easy to use, and potentially useful, Python code snippets for your grubby little fingers to play about with.

So let’s just get right on it.

121-Language Detection And Translation

Tested on : Windows 7, Linux Mint 19.1 Installs…

View On WordPress

#face detection #google translate #image webscraper #text processing #tkinter

0 notes

full-metal-furies · 2 years ago

Text

i webscraped my twitters a while ago and theres some really good images in here

15 notes · View notes

bocher-daniel · 5 years ago

Text

What can social media tell us about screen tourism?

What is left to discover after a century of intensive study in the field of film tourism? Today’s technologies can give us a new look into a previously hidden side of film tourism. For example, as film tourists casually use social media, one can gain intimate insights into emotions, behaviors on site and their perception of the destination image.

A city becomes a fantasy playground

With the use of CrowdTangle and webscraping methods, I collected over 1100 examples of film tourism communication from Facebook, Twitter and Instagram. The communication of film tourists in the case study "Game of Thrones", one can see that they regard the city of Dubrovnik as the fictional capital King’s Landing. They place fictional geotags from the series narrative within the real city. The film locations within the old town of Dubrovnik emanate a sense of fantasy that extends over the whole city, so the whole downtown becomes a playground for film tourists.

Looking at the post tourist through the window of social media

I argue that the film tourist is part of a bigger pattern that is evolving around postmodern theories and the post-truth realities of the globalized world. The film tourist is a post-tourist who enjoys blurring the lines between reality and fiction. This behavior is a problem when narratives become more important than facts, and our societies become divided, without a shared sense of space and reality. Social media is part of this: as a symptom, catalyst but also at the same time a window into the development of film tourism and our understanding of space.

#filmtourism #screentourism #humangeography #posttourist #crowdtangle #socialmedia #socialnetworks #bergdoktorarbeit #berdoktorarbeit #phd

1 note · View note

tangibletechnomancy · 1 year ago

Text

See, unfortunately, this ties back to EXACTLY what I was talking about with "fair use for me but not for thee".

It makes sense as an argument - somewhat - if you believe the popular myth that all a machine learning algorithm does is save images off the internet and rearrange bits of them into something else, which it very much does not - but even if you DO believe that myth, you're discrediting collage and other assemblage, or for that matter, many if not all fan works. A lot of that isn't done with permission, and despite that, it's generally considered Art(TM). A dick move, sometimes, depending on the piece and its purpose, but not Theft(TM) most of the time, let alone Fake Art(TM).

But when you get into the reality of it - that it not only doesn't contain a single image, but physically, mathematically CAN'T, and in fact is just STUDYING literal billions of images to know what patterns of pixels tend to represent certain keywords, then it becomes even harder to argue that anyone is being "stolen" from - at that point you are inherently arguing that REFERENCING is theft by several orders of magnitude more. Any given image that contributed to the weights that tell a model what "a horse" or "watercolors" look like is contributing less to the model's "understanding" of what these things are than any, yes, ANY image you've ever seen contributes to yours; unless you're over 150 years old and have spent your entire life just looking at images, you've gleaned your understanding of what a painting of a cat looks like from FAR fewer images than any AI model has. If studying hundreds of other people's art to make your own interpretation of what a stylized cat looks like is fair use because you're not just directly copying one of them in particular, then studying hundreds of millions to the same end must also be, because again, you are taking FAR less influence from any of those pieces and you (or your tool and the HUMAN PEOPLE who made it) are doing much more original work to make the end result coherent-

Unless your goal in making a distinction is just to define an ingroup - Real Artists, Victims Of The System, vs. an outgroup - the devious Technical Brother Art Thieving Lazy Jealous Jerk. It works VERY well for that.

That said, there IS a meaningful critique of how the datasets are compiled, and I feel it's exemplified very well by this:

Remember how this got a whole bunch of people really angry at tumblr over it when that phrasing first changed? Well, the thing is, it's not actually tumblr's fault; it's just how the internet works. It is, in fact, impossible to block webscrapers completely without making a page's contents unavailable to the general public (e.g., by a login wall), and there is no law saying that webcrawlers have to honor privacy settings. This on top of the fact that Facebook et al have been slowly boiling the frog to encourage us to overshare, and only now is this here to give people the shock reminder that the public internet is, in fact, PUBLIC, and the things we choose to post long-term IN PUBLIC are in fact readily available, preserved, there to study, IN PUBLIC...

There is a dataset ethics issue, and it is a PRIVACY issue, not one of plagiarism/copyright. What we need to be doing is focusing on privacy. The right to disappear. The right to opt out of automated studies and reasonably expect it to be honored. The right to be able to do the digital equivalent of putting up a "no photos" sign at a real world show and expect it to, if nothing else, at LEAST be honored by people who call themselves professionals.

This, incidentally, is why I...somewhat support the use of tools like Glaze or Nightshade in conjunction with NOAI/disallow all flags - those tools are not as effective as they claim to be and I take umbrage with that, BUT if adopted in conjunction with such flags on a wide scale, they could potentially enforce respect for flags on a cultural and practical level by making the end product of any company that looks at a "do not datamine" flag and says "how about I do anyway?" meaningfully worse!

In other words, fight KOSA and the like and push for privacy protections; the problem isn't MOSTLY, but ENTIRELY capitalism.

Economic anxiety has a way of bringing out reactionary sentiment in anyone if they're not careful.

It is deeply, deeply frustrating to watch it play out in front of me in leftist spaces such that self-proclaimed leftists are using actual, literal fascist arguments about Real Art vs. Fake Art and Real Labor vs. Lazy Button-Pushing.

These things don't become any less bad when you SAY your enemy is "some rich techbro" while calling broke disabled hobbyists "evil soulless automatons".

The central logic doesn't become true when you SAY you're targeting an inhuman machine while you screech obscenities about a great replacement at its operator.

When you say one minute "there is no unskilled labor, only undervalued skills", it doesn't magically absolve you of saying "nooo, you were supposed to automate away the BAD and DEMEANING jobs with no financial safety net for the workers, not THIS one I consider RESPECTABLE" in the next breath; it only makes you a fucking hypocrite.

"Fair use for me but not for thee" is not a rational position to prevent plagiarism and forgery; it's just a means to codify an ingroup and an outgroup.

"Degenerate art" is always, ALWAYS reactionary and proto-fascist thing to believe in, even if you wrap it up in other fancy words because you know "degenerate" is a Bad Word. "There is Good Art that makes society better and Bad Art, if you can even CALL it Art at all, that will rot our brains and turn us all into mindless drones if it's allowed to survive" cannot be made into anything but a reactionary position! Period! End of!

"Lazy button-pushers" are EXACTLY what corporations want you to think ANY automation operator is, so they can take credit away from those employees and criminally underpay them. They said the same damned thing about digital artists back in the early days of Photoshop. They say the same thing about overworked VFX artists today. You are DIRECTLY helping them make it worse with this argument.

The same old fucking trick of making you uncertain of your financial future so you lash out at other victims of the system because you "can't take the risk" of coming together to fight the actual enemy? Is working a FUCKING treat on way too many people who pride themselves on Not Being Like That - and it's even worse because a lot of the time pointing this out will get nothing but denial because maintaining pride in a leftist, progressive, pro-labor, pro-human Identity is more important to way too many people than ACTUALLY identifying the root of reactionary sentiment and the strategies used to spread it.

It makes me genuinely feel like I've fallen into a Fox News convention, hearing all these blatantly reactionary arguments and actively self-defeating strategies to Protect Labor.

#not art

670 notes · View notes

iamcodegeek · 5 years ago

Photo

Image crawler in Python - Web Scraping ☞ https://morioh.com/p/f152ae9a91b5 #Python #WebScraping #Morioh

#python #python tutorial #python language #python full course #python course #learn python #learn python programming #python tutorial for beginners

2 notes · View notes

iamprogrammerz · 5 years ago

Photo

Image crawler in Python - Web Scraping ☞ https://morioh.com/p/f152ae9a91b5 #Python #WebScraping #Morioh

#python #python tutorial #python language #python full course #python course #learn python #learn python programming #python tutorial for beginners

1 note · View note

fffmpreg · 2 years ago

Text

This post has gotten more traction than i was expecting, so I'd like to add stuff

first of all, to clarify, i completely supported the protests(as long as they lasted anyway)

without the mod tools the site will become full of spam and unusable anyway.

its important that people are inconvenienced, because thats the only way the general population will realize how important data preservation is and that every third google search ends up in reddit

i think privating for only 2 days and announcing that in advance is a weak af move, but ive been hearing that the admins forcefully took over and unprivated subs and the mods might get kicked out

so what can we do

not an expert but imho:

1. we can move all our stuff elsewhere,

just copy everything you've personally posted, stuff you (or others) might find helpful or any thing that might need to be preserved,

and paste anywhere else, multiple places, kbin, lemmy, mastodon, your neocities page, ao3, whatever seems appropriate

i hope someone builds a webscraper that can scrape atleast all the text, if not the images etc

we can have a breezewiki type of thing with a browser extension that automatically redirects from reddit

2. web archive got our back, somewhat

but those threads cant be commented on, nor the conversation continued, its essentially only preserving the data, not the community, which is why

3. talk to mods about hosting elsewhere and promoting that site on the sub

after all this if a time comes when we need to, we can simply delete our stuff from reddit

since we generated the content, i think we should get to decide who profits from it lol

people (mostly on twitter) are pissing me off so much with the "its reddit, who cares"

like, its not a social media, its a collection of forums, if you hate certain subs for their politics or opinions, dont visit those (you control the buttons you press or whatever)

meanwhile were about to lose so much information about niche hobbies and interests,

and these are the same people who were complaining last week that you cant find anything on google without adding "reddit" at the end,

are you fucking stupid, do you want to have to look through unrelated blogs and ai generated/pay walled quora answers everytime you need technical assistance or wanna talk about a hobby? is that what you want?

im this close to losing it

31K notes · View notes

datascience4you · 6 years ago

Photo

Image crawler in Python - Web Scraping ☞ https://morioh.com/p/f152ae9a91b5 #Python #WebScraping #Morioh

#python #python tutorial #python language #python full course #python course #learn python #learn python programming #python tutorial for beginners

1 note · View note

emergingindiaanalyticsnoida · 3 years ago

Text

WEBSCRAPPING WITH SELENIUM

WHAT IS WEBSCRAPPING & WHY IT IS NEEDED

In current world scenario data is the new fuel available in abundance. From starting a new business to create a new strategy for taking an existing business to a whole new dimension data is the most desired thing now a days. It can be in any form that may be an image, video, voice record, spreadsheet etc. However, for these processes a vast amount of data is needed to be collected and analyzed and one user can’t just sit all day to click and manually download the required files to local machine and then analyzed them for required goal. This task is not feasible at all as it is time and labor consuming, so to deal with this problem web scrapping comes in for rescue which automates the process.

It been a sigh of relief for the entrepreneur as well as for the big players of market for whom the data is everything. it’s the core of market research & business strategy. It is playing a vital to perceive the behavioral patterns of the target in real time. Apart from giving the real time behaviors it also satisfies the feasibility factors like it is technically robust, it has high accuracy, it is cost-efficient and it is inch perfect. Due to these qualities, it is not only being used by the business tycoons but it also being used by other professionals of others domains like academic researchers, scientists, doctors etc. heavily. It has proved its strength in market analysis, financial analysis and also helped to know the real time behaviors of global pandemic to fight it.

Web scrapping also known as data scrapping or web harvesting is the method of automating the process of access and import the data from a website into the local file of your device without much effort. the saved data then can be used for analysis and research. It helps to access and import almost everything from the website of target.

There are three main steps which are the foundation for executing a web scrapping successfully. First it sends a GET request to the server which in turn returns a response after the HTML code of the website is parsed and after that python library is used to access the parsed contain.

WHAT IS SELENIUM

Web scrapping is one of the most important things in data collection process. So, to make this scrapping process precisely neat and clean there are libraries or frameworks like BeautifulSoup, Selenium, Scrappy in Python which can be used. Here in this article, we will discuss about Selenium.

Selenium is the most powerful open-source automation tool available. Which is being used to control and perform web browser automation operation. Selenium was originally developed in the year 2004 by Jason hugging and later in 2011 it got merged with another test framework termed Webdriver and as WebDriver is W3C Standard. It is supported by all most all browser and that’s why it became the most popular framework in the field. Selenium test can be written in multiple languages like C#, Java, JavaScript, Python & Ruby.

Apart from multiple language support and easy implementation it also has a lot of advanced and required properties. It supports cross device testing which means the testing can be done using iPhone, blackberry, Android. One of the strongest points of Selenium is that it is user friendly and it can mimic the keyboard and mouse simulation of a real user in real time. It supports advanced user interaction like clicking on radio buttons, check boxes, selecting from drop down list, drag and drop, click and hold, selecting multiple items, going next page and coming back to previous page by clicking the go forward and go back button of browser etc. As it is open source there is large community support is available and continuous upgrade and updates are given.

Selenium requires a web driver which enables it to run cross browser tests. The web driver is the life force of Selenium it helps perform all the methods and class used in automation

INSTALLATION

Installing Selenium is very easy. The below mentioned steps can be used to install Selenium in any Python IDE without any hotch-potch. After that We will install web driver for chrome as I’ll use Chrome browser for automation.

INSTALLATION USING PIP

Assuming that you have an IDE like PyCharm, Jupyter Notebook etc. Here I’ll be using Jupyter Notebook in this process. Open the notebook and type the following, here the Selenium will be installed using Python Package manager.

INSTALLATION USING CONDA

Open anaconda command prompt and type the following to install Selenium using command prompt.

DOWNLOADING CHROME WEBDRIVER

After selenium is installed, I’ll download webdriver for Chrome. Before downloading chrome driver, we have to check the version of chrome browser installed in the local machine.

CHECK THE VERSION OF CHROME

Before downloading the driver, the chrome version must be checked using the below mentioned steps

DOWNLOAD CHROME DRIVER

Once the version of chrome is checked the below link can be used to download the chrome driver

After it is open click on the version of chrome matching the installed version of chrome. It’ll open another tab as shown below image showing a list of drivers for different platform. So, click on the required to start the download the driver

CONCLUSION

As we all know the world is rapidly changing and data has become the new definition of power. It has been clear that those who can harvest the data using scrapping tool and use it properly to take decisions for industry will be far ahead of their competitors. So, knowledge of advance use of web scrapping tool is a must to survive in this changing scenario by giving a tough fight.

0 notes