Tumgik
#sklearn
subair9 · 6 days
Text
ML Zoomcamp
Just completed the first week of Machine Learning Zoomcamp.
The lessons covered include:
What is Machine Learning
Machine Learning vs Rules Based Systems
Supervised Machine Learning
CRISP-DM
Model Selection Process
Setting up the Environment
Introduction to NumPy
Linear Algebra Refresher
Introduction to Pandas
Summary
The link to the course is below: https://github.com/DataTalksClub/machine-learning-zoomcamp
1 note · View note
teguhteja · 28 days
Text
Unlocking the Power of Data Preprocessing: Mastering Normalization and Standardization
Dive into data preprocessing! Master normalization & standardization. Python & Pandas examples included. Boost your #MachineLearning models. Perfect for #DataScience enthusiasts. Learn how to transform your data for better results!
Data preprocessing techniques. Data scientists and machine learning enthusiasts often grapple with raw data that needs refining. Therefore, understanding data preprocessing techniques becomes crucial for success. In this blog post, we’ll dive deep into two essential methods: normalization and standardization. We’ll explore how these techniques can transform your data, particularly focusing on…
0 notes
spec-vp · 1 year
Text
downsides of the grad school thing: HAHAHHFHFAHSGAHHHAAAAAAAAHHHHHHHHHHHHAGGSDHGAHDsAADSHGAHHHHHHHHHHHHKAHSGLLLGAGJKAFDL
upsides: preddy latex tables :3c
Tumblr media Tumblr media
2 notes · View notes
scumtrout · 2 years
Text
Why are we still here? Just to suffer? Every day, I need to learn about machine learning for sentiment analysis during my free time.
2 notes · View notes
juliebowie · 2 months
Text
Sklearn Cheat Sheet for Quick Machine Learning
A handy cheat sheet for using Sklearn, the popular machine learning library in Python. Find quick references to essential functions, algorithms, and workflows to accelerate your machine learning projects.
0 notes
firefox-enthusiast · 2 years
Text
Why is no one talking about my birthday???
0 notes
bayesic-bitch · 4 months
Text
I will say that Python libraries have some extreme variation in quality. "Data science" libraries like seaborne and sklearn are absolute dog shit nightmares that assume you are too stupid to understand anything. I did not good experiences with PIL or Pillow. Matplotlib is too convoluted with the more complicated features, but at least the core stuff is clean and accessible. And numpy, pytorch, and Gym are just absolute masterpieces of clean and elegant design.
110 notes · View notes
sabakos · 28 days
Text
okay sklearn is actually pretty great
5 notes · View notes
Text
Import Cygnus Oscuro
Summary: Creative Writing Final. It's a fedex humans are space orcs au. They're forced to be in the proximity of one another and it's fun for everyone except for those directly involved.
Word count : 5244
TW: one (1) swear word, auton (robot) racism including an in-universe slur (thanks, Fitz), absolutely incomprehensible worldbuilding (thanks, Squish)
Taglist (lmk if you want to be added/removed!): @stellar-lune @faggot-friday @kamikothe1and0nly @nyxpixels @florida-preposterously @poppinspop @uni-seahorse-572 @solreefs @i-loved-while-i-lied @rusted-phone-calls @when-wax-wings-melt @good-old-fashioned-lover-boy7 @dexter-dizzknees @abubble125 @hi-imgrapes @callum-hunt-is-bisexual @callas-pancake-tree @hi-my-name-is-awesome @katniss-elizabeth-chase @arson-anarchy-death @dizzeners @thefoxysnake @olivedumdum @loveution
On Ao3 or below the cut!
Bonus worldbuilding / q&a / suffering because I doubt any of this makes sense
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn import datasets, model_selection, metrics
    from sklearn.model_selection import train_test_split, cross_val_score
    from sklearn.preprocessing import *
    from sklearn.neighbors import KNeighborsClassifier
    from sklearn.metrics import *
    “Once again, what do you mean by Eifelia? The planet itself or the system as a whole, including its moons?” Sophie asks, staring out the window at the receding planetary surface as their spaceship affectionately called the ‘Cygnus Oscuro’ lifts off the ground. 
    “Eifelia has only one confirmed inhabited moon, Batyrbai. Your home planet of Datson is the only satellite in the Telychian system to have more than one moon that is suitable for habitation. Supplies were acquired at the port of Darriwilian, located at 25.78, -80.21, on the planet of Eifelia itself,” Dex replies, reading off the coordinates from the corner of xor vision. 
    It’s very easy to read off coordinates when xor neural network is constantly searching for information that it thinks will be helpful to xem. It, more often than not, is entirely extraneous information, but it is difficult to discern when, exactly, it will be of assistance. 
    Dex continues, “Five crew members departed in preparation for Eifelia’s cyclical festival of Batyrbai appearing full in the sky from the dark side of the planet. In turn, three crew members embarked.”
    Dex’s fan freezes up. “One of these crew members is human, which hail from Earth, most accurately described as a ‘Death Planet’. It is located in the system of Sol, 40.3 light years away. Take care to avoid any and all possible conflicts.” 
    Sophie fixes Dex an unbelieving look. “They can’t be that bad.”
    Article after article scroll across Dex’s field of vision. “They’ve earned their infamous reputation and most are highly unaware of it. Did you know they have contests to see which one can suffer through the most capsaicin-induced pain? Then, to cool the pain, they consume a drink full of near-impossible-to-digest lactose sugar.”
    “Yeah, and you can bend titanium without even a second thought.”
    “I’m sure a dedicated enough one would figure out how to do that.” 
    Sophie rolls his eyes. “I’ll make sure to tell Keefe not to be an intergalactic space wyrm this week but I don’t think that’s going to be happening any time soon.” 
    Dex’s processor runs the numbers, and Sophie is correct for once. In any other situation, a correct prediction from him would be a thing to praise, but in this particular case, it’s more worrisome for Keefe’s safety.
    stars_df = pd.Dataframe(data=stars.data, columns=stars.feature_names)
    stars_df.iloc[39060]
    name                “Beta Pictoris c”
    distance_ly         60         # light years, 3*10^8 m/s
    yerkes_stellar_class    “A6V”
    mass                4658.44    # Eifelia masses, 4.13*10^24 kg
    orbital_period         197.55     # Eifelia years
    grav_accel         182.470    # m/s^2
    surface_temp        1250       # kelvin
    “Greetings,” Dex’s assigned partner says as Dex slides into the chair next to him. His voice is blanketed with a thick accent Dex’s processor is unable to place, though the circling loading sign in the top corner is certainly trying. Such is the curse of exploring new planets faster than xor updates are able to keep track of them. 
    Today’s mission is expected to make that problem worse, although only slightly.     
    “I’m Fitz,” he says, holding out a hand.    
    “I’m D3x+3r,” Dex replies, not actually pronouncing the numbers like numbers even if they should be pronounced like numbers because they are numbers. The loading wheel is still circling around itself. “Although most people call me Dex because apparently two syllables is too many. I don’t understand it either.” 
    Fitz’s hand falls into his lap. “Nice to meet you, Dex.” He pauses. “Unless you have anything else I’ve forgotten, I think we can probably get going down to the surface so that we can get back sooner than later.” 
    Dex pushes away the loading circle in favor of the small transport ship’s inventory list. “I believe we have everything. If that is a false presumption, the communication link with the Cygnus Oscuro is up and running.” 
    Fitz gently undocks from the Cygnus Oscuro and that’s when Dex’s processor finally decides to provide xem with any information. It’s odd how it’s so proficient with useless information and finally now that it’s relevant, it takes a suspiciously long time. 
    It apparently doesn’t think it’s a major priority to know that xe’s just been sealed into a very small shuttle with a human. No big deal. This is both fine and normal. It’s not like they’re documented to have very short tempers. 
    Now the accent makes sense. Humans have hundreds of different languages, owing to their incredibly diverse geographic distribution. Most other species, including the Eifelians, only exist in small pockets in the corners of their worlds. Humans looked at that and went ‘no, I don’t think I will.’ Any other species is almost immediately recognizable by their accent but humans. They live to be difficult. 
    Even if the accent hadn’t been atrociously obvious in hindsight, the lines streaking across his skin—Blaschko lines, Dex’s processor claims—should have given his heritage away. The even more entertaining part is that most humans don’t even know they have them. 
    Dex’s processor is able to pull up Fitz’s official file without too much difficulty, and that seems like a mostly safe conversation to have instead of stilted silence. “So, how long have you been part of Parallax?” 
    “Well, my parents have worked here since before I was born, so the answer I usually give is, ‘Yes.’ How about you?” 
    “I was built on Gzhelia roughly 250 Eifelia years ago.” Dex pauses, converting this to a unit hopefully a little more familiar to Fitz. “That’s a little more than 4 Earth years.” 
    Fitz’s brows draw together. “Built?”
    Dex’s fan pauses in such a way that it sounds like a sigh as xe pulls back the artificial skin away from xor wrist, revealing the wires twisting underneath. A green fiber optic cable shimmers in the artificial light of the shuttle. 
    “I am aware that I am running on slightly older hardware, but I promise that my software is as updated with the most current Parallax Dataframes an update cycle half an Eifelia year ago could provide. Again, for ease of conversion, that is about three Earth days.”
    “You can stop with that. The conversions. I’ve grown up around more Eifelia time than Earth time.” 
    “I apologize. I was simply trying to prevent any incidental miscommunication before there was an issue. I will refrain from it in the future.” 
    The table of conversions still floats in front of Dex’s vision like a temporary burn-in.    
    Dex and Fitz sit in a silence that even Dex’s emotion identifier that was deprecated two years ago can identify as uncomfortable. Xe really should get around to installing a new one. 
    Fitz is the one to break the silence. “How’d you know I was human? Your little CPU tell you?” 
    Dex nods slowly. “Yes, it did, along with installing several files explaining your species’ customs. I can feel one of them slowing down my SSD flash memory with its sheer size.” 
    “Yeah, yeah, we all get it. Humans are big and loud and dumb and there’s so many of us that you can’t be bothered to learn all of it.” 
    Fitz flicks a half-dozen switches, initiating the landing sequence of the shuttle now that it is within the last thousand kilometers of altitude. The reason that it has to be activated so early is due to Beta Pictoris C’s incredibly high gravitational acceleration, causing the shuttle to have a much higher velocity than if it were under the gravitational influence of most other planets. 
    In other, more numerical terms, gravitational acceleration on Beta Pictoris C at the surface is about 182.970 m/s2, while, for reference, Eifelia’s is 8.011 m/s2. Of course, they are still up in the air, meaning that their orbital radius is slightly larger than the planet’s radius, but that really is not that much of a difference due to the sheer scale of the planet. 
    It’s no wonder Parallax has chosen the two of them for this mission—they’re the most likely to not be crushed under the sheer weight of the surface gravity. Or, more accurately, their own weight due to the increased surface gravity. 
    Fitz touches down gently, one of the very few landings Dex has experienced without involving a significant amount of screaming. 
    “Are you ready to go find one amino acid and then leave?” he asks, standing up. 
    Searching for life on planets like these is, for lack of a better descriptor, a neural-network-numbing process involving taking a few dirt samples while trying to make sure that Dex’s zinc components don’t get instantaneously vaporized, among other problems. 
    A-type stars aren’t even the hottest ones out there, but they’re on the very edge of what is believed to be habitable due to their instability. Their scarcity in the universe also makes it much more unlikely for life to have the opportunity to form around one. 
    It’s nearly inhospitable to every life form currently described, leaving a few carbon-fiber autons to figure out how to sample things on stars-forsaken planets that are literally half the surface temperature of Eifelia’s home star, Telychia. 
    “It would probably be beneficial to don some protective clothing before doing that, even if Beta Pictoris C is nearing aphelion and we have landed on the night side. Do you happen to know if it is tidally locked?”
    “That’s not in your file system?”
    “I regrettably am unable to locate it if it is.” 
    Fitz rolls his eyes, muttering, “Turing incomplete,” under his breath. 
    It takes a few milliseconds for Dex’s processor to provide the context to that statement, and that context is not a flattering one. Its origins lie with both the first human theoretical computer scientist, Alan Turing, and it became popularized due to Earth’s history with artificial intelligence. 
    It’s…not a pleasant history. 
    “Do you believe that infinite memory is possible? Because everything is technically only Turing complete when it is assumed to have infinite amounts of memory, which is impossible to create in the real world. Thus, every device, including this shuttle and your knee replacement is Turing incomplete.
    “Yeah, but at least I can feel emotions.”
    Fitz slides the heat suit’s helmet over his head, obscuring his face.
    “Most of your emotions are induced by shifts in hormonal signals. The Floians don’t have hormones. Does that mean that they too are artificial because they do not experience emotions in the same way that you do?” 
    Fitz opens the shuttle door, pressing himself against the wall to avoid being blown away by both the swirling, windy atmosphere blowing dust into all of the delicate machinery of the shuttle and the zeroth law of thermodynamics. 
    Dex’s fan immediately kicks into its highest gear, and it will stay there as long as the door remains open, barring some catastrophic, friction-related disaster. 
    “The Floians had to figure out how to evolve on their own. That should be a reasonable enough distinction for you.”
    “That implies that genetically modified organisms don’t count as organisms. And then, most autons learn via a reinforcement algorithm that mimics how evolution works in order to train a neural network. That’s the thing that I have making decisions in my ‘little CPU’ and its trillion transistors. How many neurons do you have again?”
    Fitz steps out into the outside, his suit making him look like a large orange nebula. Hopefully the door doesn’t decide to close with its own artificial consciousness like last time. That was not a fun time. 
    “Why do you ask when you could just search through your files? I’m sure it’s in there.” 
    “The answer was 135 billion,” Dex says flatly. That would be a more relevant description if xe was able to inflect xor speech more, but xe has found the setting to make xor voice a specific frequency and uses it a touch more than xe probably should. 
    Fitz turns back to Dex. “What are you doing? The sooner we get these samples into your file system, the sooner I stop looking like the stay puft marshmallow man.”
    Dex smiles as the image flashes across xor vision. Xe follows Fitz down the ramp, revealing the expected vast desertlike landscape of Beta Pictoris C.
    It’s significantly too hot for water to remain liquid but—there’s something odd about the erosion patterns. Those might not just be wind erosion. Xe downloaded a whole library of algorithms a couple of months ago. 
    Ignoring Fitz’s demands to know where xe’s going, Dex approaches one of the striated, gray rock formations. 
    url = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vTCZgoegOH a49SFXYU-ZZTdCkgTp0sn&single=true&output=csv'
    rocks_df = pd.read_csv(url)
    features = rocks[["depth", "width", "mohs_hardness"]]
    label = stars_df["class"]
    X_train, X_test, y_train, y_test = model_selection.train_test_split(features, label, test_size = 0.2, random_state = 42)
    model = KNearestNeighbor(n_neighbors = 53)
    model.fit(X_train, y_train)
    new_rock = pd.Dataframe([7,4,6.5])
    pred = model.predict(new_rock)
    A smile blossoms across Dex’s face. “We’ve got liquid erosion. It’s slightly less viscous than water, but liquid erosion nonetheless.” 
    Fitz stares at xem, waiting for an explanation that takes a long time to get there. 
    “I’m going to have to run some simulations on the ship because I don’t have enough RAM for the kind of resolution I want, but there’s potential that there used to be water here, and I’m sure you’re aware of how water and life are synonymous. Most of the time.” 
    Dex carefully scrapes off a corner of the ashy sandstone column for further study because xe, quite unfortunately, doesn’t have a built-in mass spectrometer. It’s also generally good practice to collect samples. 
    Another aspect of good practice is to look at more than one rock before drawing conclusions about an entire planet. 
    Dex traces into the dirt a simple sketch of Fitz in his marshmallow suit. He’s lucky to have all of his appendages attached, let alone proportional. Dex then takes a sample of the dirt. The mixing helps to paint a better picture of what the sand is like, rather than just the solar-radiation-exposed topsoil. 
    Suddenly, Fitz swears, pointing at something in the vial. That something is a little creature wiggling its way around the glass. 
    Dex nearly drops it, which would have been a less than ideal decision, as xe tries to find the little guy who is desperately trying to not be seen. 
    The little guy is a fairly standard arthropod-style body plan, with an exoskeleton, a number of legs that is larger than 2 and smaller than the number required for ‘burn it alive’ algorithms to kick in. So somewhere in the 6,8,10 range is probably pretty reasonable. 
    Although, to be fair, even numbers are more of a guideline than anything else. Once again, Earth is an exception to the rule with a three legged fish down in some of the deepest parts of its oceans. Also echinoderms with their five-fold radial symmetry. 
    “You, uh, might want to put him down,” Fitz suggests. “You don’t want to be charged with kidnapping should that little bug guy who I’m now going to be naming Fred turn out to have a consciousness.”
    Humans’ inclination to name creatures that have no way of communicating with them is a fairly large section in their file overview. It seems as though this can even occur with inanimate objects, which just links to a page advertising a pet rock, whatever that’s supposed to mean. 
    Dex pours the vial back onto the ground and attempts to take another sample without kidnapping another Fred. 
    Is that how human naming goes? Does it really matter? 
    The only reason this is a question is probably because It feels like all of Dex’s wires are currently being poached in the water designed to cool them. 
    There’s another one in the next vial. And the next. It’s almost like spontaneous generation but, like, not yet disproven by putting meat in a jar and covering it so maggots don’t get laid on it. 
    Yeah, that’s literally what the humans decided to do. Specifically one named Francesco Redi. Seems like a waste of calories for a species who needs to eat a lot of them to support their endothermic metabolisms. At least they figured it out in the end. 
    The fourth attempt seems to be safe as Dex only fills the vial halfway and shakes it extensively to avoid accidental kidnapping. Now the only possible complication could be microscopic creatures, but that’s past the point of reasonable care. 
    Fitz spends another few minutes gallivanting around, likely wandering around for more interesting samples, even if the entire report is already writing itself in the back of Dex’s processor. 
    He returns with a half dozen more samples of varying mineral compositions which get stored in his marshmallow suit’s pockets. “I saw another guy. Sorry I couldn’t get a picture, but he kind of looked like a scorpion. If you know what those are.”
    Dex nods, projecting a picture of one onto the first rock ledge just to prove that xe has image files stored in xor drive. 
    “Yeah, he looked kind of like that.”
    Dex switches the picture to a different one, one that isn’t necessarily a true scorpion. That doesn’t stop Eurypterus from colloquially being called a sea scorpion. It also doesn’t stop them from being extinct on Earth for around 252 million of its own years. 
    Fitz repeats, louder this time, “Yeah, he looked kind of like that.” 
    Fitz’s new best friend the Beta Pictoris C scorpion, who notably has yet to be blessed with a name, hops up onto the rock ledge, and it’s remarkable how similar they look, albeit the hologram being significantly larger. Blue swirls across its hardened exterior, and its pincers look like they’re very ready to reduce the number of fingers Dex has. 
    A warning light flicks on in the corner of Dex’s vision, cutting off access to xor files. 
    “We should probably be getting back to the ship. I have the coordinates of our landing point so that a larger, more prepared team can conduct a more detailed study. And before you begin to state that we are that team, if I am to stay out here for much longer, I will probably end up shutting down, and that is a burden I would rather not impose upon you.”
    It’s kind of odd how Dex’s vision is able to start flickering as xor processor threatens to have enough for the day. One would think it would work the same as when it gets too cold, but no. One second, xe’s completely fine and the next, xe’s restarting after eighteen hours trapped in an avalanche. 
    This is a normal experience. It’s not Dex’s first time, and most other autons xe has communicated with have had similar ones. It’s a risk associated with the job, and xor data won’t be lost in anaerobic environments the same way that data in an biologically-designed brain will. 
    Unless that brain belongs to an obligate or facultative anaerobe, but the vast majority of intelligent species do require some form of a gas to function. Many use oxygen, but carbon dioxide, methane, hydrogen, and carbon monoxide are fairly common as well. 
    Dex and Fitz make their way back into the spaceship and make absolutely certain that the hatch is sealed before peeling off their marshmallow suits. Dex’s blinking temperature warning sign disappears, but xor fan still remains running at full speed. 
    Fitz collapses into the pilot’s chair, sweat streaking down his brow, and barely waits for Dex to sit down beside him before lifting off. 
    They once again sit in an uncomfortable silence, punctuated only by the sounds of Fitz flipping various switches on the shuttle’s control panels. 
    Dex makes half a note that xe should learn how to fly a ship at some point, although Sophie would rapidly abuse that particular ability. 
    Once xe’s back aboard the Cygnus Oscuro, xe locates the mass spectrometer in order to analyze the samples before Fitz starts telling everyone about the larger portion of their discovery, because then xe’s going to have to answer other people’s questions instead of xor own.
    url = ‘https://docs.google.com/spreadsheets/d/16lsnIQaP37r682gKuz CZp-YqLgCis-Ln4PSaDEpiAjw/edit#gid=0’
    mass_spec = pd.read_csv(url) 
    compounds = []
    for i in range(mass_spec.size()): 
        id = identify(mass_spec[i]) 
        compounds += id 
     It turns out to absolutely no one’s surprise that liquid water doesn’t exist inside of the rock samples, but tricobalt tetraoxide, Co3O4, is in there, and it is a liquid at the planet’s surface temperature. It’s certainly a choice for an electron donor, and it’s kind of a wonder the entire planet isn’t bright blue with the Cobalt (ii) ions. 
    Dex isn’t surprised to find out that by the time xe’s had enough time with the samples that the entire ship knows about the little arthropod that was found, even if they aren’t formally related to the Earthen order of arthropoda Fitz is comparing it to. 
    They look similar. It’s close enough. 
    What Dex is surprised to find is that everyone wants a tour to see them despite the fact that the vast majority of the crew would acquire heat stroke almost instantaneously. This is xor thirty-sixth mission to actually go down onto a planet for the first time—autons are cheaper to replace than biological organisms—and this is by far the biggest response to a new species. 
    It’s odd. Xe doesn’t like it. 
    Dex’s neural network wants to blame it on Fitz, and there really isn’t any data to contradict that particular hypothesis. It also makes it a very difficult hypothesis to test, which makes it significantly less useful as a hypothesis. 
    On the other hand, a useful hypothesis would be one relating to the actual little alien creatures that for some reason are able to live on a planet that’s more similar to a furnace than a habitable landscape. 
    And so, against all logical reasons surrounding the temperature of a planet known to be at least twice the temperature of the hottest previously confirmed life forms. Of course it’s on Earth. Hydrothermal vents don’t look like a place where organisms could live, and then they’re just down there chilling. That’s probably not the best choice of a descriptor. 
    When in doubt, the answer is more often than not ‘Earth is a weird planet.’ 
    The journey back down to the surface with Fitz passes with significantly less fanfare than the first, the beeping of the ship being obnoxiously loud in the deafening silence. 
    They touch down, Fitz not taking as much care as last time with making sure the landing has as little of a change in momentum as possible, which is to say that it’s nowhere near the gentle landing of the first trip. 
    Fitz leans back and sighs. “Do you have any commentary you’d like to provide or are you ready to go and collect data so we can finish our reports on this planet?”
    “I mean, I’m always collecting data, even if it's only a live feed of my precise coordinates getting thrown into a plaintext file never to be seen again, so the answer is closest to both of the above.” 
    That does not seem to be the answer Fitz wants as he takes one of his bags of human snacks—potato chips, according to what’s printed on the yellow label—and throws it into the garbage can in the corner. 
    “Wow.” Dex’s visual apertures widen. “I didn’t realize that throwing projectiles with accuracy was a human skill. I’ll make sure to add that to my files, as well as to the main system.”
    Fitz’s eyes flash, his features drawing into hard lines. “Are you physically incapable of not being condescending? I get it. I’m a human. I’m from a death planet. Humans are weirder than fucking dark energy. It doesn’t require that many comments about it to get your point across!”
    Dex pauses, letting xor neural network fully process Fitz’s statements before replying, “I don’t understand where I was being condescending.”
    “You just did it two sentences ago!”
    “I did not do anything two sentences ago. It was genuinely quite interesting how your species has evolved to throw objects with accuracy, even ones with high surface area to volume ratios such as that bag of chips, because it is not something that has been documented in any other intelligent species.” 
    “Oh, please. It’s a basic skill.”
    “Do remember that your species evolved in part to bring down large prey such as Mammuthus primigenius. Throwing spears at a wooly mammoth directly led to that ability being rewarded with a higher rate of nutrients, and thus resulted in the following generations being more able to throw spears as well.”
    “You know all of that but you didn’t figure out that throwing things is pathetically easy? Your little auton brain isn’t very good at drawing conclusions from data you have, is it?”
    “It is simply something I did not have cause to consider before now, though I do recognize that it would have been quite easy to identify without the inciting event.”
    “And you’ve also said that you have a very large file on humans. Most of our games are based around the concept of throwing a ball. Was that not enough information to extrapolate that maybe we’re good at it?”
    “Games of chance are common in many species. It follows that this could simply be a manifestation of that desire in humans, so games like your ‘basketball’ or ‘baseball’ do not provide sufficient evidence to draw conclusions such as the ones you’re suggesting.”
    Fitz rolls his eyes. “Why do I even bother? It’s not like you’re going to change your mind. You don’t have a mind to change.”
    Dex wants to explain that xor neural network is actually changing its dependence on its individual notes on a regular basis, but that doesn’t seem to be advantageous in this particular context.
    Fitz rolls his eyes, muttering in what is likely his native tongue—one which Dex has not downloaded the translation file of—as he gets into the marshmallow suit once again. 
    They go out, describe a half dozen new arthropod-esque species, each with more legs than the last, and return with more samples with as few words as possible. But nothing is ever allowed to be simple. 
    The hatch on the shuttle has decided today that leaving itself open in the blistering heat is not something it likes to do, and while Fitz and Dex are distracted, it shuts its doors. 
    In turn, it opens the floodgates for Dex to learn some new fun human swear words when Fitz notices what’s happened. 
    “No reason to worry,” Dex says, making xor way through the sand to open up the back emergency panel that exists for exactly this reason. 
    “Uh, I left the keys in there. There’s very much a reason to worry.”
    “And I’ve got admin privileges. It’s fine. Go back to looking for the next beetle you’re going to call your son.”
    “Don’t be rude to Benny like that. He’s not that replaceable.”
home@Cygnus-shuttle-3:~$sudo su
home@Cygnus-shuttle-3:~$******
root@Cygnus-shuttle-3:~$ufw disable 
    There’s no particular reason why the firewall sometimes decides to make the hatch close, and this is enough of a solution for Dex to not go searching for an answer. 
    As the door begins to open again, Fitz asks, “So, what’s the password?”
    “I’m the password.”
    “Yes, yes, I understand that you’re helpful. Now, what’s the password if this were to happen again and you aren’t around?”
    “I’m the password. It’s literally just my name. D3x+3r. It’s got an uppercase character, lowercase character, number, and a special character. My friend Sophie thought he was hilarious when he heard it, so now it’s my password for everything. Don’t tell anyone.”
    “I won’t. I don’t even know where the special characters are even if I wanted to.”
    “The ‘t’ is replaced with a plus. The ‘e’s are fairly obviously transliterated to ‘3’s. There’s nothing fancy going on here.”
    Fitz turns to walk away but stops himself. “The name Sophie feels a little familiar. Does he by any chance know a Keefe?”
    “Yes, actually. The two of them dated for a while. Although I’m not sure if that should be in the past tense. I stopped asking for updates a while ago.” 
    Fitz laughs. “Stars, I wish I could figure out how to do that. I’ve never escaped from them.”
    “Just kind of stare blankly into the distance and people will stop wanting to tell you things. They’re usually doing it because they want compliments on whatever it is they’re telling you, and by depriving them of that, they stop wanting to do that.”
    “Are you sure you’re an auton?”
    Now it’s Dex’s turn to laugh, a sound xe was very much not designed to make, so it sounds more like an out of tune record skipping. “Yeah, I think so. I’ve walked into too many door frames to have gone this long without getting a contusion, which is another thing your species doesn’t particularly care about getting.”
    “Case in point: I found one on my leg yesterday and I have no idea how I got it. It’s already green and I’m not sure how I hadn’t noticed it before. I guess that’s what I get for being from a death world.”
    Dex gestures widely to the rolling desert around xem. “I think Earth’s death world status may be a bit outdated. If this isn’t a death world, I don’t know what is, and, by comparison, I’m pretty sure Earth is an absolute paradise. You didn’t have to evolve to use tricobalt tetraoxide as an electron donor.”
    “We’ve also had five mass extinctions,” Fitz interjects. 
    “So has everybody else, including the Datsonians, even if their government would rather not admit that out loud. You’re not special.”
    Fitz snaps his fingers inside of the marshmallow suit, which does not work well with the thick padding of the gloves. “And that’s exactly what I wanted you to admit.”
    “Is that why you volunteered to come back down here?”
    “That was mostly a decision based on Parallax’s inability to find another poor sap that would be willing and able to come down here.”
    “Wouldn’t it be really funny if they send a Gzhelian in your place?”
    Fitz smiles, the sound of the air conditioners they use onboard the Cygnus Oscuro at a nice, toasty 200 kelvin having kept him from sleeping for nearly as many hours as Dex has wanted to disconnect xor audio input. 
    A beat of silence stretches in the space between them, but for the first time it isn’t immensely uncomfortable. 
    “We should probably be getting back inside the shuttle before it decides to close again,” Dex says, even if it would be very entertaining if they stood outside long enough for it to grow its own intelligence again. 
    After all, that’s kind of how xe got here. Xe’s going to get replaced by a shuttle door within the next couple of Eifelia years. 
    Xe’ll probably get assigned to, like, repairing the Cygnus Oscuro in all of the places the non-auton mechanics are unable to go, but at least xe’ll have discovered a wondrous new world before that happens. 
    while True: 
        # avoid getting hit by Fitz’s projectiles
# no, seriously, they’re dangerous
        update_coordinates()
        data_status = upload_data()
        if (data_status == True):
            break() 
11 notes · View notes
spikybanana · 1 year
Text
get to know me tag
thank you for the tags @lynxindisguise @wanderingdonut @twostarscolliding !!!<3<3
relationship status: necessarily single :)
favourite colour: the kind of purple on the underbellies of clouds
stuck in my head: temptation by heaven 17, not even the whole song, just the bit where it goes "[something something] temptation!!"
last song i listened to: high and dry by radiohead
3 favourite foods: man. anything that isn't the all in a pot stew I've been cobbling together. chocolate. rice. and uhhhh idk fried rice!! which is very unspecific haha I know, but like. there's no bad fried rice
last thing i googled: I'm sure it's something unhinged about the star wars universe but it was in incognito, and the first thing in my actual history says "naive bayes with sklearn" (what I asked my brain for: just a few hundred lines of code, with maybe a bit of divine intervention; what I got: 4k words on a new wip about space monks)
dream trip: a small-ish dream but I wanna walk the west highland way in scotland! some time before I graduate for sure
absolutely zero pressure tags, sorry if it's a repeat!!: @shipsgaysfordays @deadgayfurrywizardsinthe70s @pinklume @whywcd @everythingbutcoldfire @lilyflxwers @achilleslikespeas @coldnerdnacho
24 notes · View notes
0x4468c7a6a728 · 7 months
Text
sklearn (as in scikit-learn) should be pronounced /sklɝn/
3 notes · View notes
subair9 · 2 months
Text
LLM Zoomcamp 2004
Just completed the forth week of LLM Zoomcamp.
The lessons covered include:
Introduction to monitoring answer quality
Evaluation and Monitoring in LLMs
Offline RAG Evaluation
Offline RAG Evaluation: Cosine Similarity
Offline RAG Evaluation: LLM as a Judge
Capturing User Feedback
Monitoring the System
The link to the course is below: https://github.com/DataTalksClub/llm-zoomcamp
0 notes
teguhteja · 30 days
Text
Unlocking the Power of Machine Learning with Sklearn: A Beginner's Guide
Unlock the power of Sklearn for machine learning. This beginner's guide covers data handling, preparation, and model structures. Start your AI journey now!
Demystifying Machine Learning and Sklearn Sklearn machine learning basics. First and foremost, let’s unpack what machine learning really means. Essentially, it’s a branch of artificial intelligence that enables computers to learn and improve without explicit programming. This technology powers many aspects of our daily lives, from the voice assistants we use to the recommendations we receive on…
0 notes
antiquery · 9 months
Text
Tumblr media
baby's first sklearn model! (predicting fanfic popularity on AO3)
3 notes · View notes
sunscreenstudies · 2 years
Text
Iconic Things My Coding Professors Have Said (Part 8)
student "but how does this not cause problems?"   prof "ohhhhhhh no, it causes SO many problems"
"which is completely impossible to do in five weeks but we're going to do it anyway to showcase our awesomeness"
"if you recall, we previously examined this model using the examples of ice cream and murder"
"So, say you had two tweets you wanted to examine, let’s say... ‘joe biden bit donald trump’ and ‘donald trump bit joe biden’. you wouldn’t say that these are the same thing, right? right, of course not, because as much as trump would deserve it, biden could get rabies or something and nobody wants that"
"most of these things are related to food or sex which is ironic because those should never be combined in real life. never. do you hear me? n e v e r"
"so in the early 1960s, this guy revolutionized - or destroyed, if you like - the way that corpora were examined. and i do like. he seriously fucked things up"
"i don't have anything against philosophers. i was a philosopher myself in a past life"
"Chomsky did a lot of big important things for computational linguistics but a lot of his basic underlying assumptions are just... wrong"
"he tried to rewrite all of his old texts later in his lifetime, where he tried to remove all stranded prepositons... i don't know about you, but I can think of better uses of my time"
prof "what do you do in sklearn for this?" students look confused. prof looks nervous "i actually can't think of it myself, that's why I'm asking"
“there was actually a study done a few years ago about how often advertising companies commit fraud. so, let’s give a show of hands. who thinks that ads only lie 25% of the time? hands up... nobody? oh, thank god. you may all be cynics, but at least you're a promising bunch"
"There are only two ways of displaying data in statistics. there are lies and then there are BIG lies”
prof "have you heard anything that was used with this word in the past?"   student "yes, but i don't want to repeat it"   prof "yea, there's a ton of sexual stuff going around"
"when examining how many covid-affected people had symptoms of long covid, the scientists used a facebook group to find people to ask. but they used a facebook group created specifcially for people suffering from long covid, and asking the long covid group how many of them have long covid is just... mind blowingly stupid"
"In the LGBT...P...Q...C...H? I don’t know anymore. In the alphabet soup of queers-"
prof "i'm just rushing through these things because that’s all i can do here given that we only have 5 minutes left"   student "actually we have 35 mintues, we finish at half past, not on the hour"   prof "... let me clarify. we only have 5 minutes left where my sanity will remain intact. what happens after that, i have no control over"
"but it's better than HTML in any case, which is full of problems... quite like my marriage, actually, but at least I know that HTML won't cheat on me. ANYWAY"
Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8
Part 9  | Part 10 | Part 11 | Part 12 | Part 13 | Part 14
28 notes · View notes
monuonrise · 2 years
Text
Running a k-means Cluster Analysis:
Machine Learning for Data Analysis
Week 4: Running a k-means Cluster Analysis
A k-means cluster analysis was conducted to identify underlying subgroups of countries based on their similarity of responses on 7 variables that represent characteristics that could have an impact on internet use rates. Clustering variables included quantitative variables measuring income per person, employment rate, female employment rate, polity score, alcohol consumption, life expectancy, and urban rate. All clustering variables were standardized to have a mean of 0 and a standard deviation of 1.
Because the GapMinder dataset which I am using is relatively small (N < 250), I have not split the data into test and training sets. A series of k-means cluster analyses were conducted on the training data specifying k=1-9 clusters, using Euclidean distance. The variance in the clustering variables that was accounted for by the clusters (r-square) was plotted for each of the nine cluster solutions in an elbow curve to provide guidance for choosing the number of clusters to interpret.
Load the data, set the variables to numeric, and clean the data of NA values
In [1]:''' Code for Peer-graded Assignments: Running a k-means Cluster Analysis Course: Data Management and Visualization Specialization: Data Analysis and Interpretation ''' import pandas as pd import numpy as np import matplotlib.pyplot as plt import statsmodels.formula.api as smf import statsmodels.stats.multicomp as multi from sklearn.cross_validation import train_test_split from sklearn import preprocessing from sklearn.cluster import KMeans data = pd.read_csv('c:/users/greg/desktop/gapminder.csv', low_memory=False) data['internetuserate'] = pd.to_numeric(data['internetuserate'], errors='coerce') data['incomeperperson'] = pd.to_numeric(data['incomeperperson'], errors='coerce') data['employrate'] = pd.to_numeric(data['employrate'], errors='coerce') data['femaleemployrate'] = pd.to_numeric(data['femaleemployrate'], errors='coerce') data['polityscore'] = pd.to_numeric(data['polityscore'], errors='coerce') data['alcconsumption'] = pd.to_numeric(data['alcconsumption'], errors='coerce') data['lifeexpectancy'] = pd.to_numeric(data['lifeexpectancy'], errors='coerce') data['urbanrate'] = pd.to_numeric(data['urbanrate'], errors='coerce') sub1 = data.copy() data_clean = sub1.dropna()
Subset the clustering variables
In [2]:cluster = data_clean[['incomeperperson','employrate','femaleemployrate','polityscore', 'alcconsumption', 'lifeexpectancy', 'urbanrate']] cluster.describe()
Out[2]:incomeperpersonemployratefemaleemployratepolityscorealcconsumptionlifeexpectancyurbanratecount150.000000150.000000150.000000150.000000150.000000150.000000150.000000mean6790.69585859.26133348.1006673.8933336.82173368.98198755.073200std9861.86832710.38046514.7809996.2489165.1219119.90879622.558074min103.77585734.90000212.400000-10.0000000.05000048.13200010.40000025%592.26959252.19999939.599998-1.7500002.56250062.46750036.41500050%2231.33485558.90000248.5499997.0000006.00000072.55850057.23000075%7222.63772165.00000055.7250009.00000010.05750076.06975071.565000max39972.35276883.19999783.30000310.00000023.01000083.394000100.000000
Standardize the clustering variables to have mean = 0 and standard deviation = 1
In [3]:clustervar=cluster.copy() clustervar['incomeperperson']=preprocessing.scale(clustervar['incomeperperson'].astype('float64')) clustervar['employrate']=preprocessing.scale(clustervar['employrate'].astype('float64')) clustervar['femaleemployrate']=preprocessing.scale(clustervar['femaleemployrate'].astype('float64')) clustervar['polityscore']=preprocessing.scale(clustervar['polityscore'].astype('float64')) clustervar['alcconsumption']=preprocessing.scale(clustervar['alcconsumption'].astype('float64')) clustervar['lifeexpectancy']=preprocessing.scale(clustervar['lifeexpectancy'].astype('float64')) clustervar['urbanrate']=preprocessing.scale(clustervar['urbanrate'].astype('float64'))
Split the data into train and test sets
In [4]:clus_train, clus_test = train_test_split(clustervar, test_size=.3, random_state=123)
Perform k-means cluster analysis for 1-9 clusters
In [5]:from scipy.spatial.distance import cdist clusters = range(1,10) meandist = [] for k in clusters: model = KMeans(n_clusters = k) model.fit(clus_train) clusassign = model.predict(clus_train) meandist.append(sum(np.min(cdist(clus_train, model.cluster_centers_, 'euclidean'), axis=1)) / clus_train.shape[0])
Plot average distance from observations from the cluster centroid to use the Elbow Method to identify number of clusters to choose
In [6]:plt.plot(clusters, meandist) plt.xlabel('Number of clusters') plt.ylabel('Average distance') plt.title('Selecting k with the Elbow Method') plt.show()
Tumblr media
64.media.tumblr.com
Interpret 3 cluster solution
In [7]:model3 = KMeans(n_clusters=4) model3.fit(clus_train) clusassign = model3.predict(clus_train)
Plot the clusters
In [8]:from sklearn.decomposition import PCA pca_2 = PCA(2) plt.figure() plot_columns = pca_2.fit_transform(clus_train) plt.scatter(x=plot_columns[:,0], y=plot_columns[:,1], c=model3.labels_,) plt.xlabel('Canonical variable 1') plt.ylabel('Canonical variable 2') plt.title('Scatterplot of Canonical Variables for 4 Clusters') plt.show()
Tumblr media
64.media.tumblr.com
Begin multiple steps to merge cluster assignment with clustering variables to examine cluster variable means by cluster.
Create a unique identifier variable from the index for the cluster training data to merge with the cluster assignment variable.
In [9]:clus_train.reset_index(level=0, inplace=True)
Create a list that has the new index variable
In [10]:cluslist = list(clus_train['index'])
Create a list of cluster assignments
In [11]:labels = list(model3.labels_)
Combine index variable list with cluster assignment list into a dictionary
In [12]:newlist = dict(zip(cluslist, labels)) print (newlist) {2: 1, 4: 2, 6: 0, 10: 0, 11: 3, 14: 2, 16: 3, 17: 0, 19: 2, 22: 2, 24: 3, 27: 3, 28: 2, 29: 2, 31: 2, 32: 0, 35: 2, 37: 3, 38: 2, 39: 3, 42: 2, 45: 2, 47: 1, 53: 3, 54: 3, 55: 1, 56: 3, 58: 2, 59: 3, 63: 0, 64: 0, 66: 3, 67: 2, 68: 3, 69: 0, 70: 2, 72: 3, 77: 3, 78: 2, 79: 2, 80: 3, 84: 3, 88: 1, 89: 1, 90: 0, 91: 0, 92: 0, 93: 3, 94: 0, 95: 1, 97: 2, 100: 0, 102: 2, 103: 2, 104: 3, 105: 1, 106: 2, 107: 2, 108: 1, 113: 3, 114: 2, 115: 2, 116: 3, 123: 3, 126: 3, 128: 3, 131: 2, 133: 3, 135: 2, 136: 0, 139: 0, 140: 3, 141: 2, 142: 3, 144: 0, 145: 1, 148: 3, 149: 2, 150: 3, 151: 3, 152: 3, 153: 3, 154: 3, 158: 3, 159: 3, 160: 2, 173: 0, 175: 3, 178: 3, 179: 0, 180: 3, 183: 2, 184: 0, 186: 1, 188: 2, 194: 3, 196: 1, 197: 2, 200: 3, 201: 1, 205: 2, 208: 2, 210: 1, 211: 2, 212: 2}
Convert newlist dictionary to a dataframe
In [13]:newclus = pd.DataFrame.from_dict(newlist, orient='index') newclus
Out[13]:0214260100113142163170192222243273282292312320352373382393422452471533543551563582593630......145114831492150315131523153315431583159316021730175317831790180318321840186118821943196119722003201120522082210121122122
105 rows × 1 columns
Rename the cluster assignment column
In [14]:newclus.columns = ['cluster']
Repeat previous steps for the cluster assignment variable
Create a unique identifier variable from the index for the cluster assignment dataframe to merge with cluster training data
In [15]:newclus.reset_index(level=0, inplace=True)
Merge the cluster assignment dataframe with the cluster training variable dataframe by the index variable
In [16]:merged_train = pd.merge(clus_train, newclus, on='index') merged_train.head(n=100)
Out[16]:indexincomeperpersonemployratefemaleemployratepolityscorealcconsumptionlifeexpectancyurbanratecluster0159-0.393486-0.0445910.3868770.0171271.843020-0.0160990.79024131196-0.146720-1.591112-1.7785290.498818-0.7447360.5059900.6052111270-0.6543650.5643511.0860520.659382-0.727105-0.481382-0.2247592329-0.6791572.3138522.3893690.3382550.554040-1.880471-1.9869992453-0.278924-0.634202-0.5159410.659382-0.1061220.4469570.62033335153-0.021869-1.020832-0.4073320.9805101.4904110.7233920.2778493635-0.6665191.1636281.004595-0.785693-0.715352-2.084304-0.7335932714-0.6341100.8543230.3733010.177691-1.303033-0.003846-1.24242828116-0.1633940.119726-0.3394510.338255-1.1659070.5304950.67993439126-0.630263-1.446126-0.3055100.6593823.1711790.033923-0.592152310123-0.163655-0.460219-0.8010420.980510-0.6448300.444628-0.560127311106-0.640452-0.2862350.1153530.659382-0.247166-2.104758-1.317152212142-0.635480-0.808186-0.7874660.0171271.155433-1.731823-0.29859331389-0.615980-2.113062-2.423400-0.625129-1.2442650.0060770.512695114160-0.6564731.9852172.199302-1.1068200.620643-1.371039-1.63383921556-0.430694-0.102586-0.2240530.659382-0.5547190.3254460.250272316180-0.559059-0.402224-0.6041870.338255-1.1776610.603401-1.777949317133-0.419521-1.668438-0.7331610.3382551.032020-0.659900-0.81098631831-0.618282-0.0155940.061048-1.2673840.211226-1.7590620.075026219171.801349-1.030498-0.4344840.6593820.7029191.1165791.8808550201450.447771-0.827517-1.731013-1.909640-1.1561120.4042250.7359771211000.974856-0.034925-0.0068330.6593822.4150301.1806761.173646022178-0.309804-1.755430-0.9368040.8199460.653945-1.6388680.2520513231732.6193200.3033760.217174-0.946256-1.0346581.2296851.99827802459-0.056177-0.2669040.2714790.8199462.0408730.5916550.63990432568-0.562821-0.3538960.0271070.338255-0.0316830.481486-0.1037773261080.111383-1.030498-1.690284-1.749076-1.3167450.5879080.999290127212-0.6582520.7286690.678765-0.464565-0.364702-1.781946-0.78874722819-0.6525281.1926250.6855540.498818-0.928876-1.306335-0.617060229188-0.662484-0.4505530.135717-1.106820-0.672255-0.147127-1.2726732..............................70140-0.594402-0.044591-0.8214060.819946-0.3157280.5125720.074137371148-0.0905570.052066-0.3190860.8199460.0936890.7235950.80625437211-0.4523170.1583900.549792-1.7490761.2768870.177913-0.140250373641.636776-0.779188-0.1697480.8199461.1084191.2715050.99128407484-0.117682-1.156153-0.5295180.9805101.8214720.5500380.5527263751750.604211-0.3248980.0882000.9805101.5903171.048938-0.287918376197-0.481087-0.0735890.393665-2.070203-0.356866-0.404628-0.287029277183-0.506714-0.808186-0.067926-2.070203-0.347071-2.051902-1.340281278210-0.628790-1.958410-1.887139-0.946256-1.297156-0.353290-1.08675317954-0.5150780.042400-0.1765360.1776910.5109430.6733710.467327380114-0.6661982.2945212.111056-0.625129-1.077755-0.229248-1.1365692814-0.5503841.5889211.445822-0.946256-0.245207-1.8114130.072358282911.575455-0.769523-0.1154430.980510-0.8426821.2795041.62732708377-0.5015740.332373-0.2783580.6593820.0545110.221758-0.28880838466-0.265535-0.0252600.305419-0.1434370.516820-0.6358011.332879385921.240375-1.243145-0.8349830.9805100.5677521.3035020.5785230862011.4545511.540592-0.733161-1.909640-1.2344700.7659211.014413187105-0.004485-1.281808-1.7513770.498818-0.8857790.3704051.418278188205-0.593947-0.1702460.305419-2.070203-0.629158-0.070373-0.8118762891540.504036-0.1605810.1696570.9805101.3846291.0649370.19511839045-0.6307520.061732-0.678856-0.625129-0.068902-1.377621-0.27991229197-0.6432031.3472771.2557550.498818-0.576267-1.199710-1.488839292632.067368-0.1992430.3597250.9805101.2298731.1133390.365916093211-0.6469130.1680550.3665130.498818-0.638953-2.020815-0.874146294158-0.422620-0.943506-0.2919340.8199461.8273490.505990-0.037060395135-0.6635950.2453810.4411820.338255-0.862272-0.018934-1.68276529679-0.6744750.6416770.1221410.338255-0.572349-2.111239-1.1223362971790.882197-0.653534-0.4344840.9805100.9810881.2578350.980609098149-0.6151691.0766361.4118810.017127-0.623282-0.626890-1.891814299113-0.464904-2.354706-1.4459120.8199460.4149550.5938830.5260393
100 rows × 9 columns
Cluster frequencies
In [17]:merged_train.cluster.value_counts()
Out[17]:3 39 2 35 0 18 1 13 Name: cluster, dtype: int64
Calculate clustering variable means by cluster
In [18]:clustergrp = merged_train.groupby('cluster').mean() print ("Clustering variable means by cluster") clustergrp Clustering variable means by cluster
Out[18]:indexincomeperpersonemployratefemaleemployratepolityscorealcconsumptionlifeexpectancyurbanratecluster093.5000001.846611-0.1960210.1010220.8110260.6785411.1956961.0784621117.461538-0.154556-1.117490-1.645378-1.069767-1.0827280.4395570.5086582100.657143-0.6282270.8551520.873487-0.583841-0.506473-1.034933-0.8963853107.512821-0.284648-0.424778-0.2000330.5317550.6146160.2302010.164805
Validate clusters in training data by examining cluster differences in internetuserate using ANOVA. First, merge internetuserate with clustering variables and cluster assignment data
In [19]:internetuserate_data = data_clean['internetuserate']
Split internetuserate data into train and test sets
In [20]:internetuserate_train, internetuserate_test = train_test_split(internetuserate_data, test_size=.3, random_state=123) internetuserate_train1=pd.DataFrame(internetuserate_train) internetuserate_train1.reset_index(level=0, inplace=True) merged_train_all=pd.merge(internetuserate_train1, merged_train, on='index') sub5 = merged_train_all[['internetuserate', 'cluster']].dropna()
In [21]:internetuserate_mod = smf.ols(formula='internetuserate ~ C(cluster)', data=sub5).fit() internetuserate_mod.summary()
Out[21]:
OLS Regression ResultsDep. Variable:internetuserateR-squared:0.679Model:OLSAdj. R-squared:0.669Method:Least SquaresF-statistic:71.17Date:Thu, 12 Jan 2017Prob (F-statistic):8.18e-25Time:20:59:17Log-Likelihood:-436.84No. Observations:105AIC:881.7Df Residuals:101BIC:892.3Df Model:3Covariance Type:nonrobustcoefstd errtP>|t|[95.0% Conf. Int.]Intercept75.20683.72720.1770.00067.813 82.601C(cluster)[T.1]-46.95175.756-8.1570.000-58.370 -35.534C(cluster)[T.2]-66.56684.587-14.5130.000-75.666 -57.468C(cluster)[T.3]-39.48604.506-8.7630.000-48.425 -30.547Omnibus:5.290Durbin-Watson:1.727Prob(Omnibus):0.071Jarque-Bera (JB):4.908Skew:0.387Prob(JB):0.0859Kurtosis:3.722Cond. No.5.90
Means for internetuserate by cluster
In [22]:m1= sub5.groupby('cluster').mean() m1
Out[22]:internetuseratecluster075.206753128.25501828.639961335.720760
Standard deviations for internetuserate by cluster
In [23]:m2= sub5.groupby('cluster').std() m2
Out[23]:internetuseratecluster014.093018121.75775228.399554319.057835
In [24]:mc1 = multi.MultiComparison(sub5['internetuserate'], sub5['cluster']) res1 = mc1.tukeyhsd() res1.summary()
Out[24]:
Multiple Comparison of Means - Tukey HSD,FWER=0.05group1group2meandifflowerupperreject01-46.9517-61.9887-31.9148True02-66.5668-78.5495-54.5841True03-39.486-51.2581-27.7139True12-19.6151-33.0335-6.1966True137.4657-5.76520.6965False2327.080817.461736.6999True
The elbow curve was inconclusive, suggesting that the 2, 4, 6, and 8-cluster solutions might be interpreted. The results above are for an interpretation of the 4-cluster solution.
In order to externally validate the clusters, an Analysis of Variance (ANOVA) was conducting to test for significant differences between the clusters on internet use rate. A tukey test was used for post hoc comparisons between the clusters. Results indicated significant differences between the clusters on internet use rate (F=71.17, p<.0001). The tukey post hoc comparisons showed significant differences between clusters on internet use rate, with the exception that clusters 0 and 2 were not significantly different from each other. Countries in cluster 1 had the highest internet use rate (mean=75.2, sd=14.1), and cluster 3 had the lowest internet use rate (mean=8.64, sd=8.40).
9 notes · View notes