Tumgik
#not to mention the fact that i STILL have to go grab raw html from ao3 just to get this shit to copy correctly
commanderquinn · 1 year
Text
Good Space Chapter 5: Stuck In The Middle With You
Tumblr media
! i dont! keep these posts! updated! like i do! ao3!
that means you're going to find typos and shit (and possibly minor detail changes) that don't match the ao3 version! that's because im not going to bother fixing the tumblr posts until i finish good space as a whole. im only uploading them here as a backup tbh
master list / ao3 chapter link
consistent formatting? nah. in this house we believe in Convenient Formatting 🙏 rapid fire and no flashbacks again (when they start to get Super Painful later on you’ll mourn the days when i skipped them for extra fluff) we’re Zeroed In on the nerds for another hot minute. this is what happens when you get hooked on a fic by an idiot that’s more inspired by screenwriters than authors, srry ❤️
also this chapter (and probably quite a few throughout this fic) is specifically for the babes that have had to pick themselves up from the dirt after a romantic crash. i cannot tailor this in a vague way that lets anyone picking this up have their own catharsis here, right? mega impossible to one size fits all that. but what i CAN do is use the bundle of greek myth references that is ava’s concept to tell a story about regaining personal power after a total shitass tricks you into thinking youre not completely bitchin as you are ❤️
and i guess make a bunch of canadian jokes bc those are really funny to me tbh. thank you donnatella moss for the inspiration. the best accidental moose canada ever had
anyways. sit. get comfy 😌 think of the ex you reallyreallyreally wanna stab 🥰 and then go project that exact motherfucker onto alec ❤️
"Put it on."
"No."
"Put. It. On."
"Nope."
"It's going to look good on you."
Bucky flicks his eyes up from the news article open on his tablet. "Yes, it would."
"Great. Your head is still gigantic post-defrosting. Good thing the one I picked comes with buttons. Leave three of them undone—"
"I know how many to leave undone." That was a misstep. He knows it the second the words leave his mouth. She's going to use it as if it's compliance. It isn't.
"And I'm sure you remember how to get your arms through the holes, too. So, let's go." Natasha repeatedly taps her hand on the kitchen table, making her rings knock against the aged wood. "Make with the wardrobe change."
"I'm not wearing that, and I'm sure as hell not going anywhere," he counters blandly.
"Yes, you are. Get up."
"Eat dirt, Romanoff. I have this thing called a will of my o—"
"So, you don't want to go?"
"Correct."
"Nothing could convince you to change your mind?"
"Absolutely not."
"Who do you think is going to be more disappointed when I repeat that at the bar, Wyatt or Ava?"
Bucky's eyes close slowly. Gently. The movement is a stark contrast to the anger swirling in him, the majority of which is aimed at himself, not the Russian seeking to ruin his life. This was so easy to spot coming. So easy. And he walked right into it.
"Have you given—" Steve attempts around a mouthful of food, cutting off when Natasha hits him in the back of the head to make him stop. He takes a moment to wash down the Coco Puffs with a gulp of fresh coffee after that. "Have you given Wyatt an autograph yet? I gave him one. Super nice guy, you'll like him."
"Why is the brain trust suddenly invited to a night out?" Bucky demands. This is a fucking trap. There is no possible way that this isn't a fucking trap.
Natasha rolls her eyes at him. "We're plying them with booze to try and keep them from suing us into the ground for inflicting you on the populace. Now shut up and go change. You're not wearing those pants."
"I'm—" He cuts himself off mid-refusal. There's not a chance, not even a fraction of a percent of one, that Ava would take offense to him not wanting to go. He's told her, on multiple occasions, that he hates getting dragged out to these things. His friends are awful, and they just do this to torture him. He's not inclined to entertain that most weeks, and Ava knows that. "I don't have any other pants aside from—"
"Yes, you do."
"I'm not wearing tux pants to a—"
"The leather ones you keep for long rides."
Bucky stops, and not because Natasha just revealed knowing another secret he hasn't told her. That shit doesn't even phase him anymore. His eyes move down to the blue button-up she's trying to force him into, his lips pursing slightly. The leather pants she's not supposed to know about are worn to hell and back at this point. Heavy weathering, a hole or two at the back of the heels, more than a few deep scratches that'll become holes if he's not overly careful. Not the kind of thing that would usually be suitable for a night out. 
That button-up is new, though. Looks expensive, too. Good quality silk. It'll look more natural on him under a jacket. Less like a significant effort and more like something he got roped into. Which is precisely what's happening.
Bucky sighs deeply, looking back up at her in resignation. "I have some ground rules."
"You're allowed to have approximately one."
He looks over at Steve in frustration. The bastard shakes his head with a cackle, a fresh scoop of Puffs halfway to his mouth. "Ooohoho, no. Nah-uh. There's a captain on deck tonight, but it is not me." He stands up, chewing quickly, a big dumb smile on his stupid face. "I'm being a good boy and following her orders."
Natasha knocks on the spot of hardwood directly in front of Bucky obnoxiously. "Name your singular rule. I still have to do my hair; hurry the hell up."
Her sass reminds him that he has to figure out what the fuck he's going to do with his hair. "I'm not dancing, for starters—"
"Great. None of us will hound you about dancing; you have my word. Go get dressed. We leave in an hour, and you'll be really embarrassed if I have to drag your unconscious body through the tower." Her eyebrows raise expectantly as she stands up, looking between him and the shirt. To add insult to injury, she taps her nails along his head on her way out of the kitchen.
Steve doesn't look over from where he's raiding the fridge for another snack. "For what it's worth, she sounded excited about the invitation."
Bucky's eyes squint suspiciously. "You invited her?"
"No, Nat did," he replies far too casually. "I was just in the room when she made the call."
"See, your fuck up here is that now I know—"
"I have information you can try to weasel out of me? Thanks, Buck, I appreciate that, seeing as I'm entirely inept when it comes to interrogation and spycraft—"
"Only for the most part. Was this your push or Nat's?"
"Are you asking to be a pest, or are you asking because you need to know?"
Bucky grinds his teeth. He can say the latter, and Steve will never know the difference. "I don't need to know, but—"
"Then fuck off." He shuts the fridge door with a gentle swing and a bright smile. "I have to go get dressed. So do you." He flicks at the bun resting against the back of Bucky's head on his way out. These fuckers are always touching him, and they don't pull the Canadian routine about it. "Should do something with your hair. It looks like it has blood on it."
It probably does. His last mission was designated complete all of twenty minutes ago, and he definitely bled through some of it. Bucky can't really tell on his end; he's still coming down from the adrenaline rush. Something Natasha used to her advantage, no doubt. 
"You fuck off," he grumbles long after Steve is out of earshot.
"I'm completely serious."
"No, you're panickin', ya big baby."
"I mean it."
"I'd like to go ahead and remind you that I was there when you purchased most'a your wardrobe. Both times. I think I'd know if y'didn't."
"I can't wear any of that. It's one thing when it's my space—"
"You're allowed to exist in other places, ya dweeb."
"I didn't say I wasn't allowed. Just that...." Ava trails off, her nerves finally catching up to her. The argument had felt like a funny joke when she poked her head through the doorway to start it. Now it's not feeling so funny anymore. Paige is doing that awful, shitty thing where she makes sense. Leaning against the frame and glancing down at the master bedroom's carpet, Ava feels small. "I don't know. The stuff I wear to conferences is too—prim. Most of it's ballroom shit and wouldn't work, anyways. All of my usual go-to's just... It all feels... stupid."
The energy drink chugging champion that is her best friend props herself up on her elbows where she's laid out on her bed. The headband she's wearing has two miniature alien heads poking up from it that wiggle with the motion. "Well, hey there, Alec. Long time no see, ya son of a—"
"Yeah, yeah," Ava waves her hand dismissively. The reminder does knock some of the pity party out of her, at least. There was a time when she made decisions for herself and herself alone. Those were damn good years, and Ava is trying like hell to get back into the mindset. The one she proudly lived in before she let someone talk her into being ashamed of who she is. "Let my freak flag fly, whatever. I still don't have anything to wear." Nothing that doesn't feel crushingly laughable, anyways.
"What about that lace skirt you've got, the one with the swirly patterns? That one's so cute."
Ava frowns. She's not looking to get squished in hosiery tonight, which would be the only way to save herself in something that short. "For dancing?"
"Mmm. That's, ya know, that ain't a bad point. It ain't exactly built for the breeze." Paige tilts her head to the side, making the aliens go wild. Her face pinches like she's brainstorming. Then her eyes go wide with excitement. "Oh! Wear that—the, the thing!"
"Gonna need more to go on." She snaps her fingers as Paige smacks at her own bedspread.
"The wrap dress!"
"You're out of your mind," Ava laughingly insists. Now that she's caught up to her best friend's train of thought, she's almost startled. "That's—first of all, I think it's technically a sun dress—"
"Who gives a shit? Ya look great in it."
"I look—that's beside the point. It... it's not too...?"
"Too...?"
"Shit, I don't know." She folds her arms over her chest and chews her lip for a few seconds. "What do I wear with it?"
"Nothin' but heels." The smirk on Paige's face is devious.
"You know what else isn't built for the breeze? Me. I'm not looking to flash the Avengers tonight, thanks." The words make her instantly think of Bucky, shamefully enough. He's not even going to be there tonight. She's absolutely sure of it. He's told her how much it takes to convince him to go out these days.
The manic pixie rolls her eyes. "Alright. The dress, the heels, and somethin' stringy."
"How about a jacket?" Ava reasons, already turning to go back to Paige's guest room, the one that's been unofficially hers for years.
"Pick one that's sheer, ya chickenshit," she shouts down the hallway behind her.
"That's a lot of sass coming from the woman who can't look America's Sweetheart in the eye!" 
"You'll thank me when you don't wake up here!"
Ava gets hit with the mental reminder that a certain sergeant has been threatening to fly her home for over a week. She hip-bumps her unofficial door closed with a huff. 
Bucky's not going to show up tonight. 
Even if he was, the man's a serial flirt, and she's his—the primary neurosurgeon on his case. Not-flirting through his appointments has been…. She's been trying to think of it as a bedside manner. A very unprofessional bedside manner. The kind she wouldn't have the balls to admit to out loud.
Natasha didn't mention him directly during the invitation call, only his case. All she said was that the whole team was welcome, including the duct rat, Findley. No mention of other attendants. It would have been brought up if he were going to be there; Ava's sure of that. 
Natasha did mention getting Paige home on time, which was suspicious. Tomorrow is the engineer's first mission assigned to the Avengers as support, sure, but they don't seem like the type to need a pre-check. Ava's only seen a handful of SHEILD agents listed in the medical reports from Bucky's missions, and he never mentions any of them directly. She's always gotten the impression that assigned agents are an unknown hand in that machine.
If Steve ends up tagging along, she'll have her suspicions about the Russian's intent with this whole thing. She might have an ally in the fight to push her best friend that she didn't know about. 
Maybe she'll go to the tower after Paige is home safe. Ava's brought up the idea of switching to night appointments before, and she doubts Bucky would say no to a quick ten minutes on the roof. He might even stay for a while without having the excuse of leaving her to her work. 
She could pick up some late-night bagels to bribe him with. Her favorite shop closes early, but they work til midnight sometimes just for the baking process. Ava does the yearly medical work for the owner and his family without charging him. In return, he lets her sneak in after hours for cream cheese and salmon. With that and a quick stop to her office for a handful of lollipops, she's got herself some super soldier bait. 
She might not even stop to change back out of the dress. She'll grab the lab coat, though. Bucky looks more at ease whenever she has it on.
He wants to leave already.
It's been eighteen seconds since they coraled him through the front door. He's very proud of himself. He didn't think he'd make it to half that before the urge hit.
Bucky looks around the crowded bar with the sourest face he can muster. It's loud, it's cramped, it's loud, he's already hot enough to know he'll be sweating at some point, and it's too fucking loud. The checkpoint out front is a disaster. He's not real clear on what the standards for a bar security chief are, but that pick-up artist with the handheld, battery-powered metal detector out front doesn't fit his definition of competent. Not by a long shot.
The Avengers haven't rolled out with the full roster tonight. Tony, mercifully, is away with Pepper, Barton fucks off to god knows where, and Rhodey's as much of a workaholic as Bucky is. He tries not to think about where Thor goes. That particular can of worms is pretty full. He's still trying to get used to the fact that they've got a Quinjet that can just go to space. Whenever he—they want.
The ones that did come don't give him any shit when he breaks off to do his walk-about. They all figured out pretty early on that it's a sensitive subject. Bruce doesn't even notice him leave half the time. Steve used to do a piss-poor job of inconspicuously following him back when Bucky was primarily non-verbal. Natasha never mentions it.
The building is two stories. There's a halfway decent camera set-up that he can tap into through the wifi. No windows in the bathrooms. The roof access isn't wired with an alarm. All the emergency exits are, though. The owner's room was locked before Bucky got to it, but the staff areas are open to whoever turns a handle. They've got a round of code inspections coming up at the start of next month. They'll fail at least two of them if they don't unblock that rear door.
Sam silently checks in with an offered fist bump once he's back at the table eight minutes later. Bucky doesn't hesitate to reciprocate it. There's already a half glass of whiskey sitting on the table waiting for him. He doesn't hesitate to get his mitts on that, either.
Wyatt and Hannah show up before Ava and Paige do. It's the first time Bucky's been faced with meeting them since Ava offered that one time. She never pushed it after that. He's been meaning to get around to it. But the idea has been making his teeth buzz too much to go through with it.
Hannah is laser-focused on him from the start. She's just as conscious of it as he is, then. He can tell the moment that the realization hits Wyatt. His eyes widen with a flash of concern, his burly frame curling in on itself as if that'll make six feet of muscle look less threatening. It's almost heartwarming that he's worried about looking threatening to Bucky, of all people. The anxiety on the kid's face gets swallowed up by excitement. Seconds later, another wave of anxiety surfaces. It teeters back and forth as Hannah pushes him up to the table through the crowd.
Bucky watched Atlantis the other night after one of his nightmares took away any chance of falling back to sleep. It saved him from having to wake Steve up for a trip to the supply store. He texted Ava about it once he spotted the sun through the small gap in his blackout curtains; she was thrilled. Seeing the baby-faced brain surgeon nervously approach the table makes him understand why she compares him to Milo, not Dr. Sweet. 
Bucky's not looking to be the aggressive silent type anymore. At least not when it comes to the people working their asses off for him. He reaches out with his flesh hand, giving a reassuring half-smile to Wyatt. "Good to finally meet you, Combs."
The grin that stretches across the doctor's face looks wide enough to hurt. A stubby hand reaches out across the table for an enthusiastic shake. "It's an honor to meet you, Sergeant Barnes."
"I'll sign that journal Ava's warned me about if you promise to call me Bucky," he bribes, taking his hand back for another sip of whiskey.
"Y'mean it?" He's already headed for his patch-covered messenger bag with a hopeful look on his face. "I can use whatever makes ya comfortable. I'm not gonna make ya sign—"
"Hand it over." He glances over to where Hannah is sitting down across from Bruce. They trade an amicable nod when she makes direct eye contact again. "It's good to meet you, as well, Schuster."
"Barnes." He hears the sound of a boot being kicked under the table and watches Wyatt glare at the side of her head. She gives Bucky a strained smile. He's got a feeling it's usually strained. "Likewise."
Bucky likes her already.
As Ava warned, it doesn't take long for Wyatt to start asking about maps. He's bombarded with questions the moment he hands the journal back, with a fresh, chicken-scratch signature on one of its pages. The kid has a lot of trouble picking one at a time, and Bucky's trying not to shorten his answers out of habit. 
He keeps a mental list of the information Wyatt's most interested in. A year ago, he would have done it out of ingrained habit. Tonight it's a deliberate choice. Bucky can get his hands on records the Combs family doesn't know about. The kind they can't make a legal request for because there's no official log of it.
Ava and Paige are the last to arrive. He's too busy trying to give Wyatt more stories when they walk through the door to spot them. Steve is the first to notice their entrance, pausing mid-sentence about a mission the Howlies went on that Bucky barely remembers. Looking away from Wyatt's face, he understands why his best friend froze up. 
Good fucking god almighty. She's trying to kill him.
The doctor that haunts Bucky's dreams is walking through the crowded bar in an outfit that should be triggering the tactical analysis in his head. The analysis that, lately, only ends when his mind catches up to the fact that he shouldn't be thinking about being balls-deep in her while trying to make eye contact. It's probably—definitely inappropriate. But something about the thin, light blue fabric of her dress is shorting him the fuck out. 
It's low-cut, which is the first strike. The second is the way that split up her right leg only stops when it reaches the top of her thigh. The third—the one that really knocks him flat on his ass—is the way the whole thing is pulled in to show off her hips. The ones he'd have a lot of trouble letting go of if she ever let him put his hands on her to begin with.
He roughly swallows around nothing but air. His eyes shoot up to Ava's face, desperate to stave off his bastard mind latching onto her outfit. The last thing he needs in his head right now is a full-scale plan for laying her out on the table to unwrap that thing like a present. She's smiling at him, genuine surprise shaping most of her expression. God willing, it's about his presence here, not where his eyes were a second ago.
"They let you out of the house now?" she sasses him over the roar of the bar. Her hand folds into a fist and props high on her hip as she stops at the table's edge, her other arm linked with her best friend's.
Bucky is so fucking hopeless for her. "Yes, ma'am. But only if I get enough green stickers that week."
"In that case, thanks for behaving. I didn't think you'd be here tonight." That smile of hers is still bright as the sun. Still aimed at him. Christ, he's never been happier about Natasha ruining his life. "I'm pleasantly surprised around you, for once."
Gimmie half a chance, and I can show you every kind of pleasant surprise there is. 
If this were 1943, he'd still have the balls to say it to her. It'd be suicide to say it around his idiot friends, but he was a dumbass who wouldn't have hesitated back then. Not with someone like her. 
It's probably a good thing it's not still 1943. "If I make all the surprises annoy you, you'll tell me to stop. I have to keep you on your toes, or you'll get bored."
One of her eyebrows raises at him, entirely unimpressed. It makes him want to hold her hand. "You do understand how cool my job is, right? You're also a literal cyborg I get to poke at whenever I feel like telling you it's medically necessary. What part of that am I supposed to get bored with, sergeant?"
Bucky folds with a shy chuckle, bringing up his glass of whiskey to hide his mouth behind. "You get used to the shiny parts."
"I'm sure he'll let you add more when he busts his ass again," Sam jokes from off to Ava's left. He's staring at Bucky with an overly satisfied grin. It makes him glare over his whiskey while Ava and Paige sit down.
"Sorry we're late," Paige says, her eyes moving to Steve and her cheeks turning slightly pink. "Gettin' through Bronx traffic is always fun."
"Ordered Ryder's usual," Hannah mentions, pointing to a tall glass of ale the waiter dropped off while he wasn't looking. "Didn't know what you were in the mood for."
"Somethin' fizzy." She rhythmically taps her mismatched nails on the table, humming to herself while she glances over the drink menu. "Or maybe somethin' icey."
"I went the margarita route if you wanna go halfsies tonight," Wyatt offers, nudging his frosted glass over to her. Paige perks up and leans over for a sip.
He looks over at Steve, who's watching the interaction with the sappiest smile. It nearly makes his eyes roll. Natasha and Sam sniffed out the captain's big crush a long time ago, but it's the first time Bucky's seeing it for himself.
Neither one of them has learned a goddamn thing. Not in a hundred years.
A much more gentle nail taps right in front of his arm, dragging his eyes back to Ava while she gets herself seated. "What made you decide to come?" 
She would hit him with a question that blunt right off the bat. He tries not to notice Sam's silent laughter next to her. 
"Heard the egg heads were making an appearance," he decides to be mostly honest with.
The pleased smile on her face takes on a softer edge. She really hadn't been expecting him to show. It makes him all the more glad that he listened to Natasha. "We convinced you?"
You did. "You're surprised? I'm not about to put in the effort for these assholes."
"He only does that for our birthdays," Sam tells her, leaning into her space slyly. 
Bucky holds out his hands, mildly insulted. "And bank holidays."
Ava turns her head to offer her hand to Sam with a warm giggle. She looks so fucking good in the low bar light. With her neck muscles stretched like that, Bucky wants to kiss under her jaw just to see her reaction. "I've been hoping we'd meet again under better circumstances. Ava Ryder."
Sam barks a laugh, wrapping his hand around hers. "I'd say watchin' you hand Steve his own ass was great circumstance."
"Well thanks," Steve interjects, flipping him off before going back to drawing on a napkin with Paige.
The comment, and the gesture, gets ignored entirely. "Sam Wilson, but you can call me your favorite Avenger."
Bucky almost rolls his eyes again. Watching Ava's giggles get worse stops the urge.
She was wrong.
He came out tonight. To a bar. To spend time with them.
Ava takes another drink of her ale, watching the Winter Soldier over the rim of her glass. Wearing a dress that could unwind from her with a few strategic yanks on a couple pieces of string. And heels that could have paid a month of her first apartment's rent. In a New York bar.
If her parents could see her now, they'd croak.
Bucky is so goddamn attractive in his dark leather jacket that it's un-fucking-real. The bastard looks softer with his hair down like that, and there's chest hair peaking out from that button-up he's left open to a torturous degree. It keeps distracting her every time he turns to say something to Steve. His hand is the only shiny part on display at the moment. 
The glory tales from Steve don't do the heartstopping aura justice. The fact that Bucky has had the nerve to lie—to her face no less—and say they're blown out of proportion makes her seethe sitting across from him now. No wonder he was prolific; how the hell could he not be with a face like that and the attitude to back it. Now that he's not in a professional headspace, the latter is coming out in spades. The super serum body is a mouthwatering, climbable bonus.
This is the man that keeps threatening to fly her home.
Ava takes a longer drink.
She hasn't been this in over her head since college. The familiar knee-jerk reaction of bullying him is the only thing that doesn't feel petrifying. Bucky is the last person that would make her feel unsafe, but good god, the man is intimidating. Trying to find something to say to him that isn't a joke is a lot harder than usual, with him looking that good.
Paige tuned out the moment Steve gave her meticulously outlined boxes to doodle in on an unfolded napkin. He's been adding detailed frames to them ever since while the two trade work stories. It makes Ava jealous. Her best friend might be oblivious, but at least she's not the one tongue-tied tonight.
Knocking her knees together under the table, Ava leans forward and tries another round of facing down the sergeant. "Worth the trip so far?"
Way to go, moron. Pressure him, why don't you? Of course he's having a good time; he wouldn't still be sitting here if he—
Bucky smiles at her, calming her nerves without even trying. "Every second." He looks down at the glass in her hand, then back up at her face. "You havin' fun, doc?"
She misses hearing him call her doll. It's starting to feel like maybe it was an accident the handful of times it happened. He hasn't done it in days. "Unlike you, I enjoy human interaction. Plus, the hippie thing makes me partial to loud noises." And sweat. And weed to make the loud noises sound better. And men with long hair and deep voices that would sound—
"I don't mind human interaction," he argues, folding his arms on the table and leaning over with her. "I'm just picky about the people I interact with."
"Awww," Paige coos at her side. "And we made the cut? I'm honored."
"You should be," Steve confirms with a smirk, his eyes never leaving the napkin under his hand. "He's not exaggerating."
"That's unusual for him," Ava jumps on Bucky with. She regrets it right up until he snorts and briefly covers his mouth with his hand. It's a real fuck up on his end; she takes it as an all-clear to do it to him again at her leisure. "The only people I've met with bigger heads are cardiologists."
"That's the second time you've brought them up," Bucky notes. She honestly can't remember the first, but it sounds accurate. They're fun to mock.
"Nice deflection, superstar." His eyes widen a fraction at her teasing, boosting her confidence. "Have you had the displeasure of meeting one? I'm allowed to be mean to them as a neurologist, by the way. Secret doctor pecking order and whatnot."
"If I have, I probably don't want to remember," he deadpans. Steve gives him a dirty look, but it makes Ava snort. The smug look Bucky gives her in return makes her stomach flip. "I wanna hear more about this secret doctor pecking order. How far up that chain are you?"
"I don't know, man. How far up is your brain?" 
Bucky's eyes shut in pain, and he smiles. "It's so hard to be proud of your ego when your awful puns surround it."
"You'll manage," she assures in a supportive tone. 
A low whistle drags Ava's eyes to one end of the table, where Natasha is getting up. "I'm going dancing. It's up to you losers who's coming."
A majority of the table, including most of Ava's team, moves to follow. She doesn't. Bruce and Hannah don't, continuing their discussion on a medical journal he read that morning. Bucky doesn't leave either.
He watches Ava as Paige leans over to kiss the top of her head. She's pretty sure he watches her all through their short yes, I'll watch your bag check-in. He's still watching her when she looks back at him, slowly circling his glass to make the whiskey inside it swirl.
"Not a fan of dancing?" he finally asks.
"I like dancing," Ava confirms. "I just like picking on you more." The words feel outrageously bold for how innocuous they are. It's the truth, but she feels a little stupid for saying it out loud. Whatever, if it means spending the night out with him, that's fine—
Bucky puts down his glass, a determined set to his posture. "Dance with me."
Her jaw almost drops. She doesn't catch her nervous burst of laughter in time to stop it. "I—what? You? Bucky Barnes, mister touch me and die himself wants to—"
"I let you touch me all the time." The tone he uses for the blatant—
Christ, is she ever in over her head.
She ignores his flirting like a coward, racing to hide behind professionalism as fast as her mouth can get her there. "The funny thing about that is I have your willing participation—"
"You've got my willing participation for this, too." He sounds like he means it, which is the worst part. It makes it impossible to bring herself to tell him no.
She hesitates one last time, primarily out of fear of embarrassing herself. "You're sure you want to dance?"
"With you?" Bucky stands up, allowing her to see the well-worn leather sitting low enough on his hips to turn her into a bigger wreck. "Yeah, doll. I'm sure."
Hannah leans over to slide the bag Paige left behind across the table, closer to her. She doesn't bother to stop talking. Bruce is smiling from ear to ear, stealing glances at her and Bucky. He's doing a terrible job of hiding it. 
Standing up on nervous feet, Ava watches Bucky circle the table. He offers up his flesh hand when he approaches her, his signature Brooklyn smirk on his face. "Ready?"
Fuck no. She slides her hand into his, breathing deeply when he squeezes her fingers. "I really hope someone's given you the memo on modern dancing because I have no idea what the hell you people did in the 30s." 
"I'm sure you'll help me figure it out." He's sounding more confident with every word, and it's scaring the absolute shit out of her. 
It's innocent at the start. Bucky's a perfect gentleman leading her through the crowd. He spins slowly to face her when he finds them a wide enough space, pulling her in close. The pressure of his fingers is barely there when his metallic hand moves to her lower back. Ava brings both her hands up to his chest when he lets go of one of them. 
"You'll tell me if you're uncomfortable, right?" she checks again, stretching up as close to him as she can. There's no way he has trouble hearing her over the music, but she doesn't remember that until she's all but hanging off him. It makes her cheeks feel warm.
His flesh hand moves over her hip, resting on it gently. Bucky leans down and turns his head in, getting right up to her ear. He's already starting to guide the direction of her half-hearted movements. "I will. You gonna do the same?"
"I will," she promises. Mirroring his words is the only thing her brain can come up with, given how unfairly good he smells. It's obliterating every train of thought she has. 
It is… terrifyingly easy to let herself go in his arms. The movement of her hips gets more involved, following the tempo of the song and the direction of his hand. Hers go up to his shoulders, bringing him in closer a fraction at a time. By the time the song changes, she gives up and lets them wrap around the back of his neck. 
Somewhere around the third song, when the bar's DJ is trying to ramp up into a faster energy, she ends up turned away from him. Ava isn't sure how it happened. It could have been his doing; she's not paying all that much attention. All she knows is he's pressed up against her back now, the hand on her hip moving towards her leg incrementally. Her head tilts off to the side as her eyes close, letting the Winter Soldier guide her.
His fingers stop their advance once they reach the top of the gap in her dress, the one that splits up her thigh. She gives him all of thirty seconds to figure out if he's brave enough to go further on his own. Then the ego boost from having Bucky—of all fucking people—trying to make a move on her wins out over her fear. 
Ava lays her fingers on top of the hand hesitating on her leg, urging it down. 
The first touch of his skin on hers makes them both suck in a breath. She can feel the tension in him against her back. He gets over his nerves faster after that. His hand glides down the length of her thigh, and his fingers curl under the fabric when it comes back up. Not all that far, but the intent is there.
In escalating boldness, she reaches for his metal hand, dragging it to rest at the top of her ribs. His nose comes brushing across her temple at that point, giving her an idea of how close he's keeping himself around her with her eyes closed. One of her hands goes up into his hair, and that's when things really go off the fucking rails.
His thumb moves in a wide arc, dragging across the underside of one of her breasts. Her fingers curl around his hair, and her head rolls in toward him. If she tilts it up, she could brush her nose against his; that's how far into her space he is. And then the hand on her thigh moves in.
The pounding music swallows up the slight sound it pulls from her, but she's willing to bet Bucky heard it. She leans back against him, making him freeze up momentarily. He's already moving again before her mind finally pieces together the why.
He's hard, Ava realizes.
With one hand under her tits and the other getting itself further between her thighs. With her ass pressed back against him. With his towering frame curled all the way around her.
Sergeant James Barnes is hard as a rock. For her.
How the hell he hasn't gotten his good arm ripped off yet, Bucky's not quite sure. It feels impossible that she's just... letting him do this. 
Spinning her around really fucked him over. He had been behaving pretty well up until then. He'd even managed to hold off on putting his hand as far down her back as that fucking dress allows for. But then he'd been dumb enough to turn her, and her head had relaxed off to the side, and god, it took every ounce of restraint he has not to kiss the length of her neck.
Now she's leaning back against him, fully aware of how wound up he is, and he can't figure out where to stop. She isn't slowing down any part of his stumbling. There's no new tension in her now that she's in the know about the current state of his cock. Her hips are still fucking moving, and now they're moving against him.
She's going to kill him tonight, probably right out here on this dancefloor. He just hasn't figured out if it's going to be murder or manslaughter.
He lets his left hand get bolder, trying to test the waters one last time before he lets his right one go any further. He moves it up, his thumb brushing over her nipple. He hears her pull in a shaking breath while it skims back down the side. She doesn't stop him, making him want to bite at her neck all over again. 
With no signs of her looking for an out, and not one shred of critical thinking or self-control left in his head, Bucky slides his hand further up the inside of her thigh. Her fingers tighten in his hair, nearly pulling on it at this point. All he has to do is hike up his thumb, and he'll get more information than he's probably ready to have. She could tell him to drop to his knees right here; he's mildly certain he'd do it. 
That dress is so goddamn thin. There's no weight to it at all. He can't spot the outline of anything, but he knows from how high her tits are sitting that she's got a bra on, at least. Another inch or two up with his thumb, and he'll be able to tell for himself if she came out tonight with underwear on. He's not entirely out of the goddamn loop; he knows skipping it is a much more common practice nowadays. 
Bucky's almost hoping his favorite hippie is the type. He's spent a lot of time fantasizing about ways to get her out of them. That doesn't mean he's not going to fucking lose it if his fingers don't find a strip of fabric between her legs. 
The flash of a new fantasy hits him, one of Ava letting him pin her to the alley wall out back with his head between her legs. If he takes her around the corner, he won't have to stop when the kitchen staff come out for a smoke break. If she does have underwear on, he can leave it in her mouth to keep her quiet. Or reach up to make her bite down on his fingers. With the serum and her height, it'd work like a dream.
The curiosity becomes a burning need, driving his hand all the way up. When he first touches her, it's not with his thumb, and it's not a gentle brush. He pushes his middle and index finger along the length of her lips, coming into contact with lace that's wet.
"Fuck." The word is choked when it tumbles out of him. He's coated his hand to the thought of her so many times over by now. And here she is, pushing herself up against him and just as worked up about it.
Her hand grips his arm tight enough to bruise in reaction. She doesn't push him away. God fucking help him, she doesn't stop moving either. Still, there's something about her body language that's not sitting right in his gut. She's not pushing him away. But she's not pulling him along anymore.
That's not always a stop sign. Bucky knows that. Some people like leaving the significant steps in the hands of their chosen partner. She's silently urged him to keep going a few times already. Assuming she wants that to continue isn't out of the question. But he's not the kind of man who's comfortable with that leap. Not anymore.
He moves his hand down an inch, leaving it between her legs. Not on top of the lace he wants to bite at. If she's interested, she'll put it back. Simple as that.
Bucky waits, holding her close with his metal arm around her ribs and his nose pressed into her hair while they dance. She's hesitating now, which has him convinced he made the right call. He's not self-wallowing enough to take it as a rejection. It's not like he'd been planning for this to go anywhere near as far as it did to begin with.
Her hand pulls at his hair in a way that feels conflicted. She tilts her head up, her eyes finally opening to look at him. Yeah, there it is. Right there in her eyes. It's finally catching up to her.
"I..." she tries, her mouth opening and closing a few times. "We can...."
"We can keep going," he finishes for her, not backing off from his hold on her. "We absolutely can. Or we can head to the bar and watch them make something with a cherry on it. I'm more than comfortable with both."
He watches her chew over the offered out, her eyebrows pulling in. He doesn't push her; he's not looking to make the call for her. If she wants him to get her off right here on this dancefloor, he's pretty damn sure he'd be willing at this point, even with the threat of criminal charges. He's also ready to let go and spend the rest of the night doing something that doesn't make her look torn. Even if it means ending it early.
"We should probably go to the bar." Probably. She doesn't sound happy about it, meaning it's fueled by her professionalism. He understands why she has the line. He respects the shit out of it.
"We probably should," he agrees. He doesn't move his hands. She hasn't moved hers. 
Her eyes move down to his mouth, and fuck does that do a number on his impulse control. He hopes she doesn't feel how it makes his cock jump. Ava Ryder wants to kiss him. It feels odd to celebrate that, considering where his fingers were a minute ago, but fuck. The girl of his dreams wants to kiss him.
"Let's go to the bar." The frustration in her voice almost makes him laugh. It definitely makes him smile as he turns his metal hand over to link with hers.
"You drink anything other than ale, doll?" He lets his fingers brush over the skin of her thigh reassuringly as he pulls it back out from under her dress. She looks so mad at the world, her face scrunching under her glasses. He wants to kiss her more than he's ever wanted anything in his life.
Ava takes a deep breath that she lets out with a huff. It looks like it cools off some of the annoyance. "My answer depends on how much of a narc you are, g-man."
He puts his arm around her shoulder, dragging her in close to his side. His friends will hand him his ass over this for a month, but he's not about to let her feel rejected. He's trying to respect a boundary, not ward her off. "Lucky for you, this g-man has medical strains growing in his room at the tower."
"There's no fucking way. You're telling me the Winter Soldier grows weed?"
"Are you tellin' me you buy yours? Chump."
She snorts hard enough to feel the need to cover her mouth. It makes Bucky feel damn good being able to make her laugh again that fast. "I can't believe I'm being ridiculed about the source of my pot by a senior citizen."
He holds back on reminding her that she was about to let a senior citizen stick his hand down her panties. "Has it convinced you to give up the inaccurate jokes about my job?"
"Inaccurate, he says! Don't you have a literal badge you can shove in people's faces?" Ava doesn't lean against the bar when they reach it. She stays pressed up against him while he leans on it, distracting the hell out of him. He looks down the line of people, searching for a bartender to give himself a second to refocus. "I think that's a pretty clear-cut definition of a fed."
"I think you're trying to find out if I've got a pair of cuffs handy." This is the other problem presented with her letting him go that far; it burned through what little filter he has. Now that he knows she's interested and not just humoring him, he's fucked. Hearing his own words still makes him wish he'd shut his damn mouth.
He hears her laugh in surprise again, but he's not brave enough to look at her yet. There's a momentary lull filled with the sounds of rowdy New Yorkers kicking off their weekend. Then he feels her head lean against his arm. "Something tells me you could improvise without them."
It's manslaughter. She's trying for manslaughter. By god, she's going to accomplish it if she says some shit like that again.
"I can improvise whenever you need me to." He finally looks back at her, catching her ogling his chest. Again. Her cheeks are a few shades darker. It's good to know he's not the only one reeling. "You should answer my question first, though. Unless you're looking to put in the order."
Her eyes finally flick up to his, and her smile turns shy before she looks away. "Surprise me. I burn more than drown. I'm sure you can think of a fun option to entertain me with."
Bucky should have guessed she'd give him a run for every cent he earned back when he still had his mojo. It feels like he's trying not to trip over himself while she's still getting warmed up. "One entertainment, comin' right up."
She gives him a look, doing a lousy job of holding back her amusement. "You don't get to complain about my puns if you're going to tell dad jokes like that."
"You're just jealous that mine are better." He finally flags down a bartender over her shoulder, throwing out an order for two Mai Tais. The only other cocktail he can think of off the top of his head is a Sex on the Beach, and he sure as shit doesn't have the balls to order that in front of her at the moment. A Moscow Mule is not a cocktail in his eyes. It's also not the kind of inappropriate he's looking for.
Ava's finger hooks into his front pocket, threatening to ruin every effort he's made toward getting his cock to calm the fuck down. "Some of your jokes are pretty great; I'll give you that. The dry ones make my day."
It feels backwards—and mildly alarming—to hesitate to brush her hair behind her ear for a moment. A few minutes ago, he'd been ready to go down on her in front of a room full of people. Now he's trying to find the nerve to touch her at all. Doing so gets easier when her eyes slip closed at the feeling of his fingertip moving down the side of her head. 
"Seeing you makes my day," he murmurs, not caring about letting his mouth run. It feels less intimidating in the wake of her compliment. God knows it's going to sit in his head. Probably forever. The fact that she probably can't hear it over the music certainly doesn't hurt.
Her eyes open back up slowly, with her smile taking on a wicked edge. "You feel like showing me your stash, old man?"
They haven't talked about it.
It's been less than an hour since they stopped dancing. In under sixty minutes, Bucky managed to get them a drink and all the way through Manhattan to the Avengers Tower. On a Friday, no less.
No wonder they threaten to revoke his license. Ava thought she was a speed freak behind the wheel. Now that she's got firsthand experience as his rear passenger, Bucky being allowed to have a motorcycle makes her question SHIELD more than ever.
He let her go up to the roof without him. He made it sound like he was doing her a favor by not making her go out of her way just to raid his stash with him. She's guessing it's got more to do with not being down for a surprise tour of his space. It's not as if she's going to fault him for it. 
The idea that she's actually going to let him fly her home after this is already hitting her nerves. If that's throwing her off, she has no clue where she's going to find the will to bring up the subject of—this. Tonight. What happened.
How far she was about to let it go.
He smells too good. She's decided to blame it on that, at least in her head. Mainly to make herself feel better about crossing that many ethical boundaries. It's easier than accepting that she was about to give a patient the go-ahead to finger her in the middle of a bar. Without so much as a word about it beforehand.
Ava pushes her hands under her glasses to hold her face, resisting the urge to scrub at it. She doesn't want to fuck up her makeup. Not while she still has to face Bucky. How stupid—and then she doubled down—god, now they're here, and he's getting weed—
"I was starting to think I'd never get you up here, doll."
The way his voice quells her anxious mind without any effort at all ties her stomach in a different kind of knot. She lowers her hands into her lap, giving him a half-smile. "I'd like to remind you that I'm the one who offered initially. And again tonight."
Bucky waves his free hand dismissively, his flesh one cradling a bag. "Semantics." He dumps it onto the wicker table she picked out herself. She hears glass hit metal, the sound muffled by the black cloth of the bag. "I didn't know if you were a bowl or a joint kinda gal. Figured I'd come prepared since I'm dealing with a degenerate commie."
"Steve was right about your manners," Ava insists, reaching out to open it with greedy fingers. She kicks her heels off under the table, getting distracted by the sight of him shaking his leather jacket off his shoulders. The man's tall enough to have to duck under the makeshift canopy built to account for Wyatt's height. "Tell me how many words you know for pot while I judge your choices."
"Are you forgetting they took me out for walks every few years?" Bucky walks around to her side and puts his jacket over her shoulders, surprising her. She looks up at him with a shy smile, momentarily forgetting the promise of weed picked out by a super soldier. He's such a gentleman that it's frankly obnoxious. One of his eyebrows raises at her. "Those walks included the 60s, young lady. I probably know more than you do."
"What do you remember about the 60s?" she goads as he sits down next to her.
"Plenty." Bucky props his arm up on the back of the couch, leaning into her space. She's grateful for it. Even with his jacket around her, it's freezing up here. The added warmth isn't the only reason she's grateful for it. "Personally, though, I think you would have had a better time in the 70s." He tilts his head back and forth a few times. "At least the parts of it I fucked around in."
The mental image of the Winter Soldier undercover in some sleazy disco hits her like a ton of bricks. It feels wildly inappropriate, even with him talking about it that openly. All the fantasies she has of Bucky do. Especially the ones she uses to get herself off lately. 
"I'm going to take your word for it," she murmurs. There's so much potential there to poke at him. He's offering up the bait on his end. Hell, there's still the list of weed names to dig for. But she can't get her mind to latch onto any of it with him this close.
He nudges his chin in the direction of her hands, which are still hovering in his little heap of paraphernalia. "You should start us up so I can get you home at a reasonable hour. I don't know how fast you like to—smoke."
It's astounding how good he is at riding the line between being a gentleman and a terror.
Ava looks back down at her hands with a smile. "That depends on the accuracy of your warning about this couch-locking me. Technically I'm off tomorrow, so I'm not about to say no."
"Do you smoke medicinal strains?"
"On occasion. I started for anxiety, oddly enough. Then I noticed it helped with my mood overall." She shrugs, setting aside his box of hemp papers. There's a heavy-looking grinder and two different pipes further in. One of them's a goddamn steamroller. He sticks with quality from what she can see so far. "I feel like there's a bong that was held back from this collection."
"There's a lot that was held back. I'm not gonna parade all my ill-gotten goods through the tower." His pauses while she gets the last of it emptied out." You gonna show me how it's done or put me to the test?"
"Definitely the latter." She turns her head to smile at him innocently, pushing her glasses up her nose. It makes his lips twitch. "I don't see anything to assist rolling. Does that mean you're confident enough to show me your handiwork?"
Bucky scoffs, his expression becoming entirely unimpressed. He almost looks offended, leaning over to grab the papers and the grinder. "You're telling me you people need tools these days? After all the work I put into teaching Captain America how to do it properly?"
Ava's brows shoot up in shock. "You're fucking kidding. I figured the weed was a new development—"
"Nah, I've been smoking since my first job." He's not watching his hands much as he lays out the foundation of his work. He's primarily watching her. "Worked for a guy that owed a corner store. He had family that ran a not-so-secret farm." He turns the grinder lid enough to loosen it, then flicks it to spin it the rest of the way off with a cocky grin. "I was an outstanding employee. So was Steve once I got him hired."
"America's Sweetest Stoners," Ava coos, making him chuckle. He's not stingy about what he's rolling for them. It makes her wonder how many plants he's got set up. "Do the two of you still smoke together?"
"He doesn't bother much. Takes a lot to build up any kind of buzz with our systems, so he looks at it the same way he does drinking at this point. He still shows up whenever Banner drops off some new hybrid monstrosity for me to try." Bucky glances over at her quickly, his fingers never stopping their work. "This is from one of the normal plants, don't worry. I won't start you off that far in the deep end."
Ava shrugs. Banner's main lab is here in the tower, so there's no chance the process isn't documented. JARVIS wouldn't let her use anything that could do her actual harm. "You can if you want, but you're responsible for explaining to Tony why I'm passed out on his roof."
He gives her the most insulted look. "I wouldn't leave you up on the roof. I'd be enough of a gentleman to carry you inside."
He's ruining her life. There's no way she's going to be able to walk away from tonight without being completely wrapped around his finger. It makes her smile at him like a hopeless fucking moron. "I believe you."
Bucky brings the most well-balanced joint she's ever seen up to his mouth, licking it closed in one smooth stroke. His eyes never leave hers. It makes her swallow. The fucker smirks at her and twirls the joint between his fingers, holding it out for her inspection.
"Well?" he prompts, watching her intently as she plucks it from his hand. He's preening. Waiting for his praise.
Goddamn him, she's going to have to give it to him. The joint is so perfectly rolled it's mesmerizing. Even distribution, not pulled overly tight, and meticulously sealed. She can't remember the last time she managed to do a job half that good. Bowls have always been her go-to. It's clear that this is his.
Ava giggles at the absurdity of it all. It feels surreal to be a step away from lighting up with a cyborg PoW she first read about in primary school. "You're such a dork. Shut up and hand over the lighter before your head explodes from being over-inflated."
"Now I know I did a damn good job by today's standards." For the second time that night, she gets the overwhelming urge to kiss Bucky as he reaches for the lighter. She props the joint between her lips to distract herself and lets him light it for her when he silently offers. The flame does stunning things to the color of his eyes in the dark. "You only tell me to shut up when you're really impressed with me."
She doesn't miss that he waited until she started inhaling to make the point. It makes her roll her eyes in exasperation. Ava can tell from the first drag that his shit is going to hit harder than her usual. She turns her head to blow it away from his face, handing back the joint. He tucks it between his fingers and brings it up to his mouth in one smooth motion.
"Now look who's outright lying. I tell you to shut up for various reasons." The muscles in his neck look unfairly good when he turns to exhale. It makes her want to run her tongue up his throat. She looks back up at his face. Everything below his chin is hazardous to her health at the moment. "I don't remember any of them being because I was impressed until now."
His eyes flick back to hers, then down to her mouth as he smiles. His hand was up her dress. It was between her legs only an hour ago. And yet watching him stare at her mouth still feels obscene. "You've got a real funny way of stroking my ego, doll."
"I get the feeling you enjoy it," Ava counters, snatching the joint from his fingers. "I wouldn't do it otherwise. You're always welcome to suggest an alternative."
"No, thanks. I'm a pretty big fan of what you do to me." 
Damn. Him.
Yes, the question was a check-in. Yes, she was trying to get a read on how far he wants this to go. Then he had to go and double down without hesitation. She knows by now what door he's trying to invite her through. 
Ava is so not brave enough for this conversation. It's not—it's complicated. She really shouldn't be working on his case if they're going to go down this road, at least not as his primary surgeon. She'll have to pass it on to Hannah and have a few very embarrassing conversations with a handful of people. Ones that involve fessing up to wanting to fuck Bucky Barnes.
She's not saying no. But she's not brave enough to say yes. At least not tonight, up here on the roof.
Ava leans back against the couch, feeling his arm curl in around her shoulders. "Good. Let me know if that changes."
u dont get to yell at me for the edging, i warned u that im gonna leave an * on smut chapters. anything less than Full Fuckin aint gettin the badge 😤 i have a Standard to uphold in this house of sin
(tho if anyone feels there shoulda been a warning tag for smthing you can always lemme know bb 💞)
also ill never be able to properly articulate how much i love writing cranky old fart bucko. heartstopper is stupid fun, feral trauma man keeps me on my toes, but stick-shaking geezer mode??? mr. “kids these days with their MEMES” himself??? beautiful. fantastic. superb. his final form, truly 🤌 i yearn to write more of it
anyways there are writers on the internet that can make their slow burn wholesome. in all my years on this space rock of ours, ive never been one of them
even if i do write the longfic of the sunshine dweebs steve and paige, that probably wont be all that wholesome of a slow burn either ajdhdskjfdjsjf. they ARE my tooth rotting fluff ship tho. mmm okay so maybe paige is a tragedy in disguise but its ME so thats expected 😌 the babes that like their romance extra sappy and cutesy take a lotta shit and deserve a Safe Space and steve rogers fits that bill, imho
bucky is for the babes that like to verbally get their hair pulled before hearing ily 🥰
the good news is, i get a few more chapters in this fic to torture you with before i let bucko and ava do the Big Sin (not murder, the other one. no, not hand holding, the other other one) 😌💖💞
also PieAnnamay's comment reminded me that i never linked my fav buckaroo fic, safe with me!!! for anyone else that hasnt stumbled upon bitsandbobsandstuff, i cant recommend them enough. i HIGHLY encourage you to go read through all their works while you’re waiting for updates on this, the bucky and steve fics are 😫🤌 perfection (i promise when i finally have a day to really do tumblr stuff, ill make a list of my fav fics/writers in my pinned post. i promise i will try to get to it Soon, i still havent even caught up on chapter posts there asldhfsadf)
❤️ https://archiveofourown.org/works/13798047/chapters/31721565
0 notes
srasamua · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
2 notes · View notes
Text
Panda Quotes
Official Website: Panda Quotes
• A bunch of money-grubbin’, greenhouse-gasing, seal-clubbing, oil-drilling, Bible-thumping, missile-firing, right-to-life-ing, lethal-injecting hypocrites. People whose idea of a good time is strapping a dead panda to a Lincoln Navigator and running over everybody in the gay parade. – Richard Jeni • A Panda walks into a cafe. He orders a sandwich, eats it, then draws a gun and fires two shots into the air. “Why?” asks the confused waiter, as the panda makes toward the exit. The panda produces a badly punctuated wildlife annual and tosses it over his shoulder. “I’m a Panda,” he says, at the door. “Look it up.” The waiter turns to the relevant entry, and, sure enough, finds an explanation. Panda. Large black and white bear-like mammal, native to China. Eats, shoots and leaves. – Lynne Truss • A panda walks into a tea room and ordered a salad and ate it. Then it pulled out a pistol, shot the man in the next table dead, and walked out. Everyone rushed after it, shouting “Stop! Stop! Why did you do that?” “Becuase I am a panda,” said the panda. “That’s what pandas do. If you don’t believe me, look in the dictionary.” So they looked in the dictionary and sure enough they found Panda: Racoon-like animal of Asia. Eats shoots and leaves. – Ursula K. Le Guin • Americans love marriage too much. We rush into mariage with abandon, expecting a micro-Utopia on earth. We pile all our needs onto it, our expectations, neuroses, and hopes. In fact, we’ve made marriage into the panda bear of human social institutions: we’ve loved it to death. – Barbara Ehrenreich
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'Panda', orderby: 'rand', order: 'DESC', template: '1', limit: '68', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_panda').html(obj); jQuery('#thelovesof_panda img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); ); • Despite what you may have been taught about Indians or Africans or ancient Celts, poor people are terrible stewards of their environment. For instance, if my kid were starving to death, I would happily feed her fresh panda. – Jonah Goldberg • Do you think pandas know they’re Chinese and they’re taking the one child policy a bit too seriously? – Jim Jefferies • Eating a RAW food lifestyle is the purest and best way to live. Many of the strongest and longest living animals are raw, such as the panda bear and gorillas. Self love has brought me to a RAW lifestyle. Feeding my body with pure natural energy. Most people’s perception is what has been ingrained inside them by manipulation, but slowly there is a shift in consciousness, one person at a time. People will ask more questions, begin to stand up for themselves, go their “own way”, take better care of themselves, which will benefit everyone and everything around them. – Eric Nies • Every time I mention her, Magnus says, “Are you two getting along?” in raised, hopeful tones, like we’re endangered pandas who need to make a baby. – Sophie Kinsella • For better or worse, zoos are how most people come to know big or exotic animals. Few will ever see wild penguins sledding downhill to sea on their bellies, giant pandas holding bamboo lollipops in China or tree porcupines in the Canadian Rockies, balled up like giant pine cones. – Diane Ackerman • He slung off his backpack. He’d managed to grab a lot of supplies at the Napa Bargain Mart: a portable GPS, duct tape, lighter, superglue, water bottle, camping roll, a Comfy Panda Pillow Pet (as seen on TV), and a Swiss army knife—pretty much every tool a modern demigod could want. – Rick Riordan • I am glad that the life of pandas is so dull by human standards, for our efforts at conservation have little moral value if we preserve creatures only as human ornaments; I shall be impressed when we show solicitude for warty toads and slithering worms. – Stephen Jay Gould • I couldn’t have invented crisps. … I don’t really want to be known as the man who invented crisps. … I invented apples. … I invented pandas, and caps. I invented soil. – Noel Fielding • I find it striking that the quality of the urban habitat of homo sapieans is so weakly researched compared to the habitats of gorillas, elephants, and Bengal tigers and panda bears in China…you hardly see anything on the habitat of man in the urban environment. – Jan Gehl • I only dated one Asian girl, but she was very Asian. She was a panda. – Jim Gaffigan • I said that I thought the secret of life was obvious: be here now, love as if your whole life depended on it, find your life’s work, and try to get hold of a giant panda. If you had a giant panda in your back yard, anything could go wrong — someone could die, or stop loving you, or you could get sick — and if you could look outside and see this adorable, ridiculous, boffo panda, you’d start to laugh; you’d be so filled with thankfulness and amusement that everything would be O.K. again. – Anne Lamott • I think it’s because Po [from Kung Fu Panda] is such a geek, and he is so relatable. He is so excited by life and is excited to learn new things. I think that accessibility is something that we all can relate to, there are so many things we wish we could do but don’t have the means to achieve it. – Jennifer Yuh Nelson • I think we had to push forward to make sure it was different, do something that we had never done before and yet still have the consistency to stay in the same world. This was our chance to do all the things we didn’t get a chance to do before. We’ve been working on these movies [Kung Fu Panda] for twelve years and we have to keep things exciting for us in order for us to devote that many years of our lives to do this. – Jennifer Yuh Nelson • I thought the secret of life was obvious: be here now, love as if your whole life depended on it, find your life’s work, and try to get hold of a giant panda. – Anne Lamott • If you can unify the public mind saving an iconic species like the tiger, like they did with the panda, that means you have to protect their habitat and everything that they hunt. And that means saving massive, thousands of acres for them to be able to roam and breed. So it’s more of a land effort. – Leonardo DiCaprio • I’m not one of those actors who romanticizes his trials working out and brags that he can bench press a panda now. – Ryan Reynolds • I’m really into pandas right now. They’re really scratching an itch for me. They’re so goddamn cute. – Nick Kroll • In the game of life, less diversity means fewer options for change. Wild or domesticated, panda or pea, adaptation is the requirement for survival. – Cary Fowler • It’s like the panda, they say that’s dying out. But what do they do? When you see them they’re just sitting in the jungle eating. – Karl Pilkington • I’ve been punched by a vampire, an Indian girl, and a panda… I should be a video game. – Adam Rex • Men look like pandas when they try and put make-up on. – Adam Ant • Met someone who works at the zoo. Apparently the panda is a nasty animal. – Dov Davidoff • One of my favorite things about the Kung Fu Panda 3 is the look of it. We never go for realism. I think a lot of time when people go for 3D that’s the mistake. Because we’re never going for full realism – for computer generated live action films like Avatar the goal is realism, to make the audience feel like they are seeing something that is real. Lord of the Rings had character design and environments to make it look real, whereas we aren’t going for that, we are going for something that is theatrically, viscerally, and emotionally real. – Jennifer Yuh Nelson • One of the most jolting days of adulthood comes the first time you run out of toilet paper. Toilet paper, up until this point, always just existed. And now it’s a finite resource, constantly in danger of extinction, that must be carefully tracked and monitored, like pandas? – Kelly Williams Brown • One of the things we love about Po [Kung Fu Panda] is that he’s vulnerable. He’s someone that we can all identify with because he has those insecurities. He’s an outsider feeling guy. – Jennifer Yuh Nelson • Panda. Large black-and-white bear-like mammal, native to China. Eats, shoots and leaves. – Lynne Truss • Physically he was the connoisseur’s connoisseur. He was a giant panda, Santa Claus and the Jolly Green Giant rolled into one. On him, a lean and slender physique would have looked like very bad casting. – Craig Claiborne • Po’s [Kung Fu Panda] unending enthusiasm is something we wish we could have. We can’t help but root for him because of his geek energy. – Jennifer Yuh Nelson • Summit meetings tend to be like panda matings. The expectations are always high, and the results usually disappointing. – Robert Orben • The hidden village was something we found when we went to research in China we climbed a mountain in the Sichuan province where the panda sanctuary is based, and we climbed to this beautiful, mist-covered, almost primordial place and when we turned these corners these moss covered old buildings would come into view, revealing themselves and it was so beautiful and so unlike anything we’d seen that we literally took those moments and put them into the film [Kung Fu Panda 3]. – Jennifer Yuh Nelson • The sad thing about destroying the environment is that we’re going to take the rest of life with us. The bluebirds will be gone, and the elephants will be gone, and the tigers will be gone, and the pandas will be gone. – Ted Turner • The way I paint is similar to rock in that you don’t stand around and say, ‘Gee, what are they talking about?’ Rock is simple, blunt, colorful. Same with my paintings. You don’t stand back and wonder what it is. That’s Jim Morrison, that’s a panda, that’s a scene on the West Coast. It’s not abstract. – Grace Slick • There are a lot of movies that take place internationally, like Kung Fu Panda portraying a little bit of China, and Ratatouille portraying a little about Paris, but it’s hard to find a movie that portrays Rio or Brazil. – Carlos Saldanha • There’s no point bleating about the future of pandas, polar bears and tigers when we’re not addressing the one single factor putting more pressure on the eco system than any other – namely the ever-increasing population. – Chris Packham • Things Isabella Wouldn’t Care About: – Titanic sinking again. – Metror striking Earth and landing directly on top of world’s most innocent panda. – Titanic sinking again and this time the entire crew is puppies. – Jim Benton • Those Grizzlies are more like pandas. – Charles Barkley • Today is a gift from God – that is why it is called the present. – Sri Sri Ravi Shankar • Tomorrow is a mystery. Today is a gift. That is why it is called the present. – Eleanor Roosevelt • We did two films [Kung Fu Panda], because the first two films were so embraced by the Chinese audiences we wanted to make something we could push further and since this is a co-production, it seemed like the perfect time to create something that felt native to Chinese audiences. – Jennifer Yuh Nelson • We got the best actors imaginable [in Kung Fu Panda]. If we could have made a wish list I don’t think there would anyone else we would have added. Yeah, we’ve been blessed with exactly how amazing a cast of actors we have. To have someone like Bryan Cranston, who is not just an amazing actor, but who has such a range. – Jennifer Yuh Nelson • We use pandas and eagles and things. I’d love to see a wilderness society with an angry-looking wolverine as their logo. – E. O. Wilson • When you see all of the pandas in this movie [Kung Fu Panda 3], they are rolling because that is exactly what they do. Not only were we able to watch the pandas play but we had free range to walk around and get a feel for the architecture and get a sense of where they lived ,so there’s a lot of firsthand exploration. – Jennifer Yuh Nelson • You don’t look in the eyes of a carrot seed quite in the way you do a panda bear, but it’s very important diversity. – Cary Fowler • You have hundreds of artists you’re dealing with across the world and the scale of this movie [Kung Fu Panda] was insane – we had a parallel pipeline going on where you had two versions recording Mandarin voice actors, getting it to be funny for Mandarin audiences going beyond a straight translation, and then animating it and lighting it, it’s a lot of work. – Jennifer Yuh Nelson
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'a', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_a').html(obj); jQuery('#thelovesof_a img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'e', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_e').html(obj); jQuery('#thelovesof_e img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'i', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_i').html(obj); jQuery('#thelovesof_i img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'o', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_o').html(obj); jQuery('#thelovesof_o img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'u', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_u').html(obj); jQuery('#thelovesof_u img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'y', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_y').html(obj); jQuery('#thelovesof_y img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
0 notes
equitiesstocks · 5 years
Text
Panda Quotes
Official Website: Panda Quotes
• A bunch of money-grubbin’, greenhouse-gasing, seal-clubbing, oil-drilling, Bible-thumping, missile-firing, right-to-life-ing, lethal-injecting hypocrites. People whose idea of a good time is strapping a dead panda to a Lincoln Navigator and running over everybody in the gay parade. – Richard Jeni • A Panda walks into a cafe. He orders a sandwich, eats it, then draws a gun and fires two shots into the air. “Why?” asks the confused waiter, as the panda makes toward the exit. The panda produces a badly punctuated wildlife annual and tosses it over his shoulder. “I’m a Panda,” he says, at the door. “Look it up.” The waiter turns to the relevant entry, and, sure enough, finds an explanation. Panda. Large black and white bear-like mammal, native to China. Eats, shoots and leaves. – Lynne Truss • A panda walks into a tea room and ordered a salad and ate it. Then it pulled out a pistol, shot the man in the next table dead, and walked out. Everyone rushed after it, shouting “Stop! Stop! Why did you do that?” “Becuase I am a panda,” said the panda. “That’s what pandas do. If you don’t believe me, look in the dictionary.” So they looked in the dictionary and sure enough they found Panda: Racoon-like animal of Asia. Eats shoots and leaves. – Ursula K. Le Guin • Americans love marriage too much. We rush into mariage with abandon, expecting a micro-Utopia on earth. We pile all our needs onto it, our expectations, neuroses, and hopes. In fact, we’ve made marriage into the panda bear of human social institutions: we’ve loved it to death. – Barbara Ehrenreich
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'Panda', orderby: 'rand', order: 'DESC', template: '1', limit: '68', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_panda').html(obj); jQuery('#thelovesof_panda img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); ); • Despite what you may have been taught about Indians or Africans or ancient Celts, poor people are terrible stewards of their environment. For instance, if my kid were starving to death, I would happily feed her fresh panda. – Jonah Goldberg • Do you think pandas know they’re Chinese and they’re taking the one child policy a bit too seriously? – Jim Jefferies • Eating a RAW food lifestyle is the purest and best way to live. Many of the strongest and longest living animals are raw, such as the panda bear and gorillas. Self love has brought me to a RAW lifestyle. Feeding my body with pure natural energy. Most people’s perception is what has been ingrained inside them by manipulation, but slowly there is a shift in consciousness, one person at a time. People will ask more questions, begin to stand up for themselves, go their “own way”, take better care of themselves, which will benefit everyone and everything around them. – Eric Nies • Every time I mention her, Magnus says, “Are you two getting along?” in raised, hopeful tones, like we’re endangered pandas who need to make a baby. – Sophie Kinsella • For better or worse, zoos are how most people come to know big or exotic animals. Few will ever see wild penguins sledding downhill to sea on their bellies, giant pandas holding bamboo lollipops in China or tree porcupines in the Canadian Rockies, balled up like giant pine cones. – Diane Ackerman • He slung off his backpack. He’d managed to grab a lot of supplies at the Napa Bargain Mart: a portable GPS, duct tape, lighter, superglue, water bottle, camping roll, a Comfy Panda Pillow Pet (as seen on TV), and a Swiss army knife—pretty much every tool a modern demigod could want. – Rick Riordan • I am glad that the life of pandas is so dull by human standards, for our efforts at conservation have little moral value if we preserve creatures only as human ornaments; I shall be impressed when we show solicitude for warty toads and slithering worms. – Stephen Jay Gould • I couldn’t have invented crisps. … I don’t really want to be known as the man who invented crisps. … I invented apples. … I invented pandas, and caps. I invented soil. – Noel Fielding • I find it striking that the quality of the urban habitat of homo sapieans is so weakly researched compared to the habitats of gorillas, elephants, and Bengal tigers and panda bears in China…you hardly see anything on the habitat of man in the urban environment. – Jan Gehl • I only dated one Asian girl, but she was very Asian. She was a panda. – Jim Gaffigan • I said that I thought the secret of life was obvious: be here now, love as if your whole life depended on it, find your life’s work, and try to get hold of a giant panda. If you had a giant panda in your back yard, anything could go wrong — someone could die, or stop loving you, or you could get sick — and if you could look outside and see this adorable, ridiculous, boffo panda, you’d start to laugh; you’d be so filled with thankfulness and amusement that everything would be O.K. again. – Anne Lamott • I think it’s because Po [from Kung Fu Panda] is such a geek, and he is so relatable. He is so excited by life and is excited to learn new things. I think that accessibility is something that we all can relate to, there are so many things we wish we could do but don’t have the means to achieve it. – Jennifer Yuh Nelson • I think we had to push forward to make sure it was different, do something that we had never done before and yet still have the consistency to stay in the same world. This was our chance to do all the things we didn’t get a chance to do before. We’ve been working on these movies [Kung Fu Panda] for twelve years and we have to keep things exciting for us in order for us to devote that many years of our lives to do this. – Jennifer Yuh Nelson • I thought the secret of life was obvious: be here now, love as if your whole life depended on it, find your life’s work, and try to get hold of a giant panda. – Anne Lamott • If you can unify the public mind saving an iconic species like the tiger, like they did with the panda, that means you have to protect their habitat and everything that they hunt. And that means saving massive, thousands of acres for them to be able to roam and breed. So it’s more of a land effort. – Leonardo DiCaprio • I’m not one of those actors who romanticizes his trials working out and brags that he can bench press a panda now. – Ryan Reynolds • I’m really into pandas right now. They’re really scratching an itch for me. They’re so goddamn cute. – Nick Kroll • In the game of life, less diversity means fewer options for change. Wild or domesticated, panda or pea, adaptation is the requirement for survival. – Cary Fowler • It’s like the panda, they say that’s dying out. But what do they do? When you see them they’re just sitting in the jungle eating. – Karl Pilkington • I’ve been punched by a vampire, an Indian girl, and a panda… I should be a video game. – Adam Rex • Men look like pandas when they try and put make-up on. – Adam Ant • Met someone who works at the zoo. Apparently the panda is a nasty animal. – Dov Davidoff • One of my favorite things about the Kung Fu Panda 3 is the look of it. We never go for realism. I think a lot of time when people go for 3D that’s the mistake. Because we’re never going for full realism – for computer generated live action films like Avatar the goal is realism, to make the audience feel like they are seeing something that is real. Lord of the Rings had character design and environments to make it look real, whereas we aren’t going for that, we are going for something that is theatrically, viscerally, and emotionally real. – Jennifer Yuh Nelson • One of the most jolting days of adulthood comes the first time you run out of toilet paper. Toilet paper, up until this point, always just existed. And now it’s a finite resource, constantly in danger of extinction, that must be carefully tracked and monitored, like pandas? – Kelly Williams Brown • One of the things we love about Po [Kung Fu Panda] is that he’s vulnerable. He’s someone that we can all identify with because he has those insecurities. He’s an outsider feeling guy. – Jennifer Yuh Nelson • Panda. Large black-and-white bear-like mammal, native to China. Eats, shoots and leaves. – Lynne Truss • Physically he was the connoisseur’s connoisseur. He was a giant panda, Santa Claus and the Jolly Green Giant rolled into one. On him, a lean and slender physique would have looked like very bad casting. – Craig Claiborne • Po’s [Kung Fu Panda] unending enthusiasm is something we wish we could have. We can’t help but root for him because of his geek energy. – Jennifer Yuh Nelson • Summit meetings tend to be like panda matings. The expectations are always high, and the results usually disappointing. – Robert Orben • The hidden village was something we found when we went to research in China we climbed a mountain in the Sichuan province where the panda sanctuary is based, and we climbed to this beautiful, mist-covered, almost primordial place and when we turned these corners these moss covered old buildings would come into view, revealing themselves and it was so beautiful and so unlike anything we’d seen that we literally took those moments and put them into the film [Kung Fu Panda 3]. – Jennifer Yuh Nelson • The sad thing about destroying the environment is that we’re going to take the rest of life with us. The bluebirds will be gone, and the elephants will be gone, and the tigers will be gone, and the pandas will be gone. – Ted Turner • The way I paint is similar to rock in that you don’t stand around and say, ‘Gee, what are they talking about?’ Rock is simple, blunt, colorful. Same with my paintings. You don’t stand back and wonder what it is. That’s Jim Morrison, that’s a panda, that’s a scene on the West Coast. It’s not abstract. – Grace Slick • There are a lot of movies that take place internationally, like Kung Fu Panda portraying a little bit of China, and Ratatouille portraying a little about Paris, but it’s hard to find a movie that portrays Rio or Brazil. – Carlos Saldanha • There’s no point bleating about the future of pandas, polar bears and tigers when we’re not addressing the one single factor putting more pressure on the eco system than any other – namely the ever-increasing population. – Chris Packham • Things Isabella Wouldn’t Care About: – Titanic sinking again. – Metror striking Earth and landing directly on top of world’s most innocent panda. – Titanic sinking again and this time the entire crew is puppies. – Jim Benton • Those Grizzlies are more like pandas. – Charles Barkley • Today is a gift from God – that is why it is called the present. – Sri Sri Ravi Shankar • Tomorrow is a mystery. Today is a gift. That is why it is called the present. – Eleanor Roosevelt • We did two films [Kung Fu Panda], because the first two films were so embraced by the Chinese audiences we wanted to make something we could push further and since this is a co-production, it seemed like the perfect time to create something that felt native to Chinese audiences. – Jennifer Yuh Nelson • We got the best actors imaginable [in Kung Fu Panda]. If we could have made a wish list I don’t think there would anyone else we would have added. Yeah, we’ve been blessed with exactly how amazing a cast of actors we have. To have someone like Bryan Cranston, who is not just an amazing actor, but who has such a range. – Jennifer Yuh Nelson • We use pandas and eagles and things. I’d love to see a wilderness society with an angry-looking wolverine as their logo. – E. O. Wilson • When you see all of the pandas in this movie [Kung Fu Panda 3], they are rolling because that is exactly what they do. Not only were we able to watch the pandas play but we had free range to walk around and get a feel for the architecture and get a sense of where they lived ,so there’s a lot of firsthand exploration. – Jennifer Yuh Nelson • You don’t look in the eyes of a carrot seed quite in the way you do a panda bear, but it’s very important diversity. – Cary Fowler • You have hundreds of artists you’re dealing with across the world and the scale of this movie [Kung Fu Panda] was insane – we had a parallel pipeline going on where you had two versions recording Mandarin voice actors, getting it to be funny for Mandarin audiences going beyond a straight translation, and then animating it and lighting it, it’s a lot of work. – Jennifer Yuh Nelson
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'a', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_a').html(obj); jQuery('#thelovesof_a img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'e', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_e').html(obj); jQuery('#thelovesof_e img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'i', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_i').html(obj); jQuery('#thelovesof_i img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'o', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_o').html(obj); jQuery('#thelovesof_o img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'u', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_u').html(obj); jQuery('#thelovesof_u img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
jQuery(document).ready(function($) var data = action: 'polyxgo_products_search', type: 'Product', keywords: 'y', orderby: 'rand', order: 'DESC', template: '1', limit: '4', columns: '4', viewall:'Shop All', ; jQuery.post(spyr_params.ajaxurl,data, function(response) var obj = jQuery.parseJSON(response); jQuery('#thelovesof_y').html(obj); jQuery('#thelovesof_y img.swiper-lazy:not(.swiper-lazy-loaded)' ).each(function () var img = jQuery(this); img.attr("src",img.data('src')); img.addClass( 'swiper-lazy-loaded' ); img.removeAttr('data-src'); ); ); );
0 notes
Text
Using Python to recover SEO site traffic (Part three) Search Engine Watch
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter .
Want to stay on top of the latest search trends?
Get top insights and news from our search experts.
Related reading
Complete overivew of what Google Search Console is, what it does for your site, how to use it, and what you need to get started taking advantage of it today.
Last month, Google tested AR functionality in Google Maps. What are the implications of VPS, street view, and machine learning for local search and SEOs?
The robots.txt file is an often overlooked and sometimes forgotten part of a website and SEO. Here’s what it is, examples, how to’s, and tips for success.
What exactly agencies need when it comes to website audits and what to look for in choosing a tool. Five specific recommendations, screenshots, examples.
Want to stay on top of the latest search trends?
Get top insights and news from our search experts.
Source link
0 notes
alanajacksontx · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from IM Tips And Tricks https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/ from Rising Phoenix SEO https://risingphxseo.tumblr.com/post/184297809275
0 notes
kellykperez · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
source https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/ from Rising Phoenix SEO http://risingphoenixseo.blogspot.com/2019/04/using-python-to-recover-seo-site.html
0 notes
bambiguertinus · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
evaaguilaus · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
oscarkruegerus · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
sheilalmartinia · 5 years
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call “winners vs losers” helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Let’s get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Stores’ sites use a flat URL structure with no directory paths. Our manual approach wouldn’t work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing “Inspect,” then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientist’s bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, “Here is the data I have, let me try different computer science ideas I know until I find a good solution.”
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamlet’s observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didn’t start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and I’ve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now let’s get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with “correct” answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, we’ll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
What’s more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents don’t include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we don’t want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called “binning”.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that it’s never seen before. We do this to prevent our model from simply “memorizing” the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
We’re using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesn’t make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko Obkircher commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, I’ve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet Batista is the CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He can be found on Twitter @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Search Engine Watch https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
docacappella · 7 years
Text
T he Best A cappella Songs You’ve Never Heard
On this blog, I usually highlight a cappella albums that I believe deserve as much attention as the latest release by Pentatonix. You can view those posts here:
https://acappellaquest.blogspot.com/2013/11/the-best-albums-youve-never-heard-part-1.html
https://acappellaquest.blogspot.com/2014/04/the-best-albums-youve-never-heard-part-2.html
I’m going to change things up a little bit and talk about specific a cappella songs that I think also deserve special mention. Why am I changing from albums to songs? The reason is simple... I’ve discovered that it’s becoming more common for a cappella groups to release small EP’s or singles. By releasing individual songs more frequently, a cappella groups can stay relevant in this ever-growing musical marketplace. To help foster that trend, here are some a cappella songs you should be listening to. (In no particular order)
The criteria for selecting these songs are as follows:
A. It has to be a song I believe the majority of blog readers have not heard yet. This eliminates songs from more popular albums like the BOCA compilations and Sing-Off winners.
B. I have to be totally obsessed with it.
1) “Real Thing” by Hive
It’s probably a good sign you like this song when you downloaded it last Wednesday and it’s already on your “top 25 most played” playlist. The ladies of Hive are clearly sending a strong message with this first single- female a cappella is here to stay, like it or not. In fact, the entire production of this single, from the arrangement to the mixing, was done by female a cappella artists.
The song is a little offbeat- It’s an arrangement by the “Tune Yards,” a band I had never even heard of until last week. The song begins in a typical R&B style, but the sudden shift in the middle is enough to excite music nerds like me. Don’t judge the book by its cover- listen all the way through.
Download it here: https://itunes.apple.com/us/album/real-thing/1298618940?i=1298619284
2) “Agua De Beber” by Sambaranda
About a month ago, I asked the Facebook hive mind to suggest Latin a cappella albums that I could listen to, having little-to-no idea what groups specialized in Latin music. This is how I found Sambaranda, an a cappella group from Brazil. Their cover of Jobim’s “Agua De Beber” simply rocks. Half of the entire song is in 7/4, a meter that most of us never dare to tread.
What I love most about the arrangement is the beginning loop, repeated several times throughout the recording. I use that loop as inspiration for several a cappella arrangements I’ve recently written, and I’ve mentioned the song in last week’s post about informative arrangements.
Download it here: https://itunes.apple.com/us/album/%C3%A1gua-de-beber/1171576410?i=1171576503
3) “Love is Just That Way” by Accent
As a massive Take 6 fan, I’ve played their albums to death. Naturally, this has led to some jazz withdrawal- It’s extremely rare that anyone is writing complex harmonies that only Take 6 can deliver.
This is why I was so happy to find Accent’s new album In This Together. Their penchant for jazz writing breathes new life into my a cappella addiction. These harmonies are probably as close to “Take 6” as any group has gotten thus far. Every song on the album is amazing, but my personal favorite is “Love is just that way.” Only a group like Accent could rock that hard and still be considered jazz.
Downlaod it here: https://itunes.apple.com/us/album/love-is-just-that-way/1266524440?i=1266524931
4) “Stay” by Vocalight
Vocalight is the new “it” group in town, and they deserve it. A mix of alumni from Eleventh Hour and Forte, they stunned the world by taking 3rd in the Varsity Vocals Aca-Open, and now they’re debuting complex arrangements in the vein of Pentatonix, but without the restraints of trying to please a general audience. My expectation of “Stay” was for them to over-emphasize the harmonic clashes in the chorus—probably the most well-liked part of that song—but once again the group shocks and amazes me by totally reinventing the song and inventing their own groove. It’s like they removed all the "Zedd" and added more "Alessia Cara."
Download it here: https://itunes.apple.com/us/album/stay/1247700681?i=1247700682
5) “Wildest Dreams” by Drastic Measures from A cappella Academy
An older inclusion in the list, this insanely difficult version of the Taylor Swift tune makes me hate the fact that I’m too old to apply for the academy. If you were ever looking for a way to totally re-imagine a song, this would be a good example. Rarely have I heard a group sing an arrangement this complex. From now on, THIS is how I'm going to arrange Taylor Swift.
Download it here: https://itunes.apple.com/us/album/wildest-dreams/1147839392?i=1147839754
6) “Home” by Freshmen Fifteen
Another oldie but goodie. The absolute best arrangement of this song comes from the Freshmen Fifteen, who meld “Home” with several others spirituals. There’s a moment, right before the final chorus, that no matter how many times you hear it, you never fail to get goosebumps. The soloist emits more emotion in this recording than every solo on the last BOCA...COMBINED. It’s raw, imperfect, and absolutely outstanding.
Download it here: https://itunes.apple.com/us/album/home/371227358?i=371227452
7) “Talk2Me” by House Jacks feat. Postyr Project
“Talk2Me” is a strange mix of rock and electronica that works a little too well. The House Jacks' album Pollen is a concept album that has them traveling and recording with a cappella groups all over the world. The entire album deserves your attention, but “Talk2Me” is the one that grabs your attention the most. The song manages to build an enormous amount of tension in the sound and never really releases the pressure, but you don’t seem to mind.
Downlaod it here: https://itunes.apple.com/us/album/talk2me-feat-postyr-project/932053911?i=932053936
8) “What Kind of Band” by Avante
Avante is not widely known in the a cappella circles yet, but probably more so in the vocal jazz community. This song was written for a specific kind of audience—the major a cappella nerd. I bet you’re shocked why I love this song…
Just try to catch all the a cappella easter eggs if you can…
Download it here: https://itunes.apple.com/us/album/what-kind-of-band/955049450?i=955049461
9) "In The River" by ARORA
I’m cheating a little here, because this song is not commercially available yet. As an attendee of SoJam 2017, I was able to purchase a copy of their demo CD for their upcoming album release. "In The River" is shaping up to be the next “Bridge-” a seamless mix of electronica, rock, and calming ambiance- a grouping of styles that only ARORA could pull off. While you probably can’t listen to this one yet, you can ABSOLUTELY set your expectations high and your anticipation at maximum. ARORA will deliver.
10) "Little Drummer Boy" by Five O' Clock Shadow
This one is definitely the oldest song on the list, but I have gotten multiple uses out of it in educational settings. The vocal percussion solo is a testament to both the incredible talent of David Stackhouse and the musicality one can bring to a percussion solo that is more than “look at the cool sounds I can make.” Whenever I introduce a class to vocal percussion, this is always the first track I play, because it never fails to shock and amaze. Couple that with the insane talent of this iteration of Five O’Clock Shadow, and you get my favorite a cappella holiday track of all time.
Download it here: https://itunes.apple.com/us/album/little-drummer-boy/342769403?i=342769834
Marc Silverberg
Follow the Quest:
twitter.com/docacappella
docacappella.com
facebook.com/docacappella
docacappella.tumblr.com
0 notes