A tumblog to capture all the things Peeyoosh Chandra (aka PC) finds interesting. @peeyooshchandra
Don't wanna be here? Send us removal request.
Text
Building a Docker to run Python & R
The Challenge
I wanted to containerise a pipeline of code that was predominantly developed in Python but has a dependency on a model that was trained in R.
I succeeded in doing so by using a base Ubuntu image with an installation of the r-base package and python3.6 - and much troubleshooting on the web! So I’m blogging this to share my learnings because the help forums online seem to still be in their infancy, hoping it helps others in their DevOps efforts.
The Setup
To simplify, let’s consider I have an R code that runs the model (a random forest) but it needs to be part of a data pipeline that was built in Python. The Python pipeline performs some functionality first and generates input for the model, then executes the R code with that input, before taking the output to the next stage of the Python pipeline. So we’ll create a template for this process by writing a simple test Python function to call an R code, and put this in a Docker container to demonstrate this capability.
Below is the test Python code “test_call_r.py” that uses the package subprocess to execute the R code “run_rf_model.R” that uses the random forest model (as if the R code was run from the command line outside the pipeline):
# test_call_r.py import subprocess def call_r(): print('Calling R') # Need to the know the path to call the R application for executing the code via subprocess: subprocess.call(["Rscript", "run_rf_model.R"]) print("Finished calling R") call_r()
The Solution
The Dockerfile I built for running R and Python to run together is:
FROM ubuntu:latest ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update && apt-get install -y --no-install-recommends build-essential r-base r-cran-randomforest python3.6 python3-pip python3-setuptools python3-dev WORKDIR /app COPY requirements.txt /app/requirements.txt RUN pip3 install -r requirements.txt RUN Rscript -e "install.packages('data.table')" COPY . /app
The Execution
The commands to build the image, run the container (naming it SnakeR), and execute the code are:
docker build -t my_image . docker run -it --name SnakeR my_image docker exec SnakeR /bin/sh -c "python3 test_call_r.py"
The Breakdown
I treated it like a Ubuntu OS and built the image as follows:
suppress the prompts for choosing your location during the R install;
update the apt-get;
set installation criteria of:
y = yes to user prompts for proceeding (e.g. memory allocation);
install only the recommended, not suggested, dependencies;
include some essential installation packages for Ubuntu;
r-base for the R software;
r-cran-randomforest to force the package to be available (unlike the separate install of data.table which didn’t work for randomForest for some reason);
python3.6 version of python;
python3-pip to allow pip be used to install the requirements;
python3-setuptools to somehow help execute the pip installs (?!);
python3-dev to execute the JayDeBeApi installation as part of the requirements (that it otherwise confuses is for Python2 not 3);
specify the active “working directory” to be the /app location;
copy the requirements file that holds the python dependencies (built from the virtual environment of the Python codebase, e.g., with pip freeze);
install the Python packages from the requirements file (pip3 for Python3);
install the R packages (e.g. just data.table here);
copy the directory contents to the specified working directory /app.
The Take-Home
As a relative n00b to Docker, having just used some templates as a Python user (FROM python:3), I had to remind myself that it is essentially a Virtual Machine that can be considered host to an OS, such as Ubuntu, on which can be installed any requirements. So the need to have Python and R beside each other in the one container should not have been considered a challenge. After some web-searching I was under the impression that it might not be so straightforward, and couldn’t find examples of others who had done it already. But with some perseverance and troubleshooting of each error message as it arose, I eventually built a Docker image containing both R and Python - and it worked! So this is me sharing my process in case it helps anyone else new to Docker who needs something like it. Enjoy!
p.s. Thanks to colleagues who entertained my chats about this - their experiences and discussion helped it happen.
36 notes
·
View notes
Quote
Too much flow could be a bad thing. Right now, we’re beginning to realize that all this connectivity enabling much richer flows on a global scale also has its downside. We’re all familiar with the flash crashes associated with high frequency trading on electronic markets. Stories of significant security breaches multiply as we become more connected. Extreme events cascade out of nowhere to disrupt activity on a large scale. It turns out that too much flow can make our systems more fragile.
Edge Perspectives with John Hagel: Flows, Fragility and Friction (via edgeperspectives)
Transaction cost theory at work
1 note
·
View note
Quote
Consumption is so great that Twitch is already larger than 70% of American television networks, as well as Amazon’s own OTT video service.
Why the Next Sports Empire will be Built on eSports (via courtenaybird)
19 notes
·
View notes
Text
The Need-Time-Location model for mobile advertising
If you browse NZ news websites stuff and NZ Herald, you will no doubt have been hit by a mobile homepage take over similar to the one below.
It is a mobile homepage takeover, and works in gaining attention via interruption. It is pretty annoying because it ignores the user’s needs and so for a number of people, over time, results in consumers developing an avoid reflex.
Given that mobile is a completely new environment, with a completely different personal & local context
So while we’ve had Social, Local & Mobile (SoLoMo), probably the most important context we’re missing is that of time of day.
Current mobile takeover products in NZ don’t support time-of day targeted advertising, and would incur more production cost, but may improve usefulness to the end recipient.
From Felix Portnoy - Chasing the Holy Grail of Mobile Advertising: An Interaction Framework for a User-Centered Mobile Marketing,
0 notes
Text
A model to understand how advertising influences brand perceptions
An oldie, but a goodie as they say. So often we forget that brand credibility and past experience significantly influences advertising credibility and receptiveness.
More interesting is how digital media which is more pervasive can act as a catalyst in shifting these perceptions. Little research has been done till now, but early signs from academia indicate that we’re in for _more_ change soon.
0 notes
Photo
Interesting charts about users’ reactions to various brands social media presence and content.
109 notes
·
View notes
Link
0 notes
Text
And so finally we witness Lawrence Lessig’s Creative Commons begin to pass.
German court hands Kraftwerk its ass, rules sampling is legal

Today, Kraftwerk lost its vindictive, 19-year-long copyright suit against Sabrina Setlur, whose 1997 song “Nur mir” looped a drum sequence from Kraftwerk’s 1977 “Metall auf Metall.”
The court found that sampling had helped create hip-hop music, and held that if a sample’s effect on the rightsholder’s usage rights is “negligible, then artistic freedom overrides the interest of the owner of the copyright.”
In support of his case, Kraftwerk’s Ralf Huetter (who had previously obtained a court order requiring the suppression of Nur mir) cited the Bible, insisting “that the commandment ‘thou shalt not steal’ applied also to music.”
http://boingboing.net/2016/05/31/german-court-hands-kraftwerk-i.html
62 notes
·
View notes
Link
The Internet Advertising Bureau has made a belated (but not unwelcome!) conversion to the cause of improved user experience and lighter, less intrusive ads. The move comes because publishers and digital advertisers are having kittens over the rise of ad blocking software.
The UX crisis is real, modern ads are invasive and intrusive, and they hoover up your data like nobody’s business (something the IAB doesn’t dwell on). The alternative - LEAN ads as an alternative consumer standard - is probably the least the IAB can do and may turn out to be the most: its announcement includes lots of reassurance that all this consumer-friendly stuff is only optional, don’t worry. It’s quite possible that advertisers and publishers will go the road of the stick, not the carrot, and ban ad-blocker users from parts of the web.
But it seems to me there’s something disingenuous about its statement, even so. The way the IAB is telling it, they simply took their eye off the ball in pursuit of advertising excellence: they were so busy saving the free and democratic internet in the wake of the dotcom crash that they didn’t notice that invasive ads were spreading like knotweed and most media publishers were looking like shonky Russian torrent sites. It was a mistake, a sin of omission.
Except this isn’t quite true. The IAB’s problem wasn’t that it was so busy helping businesses that it ignored user experience entirely. The truth is if anything less callous but more awkward. The knotweed didn’t creep up on them. They helped plant it. The IAB, and other advertising bodies, have had lots to say about digital advertising and user experience, but until quite recently much of it has been based on the idea that people are fine with digital advertising and if anything want to see more of it.
The idea is rooted in some basic beliefs, which seem to me common to a lot of digital marketers. The beliefs run something like this: old advertising was mass media and based on interruption. It wasn’t efficient and people didn’t like it. But people would like new, online advertising if it was engaging and relevant. And young people would especially like it.
This presentation (link opens a powerpoint) from 2012 is a good example - the results of a study which asked people whether not not they agreed with statements about advertising, relevant advertising, privacy, and so on. The conclusions are upbeat: people expect to see advertising online, they find it helpful, they like relevant advertising more, and so on. The conclusions are also a fair summary of the data, as far as I can tell. There are perhaps a few warning signs an analyst might have brought out - almost 60% of people say they would like relevant advertising more. Only half that figure say they actually see more relevant advertising.
But on the whole, the IAB believed what the survey, and surveys like it, told them about what people thought, and reported it fairly. And hey, even if consumers currently didn’t feel advertising was relevant, with all the data being collected on them, and all the engaging new ways of communicating with them online, they soon would. Right? The IAB and other advertising bodies had faith in the idea of a new type of consumer, a digital native who understood the role advertising played and welcomed well-targeted, engaging and helpful examples of it, personalised using their data.
But this was a fantasy. Ad blocking is particularly on the rise among young people, the very group supposed to tolerate or enjoy digital ads. The promise of relevant, targeted ads either has not materialised yet or has failed to motivate people. Engagement with brands on social media is falling. And the very things that were meant to improve advertising - collecting personal data to improve relevance, and using rich media to improve engagement - have helped make it slower, more unwieldy and less popular than ever.
The claims made by the IAB - that advertising funds the web as we know it - are entirely true, but the rise of invasive, data-hungry, memory-heavy, non-LEAN advertising is not just down to an excess of understandable enthusiasm. It’s the consequence of a bet on a particular type of consumer and set of attitudes to advertising - one with the additional, seductive property that marketers very much wanted it to be true.
Why wasn’t it true? What did that survey (and others like it) get wrong? Three things spring to mind. The first is that what people think doesn’t necessarily match what they’ll do. People may think all sorts of reasonable things about advertising in the abstract, but still happily download a plug-in or tick a box that removes it, if that feels like an easy option. The second is that even in the survey, enough people disliked advertising to make a very big hole in the accounts if the option is given to get rid of it.
And the third is that the world “relevant” is not very useful. Would I like ads to be relevant to me? Yes. But what I mean by that isn’t necessarily what advertisers mean. Relevance is something you can only really decide after you see an ad, and it’s likely to correlate very strongly with “do I like what’s being advertised”. And a near miss - like something I’ve already decided I don’t want - may well feel less relevant than something miles off, even though that’s objectively unfair.
(Retargeting is the classic example here - the IAB, incidentally, deplore the practise of retargeting people who’ve already bought a thing. But that isn’t why retargeting is annoying! It has a relatively very high conversion rate, I believe - because people who consider something might regret not buying it. For those people, it’s relevant advertising. But for the majority who don’t regret not buying it, and have actively rejected it, retargeting is the opposite of relevant!)
Where does this actually leave the IAB and advertising? They’re doing the right thing (or taking a big step towards it). Advertising UX is a disaster area. But their justification is going in the wrong direction - blaming the current mess on an excess of advertiser high spirits, rather than looking even slightly at the ideas and theories underpinning it. If those stay intact, no reform is likely to last.
7 notes
·
View notes
Link
While an incomplete report going through peer review, the volume of interaction with Firm Generated Content (FGC) has a positive association with car sales.
As more and more consumers move to evaluate their purchases online, looking for peer recommendations, the level of investment in socially bonding consumers to a brand will become even more important.
So it’s interesting, that a number of strategists are now talking about moving away from “rented” platforms like Facebook, etc and investing more in owned platforms.
Fundamentally, the way we think about distributing Firm Generated Content requires careful analysis of the level of influence by channel and frequency of exposure.
For the last few months, Facebook as proven to be a very effective channel for content distribution and driving in-store traffic.
0 notes
Link
In 2013, eBay famously published the results of their study into how (in)effective paid search advertising was.
It was an interesting time, and the debate on digital advertising effectiveness was raging. Apparently, you were more likely to survive a plane crash than click on a banner.
Native advertising was on the rise, and the death of the banner had been pronounced.
And yet, the facts did not reconcile with the hype. And this research report showed how digital advertising was actually working.
Three years on, we’re seeing average banner response rates increase, and more importantly the area of cross-media interaction analysis is showing that research online, buy offline is how consumers now shop.
0 notes
Link
As our investment in digital advertising increases, how do we build a robust model of how the advertising is working?
This research paper starts of by correctly stating the attribution problem that so many are struggling with.
“Currently, the literature on the effects of online advertisement solely looks on immediate response: that is only action that the user has taken directly after being exposed to a specific ad are measured and analysed. While this allows for simplicity, much, if not most, of the effects of marketing are not immediate but happen in a significantly longer timeframe.”
First, we know very little about the interactions between different online strategies when used together as a strategic mix in a campaign. Secondly, we do not know what are the dynamic overtime overall impacts of the usage of these different technological mixes, throughout an online marketing campaign.
To develop a framework that mitigate both of these limitation we develop a framework that views online shopping as a journey.
Thus, instead of seeing the interaction of ads and users as immediate cause and effect, we analyse online shopping is a continuous process in which users develop preferences and gather information before culminating in a purchase.
This paper provides an online of how to think through the balancing paid search & behavioural targeting in developing digital media strategies. There is still an underlying assumption that somehow the brand is already known ...
1 note
·
View note
Link
“Do ads have any effect?”
This is epitomized by an influential paper by Abraham et al. (1990), which has as its first sentence, “Until recently, believing in the effectiveness of advertising and promotion was largely a matter of faith.”
1 note
·
View note
Link
Abstract : This paper presents the first large-scale measurement of the effectiveness—measured in terms of incremental conversion gains—of online search ads. We develop a simple metric called net acquisition benefit (NAB) that admits comparisons between the efficacy of different ad campaign strategies without access to advertisers’ private financial information. We study three common campaign strategies used by advertisers on a large search ad network: cannibalization, poaching, and ad extensions.
1 note
·
View note
Link
The text below is that of the speech given by Hal Varian, Chief Economist at Google, on 25 September 2013 in Milan at the awards ceremony of the annual Italian journalism award È giornalismo. Hal Varian is the 2013 award winner but declined the prize money and asked the award jury to nominate innovative Italian journalists working in online media ...
Understanding the problem is half the battle
0 notes
Link
In a study of a major grocery store's Facebook page (150,000 fans), YetiData and Collective Bias that Facebook fans of the store on average bought 125 more items than a typical customer — a 35 percent rise.
Which is very close to what the Ehrenberg-Bass Institute study showed, and generally shows that your most loyal customers are the ones most likely to engage and purchase your brand and consume your product. This is even more true now that reach in Facebook is limited.
The broader question is how long does it take to build up this loyalty in new customers?
0 notes