This was simultaneously hilarious and extremely relevant.
artemisarticles
poipoipoi-2016
On the one hand, this is fascinating and overlaps with certain things I’ve seen elsewhere albeit on higher budgets with more testing and more explicit QA people. 
On the other hand “Step 1: Build an exhaustive detailed description of what you want”, bitch please there is a reason we invented agile and it’s because no one including you knows what you want and being able to do 2-week to 2-month (to 2-hour on internal tooling) v1 -> feedback -> you get v2 is why modern software is what it is.  For good and bad.    
theemperorsfeather
This is really quite wonderful:
The Japanese employment market has a curious feature: there are regions of Japan with extremely high economic productivity (such as Tokyo, Osaka, and Nagoya, but for the purpose of this issue think “Tokyo” and you won’t be wrong) and regions with low economic productivity (substantially everywhere else). This counsels that a young person born and educated in e.g. Gifu move to Tokyo after graduation to earn a living.
Many, many do. While Japan’s overall population is declining, Tokyo’s increases by about 100,000 people per year.
The regions in Japan are not thrilled about this state of affairs for many reasons. Tokyo isn’t just the seat of Japanese commerce; it also houses the government, media, cultural institutions, etc etc. There is a real sense that your children moving to Tokyo causes them to lose connection with their culture and that the rewards from the national enterprise aren’t being allocated fairly. Tokyo, for its perspective, views the regions with the noblesse oblige that you would expect a cosmopolitan center of culture and learning to have with respect to their benighted country bumpkin cousins.
But Japan has a policy response for it, and it is sort of beautiful. Called ふるさと納税 (Furusato Nouzei or, roughly, the Hometown Tax System), it works something like this:
A substantial portion of Japan’s income-based taxes are residence taxes, which are paid to the city and prefecture (think state) that one resides in, based on one’s income in the previous year. The rate is a flat 10% of taxed income; due to quirks of calculating this which almost certainly aren’t relevant to you, you can estimate this as 8% of what white collar employees think their salary is.
Furusato Nouzei allows you to donate up to 40% of next year’s residence tax to one or many cities/prefectures of your choice, in return for a 1:1 credit on your tax next year. This is entirely opt-in. Anyone can participate, regardless of where they live.
In principle, the idea is to donate to one’s hometown. Importantly, one actually has unfettered discretion as to which city/prefecture one donates to...
There exists a culture in Japan of reciprocating gifts. . .
While not formally defined in the legislation for the Furusato Nouzei system, someone at a city government figured that it was just not appropriate to let someone just give ~3% of their salary to the city without receiving a token of appreciation in return. So they sent something back; a can of locally-produced plums, say, to remind you of the tastes of your childhood.
And this was a beautiful idea! It directly improved the ability of the system to cement relationships between internal migrants and their hometowns, one of the declared goals of the system. It motivated people to fill out paperwork and float the city a bit of money for part of a year, because who doesn’t like free plums. (You might sensibly object that they aren’t free given the time value of money, but prevailing interest rates in Japan are indistinguishable from zero.) And it let cities specialize in marketing this initiative.
And specialize they did.
... The government wasn’t willing to adjudicate one’s “true” hometown; 帰る場所 is where the heart is
.And then some bureaucrat realized that this created a market: you, as a city government, can bid for taxpayers to select you as a hometown.
readonline
mainepage
nomerit
nancydsmithus
Text To Speech And Back Again With AWS (Part 2)
Text To Speech And Back Again With AWS (Part 2)
Philip Kiely
This is the second half of a series on transforming content between text and speech on AWS. In part one, we used Amazon Polly to narrate blog posts and embedded the content in a website using an audio tag. In this article, we will use speech-to-text to draft transcripts of podcasts and interviews for publication. Finally, we will evaluate the overall accuracy of these format-transformation technologies by running a few samples through round-trip transcriptions.
Speech-To-Text Project
In 2012, Patrick McKenzie (a.k.a. patio11, of Kalzumeus and Stripe) and Ramit Sethi (of I Will Teach You To Be Rich) sat down and recorded two hour-long podcasts. As I am a fan of both of their work, I probably would have listened to the podcasts, but I definitely wouldn’t have listened to them several times each. The transcripts, on the other hand, I can reread and reference at my leisure. I also freely recommend the series when talking to people about freelancing, knowing that I am giving them a resource that takes a quarter the time to read that it takes to listen to. Even though the content of the podcasts and transcripts are exactly the same, the combination is 10× as useful as the podcast alone.
In the first transcript, McKenzie says that he paid 75 dollars and waited a couple of days to have the podcast transcribed by a professional service. His other option was to transcribe it himself. When I worked for my college’s newspaper, I frequently transcribed interviews. Over time, I got more practiced at the skill and improved from taking four minutes of transcribing per minute of audio to three minutes per minute. While I imagine that a professional with specialized equipment and a faster typing speed could drop below two minutes per minute, as an amateur transcriber McKenzie likely saved himself five or six hours of work by paying for the service.
Seven years later, it seems like he should have another option: an automated transcription with Amazon Web Services. As we’ll see, the transcription would require significantly more editing before it would be publication-ready, but automated transcription has two killer features compared to hiring a professional: he would have gotten the transcription back in real time for about a dollar. In this article, I’ll explain how you can use Speech-to-Text on AWS to easily make your content multi-format and ideas for using Amazon Transcribe in more complex applications.
Amazon provides a console to experiment with Transcribe. To access the console, log on to your AWS account and search “Transcribe” in the services search field. The console exposes the full power of Transcribe, and if you’re only planning on transcribing a few pieces of content per week then using the console is a solid long-term option. The transcription console gives you two options: streaming audio and uploading a file.
Tumblr media
You can launch live transcriptions in the real-time transcription tab. (Large preview)
The “real-time transcription” tab offers the ability to speak into the microphone and have a transcription generated in real time. Speaking deliberately, and with my computer’s onboard microphone, I was able to transcribe the sentence “Smashing Magazine publishes technical content for developers worldwide” on the first try. However, when I tried to transcribe the previous paragraph at a more conversational speed and articulation, there were numerous errors.
“Amazon provides a consul to experiment with transcribe access. The console log onto a ws account and search transcribed in the services search field, The consul exposes the full power of transcribed. And if you only planning on transcribing a few pieces of content a week than using the consul is a solid long term option. The transcription Council gives you two options streaming audio and uploaded a file.”
In addition to simply missing some words, Transcribe has issues with homophones and punctuation. In the first sentence, it transcribed “console” as “consul.” This homophone error can only be corrected by evaluating each transcribed word in the context of the sentence and adjusting according to the algorithm’s best guess. The first sentence also runs into the second, which throws off the grammatical structure and meaning of the entire rest of the paragraph. Beyond contextual clues, Amazon Transcribe seems to use pauses to determine punctuation. That said, I am using a built-in microphone, transcribing in real time, and to be honest I don’t have the clearest speaking voice. Let’s see if we can find improvements by mitigating each of these factors.
I used a Blue Yeti, a midrange all-purpose recording microphone, to stream audio into the console. As you can see in the image below, improved audio quality did not significantly improve transcription quality. I hypothesize that while a poor quality audio input would further degrade the text’s accuracy, improvement past the threshold of a built-in microphone or cheap webcam does not provide the quality transcription that we are looking for.
Tumblr media
Improving microphone quality does not materially improve transcription quality. (Large preview)
Using the same microphone, I recorded the same paragraph as an .mp3 file and uploaded it for transcription. To do the same, navigate to the “Transcription Jobs” panel and click the orange button with the text “Create Job.” This will bring you to a form where you can configure the transcription job.
Tumblr media
A transcription job requires a title, language, input source, and file format. (Large preview)
The job name is arbitrary, just choose something that will be meaningful to you when you review the completed jobs. You can select from about a dozen languages, with English and Spanish available in regional variants. The transcription service draws its input from S3, so you’ll need to upload your audio file to the storage service before you can run the job. You can upload the file in one of four supported formats: .mp3, .mp4, .wav, and .flac.
Tumblr media
A transcription job offers data location and audio identification options. (Large preview)
If you want to keep the output data in a permanent location, change “Data location” to “Customer specified” and enter the name of an S3 bucket that you can write to. Finally, you can choose between two identification options. Channel identification tags input with the channel that it came from in the audio file, while “Speaker identification” attempts to recognize distinct voices in the audio. If you are transcribing a multi-person podcast or interview, Speaker identification is a useful feature, but it is not applicable to this simple test.
Inspecting the output, unfortunately, reveals that the transcription is no more accurate than the real-time console transcription. However, running a transcription job does provide more data. In addition to the transcription text, the job outputs JSON with each word, its confidence score, and alternate words considered, if any. If you want to write your own natural language processing code to try to improve the readability of the output, this data will give you what you need to get started.
Finally, I had a friend who hosts a local radio show narrate the same paragraph for live transcription. Despite his steady pace and clear enunciation, the resulting text was no more accurate than any of my live transcription attempts. While a professional narrator may be able to achieve even more specific pronunciation, the technology is really only useful if it is widely usable.
Unfortunately, it seems that the transcription quality is too low to fully automate our proposed use case. Depending on your typing speed, running audio through Amazon Transcribe and then editing by hand may be faster than simple manual transcription, but it is not a turnkey solution for speech-to-text that compares to what exists for text-to-speech. For specific domains, you can define Custom Vocabularies to improve transcription accuracy, but out of the box, the service is insufficiently advanced.
As with most of its services, AWS offers an API for using Transcribe. Unless you have a large number of files to transcribe or you need to transcribe audio in response to events, I would recommend using the console and save yourself the time of setting up programmatic access.
To use Transcribe from the AWS CLI, you’ll need a JSON file and a terminal command.
aws transcribe start-transcription-job \ --region YOUR_REGION_HERE \ --cli-input-json YOUR_FILE_PATH.json
At YOUR_FILE_PATH.json, you’ll need a .json file with four pieces of information. As above, you can set any meaningful string as the TranscriptionJobName and any supported language as the LanguageCode. The CLI supports the same four media file formats and still reads the media file from S3.
{ "TranscriptionJobName": "request ID", "LanguageCode": "en-US", "MediaFormat": "mp3", "Media": { "MediaFileUri": "https://YOUR_S3_BUCKET/YOUR_MEDIA_FILE.mp3" } }
This kind of access is also available through a Python SDK. Amazon recommends Transcribe for voice analytics, search and compliance, advertising, and closed-captioning media. In each of these cases, the transcribed text is an input to another system like Amazon Comprehend rather than the final output. Thus, as a developer, it is important to design your system and limit its use cases to tolerate the range of errors that Transcribe will feed into your application.
Note: For more on using Amazon Transcribe and other services programmatically, check out Amazon’s getting started guide.
Round Trip Accuracy
While the live performance of Amazon Transcribe was somewhat disappointing, we can investigate the theoretical maximum accuracy of the system by transcribing something that was read by Amazon Polly. The two services should be using compatible pronunciation libraries and speech cadences, so text input into Amazon Polly should survive the round trip more or less intact. Of course, we will stick with the same test paragraph.
Lo and behold, this is the only strategy that has made the transcription noticeably better:
“Amazon provides a console to experiment with transcribe. To access the console, log onto your AWS account and search transcribing the service’s search field. The console exposes the full power of transcribe, and if you’re only planning on transcribing a few pieces of content per week than using the console is a solid long term option. The Transcription council gives you two options. Streaming audio and uploading a file.”
Stubborn errors persist (“council” versus “console” comes in at 70% confidence) but overall the text is a few edits away from useable. However, most of us don’t speak like synthesized robots, so this quality is unavailable to us at the time of writing.
While the quality of output speech and text are noticeably lesser than that of a person, these services cost so little that they are a strong alternative for many applications. Text-to-speech, at 4 dollars per million characters (16 dollars per million for the superior neural voices), can narrate articles in seconds for pennies. Speech-to-text, at .04 cents per second, can transcribe podcasts in minutes for about a dollar. Of course, prices may change over time, but historically as technologies like these improve, they tend to become less expensive and more effective.
Because of the low cost, you can experiment with these technologies for things like improving your personal productivity. When biking or driving to work, it is impossible to type notes or an outline a project, however, speaking and automatically transcribing a stream-of-consciousness narration would get a lot of planning done. Journalists frequently transcribe long interviews, a process which AWS can automate by tagging the voices of people speaking in a recording. On the other side of the writing process, having a steady, robotic voice read your work back to you can help you identify errors and awkward phrasing.
These technologies already have a number of use cases, but that will only expand over time as the technologies improve. While text-to-speech is reaching near-perfect accuracy in pronunciation, especially when assisted by pronunciation alphabets and tags, the synthesized voice still doesn’t sound fully natural. Speech-to-text systems are pretty good at transcribing clear speech but still struggle with punctuation, homophones, and even moderately quick speech. Once the technologies overcome these challenges, I anticipate that most applications will have a use for at least one of them.
Tumblr media
(dm, yk, il)
pearwaldorf
Some people write fanfic as a hobby. This dude wrote letters disputing erroneous debts for people. There’s a lot of real good advice in here about how to deal with reporting accounts that were not opened by you, which is probably going to become very relevant after the Experian hack. 
isanah
The author’s had to deal with debt that wasn’t his, and he recommends against a credit freeze. Really good post, worth bookmarking as a resource.
allisonacs
Just re-posting again because people always seem to forget.
Do you like music? What is your favorite musical genre to listen to? I personally love pornogrind and harsh noise
Yeah, I love listening to music.
Some favorites:
NPR's Music Minute (NPR) is a podcast about music that's good.
The Algebraist's YouTube channel (tA)
Kalzumeus' blog post (kA)
pdudits
“Your salary negotiation, which routinely takes less than 5 minutes, has an outsized influence on your compensation. It's very hard to get a $5,000 bonus through outstanding job performance, but you can trivially pick up $5,000 in salary negotiations.” https://t.co/6kO52drWXN
— John Arundel (@bitfield) December 2, 2020
cahouser
ourwitching
There exists an idiom called “dropping a hash” which is widely understood in the securi...
pawelpiotrowski
90% of programming jobs are in creating Line of Business software: Economics 101: the price for anything (including you) is a function of the supply of it and demand for it.
Software solves business problems.  Software often solves business problems despite being soul-crushingly boring and of minimal technical complexity. (..) It does not matter to the company that the reporting form is the world’s simplest CRUD app, it only matters that it either saves the company costs or generates additional revenue. There are companies which create software which actually gets used by customers, which describes almost everything that you probably think of when you think of software.  It is unlikely that you will work at one unless you work towards making this happen.  Even if you actually work at one, many of the programmers there do not work on customer-facing software, either.
Engineers are hired to create business value, not to program things:  Businesses do things for irrational and political reasons all the time (see below), but in the main they converge on doing things which increase revenue or reduce costs. Status (..) is awarded to people who successfully take credit for doing one of these things.
The person who has decided to bring on one more engineer is not doing it because they love having a geek around the room, they are doing it because adding the geek allows them to complete a project (or projects) which will add revenue or decrease costs.  Producing beautiful software is not a goal.  Solving complex technical problems is not a goal.  Writing bug-free code is not a goal.  Using sexy programming languages is not a goal.  Add revenue.  Reduce costs.  Those are your only goals.
Profit Centers are the part of an organization that bring in the bacon: partners at law firms, sales at enterprise software companies, “masters of the universe” on Wall Street, etc etc.  Cost Centers are, well, everybody else.  You really want to be attached to Profit Centers because it will bring you higher wages, more respect, and greater opportunities for everything of value to you.
Engineers in particular are usually very highly paid Cost Centers. This is what brings us wonderful ideas like outsourcing, which is “Let’s replace really expensive Cost Centers who do some magic which we kinda need but don’t really care about with less expensive Cost Centers in a lower wage country”. (..) Nobody ever outsources Profit Centers.
Don’t call yourself a programmer: “Programmer” sounds like “anomalously high-cost peon who types some mumbo-jumbo into some other mumbo-jumbo.”  If you call yourself a programmer, someone is already working on a way to get you fired.
You know Salesforce, widely perceived among engineers to be a Software as a Services company?  Their motto and sales point is “No Software”, which conveys to their actual customers “You know those programmers you have working on your internal systems?  If you used Salesforce, you could fire half of them and pocket part of the difference in your bonus.” (There’s nothing wrong with this, by the way.  You’re in the business of unemploying people.  If you think that is unfair, go back to school and study something that doesn’t matter.)
Instead, describe yourself by what you have accomplished for previously employers vis-a-vis increasing revenues or reducing costs.  If you have not had the opportunity to do this yet, describe things which suggest you have the ability to increase revenue or reduce costs, or ideas to do so.
Similarly, even though you might think Google sounds like a programmer-friendly company, there are programmers and then there’s the people who are closely tied to 1% improvements in AdWords click-through rates.
Do Java programmers make more money than .NET programmers?  Anyone describing themselves as either a Java programmer or .NET programmer has already lost, because a) they’re a programmer (you’re not, see above) and b) they’re making themselves non-hireable for most programming jobs.
Talented engineers are rare — vastly rarer than opportunities to use them — and it is a seller’s market for talent right now in almost every facet of the field.  Everybody at Matasano uses Ruby.  If you don’t, but are a good engineer, they’ll hire you anyway.  (A good engineer has a track record of — repeat after me — increasing revenue or decreasing costs.)  Much of Fog Creek uses the Microsoft Stack.  I can’t even spell ASP.NET and they’d still hire me.
There are companies with broken HR policies where lack of a buzzword means you won’t be selected.  You don’t want to work for them, but if you really do, you can add the relevant buzzword to your resume. (..)
Co-workers and bosses are not usually your friends: You will spend a lot of time with co-workers.  You may eventually become close friends with some of them (..) You should be a good person to everyone you meet — it is the moral thing to do, and as a sidenote will really help your networking
You radically overestimate the average skill of the competition because of the crowd you hang around with: Many people already successfully employed as senior engineers cannot actually implement FizzBuzz. Key takeaway: you probably are good enough to work at that company you think you’re not good enough for.
Networking: it isn’t just for TCP packets: Networking just means a) meeting people who at some point can do things for you (or vice versa) and b) making a favorable impression on them.
Strive to help people.  It is the right thing to do, and people are keenly aware of who have in the past given them or theirs favors.  If you ever can’t help someone but know someone who can, pass them to the appropriate person with a recommendation.  If you do this right, two people will be happy with you and favorably disposed to helping you out in the future.
Academia is not like the real world: Your GPA largely doesn’t matter (modulo one high profile exception: a multinational advertising firm). (..) it only determines whether your resume gets selected for job interviews.
Your major and minor don’t matter.  Most decisionmakers in industry couldn’t tell the difference between a major in Computer Science and a major in Mathematics if they tried.
In general, big companies pay more (money, benefits, etc) than startups.  Engineers with high perceived value make more than those with low perceived value.  Senior engineers make more than junior engineers.  People working in high-cost areas make more than people in low-cost areas.  People who are skilled in negotiation make more than those who are not.
We have strong cultural training to not ask about salary, ever. This is not universal.  In many cultures, professional contexts are a perfectly appropriate time to discuss money.  (If you were a middle class Japanese man, you could reasonably be expected to reveal your exact salary to a 2nd date, anyone from your soccer club, or the guy who makes your sushi.)
If I were a Marxist academic or a conspiracy theorist, I might think that this bit of middle class American culture was specifically engineered to be in the interests of employers and against the interests of employees.  Prior to a discussion of salary at any particular target employer, you should speak to someone who works there in a similar situation and ask about the salary range for the position.
Engineers are routinely offered a suite of benefits.  It is worth worrying, in the United States, about health insurance (traditionally, you get it and your employer foots most or all of the costs) and your retirement program, which is some variant of “we will match contributions to your 401k up to X% of salary.”  The value of that is easy to calculate: X% of salary.  (It is free money, so always max out your IRA up to the employer match.  Put it in index funds and forget about it for 40 years.)
How do I become better at negotiation?  This could be a post in itself.  Short version: a)  Remember you’re selling the solution to a business need (raise revenue or decrease costs) rather than programming skill or your beautiful face. b)  Negotiate aggressively with appropriate confidence, like the ethical professional you are.  It is what your counterparty is probably doing.  You’re aiming for a mutual beneficial offer, not for saying Yes every time they say something. c)  “What is your previous salary?” is employer-speak for “Please give me reasons to pay you less money.”  Answer appropriately. d)  Always have a counteroffer.  Be comfortable counteroffering around axes you care about other than money.  If they can’t go higher on salary then talk about vacation instead. e)  The only time to ever discuss salary is after you have reached agreement in principle that they will hire you if you can strike a mutually beneficial deal.  This is late in the process after they have invested a lot of time and money in you, specifically, not at the interview.  f)  Read a book.  Many have been written about negotiation.  I like Getting To Yes
Working at a startup, you tend to meet people doing startups. Most of them will not be able to hire you in two years. Working at a large corporation, you tend to meet other people in large corporations in your area.  Many of them either will be able to hire you or will have the ear of someone able to hire you in two years.
Working in a startup is a career path but, more than that, it is a lifestyle choice. This is similar to working in investment banking or academia. Those are three very different lifestyles.  Many people will attempt to sell you those lifestyles as being in your interests, for their own reasons.  If you genuinely would enjoy that lifestyle, go nuts.  If you only enjoy certain bits of it, remember that many things are available a la carte if you really want them.  For example, if you want to work on cutting-edge technology but also want to see your kids at 5:30 PM, you can work on cutting-edge technology at many, many, many megacorps.
Your most important professional skill is communication: Remember engineers are not hired to create programs and how they are hired to create business value? The dominant quality which gets you jobs is the ability to give people the perception that you will create value.  This is not necessarily coextensive with ability to create value.
Some of the best programmers I know are pathologically incapable of carrying on a conversation.  People disproportionately a) wouldn’t want to work with them or b) will underestimate their value-creation ability because they gain insight into that ability through conversation and the person just doesn’t implement that protocol
Conversely, people routinely assume that I am among the best programmers they know entirely because a) there exists observable evidence that I can program and b) I write and speak really, really well.
Communication is a skill. Practice it: you will get better. One key sub-skill is being able to quickly, concisely, and confidently explain how you create value to someone who is not an expert in your field and who does not have a priori reasons to love you.
All business decisions are ultimately made by one or a handful of multi-cellular organisms closely related to chimpanzees, not by rules or by algorithms: People are people.  Social grooming is a really important skill.  People will often back suggestions by friends because they are friends, even when other suggestions might actually be better.  People will often be favoritably disposed to people they have broken bread with.
Actual grooming is at least moderately important, too, because people are hilariously easy to hack by expedients such as dressing appropriately for the situation, maintaining a professional appearance, speaking in a confident tone of voice, etc.  Your business suit will probably cost about as much as a computer monitor.  You only need it once in a blue moon, but when you need it you’ll be really, really, really glad that you have it.
At the end of the day, your life happiness will not be dominated by your career. Either talk to older people or trust the social scientists who have: family, faith, hobbies, etc etc generally swamp career achievements and money in terms of things which actually produce happiness.  Optimize appropriately.  Your career is important, and right now it might seem like the most important thing in your life, but odds are that is not what you’ll believe forever.  Work to live, don’t live to work.
