ramrachum
ramrachum
Ram's blog
107 posts
Don't wanna be here? Send us removal request.
ramrachum · 5 years ago
Text
GridRoyale - A life simulation for exploring social dynamics
GridRoyale - A life simulation for exploring social dynamics
Another day, another project :)
Tumblr media
This is a project that I wanted to do for years. I finally had the opportunity to do it. Check out the GridRoyale readme on GitHub for more details and a live demo.
GridRoyale is a life simulation. It's a tool for machine learning researchers to explore social dynamics.
It's similar to Game of Life or GridWorld, except I added game mechanics to encourage the players to behave socially. These game mechanics are similar to those in the battle royale genre of computer games, which is why it's called GridRoyale.
The game mechanics, Python framework and visualization are pretty good-- The core algorithm sucks, and I'm waiting for someone better than me to come and write a new one. If that's you, please open a pull request.
0 notes
ramrachum · 5 years ago
Text
Live-coding a music synthesizer
Live-coding a music synthesizer
After so much work and waiting, the video of my EuroPython talk is finally released!
youtube
This is a fun live-coding session using NumPy and SoundDevice. The goal of this talk is to make the computer produce realistic-sounding instrument sounds, using nothing but math.
The talk starts with creating a simple sound using a sine wave. We gradually make it sound more like a real instrument, learning a little bit about music theory on the way. We add features one-by-one until by the end of the talk, we hear our synthesizer play a piece of classical music.
0 notes
ramrachum · 5 years ago
Text
Quick project: Guitar tuning peg turner
Quick project: Guitar tuning peg turner
Here's a cute 3D printing I did on a whim a couple of days ago.
I wanted to change the strings on my acoustic bass guitar, so I'll have crisp sound on my upcoming talk Live-coding a music synthesizer on EuroPython 2020. Do watch it, or watch the YouTube video after it's released.
To change my strings, I need to unwind the old ones all the way through, and then wind the new ones manually, using these tuning pegs:
Tumblr media
Ain't nobody got time for that! 3D-printing to the rescue:
Tumblr media
Printing took only 25 minutes:
Tumblr media
After connecting it to my electric screwdriver, it worked perfectly:
youtube
Good times :)
0 notes
ramrachum · 5 years ago
Text
Improving Python exception chaining with raise-from
Improving Python exception chaining with raise-from
This is going to be a story about an esoteric feature of Python 3, and how I spent the last few months reviving it and bringing it into the limelight.
Back in the yee-haw days of 2003, Raymond Hettinger wrote an email to the python-dev mailing list, sharing an idea for improving the way that Python handles exceptions that are caught and replaced with other exceptions. The goal was to avoid losing information about the first exception while reporting the second one. Showing the full information to the user would make debugging easier, and if you've followed my work before, you know there's nothing I love better than that.
That idea was polished and refined by many discussions on python-dev. A year later, Python core developer Ka-Ping Yee wrote up a comprehensive PEP that was then known as PEP 344, later to be renamed to PEP 3134. That idea was detailed there, with all the loose ends, potential problems and solutions. Guido accepted the PEP, and it was implemented for the infamous Python 3.0, to be used... By no one. For a long time.
If there's one thing I don't miss, it's waiting 10 years for the Python ecosystem to adopt Python 3. But finally, it happened. Almost all the packages on PyPI support Python 3 now, and getting a job writing Python 3 code is no longer a luxury. Only a few days ago, NumPy finally dropped Python 2 support. We live in good times.
When a modern Python developer catches an exception and raises a new one to replace it, they can enjoy seeing the complete information for both tracebacks. This is very helpful for debugging, and is a win for everybody.
Except... For one thing.
Two cases of exception chaining
There was one interesting detail of PEP 3134 that was forgotten: It has to do with the question, "What does it mean when one exception is replaced with another? Why would someone make that switcheroo?"
These can be roughly divided into two cases, and PEP 3134 provided a solution for each case.
The first case is this:
"An exception was raised, we were handling it, and something went wrong in the process of handling it."
The second case is this:
"An exception was raised, and we decided to replace it with a different exception that will make more sense to whoever called this code. Maybe the new exception will make more sense because we're giving a more helpful error message. Or maybe we're using an exception class that's more relevant to the problem domain, and whoever's calling our code could wrap the call with an except clause that's tailored for this failure mode."
That second case is quite a mouthful, isn't it? It didn't help that the first case was defined as the default. The second case ended up falling by the wayside. Most Python developers haven't learned how to tell Python that the second case is what's happening in their code, and to listen when Python is telling them that it's happening in code that they're currently debugging. This resulted in a Catch 22 situation, not that different from the one that slowed down Python 3 adoption in the first place.
Before I tell you what I did to break that Catch 22, I'll bring you into the fold and show you how to make that feature work in your project.
Exception causes, or `raise new from old`
I'm going to show you both sides of this feature: How to tell Python that you're catching an exception to replace it with a friendlier one, and how to understand when Python is telling you that this is what's happening in code that you're debugging.
For the first part, here's a good example from MyPy's codebase:
try: self.connection, _ = self.sock.accept() except socket.timeout as e: raise IPCException('The socket timed out') from e
See the from e bit at the end? That's the bit that tells Python: The IPCException that we're raising is just a friendlier version of the socket.timeout that we just caught.
When we run that code and reach that exception, the traceback is going to look like this:
Traceback (most recent call last): File "foo.py", line 19, in self.connection, _ = self.sock.accept() File "foo.py", line 7, in accept raise socket.timeout socket.timeout The above exception was the direct cause of the following exception: Traceback (most recent call last): File "foo.py", line 21, in raise IPCException('The socket timed out') from e IPCException: The socket timed out
See that message in the middle, about the exception above being the direct cause of the exception below? That's the important bit. That's how you know you have a case of a friendly wrapping of an exception.
If you were dealing with the first case, i.e. an exception handler that has an error in it, the message between the two tracebacks would be:
During handling of the above exception, another exception occurred:
That's it. Now you can tell the two cases apart.
What I did to push this feature
I found that almost no one knows about this feature, which is sad, because I think it's a useful piece of information when debugging. I decided I'll do my part to push the Python community to use this syntax.
I wrote a little script that uses Python's ast module to analyze a codebase and find all instances where this syntax isn't used and should be. The heuristic was simple: If you're doing a raise inside an except then in 99.9% of cases you're wrapping an exception.
I took the output from that script and used it to open PRs to a slew of open-source Python packages. Some of the projects I fixed are: Setuptools, SciPy, Matplotlib, Pandas, PyTest, IPython, MyPy, Pygments and Sphinx. Check out my GitHub history for the full list.
I then added a rule to PyLint, now known as W0707: raise-missing-from. After the PyLint team makes the next release, and the thousands of projects around the world that use PyLint upgrade to that release, they will all get an error when they fail to use raise from in places they should.
Hopefully, in a few years' time, this feature of Python will become more ingrained in the Python community.
What you can do to help
Do you maintain a Python project that already dropped Python 2 support? Install the latest version of PyLint from GitHub. You can do this in a virtualenv if you'd like to keep your system Python clean. Run this to install:
pip install git+https://github.com/PyCQA/pylint
Then, run this line on your repo:
pylint your_project_path | grep W0707
You'll get a list of lines showing where you should add raise from in your code. If you're not getting any output, your code is good!
0 notes
ramrachum · 5 years ago
Text
Symlinks and hardlinks, move over, make room for reflinks!
Symlinks and hardlinks, move over, make room for reflinks!
If you've been around Linux for a while, you know about symlinks and hardlinks. You've used them and you know the differences between how each of them behaves. Besides being a useful filesystem tool, they're also a favorite interview question, used to gauge a candidate's familiarity with filesystems.
What you might not know is that there's also a thing called reflink. Right now it's supported only on a handful of filesystems, such as Apple's APFS, used on all Apple devices, XFS which is used on lots of Linux file-sharing servers, Btrfs, OCFS2, and Microsoft's ReFS.
If a symlink is a shortcut to another file, and a hardlink is a first-class pointer to the same inode as another file, what's a reflink, and when is it useful?
A reflink is a tool for doing copy-on-write on the filesystem.
If you've heard the term copy-on-write before, I'm willing to bet that it was in the context of the Linux fork call. Let's talk a bit about that.
Copy-on-write when forking a process
When you fork a process in Linux, the new process has a new copy of the old process's memory. This is essential, because if the new process shared the old process's memory, either process could crash if the other process was making an unexpected change to their shared memory. Therefore, Linux needs to make a copy.
However, Linux is smart, and it knows better than to just make a naive copy. Making a naive copy could be a waste of memory, especially if your process has several gigabytes of memory allocated, and you're forking lots of processes for small tasks. If Linux were to make naive copies, you could find yourself with an out-of-memory crash very quickly.
When you fork a process, Linux uses copy-on-write to create the new process's memory. This means that it holds off on making actual copies of the existing memory pages until the last possible moment; which means, the moment when the two processes start having different ideas on what the content of these memory pages should be. In other words, as soon as one of these processes start writing to these memory pages, Linux makes a copy of it, assigning the original page to the original process, and the new copy to the newly-forked process.
This is a huge boon, because most of the time, the new process will either only be reading the memory, or not even that. So many copy actions are avoided thanks to this technique. The beauty part is that these shenanigans are completely transparent to the process, and to the developer who's writing the logic that this process performs. The new process behaves as if it has its own copy of the parent's memory pages, and the floor is being paved ahead of it as it walks forward, so to speak. It'll never even know that copy-on-write was performed.
Now we're ready to talk about reflinks.
Reflinks are copy-on-write for the filesystem
If you read the section above, you already know 90% of what you need to know to understand and use reflinks.
A reflink is a copy of a file, except that copy isn't really created on the hard-drive until the last possible moment. Like the forking version, this logic is invisible. You could do a reflink of a 10 gigabyte file, and the new "copy" would be created immediately, because the 10 gigabytes wouldn't really be duplicated. They'll only be duplicated once you start modifying one of the copies.
All the while, you could treat the reflink as if it was a completely legitimate copy of your original file.
How do you create reflinks?
On Linux, run the following:
$ cp --reflink old_file new_file
On Mac, there's a different flag for some reason:
$ cp -c old_file new_file
If you're creating reflinks programmatically, you could also use dedicated libraries such as this one for Python.
When are reflinks useful?
Here's an example of where I've used reflinks for a client of mine years back. They had a tool for developers that takes their entire codebase and copies it into a Docker container to run tests on it. (Don't ask.)
That recursive copying took a while, and the developers couldn't change their code in the meanwhile, or checkout any other branches, because then an inconsistent version of their code would be copied into the container. That was pretty annoying for me personally, because I was twiddling my thumbs whenever I started the test process.
I figured, why not use reflinks?
I wrote some Python code that creates reflinks to the code in a temporary folder, and then does a real copy from that temporary folder to the Docker container. The big advantage here is that as soon as the reflinks were created, I could modify the original code as much as I wanted, without affecting the tests.
Fortunately, all the developers were using Macs in that company, so I knew I didn't have to worry about filesystem support.
How can reflinks go wrong?
You might be thinking, "What happens if I create a reflink of a huge file, that's bigger than the amount of space I have available on the harddrive?"
I've never tried this, but here's what I heard: The reflink will be created, but then you'll get an error as soon as one of the copies will be changed, and an actual copy will need to be created. I haven't tested this, but this is something you should take into account if you're relying on reflinks in your business logic.
0 notes
ramrachum · 5 years ago
Text
Make your 3D prints stronger with titanium rods
Make your 3D prints stronger with titanium rods
I've been doing 3D printing for a few years, and there's a technique I've been using that I thought I should share with the world. This was also a good opportunity to make my first YouTube video. Enjoy :)
Tumblr media
Affiliate links to equipment I used in the video:
• Pin vise • Heavy-duty, keyed pin vise • Jeweller's vise • Medium-sized bolt cutter • Big bolt cutter
Buy titanium rods on eBay.
0 notes
ramrachum · 5 years ago
Text
git-recent: Quickly check out your favorite branches
git-recent: Quickly check out your favorite branches
I released yet another open-source project!
https://github.com/cool-RR/git-recent/
0 notes
ramrachum · 6 years ago
Text
PySnooper: Never use print for debugging again
PySnooper: Never use print for debugging again
I just released a new open-source project!
https://github.com/cool-RR/PySnooper/.
0 notes
ramrachum · 10 years ago
Text
Mike Driscoll interviewed me on his blog
Mike Driscoll interviewed me on his blog
I recently had the pleasure of being interviewed by Mike Driscoll on his blog, The Mouse vs. The Python.
Mike is well-known in the Python world and especially in the wxPython user group. He often posts tutorials for beginners on his blog, and it's happened serveral times that I googled a technical question and found the answer in one of his tutorials.
Head over to Mike's blog to read the interview.
2 notes · View notes
ramrachum · 10 years ago
Text
PythonTurtle makes it into Saudi Arabia's official state curriculum
PythonTurtle makes it into Saudi Arabia’s official state curriculum
I just heard some very exciting news.
Six years ago, when I was just starting out my development career, I made a little program called PythonTurtle. It’s a program that helps children learn how to program in the Python programming language, which is the programming language that I use it my day-to-day work as a web developer. I created PythonTurtle as a side project, because I saw there wasn’t a viable solution for children to learn how to program in Python. I figured there should be a solution, so I spent roughly two months of hard work building and releasing PythonTurtle.
Screenshot from the program:
Tumblr media
What’s special about PythonTurtle is that it lets children learn programming in an exciting way that puts emphasis on fun and creativity rather than technical details.
When using the program, an illustrated turtle is displayed on the screen, and the children can program it to move around the screen and draw lines. The more programming concepts the children learn, the more impressive drawings they can create with the program. This gives them motivation to learn and improve their skills without feeling that it’s being forced on them by their schoolteacher.
PythonTurtle is based on an educational program called LOGO that was developed in the eighties; what I made is in fact a modern version, so instead of teaching programming in a didactic language, it taught programming in the Python programming language, which is a real language used in the industry today. The idea is to bring children closer to the techniques used in the real world, and possibly plant the seeds of a career in software development.
Tumblr media
Because I was just starting out as software developer back then, I didn’t have the skills that I have today, and developing this software was hard for me. There were technical challenges (specifically modifying the wxPython shell to be able to command an auxiliary process.) These challenges were so hard, that it looked like I wasn’t going to solve them, and at a few points I considered giving up on the project entirely. I was asking myself, why am I even doing this? No one even knew I was working on this program, and no one seemed to care.
But I told myself that I’m creating something big here, and it’s important that I see this through to the end. So I did, and I overcame the technical problems.
I released the program as open-source under the MIT license, which means that every person on Earth could download it and use it free of charge. I decided to release it that way rather than as commercial software because I figured more children could use it if it was free, and that seemed more important to me than making a few bucks. I also liked the idea of contributing back to the open-source community, because so much of the software that I use every day is built on open-source software that was made by volunteers, so I was happy to contribute my share of open-source software.
I released the software for download and I submitted a link to the website to tech forums such as HN and Reddit, and over the next few days, the story blew up, and thousands of people visited the website. I was very happy and proud that people liked my project so much.
Over the six years since I’ve released the program, I’ve gotten many happy emails from teachers and parents who used the program to teach their children to program. It’s always heartwarming to get these emails. They come from all over the world: From the States, from the UK, from Africa, Australia, South America… I would occasionally also get emails from children themselves, and one time even from a 80-year-old man who said that he used my program to learn to program himself. I got more reports of adults enjoying using the program. Looking at the analytics for the website, I saw that PythonTurtle was downloaded almost 100,000 times, which made me very proud.
But last year, I’ve noticed something odd. I was checking how the site is doing on Google Analytics and saw that I’m getting a disproportionally large number of hits from Saudi Arabia.
Tumblr media
Specifically, there was a big peak of Saudi visitors around January 2014, and than that peak appeared again in January 2015. I also got more feedback emails from people with Arab-sounding names. I investigated why, and found a Saudi forum where PythonTurtle was mentioned. The text was in Arabic, and I tried translating it to English using Google Translate, but the result was too hard to understand, so I let it go and didn’t investigate further.
Until a couple of days ago, I got an email from a teacher from Saudi Arabia about PythonTurtle. He told me that PythonTurtle is being used in all high-schools in Saudi Arabia! The ministry of education of Saudi Arabia has put PythonTurtle into the official state curriculum! This means that it’s being used by more than 4,000 schools which teach more than 700,000 students!!!
I’m very excited to have made a program that has helped so many students, and especially the students in Saudi Arabia. I’m an Israeli, and there are no diplomatic relations between Israel and Saudi Arabia. I’m an ignorant regarding the political affairs between the countries, but I’m happy to see that open-source software has no borders; if a developer in one country makes a program that can help people, it can be used everywhere and help people all around the world, regardless of the political situation.
2 notes · View notes
ramrachum · 10 years ago
Text
Christoph Gohlke's awesome collection of Windows binaries for Python packages
Christoph Gohlke's awesome collection of Windows binaries for Python packages
Today I needed to upgrade the psycopg2 package on a Django app of one of my clients. Without giving it a second thought, I fired up my browser and started typing goh in the omnibar. I quickly got Christoph Gohlke’s page, which is on my favorites:
http://www.lfd.uci.edu/~gohlke/pythonlibs/
What is this? It’s a page where you can find Windows binaries of many popular Python packages. Whenever you need to install a Python package that requires compilation, and that package’s maintainers haven’t made Windows binaries available on PyPI, you could usually find it on Christoph’s page, categorized by package, Python version and 32bit/64bit.
I’ve never met Christoph. Never even spoken to him online. But he’s saved me, and thousands of other developers, from doing countless of hours of dreadful work compiling PyPI packages. His page is a godsend for anyone who does Python development on Windows.
Thank you Christoph!
1 note · View note
ramrachum · 10 years ago
Text
Startup lesson learned: Work vertically before you work horizontally
Startup lesson learned: Work vertically before you work horizontally
In this blog post I’m going to share with you an important lesson that I’ve learned about startups. I’m happy to have learned this lesson in the beginning of my career as an entrepreneur, about a decade ago, because I got to apply it in all the projects that came afterwards, both my personal projects and my client projects.
The year was 2006. My friend and I have just founded our first startup, Bintos. We were both 20 years old. We’ve been good friends since high-school, and we’ve always dreamed of starting our own startup and making it big. Now that we were both “adults”, it was only natural that we’ll start a startup and that we’ll do it together.
Bintos did something similar to Khan Academy, before the latter came to be famous. We wanted to produce high-quality video lectures for high-school students that would target the material they had to learn for the matriculation exams, and then, of course, put these video lectures free online. Please remember that this was 2006, when YouTube was just a young startup waiting to get acquired by Google, while Google Videos was still superior. Everyone knew that it’s only a matter of time until video on the web makes it big, but the quality and speed weren’t close to what they are today.
How did we get the idea to produce video lectures for high-school students? We were both fresh out of high-school, and we remembered how difficult it was to find material to use to study for our exams. (There are a lot of books in the library, but students want learning materials that are laser-focused on the exam, because otherwise they might “waste” time learning things that aren’t on the exam :)
We remembered that when we were in high-school, we were so desperate for relevant learning materials, that we were stoked to find a text-only website that was badly-formatted and almost impossible to read. The fact that such a shitty website was our best option was a sign for us that a change was necessary; so why not bring it ourselves.
Now, I had just dropped out of the Technion, a leading technical university in Israel. The Technion had a very impressive and successful project of video lectures for its students. Students watched these video lectures obsessively; sometimes to cover for missed lessons, but usually to prepare for an upcoming exams. I would personally watch it from my computer in my dorm room, but most people would watch it in the university’s computer farms. They even had a vending machine giving out VHS tapes of these lectures! It was incredible.
When we stared Bintos we basically wanted to take the success that video lectures had in the Technion and apply it to the world of high-school students.
Tumblr media
Now, we were young, inexperienced and poor. Starting a startup together was hard. Neither of us even had a job before. I wasn’t a serious programmer back then. We had little experience in being adults and talking to people seriously– An email that I would write today in 3 minutes, we would labor on for maybe 30 minutes, to make sure it had just the right mix of politeness, assertiveness, and knowing-what-the-hell-we’re-talking-about-ness. My point is that getting anything done was an inefficient struggle. We were the very definition of a scrappy startup. But still, we worked hard and felt that this was our calling in life.
Our main goal was to produce a sample video course that we could put online so we could get feedback from students, and hopefully investor attention. This meant we needed (a) a teacher to give the lectures, who should be as good of a teacher as we can get, (b) video equipment and knowing how to use it and © to edit the videos and putting them online on a website.
After months of hard work, we got everything we needed. We got an amazing math teacher, who worked in a top private school, to volunteer to give the lectures. (I’m still amazed that my cofounder convinced him to spend a few days off from work with us for no pay.) We got the private school mentioned above to allow us to use an empty classroom to film the lectures. We hired a professional cameraman, who had a high-end camcorder that produced great video; he was also in charge of our sound recording. And we pooled our money to buy another, smaller camcorder, for the wide shot, so that we could combine footage from both cameras when we edit the video.
We scheduled a few days of filming, and everything worked out great. This was the culmination of all our hard work, and we were very excited. We both sat in the back while the teacher gave the lectures and the cameraman followed him with the camera, and made sure everything went according to plan. The teacher did a great job and gave a great lecture.
Tumblr media
After we finished all the filming, I sat down to do the arduous work of editing all that footage down to consumable video lectures. (I picked up Adobe Premiere for this task.) We converted all the footage from MiniDV tapes to files, and I took a look at the footage. The footage from the auxiliary camera was good; the sound quality was shitty because it was using the on-camera microphone, but no worries, the main camera was the one connected to the collar microphone that the teacher was wearing, so we’ll use that.
I load up a video file from the main camera and play it. The video quality is great, audio is loud and clear with no background noises, everything looks perfect…
And then: Swoosh swoosh.
A loud, unpleasant noise. I was confused. A few more seconds of perfect audio went by, and then again: Swoosh swoosh swoosh swoosh.
That noise turned out to be the collar microphone, that wasn’t attached firmly enough, rubbing against the teacher’s sweater.
I looked at different points in the video file, and at other video files from the same camera. They all had those sounds all over the place. At most it would be gone for 15 seconds, and then it would come back again.
I showed this to my friend, and we were devastated. That noise was so loud and distracting, we couldn’t release a lecture that had it. We considered what we could do. We tried to digitally remove it, but I doubt even professionals could do that, certainly not us. We considered using audio from the second camera, but the quality was bad, the teacher’s voice was muffled and there was lots of echo.
We ended up having to refilm the whole thing. We were very lucky that the teacher was patient with us and volunteered a couple more days to help us. The cameraman took responsibility (after some arguing…), since he was supposed to be in charge of the sound, and he gave us those extra days for free. So we got to produce the lectures at the end and do it right: We made 100% sure there weren’t any noises when we filmed, and the lectures came out great after I edited them.
But after the whole thing was done, I stopped to think: What can I learn from this? How can I prevent something like this from happening again in the future? What’s the mistake I made?
My mistake was working horizontally before working vertically.
What do I mean by that?
There were several “layers” of work. The first layer was filming the lectures. The second layer was editing them. The third layer was uploading them and making sure they looked good on the site.
Every layer was a lot of work, and the most satisfying way to do work like that is to dive into the first layer, focus exclusively on it, finish it, and then move on to the next. I think that we like this method best because you get a feeling of accomplishment when you’re done with a certain type of tasks. Also, it’s probably more efficient because you get to concentrate better on each task. For example, if you were cleaning your house, you wouldn’t wash half the dishes in the sink, then clean half the floor, then scrub half the toilet, and then go back to the dishes again. You’d finish every layer of work before moving on to the next.
But this method (which I call working horizontally) is good only when you’re very familiar with the work and have high confidence that nothing will go wrong. (Like housework.) When you’re doing something you’ve never done before, the right thing is to first work vertically, taking a sample amount of work and driving it through all the layers. It’s less efficient, but the reason it’s better is because you learn what every layer really looks like, and you get to make mistakes earlier rather than later, so you could apply the lessons you’ve learned when you do the bulk of the work.
In the video lectures example, the right thing to do would be to film a single lecture, edit it and upload it. If that came out alright, then we should have gone forward with the course. In fact, I could have probably done the first lecture while they were still filming the other ones, so it wouldn’t have even wasted any time.
Ever since I’ve learned this lesson, I’ve been applying it to every project that I do.
When a client comes to me with a job and I look at all the work we have, I always insist on taking one sample unit of work and driving it through all the layers, just to catch any mistakes early. After that’s done, then it’s time to do the bulk of the work.
1 note · View note
ramrachum · 11 years ago
Text
Code comments that I find helpful
Code comments that I find helpful
I’m a huge believer in code quality. When I write code, I put in a lot of effort to make it be as easy to read and understand as possible. Because code is read much more often than it’s written, by writing your code in a way that’s easy to understand you’re saving lots of time for the developers who are going to read your code in the future (one of them being future you) and you’re making it easy for them to build on your code and extend it.
As we all know, a big part of writing good code is comments that explain crucial points about the code. This is what this blog post is about.
Making good code comments is not a trivial thing. We’ve all seen comments like this:
x += 1 # Increment x by 1
Comments like that are not only unhelpful but they are outright harmful, because they add noise to the code; they grab our attention but then don’t give us any useful information. Attention is a scarce resource that shouldn’t be wasted.
Because adding comments to code adds noise, we need to make sure that our comments deliver the maximum amount of useful information in the minimum amount of noise. In this blog post I’ve listed a few kinds of comments that I put in my code to make it as clear as possible with as little noise as possible.
Comment braces
Here is a style of comments I picked up a few years ago by reading someone else’s code, and which proved really helpful since then:
def calculate_length_of_recurrent_perm_space(k, fbb): # ... ### Doing phase one, getting all sub-FBBs: ################################ # # levels = [] current_fbbs = {fbb} while len(levels) < k and current_fbbs: k_ = k - len(levels) levels.append( {fbb_: fbb_.get_sub_fbbs_for_one_key_removed() for fbb_ in current_fbbs if (k_, fbb_) not in cache} ) current_fbbs = set(itertools.chain(*levels[-1].values())) # # ### Finished doing phase one, getting all sub-FBBs. ####################### ### Doing phase two, solving FBBs from trivial to complex: ################ # # for k_, level in enumerate(reversed(levels), (k - len(levels) + 1)): if k_ == 1: for fbb_, sub_fbb_bag in level.items(): cache[(k_, fbb_)] = fbb_.n_elements else: for fbb_, sub_fbb_bag in level.items(): cache[(k_, fbb_)] = sum( (cache[(k_ - 1, sub_fbb)] * factor for sub_fbb, factor in sub_fbb_bag.items()) ) # # ### Finished doing phase two, solving FBBs from trivial to complex. #######
I call these “comment braces” because they look like huge vertical braces that have code in them. Even though these comments are quite bulky, I still really love them because they divide the code into different segments, which is very helpful when you have a piece of code that can be logically separated into different segments. (Of course, if you can refactor these different segments into different functions, that would be ideal, but in many cases it’s not practical.)
This style of commenting makes it easier to read the code casually, because when you’re reading a line in the code you only need to think how it relates to the lines in its section, and not how it relates to the lines in the different sections.
Until challenged
“Until challenged” is a short two-word comment that communicates a common programming idiom:
def __lt__(self, other): found_strict_difference = False # Until challenged. all_elements = set(other) | set(self) for element in all_elements: if self[element] > other[element]: return False elif self[element] < other[element]: found_strict_difference = True return found_strict_difference
We’ve all written algorithms where we set a variable to a boolean, and then later we might flip it and we might not depending on some condition, and eventually we’re going to check the value of that variable and possibly return it. Writing a comment “Until challenged” after a variable assignment communicates that we’re using this idiom.
It might be unclear to people who aren’t familiar with it, so it’s a compromise between brevity and understandability. If you want to make it more universally understandable, you can add a few words like “Set to False until we possibly discover a difference and set it to True”.
Establishing current state
Also known as, “the manual assert.” Sometimes it’s useful to make a comment somewhere in the middle of the function that describes what we’ve accomplished by this point, what is the current state, and what we’re going to do now. Example:
# ... if self.is_degreed and (perm.degree not in self.degrees): raise ValueError # At this point we know the permutation contains the correct items, and # has the correct degree. Now, to calculate its index number. if perm.is_dapplied: return self.undapplied.index(perm.undapplied) # ...
I love this kind of comment. What all good comments have in common is that they’re saying what your internal monologue would say if you tried to read the code and understand it without comments, and this comment is no different.
Clarifying else keyword
One thing that can be confusing when reading Python code is when looking at the else part of a long if-else clause, and not being sure what condition it is. This is where I like to add a comment reiterating the condition:
if actual_item_test is None: if isinstance(single_or_sequence, collections.Sequence): return tuple(single_or_sequence) elif single_or_sequence is None: return tuple() else: return (single_or_sequence,) else: # actual_item_test is not None if actual_item_test(single_or_sequence): return (single_or_sequence,) elif single_or_sequence is None: return () else: return tuple(single_or_sequence)
Note the comment after the middle else. It tells you which condition should be true for this else clause to be executed, so you don’t have to trace it back to the original if line.
That’s all I’ve got for now. I’ll be happy to hear your code commenting tips!
0 notes
ramrachum · 11 years ago
Text
My new open-source project: Combi, the Pythonic package for combinatorics
My new open-source project: Combi, the Pythonic package for combinatorics
I’m proud to announce the first release of my new open-source project: Combi!
Combi is a combinatorics package for Python.
Combi on GitHub.
Combi on PyPI.
Combi documentation.
Combi is awesome. It’s like a marshmallow that was slowly and carefully roasted at just the right temperature to make it melt inside, but not too hot as to burn it; except instead of being a marshmallow, it’s a Python package.
Installation:
$ pip install combi
Tumblr media
What is Combi good for? Combi lets you explore spaces of permutations and combinations as if they were Python sequences, but without generating all the permutations/combinations in advance. It also lets you specify a lot of special conditions on these spaces. This is helpful both for scientific computing, and for general-purpose programming, as combinations and permutations are concepts that come up when solving many different kinds of programming problems.
(I developed Combi while doing research for a bigger project of mine that’s going to remain a secret for a while. I call it Project SK. If you want to get updates on it when it becomes public, sign up here.)
Let’s look at the simplest example of using Combi. Check out this $5 padlock in the picture. I use this padlock for my gym locker, so people won’t steal my stuff when I’m swimming in the pool. It has 8 buttons, and to open it you have to press down a secret combination of 4 buttons. I wonder though, how easy is it to crack?
>>> from combi import * >>> padlock_space = CombSpace(range(1, 9), 4) >>> padlock_space <CombSpace: range(1, 9), n_elements=4>
padlock_space is the space of all possible combinations for our padlock. At this point, the combinations weren’t really generated; if we’ll ask for a combination from the space, it’ll be generated on-demand:
>>> padlock_space[7] <Comb, n_elements=4: (1, 2, 4, 7)>
As you can see, padlock_space behaves like a sequence. We can get a combination by index number. We can also do other sequence-y things, like getting the index number of a combination, or slicing it, or getting the length using len. This is a huge benefit because then we can explore these spaces in a declarative rather than imperative style of programming. (i.e. we don’t have to think about generating the permutations, we simply assume that the permutation space exists and we’re taking items from it at leisure.) Let’s try looking at the length of padlock_space:
>>> len(padlock_space) 70
Only 70 combinations. That’s pretty lame… At 3 seconds to try a combination, this means this padlock is crackable in under 4 minutes. Not very secure.
In the example above, I used CombSpace, which is a space of combinations. It’s a thin subclass over PermSpace, which is a space of permutations. A combination is like a permutation, except order doesn’t matter.
Now, because the permutations/combinations are generated on-demand, I can do something like this:
>>> huge_perm_space = PermSpace(1000) >>> huge_perm_space <PermSpace: 0..999>
This is a perm space of all permutations of the numbers between 0 and 999. It is ginormous. The number of permutations is around 10**2500 (a number that far exceeds the number of particles in the universe.) I’m not even going to show its length in the shell session because the length number alone would fill this entire blog post. And yet you can fetch any permutation from this space by index number in a fraction of a second:
>>> huge_perm_space[7] <Perm: (0, 1, 2, 3, 4, ... 997, 996, 999, 998)>
Note that the permutation huge_perm_space[7] is a sequence by itself, where every item is a number in range(1000).
Combi lets you specify a myriad of options on the the spaces that you create. For example, you can make some elements be fixed:
>>> fixed_perm_space = PermSpace(4, fixed_map={3: 3,}) >>> fixed_perm_space <PermSpace: 0..3, fixed_map={3: 3}> >>> tuple(fixed_perm_space) (<Perm: (0, 1, 2, 3)>, <Perm: (0, 2, 1, 3)>, <Perm: (1, 0, 2, 3)>, <Perm: (1, 2, 0, 3)>, <Perm: (2, 0, 1, 3)>, <Perm: (2, 1, 0, 3)>)
This limits the space and makes it smaller. This is useful when you’re making explorations on a huge PermSpace and want to inspect only a smaller subset of it that would be easier to handle.
There are many more variations that you could have on a PermSpace or a CombSpace. You can specify a custom domain and a custom range to a space. You can constrain it to permutations of a certain degree (e.g. degrees=1 to limit to transformations only.) You can do k-permutations by specifying the length of the desired permutations as n_elements. You can have the permutation objects be of a custom subclass that you define, so you could provide extra methods on them that fit your use case. You can provide sequences that have some items appear multiple times and Combi would be smart about it and consider multiple occurrences of the same item to be interchangable. You can also toggle that behavior so it would treat them as unique. It’s very customizable :)
Combi has a bunch more useful features that are beyond the scope of this blog post (click for links to documentation):
MapSpace is like Python’s builtin map, except it’s a sequence that allows index access.
ProductSpace is like Python’s itertools.product, except it’s a sequence that allows index access.
ChainSpace is like Python’s itertools.chain, except it’s a sequence that allows index access.
SelectionSpace is a space of all selections from a sequence, of all possible lengths.
The Bag class is like Python’s collections.Counter, except it offers far more functionality, like more arithmetic operations between bags, comparison between bags, and more. (It can do that because unlike Python’s collections.Counter, it only allows natural numbers as keys.)
Classes FrozenBag, OrderedBag and FrozenOrderedBag are provided, which are variations on Bag.
I hope that the Combi package will be useful for you!
0 notes
ramrachum · 11 years ago
Text
Another silly Python riddle
Another silly Python riddle
Do you think of yourself as an experienced Python developer? Do you think you know Python’s quirks inside and out? Here’s a silly riddle to test your skills.
Observe the following Python code:
def f(x): return x == not x f(None)
The question is: What will the call f(None) return?
Think carefully and try to come up with an answer without running the code. Then check yourself :)
0 notes
ramrachum · 11 years ago
Text
Observational comedy and tickling
Observational comedy and tickling
I had a nice thought today.
I was thinking about comedy; about what makes people laugh. It’s something I think about a lot, especially that what makes people laugh the most is when you’re being your genuine self, and not apologizing about it.
I was thinking about what is called observational comedy. It’s the kind of comedy Seinfeld became famous for; it’s candid observations about everyday life that we all experience but never talk about. If you’ve ever seen one of Jerry Seinfeld’s standups on his shows, you know what I’m talking about. Here are a few random observations of his:
“You know you’re getting old when you get that one candle on the cake. It’s like, ‘See if you can blow this out.’”
“I am so busy doing nothing… that the idea of doing anything - which as you know, always leads to something - cuts into the nothing and then forces me to have to drop everything.”
“Men don’t care what’s on TV. They only care what else is on TV.”
You see these kind of jokes a lot in internet memes. The beloved Louis C.K. also does a lot of gags in this style.
These are things we experience in our lives, but don’t think about; so when a talented comedian comes along and voices these thoughts that we’ve been harboring for years, possibly with a nice word play or a joke on top, we can’t help but laugh. Those previously-unvoiced thoughts are like regions of our mind that was always there, but that we never think about. When someone finally sheds light on them, we laugh in delight, as we are reassured that they exist, and that other people have them too.
Then it hit me: Observational comedy is very similar to tickling! If someone is tickling your foot with his finger, he’s touching a part of your body that you use all the time, but that never gets any kind of sensation other than the idiotic, montonous pounding that is walking. Suddenly, the foot that became desensitized to touch experiences a small finger touching it gently; and we laugh. Observational humor is the equivalent of tickling, except for our minds. Thoughts that we took for granted before, that we never thought about, suddenly get probed by the comedian. So we laugh.
0 notes
ramrachum · 11 years ago
Text
Fun with etymology
Fun with etymology
In Hebrew, we have an expression "radiophonic voice". If someone has a radiophonic voice, it means they have a pleasant voice and good diction, much like a radio announcer would. 
Tumblr media
This expression makes sense to me, and I was kinda surprised when I learned that it didn’t exist in English.
But then I thought about the origin of the word radio/radius in "radiophonic voice", and realized it made quite a funny journey.
It used to mean “the spoke of a wheel”; then 
Because the spoke of a wheel goes from the center to the edge of the circle, it also meant what we call radius, which is the abstract geometric idea of a line from a center of a circle to its edge. (Note: I may be wrong the order of these two first meanings.) Then
Tumblr media
Because electromagnetic radiation comes from a center point outwards in a sphere, it was given the name “radiation”. It’s interesting to note that the international sign for radiation is quite similar to the aforementioned wheel.
Because of the name “radiation”, the device used for listening to music using electromagnetic radiation was called a radio; and
“Radiophonic voice” came to mean (in Hebrew) a voice that’s pleasant enough that you’d want to listen to it on the radio.
Quite an interesting journey through the ages from “spoke of a wheel” to “pleasant voice” :)
0 notes