Don't wanna be here? Send us removal request.
Text
Python Subprocess Module
Popen
Popen is a class handling the underlying processes creation and management in this module. Its constructor takes a bunch of arguments. Here I just list several important arguments.
The first argument is the command to be executed and its arguments. It can be either a string of a valid command line or a sequence of command and arguments. E.g.
import subprocess p1 = subprocess.Popen("ls -l", ...) p2 = subprocess.Popen(["ls", "-l"], ...)
shell argument indicates whether to use the shell of the underlying OS to execute the command. On Unix, shell=True is equivalent to executing "sh -c <cmd>". If shell=True, the command should be a string rather than a sequence;otherwise, the constructor will take only the first element in the sequence. Passing shell=True can be a security hazard.
stdin, stdout and stderr specify the program's standard input, standard output and standard error file handles respectively. The value can be subprocess.PIPE, an existing file or file descriptor and None.
0 notes
Text
Python Runtime Services
atexit
The atexit module is used to register functions to be executed when the interpreter exits.
register(func, [,arg [, kwargs]])
copy
The copy module has a copy(x) function for shallow copy and a deepcopy(x) function for deep copy. The former is seldom used to copy data structures such as list, set and dict. Copying them using their own constructors will be more efficient.
copy(x): shallow copy
deepcopy(x [,visit]): deep copy
gc
Python garbage collection uses a three-level generational scheme in which objects survive the initial garbage-collection step are placed onto lists of objects that are checked less frequently.
collect([generation]): runs a full garbage collection, generation indicates the generation to do garbage collection
enable()/disable(): turn on/off garbage collection
garbage: a list of user-defined instances no longer in use but involved in a reference cycle. These instances have also defined a __del__() method, so they won't be garbage collected due to difficulty in object finalization
get_count(): return a tuple showing number of objects in each generation
get_objects(): return a list of objects tracked by garbage collector
get_referrers(obj1, obj2, ...): return a list of objects that directly refer to obj1, obj2 and so on
get_referents(obj1, obj2, ...): return a list of objects that the obj1, obj2 and so on refer to
get_threshold()/set_threshold(th1, [, th2, [th3]): get/set threshold for garbage collection, i.e. frequency of garbage collection
isenabled(): if garbage collection is enabled
get_debug()/set_debug(flags): get/set debug flags for the garbage collector. The debug flags are constants chained by bitwise OR.
inspect
The inspect module is used to gather information about live Python objects such as attributes, doc strings, source code, stack frames, and so on.
currentframe(): return the frame object corresponding to the caller's stack frame.
getargspec(func):
0 notes
Text
Python Test and Debug
doctest
The first line of string in a method is called documentation string, which may include usage and example of the method. However, code may be changed time to time while the documentation may not be kept aligned to it in time. In this case, users may be confused by inaccurate or even incorrect documentation. To avoid this, doctest should be used to test if a method works consistently with examples presented in its documentation string. The example of doctest is shown below.
#!/usr/bin/env python if __name__ == '__main__': import doctest print doctest.testmod() def half(x): """ Halves x. For example: >>> half(6.8) 3.4 >>> half(5.0) 2.5 >>> half(7.0) 3.5 """ return x/2
0 notes
Text
[LeetCode] Tree Traversal with Morris Traversal
Tree traversal is one of the basic algorithms every programmer should master. Tree traversal can be done in various ways. The most commonly seen algorithm is to use a stack(implicitly or explicitly) to traverse a tree recursively or iteratively. With this algorithm, both time and space complexity are O(n).
However, a question you might be frequently asked is "Could you do it better?". To traverse a tree, one thing for sure is that the time complexity can not be improved - you have to go over each node, otherwise it is not a complete traversal. The only possibility is that we may have a chance to improve the space complexity. So here I will introduce Morris Traversal, which minimizes space you need for tree traversal.
An important problem to be solved in tree traversal is how to store a node's ancestors. How it is done differs a good algorithm from many mediocre ones. Instead of storing all the ancestors of a particular node as we usually do, Morris algorithm only stores its parent. So that the space complexity decreases significantly as we only need several extra pointers rather than a stack of them.
Morris algorithm makes full use of right pointer of the rightmost leaf of each subtree. Usually these pointers point to null but Morris borrows them for storing parents. Let's see how it works in detail.
In Morris algorithm, there are two pointers - cur points to the current node and prev points to the parent of it. You always look at the left first. If the left is null, you go right. If it is pre-order or in-order traversal, the current node should be printed out now since there is nothing on the left. If the left is not null, you must go left first. But before you go, you must know where to return once you finish the traversal on this side, so you need to store that first. The way to store the parent is that, you let the prev pointer keep going right of the current node's left node till it reaches the leaf and then you let the leaf's right pointer point to the current node. The reason why this pointer is used is that the right leaf of a subtree is usually the last node being traversed(except post-order, that's why using Morris for post-order traversal is such a pain, we'll see later). We can simply let cur go its right after traversing a subtree as the right pointer of the last node will point to the parent of the subtree.
Let's look at examples. The pre-order and in-order traversal are basically similar, while post-order is quite different as it violates an important assumption that the rightmost leaf is the last node to be traversed. The code of pre-order and in-order traversal are shown below.
Pre-order(Java):
/** * Definition for binary tree * public class TreeNode { * int val; * TreeNode left; * TreeNode right; * TreeNode(int x) { val = x; } * } */ public class Solution { public List preorderTraversal(TreeNode root) { List res = new ArrayList(); if(root == null) return res; TreeNode cur = root; while(cur != null){ // when nothing is on the left if(cur.left == null){ res.add(cur.val); cur = cur.right; }else{ TreeNode prev = cur.left; // look for the rightmost leaf while(prev.right != null && prev.right != cur) prev = prev.right; // when finding the rightmost leaf if(prev.right == null){ res.add(cur.val); prev.right = cur; cur = cur.left; // when traversed to the rightmost leaf }else{ cur = cur.right; } } } return res; } }
In-order(Java):
/** * Definition for binary tree * public class TreeNode { * int val; * TreeNode left; * TreeNode right; * TreeNode(int x) { val = x; } * } */ public class Solution { public List inorderTraversal(TreeNode root) { List res = new ArrayList(); if(root == null) return res; TreeNode cur = root; while(cur != null){ if(cur.left == null){ res.add(cur.val); cur = cur.right; }else{ TreeNode prev = cur.left; while(prev.right != null && prev.right != cur) prev = prev.right; if(prev.right == null){ prev.right = cur; cur = cur.left; }else{ res.add(cur.val); cur = cur.right; } } } return res; } }
As I said, post-order traversal is painful with Morris algorithm , although it finally achieves constant space complexity. In the post-order traversal, we still use the rightmost leaf's right pointer for storing the parent. The difference is that we need to reverse the order of output on the right side of a subtree, as the right node should be traversed earlier than the middle one. The code is shown below.
Post-order(Java):
/** * Definition for binary tree * public class TreeNode { * int val; * TreeNode left; * TreeNode right; * TreeNode(int x) { val = x; } * } */ public class Solution { public List postorderTraversal(TreeNode root) { List res = new ArrayList(); if(root == null) return res; TreeNode dummy = new TreeNode(0); dummy.left = root; TreeNode cur = dummy; while(cur != null){ if(cur.left == null){ cur = cur.right; }else{ TreeNode prev = cur.left; while(prev.right != null && prev.right != cur) prev = prev.right; if(prev.right == null){ prev.right = cur; cur = cur.left; }else{// when traversed to the rightmost leaf // reverse the output order of the right side add(cur.left, prev, res); cur = cur.right; } } } return res; } public void reverse(TreeNode from, TreeNode to){ if(from == to) return; TreeNode x = from, y = from.right; while(true){ TreeNode tmp = y.right; y.right = x; x = y; y = tmp; if(x == to) break; } } public void add(TreeNode from, TreeNode to, List res){ reverse(from, to); TreeNode p = to; while(true){ res.add(p.val); if(p == from) break; p = p.right; } reverse(to, from); } }
0 notes
Text
[LeetCode] Next Permutation
I've been stuck in this problem for a long time. I'm always thinking I'll figure it out since the AC rate is relatively high. But finally I give up and look at the answer - even the answer is not that easy to understand!
Before going through the solution, I suppose you have already figured out "Permutation" problem in LeetCode. It is a typical DFS(depth-first search) problem and should be easy if you understand how DFS works. Solving this problem requires you to take a slow motion to rethink the solution. Here is how the solution runs. Assuming we have an original sequence "1 2 3 4", the permutations are generated in this order:
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
...
I won't list all the permutations as it is a long list (4! permutations). Recall how "Permutation" works - you may work it recursively, the algorithm picks the first number in the first recursion, then the second in the second recursion, and so on so forth. Each number can only be picked once - that means a number cannot be picked by a recursion if its upper recursion have selected that number. Each recursion will go over the sequence before it returns. Once it returns, the upper level will pick the next number in the sequence and trigger another recursion. This process repeats until the uppermost recursion have traversed over the sequence.
Now let's look at the listed permutations. We can think that each number is determined by a recursion and the recursion returns once it has traversed the sequence, i.e. picked the largest unpicked number. When a recursion returns, the upper recursion will move to the next number, i.e. the number on the left will be the next number in the original sequence, and another recursion will start from the smallest unpicked number, i.e. the subsequence on the right is likely to be in an increasing order.
So we must find where a recursion is almost returning. The fact is that the head of a decreasing subsequence indicates a returning recursion. For example, if we have a sequence "1 4 3 2", you should notice that the second recursion (the second digit) picks the largest unpicked number and so do the third and the fourth. So all recursions lower than the first recursion will return and the first recursion will move to the next unpicked number, which is "2", in the next permutation. Then the first recursion will spawn another recursion starting from the smallest unpicked number, which is "1" and this recursion will spawn another recursion that acts in this way, and so on so forth. So finally we get a next permutation, which is "2 1 3 4".
Now the solution is clear. First of all, we need to find the head of a decreasing subsequence - it can be easy if you traverse the sequence from the end and find the first number not in an increasing order. We can call it "returning point". Secondly, we need to find the number next to the number at the "returning point" in the original sequence. Since the subsequence on the right is sorted decreasingly, the next number can be found if you traverse the sequence from the end again and find the first number larger than the "returning point" number. Thirdly, we need to let the "returning point" number move to the next number, i.e. swap the "returning point" number and the next number. Finally, we must reverse the subsequence on the right because the next recursion should start from the smallest unpicked number and it is likely to result in an increasing subsequence.
The code is shown below.
Java:
public class Solution { public void nextPermutation(int[] num) { if(num.length < 2) return; for(int i = num.length - 2; i >= 0; i --){ // find "returning point" number if(num[i] < num[i + 1]){ for(int j = num.length - 1; j >= 0; j --){ // find the next number if(num[j] > num[i]){ // let the "returning point" move to the next number int tmp = num[i]; num[i] = num[j]; num[j] = tmp; // spawn a new recursion starting from the smallest unpicked // number reverse(num, i + 1, num.length - 1); return; } } } } // in case that the first recursion returns, the algorithm shouldn't break but // return the first permutation of the original sequence. reverse(num, 0, num.length - 1); } public void reverse(int[] num, int start, int end){ while(start < end){ int tmp = num[start]; num[start] = num[end]; num[end] = tmp; start ++; end --; } } }
0 notes
Text
[Leetcode] Candy & Trapping Rain Water & Best Time to Buy and Sell Stock III
The reason why I bring them together is that they are very similar problems - all of them provides an array of data and ask you to find some kind of maximum/minimum total from it. The solutions to the problems are also similar in essence. We can solve this type of problems with dynamic programming with a one-dimensional array. The array is used for storing the maximum/minimum value at the point of time the corresponding data is traversed.
Let's look at the "Candy" problem first. The solution is like that: we go over the data array one way and then the other so that we can have two views of the data. In the first traversal, we compare the rating of a child with the child on the left. If the child on the left has a higher rating, the child on the right gets one candy because every child will get at least one candy. If the left child has a lower rating, the right child will get (the number of candies the left child gets + 1) candies. The numbers are stored in the one-dimensional array. In this traversal, we only consider what is the minimum number of candies the right child can get but ignore whether the left child with higher rating gets more candies or not. So we need to traverse the data array reversely to take care of this case. In the second traversal, we care about how many candies a child on the left can get in minimum. In theory, it should result in another one-dimensional array to store the minimum number of candies each child on the left can have. However, we can save that by computing the result when traversing the data. The code is shown below.
Java:
public class Solution { public int candy(int[] ratings) { if(ratings.length == 0 || ratings.length == 1) return ratings.length; int[] candies = new int[ratings.length]; candies[0] = 1; for(int i = 1; i < ratings.length; i ++){ if(ratings[i] > ratings[i - 1]) candies[i] = candies[i - 1] + 1; else candies[i] = 1; } int res = candies[candies.length - 1]; for(int i = ratings.length - 2; i >= 0; i --){ int num = 1; if(ratings[i] > ratings[i + 1]) num = candies[i + 1] + 1; res += Math.max(num, candies[i]); candies[i] = num; } return res; } }
Then we look at the "Trapping Rain Water". Instead of keeping track of the minimum number of candies a child can get, we find the maximum "wall" height before the current "wall" is examined. The first traversal finds the maximum height on the left of each wall and the second finds the maximum on the right. So that when we examined a specific "wall", we can definitely know how much water can be held in between with knowledge about the maximum heights of walls on the both sides. The code is shown below.
Java:
public class Solution { public int trap(int[] A) { if(A.length < 3) return 0; int[] maxL = new int[A.length]; int max = A[0]; for(int i = 0; i < A.length; i ++){ maxL[i] = max; max = Math.max(max, A[i]); } max = A[A.length - 1]; int total = 0, trap = 0; for(int i = A.length - 2; i >= 0; i --){ trap = Math.min(maxL[i], max) - A[i]; if(trap > 0) total += trap; max = Math.max(max, A[i]); } return total; } }
There are many solutions to the third problem, but dynamic programming one can be one of the most elegant solutions. The logic is rather simple and it is easy to implement. Similarly, we use a one-dimensional array to record the maximum profit that can be made during a period of time. We can find the maximum profit that can be made from a one-time transaction in the first traversal. The second traversal looks at the possibility of earning more profits if another transaction is made. The maximum can be found by adding up the maximum profit of two transactions at a specific day. The code is shown below.
Java:
public class Solution { public int maxProfit(int[] prices) { if(prices.length < 1) return 0; int[] profits = new int[prices.length]; int maxProfit = 0, lowestPrice = prices[0]; for(int i = 1; i < profits.length; i ++){ int profit = prices[i] - lowestPrice; maxProfit = Math.max(maxProfit, profit); profits[i] = maxProfit; lowestPrice = Math.min(lowestPrice, prices[i]); } maxProfit = 0; int highestPrice = prices[prices.length - 1], res = 0; for(int i = prices.length - 1; i >= 0; i --){ maxProfit = Math.max(maxProfit, highestPrice - prices[i]); highestPrice = Math.max(highestPrice, prices[i]); res = Math.max(res, maxProfit + profits[i]); } return res; } }
0 notes
Text
[Leetcode] Word Ladder II
This problem is the next step of "Word Ladder I". Although the solution is simple in logic, the great amount of details makes it very tricky.
The basic idea is that we need to use breadth-first search(BFS) to find the minimum depth of the target word in the "word tree" first, as we did before in "Word Ladder I"; then we need to use depth-first search(DFS) to find the path from the start word(root) to the target word(leaf).
Implementing the algorithm is not very difficult. You may want to use a map to keep track of paths from the start word to the current word. In this case, you don't even need DFS and will have all the possible paths to the target word in the end. It is great, isn't it? What? It brings you a "Memory Limit Exceeded" error? OK, let's look into it.
By examining the algorithm, we will find it does waste a lot of memory - it is not necessary to store many words. For instance, if there are two paths, dog -> dot -> hot and dog -> dot -> not, we can simply find that dog and dot are duplicate in the map. To avoid that, we can use a map to keep track of the previous word variation(parent) instead of the entire path.
Now we have a list of word pairs, but how we get the full path? To find the full path, we need to turn around, looking for the start word from the end. So how can we do that? That's why we need DFS. We can start from the end word, find a list of its previous variations, add each of them to the path respectively and make it the end word. This process repeat till we find the start word and then we have a full path. Since the algorithm goes over every possibilities, we will have all the possible paths in the end.
Python:
class Solution: # @param start, a string # @param end, a string # @param dict, a set of string # @return a list of lists of string def findLadders(self, start, end, dict): chars = "abcdefghijklmnopqrstuvwxyz" res, path = [], [] # curlevel and nextlevel for level-order traversal # prev is a map storing the previous word variations of each word curlevel, nextlevel, prev = [], [], {} curlevel.append(start) path.append(end) while curlevel: # remove words in the previous level from the dict for w in curlevel: if w in dict: dict.remove(w) # generate the next level while curlevel: word = curlevel[0] del curlevel[0] # generate paths when reaching the target word if word == end: self.buildPath(start, end, prev, res, path) return res for i in range(len(word)): for c in chars: nw = word[:i] + c + word[i + 1:] if nw in dict: nextlevel.append(nw) if nw not in prev: prev[nw] = [] prev[nw].append(word) curlevel, nextlevel = nextlevel, curlevel return res # recursion for generating paths def buildPath(self, start, end, prev, res, path): if start == end: p = list(path) p.reverse() res.append(p) else: for w in prev[end]: path.append(w) self.buildPath(start, w, prev, res, path) del path[-1]
It seems that the code should work, but it turns out to be "Time Limit Exceeded" on Leetcode. So I ran the code on my own and found that the result was even incorrect - there were duplicates in the result. Where's the problem? Then I found a corner case I missed - when we have "rex" and "tax" in a same level, both of them can vary into "tex". So there will be two "tex"s in a same level and the program have to waste cycles to traverse them. We must avoid that!
The workaround is simple: use a set to check if a word has been traversed previously when traversing a level. The improved code is shown below.
Python:
class Solution: # @param start, a string # @param end, a string # @param dict, a set of string # @return a list of lists of string def findLadders(self, start, end, dict): chars = "abcdefghijklmnopqrstuvwxyz" res, path = [], [] # curlevel and nextlevel for level-order traversal # prev is a map storing the previous word variations of each word curlevel, nextlevel, prev = [], [], {} curlevel.append(start) path.append(end) while curlevel: for w in curlevel: if w in dict: dict.remove(w) # a set for checking duplicate presences of a word in a level shown = set() # generate the next level while curlevel: word = curlevel[0] del curlevel[0] # skip a word if it is traversed before, otherwise add it # to the set if word in shown: continue shown.add(word) # generate paths when reaching the target word if word == end: self.buildPath(start, end, prev, res, path) return res for i in range(len(word)): for c in chars: nw = word[:i] + c + word[i + 1:] if nw in dict: nextlevel.append(nw) if nw not in prev: prev[nw] = [] prev[nw].append(word) curlevel, nextlevel = nextlevel, curlevel return res # recursion for generating paths def buildPath(self, start, end, prev, res, path): if start == end: p = list(path) p.reverse() res.append(p) else: for w in prev[end]: path.append(w) self.buildPath(start, w, prev, res, path) del path[-1]
Java:
public class Solution { public List<List> findLadders(String start, String end, Set dict) { List<List> res = new ArrayList<List>(); List curlevel = new ArrayList(), nextlevel = new ArrayList(), path = new ArrayList(); Map<String, ArrayList> prev = new HashMap<String, ArrayList>(); curlevel.add(start); path.add(end); while(!curlevel.isEmpty()){ for(String w : curlevel) dict.remove(w); Set shown = new HashSet(); while(!curlevel.isEmpty()){ String word = curlevel.remove(0); if(shown.contains(word)) continue; shown.add(word); if(word.equals(end)){ buildPath(start, end, prev, path, res); return res; } for(int i = 0; i < word.length(); i ++){ char[] chars = word.toCharArray(); for(int c = 'a'; c <= 'z'; c ++){ chars[i] = (char)c; String nw = new String(chars); if(dict.contains(nw)){ nextlevel.add(nw); if(prev.get(nw) == null) prev.put(nw, new ArrayList()); prev.get(nw).add(word); } } } } List tmp = curlevel; curlevel = nextlevel; nextlevel = tmp; } return res; } public void buildPath(String start, String end, Map<String, ArrayList> prev, List path, List<List> res){ if(start.equals(end)){ List p = new ArrayList(path); Collections.reverse(p); res.add(p); }else{ ArrayList prevs = prev.get(end); for(String w : prevs){ path.add(w); buildPath(start, w, prev, path, res); path.remove(path.size() - 1); } } } }
Note that creating a charset for the dictionary is trivial to the performance of the algorithm, especially when the dictionary is pretty large.
0 notes
Text
range() generates a list of numbers when the first time that it is being called, while xrange() acts like a generator, returning a single number every time it is called. So range() results in less runtime but more memory consumption, while xrange() is the opposite.
I found that when doing "Word Ladder" in Leetcode. It results in TLE when using xrange() for loop, but range() made it through.
0 notes
Text
[LeetCode] Reverse A Linked List II & Reverse Nodes in k-Group
These two problems are very similar in essence. The key to the problems is that you must know how to reverse a linked list. So first of all, let's look at an algorithm for reversing a linked list.
The idea of the algorithm I will introduce here is very simple: traverse the list and insert the newly found node to the head of the list at each iteration. For example, if we want to reverse a list 1->2->3->4, we start traversing the list first. When 1 is found, we keep it there as if we picked it out and inserted it back to the head of the list. When we find 2, we also insert it to the head of the list, then we get 2->1->3->4. When we find 3, we insert it to the head again and we get 3->2->1->4. Finally, we find 4 and put it to the head. Then we finally get the answer 4->3->2->1.
To implement this algorithm, we need 3 pointers: pointer head pointing to the head of the list, pointer first pointing to the first node we found and pointer cur pointing to the current node we found. To reverse the list, we let first point to cur's next node, cur point to the previously found node(should be head's next node) and head point to cur again and again till no more new node found in the list (cur is null). The implementation is shown below.
Python:
# Definition for singly-linked list. # class ListNode: # def __init__(self, x): # self.val = x # self.next = None def reverseList(l): # when l is null or there is a single node in the list if not l or not l.next: return l # head points to the head of the list # first points to the first node of the list # cur points to the currently traversing node head = ListNode(0) head.next = l first = head.next cur = first.next # before cur reaches the end of the list while cur: first.next = cur.next cur.next = head.next head.next = cur cur = first.next return head.next
With knowledge about how to reverse a list, we can dig into the problems mentioned before.
The first problem requires us reversing a portion of the list rather than the entire one. The indices of the start and the end are given. The problem is just a little bit more complex than reversing the entire list. We just need to find the start node, keep reversing the list and stop when the next node of the end node is found. In this case, the head node is the previous node of the start node and the first node is the start node. The code is shown below.
Python:
# Definition for singly-linked list. # class ListNode: # def __init__(self, x): # self.val = x # self.next = None class Solution: # @param head, a ListNode # @param m, an integer # @param n, an integer # @return a ListNode def reverseBetween(self, head, m, n): #dummy points to the head of the list dummy = ListNode(0) dummy.next = head # mhead points to the previous node of the start node # f points to the start node # c points to the currentl traversing node mhead, f, c = dummy, None, None # before reaching the end node for i in xrange(0, n): # looking for the start node if i < m - 1: mhead = mhead.next # start node found elif i == m - 1: f = mhead.next c = f.next # reversing the list else: f.next = c.next c.next = mhead.next mhead.next = c c = f.next return dummy.next
The second problem is a little more complex than the first one. It requires reversing every k nodes in the list. So we need to treat every k nodes as a sublist and reverse each of them. One thing to note is that if there are fewer than k nodes at the end of the list, these nodes are not reversed. In other words, a sublist won't be reversed if it has fewer than k nodes. So we need a "precursor" to identify sublists. The code is shown below.
Python:
# Definition for singly-linked list. # class ListNode: # def __init__(self, x): # self.val = x # self.next = None class Solution: # @param head, a ListNode # @param k, an integer # @return a ListNode def reverseKGroup(self, head, k): dummy = ListNode(0) dummy.next = head # pre is the precursor # h points to the head of each sublist # f points to the first node of each sublist # c points to the currently traversing node # cnt is used for tracking the number of nodes in each sublist pre, h, f, c = dummy, dummy, None, None cnt = 0 while pre: # identifying a sublist if cnt < k: pre = pre.next cnt += 1 # found a sublist else: # reverse the sublist f = h.next c = f.next for i in xrange(0, k - 1): f.next = c.next c.next = h.next h.next = c c = f.next # let h and pre point to the f, since f is at the end of this sublist # and points to the head of the next sublist h, pre = f, f cnt = 0 return dummy.next
0 notes
Text
Python Execution Environment
Interpreter Options
-3 # enable warnings about features removed or changed in Python 3 -i # enter interactive mode after program execution (mostly for debugging) -m module # run a module as a script (executes __main__ method within the module) -O # optimized mode -OO # optimized mode plus removal of documentation strings when creating .pyo files -t # reports warning about inconsistent tab usage -tt # throws TabError exception when finding inconsistent tab usage -x # skip the first line of the source program -c cmd # execute cmd as a string
Interpreter Environment Variables
PYTHONPATH # module search path, separated by ":" PYTHONSTARTUP # file executed on interactive startup PYTHONHOME # Python installation location PYTHONINSPECT # equivalent to "-i" option PYTHONUNBUFFERED # equivalent to "-u" option PYTHONIOENCODING # encoding and error handling for stdin, stdout and stderr PYTHONOPTIMIZE # equivalent to "-O" option PYTHONNOUSERSITE # equivalent to "-s" option PYTHONVERBOSE # equivalent to "-v" option PYTHONUSERBASE # root directory for per-user package PYTHONCASEOK # indicates to use case-insensitive matching for module names in search
Interactive Sessions
Display results of interactive sessions can be changed by redefining sys.displayhook. It is a method accepts a parameter, which is the result of the last executed statement.
Per-User Package
In UNIX-like operating systems, per-user package is put in ~/.local/ directory. It may look like ~/.local/lib/python2.7/site-packages. You can put your own modules or install third-party modules under this directory.
Clean up on Program Exit
Although Python interpreter destroys objects with no reference to it, it does not guarantee to do that, especially when there is circular reference in the program. In this case, it is necessary to clean up these resources explicitly. It is always a good idea to define a cleanup method explicitly and register it to atexit module. Also register Python's garbage collector to the atexit module so that it can be invoked on the program's exit. An example code is shown below.
= simport atexit, gc conn = open_connection("www.google.com") def cleanup(): print "Clean up ..." close_connection(conn) atexit.register(cleanup) atexit.register(gc.collect)
0 notes
Text
[Leetcode]Sort Colors
It is a typical bucket sort(or counting sort) problem. I believe that the bucket sort solution is of little interest and easy to understand, so I paste the code below.
Python:
class Solution: # @param A a list of integers # @return nothing, sort in place def sortColors(self, A): counter = [0 for i in range(3)] for i in A: counter[i] += 1 cnt = 0 for x, i in enumerate(counter): for j in range(i): A[cnt] = x cnt += 1
An interesting solution is the one-way pass solution. The basic idea is that the 0s are put to the front of the array and the 2s are thrown to the end, leaving the 1s in the middle. The unsorted part of the array shrinks as more and more 0s and 2s are found. To keep track of the unsorted part, we need two pointers pointing to the head and the end. The algorithm terminates once the size of unsorted part reaches 0, i.e. the index reaches the end of the unsorted part. The code is shown below.
Python:
class Solution: # @param A a list of integers # @return nothing, sort in place def sortColors(self, A): head = 0 end = len(A) - 1 idx = 0 while idx <= end: if A[idx] == 0: if idx == head: idx += 1 else: A[idx], A[head] = A[head], A[idx] head += 1 elif A[idx] == 2: A[idx], A[end] = A[end], A[idx] end -= 1 else: idx += 1
0 notes
Text
Python I/O
argv
# file: test.py import sys print argv[0] #test.py print len(argv) # number of arguments following the script plus the script name
Adding Options
Use optparse to make life easier ...
# file: test.py import optparse p = optparse.OptionParser() #store values p.add_option("-o", action="store", dest="outfile") p.add_option("--output", action="store", dest="outfile") #set boolean values p.add_option("-d", action="store_true", dest="debug") p.add_option("--debug", action="store_true", dest="debug") #set default values p.set_defaults(debug=False) #parse the args opts, args = p.parse_args() print opts.outfile print opts.debug
So when I execute
$./test.py -o abc.txt -d
, the result is as below
$./test.py -o abc.txt -d abc.txt True
The optparser automatically gives error message if an input option is not available. e.g.
$./test.py -b Usage: opt.py [options] opt.py: error: no such option: -b
Environment Variables
They are stored in os.environ. Typical environment variables including PATH, USER, EDITOR, etc. Modification of environment variables will affect running processes and their child processes.
Read Lines From a File
Instead of explicitly calling readline() or readlines(), we can directly iterate on a file if all the lines in the file will be consumed. e.g.
f = open("test.txt", "r") for line in f: print line
I/O Modes
Files can be encoded with built-in codec
"r" # read only "w" # write only "a" # append only "b" # read as a binary file "t" # read as a text file "+" # allows for in-place update "U" # universal newline support for reading. It translates different newline encodings into standard '\n'
File Method
f = open("test.txt", "r+") n = 10 s = "abc" lines = ["hello world", "hello python"] offset = 0 whence = 3 size = 1024 f.read(n) # reads at most n bytes, n is optional f.readline() # reads an input line, including the terminating newline f.readlines() # reads all the input lines and return a list f.write(s) # writes s to the file f.writelines(lines) # writes all strings in sequence lines into the file f.close() # closes the file f.tell() # returns the current file pointer f.seek(offset, whence) # seeks to a new file position, whence is optional. # When whence is # 0 - seek from the start of the file # 1 - seek from the current position of the file pointer # 2 - seek from the end of the file f.isatty() # returns 1 if f is an interactive terminal f.flush() # flushes the output buffers f.truncate(size) # truncates the file to at most size bytes f.fileno() # returns the file descriptor f.next() # returns the next line or raises StopIteration exception when EOF is reached. It is f.__next__() in Python 3
readline() also accepts a maximum line length. If the line is longer than the length, the remaining part of the line will be returned in the subsequent line.
Note that in Python 2, read operations return 8-bit strings no matter what file mode is specified. In Python 3, they return Unicode strings if text mode is specified and byte strings if binary mode is selected.
File Attributes
f.closed # Boolean. False if the file is open and True if closed f.mode # I/O mode f.name # name of the file if the file is created by open(). # Otherwise, it is the source of the file f.softspace # Boolean. Indicates if a space character is printed before another value # using the print statement f.newlines # All types of newlines found in the file. None if no newline found f.encoding # String. Indicates file encoding. None if no encoding is used
Standard I/O
Standard I/O is provided by sys module. They are sys.stdin, sys.stdout and sys.stderr. To read from user input, we can do this.
import sys sys.stdout.write("Enter your name:") name = sys.stdin.readline() #Alternatively, name = raw_input("Enter your name:") #Python 2 name = input("Enter your name:") #Python 3
The difference between sys.stdout.write() and raw_input() is that the latter omits newline of the input line while the former doesn't.
""The three standard I/O file objects can be changed in sys.__stdout__, sys.__stdin__ and sys.__stderr__, if necessary.
print Statement
"," will call the str() method of trailing object and is used to concatenate strings.
">>" is used to redirect the strings to a specific file object.
f = open("test.txt", "r+") print >> f, "Hello World" f.close()
Variable Interpolation
# formatted I/O form = """\ Dear %(name)s, Please send back my %(item)s or pay me %(amt)0.2f. Sincerely Yours, Hello Kitty """ print form % {"name": "Mr. Bush", "item": "blender", "amount": 50.00,} #string's format() method form2 = """\ Dear {name}, Please send back my {item} or pay me {amt}. Sincerely Yours, Hello Kitty """ print form2.format(name = "Mr.Bush", item="blender", amount = 50.0)
Unicode I/O
Files can be encoded with a built-in module codecs. You can encode a file when or after opening it as below
# encode a file when opening it f = codecs.open("test.txt", "r+", "utf-8", "strict") # encode a file after opening it. # Note: the file must be in binary mode. g = open("test.txt", "rb+") codecs.EncodedFile(g, "utf-8")
The "strict" shown in codecs.open() method is for error handling. There are four options - "strict", "ignore", "backslashreplace" and "xmlcharrefreplace".
pickle and shelve
pickle and shelve are built-in modules for object persistence. pickle serializes an object to and loads an object from a file with its protocol. shelve is compatible with pickle and acts like a dictionary, but only strings can be the keys. Certain types of objects with internal state, such as files and network connection, cannot be stored with shelve. The usage is as below.
pickle:
import pickle # serialize an object to a file obj = object() f = open("test", "wb+") pickle.dump(obj, f) f.close() # load the object from the file f = open("test", "rb+") obj=pickle.load(f) f.close()
shelve:
import shelve obj = object() db = shelve.open("data.store") # serialization db["key"]=obj # retrieval obj2 = db["key"] db.close()
0 notes
Text
There is no explicit overloading in Python as Python is dynamically typed. So overloading can be achieved in one function.e.g.
def func(a, b, c): pass # all the following calls are valid func(1) func(2, 3) func(1, 2, 3)
0 notes
Text
[Leetcode]Word Ladder
The most difficult part of this problem is to realize that it is related to tree and breadth-first search(BFS). Once you realize that, the problem becomes easy.
The problem requires us to find the shortest path to transform a word to another. Only one character can be changed each time and all the intermediate words during the transformation must be in a given dictionary. Now let's construct a tree to deal with this problem.
First of all, you should realize that the original word is the root of the tree and the target word is a leaf (as we stop building the tree once we find the target). Since we can change only one character each time, the children of a tree node are all the valid words in the dictionary with only one character different from their parent. For example, assuming we want to transform "hit" to "cog" and the given dictionary contains ["hot","dot","dog","lot","log"], we can construct such a tree:

With this vision, the problem can be easily solved by using two queues - one for tracking current word and another for tracking current depth. The code is shown below:
Java:
public class Solution { public int ladderLength(String start, String end, Set dict) { //characters shown in the dict Set chars = new HashSet(); List words = new ArrayList(); List steps = new ArrayList(); for(String w : dict){ for(Character c : w.toCharArray()){ chars.add(c); } } for(int i = 0; i < end.length(); i ++) chars.add(end.charAt(i)); words.add(start); steps.add(1); while(!words.isEmpty()){ String curWord = words.remove(0); int curStep = steps.remove(0); if(curWord.equals(end)) return curStep; for(int i = 0; i < curWord.length(); i ++){ char[] cs = curWord.toCharArray(); for(Character c : chars){ cs[i] = c; String newWord = new String(cs); if(dict.contains(newWord)){ words.add(newWord); steps.add(curStep + 1); dict.remove(newWord); } } } } return 0; } }
Python:
class Solution: # @param start, a string # @param end, a string # @param dict, a set of string # @return an integer def ladderLength(self, start, end, dict): charset = set() words, steps = ([] for i in range(2)) #collect all chars shown in the dict into the charset for w in dict: for c in w: charset.add(c) for c in end: charset.add(c) words.append(start) steps.append(1) while words: curword = words[0] curstep = steps[0] del words[0] del steps[0] if curword == end: return curstep for i in range(len(curword)): arr = list(curword) for c in charset: arr[i] = c nw = "".join(arr) if nw in dict: words.append(nw) steps.append(curstep + 1) dict.remove(nw) return 0
0 notes
Text
[Leetcode]Median of Two Sorted Array
The trickiest part of this problem is to find the median in a O(log(m + n)) runtime complexity. It is natural to think of merging the two sorted arrays first and find the midst element of the merged array. This approach results in an O(n) time and space complexity. It is not that bad, but not good enough to pass this test.
To reduce time complexity, we must come up with an approach to shrink size of the arrays and avoid traversing every element of them. Actually, this problem is quite similar to the problem "find the k-th element of two sorted arrays". In this case, we just need to find the (m + n)/2-th element with the exactly same approach.
Before writing code, we must know several facts about the "find the k-th element of two sorted arrays" problem:
When either array is empty, the program should return the k-th element of the other.
When k = 1, the program should return the smaller element between the first elements of the arrays.
If the k/2-th element of array A is smaller than the k/2-th element of array B, i.e. A[k/2 - 1] < B[k/2 - 1], the first k/2 elements can never be larger than the k-th element of A and B, and vice versa. In other word, if A[k/2 - 1] < B[k/2 -1], the first k/2 elements in A should be discarded, and vice versa.
If the k/2-th elements of A and B are equal, the program should return this element as the k-th element of the two arrays
With this approach, it is always assumed that A is smaller than B in size. Otherwise, k/2 might be out of B's index when A is much larger than B in size. So there must be a line to deal with the situation that A's size is larger than B's. The code is shown below:
Java:
public class Solution { public double findMedianSortedArrays(int[] A, int[] B){ int len = A.length + B.length; if(len % 2 == 1) return (double)findMedian(A, B, 0, 0, len / 2 + 1); else return (findMedian(A, B, 0, 0, len / 2) + findMedian(A, B, 0, 0, len / 2 + 1)) / 2.0; } public int findMedian(int[] A, int[] B, int as, int bs, int k){ int alen = A.length - as, blen = B.length - bs; if(alen > blen) return findMedian(B, A, bs, as, k); if(alen == 0) return B[bs + k - 1]; if(k == 1) return Math.min(A[as], B[bs]); int pa = Math.min(k / 2, alen), pb = k - pa; if(A[as + pa - 1] < B[bs + pb - 1]) return findMedian(A, B, as + pa, bs, k - pa); else if(A[as + pa - 1] > B[bs + pb - 1]) return findMedian(A, B, as, bs + pb, k - pb); else return A[as + pa - 1]; } }
Python:
class Solution: # @return a float def findMedianSortedArrays(self, A, B): total = len(A) + len(B) if total % 2 == 1: return self.findMedian(A, B, total / 2 + 1) / 1.0 else: return (self.findMedian(A, B, total / 2) + self.findMedian(A, B, total / 2 + 1)) / 2.0 def findMedian(self, a, b, k): if len(a) > len(b): return self.findMedian(b, a, k) if len(a) == 0: return b[k - 1] if k == 1: return min(a[0], b[0]) pa = min(k / 2, len(a)) pb = k - pa if a[pa - 1] < b[pb - 1]: return self.findMedian(a[pa:], b, k - pa) elif a[pa - 1] > b[pb - 1]: return self.findMedian(a, b[pb:], k - pb) else: return a[pa - 1]
The algorithm can be improved so that we don't need to concern about if A is smaller than B in size. The improvement is simple: we compare the k/(len(A) + len(B))-th element of A to the counterpart of B, so that it always take away a proper portion of A rather than a fix number of elements. The code is shown below:
Java:
public class Solution { public double findMedianSortedArrays(int[] A, int[] B){ int len = A.length + B.length; if(len % 2 == 1) return (float)findMedian(A, 0, A.length - 1, B, 0, B.length - 1, len / 2); else return (findMedian(A, 0, A.length - 1, B, 0, B.length - 1, len / 2) + findMedian(A, 0, A.length - 1, B, 0, B.length - 1, len / 2 - 1)) / 2.0; } public int findMedian(int[] A, int as, int ae, int[] B, int bs, int be, int k){ int alen = ae - as + 1, blen = be - bs + 1; if(alen == 0) return B[bs + k]; if(blen == 0) return A[as + k]; if(k == 0) return Math.min(A[as], B[bs]); int ak = alen * k / (alen + blen), bk = k - ak - 1; if(A[as + ak] < B[bs + bk]) return findMedian(A, as + ak + 1, ae, B, bs, be, k - ak - 1); else if(A[as + ak] > B[bs + bk]) return findMedian(A, as, ae, B, bs + bk + 1, be, k - bk - 1); else return A[as + ak]; } }
Python:
class Solution: # @return a float def findMedianSortedArrays(self, A, B): total = len(A) + len(B) if total % 2 == 1: return self.findMedian(A, B, total / 2) / 1.0 else: return (self.findMedian(A, B, total / 2) + self.findMedian(A, B, total / 2 - 1)) / 2.0 def findMedian(self, a, b, k): if len(a) == 0: return b[k] if len(b) == 0: return a[k] if k == 0: return min(a[0], b[0]) pa = len(a) * k / (len(a) + len(b)) pb = k - pa - 1 if a[pa] < b[pb]: return self.findMedian(a[pa + 1:], b, k - pa - 1) elif a[pa] > b[pb]: return self.findMedian(a, b[pb + 1:], k - pb - 1) else: return a[pa]
0 notes
Text
[Leetcode]Populating Next Right Pointers in Each Node I & II
The problems are easy if extra space can be used - they are typical problems related to binary tree level order traversal. However, it is not allowed to use extra space in solving the problems, things become a little bit tricky.
Problem I is not that tricky as it is guaranteed that test cases are perfect binary trees. It saves a lot of concerns about whether a node is present or not. To save extra space, we should fully use the next pointer of each node - we must chain a level first and use the chained level to connect nodes in the next level. The code is as below:
# Populating Next Right Pointers in Each Node I # Populating Next Right Pointers in Each Node I # Definition for a binary tree node # class TreeNode: # def __init__(self, x): # self.val = x # self.left = None # self.right = None # self.next = None class Solution: # @param root, a tree node # @return nothing def connect(self, root): if not root: return # dummy is a dummy pointer for working around chaining the leftmost node of each level # top is pointing to the upper level # cur is pointing to the current level that is being chained. It points to dummy at initial # first is pointing to the leftmost node of the next level to be chained dummy = TreeNode(0) top = root cur = dummy first = root.left while top: # since this loop is for chaining the next level, it ends when there is no more next level while top and top.left: cur.next = top.left cur = cur.next cur.next = top.right cur = cur.next top = top.next top = first cur = dummy first = top.left if top else None
Problem II is trickier as the binary tree is not guaranteed to be perfect. In this case, we must take into account whether children of a tree node is present or not. We also need to be able to recognize leaves in an "upper" level from leaves at "bottom"; otherwise, the program will exit when encountering a leave in an "upper" level, leaving nodes in "lower" levels unchained. The code is as below:
# Populating Next Right Pointers in Each Node II # Definition for a binary tree node # class TreeNode: # def __init__(self, x): # self.val = x # self.left = None # self.right = None # self.next = None class Solution: # @param root, a tree node # @return nothing def connect(self, root): if not root or (not root.left and not root.right): return # top is pointing to the upper level # first is pointing to the leftmost node of the next level to be chained. # dummy is a dummy pointer working around chaining the first node of each level. It points to the leftmost node of each level when the node is found # cur is pointing to the current level that is being chained top = root first = top.left if top.left else top.right dummy = TreeNode(0) cur = dummy while top: while top: if top.left: cur.next = top.left cur = cur.next if top.right: cur.next = top.right cur = cur.next elif top.right: cur.next = top.right cur = cur.next top = top.next top = first cur = dummy # this snippet of code is used for recognizing leaves in "upper" level from those at "bottom", avoiding the program from early exit. if not top: first = None f = top while f and not f.left and not f.right: f = f.next if not f: first = None else: if f.left: first = f.left else: first = f.right
0 notes
Text
A[i ++] in Python
A[i ++] in Python won't act as it does in Java, because the evaluation ordering of indexing is higher than addition in Python
0 notes