Computational processes are abstract beings that inhabit computers. As they
evolve, processes manipulate other abstract things called data. The evolution
of a process is directed by pattern of rules called a program.
Why use Lisp for the book?
The most significant unique feature of Lisp is the fact that Lisp descriptions
of processes, called procedures, can themselves be represented and manipulated
as Lisp data. The importance of this is that there are powerful program-design
techniques that rely on the ability to blur the traditional distinction between
"passive" data and "active" processes.
The three mechanisms of combining simple ideas to form more complex ideas in any
powerful programming languages:
Here comes the summary of the Coursera course
Fucntional Programming Principles in Scala
It’s a little bit late, it has been two months since I finished the course.
This is the first Coursera course I followed from the very beginning to the end
and accomplished all the programming assignments with full scores, which helped
me to get the certificate with distinction.
First the excellent parts of the course:
Martin Odersky is a pretty good teacher, and all the lectures are well
designed. I enjoyed watching the lectures a lot and most of which I have
read more than once.
The programming assignments are well structured, with detailed instructions
and step by step guide. In each of the assignments, the whole task is split
into several steps with approprivate level of abstractions.
There are also many useful supporting materials like scala style guide, sbt
tutorial, instructions on tools setup for the course, which made it easier
to concentrate on the course content. For me, I don’t have scala develop
environment setup before the course, and by following the tools setup section
in only took me less than half an hour to get everything ready for trying the
example code and the assignments.
As for the less-good things, I would say:
Both the lectures and the assignments are quite easy for an experienced
programmers (No functional programming background needed), I was expecting
more challenging stuff.
There are no official solutions distributed. (The course will be offered
again some time later) However, I still think for those students who passed
the course should be qualified to get the solutions so that they can compare
those with their own to see where they can still improve.
It’s a pity that this is only the first half of an advanced undergraduate
course that Martin taught on campus. I am interested in the other half.
After taking this cousre, I got a deeper understanding of the functional
programming basics and it made me feel more comfortable while picking up
SICP again (after 3 years). Now, I am convinced that I am able to go through
SICP and finish most of the exercises. Also, I had a firmer grasp of Scala
even though I didn’t write more than 100 lines of Scala before; I understand
more about the scala syntax, idioms and even the motivations behind some of the
language structures. E.g., call by name/value, lazy evaluation, currying,
pattern matching and so on. I will publish my detailed notes on the lectures
and assignments to this blog later.
To conclude, I really enjoyed taking the course and many thanks to Martin
, the TAs, and also the coursera staff for offering such a wonderful course.
This module provides the infrastructure for defining
abstract base classes
(ABCs) in Python. The ABCs define a minimal set of methods that establish the
characteristic behavior of the type. For more details about this, see
PEP 3119.
Highlights
The module provides a metaclass
used to create ABCs. An ABC can be subclassed directly. The class also has a ‘register’
method to register unrelated concrete classes (including built-in classes) and unrelated
ABCs as ‘virtual subclasses’
Also there are two decorators abstractmethod and abstractproperty, which will set
the function object’s attribute ‘__isabstractmethod__’ to True. Only when all of the
abstract methods and abstract properties are overriden, can a class that has a metaclass
derived from ABCMeta be instantiated.
Code comments
ABCMeta.__new__
123456789101112131415161718192021
def__new__(mcls,name,bases,namespace):cls=super(ABCMeta,mcls).__new__(mcls,name,bases,namespace)# Compute set of abstract method namesabstracts=set(nameforname,valueinnamespace.items()ifgetattr(value,"__isabstractmethod__",False))forbaseinbases:fornameingetattr(base,"__abstractmethods__",set()):value=getattr(cls,name,None)ifgetattr(value,"__isabstractmethod__",False):abstracts.add(name)cls.__abstractmethods__=frozenset(abstracts)# chaoc: caches are the typical usages of weak references# Set up inheritance registrycls._abc_registry=WeakSet()cls._abc_cache=WeakSet()cls._abc_negative_cache=WeakSet()cls._abc_negative_cache_version=ABCMeta._abc_invalidation_counterreturncls
It first creates a ‘type’ object cls. (super(ABCMeta, mcls) is ‘type’)
Iterate through all the attributes (including all the attributes inherited
from all the bases), if any of them have ‘__isabstractmethod__’ set to true,
add it to cls’s __abstractmethods__.
Initialize the attributes ‘_abc_registry’, ‘_abc_cache’, ‘_abc_negative_cache’
and ‘_abc_negative_cache_version’, which are used to speed up the check in
__instancecheck__ and __subclasscheck__.
ABCMeta.register(cls, subclass)
See the added comments in line
12345678910111213141516
defregister(cls,subclass):"""Register a virtual subclass of an ABC."""# chaoc: sanity check to make sure that subclass is class type# either a type object (for new style classes) or the type is# the same as ClassType(for old style classes)ifnotisinstance(subclass,(type,types.ClassType)):raiseTypeError("Can only register classes")ifissubclass(subclass,cls):return# Already a subclass# Subtle: test for cycles *after* testing for "already a subclass";# this means we allow X.register(X) and interpret it as a no-op.ifissubclass(cls,subclass):# This would create a cycle, which is bad for the algorithm belowraiseRuntimeError("Refusing to create an inheritance cycle")cls._abc_registry.add(subclass)ABCMeta._abc_invalidation_counter+=1# Invalidate negative cache
ABCMeta.__instancecheck__
See the added comments in line
12345678910111213141516171819202122
def__instancecheck__(cls,instance):"""Override for isinstance(instance, cls)."""# Inline the cache checking when it's simple.subclass=getattr(instance,'__class__',None)ifsubclassisnotNoneandsubclassincls._abc_cache:returnTruesubtype=type(instance)# Old-style instancesifsubtypeis_InstanceType:subtype=subclass# chaoc: subtype will also be subclass for old style classes# as assigned in the above stepifsubtypeissubclassorsubclassisNone:# chaoc: check if the negative cache is safe to use or notif(cls._abc_negative_cache_version==ABCMeta._abc_invalidation_counterandsubtypeincls._abc_negative_cache):returnFalse# Fall back to the subclass check.returncls.__subclasscheck__(subtype)return(cls.__subclasscheck__(subclass)orcls.__subclasscheck__(subtype))
ABCMeta.__subclasscheck__
The code and comment in this function is very clear and straightforward.
Just make sure the different cases needed to check:
check the subclass hook
check if it’s a direct subclass through __mro__
check if it’s a subclass of a registered class (issubclass is called to do
recursive check)
check if it’s a subclass of a subclass (issubclass is called to do recursive
check)
In this post, we only talk about the defitions of ABCMeta. We will see the
typical usages in the collections module.
This module provides support for maintaining a list in sorted order without
having to sort the list after each insertion. It uses a basic bisection
algorithm similar with the classic binary search.
Review of binary search
According to ‘Programming Pearls’
by Jon Bentley, It’s very hard to write a correct binary search(Unbelievable, Uh?).
Below is the note from the book:
I’ve assigned [binary search] in courses at Bell Labs and IBM.
Professional programmers had a couple of hours to convert [its] description
into a program in the language of their choice; a high-level pseudocode
was fine. At the end of the speci?ed time, almost all the programmers
reported that they had correct code for the task. We would then take thirty
minutes to examine their code, which the programmers did with test cases.
In several classes and with over a hundred programmers, the results varied
little: ninety percent of the programmers found bugs in their programs (and
I wasn’t always convinced of the correctness of the code in which no bugs were
found). I was amazed: given ample time, only about ten percent of professional
programmers were able to get this small program right. But they aren’t the only
ones to ?nd this task difficult: in the history in Section 6.2.1 of his
‘Sorting and Searching’, Knuth points out that while the first binary search
was published in 1946, the first published binary search without bugs did not
appear until 1962.
Understanding how to use loop invariants in composing a program is very important,
so let’s first implement a binary search and prove the correctness utilizing this.
Here we use a simplified specification compared to the bsearch function in the C
standard library.
1234567891011121314151617181920212223242526
intbinsearch(intx,int*A,intn)// @require 0 <= n && n <= length(A)// @require is_sorted(A, n)/* @ensures (-1 == result && !is_in(x, A, n)) || ((0 <= result && result < n) && A[result] == x)*/{intlower=0,higher=n;intmid;while(lower<higher)// @loop_invariant 0 <= lower && lower <= higher && higher <= n// @loop_invariant (lower == 0 || A[lower - 1] < x)// @loop_invariant (higher == n || A[higher] > x){mid=lower+(higher-lower)/2;// @assert lower <= mid && mid < higherif(A[mid]==x)returnmid;elseif(A[mid]<x)lower=mid+1;elsehigher=mid;}return-1;}
It’s easy to prove that the loop invariants are strong enough to imple the
postcodition of the function(@ensures). Does this function terminate? In the
loop body, we have lower < higher, so lower <= mid && mid < higher, then the
intervals from lower to mid and from (mid+1) to higher are strictly
smaller than the original interval (lower, higher), unless we find the element,
the difference between higher and lower will eventually become 0 and we will
exit the loop.
The bisect() functions provided in this module are useful for finding the
insertion points but can be tricky or awkward to use for common searching
tasks.
Code comments
The above binsearch function will return the first of the elements that equals
to the target if there are more than one. Sometimes it’s required to find the
left most one or the right most one. We can achieve this by using the bisect()
functions. Let’s examine the pre and post conditions, and also loop invariants.
12345678910111213141516171819
defbisect_right(a,x,lo=0,hi=None):# Let l be the original lo passed in, and h be hi. In order to differentiate.# @requires 0 <= l && l < h && h <= len(a)# @requires is_sorted(a[l:h])# @ensures all e in a[l:result] have e <= x, and all e in a[result:h] have e > xwhilelo<hi:# @loop_invariant l <= lo && lo <= hi && hi <= h# @loop_invariant lo == l || a[lo - 1] <= x# @loop_invariant hi == h || a[hi] > xmid=(lo+hi)//2# @assert lower <= mid && mid < higherifx<a[mid]:hi=midelse:lo=mid+1returnlo
After exiting the loop, we have lo == hi. Now we have to distinguish some cases:
If lo == l, from the third conjunct, we know that a[hi] > x, and since lo == hi,
we have a[lo] > x. In this case, all e in a[l:h] > x
If hi == h, from the second conjunct, we know that a[lo - 1] < x, in this case,
all e in a[l:h] <= x
if lo != l and hi != h, from the second and third conjuncts, we know that
a[lo - 1] <= x, a[hi] > x. The post condition still holds.
We can do the same analysis on the function bisect_left().
Note: In the pre condition I explicitly written down that lo < hi is required, the
code will directly return lo when hi is less than or equal to lo, but that’s simply
meaningless, if this is the case, the sorted order of ‘a’ after we insert element
before the index returned by the function.
About the function prototype
In the doc of the module, it’s bisect.bisect_right(a, x, lo=0, hi=len(a))
while in the source code, it’s def bisect_left(a, x, lo=0, hi=None)
bisect.bisect_right(a, x, lo=0, hi=len(a)) is not valid python code, you will
get error like this: NameError: name ‘a’ is not defined. This is because default
values are computed and bound at the function definition time rather than when you
call the function. This means that you can’t have a default which is dependent on
something that is not known until the function is called.
Pythonic stuff
Override function definitions in Python
12345
# Overwrite above definitions with a fast C implementationtry:from_bisectimport*exceptImportError:pass
It took me some time to finally get latex math formulas working in Octopress.
If you googled ‘Octopress latex’, you can get quite a few online resources about
how to support latex in octopress, with various levels of complexity. In this
post, I will write down how I achieve this.
The initial attempt
As I installed the jekyll-rst plugin
to use rst to write my posts, I thought it should be easy to write latex math
because docutils has native support for it since version 0.8 (A :math: role and
also a .. math: directive introduced for that). However, after I tried to use
these in a octopress post, I found that the post will be rendered to empty. For
example, if I insert the following rst code into my post, the whole post becomes
empty; but after removing this line, everything is fine.
1
The area of a circle is :math:`A_\text{c} = (\pi/4) d^2`.
I also verified that the exact same code can be successfully converted to valid
html using the ‘rst2html.py’ script on my system, so I guess maybe something is
wrong in ‘RbST’. I found that in RbST, it has its own copies of rst2html and
rst2latex tools under ** /gems/RbST-0.1.3/lib/rst2parts **,
1234
chuchao@chuchao:~/.rvm/gems/ruby-1.9.2-p320/gems/RbST-0.1.3/lib/rst2parts$ ls
__init__.py rst2html.py rst2latex.py transform.py transform.pycchuchao@chuchao:~/.rvm/gems/ruby-1.9.2-p320/gems/RbST-0.1.3/lib/rst2parts$ ls ..
rbst.rb rst2parts
which will be used in rbst.rb. I have even tried to change rbst.rb to use the
rst2html.py installed on my system, but this also didn’t get any luck.
The module provides an implementation of heap queue algorithm, also known as
priority queue algorithm.
Highlights
Zero-based indexing is used, so the children’s index of node with index k
are (2*k + 1) and (2*k + 2) respectively.
Internally a ‘min heap’ is maintained rather than ‘max heap’, which is more
generally used in algorithm textbooks.
Three general functions based on heaps are also provided:
12345678910
# Merge multiple sorted inputs into a single sorted output# (for example, merge timestamped entries from multiple log files)heapq.merge(*iterables)# The following two functions are effectively like# sorted(iterable, key=key, reverse=True)[:n] and# sorted(iterable, key=key)[:n],# but they perform best with smaller values of nheapq.nlargest(n,iterable[,key])heapq.nsmallest(n,iterable[,key]))
Pythonic stuff
Rich comparison methods
123
defcmp_lt(x,y):# use __lt__ if available; otherwise, try __le__returnx<yifhasattr(x,'__lt__')else(noty<=x)
In python, there are 6 so called "Rich Comparison" methods, x < y calls
x.__lt__(y) and others are similar (__le__ and <=; __gt__ and >=; __eq__ and ==;
__ne__ and <>). Arguments to rich comparison methods are never coerced. see
coercion
Code comments
Why heapify(x) is O(n)?
This is not obvious at first by seeing the code, given that there is a ‘while’
loop in _siftup and also a while loop in _siftdown(called in _siftup). Let’s
look into it further:
in the while loop of _siftup, it takes O(L) time for nodes that L levels
above leaves.
and in the while loop of _siftdown called in _siftup, it takes at most L
steps, so _siftdown is O(L).
since we have n/4 nodes in level 1, n/8 nodes in level 2, and finally one
root node, which is lg(n) levels above leaf, so the total amount in the while
loop of heapify is:
n/4 * c + n/8 * c + n/16 * 3c + … + 1 * lg(n) * c, and let n/4 = 2^k,
after simplification, we get:
c * 2^k(1/2^0 + 2/2^1 + 3/2^2 + … + (k+1)/2^k), as the limit of
(k+1)/2^k is 0 when k is infinite, so the term in the brackets bound to
a constant, from this we can conclude that heapify is O(2^k), which is O(n).
Why it continues to find the smaller child until a leaf is hit in _siftup?
As explained in the comment by the module author, this is a ad hoc to reduce
the comparisons on the following operations on the heap.
I am going to start the series of posts on reading the source of python standard
modules. I will go with the pure python modules first, and maybe later I can
continue with C implementations of the modules. Let’s see how far I could go.
What will be included
A brief introduction of the module. (It should be very short, people can go to
the standard library doc for more information.)
Special highlights about the important APIs, implementation details.
Python features/idioms/tricks/gotchas that worth the whistle, especially those
I was not familiar with
Detail explanations about the tricky part of the code if any
What will not be included
The example usage of the various APIs, for this kind of stuff,
Python module of the week is a better
place to go
Also, alone the way, I may start another series on some specific ‘advanced topics’
in python, like descriptor, decorator, method resolution order(mro)
and so on. Mainly about why they are introduced into python, how they are used and
the typical use cases. This is inspired by the blogs about
python history
This post documents how I set up this blog and the problems encounted. Octopress
is quite popular recently and you can just find a huge number of bloggers who
have written about how to set this up online. My main references are
Setting up a blog on Octopress and github
and Blog = Github + Octopress (In Chinese).