Chao's Random Thoughts

Keep looking, don't settle

Python Bisect

Introduction

This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. It uses a basic bisection algorithm similar with the classic binary search.

Code comments

The above binsearch function will return the first of the elements that equals to the target if there are more than one. Sometimes it's required to find the left most one or the right most one. We can achieve this by using the bisect() functions. Let's examine the pre and post conditions, and also loop invariants.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def bisect_right(a, x, lo=0, hi=None):

    # Let l be the original lo passed in, and h be hi. In order to differentiate.

    # @requires 0 <= l && l < h && h <= len(a)
    # @requires is_sorted(a[l:h])
    # @ensures all e in a[l:result] have e <= x, and all e in a[result:h] have e > x

    while lo < hi:
        # @loop_invariant l <= lo && lo <= hi && hi <= h
        # @loop_invariant lo == l || a[lo - 1] <= x
        # @loop_invariant hi == h || a[hi] > x
        mid = (lo + hi) // 2
        # @assert lower <= mid && mid < higher
        if x < a[mid]:
            hi = mid
        else:
            lo = mid + 1
    return lo

After exiting the loop, we have lo == hi. Now we have to distinguish some cases:

  • If lo == l, from the third conjunct, we know that a[hi] > x, and since lo == hi, we have a[lo] > x. In this case, all e in a[l:h] > x
  • If hi == h, from the second conjunct, we know that a[lo - 1] < x, in this case, all e in a[l:h] <= x
  • if lo != l and hi != h, from the second and third conjuncts, we know that a[lo - 1] <= x, a[hi] > x. The post condition still holds.

We can do the same analysis on the function bisect_left().

Note: In the pre condition I explicitly written down that lo < hi is required, the code will directly return lo when hi is less than or equal to lo, but that's simply meaningless, if this is the case, the sorted order of 'a' after we insert element before the index returned by the function.

About the function prototype

In the doc of the module, it's bisect.bisect_right(a, x, lo=0, hi=len(a)) while in the source code, it's def bisect_left(a, x, lo=0, hi=None)

bisect.bisect_right(a, x, lo=0, hi=len(a)) is not valid python code, you will get error like this: NameError: name 'a' is not defined. This is because default values are computed and bound at the function definition time rather than when you call the function. This means that you can't have a default which is dependent on something that is not known until the function is called.

Pythonic stuff

Override function definitions in Python

1
2
3
4
5
# Overwrite above definitions with a fast C implementation
try:
    from _bisect import *
except ImportError:
    pass

Gotcha -- Mutable default arguments

See here for more details.

Comments