best-of-jupyter

Jupyter Tips, Tricks, Best Practices with Sample Code for Productivity Boost

View the Project on GitHub NirantK/best-of-jupyter

Making the Best of Jupyter

Tips, Tricks, Best Practices with Sample Code for Productivity Boost

Found useful by Nobel Laureates and more:

“…, this looks very helpful”

  • Economics Nobel Laureate 2018, Dr. Paul Romer on Twitter

Contents

Getting Started Right

Debugging

from IPython.core.debugger import set_trace

def foobar(n):
    x = 1337
    y = x + n
    set_trace() #this one triggers the debugger
    return y

foobar(3)

Returns:

> <ipython-input-9-04f82805e71f>(7)fobar()
      5     y = x + n
      6     set_trace() #this one triggers the debugger
----> 7     return y
      8 
      9 foobar(3)

ipdb> q
Exiting Debugger.

Preference Note: If I already have an exception, I prefer %debug because I can zero down to the exact line where code breaks compared to set_trace() where I have to traverse line by line

This makes the following workflow possible:

In [1]: %load_ext autoreload

In [2]: %autoreload 2  # set autoreload flag to 2. Why? This reloads modules every time before executing the typed Python code

In [3]: from foo import some_function

In [4]: some_function()
Out[4]: 42

In [5]: # open foo.py in an editor and change some_function to return 43

In [6]: some_function()
Out[6]: 43

Programming Sugar

import sys
!{sys.executable} -m pip install foo  # sys.executable points to the python that is running in your kernel 

Search Magic

Use the Search Magic file - no need to pip install. Download and use the file.

In [1]: from search_magic import SearchMagic
In [2]: get_ipython().register_magics(SearchMagic)

In [3]: %create_index
In [4]: %search tesseract
Out[4]: Cell Number -> 2
        Notebook -> similarity.ipynb
        Notebook Execution Number -> 2

Jupyter Kungfu

?str.replace() 

Returns:

Docstring:
S.replace(old, new[, count]) -> str

Return a copy of S with all occurrences of substring
old replaced by new.  If the optional argument count is
given, only the first count occurrences are replaced.
Type:      method_descriptor

Sanity Checks

nbdime

Selective Diff/Merge Tool for jupyter notebooks

Install it first:

pip install -e git+https://github.com/jupyter/nbdime#egg=nbdime

It should automatically configure it for jupyter notebook. If something doesn’t work, see installation.

Then put the following into ~/.jupyter/nbdime_config.json:

{

  "Extension": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbDiff": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbDiffDriver": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbMergeDriver": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "dummy": {}
}

Change outputs value to true if you care to see outputs diffs too.

Markdown Printing

Including markdown in your code’s output is very useful. Use this to highlight parameters, performance notes and so on. This enables colors, Bold, etc.

from IPython.display import Markdown, display
def printmd(string, color=None):
    colorstr = "<span style='color:{}'>{}</span>".format(color, string)
    display(Markdown(colorstr))

printmd("**bold and blue**", color="blue")

Find currently running cell

Add this snippet to the start of your notebook. Press Alt+I to find the cell being executed right now. This does not work if you have enabled vim bindings:

%%javascript
// Go to Running cell shortcut
Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Alt-I', {
    help : 'Go to Running cell',
    help_index : 'zz',
    handler : function (event) {
        setTimeout(function() {
            // Find running cell and click the first one
            if ($('.running').length > 0) {
                //alert("found running cell");
                $('.running')[0].scrollIntoView();
            }}, 250);
        return false;
    }
});

Better Mindset

Plotting and Visualization

def show_img(im, figsize=None, ax=None, title=None):
    import matplotlib.pyplot as plt
    if not ax: fig,ax = plt.subplots(figsize=figsize)
    ax.imshow(im, cmap='gray')
    if title is not None: ax.set_title(title)
    ax.get_xaxis().set_visible(True)
    ax.get_yaxis().set_visible(True)
    return ax
    
def draw_rect(ax, bbox):
    import matplotlib.patches as patches
    x, y, w, h = bbox
    patch = ax.add_patch(patches.Rectangle((x, y), w,h, fill=False, edgecolor='red', lw=2))

show_img is a reusable plotting function which can be easily extended to plot one off images as well properly use subplots. In below example, I use a single figure and add new images as subplots using the neater axes.flat syntax:

fig, axes = plt.subplots(1, 2, figsize=(6, 2))
ax = show_img(char_img, ax= axes.flat[0], title = 'char_img_line_cropping:\n'+str(char_img.shape))
ax = show_img(char_bg_mask, ax=axes.flat[1], title = 'Bkg_mask:\n'+str(char_bg_mask.shape))

#  If you are working on image segmentation task, you can easily add red rectangles per subplot:
draw_rect(ax, char_bounding_boxes)  # will add red bounding boxes for each character

Please don’t overdo Cell Magic