Chapter03 Understand Ipython

If you find this content useful, consider buying this book:

If you enjoyed this book considering buying a copy

Chapter 3: Understand IPython #

Noah Gift

Back in 2007 I wrote an article for IBM Developerworks on IPython and SNMP (Simple Network Management Protocol) and this chapter steals some of those ideas. In the physical data center era, the SNMP protocol is a useful way to collect metrics about CPU, memory, and the performance of a particular machine.

What makes IPython, the software engine behind Jupyter Notebook particularly useful is the interactive nature. An object or variable is “declared,” then it is “played with.” This style is quite helpful.

Here is an example of how that works. A user can import the library snmp, declare a variable oid, then pass it into the library and get a result. This step dramatically speeds up the ability to build solutions.

In [1]: import netsnmp

        In [2]: oid = netsnmp.Varbind('sysDescr')

        In [3]: result = netsnmp.snmpwalk(oid,
        ...:                         Version = 2,
        ...:                         DestHost="localhost",
        ...:                         Community="public")

        In [4]: result
        Out[4]: ('Linux localhost 2.6.18-8.1.14.el5 #1 SMP Thu Sep 27
        18:58:54 EDT 2007 i686',)

This book is a book about command-line tools, but it is essential to highlight the power of IPython as a useful tool in this endeavor. The click library makes it simple to map a function to an instrument. One workflow is to develop the service interactively via IPython, then put it into an editor like Visual Studio Code.

Using IPython, Jupyter, Colab and Python executable #

Let’s dive into key features of IPython and how they can empower the command-line tool developer.

IPython #

IPython is very similar to Jupyter, but run from terminal:

  • IPython predates Jupyter
  • Both Jupyter and IPython accept !ls -l format to execute shell commands

To run a shell command, put a ! in front.

!ls -l
    total 4
    drwxr-xr-x 1 root root 4096 Nov 21 16:30 sample_data

Note that you can assign the results to a variable.

var = !ls -l
type(var)

The type generated is called an SList.


    IPython.utils.text.SList

Another python method use you can use from a SList is fields.

#var.fields?

Here is an example of a grep method. The SList object return from using a ! in front has both a grep and a fields.

var.grep("data")

    ['drwxr-xr-x 1 root root 4096 Nov 21 16:30 sample_data']

Jupyter Notebook #

Going beyond just IPython, there are many flavors of Jupyter Notebook. A few popular ones:

  • Jupyter
  • JupyterHug
  • Colab
  • Kaggle
  • Sagemaker

Hosted Commercial Flavors #

Pure Open Source #

Hybrid Solutions #

Colab Notebook Key Features #

The “flavor” of Jupyter I like the most is colab. Generally, I like it because it “just works.”

Magic Commands #

Both Jupyter and IPython have “magic” commands. Here are a few examples.

%timeit #
too_many_decimals = 1.912345897

print("built in Python Round")
%timeit round(too_many_decimals, 2)

    built in Python Round
    The slowest run took 28.69 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 468 ns per loop
%alias #

The %alias command is a way to create shortcuts.

alias lscsv ls -l sample_data/*.csv 
lscsv
    -rw-r--r-- 1 root root   301141 Oct 25 16:58 sample_data/california_housing_test.csv
    -rw-r--r-- 1 root root  1706430 Oct 25 16:58 sample_data/california_housing_train.csv
    -rw-r--r-- 1 root root 18289443 Oct 25 16:58 sample_data/mnist_test.csv
    -rw-r--r-- 1 root root 36523880 Oct 25 16:58 sample_data/mnist_train_small.csv

To learn more, you can see the Magic Command references.

%who #

To print variables, an excellent command to understand is %who. This step shows what variables are declared.

var1=1
who
    drive     os     too_many_decimals     var     var1     
too_many_decimals
    1.912345897
%writefile #

One useful way to write scripts (perhaps a whole command-line tool) is to use the %%writefile magic command.

%%writefile magic_stuff.py
import pandas as pd
df = pd.read_csv(
    "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
df = df.drop(df.index[[1,11877]]) #drop outlier
df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
print(df.head())
    Writing magic_stuff.py
cat magic_stuff.py
    import pandas as pd
    df = pd.read_csv(
        "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
    df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
    df = df.drop(df.index[[1,11877]]) #drop outlier
    df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
    print(df.head())
!python magic_stuff.py
       fat_100g  ...                         product
    0     28.57  ...  Banana Chips Sweetened (Whole)
    2     57.14  ...          Organic Salted Nut Mix
    3     18.75  ...                  Organic Muesli
    4     36.67  ...                   Zen Party Mix
    5     18.18  ...            Cinnamon Nut Granola
    
    [5 rows x 7 columns]
!pip install -q pylint
....
[?25h
!pylint magic_stuff.py
    ************* Module magic_stuff
    magic_stuff.py:3:0: C0301: Line too long (109/100) (line-too-long)
    magic_stuff.py:4:0: C0301: Line too long (110/100) (line-too-long)
    magic_stuff.py:5:24: C0326: Exactly one space required after comma
    df = df.drop(df.index[[1,11877]]) #drop outlier
                            ^ (bad-whitespace)
    magic_stuff.py:7:0: C0304: Final newline missing (missing-final-newline)
    magic_stuff.py:1:0: C0114: Missing module docstring (missing-module-docstring)
    magic_stuff.py:2:0: C0103: Constant name "df" doesn't conform to UPPER_CASE naming style (invalid-name)
    magic_stuff.py:5:0: C0103: Constant name "df" doesn't conform to UPPER_CASE naming style (invalid-name)
    
    ------------------------------------
    Your code has been rated at -1.67/10
Bash #

Another useful tool in automation is the use of %%bash. You can use it to execute bash commands.

%%bash
uname -a
ls
ps
    Linux bee9b4559c13 4.14.137+ #1 SMP Thu Aug 8 02:47:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
    gdrive
    magic_stuff.py
    sample_data
        PID TTY          TIME CMD
          1 ?        00:00:00 run.sh
          9 ?        00:00:01 node
         24 ?        00:00:03 jupyter-noteboo
        119 ?        00:00:00 tail
        127 ?        00:00:06 python3
        298 ?        00:00:00 bash
        299 ?        00:00:00 drive
        300 ?        00:00:00 grep
        383 ?        00:00:01 drive
        394 ?        00:00:00 fusermount <defunct>
        450 ?        00:00:00 bash
        451 ?        00:00:00 tail
        452 ?        00:00:00 grep
        551 ?        00:00:00 bash
        554 ?        00:00:00 ps
!uname -a
Linux bee9b4559c13 4.14.137+ #1 SMP Thu Aug 8 02:47:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
Python2 #

Yet another useful trick is the ability to run Python2.

%%python2
print "old school"

    old school

Python executable #

You can run scripts, REPL and even run python statements with -c flag and semicolon to string together multiple statements

!python -c "import os;print(os.listdir())"
    ['.config', 'magic_stuff.py', 'gdrive', 'sample_data']
!ls -l
total 712
drwx------ 4 root root   4096 Sep  9 16:16 gdrive
-rw-r--r-- 1 root root    407 Sep  9 16:20 magic_stuff.py
-rw-r--r-- 1 root root 712814 Sep  9 16:25 pytorch.pptx
drwxr-xr-x 1 root root   4096 Aug 27 16:17 sample_data
!pip install -q yellowbrick
#this is how you capture input to a program
import sys;sys.argv
    ['/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py',
     '-f',
     '/root/.local/share/jupyter/runtime/kernel-559953d0-ac45-4a1a-a716-8951070eaab5.json']

In summary, IPython and Jupyter are beneficial tools to develop any Python code. They come in especially handy in testing out automation. Put a little IPython into your command-line tool kit, and you won’t regret it.