Chapter03 Understand Ipython

If you find this content useful, consider buying this book:

  • Amazon
  • Purchase all books bundle
  • Purchase from pragmatic ai labs
  • Subscribe all content monthly
  • If you enjoyed this book considering buying a copy

    Chapter 3: Understand IPython #

    Noah Gift

    Back in 2007 I wrote an article for IBM Developerworks on IPython and SNMP (Simple Network Management Protocol) and this chapter steals some of those ideas. In the physical data center era, the SNMP protocol is a useful way to collect metrics about CPU, memory, and the performance of a particular machine.

    What makes IPython, the software engine behind Jupyter Notebook particularly useful is the interactive nature. An object or variable is “declared,” then it is “played with.” This style is quite helpful.

    Here is an example of how that works. A user can import the library snmp, declare a variable oid, then pass it into the library and get a result. This step dramatically speeds up the ability to build solutions.

    In [1]: import netsnmp
    
            In [2]: oid = netsnmp.Varbind('sysDescr')
    
            In [3]: result = netsnmp.snmpwalk(oid,
            ...:                         Version = 2,
            ...:                         DestHost="localhost",
            ...:                         Community="public")
    
            In [4]: result
            Out[4]: ('Linux localhost 2.6.18-8.1.14.el5 #1 SMP Thu Sep 27
            18:58:54 EDT 2007 i686',)
    

    This book is a book about command-line tools, but it is essential to highlight the power of IPython as a useful tool in this endeavor. The click library makes it simple to map a function to an instrument. One workflow is to develop the service interactively via IPython, then put it into an editor like Visual Studio Code.

    Using IPython, Jupyter, Colab and Python executable #

    Let’s dive into key features of IPython and how they can empower the command-line tool developer.

    IPython #

    IPython is very similar to Jupyter, but run from terminal:

    • IPython predates Jupyter
    • Both Jupyter and IPython accept !ls -l format to execute shell commands

    To run a shell command, put a ! in front.

    !ls -l
    
        total 4
        drwxr-xr-x 1 root root 4096 Nov 21 16:30 sample_data
    

    Note that you can assign the results to a variable.

    var = !ls -l
    type(var)
    

    The type generated is called an SList.

    
        IPython.utils.text.SList
    

    Another python method use you can use from a SList is fields.

    #var.fields?
    

    Here is an example of a grep method. The SList object return from using a ! in front has both a grep and a fields.

    var.grep("data")
    
    
        ['drwxr-xr-x 1 root root 4096 Nov 21 16:30 sample_data']
    
    

    Jupyter Notebook #

    Going beyond just IPython, there are many flavors of Jupyter Notebook. A few popular ones:

    • Jupyter
    • JupyterHug
    • Colab
    • Kaggle
    • Sagemaker

    Hosted Commercial Flavors #

    Pure Open Source #

    Hybrid Solutions #

    Colab Notebook Key Features #

    The “flavor” of Jupyter I like the most is colab. Generally, I like it because it “just works.”

    Magic Commands #

    Both Jupyter and IPython have “magic” commands. Here are a few examples.

    %timeit #
    too_many_decimals = 1.912345897
    
    print("built in Python Round")
    %timeit round(too_many_decimals, 2)
    
    
        built in Python Round
        The slowest run took 28.69 times longer than the fastest. This could mean that an intermediate result is being cached.
        1000000 loops, best of 3: 468 ns per loop
    
    %alias #

    The %alias command is a way to create shortcuts.

    alias lscsv ls -l sample_data/*.csv 
    
    lscsv
    
        -rw-r--r-- 1 root root   301141 Oct 25 16:58 sample_data/california_housing_test.csv
        -rw-r--r-- 1 root root  1706430 Oct 25 16:58 sample_data/california_housing_train.csv
        -rw-r--r-- 1 root root 18289443 Oct 25 16:58 sample_data/mnist_test.csv
        -rw-r--r-- 1 root root 36523880 Oct 25 16:58 sample_data/mnist_train_small.csv
    

    To learn more, you can see the Magic Command references.

    %who #

    To print variables, an excellent command to understand is %who. This step shows what variables are declared.

    var1=1
    
    who
    
        drive     os     too_many_decimals     var     var1     
    
    too_many_decimals
    
        1.912345897
    
    %writefile #

    One useful way to write scripts (perhaps a whole command-line tool) is to use the %%writefile magic command.

    %%writefile magic_stuff.py
    import pandas as pd
    df = pd.read_csv(
        "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
    df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
    df = df.drop(df.index[[1,11877]]) #drop outlier
    df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
    print(df.head())
    
        Writing magic_stuff.py
    
    cat magic_stuff.py
    
        import pandas as pd
        df = pd.read_csv(
            "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
        df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
        df = df.drop(df.index[[1,11877]]) #drop outlier
        df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
        print(df.head())
    
    !python magic_stuff.py
    
           fat_100g  ...                         product
        0     28.57  ...  Banana Chips Sweetened (Whole)
        2     57.14  ...          Organic Salted Nut Mix
        3     18.75  ...                  Organic Muesli
        4     36.67  ...                   Zen Party Mix
        5     18.18  ...            Cinnamon Nut Granola
        
        [5 rows x 7 columns]
    
    !pip install -q pylint
    
    ....
    [?25h
    
    !pylint magic_stuff.py
    
        ************* Module magic_stuff
        magic_stuff.py:3:0: C0301: Line too long (109/100) (line-too-long)
        magic_stuff.py:4:0: C0301: Line too long (110/100) (line-too-long)
        magic_stuff.py:5:24: C0326: Exactly one space required after comma
        df = df.drop(df.index[[1,11877]]) #drop outlier
                                ^ (bad-whitespace)
        magic_stuff.py:7:0: C0304: Final newline missing (missing-final-newline)
        magic_stuff.py:1:0: C0114: Missing module docstring (missing-module-docstring)
        magic_stuff.py:2:0: C0103: Constant name "df" doesn't conform to UPPER_CASE naming style (invalid-name)
        magic_stuff.py:5:0: C0103: Constant name "df" doesn't conform to UPPER_CASE naming style (invalid-name)
        
        ------------------------------------
        Your code has been rated at -1.67/10
    
    Bash #

    Another useful tool in automation is the use of %%bash. You can use it to execute bash commands.

    %%bash
    uname -a
    ls
    ps
    
        Linux bee9b4559c13 4.14.137+ #1 SMP Thu Aug 8 02:47:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
        gdrive
        magic_stuff.py
        sample_data
            PID TTY          TIME CMD
              1 ?        00:00:00 run.sh
              9 ?        00:00:01 node
             24 ?        00:00:03 jupyter-noteboo
            119 ?        00:00:00 tail
            127 ?        00:00:06 python3
            298 ?        00:00:00 bash
            299 ?        00:00:00 drive
            300 ?        00:00:00 grep
            383 ?        00:00:01 drive
            394 ?        00:00:00 fusermount <defunct>
            450 ?        00:00:00 bash
            451 ?        00:00:00 tail
            452 ?        00:00:00 grep
            551 ?        00:00:00 bash
            554 ?        00:00:00 ps
    
    !uname -a
    
    Linux bee9b4559c13 4.14.137+ #1 SMP Thu Aug 8 02:47:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
    
    Python2 #

    Yet another useful trick is the ability to run Python2.

    %%python2
    print "old school"
    
    
        old school
    

    Python executable #

    You can run scripts, REPL and even run python statements with -c flag and semicolon to string together multiple statements

    !python -c "import os;print(os.listdir())"
    
        ['.config', 'magic_stuff.py', 'gdrive', 'sample_data']
    
    !ls -l
    
    total 712
    drwx------ 4 root root   4096 Sep  9 16:16 gdrive
    -rw-r--r-- 1 root root    407 Sep  9 16:20 magic_stuff.py
    -rw-r--r-- 1 root root 712814 Sep  9 16:25 pytorch.pptx
    drwxr-xr-x 1 root root   4096 Aug 27 16:17 sample_data
    
    !pip install -q yellowbrick
    
    #this is how you capture input to a program
    import sys;sys.argv
    
        ['/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py',
         '-f',
         '/root/.local/share/jupyter/runtime/kernel-559953d0-ac45-4a1a-a716-8951070eaab5.json']
    
    

    In summary, IPython and Jupyter are beneficial tools to develop any Python code. They come in especially handy in testing out automation. Put a little IPython into your command-line tool kit, and you won’t regret it.