# Introducing Python Workshop #
### Session II - Python Basics ###

We will take an example that uses many features of Python programming language and as they come we will understand them one by one. 


To get started, we will refer to the text in the webpage - http://composingprograms.com/shakespeare.txt. This webpage contains texts of William Shakespeare's 37 plays. We will extract the text of this webpage into a Python "list" and perform some basic operations. 

__Imports__

In order to extract the content of the webpage, we will use Python's "urllib" package, which has several modules for working with URLs. Specifically, we will use "urlopen" function from "urllib.request" module that can open and read URL contents. We can see the list of all modules of a package and their respective functions using __dir__ function.

In [6]:
import urllib.request

In [11]:
dir(urllib)

In [10]:
dir(urllib.request)

<br>

Before we can use the functions of a module, we need to import either the entire module or just the function. Here we will import a function called "urlopen" from the "urllib.request" module. To do this we must use Python's __import statement__. 


For more information on "urllib", refer - https://docs.python.org/3/library/internet.html

In [9]:
from urllib.request import urlopen

<br> 
__Statements and Expression__

Statements and expression are basics of computer programs. __Statements__ carry out some action and __expressions__ typically describe computations. 

Here is an example of an __assignment statement__, which creates a variable named "shakespeare" and assigns it to the expression that follows "__= operator__", which is the text of a url location. The type of this text variable is string. 

In [18]:
shakespeare = 'http://composingprograms.com/shakespeare.txt'

<br> 
__Values and Types__

A program works with values. Values can be numbers, texts and/or special characters. Values belong to different types. Numbers are usually of type integer or float and texts are of type string. 

In our example the url location is combination of texts and belongs to type string. Python provides a type function to check the type of variables. 

In [None]:
print( type(shakespeare) )

<br> 
__Strings__

Strings are sequence of characters. Strings are contained by either single or double quotes. Two string can be combined by using __"+ operator"__ and repeated by __"* operator"__. 

In [None]:
print( shakespeare + '  -  ' + shakespeare)

In [None]:
print( shakespeare * 3)

<br> 
__Variables__

A powerful feature of programming languages is the ability to manipulate variables. Variable is a name that refers to a value. There are naming rules for naming a variable. For example, variable names cannot start with a number or contain space. NameError is a common error in programming. It occurs when a naming rule is violated. 

It is a good practice to have descriptive but concise variable names. 

In [None]:
1shakespeare = 'http://composingprograms.com/shakespeare.txt'

<br> We will now use the 'urlopen' function that we called earlier to get the actual content of this webpage and assign it back to the same variable named "shakespeare". 

In [None]:
#shakespeare = urlopen(shakespeare)
shakespeare = urlopen('http://composingprograms.com/shakespeare.txt')

<br> The variable "shakespeare" no longer belongs to type string. 

We have reassigned the variable to another expression. This expression applies the function "urlopen" to the previous value of the variable "shakespeare". 

In [None]:
print( type(shakespeare) )

<br>

The new value of the variable "shakespeare" belongs to class "http.client.HTTPResponse". We will not get into the details of classes in this workshop. Only briefly we will say that each class instance can have attributes attached to it. Class instances can also have __methods__ attached to them. Methods are functions that are available only to its own class and are accessed using the __"dot operator"__ (just like we did with the modules before). 

The variable "shakespeare" now contains not only the content of the webpage but also everything that comes attached to the output of the "urlopen" function such as methods that are specific to it. A list of available methods for a specific class can be obtained with "dir" function.

Note that if two separate classes have same method, it is possible that they perform different operations. 

In [15]:
dir(shakespeare)

In [23]:
#dir(shakespeare.read())

In [33]:
#dir(shakespeare.read().decode())

In [34]:
#dir(shakespeare.read().decode().split())

<br> 
__Functions__

Variables are manipulated using functions (and operators). The name of the function is bound to compound operation. A url location in itself is one piece of data and the content of the url location is another piece of data. The process of getting the content of url from its url location is complex and the function 'urlopen' simplifies it for us. This function takes the url location, runs some operations with it and gives us an output which has the contents of the webpage and methods that can be applied to this output.  

In [35]:
words = shakespeare.read().decode().split()

In [None]:
print( type(words) )

<br> 
__List__ 

A list is a mutable, ordered sequence of items. It can be indexed, sliced and changed. Each element can be accessed using its position in the list. 

In our example, words is a variable that refers to a list of words that exist in the Shakespearean text including the title of the first play "A MIDSUMMER_NIGHT"S DREAM". Let's separate this title from rest of text. 

In [None]:
title = words[0:3]
print( title )

In [None]:
body = words[3:]

print( body[:10])

<br>

__Indexing Operator__

Indexing operator ([ ]) selects one or more elements from a sequence. Each element of a sequence is assigned a number - its position or index. Index must be an integer value and is called inside a pair of square brackets. 

The operation that extracts a subsequence is called __slicing__. When selecting more than one element __": operator"__ is used with integer before and after it to indicate where to start and where to stop the index, respectively.

Python indexing starts at 0 and ends at (n-1), where n refers to the number of items in the sequence. The function "len" can be used to get the number of items in a list. 

Negative indexing is also supported by Python. It can be done by adding "-" operator before the integer value.

In [None]:
n_words = len(body)
print( n_words )

In [None]:
print( body[980634])

In [None]:
print( body[980633])

In [None]:
print( body[-1])

In [None]:
sub_body = body[:10]
print( sub_body)

In [None]:
print( sub_body[:-2])

In [None]:
print( sub_body[::2])    # gives every 2nd element

In [None]:
print( sub_body[::-1])   # backward indexing

In [None]:
print( sub_body[::-2])   # backward indexing of every 2nd element

<br>

__Arithmetic Operator__

Just like the strings, lists can also be manipulated using "+", "\*" and other such operators known as arithmetic operators. 

In [None]:
new_list = title + sub_body
print(new_list)

In [None]:
numeric_list = [1, 2, 3]
print(numeric_list * 3)

Arithmetic operators behave differently on lists and strings vs integers and float. 

In [None]:
print( 5 + 5 )          # addition
print( 5.5 - 6.5 )      # substraction
print( 5 * 3 )          # multiplication
print( 5.5 // 1.25 )    # floor division

<br> __Iterations__

Iterations are useful for manipulating each item in the list. For example, if we want to multiply each element of the list by 3, we can do that using __for statement__, which makes iterations very easy. Note that the behavior of the operator will be different for different types of elements in a list. 

In [None]:
mixed_list = title + numeric_list
mixed_list = title.append(numeric_list)

In [None]:
for element in mixed_list:
    print(element*3)

<br>

__Python Syntax__

Syntax refers to the structure of the language. 

The end of the statement does not require semicolon or other symbol. After a statement is complete, the code is considered completed. However, using semicolon can allow you to execute two separate codes from the same line. 

Indentation i.e. the whitespace matters in Python. A block of code is a set of statements that should be treated as a unit even when written in a new line. A code block in python are denoted by indentation. For example, in compound statements such as loops and conditionals, after the colon we must enter into a new line and add exactly four spaces to continue further. Whitespaces __within__ the same line does not matter however.  

Comments about codes can be made using hashtag #. anything written after # is ignored by the interpreter. Python does not have any syntax for multi-line comments. 

In [None]:
sub_body_lowercase = []
for word in sub_body:
        sub_body_lowercase.append( word.lower() )
        #print(sub_body_lowercase)
# print(sub_body_lowercase)
sub_body_lowercase

<br>

__Membership operator__

"in" and "not in" are membership operators. They check membership of values in another sequence of values.

In [None]:
2 in numeric_list

In [None]:
4 in numeric_list

In [None]:
4 not in numeric_list

<br> __Conditionals__

In order to write useful programs, we almost always need the ability to check conditions and change the behavior of program accordingly. In Python, conditionals can be created with __if directive__. 

Conditionals are often combined with __comparison operators__. In the conditional below, we used "==" (is equal to), which is an example of conditional operator. Other examples include "!=" (is not equal to), ">=" (is greater than or equal to), etc.

In [None]:
my_string = 'some text'
#my_string = 7

if type(my_string) == str:
    print( str(my_string) + ' is of type string.')
else:
    print( str(my_string) + ' is not of type string.')

<br>
Iterators and conditionals can be combined together to manipulate each element of a list conditionally. 

In [None]:
for element in mixed_list:
    if element in numeric_list:
        print('Yes')
    else:
        print('No')

<br>

__List Comprehension__

Iterators also allow to conditionally manipulate individual items on the list in simpler way (without using loops). It can also be done by List Comprehension.


In [None]:
doubled_list = [e*2 for e in numeric_list]
squared_list = [e**2 for e in numeric_list]

List Comprehension and conditionals can also be combined together for more list operations. Codes that follow list comprehension are generally elegant and clear. 

In [None]:
new_list = [x + 23 for x in numeric_list if x > 2]
new_list

<br> __Dictionary__

Dictionary is a mutable, unordered set of key-value pairs. Each key must be unique but values do not have to be. Each key is separated from its value by ":" and each key-value pair is separated from the next with a comma. To access a given element in a dictionary, we must refer to it by its key. Dictionaries are written using curly brackets. "{ }" creates an empty dictionary.

In [None]:
my_dict = { }
print( type(my_dict) )

In [None]:
my_dict = { 'Name':'John', 'Age':35, 'Hobbies':['basketball', 'football', 'swimming']}
my_dict['Hobbies']

In [None]:
my_dict['Last Name'] = "Smith"
my_dict

<br>

__Lists, dictionaries, conditionals and iterations can be combined together to perform complex computations.__ 

For example, if we want to find out the frequency of each words and punctuations that appear in the Shakespearean text, we can do so as:

In [None]:
freq_words = {}

for word in body_lowercase:
    if word in freq_words:
        freq_words[word] = freq_words[word] + 1
    else:
        freq_words[word] = 1
        
#freq_words

In [None]:
len( freq_words.keys() )

In [None]:
print( freq_words["daughter"] )

In [None]:
print( freq_words["now"])