# Python

# What is *? In Regular Expression?

https://stackoverflow.com/questions/3075130/what-is-the-difference-between-and-regular-expressions

# Parameterization and string substitution

### String substitution

In the context of building a database query like so:

`CREATE TABLE fish (name TEXT, species TEXT, tank_number INTEGER)` you would likely want to include for example user input for say a search query into a database. There are two ways to go about it and the first way to build a query is via string substitution and this is the bad way.

Say the user input is in the variable `name` and the user can put whatever name they like including special characters. We have our search query to find the particular `name` that the user inputted as follows:

```python
query = f"SELECT * from user_table WHERE name='{name}'"
```

Using f-string we are interpolation the variable with the query and this give the chance for SQL Injection attack to occur. This is because the user can specify something like `name = "Tom' OR 1=1"` as it's input and when the variable is used for interpolation for building the query it results into:

```python
name = "Tom' OR 1 = 1"

"SELECT * from user_table WHERE name='{name}'"

"SELECT * from user_table WHERE name='Tom' OR 1 = 1"
```

And when the query is executed it will pick everyone from the database and retrieve their information (possibly private information).

### Parameterized queries

Now for a much safer approach to building query is via parameterized queries. In this case, the variables that are used to build the query are pass as parameters and not used directly in a string interpolation. How you do this will depend on the library that you use, for example, in Python when you execute a query you can pass a parameterized string along with the parameters to build a parameterized queries.

The idea is that if the user input are passed as parameters, they no longer have the chance to mess with the query since there are no interpolation, the user input is not used to build a query but rather as a parameter. **The variables themselves will no long be used as part of an executable code but rather treated as literal values.**

```python
query = "SELECT * from user_table WHERE name=?"
params = (name)
cursor.execute(query, params)
```

Now no matter what the user input, even `Tom' OR 1 = 1` as `name` it will be treated literally, as you are looking for a person named `Tom' OR 1 = 1` in the database.

##### Further clarification  


In the future, if you are back to this post and wonder isn't this the same as string substitution? You would be wrong.

The way that parameterized query works is that the query is first sent to the SQL engine, and the database will know exactly what this query will do, and only then it will insert the username and passwords as LITERAL VALUES. So the user input values cannot affect the query in anyway.

It is separating the values with the queries, the query is first sent to the database and database will know what it does, then it will ask for the parameters that you have layed out and use it LITERALLY, not as part of the query anymore.

In the previous example, you are asking the database can you select all the columns from `user_table` where the name is equal to a parameter, the database understood your request, then ask you for the `name` and the name is provided as parameter. Even if you sent `Tom' OR 1 = 1` the database will be looking for a person named `Tom' OR 1 = 1`.

# All About Importing Modules and from Packages

### Module?

A module is just a Python file with the corresponding `.py` extension. So if you're talking about the math module then there is a corresponding `math.py` file that contains functions, classes, and constants that are meant to be used by other Python files.

### Where does Python look for modules

If you have written a python module under say `a_module.py` in a directory of `code`.

And you have a script called `a_script.py` in directory called `scripts`. You would like to use the `a_module` in `a_script.py` by importing it.

```
import a_module
```

Then you try to run the `a_script.py` by running `python3 scripts/a_script.py` it will fail with

```
$ python3 scripts/a_script.py
Traceback (most recent call last):
  File "scripts/a_script.py", line 1, in <module>
    import a_module
ModuleNotFoundError: No module named 'a_module'
```

When Python imports a module it will try to find a package or module. But where does it look? Python has a simple algorithm for finding a module with a given name. It will look for a file called `a_module.py` in the directories listed in the variable `sys.path`.

```
>>> import sys
>>> type(sys.path)
<class 'list'>
>>> for path in sys.path:
...     print(path)
... 
/Users/brettmz-admin/dev_trees/psych-214-fall-2016/sphinxext
/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python37.zip
/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7
/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/lib-dynload
/Users/brettmz-admin/Library/Python/3.7/lib/python/site-packages
/Users/brettmz-admin/dev_trees/grin
/Users/brettmz-admin/dev_trees/rmdex
/usr/local/lib/python3.7/site-packages
```

It doesn't search recursively, it will only search in the directory that is listed under `sys.path`. And as you can see the `code` directory is not in `sys.path` which is why Python could not import `a_module.py`.

To fix this you can just simply append to the `sys.path` list like so:

```
import sys
sys.path.append('code')

import a_module
```

Now this will work as expected. This simple search algorithm for module also works for packages, it searches for packages then the module the same way.

<p class="callout info">Information from: [https://bic-berkeley.github.io/psych-214-fall-2016/sys\_path.html](https://bic-berkeley.github.io/psych-214-fall-2016/sys_path.html) </p>

### What is Namespace package and Regular Package

A namespace packages are special packages that allows you to unify two packages with the same name but are at different directories:

```
path1
+--namespace
   +--module1.py
   +--module2.py
path2
+--namespace
   +--module3.py
   +--module4.py
```

Notice that in order to make Namespace package work you would have to add the two paths `path1` and `path2` to `sys.path`. Then you can import the four modules by doing

```
from namespace import module1
from namespace import module2
from namespace import module3
from namespace import module4
```

Or import specific functions from the module

```
from namespace.module1 import boo
```

Namespace packages basically unifies the two packages with the same name in a single namespace. You can import all four modules freely.

However, if either one of the `namespace` packages gain an `__init__.py` then it will become a normal package, and the unification is no longer as the other directory will be ignored.

If both have `__init__.py` the first one encountered in `sys.path` is the one being used.

##### Regular package

A regular package is a collection of modules. In order to make Python recognize that it is a regular package you must add `__init__.py`. Without `__init__.py` then it will be interpreted as a namespace package which doesn't fit 99% of normal use cases.

So the moral of the story is that if you're creating packages, just put `__init__.py` in it unless you have the special use case of needing to unify two namespace packages that are in different directory.

Regular packages are also searched in `sys.path` just like everything else.

# Global variables with imports

### Sharing global variables between modules

If for whatever reason you need to share global variables you need to keep in mind that global variables in Python are global to a module, not across modules. Which is different than say `static` variable in C.

To make your global variable be able to be shared cross module you would obviously declare a global variable in a module, and the best practice is to declare it in a dedicated module file called `config.py` and have everyone that needs that global variable import it into their scope.

```python
# config.py
a = 69 # Variable declared on top level are global by default
```

```python
# cool.py
import config
print(config.a)
```

<p class="callout warning">A word of warning, don't use `from` import unless you want the global variable to be a constant. If you do `from config import a` this would create a new `a` variable that is initialized to whatever `config.a` is at the time of the import, and this new `a` variable would be unaffected by any assignments to `config.a`. So it is not global global anymore.</p>

# Unpacking operator in code and function header

### Unpacking

In Python you are allowed to do deconstruction similar to how you can do deconstruction assignment in JavaScript. You can take a list and then assign each of the elements individually to variables on the left hand. For example:

```python
(a, b, c) = (1, 2, 3)
# a = 1
# b = 2
# c = 3
```

If you are using this simple unpacking assignment then the values on the right hand side will be automatically assigned to the variables to the left according to their position in the tuple.

When you unpack values into variables the number of variables present must be exactly equal to the number of elements in the tuple, else ValueError

You can use this tuple unpacking to unpack a String into characters, unpack list into variables, unpack a generator, and unpack a dictionary as well.

The left hand side can be a list as well but is rarely used.

### \* Unpacking operator

The \* operator extends the unpacking tuple functionality to allow you to pack multiple values into a single variable as a list. For example:

```python
*a, = [1, 2]
# a = [1, 2]
```

In this example, the variable a gets stuffed two values 1 and 2 into it as a list. Notice the usage of comma there, it is required otherwise it will be a normal list assignment statement and not unpacking.

You can have as many variables as you want on the left hand side, but there can only be one starred variable. The starred variable contains all the rest of the values that are not distributed to the mandatory variable in the list, even if it has no value. If there are no element left to distribute to the starred variable, then the starred variable will just be an empty list:

```python
a, b, *c = [1, 2]
# a = 1
# b = 2
# c = []
```

### Function header with \*args and \*\*kwargs

Sometimes in Python function header you will see functions that takes in variable number of position arguments with \*, and variable number of keyword arguments with \*\*. For example:

```python
def func(required, *args, **kwargs)
```

`args` and `kwargs` are both optional, and are automatically default to `() and {}` respectively, if none are provided for the function.

Calling the function above would be like so:

```python
func("hello", 1, 2, 3, site="hi", oh="lord")
# required = "hello"
# args = (1, 2, 3)
# kwargs = {"site": "hi", "oh": "lord"}
```

##### Side note on method headers

1. Variables without default value must be before those that have default values  
    ```
    def test_func(a=10, b, *args, **kwargs)
    ```
    
    Is not valid Python function
2. `*args` must be fore `**kwargs`. They cannot switch, otherwise, invalid Python syntax
3. Follow the structure of `func`, don't try to put position arguments after `*args`.