"The only way to learn a new programming language is by writing programs in it." - Dennis Ritchie
I believe you can learn serious things through silly games. I also think you will learn best by doing. This is a book of programming exercises. Each chapter includes a description of a program you should write with examples of how the program should work. Most importantly, each program includes tests so that you know if your program is working well enough.
I won't necessarily show you beforehand how to write each program. I'll describe what the program should do and provide some discussion about how to write it. I'll also create an appendix with short examples of how to do things like how to use argparse
, how to read/write from/to a file, how to process all the files in a directory, how to extract k-mers from a string, etc. I'll provide some building blocks, but I want you to figure out how to put the pieces together.
When you are done with this books you be able to:
- Write command-line Python programs
- Process a variety of command-line arguments, options, and flags
- Write and run tests for your programs and functions
- Manipulate of Python data structures including strings, lists, tuples, sets, dictionaries
- Use higher-order functions like
map
andfilter
- Write and use regular expressions
- Read, parse, and write various text formats
- Use and control of randomness
- Create and use graphs, kmers, Markov chains, Hamming distance, the Soundex algorithm, and more
First use the GitHub interface to "fork" this repository into your own account. Then do git clone
of your repository to get a local copy. Inside that checkout, do:
git remote add upstream https://github.com/kyclark/playful_python.git
This will allow you to git pull upstream master
in order to get updates. When you create new files, git add/commit/push
them to your repository. (Please do not create pull requests on my repository -- unless, of course, you have suggestions for improving my repo!).
I provide some useful programs in the bin
directory including one called new.py
that will help you stub out new Python programs using the argparse
module to parse the command line arguments and options for your programs. I recommend you start every new program with this program. For example, in the article
directory the README.md
wants you to create a program called article.py
. You should do this:
$ cd article
$ new.py article
This will create a new file called article.py
(that has been made executable with chmod +x
, if your operating system supports that) that has example code for you to start writing your program.
Your $PATH
is a list of directories where your operating system will look for programs. To see what your $PATH
looks like, do:
$ echo $PATH
Probably each directory is separated by a colon (:
). The order of the directories matters! For instance, it's common to have more than one version of Python installed. When you type python
on the command line, the directories in your $PATH
are searched in order, and the first python
found is the one that is used (and it's probably Python version 2!)
You could execute new.py
by giving the full path to the program, e.g., $HOME/work/playful_python/bin/new.py
, but that's really tedious. It's best to put new.py
into one of the directories that is already in your $PATH
like maybe /usr/local/bin
. The problem is that you probably need administrator privileges to write to most of the directories that are in your $PATH
. If you are working on your laptop, this is probably not a problem, but if you are on a shared system, you probably won't be able to copy the program into your $PATH
directories.
An alternative is to alter your $PATH
to include the directory where new.py
is located. E.g., if new.py
is in $HOME/work/playful_python/bin/
, then add this directory to your $PATH
-- probably by editing .bashrc
or .bash_profile
located in your $HOME
directory (if you use bash
). See the documentation for your shell of choice to understand how to edit and persist your $PATH
.
For what it's worth, I always create a $HOME/.local
directory for local installations of software I need, so I add $HOME/.local/bin
to my $PATH
. Then I copy programs like new.py
there and they are available to me anywhere on the system.
Once you have stubbed out your new program, open it in your favorite editor and change the example arguments in get_args
to suit the needs of your app, then add your code to main
to accomplish the task described in the README. To run the test suite using make
, you can type make test
in the same directory as the test.py
and article.py
program. If your system does not have make
or you just don't want to use it, type pytest -v test.py
.
Your goal is to pass all the tests. The tests are written in an order designed to guide you in how break the problem down, e.g., often a test will ask you to alter one bit of text from the command line, and this it will ask you to read and alter the text from a file. I would suggest you solve the tests in order. The make test
target in every Makefile executes pytest -xv test.py
where the -x
flag will have pytest
halt testing after it finds one that fails. There's no point in running every test when one fails, so I think this is less frustrating that seeing perhaps hundreds of lines of failing tests shoot by.
A fair number of the program rely on a dictionary of English words. To be sure that you can reproduce my results, I include a copy of mine in inputs/words.zip
.
Notebooks are great for interactive exploration of data, especially if you want to visualize thing, but the downsides:
- Stored as JSON not line-oriented text, so no good
diff
tools - Not easily shared
- Too easy to run cells out of order
- Hard to test
- No way to pass in arguments
I believe you can better learn how to create testable, reproducible software by writing command-line programs that always run from beginning to end and have a test suite. It's difficult to achieve that with Notebooks, but I do encourage you to explore Notebooks on your own.
I always love when a language has a good REPL (read-evaluate-print-loop) tool. Python and Haskell both excel in this respect. For simplicity's sake, I show the standard REPL when you execute python3
on the command-line, but you won't be able to copy and paste the same code examples there. For your own purposes, I suggest using the iPython REPL (ipython
) instead.
Every program included has been automatically formatted with yapf
(Yet Another Python Formatter), a tool from Google that can be customized with a configuration file. I encourage you to adopt and regularly a formatter (see also black
) after every modification to your program. Sometimes I even set up the formatter to format every time I save my program.
I would also encourage you to look at code "linters" like pylint
to find potential errors in your code that the Python interpreter itself will not compalain about. The mypy
program can also be used as we introduce type hints.
The exercises are roughly arranged from easier to harder.
Ken Youens-Clark is a Sr. Scientific Programmer in the lab of Dr. Bonnie Hurwitz at the University of Arizona. He started college as a music major at the University of North Texas but changed to English lit for his BA in 1995. He started programming at his first job out of college, working through several languages and companies before landing in bioinformatics in 2001. In 2019 he earned his MS in Biosystems Engineering, and enjoys helping people learn programming. When he's not working, he likes playing music, riding bikes, cooking, and being with his wife and children.
© Ken Youens-Clark 2019